Picture for Tian Tan

Tian Tan

Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition

Add code
Jul 05, 2024
Viaarxiv icon

video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models

Add code
Jun 22, 2024
Viaarxiv icon

Text-aware Speech Separation for Multi-talker Keyword Spotting

Add code
Jun 18, 2024
Figure 1 for Text-aware Speech Separation for Multi-talker Keyword Spotting
Figure 2 for Text-aware Speech Separation for Multi-talker Keyword Spotting
Figure 3 for Text-aware Speech Separation for Multi-talker Keyword Spotting
Figure 4 for Text-aware Speech Separation for Multi-talker Keyword Spotting
Viaarxiv icon

Can Large Language Models Understand Spatial Audio?

Add code
Jun 12, 2024
Viaarxiv icon

SALMONN: Towards Generic Hearing Abilities for Large Language Models

Add code
Oct 20, 2023
Viaarxiv icon

Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models

Add code
Oct 10, 2023
Viaarxiv icon

Connecting Speech Encoder and Large Language Model for ASR

Add code
Sep 26, 2023
Viaarxiv icon

Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer

Add code
Sep 14, 2023
Viaarxiv icon

Multi-Modality Deep Network for Extreme Learned Image Compression

Add code
Apr 26, 2023
Viaarxiv icon

Adjacency constraint for efficient hierarchical reinforcement learning

Add code
Oct 30, 2021
Figure 1 for Adjacency constraint for efficient hierarchical reinforcement learning
Figure 2 for Adjacency constraint for efficient hierarchical reinforcement learning
Figure 3 for Adjacency constraint for efficient hierarchical reinforcement learning
Figure 4 for Adjacency constraint for efficient hierarchical reinforcement learning
Viaarxiv icon