Picture for Xianzhao Chen

Xianzhao Chen

SALMONN-omni: A Codec-free LLM for Full-duplex Speech Understanding and Generation

Add code
Nov 27, 2024
Viaarxiv icon

Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation

Add code
Sep 25, 2024
Figure 1 for Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation
Figure 2 for Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation
Figure 3 for Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation
Figure 4 for Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation
Viaarxiv icon

video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models

Add code
Jun 22, 2024
Viaarxiv icon

Can Large Language Models Understand Spatial Audio?

Add code
Jun 12, 2024
Viaarxiv icon

SALMONN: Towards Generic Hearing Abilities for Large Language Models

Add code
Oct 20, 2023
Figure 1 for SALMONN: Towards Generic Hearing Abilities for Large Language Models
Figure 2 for SALMONN: Towards Generic Hearing Abilities for Large Language Models
Figure 3 for SALMONN: Towards Generic Hearing Abilities for Large Language Models
Figure 4 for SALMONN: Towards Generic Hearing Abilities for Large Language Models
Viaarxiv icon

Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models

Add code
Oct 10, 2023
Viaarxiv icon

Connecting Speech Encoder and Large Language Model for ASR

Add code
Sep 26, 2023
Figure 1 for Connecting Speech Encoder and Large Language Model for ASR
Figure 2 for Connecting Speech Encoder and Large Language Model for ASR
Figure 3 for Connecting Speech Encoder and Large Language Model for ASR
Figure 4 for Connecting Speech Encoder and Large Language Model for ASR
Viaarxiv icon

Improving Frame-level Classifier for Word Timings with Non-peaky CTC in End-to-End Automatic Speech Recognition

Add code
Jun 09, 2023
Viaarxiv icon