Picture for Yangze Li

Yangze Li

Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM

Add code
Nov 01, 2024
Viaarxiv icon

A Transcription Prompt-based Efficient Audio Large Language Model for Robust Speech Recognition

Add code
Aug 18, 2024
Viaarxiv icon

Unveiling the Potential of LLM-Based ASR on Chinese Open-Source Datasets

Add code
May 06, 2024
Figure 1 for Unveiling the Potential of LLM-Based ASR on Chinese Open-Source Datasets
Figure 2 for Unveiling the Potential of LLM-Based ASR on Chinese Open-Source Datasets
Figure 3 for Unveiling the Potential of LLM-Based ASR on Chinese Open-Source Datasets
Figure 4 for Unveiling the Potential of LLM-Based ASR on Chinese Open-Source Datasets
Viaarxiv icon

MMGER: Multi-modal and Multi-granularity Generative Error Correction with LLM for Joint Accent and Speech Recognition

Add code
May 06, 2024
Viaarxiv icon

SA-Paraformer: Non-autoregressive End-to-End Speaker-Attributed ASR

Add code
Oct 07, 2023
Viaarxiv icon

The second multi-channel multi-party meeting transcription challenge 2.0): A benchmark for speaker-attributed ASR

Add code
Sep 24, 2023
Viaarxiv icon

BA-SOT: Boundary-Aware Serialized Output Training for Multi-Talker ASR

Add code
May 23, 2023
Viaarxiv icon

CASA-ASR: Context-Aware Speaker-Attributed ASR

Add code
May 21, 2023
Viaarxiv icon