Picture for Fan Yu

Fan Yu

CosyVoice 2: Scalable Streaming Speech Synthesis with Large Language Models

Add code
Dec 13, 2024
Viaarxiv icon

ChatTracker: Enhancing Visual Tracking Performance via Chatting with Multimodal Large Language Model

Add code
Nov 04, 2024
Figure 1 for ChatTracker: Enhancing Visual Tracking Performance via Chatting with Multimodal Large Language Model
Figure 2 for ChatTracker: Enhancing Visual Tracking Performance via Chatting with Multimodal Large Language Model
Figure 3 for ChatTracker: Enhancing Visual Tracking Performance via Chatting with Multimodal Large Language Model
Figure 4 for ChatTracker: Enhancing Visual Tracking Performance via Chatting with Multimodal Large Language Model
Viaarxiv icon

Enhancing Low-Resource ASR through Versatile TTS: Bridging the Data Gap

Add code
Oct 22, 2024
Figure 1 for Enhancing Low-Resource ASR through Versatile TTS: Bridging the Data Gap
Figure 2 for Enhancing Low-Resource ASR through Versatile TTS: Bridging the Data Gap
Figure 3 for Enhancing Low-Resource ASR through Versatile TTS: Bridging the Data Gap
Figure 4 for Enhancing Low-Resource ASR through Versatile TTS: Bridging the Data Gap
Viaarxiv icon

SEFraud: Graph-based Self-Explainable Fraud Detection via Interpretative Mask Learning

Add code
Jun 17, 2024
Viaarxiv icon

MaLa-ASR: Multimedia-Assisted LLM-Based ASR

Add code
Jun 09, 2024
Figure 1 for MaLa-ASR: Multimedia-Assisted LLM-Based ASR
Figure 2 for MaLa-ASR: Multimedia-Assisted LLM-Based ASR
Figure 3 for MaLa-ASR: Multimedia-Assisted LLM-Based ASR
Figure 4 for MaLa-ASR: Multimedia-Assisted LLM-Based ASR
Viaarxiv icon

An Embarrassingly Simple Approach for LLM with Strong ASR Capacity

Add code
Feb 13, 2024
Viaarxiv icon

LCB-net: Long-Context Biasing for Audio-Visual Speech Recognition

Add code
Jan 12, 2024
Viaarxiv icon

Hourglass-AVSR: Down-Up Sampling-based Computational Efficiency Model for Audio-Visual Speech Recognition

Add code
Dec 14, 2023
Viaarxiv icon

BA-MoE: Boundary-Aware Mixture-of-Experts Adapter for Code-Switching Speech Recognition

Add code
Oct 08, 2023
Viaarxiv icon

SA-Paraformer: Non-autoregressive End-to-End Speaker-Attributed ASR

Add code
Oct 07, 2023
Viaarxiv icon