Picture for Xianghu Yue

Xianghu Yue

VoiceBench: Benchmarking LLM-Based Voice Assistants

Add code
Oct 22, 2024
Viaarxiv icon

Beyond Single-Audio: Advancing Multi-Audio Processing in Audio Large Language Models

Add code
Sep 27, 2024
Figure 1 for Beyond Single-Audio: Advancing Multi-Audio Processing in Audio Large Language Models
Figure 2 for Beyond Single-Audio: Advancing Multi-Audio Processing in Audio Large Language Models
Figure 3 for Beyond Single-Audio: Advancing Multi-Audio Processing in Audio Large Language Models
Figure 4 for Beyond Single-Audio: Advancing Multi-Audio Processing in Audio Large Language Models
Viaarxiv icon

Analytic Class Incremental Learning for Sound Source Localization with Privacy Protection

Add code
Sep 11, 2024
Figure 1 for Analytic Class Incremental Learning for Sound Source Localization with Privacy Protection
Figure 2 for Analytic Class Incremental Learning for Sound Source Localization with Privacy Protection
Figure 3 for Analytic Class Incremental Learning for Sound Source Localization with Privacy Protection
Figure 4 for Analytic Class Incremental Learning for Sound Source Localization with Privacy Protection
Viaarxiv icon

TTSlow: Slow Down Text-to-Speech with Efficiency Robustness Evaluations

Add code
Jul 02, 2024
Viaarxiv icon

Text-guided HuBERT: Self-Supervised Speech Pre-training via Generative Adversarial Networks

Add code
Feb 28, 2024
Figure 1 for Text-guided HuBERT: Self-Supervised Speech Pre-training via Generative Adversarial Networks
Figure 2 for Text-guided HuBERT: Self-Supervised Speech Pre-training via Generative Adversarial Networks
Figure 3 for Text-guided HuBERT: Self-Supervised Speech Pre-training via Generative Adversarial Networks
Figure 4 for Text-guided HuBERT: Self-Supervised Speech Pre-training via Generative Adversarial Networks
Viaarxiv icon

CoAVT: A Cognition-Inspired Unified Audio-Visual-Text Pre-Training Model for Multimodal Processing

Add code
Jan 22, 2024
Viaarxiv icon

Self-Supervised Acoustic Word Embedding Learning via Correspondence Transformer Encoder

Add code
Jul 19, 2023
Viaarxiv icon

Self-Transriber: Few-shot Lyrics Transcription with Self-training

Add code
Nov 18, 2022
Figure 1 for Self-Transriber: Few-shot Lyrics Transcription with Self-training
Figure 2 for Self-Transriber: Few-shot Lyrics Transcription with Self-training
Figure 3 for Self-Transriber: Few-shot Lyrics Transcription with Self-training
Figure 4 for Self-Transriber: Few-shot Lyrics Transcription with Self-training
Viaarxiv icon

token2vec: A Joint Self-Supervised Pre-training Framework Using Unpaired Speech and Text

Add code
Oct 30, 2022
Figure 1 for token2vec: A Joint Self-Supervised Pre-training Framework Using Unpaired Speech and Text
Figure 2 for token2vec: A Joint Self-Supervised Pre-training Framework Using Unpaired Speech and Text
Figure 3 for token2vec: A Joint Self-Supervised Pre-training Framework Using Unpaired Speech and Text
Figure 4 for token2vec: A Joint Self-Supervised Pre-training Framework Using Unpaired Speech and Text
Viaarxiv icon

End-to-End Code-Switching ASR for Low-Resourced Language Pairs

Add code
Sep 30, 2019
Figure 1 for End-to-End Code-Switching ASR for Low-Resourced Language Pairs
Figure 2 for End-to-End Code-Switching ASR for Low-Resourced Language Pairs
Figure 3 for End-to-End Code-Switching ASR for Low-Resourced Language Pairs
Figure 4 for End-to-End Code-Switching ASR for Low-Resourced Language Pairs
Viaarxiv icon