Picture for Shiliang Zhang

Shiliang Zhang

Unispeaker: A Unified Approach for Multimodality-driven Speaker Generation

Add code
Jan 11, 2025
Viaarxiv icon

MinMo: A Multimodal Large Language Model for Seamless Voice Interaction

Add code
Jan 10, 2025
Viaarxiv icon

Hardware-in-the-loop Simulation Testbed for Geomagnetic Navigation

Add code
Dec 16, 2024
Viaarxiv icon

CosyVoice 2: Scalable Streaming Speech Synthesis with Large Language Models

Add code
Dec 13, 2024
Viaarxiv icon

Anti-Forgetting Adaptation for Unsupervised Person Re-identification

Add code
Nov 22, 2024
Figure 1 for Anti-Forgetting Adaptation for Unsupervised Person Re-identification
Figure 2 for Anti-Forgetting Adaptation for Unsupervised Person Re-identification
Figure 3 for Anti-Forgetting Adaptation for Unsupervised Person Re-identification
Figure 4 for Anti-Forgetting Adaptation for Unsupervised Person Re-identification
Viaarxiv icon

CTC-Assisted LLM-Based Contextual ASR

Add code
Nov 10, 2024
Viaarxiv icon

Enhancing Low-Resource ASR through Versatile TTS: Bridging the Data Gap

Add code
Oct 22, 2024
Figure 1 for Enhancing Low-Resource ASR through Versatile TTS: Bridging the Data Gap
Figure 2 for Enhancing Low-Resource ASR through Versatile TTS: Bridging the Data Gap
Figure 3 for Enhancing Low-Resource ASR through Versatile TTS: Bridging the Data Gap
Figure 4 for Enhancing Low-Resource ASR through Versatile TTS: Bridging the Data Gap
Viaarxiv icon

Long-distance Geomagnetic Navigation in GNSS-denied Environments with Deep Reinforcement Learning

Add code
Oct 21, 2024
Figure 1 for Long-distance Geomagnetic Navigation in GNSS-denied Environments with Deep Reinforcement Learning
Figure 2 for Long-distance Geomagnetic Navigation in GNSS-denied Environments with Deep Reinforcement Learning
Figure 3 for Long-distance Geomagnetic Navigation in GNSS-denied Environments with Deep Reinforcement Learning
Figure 4 for Long-distance Geomagnetic Navigation in GNSS-denied Environments with Deep Reinforcement Learning
Viaarxiv icon

Are Transformers in Pre-trained LM A Good ASR Encoder? An Empirical Study

Add code
Sep 26, 2024
Figure 1 for Are Transformers in Pre-trained LM A Good ASR Encoder? An Empirical Study
Figure 2 for Are Transformers in Pre-trained LM A Good ASR Encoder? An Empirical Study
Figure 3 for Are Transformers in Pre-trained LM A Good ASR Encoder? An Empirical Study
Figure 4 for Are Transformers in Pre-trained LM A Good ASR Encoder? An Empirical Study
Viaarxiv icon

Paraformer-v2: An improved non-autoregressive transformer for noise-robust speech recognition

Add code
Sep 26, 2024
Viaarxiv icon