Picture for Lingwei Meng

Lingwei Meng

ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation

Add code
Oct 27, 2024
Figure 1 for ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation
Figure 2 for ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation
Figure 3 for ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation
Figure 4 for ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation
Viaarxiv icon

Towards Within-Class Variation in Alzheimer's Disease Detection from Spontaneous Speech

Add code
Sep 22, 2024
Figure 1 for Towards Within-Class Variation in Alzheimer's Disease Detection from Spontaneous Speech
Figure 2 for Towards Within-Class Variation in Alzheimer's Disease Detection from Spontaneous Speech
Figure 3 for Towards Within-Class Variation in Alzheimer's Disease Detection from Spontaneous Speech
Figure 4 for Towards Within-Class Variation in Alzheimer's Disease Detection from Spontaneous Speech
Viaarxiv icon

Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions

Add code
Sep 13, 2024
Figure 1 for Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions
Figure 2 for Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions
Figure 3 for Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions
Figure 4 for Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions
Viaarxiv icon

LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker Diarization

Add code
Sep 01, 2024
Figure 1 for LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker Diarization
Figure 2 for LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker Diarization
Figure 3 for LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker Diarization
Figure 4 for LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker Diarization
Viaarxiv icon

Large Language Model-based FMRI Encoding of Language Functions for Subjects with Neurocognitive Disorder

Add code
Jul 15, 2024
Viaarxiv icon

Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System

Add code
Jul 13, 2024
Viaarxiv icon

Autoregressive Speech Synthesis without Vector Quantization

Add code
Jul 11, 2024
Viaarxiv icon

VALL-E R: Robust and Efficient Zero-Shot Text-to-Speech Synthesis via Monotonic Alignment

Add code
Jun 12, 2024
Viaarxiv icon

UNIT-DSR: Dysarthric Speech Reconstruction System Using Speech Unit Normalization

Add code
Jan 26, 2024
Viaarxiv icon

Cross-Speaker Encoding Network for Multi-Talker Speech Recognition

Add code
Jan 08, 2024
Viaarxiv icon