Picture for Kun Wei

Kun Wei

HDMoLE: Mixture of LoRA Experts with Hierarchical Routing and Dynamic Thresholds for Fine-Tuning LLM-based ASR Models

Add code
Sep 30, 2024
Figure 1 for HDMoLE: Mixture of LoRA Experts with Hierarchical Routing and Dynamic Thresholds for Fine-Tuning LLM-based ASR Models
Figure 2 for HDMoLE: Mixture of LoRA Experts with Hierarchical Routing and Dynamic Thresholds for Fine-Tuning LLM-based ASR Models
Figure 3 for HDMoLE: Mixture of LoRA Experts with Hierarchical Routing and Dynamic Thresholds for Fine-Tuning LLM-based ASR Models
Figure 4 for HDMoLE: Mixture of LoRA Experts with Hierarchical Routing and Dynamic Thresholds for Fine-Tuning LLM-based ASR Models
Viaarxiv icon

Ideal-LLM: Integrating Dual Encoders and Language-Adapted LLM for Multilingual Speech-to-Text

Add code
Sep 17, 2024
Viaarxiv icon

Advancing Multi-talker ASR Performance with Large Language Models

Add code
Aug 30, 2024
Viaarxiv icon

MMGER: Multi-modal and Multi-granularity Generative Error Correction with LLM for Joint Accent and Speech Recognition

Add code
May 06, 2024
Viaarxiv icon

Unveiling the Potential of LLM-Based ASR on Chinese Open-Source Datasets

Add code
May 06, 2024
Viaarxiv icon

Do You Guys Want to Dance: Zero-Shot Compositional Human Dance Generation with Multiple Persons

Add code
Jan 24, 2024
Viaarxiv icon

Conversational Speech Recognition by Learning Audio-textual Cross-modal Contextual Representation

Add code
Oct 22, 2023
Viaarxiv icon

The NPU-MSXF Speech-to-Speech Translation System for IWSLT 2023 Speech-to-Speech Translation Task

Add code
Jul 10, 2023
Figure 1 for The NPU-MSXF Speech-to-Speech Translation System for IWSLT 2023 Speech-to-Speech Translation Task
Figure 2 for The NPU-MSXF Speech-to-Speech Translation System for IWSLT 2023 Speech-to-Speech Translation Task
Figure 3 for The NPU-MSXF Speech-to-Speech Translation System for IWSLT 2023 Speech-to-Speech Translation Task
Figure 4 for The NPU-MSXF Speech-to-Speech Translation System for IWSLT 2023 Speech-to-Speech Translation Task
Viaarxiv icon

StyleS2ST: Zero-shot Style Transfer for Direct Speech-to-speech Translation

Add code
Jun 01, 2023
Viaarxiv icon

ALR-GAN: Adaptive Layout Refinement for Text-to-Image Synthesis

Add code
Apr 13, 2023
Viaarxiv icon