Picture for Shuwei He

Shuwei He

ERNIE 5.0 Technical Report

Add code
Feb 04, 2026
Viaarxiv icon

CORD: Bridging the Audio-Text Reasoning Gap via Weighted On-policy Cross-modal Distillation

Add code
Jan 23, 2026
Viaarxiv icon

MoE Adapter for Large Audio Language Models: Sparsity, Disentanglement, and Gradient-Conflict-Free

Add code
Jan 08, 2026
Viaarxiv icon

Multi-modal and Multi-scale Spatial Environment Understanding for Immersive Visual Text-to-Speech

Add code
Dec 17, 2024
Viaarxiv icon

Multi-Source Spatial Knowledge Understanding for Immersive Visual Text-to-Speech

Add code
Oct 18, 2024
Figure 1 for Multi-Source Spatial Knowledge Understanding for Immersive Visual Text-to-Speech
Figure 2 for Multi-Source Spatial Knowledge Understanding for Immersive Visual Text-to-Speech
Figure 3 for Multi-Source Spatial Knowledge Understanding for Immersive Visual Text-to-Speech
Viaarxiv icon