Picture for Kai Yu

Kai Yu

Sherman

Recent Advances in Discrete Speech Tokens: A Review

Add code
Feb 10, 2025
Viaarxiv icon

From Generalist to Specialist: A Survey of Large Language Models for Chemistry

Add code
Dec 28, 2024
Figure 1 for From Generalist to Specialist: A Survey of Large Language Models for Chemistry
Figure 2 for From Generalist to Specialist: A Survey of Large Language Models for Chemistry
Figure 3 for From Generalist to Specialist: A Survey of Large Language Models for Chemistry
Figure 4 for From Generalist to Specialist: A Survey of Large Language Models for Chemistry
Viaarxiv icon

AdaEAGLE: Optimizing Speculative Decoding via Explicit Modeling of Adaptive Draft Structures

Add code
Dec 25, 2024
Viaarxiv icon

Neural Directed Speech Enhancement with Dual Microphone Array in High Noise Scenario

Add code
Dec 24, 2024
Viaarxiv icon

Why Do Speech Language Models Fail to Generate Semantically Coherent Outputs? A Modality Evolving Perspective

Add code
Dec 22, 2024
Viaarxiv icon

SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage Training

Add code
Dec 20, 2024
Viaarxiv icon

NTC-KWS: Noise-aware CTC for Robust Keyword Spotting

Add code
Dec 17, 2024
Viaarxiv icon

Streaming Keyword Spotting Boosted by Cross-layer Discrimination Consistency

Add code
Dec 17, 2024
Viaarxiv icon

VQTalker: Towards Multilingual Talking Avatars through Facial Motion Tokenization

Add code
Dec 13, 2024
Viaarxiv icon

Reducing Tool Hallucination via Reliability Alignment

Add code
Dec 05, 2024
Figure 1 for Reducing Tool Hallucination via Reliability Alignment
Figure 2 for Reducing Tool Hallucination via Reliability Alignment
Figure 3 for Reducing Tool Hallucination via Reliability Alignment
Figure 4 for Reducing Tool Hallucination via Reliability Alignment
Viaarxiv icon