Picture for Kai Yu

Kai Yu

Sherman

Bitrate-Controlled Diffusion for Disentangling Motion and Content in Video

Add code
Sep 10, 2025
Viaarxiv icon

POSE: Phased One-Step Adversarial Equilibrium for Video Diffusion Models

Add code
Aug 28, 2025
Viaarxiv icon

MOSA: Mixtures of Simple Adapters Outperform Monolithic Approaches in LLM-based Multilingual ASR

Add code
Aug 26, 2025
Viaarxiv icon

Joint decoding method for controllable contextual speech recognition based on Speech LLM

Add code
Aug 12, 2025
Viaarxiv icon

ChemDFM-R: An Chemical Reasoner LLM Enhanced with Atomized Chemical Knowledge

Add code
Jul 30, 2025
Viaarxiv icon

Reasoning-Driven Retrosynthesis Prediction with Large Language Models via Reinforcement Learning

Add code
Jul 23, 2025
Viaarxiv icon

Text to Image for Multi-Label Image Recognition with Joint Prompt-Adapter Learning

Add code
Jun 12, 2025
Viaarxiv icon

Low-Resource Domain Adaptation for Speech LLMs via Text-Only Fine-Tuning

Add code
Jun 06, 2025
Viaarxiv icon

Masked Self-distilled Transducer-based Keyword Spotting with Semi-autoregressive Decoding

Add code
May 30, 2025
Viaarxiv icon

HiDream-I1: A High-Efficient Image Generative Foundation Model with Sparse Diffusion Transformer

Add code
May 28, 2025
Viaarxiv icon