Picture for Yu-Chiang Frank Wang

Yu-Chiang Frank Wang

DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment

Add code
Jul 03, 2025
Viaarxiv icon

GenRecal: Generation after Recalibration from Large to Small Vision-Language Models

Add code
Jun 18, 2025
Viaarxiv icon

EMLoC: Emulator-based Memory-efficient Fine-tuning with LoRA Correction

Add code
Jun 13, 2025
Viaarxiv icon

Universal Speech Enhancement with Regression and Generative Mamba

Add code
May 27, 2025
Viaarxiv icon

UWAV: Uncertainty-weighted Weakly-supervised Audio-Visual Video Parsing

Add code
May 14, 2025
Viaarxiv icon

VideoMage: Multi-Subject and Motion Customization of Text-to-Video Diffusion Models

Add code
Mar 27, 2025
Viaarxiv icon

Segment Anything, Even Occluded

Add code
Mar 08, 2025
Viaarxiv icon

Plan2Align: Predictive Planning Based Test-Time Preference Alignment in Paragraph-Level Machine Translation

Add code
Feb 28, 2025
Viaarxiv icon

Dr. Splat: Directly Referring 3D Gaussian Splatting via Direct Language Embedding Registration

Add code
Feb 23, 2025
Viaarxiv icon

MotionMatcher: Motion Customization of Text-to-Video Diffusion Models via Motion Feature Matching

Add code
Feb 18, 2025
Viaarxiv icon