Picture for Yimeng Zhu

Yimeng Zhu

SKT: Integrating State-Aware Keypoint Trajectories with Vision-Language Models for Robotic Garment Manipulation

Add code
Sep 26, 2024
Viaarxiv icon

A3VLM: Actionable Articulation-Aware Vision Language Model

Add code
Jun 11, 2024
Viaarxiv icon

On decoder-only architecture for speech-to-text and large language model integration

Add code
Jul 14, 2023
Viaarxiv icon

Building High-accuracy Multilingual ASR with Gated Language Experts and Curriculum Training

Add code
Mar 01, 2023
Viaarxiv icon