Picture for Yu-Chiang Frank Wang

Yu-Chiang Frank Wang

Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent Planning

Add code
Jan 14, 2026
Viaarxiv icon

OpenVoxel: Training-Free Grouping and Captioning Voxels for Open-Vocabulary 3D Scene Understanding

Add code
Jan 14, 2026
Viaarxiv icon

Speech-Hands: A Self-Reflection Voice Agentic Approach to Speech Recognition and Audio Reasoning with Omni Perception

Add code
Jan 14, 2026
Viaarxiv icon

TA-Prompting: Enhancing Video Large Language Models for Dense Video Captioning via Temporal Anchors

Add code
Jan 06, 2026
Viaarxiv icon

Masking Teacher and Reinforcing Student for Distilling Vision-Language Models

Add code
Dec 23, 2025
Viaarxiv icon

4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation

Add code
Dec 22, 2025
Figure 1 for 4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation
Figure 2 for 4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation
Figure 3 for 4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation
Figure 4 for 4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation
Viaarxiv icon

Zoom-Zero: Reinforced Coarse-to-Fine Video Understanding via Temporal Zoom-in

Add code
Dec 16, 2025
Viaarxiv icon

Unified Reinforcement and Imitation Learning for Vision-Language Models

Add code
Oct 22, 2025
Figure 1 for Unified Reinforcement and Imitation Learning for Vision-Language Models
Figure 2 for Unified Reinforcement and Imitation Learning for Vision-Language Models
Figure 3 for Unified Reinforcement and Imitation Learning for Vision-Language Models
Figure 4 for Unified Reinforcement and Imitation Learning for Vision-Language Models
Viaarxiv icon

Temporal Prompting Matters: Rethinking Referring Video Object Segmentation

Add code
Oct 08, 2025
Viaarxiv icon

Continual Personalization for Diffusion Models

Add code
Oct 02, 2025
Figure 1 for Continual Personalization for Diffusion Models
Figure 2 for Continual Personalization for Diffusion Models
Figure 3 for Continual Personalization for Diffusion Models
Figure 4 for Continual Personalization for Diffusion Models
Viaarxiv icon