Picture for Fu-En Yang

Fu-En Yang

Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent Planning

Add code
Jan 14, 2026
Viaarxiv icon

3AM: Segment Anything with Geometric Consistency in Videos

Add code
Jan 13, 2026
Viaarxiv icon

TA-Prompting: Enhancing Video Large Language Models for Dense Video Captioning via Temporal Anchors

Add code
Jan 06, 2026
Viaarxiv icon

VADER: Towards Causal Video Anomaly Understanding with Relation-Aware Large Language Models

Add code
Nov 10, 2025
Figure 1 for VADER: Towards Causal Video Anomaly Understanding with Relation-Aware Large Language Models
Figure 2 for VADER: Towards Causal Video Anomaly Understanding with Relation-Aware Large Language Models
Figure 3 for VADER: Towards Causal Video Anomaly Understanding with Relation-Aware Large Language Models
Figure 4 for VADER: Towards Causal Video Anomaly Understanding with Relation-Aware Large Language Models
Viaarxiv icon

LongSplat: Robust Unposed 3D Gaussian Splatting for Casual Long Videos

Add code
Aug 19, 2025
Figure 1 for LongSplat: Robust Unposed 3D Gaussian Splatting for Casual Long Videos
Figure 2 for LongSplat: Robust Unposed 3D Gaussian Splatting for Casual Long Videos
Figure 3 for LongSplat: Robust Unposed 3D Gaussian Splatting for Casual Long Videos
Figure 4 for LongSplat: Robust Unposed 3D Gaussian Splatting for Casual Long Videos
Viaarxiv icon

ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning

Add code
Jul 22, 2025
Figure 1 for ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning
Figure 2 for ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning
Figure 3 for ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning
Figure 4 for ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning
Viaarxiv icon

VideoMage: Multi-Subject and Motion Customization of Text-to-Video Diffusion Models

Add code
Mar 27, 2025
Viaarxiv icon

MotionMatcher: Motion Customization of Text-to-Video Diffusion Models via Motion Feature Matching

Add code
Feb 18, 2025
Viaarxiv icon

PromptHSI: Universal Hyperspectral Image Restoration Framework for Composite Degradation

Add code
Nov 24, 2024
Figure 1 for PromptHSI: Universal Hyperspectral Image Restoration Framework for Composite Degradation
Figure 2 for PromptHSI: Universal Hyperspectral Image Restoration Framework for Composite Degradation
Figure 3 for PromptHSI: Universal Hyperspectral Image Restoration Framework for Composite Degradation
Figure 4 for PromptHSI: Universal Hyperspectral Image Restoration Framework for Composite Degradation
Viaarxiv icon

Select and Distill: Selective Dual-Teacher Knowledge Transfer for Continual Learning on Vision-Language Models

Add code
Mar 14, 2024
Viaarxiv icon