Picture for Xiaokang Yang

Xiaokang Yang

Scaling Up Occupancy-centric Driving Scene Generation: Dataset and Method

Add code
Oct 27, 2025
Viaarxiv icon

FieldGen: From Teleoperated Pre-Manipulation Trajectories to Field-Guided Data Generation

Add code
Oct 23, 2025
Viaarxiv icon

Expertise need not monopolize: Action-Specialized Mixture of Experts for Vision-Language-Action Learning

Add code
Oct 16, 2025
Viaarxiv icon

Flow-Matching Guided Deep Unfolding for Hyperspectral Image Reconstruction

Add code
Oct 02, 2025
Figure 1 for Flow-Matching Guided Deep Unfolding for Hyperspectral Image Reconstruction
Figure 2 for Flow-Matching Guided Deep Unfolding for Hyperspectral Image Reconstruction
Figure 3 for Flow-Matching Guided Deep Unfolding for Hyperspectral Image Reconstruction
Figure 4 for Flow-Matching Guided Deep Unfolding for Hyperspectral Image Reconstruction
Viaarxiv icon

FlashEdit: Decoupling Speed, Structure, and Semantics for Precise Image Editing

Add code
Sep 26, 2025
Viaarxiv icon

Discrete Diffusion VLA: Bringing Discrete Diffusion to Action Decoding in Vision-Language-Action Policies

Add code
Aug 27, 2025
Viaarxiv icon

NEARL-CLIP: Interacted Query Adaptation with Orthogonal Regularization for Medical Vision-Language Understanding

Add code
Aug 06, 2025
Viaarxiv icon

Perceiving and Acting in First-Person: A Dataset and Benchmark for Egocentric Human-Object-Human Interactions

Add code
Aug 06, 2025
Viaarxiv icon

MEDTalk: Multimodal Controlled 3D Facial Animation with Dynamic Emotions by Disentangled Embedding

Add code
Jul 08, 2025
Viaarxiv icon

MedSeg-R: Reasoning Segmentation in Medical Images with Multimodal Large Language Models

Add code
Jun 12, 2025
Viaarxiv icon