Picture for Yan Yan

Yan Yan

Real-Time Robot Execution with Masked Action Chunking

Add code
Jan 27, 2026
Viaarxiv icon

CogniMap3D: Cognitive 3D Mapping and Rapid Retrieval

Add code
Jan 13, 2026
Viaarxiv icon

Consistent Instance Field for Dynamic Scene Understanding

Add code
Dec 16, 2025
Figure 1 for Consistent Instance Field for Dynamic Scene Understanding
Figure 2 for Consistent Instance Field for Dynamic Scene Understanding
Figure 3 for Consistent Instance Field for Dynamic Scene Understanding
Figure 4 for Consistent Instance Field for Dynamic Scene Understanding
Viaarxiv icon

Distill Video Datasets into Images

Add code
Dec 16, 2025
Figure 1 for Distill Video Datasets into Images
Figure 2 for Distill Video Datasets into Images
Figure 3 for Distill Video Datasets into Images
Figure 4 for Distill Video Datasets into Images
Viaarxiv icon

From Particles to Fields: Reframing Photon Mapping with Continuous Gaussian Photon Fields

Add code
Dec 13, 2025
Viaarxiv icon

VGent: Visual Grounding via Modular Design for Disentangling Reasoning and Prediction

Add code
Dec 11, 2025
Figure 1 for VGent: Visual Grounding via Modular Design for Disentangling Reasoning and Prediction
Figure 2 for VGent: Visual Grounding via Modular Design for Disentangling Reasoning and Prediction
Figure 3 for VGent: Visual Grounding via Modular Design for Disentangling Reasoning and Prediction
Figure 4 for VGent: Visual Grounding via Modular Design for Disentangling Reasoning and Prediction
Viaarxiv icon

GLaD: Geometric Latent Distillation for Vision-Language-Action Models

Add code
Dec 10, 2025
Figure 1 for GLaD: Geometric Latent Distillation for Vision-Language-Action Models
Figure 2 for GLaD: Geometric Latent Distillation for Vision-Language-Action Models
Figure 3 for GLaD: Geometric Latent Distillation for Vision-Language-Action Models
Figure 4 for GLaD: Geometric Latent Distillation for Vision-Language-Action Models
Viaarxiv icon

TraceFlow: Dynamic 3D Reconstruction of Specular Scenes Driven by Ray Tracing

Add code
Dec 10, 2025
Viaarxiv icon

MVI-Bench: A Comprehensive Benchmark for Evaluating Robustness to Misleading Visual Inputs in LVLMs

Add code
Nov 18, 2025
Figure 1 for MVI-Bench: A Comprehensive Benchmark for Evaluating Robustness to Misleading Visual Inputs in LVLMs
Figure 2 for MVI-Bench: A Comprehensive Benchmark for Evaluating Robustness to Misleading Visual Inputs in LVLMs
Figure 3 for MVI-Bench: A Comprehensive Benchmark for Evaluating Robustness to Misleading Visual Inputs in LVLMs
Figure 4 for MVI-Bench: A Comprehensive Benchmark for Evaluating Robustness to Misleading Visual Inputs in LVLMs
Viaarxiv icon

WarpGAN: Warping-Guided 3D GAN Inversion with Style-Based Novel View Inpainting

Add code
Nov 11, 2025
Viaarxiv icon