Picture for Tong Wang

Tong Wang

Jeffrey

MiVE: Multiscale Vision-language features for reference-guided video Editing

Add code
May 14, 2026
Viaarxiv icon

UHR-Micro: Diagnosing and Mitigating the Resolution Illusion in Earth Observation VLMs

Add code
May 12, 2026
Viaarxiv icon

ReflectDrive-2: Reinforcement-Learning-Aligned Self-Editing for Discrete Diffusion Driving

Add code
May 06, 2026
Viaarxiv icon

See Further, Think Deeper: Advancing VLM's Reasoning Ability with Low-level Visual Cues and Reflection

Add code
Apr 27, 2026
Viaarxiv icon

DataFactory: Collaborative Multi-Agent Framework for Advanced Table Question Answering

Add code
Mar 10, 2026
Viaarxiv icon

CMSA-Net: Causal Multi-scale Aggregation with Adaptive Multi-source Reference for Video Polyp Segmentation

Add code
Feb 26, 2026
Viaarxiv icon

Generating a Paracosm for Training-Free Zero-Shot Composed Image Retrieval

Add code
Feb 03, 2026
Viaarxiv icon

One Ring to Rule Them All: Unifying Group-Based RL via Dynamic Power-Mean Geometry

Add code
Jan 30, 2026
Viaarxiv icon

STARS: Shared-specific Translation and Alignment for missing-modality Remote Sensing Semantic Segmentation

Add code
Jan 24, 2026
Viaarxiv icon

RayFusion: Ray Fusion Enhanced Collaborative Visual Perception

Add code
Oct 09, 2025
Figure 1 for RayFusion: Ray Fusion Enhanced Collaborative Visual Perception
Figure 2 for RayFusion: Ray Fusion Enhanced Collaborative Visual Perception
Figure 3 for RayFusion: Ray Fusion Enhanced Collaborative Visual Perception
Figure 4 for RayFusion: Ray Fusion Enhanced Collaborative Visual Perception
Viaarxiv icon