Picture for Chang Xu

Chang Xu

ThoughtProbe: Classifier-Guided Thought Space Exploration Leveraging LLM Intrinsic Reasoning

Add code
Apr 09, 2025
Viaarxiv icon

Marine Saliency Segmenter: Object-Focused Conditional Diffusion with Region-Level Semantic Knowledge Distillation

Add code
Apr 03, 2025
Viaarxiv icon

Lumina-Image 2.0: A Unified and Efficient Image Generative Framework

Add code
Mar 27, 2025
Viaarxiv icon

Silent Hazards of Token Reduction in Vision-Language Models: The Hidden Impact on Consistency

Add code
Mar 11, 2025
Viaarxiv icon

BRIDGE: Bootstrapping Text to Control Time-Series Generation via Multi-Agent Iterative Optimization and Diffusion Modelling

Add code
Mar 05, 2025
Figure 1 for BRIDGE: Bootstrapping Text to Control Time-Series Generation via Multi-Agent Iterative Optimization and Diffusion Modelling
Figure 2 for BRIDGE: Bootstrapping Text to Control Time-Series Generation via Multi-Agent Iterative Optimization and Diffusion Modelling
Figure 3 for BRIDGE: Bootstrapping Text to Control Time-Series Generation via Multi-Agent Iterative Optimization and Diffusion Modelling
Figure 4 for BRIDGE: Bootstrapping Text to Control Time-Series Generation via Multi-Agent Iterative Optimization and Diffusion Modelling
Viaarxiv icon

Origami-Inspired Soft Gripper with Tunable Constant Force Output

Add code
Mar 03, 2025
Viaarxiv icon

Learning Mask Invariant Mutual Information for Masked Image Modeling

Add code
Feb 27, 2025
Viaarxiv icon

Mitigating the Impact of Prominent Position Shift in Drone-based RGBT Object Detection

Add code
Feb 13, 2025
Viaarxiv icon

VLA-Cache: Towards Efficient Vision-Language-Action Model via Adaptive Token Caching in Robotic Manipulation

Add code
Feb 04, 2025
Figure 1 for VLA-Cache: Towards Efficient Vision-Language-Action Model via Adaptive Token Caching in Robotic Manipulation
Figure 2 for VLA-Cache: Towards Efficient Vision-Language-Action Model via Adaptive Token Caching in Robotic Manipulation
Figure 3 for VLA-Cache: Towards Efficient Vision-Language-Action Model via Adaptive Token Caching in Robotic Manipulation
Figure 4 for VLA-Cache: Towards Efficient Vision-Language-Action Model via Adaptive Token Caching in Robotic Manipulation
Viaarxiv icon

Generative Physical AI in Vision: A Survey

Add code
Jan 19, 2025
Figure 1 for Generative Physical AI in Vision: A Survey
Figure 2 for Generative Physical AI in Vision: A Survey
Figure 3 for Generative Physical AI in Vision: A Survey
Figure 4 for Generative Physical AI in Vision: A Survey
Viaarxiv icon