Picture for Sixiao Zheng

Sixiao Zheng

A Neural Representation Framework with LLM-Driven Spatial Reasoning for Open-Vocabulary 3D Visual Grounding

Add code
Jul 09, 2025
Viaarxiv icon

TriVLA: A Triple-System-Based Unified Vision-Language-Action Model for General Robot Control

Add code
Jul 03, 2025
Viaarxiv icon

TriVLA: A Unified Triple-System-Based Unified Vision-Language-Action Model for General Robot Control

Add code
Jul 02, 2025
Viaarxiv icon

ReasonGrounder: LVLM-Guided Hierarchical Feature Splatting for Open-Vocabulary 3D Visual Grounding and Reasoning

Add code
Mar 30, 2025
Viaarxiv icon

VidCRAFT3: Camera, Object, and Lighting Control for Image-to-Video Generation

Add code
Feb 12, 2025
Viaarxiv icon

TemporalStory: Enhancing Consistency in Story Visualization using Spatial-Temporal Attention

Add code
Jul 13, 2024
Viaarxiv icon

Intelligent Director: An Automatic Framework for Dynamic Visual Composition using ChatGPT

Add code
Feb 24, 2024
Viaarxiv icon

Visual Representation Learning with Transformer: A Sequence-to-Sequence Perspective

Add code
Jul 19, 2022
Viaarxiv icon

HunYuan_tvr for Text-Video Retrivial

Add code
Apr 14, 2022
Figure 1 for HunYuan_tvr for Text-Video Retrivial
Figure 2 for HunYuan_tvr for Text-Video Retrivial
Figure 3 for HunYuan_tvr for Text-Video Retrivial
Figure 4 for HunYuan_tvr for Text-Video Retrivial
Viaarxiv icon

Clustering by the Probability Distributions from Extreme Value Theory

Add code
Feb 20, 2022
Figure 1 for Clustering by the Probability Distributions from Extreme Value Theory
Figure 2 for Clustering by the Probability Distributions from Extreme Value Theory
Figure 3 for Clustering by the Probability Distributions from Extreme Value Theory
Figure 4 for Clustering by the Probability Distributions from Extreme Value Theory
Viaarxiv icon