Picture for Siyuan Huang

Siyuan Huang

EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation

Add code
Jan 03, 2025
Figure 1 for EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation
Figure 2 for EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation
Figure 3 for EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation
Figure 4 for EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation
Viaarxiv icon

A3: Android Agent Arena for Mobile GUI Agents

Add code
Jan 02, 2025
Figure 1 for A3: Android Agent Arena for Mobile GUI Agents
Figure 2 for A3: Android Agent Arena for Mobile GUI Agents
Figure 3 for A3: Android Agent Arena for Mobile GUI Agents
Figure 4 for A3: Android Agent Arena for Mobile GUI Agents
Viaarxiv icon

MOVIS: Enhancing Multi-Object Novel View Synthesis for Indoor Scenes

Add code
Dec 16, 2024
Figure 1 for MOVIS: Enhancing Multi-Object Novel View Synthesis for Indoor Scenes
Figure 2 for MOVIS: Enhancing Multi-Object Novel View Synthesis for Indoor Scenes
Figure 3 for MOVIS: Enhancing Multi-Object Novel View Synthesis for Indoor Scenes
Figure 4 for MOVIS: Enhancing Multi-Object Novel View Synthesis for Indoor Scenes
Viaarxiv icon

MMAD-Purify: A Precision-Optimized Framework for Efficient and Scalable Multi-Modal Attacks

Add code
Oct 17, 2024
Figure 1 for MMAD-Purify: A Precision-Optimized Framework for Efficient and Scalable Multi-Modal Attacks
Figure 2 for MMAD-Purify: A Precision-Optimized Framework for Efficient and Scalable Multi-Modal Attacks
Figure 3 for MMAD-Purify: A Precision-Optimized Framework for Efficient and Scalable Multi-Modal Attacks
Figure 4 for MMAD-Purify: A Precision-Optimized Framework for Efficient and Scalable Multi-Modal Attacks
Viaarxiv icon

Cluster-wise Graph Transformer with Dual-granularity Kernelized Attention

Add code
Oct 09, 2024
Figure 1 for Cluster-wise Graph Transformer with Dual-granularity Kernelized Attention
Figure 2 for Cluster-wise Graph Transformer with Dual-granularity Kernelized Attention
Figure 3 for Cluster-wise Graph Transformer with Dual-granularity Kernelized Attention
Figure 4 for Cluster-wise Graph Transformer with Dual-granularity Kernelized Attention
Viaarxiv icon

Mirror-Consistency: Harnessing Inconsistency in Majority Voting

Add code
Oct 07, 2024
Viaarxiv icon

Autonomous Character-Scene Interaction Synthesis from Text Instruction

Add code
Oct 04, 2024
Figure 1 for Autonomous Character-Scene Interaction Synthesis from Text Instruction
Figure 2 for Autonomous Character-Scene Interaction Synthesis from Text Instruction
Figure 3 for Autonomous Character-Scene Interaction Synthesis from Text Instruction
Figure 4 for Autonomous Character-Scene Interaction Synthesis from Text Instruction
Viaarxiv icon

Effective Tuning Strategies for Generalist Robot Manipulation Policies

Add code
Oct 02, 2024
Viaarxiv icon

UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models

Add code
Sep 30, 2024
Figure 1 for UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models
Figure 2 for UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models
Figure 3 for UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models
Figure 4 for UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models
Viaarxiv icon

SKT: Integrating State-Aware Keypoint Trajectories with Vision-Language Models for Robotic Garment Manipulation

Add code
Sep 26, 2024
Viaarxiv icon