Picture for Hanbo Zhang

Hanbo Zhang

REGNet V2: End-to-End REgion-based Grasp Detection Network for Grippers of Different Sizes in Point Clouds

Add code
Oct 12, 2024
Viaarxiv icon

GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation

Add code
Oct 08, 2024
Figure 1 for GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation
Figure 2 for GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation
Figure 3 for GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation
Figure 4 for GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation
Viaarxiv icon

DexDiff: Towards Extrinsic Dexterity Manipulation of Ungraspable Objects in Unrestricted Environments

Add code
Sep 09, 2024
Figure 1 for DexDiff: Towards Extrinsic Dexterity Manipulation of Ungraspable Objects in Unrestricted Environments
Figure 2 for DexDiff: Towards Extrinsic Dexterity Manipulation of Ungraspable Objects in Unrestricted Environments
Figure 3 for DexDiff: Towards Extrinsic Dexterity Manipulation of Ungraspable Objects in Unrestricted Environments
Figure 4 for DexDiff: Towards Extrinsic Dexterity Manipulation of Ungraspable Objects in Unrestricted Environments
Viaarxiv icon

SInViG: A Self-Evolving Interactive Visual Agent for Human-Robot Interaction

Add code
Feb 20, 2024
Viaarxiv icon

Towards Unified Interactive Visual Grounding in The Wild

Add code
Jan 30, 2024
Figure 1 for Towards Unified Interactive Visual Grounding in The Wild
Figure 2 for Towards Unified Interactive Visual Grounding in The Wild
Figure 3 for Towards Unified Interactive Visual Grounding in The Wild
Figure 4 for Towards Unified Interactive Visual Grounding in The Wild
Viaarxiv icon

Vision-Language Foundation Models as Effective Robot Imitators

Add code
Nov 06, 2023
Figure 1 for Vision-Language Foundation Models as Effective Robot Imitators
Figure 2 for Vision-Language Foundation Models as Effective Robot Imitators
Figure 3 for Vision-Language Foundation Models as Effective Robot Imitators
Figure 4 for Vision-Language Foundation Models as Effective Robot Imitators
Viaarxiv icon

InViG: Benchmarking Interactive Visual Grounding with 500K Human-Robot Interactions

Add code
Oct 18, 2023
Figure 1 for InViG: Benchmarking Interactive Visual Grounding with 500K Human-Robot Interactions
Figure 2 for InViG: Benchmarking Interactive Visual Grounding with 500K Human-Robot Interactions
Figure 3 for InViG: Benchmarking Interactive Visual Grounding with 500K Human-Robot Interactions
Figure 4 for InViG: Benchmarking Interactive Visual Grounding with 500K Human-Robot Interactions
Viaarxiv icon

What Matters in Training a GPT4-Style Language Model with Multimodal Inputs?

Add code
Jul 30, 2023
Viaarxiv icon

Robotic Grasping from Classical to Modern: A Survey

Add code
Feb 08, 2022
Viaarxiv icon

Density-based Curriculum for Multi-goal Reinforcement Learning with Sparse Rewards

Add code
Sep 24, 2021
Figure 1 for Density-based Curriculum for Multi-goal Reinforcement Learning with Sparse Rewards
Figure 2 for Density-based Curriculum for Multi-goal Reinforcement Learning with Sparse Rewards
Figure 3 for Density-based Curriculum for Multi-goal Reinforcement Learning with Sparse Rewards
Figure 4 for Density-based Curriculum for Multi-goal Reinforcement Learning with Sparse Rewards
Viaarxiv icon