Picture for Shizhe Chen

Shizhe Chen

INRIA

MAGICIAN: Efficient Long-Term Planning with Imagined Gaussians for Active Mapping

Add code
Mar 23, 2026
Viaarxiv icon

Gondola: Grounded Vision Language Planning for Generalizable Robotic Manipulation

Add code
Jun 12, 2025
Figure 1 for Gondola: Grounded Vision Language Planning for Generalizable Robotic Manipulation
Figure 2 for Gondola: Grounded Vision Language Planning for Generalizable Robotic Manipulation
Figure 3 for Gondola: Grounded Vision Language Planning for Generalizable Robotic Manipulation
Figure 4 for Gondola: Grounded Vision Language Planning for Generalizable Robotic Manipulation
Viaarxiv icon

ComposeAnything: Composite Object Priors for Text-to-Image Generation

Add code
May 30, 2025
Figure 1 for ComposeAnything: Composite Object Priors for Text-to-Image Generation
Figure 2 for ComposeAnything: Composite Object Priors for Text-to-Image Generation
Figure 3 for ComposeAnything: Composite Object Priors for Text-to-Image Generation
Figure 4 for ComposeAnything: Composite Object Priors for Text-to-Image Generation
Viaarxiv icon

HORT: Monocular Hand-held Objects Reconstruction with Transformers

Add code
Mar 27, 2025
Figure 1 for HORT: Monocular Hand-held Objects Reconstruction with Transformers
Figure 2 for HORT: Monocular Hand-held Objects Reconstruction with Transformers
Figure 3 for HORT: Monocular Hand-held Objects Reconstruction with Transformers
Figure 4 for HORT: Monocular Hand-held Objects Reconstruction with Transformers
Viaarxiv icon

Online 3D Scene Reconstruction Using Neural Object Priors

Add code
Mar 24, 2025
Viaarxiv icon

NextBestPath: Efficient 3D Mapping of Unseen Environments

Add code
Feb 07, 2025
Figure 1 for NextBestPath: Efficient 3D Mapping of Unseen Environments
Figure 2 for NextBestPath: Efficient 3D Mapping of Unseen Environments
Figure 3 for NextBestPath: Efficient 3D Mapping of Unseen Environments
Figure 4 for NextBestPath: Efficient 3D Mapping of Unseen Environments
Viaarxiv icon

Towards Generalizable Vision-Language Robotic Manipulation: A Benchmark and LLM-guided 3D Policy

Add code
Oct 02, 2024
Figure 1 for Towards Generalizable Vision-Language Robotic Manipulation: A Benchmark and LLM-guided 3D Policy
Figure 2 for Towards Generalizable Vision-Language Robotic Manipulation: A Benchmark and LLM-guided 3D Policy
Figure 3 for Towards Generalizable Vision-Language Robotic Manipulation: A Benchmark and LLM-guided 3D Policy
Figure 4 for Towards Generalizable Vision-Language Robotic Manipulation: A Benchmark and LLM-guided 3D Policy
Viaarxiv icon

Conan-embedding: General Text Embedding with More and Better Negative Samples

Add code
Aug 29, 2024
Figure 1 for Conan-embedding: General Text Embedding with More and Better Negative Samples
Figure 2 for Conan-embedding: General Text Embedding with More and Better Negative Samples
Figure 3 for Conan-embedding: General Text Embedding with More and Better Negative Samples
Figure 4 for Conan-embedding: General Text Embedding with More and Better Negative Samples
Viaarxiv icon

ViViDex: Learning Vision-based Dexterous Manipulation from Human Videos

Add code
Apr 24, 2024
Figure 1 for ViViDex: Learning Vision-based Dexterous Manipulation from Human Videos
Figure 2 for ViViDex: Learning Vision-based Dexterous Manipulation from Human Videos
Figure 3 for ViViDex: Learning Vision-based Dexterous Manipulation from Human Videos
Figure 4 for ViViDex: Learning Vision-based Dexterous Manipulation from Human Videos
Viaarxiv icon

Think-Program-reCtify: 3D Situated Reasoning with Large Language Models

Add code
Apr 23, 2024
Figure 1 for Think-Program-reCtify: 3D Situated Reasoning with Large Language Models
Figure 2 for Think-Program-reCtify: 3D Situated Reasoning with Large Language Models
Figure 3 for Think-Program-reCtify: 3D Situated Reasoning with Large Language Models
Figure 4 for Think-Program-reCtify: 3D Situated Reasoning with Large Language Models
Viaarxiv icon