Picture for Ruihang Chu

Ruihang Chu

Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models

Add code
Mar 27, 2024
Figure 1 for Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models
Figure 2 for Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models
Figure 3 for Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models
Figure 4 for Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models
Viaarxiv icon

DriveCoT: Integrating Chain-of-Thought Reasoning with End-to-End Driving

Add code
Mar 25, 2024
Figure 1 for DriveCoT: Integrating Chain-of-Thought Reasoning with End-to-End Driving
Figure 2 for DriveCoT: Integrating Chain-of-Thought Reasoning with End-to-End Driving
Figure 3 for DriveCoT: Integrating Chain-of-Thought Reasoning with End-to-End Driving
Figure 4 for DriveCoT: Integrating Chain-of-Thought Reasoning with End-to-End Driving
Viaarxiv icon

DialogGen: Multi-modal Interactive Dialogue System for Multi-turn Text-to-Image Generation

Add code
Mar 13, 2024
Viaarxiv icon

A Survey of Reasoning with Foundation Models

Add code
Dec 26, 2023
Figure 1 for A Survey of Reasoning with Foundation Models
Figure 2 for A Survey of Reasoning with Foundation Models
Figure 3 for A Survey of Reasoning with Foundation Models
Figure 4 for A Survey of Reasoning with Foundation Models
Viaarxiv icon

Mask-Attention-Free Transformer for 3D Instance Segmentation

Add code
Sep 04, 2023
Viaarxiv icon

DiT-3D: Exploring Plain Diffusion Transformers for 3D Shape Generation

Add code
Jul 04, 2023
Figure 1 for DiT-3D: Exploring Plain Diffusion Transformers for 3D Shape Generation
Figure 2 for DiT-3D: Exploring Plain Diffusion Transformers for 3D Shape Generation
Figure 3 for DiT-3D: Exploring Plain Diffusion Transformers for 3D Shape Generation
Figure 4 for DiT-3D: Exploring Plain Diffusion Transformers for 3D Shape Generation
Viaarxiv icon

DiffComplete: Diffusion-based Generative 3D Shape Completion

Add code
Jun 28, 2023
Figure 1 for DiffComplete: Diffusion-based Generative 3D Shape Completion
Figure 2 for DiffComplete: Diffusion-based Generative 3D Shape Completion
Figure 3 for DiffComplete: Diffusion-based Generative 3D Shape Completion
Figure 4 for DiffComplete: Diffusion-based Generative 3D Shape Completion
Viaarxiv icon

TriVol: Point Cloud Rendering via Triple Volumes

Add code
Mar 29, 2023
Figure 1 for TriVol: Point Cloud Rendering via Triple Volumes
Figure 2 for TriVol: Point Cloud Rendering via Triple Volumes
Figure 3 for TriVol: Point Cloud Rendering via Triple Volumes
Figure 4 for TriVol: Point Cloud Rendering via Triple Volumes
Viaarxiv icon

ICM-3D: Instantiated Category Modeling for 3D Instance Segmentation

Add code
Aug 26, 2021
Figure 1 for ICM-3D: Instantiated Category Modeling for 3D Instance Segmentation
Figure 2 for ICM-3D: Instantiated Category Modeling for 3D Instance Segmentation
Figure 3 for ICM-3D: Instantiated Category Modeling for 3D Instance Segmentation
Figure 4 for ICM-3D: Instantiated Category Modeling for 3D Instance Segmentation
Viaarxiv icon

Simultaneous Semantic and Collision Learning for 6-DoF Grasp Pose Estimation

Add code
Aug 05, 2021
Figure 1 for Simultaneous Semantic and Collision Learning for 6-DoF Grasp Pose Estimation
Figure 2 for Simultaneous Semantic and Collision Learning for 6-DoF Grasp Pose Estimation
Figure 3 for Simultaneous Semantic and Collision Learning for 6-DoF Grasp Pose Estimation
Figure 4 for Simultaneous Semantic and Collision Learning for 6-DoF Grasp Pose Estimation
Viaarxiv icon