Picture for Hengshuang Zhao

Hengshuang Zhao

UniReal: Universal Image Generation and Editing via Learning Real-world Dynamics

Add code
Dec 10, 2024
Viaarxiv icon

Liquid: Language Models are Scalable Multi-modal Generators

Add code
Dec 05, 2024
Figure 1 for Liquid: Language Models are Scalable Multi-modal Generators
Figure 2 for Liquid: Language Models are Scalable Multi-modal Generators
Figure 3 for Liquid: Language Models are Scalable Multi-modal Generators
Figure 4 for Liquid: Language Models are Scalable Multi-modal Generators
Viaarxiv icon

SyncVIS: Synchronized Video Instance Segmentation

Add code
Dec 01, 2024
Figure 1 for SyncVIS: Synchronized Video Instance Segmentation
Figure 2 for SyncVIS: Synchronized Video Instance Segmentation
Figure 3 for SyncVIS: Synchronized Video Instance Segmentation
Figure 4 for SyncVIS: Synchronized Video Instance Segmentation
Viaarxiv icon

Efficient 3D Perception on Multi-Sweep Point Cloud with Gumbel Spatial Pruning

Add code
Nov 12, 2024
Figure 1 for Efficient 3D Perception on Multi-Sweep Point Cloud with Gumbel Spatial Pruning
Figure 2 for Efficient 3D Perception on Multi-Sweep Point Cloud with Gumbel Spatial Pruning
Figure 3 for Efficient 3D Perception on Multi-Sweep Point Cloud with Gumbel Spatial Pruning
Figure 4 for Efficient 3D Perception on Multi-Sweep Point Cloud with Gumbel Spatial Pruning
Viaarxiv icon

One for All: Multi-Domain Joint Training for Point Cloud Based 3D Object Detection

Add code
Nov 03, 2024
Figure 1 for One for All: Multi-Domain Joint Training for Point Cloud Based 3D Object Detection
Figure 2 for One for All: Multi-Domain Joint Training for Point Cloud Based 3D Object Detection
Figure 3 for One for All: Multi-Domain Joint Training for Point Cloud Based 3D Object Detection
Figure 4 for One for All: Multi-Domain Joint Training for Point Cloud Based 3D Object Detection
Viaarxiv icon

UniMatch V2: Pushing the Limit of Semi-Supervised Semantic Segmentation

Add code
Oct 14, 2024
Figure 1 for UniMatch V2: Pushing the Limit of Semi-Supervised Semantic Segmentation
Figure 2 for UniMatch V2: Pushing the Limit of Semi-Supervised Semantic Segmentation
Figure 3 for UniMatch V2: Pushing the Limit of Semi-Supervised Semantic Segmentation
Figure 4 for UniMatch V2: Pushing the Limit of Semi-Supervised Semantic Segmentation
Viaarxiv icon

VIRT: Vision Instructed Transformer for Robotic Manipulation

Add code
Oct 09, 2024
Figure 1 for VIRT: Vision Instructed Transformer for Robotic Manipulation
Figure 2 for VIRT: Vision Instructed Transformer for Robotic Manipulation
Figure 3 for VIRT: Vision Instructed Transformer for Robotic Manipulation
Figure 4 for VIRT: Vision Instructed Transformer for Robotic Manipulation
Viaarxiv icon

EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions

Add code
Sep 26, 2024
Figure 1 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Figure 2 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Figure 3 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Figure 4 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Viaarxiv icon

LION: Linear Group RNN for 3D Object Detection in Point Clouds

Add code
Jul 25, 2024
Figure 1 for LION: Linear Group RNN for 3D Object Detection in Point Clouds
Figure 2 for LION: Linear Group RNN for 3D Object Detection in Point Clouds
Figure 3 for LION: Linear Group RNN for 3D Object Detection in Point Clouds
Figure 4 for LION: Linear Group RNN for 3D Object Detection in Point Clouds
Viaarxiv icon

Point Transformer V3 Extreme: 1st Place Solution for 2024 Waymo Open Dataset Challenge in Semantic Segmentation

Add code
Jul 21, 2024
Figure 1 for Point Transformer V3 Extreme: 1st Place Solution for 2024 Waymo Open Dataset Challenge in Semantic Segmentation
Figure 2 for Point Transformer V3 Extreme: 1st Place Solution for 2024 Waymo Open Dataset Challenge in Semantic Segmentation
Figure 3 for Point Transformer V3 Extreme: 1st Place Solution for 2024 Waymo Open Dataset Challenge in Semantic Segmentation
Figure 4 for Point Transformer V3 Extreme: 1st Place Solution for 2024 Waymo Open Dataset Challenge in Semantic Segmentation
Viaarxiv icon