Picture for Bin Zhao

Bin Zhao

Exploring the Potential of Encoder-free Architectures in 3D LMMs

Add code
Feb 13, 2025
Viaarxiv icon

SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model

Add code
Jan 27, 2025
Viaarxiv icon

Open-Vocabulary Octree-Graph for 3D Scene Understanding

Add code
Nov 25, 2024
Figure 1 for Open-Vocabulary Octree-Graph for 3D Scene Understanding
Figure 2 for Open-Vocabulary Octree-Graph for 3D Scene Understanding
Figure 3 for Open-Vocabulary Octree-Graph for 3D Scene Understanding
Figure 4 for Open-Vocabulary Octree-Graph for 3D Scene Understanding
Viaarxiv icon

Night-to-Day Translation via Illumination Degradation Disentanglement

Add code
Nov 21, 2024
Viaarxiv icon

FreeGaussian: Guidance-free Controllable 3D Gaussian Splats with Flow Derivatives

Add code
Oct 29, 2024
Figure 1 for FreeGaussian: Guidance-free Controllable 3D Gaussian Splats with Flow Derivatives
Figure 2 for FreeGaussian: Guidance-free Controllable 3D Gaussian Splats with Flow Derivatives
Figure 3 for FreeGaussian: Guidance-free Controllable 3D Gaussian Splats with Flow Derivatives
Figure 4 for FreeGaussian: Guidance-free Controllable 3D Gaussian Splats with Flow Derivatives
Viaarxiv icon

Towards Flexible and Efficient Diffusion Low Light Enhancer

Add code
Oct 16, 2024
Figure 1 for Towards Flexible and Efficient Diffusion Low Light Enhancer
Figure 2 for Towards Flexible and Efficient Diffusion Low Light Enhancer
Figure 3 for Towards Flexible and Efficient Diffusion Low Light Enhancer
Figure 4 for Towards Flexible and Efficient Diffusion Low Light Enhancer
Viaarxiv icon

Aerial Vision-and-Language Navigation via Semantic-Topo-Metric Representation Guided LLM Reasoning

Add code
Oct 11, 2024
Figure 1 for Aerial Vision-and-Language Navigation via Semantic-Topo-Metric Representation Guided LLM Reasoning
Figure 2 for Aerial Vision-and-Language Navigation via Semantic-Topo-Metric Representation Guided LLM Reasoning
Figure 3 for Aerial Vision-and-Language Navigation via Semantic-Topo-Metric Representation Guided LLM Reasoning
Figure 4 for Aerial Vision-and-Language Navigation via Semantic-Topo-Metric Representation Guided LLM Reasoning
Viaarxiv icon

Fast-UMI: A Scalable and Hardware-Independent Universal Manipulation Interface

Add code
Sep 29, 2024
Figure 1 for Fast-UMI: A Scalable and Hardware-Independent Universal Manipulation Interface
Figure 2 for Fast-UMI: A Scalable and Hardware-Independent Universal Manipulation Interface
Figure 3 for Fast-UMI: A Scalable and Hardware-Independent Universal Manipulation Interface
Figure 4 for Fast-UMI: A Scalable and Hardware-Independent Universal Manipulation Interface
Viaarxiv icon

Semi-LLIE: Semi-supervised Contrastive Learning with Mamba-based Low-light Image Enhancement

Add code
Sep 25, 2024
Figure 1 for Semi-LLIE: Semi-supervised Contrastive Learning with Mamba-based Low-light Image Enhancement
Figure 2 for Semi-LLIE: Semi-supervised Contrastive Learning with Mamba-based Low-light Image Enhancement
Figure 3 for Semi-LLIE: Semi-supervised Contrastive Learning with Mamba-based Low-light Image Enhancement
Figure 4 for Semi-LLIE: Semi-supervised Contrastive Learning with Mamba-based Low-light Image Enhancement
Viaarxiv icon

AlignBot: Aligning VLM-powered Customized Task Planning with User Reminders Through Fine-Tuning for Household Robots

Add code
Sep 18, 2024
Figure 1 for AlignBot: Aligning VLM-powered Customized Task Planning with User Reminders Through Fine-Tuning for Household Robots
Figure 2 for AlignBot: Aligning VLM-powered Customized Task Planning with User Reminders Through Fine-Tuning for Household Robots
Figure 3 for AlignBot: Aligning VLM-powered Customized Task Planning with User Reminders Through Fine-Tuning for Household Robots
Figure 4 for AlignBot: Aligning VLM-powered Customized Task Planning with User Reminders Through Fine-Tuning for Household Robots
Viaarxiv icon