Picture for Gao Huang

Gao Huang

4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models

Add code
Mar 13, 2025
Viaarxiv icon

Towards Understanding Text Hallucination of Diffusion Models via Local Generation Bias

Add code
Mar 05, 2025
Viaarxiv icon

ProxyTransformation: Preshaping Point Cloud Manifold With Proxy Attention For 3D Visual Grounding

Add code
Feb 26, 2025
Viaarxiv icon

ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation

Add code
Feb 25, 2025
Viaarxiv icon

HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language Embedding

Add code
Dec 20, 2024
Viaarxiv icon

Uni-AdaFocus: Spatial-temporal Dynamic Computation for Video Recognition

Add code
Dec 15, 2024
Viaarxiv icon

Bridging the Divide: Reconsidering Softmax and Linear Attention

Add code
Dec 09, 2024
Figure 1 for Bridging the Divide: Reconsidering Softmax and Linear Attention
Figure 2 for Bridging the Divide: Reconsidering Softmax and Linear Attention
Figure 3 for Bridging the Divide: Reconsidering Softmax and Linear Attention
Figure 4 for Bridging the Divide: Reconsidering Softmax and Linear Attention
Viaarxiv icon

A Unified Interaction Control Framework for Safe Robotic Ultrasound Scanning with Human-Intention-Aware Compliance

Add code
Nov 29, 2024
Figure 1 for A Unified Interaction Control Framework for Safe Robotic Ultrasound Scanning with Human-Intention-Aware Compliance
Figure 2 for A Unified Interaction Control Framework for Safe Robotic Ultrasound Scanning with Human-Intention-Aware Compliance
Figure 3 for A Unified Interaction Control Framework for Safe Robotic Ultrasound Scanning with Human-Intention-Aware Compliance
Figure 4 for A Unified Interaction Control Framework for Safe Robotic Ultrasound Scanning with Human-Intention-Aware Compliance
Viaarxiv icon

Advancing Generalization in PINNs through Latent-Space Representations

Add code
Nov 28, 2024
Figure 1 for Advancing Generalization in PINNs through Latent-Space Representations
Figure 2 for Advancing Generalization in PINNs through Latent-Space Representations
Figure 3 for Advancing Generalization in PINNs through Latent-Space Representations
Figure 4 for Advancing Generalization in PINNs through Latent-Space Representations
Viaarxiv icon

Training an Open-Vocabulary Monocular 3D Object Detection Model without 3D Data

Add code
Nov 23, 2024
Figure 1 for Training an Open-Vocabulary Monocular 3D Object Detection Model without 3D Data
Figure 2 for Training an Open-Vocabulary Monocular 3D Object Detection Model without 3D Data
Figure 3 for Training an Open-Vocabulary Monocular 3D Object Detection Model without 3D Data
Figure 4 for Training an Open-Vocabulary Monocular 3D Object Detection Model without 3D Data
Viaarxiv icon