Picture for Gao Huang

Gao Huang

Uni-AdaFocus: Spatial-temporal Dynamic Computation for Video Recognition

Add code
Dec 15, 2024
Viaarxiv icon

Bridging the Divide: Reconsidering Softmax and Linear Attention

Add code
Dec 09, 2024
Viaarxiv icon

A Unified Interaction Control Framework for Safe Robotic Ultrasound Scanning with Human-Intention-Aware Compliance

Add code
Nov 29, 2024
Figure 1 for A Unified Interaction Control Framework for Safe Robotic Ultrasound Scanning with Human-Intention-Aware Compliance
Figure 2 for A Unified Interaction Control Framework for Safe Robotic Ultrasound Scanning with Human-Intention-Aware Compliance
Figure 3 for A Unified Interaction Control Framework for Safe Robotic Ultrasound Scanning with Human-Intention-Aware Compliance
Figure 4 for A Unified Interaction Control Framework for Safe Robotic Ultrasound Scanning with Human-Intention-Aware Compliance
Viaarxiv icon

Advancing Generalization in PINNs through Latent-Space Representations

Add code
Nov 28, 2024
Figure 1 for Advancing Generalization in PINNs through Latent-Space Representations
Figure 2 for Advancing Generalization in PINNs through Latent-Space Representations
Figure 3 for Advancing Generalization in PINNs through Latent-Space Representations
Figure 4 for Advancing Generalization in PINNs through Latent-Space Representations
Viaarxiv icon

Training an Open-Vocabulary Monocular 3D Object Detection Model without 3D Data

Add code
Nov 23, 2024
Figure 1 for Training an Open-Vocabulary Monocular 3D Object Detection Model without 3D Data
Figure 2 for Training an Open-Vocabulary Monocular 3D Object Detection Model without 3D Data
Figure 3 for Training an Open-Vocabulary Monocular 3D Object Detection Model without 3D Data
Figure 4 for Training an Open-Vocabulary Monocular 3D Object Detection Model without 3D Data
Viaarxiv icon

ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis

Add code
Nov 11, 2024
Viaarxiv icon

How Far is Video Generation from World Model: A Physical Law Perspective

Add code
Nov 04, 2024
Figure 1 for How Far is Video Generation from World Model: A Physical Law Perspective
Figure 2 for How Far is Video Generation from World Model: A Physical Law Perspective
Figure 3 for How Far is Video Generation from World Model: A Physical Law Perspective
Figure 4 for How Far is Video Generation from World Model: A Physical Law Perspective
Viaarxiv icon

DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution

Add code
Nov 04, 2024
Figure 1 for DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution
Figure 2 for DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution
Figure 3 for DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution
Figure 4 for DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution
Viaarxiv icon

Exploring contextual modeling with linear complexity for point cloud segmentation

Add code
Oct 28, 2024
Figure 1 for Exploring contextual modeling with linear complexity for point cloud segmentation
Figure 2 for Exploring contextual modeling with linear complexity for point cloud segmentation
Figure 3 for Exploring contextual modeling with linear complexity for point cloud segmentation
Figure 4 for Exploring contextual modeling with linear complexity for point cloud segmentation
Viaarxiv icon

LLM-based Optimization of Compound AI Systems: A Survey

Add code
Oct 21, 2024
Figure 1 for LLM-based Optimization of Compound AI Systems: A Survey
Figure 2 for LLM-based Optimization of Compound AI Systems: A Survey
Viaarxiv icon