Picture for Qi Sun

Qi Sun

Robin: a Suite of Multi-Scale Vision-Language Models and the CHIRP Evaluation Benchmark

Add code
Jan 16, 2025
Figure 1 for Robin: a Suite of Multi-Scale Vision-Language Models and the CHIRP Evaluation Benchmark
Figure 2 for Robin: a Suite of Multi-Scale Vision-Language Models and the CHIRP Evaluation Benchmark
Figure 3 for Robin: a Suite of Multi-Scale Vision-Language Models and the CHIRP Evaluation Benchmark
Figure 4 for Robin: a Suite of Multi-Scale Vision-Language Models and the CHIRP Evaluation Benchmark
Viaarxiv icon

$\text{Transformer}^2$: Self-adaptive LLMs

Add code
Jan 14, 2025
Viaarxiv icon

Computer Vision-Driven Gesture Recognition: Toward Natural and Intuitive Human-Computer

Add code
Dec 24, 2024
Viaarxiv icon

Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning

Add code
Dec 17, 2024
Viaarxiv icon

FovealNet: Advancing AI-Driven Gaze Tracking Solutions for Optimized Foveated Rendering System Performance in Virtual Reality

Add code
Dec 12, 2024
Figure 1 for FovealNet: Advancing AI-Driven Gaze Tracking Solutions for Optimized Foveated Rendering System Performance in Virtual Reality
Figure 2 for FovealNet: Advancing AI-Driven Gaze Tracking Solutions for Optimized Foveated Rendering System Performance in Virtual Reality
Figure 3 for FovealNet: Advancing AI-Driven Gaze Tracking Solutions for Optimized Foveated Rendering System Performance in Virtual Reality
Figure 4 for FovealNet: Advancing AI-Driven Gaze Tracking Solutions for Optimized Foveated Rendering System Performance in Virtual Reality
Viaarxiv icon

BudgetFusion: Perceptually-Guided Adaptive Diffusion Models

Add code
Dec 10, 2024
Viaarxiv icon

Detect an Object At Once without Fine-tuning

Add code
Nov 04, 2024
Figure 1 for Detect an Object At Once without Fine-tuning
Figure 2 for Detect an Object At Once without Fine-tuning
Figure 3 for Detect an Object At Once without Fine-tuning
Figure 4 for Detect an Object At Once without Fine-tuning
Viaarxiv icon

An Evolved Universal Transformer Memory

Add code
Oct 17, 2024
Figure 1 for An Evolved Universal Transformer Memory
Figure 2 for An Evolved Universal Transformer Memory
Figure 3 for An Evolved Universal Transformer Memory
Figure 4 for An Evolved Universal Transformer Memory
Viaarxiv icon

Low Latency Point Cloud Rendering with Learned Splatting

Add code
Sep 24, 2024
Viaarxiv icon

Can-Do! A Dataset and Neuro-Symbolic Grounded Framework for Embodied Planning with Large Multimodal Models

Add code
Sep 22, 2024
Viaarxiv icon