Picture for Tao Yuan

Tao Yuan

Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization

Add code
Dec 25, 2024
Viaarxiv icon

The Key of Understanding Vision Tasks: Explanatory Instructions

Add code
Dec 24, 2024
Viaarxiv icon

Multi-modal Agent Tuning: Building a VLM-Driven Agent for Efficient Tool Usage

Add code
Dec 20, 2024
Viaarxiv icon

Beyond 2:4: exploring V:N:M sparsity for efficient transformer inference on GPUs

Add code
Oct 21, 2024
Figure 1 for Beyond 2:4: exploring V:N:M sparsity for efficient transformer inference on GPUs
Figure 2 for Beyond 2:4: exploring V:N:M sparsity for efficient transformer inference on GPUs
Figure 3 for Beyond 2:4: exploring V:N:M sparsity for efficient transformer inference on GPUs
Figure 4 for Beyond 2:4: exploring V:N:M sparsity for efficient transformer inference on GPUs
Viaarxiv icon

PR2: A Physics- and Photo-realistic Testbed for Embodied AI and Humanoid Robots

Add code
Sep 03, 2024
Figure 1 for PR2: A Physics- and Photo-realistic Testbed for Embodied AI and Humanoid Robots
Figure 2 for PR2: A Physics- and Photo-realistic Testbed for Embodied AI and Humanoid Robots
Figure 3 for PR2: A Physics- and Photo-realistic Testbed for Embodied AI and Humanoid Robots
Figure 4 for PR2: A Physics- and Photo-realistic Testbed for Embodied AI and Humanoid Robots
Viaarxiv icon

FIRE: A Dataset for Feedback Integration and Refinement Evaluation of Multimodal Models

Add code
Jul 16, 2024
Viaarxiv icon

LV-Eval: A Balanced Long-Context Benchmark with 5 Length Levels Up to 256K

Add code
Feb 06, 2024
Viaarxiv icon

Structured Attention for Unsupervised Dialogue Structure Induction

Add code
Oct 09, 2020
Figure 1 for Structured Attention for Unsupervised Dialogue Structure Induction
Figure 2 for Structured Attention for Unsupervised Dialogue Structure Induction
Figure 3 for Structured Attention for Unsupervised Dialogue Structure Induction
Figure 4 for Structured Attention for Unsupervised Dialogue Structure Induction
Viaarxiv icon

Joint Inference of States, Robot Knowledge, and Human Beliefs

Add code
Apr 25, 2020
Figure 1 for Joint Inference of States, Robot Knowledge, and Human Beliefs
Figure 2 for Joint Inference of States, Robot Knowledge, and Human Beliefs
Figure 3 for Joint Inference of States, Robot Knowledge, and Human Beliefs
Figure 4 for Joint Inference of States, Robot Knowledge, and Human Beliefs
Viaarxiv icon

PerspectiveNet: 3D Object Detection from a Single RGB Image via Perspective Points

Add code
Dec 16, 2019
Figure 1 for PerspectiveNet: 3D Object Detection from a Single RGB Image via Perspective Points
Figure 2 for PerspectiveNet: 3D Object Detection from a Single RGB Image via Perspective Points
Figure 3 for PerspectiveNet: 3D Object Detection from a Single RGB Image via Perspective Points
Figure 4 for PerspectiveNet: 3D Object Detection from a Single RGB Image via Perspective Points
Viaarxiv icon