Picture for De-An Huang

De-An Huang

Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders

Add code
Aug 28, 2024
Figure 1 for Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Figure 2 for Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Figure 3 for Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Figure 4 for Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Viaarxiv icon

ARDuP: Active Region Video Diffusion for Universal Policies

Add code
Jun 19, 2024
Figure 1 for ARDuP: Active Region Video Diffusion for Universal Policies
Figure 2 for ARDuP: Active Region Video Diffusion for Universal Policies
Figure 3 for ARDuP: Active Region Video Diffusion for Universal Policies
Figure 4 for ARDuP: Active Region Video Diffusion for Universal Policies
Viaarxiv icon

X-VILA: Cross-Modality Alignment for Large Language Model

Add code
May 29, 2024
Figure 1 for X-VILA: Cross-Modality Alignment for Large Language Model
Figure 2 for X-VILA: Cross-Modality Alignment for Large Language Model
Figure 3 for X-VILA: Cross-Modality Alignment for Large Language Model
Figure 4 for X-VILA: Cross-Modality Alignment for Large Language Model
Viaarxiv icon

What is Point Supervision Worth in Video Instance Segmentation?

Add code
Apr 01, 2024
Viaarxiv icon

LITA: Language Instructed Temporal-Localization Assistant

Add code
Mar 27, 2024
Viaarxiv icon

Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition

Add code
Mar 21, 2024
Viaarxiv icon

T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with Trajectory Stitching

Add code
Feb 21, 2024
Viaarxiv icon

Deep Multimodal Fusion for Surgical Feedback Classification

Add code
Dec 06, 2023
Viaarxiv icon

Eureka: Human-Level Reward Design via Coding Large Language Models

Add code
Oct 19, 2023
Viaarxiv icon

Differentially Private Video Activity Recognition

Add code
Jun 27, 2023
Viaarxiv icon