Picture for Qifan Wang

Qifan Wang

Meta AI

Error-driven Data-efficient Large Multimodal Model Tuning

Add code
Dec 20, 2024
Viaarxiv icon

CompCap: Improving Multimodal Large Language Models with Composite Captions

Add code
Dec 06, 2024
Figure 1 for CompCap: Improving Multimodal Large Language Models with Composite Captions
Figure 2 for CompCap: Improving Multimodal Large Language Models with Composite Captions
Figure 3 for CompCap: Improving Multimodal Large Language Models with Composite Captions
Figure 4 for CompCap: Improving Multimodal Large Language Models with Composite Captions
Viaarxiv icon

Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics

Add code
Nov 22, 2024
Figure 1 for Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics
Figure 2 for Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics
Figure 3 for Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics
Figure 4 for Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics
Viaarxiv icon

When Backdoors Speak: Understanding LLM Backdoor Attacks Through Model-Generated Explanations

Add code
Nov 19, 2024
Figure 1 for When Backdoors Speak: Understanding LLM Backdoor Attacks Through Model-Generated Explanations
Figure 2 for When Backdoors Speak: Understanding LLM Backdoor Attacks Through Model-Generated Explanations
Figure 3 for When Backdoors Speak: Understanding LLM Backdoor Attacks Through Model-Generated Explanations
Figure 4 for When Backdoors Speak: Understanding LLM Backdoor Attacks Through Model-Generated Explanations
Viaarxiv icon

Visual Fourier Prompt Tuning

Add code
Nov 02, 2024
Figure 1 for Visual Fourier Prompt Tuning
Figure 2 for Visual Fourier Prompt Tuning
Figure 3 for Visual Fourier Prompt Tuning
Figure 4 for Visual Fourier Prompt Tuning
Viaarxiv icon

Unveiling User Satisfaction and Creator Productivity Trade-Offs in Recommendation Platforms

Add code
Oct 31, 2024
Viaarxiv icon

FIRP: Faster LLM inference via future intermediate representation prediction

Add code
Oct 27, 2024
Viaarxiv icon

RoRA-VLM: Robust Retrieval-Augmented Vision Language Models

Add code
Oct 11, 2024
Figure 1 for RoRA-VLM: Robust Retrieval-Augmented Vision Language Models
Figure 2 for RoRA-VLM: Robust Retrieval-Augmented Vision Language Models
Figure 3 for RoRA-VLM: Robust Retrieval-Augmented Vision Language Models
Figure 4 for RoRA-VLM: Robust Retrieval-Augmented Vision Language Models
Viaarxiv icon

Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models

Add code
Oct 04, 2024
Figure 1 for Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models
Figure 2 for Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models
Figure 3 for Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models
Figure 4 for Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models
Viaarxiv icon

ProteinGPT: Multimodal LLM for Protein Property Prediction and Structure Understanding

Add code
Aug 21, 2024
Viaarxiv icon