Picture for Xiujun Li

Xiujun Li

Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset

Add code
Nov 05, 2024
Viaarxiv icon

Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms

Add code
Oct 24, 2024
Viaarxiv icon

From Text to Pixel: Advancing Long-Context Understanding in MLLMs

Add code
May 23, 2024
Viaarxiv icon

VIM: Probing Multimodal Large Language Models for Visual Embedded Instruction Following

Add code
Nov 29, 2023
Viaarxiv icon

LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation

Add code
May 18, 2023
Viaarxiv icon

Self-supervised Pre-training with Hard Examples Improves Visual Representations

Add code
Jan 04, 2021
Figure 1 for Self-supervised Pre-training with Hard Examples Improves Visual Representations
Figure 2 for Self-supervised Pre-training with Hard Examples Improves Visual Representations
Figure 3 for Self-supervised Pre-training with Hard Examples Improves Visual Representations
Figure 4 for Self-supervised Pre-training with Hard Examples Improves Visual Representations
Viaarxiv icon

VinVL: Making Visual Representations Matter in Vision-Language Models

Add code
Jan 02, 2021
Figure 1 for VinVL: Making Visual Representations Matter in Vision-Language Models
Figure 2 for VinVL: Making Visual Representations Matter in Vision-Language Models
Figure 3 for VinVL: Making Visual Representations Matter in Vision-Language Models
Figure 4 for VinVL: Making Visual Representations Matter in Vision-Language Models
Viaarxiv icon

MiniVLM: A Smaller and Faster Vision-Language Model

Add code
Dec 13, 2020
Figure 1 for MiniVLM: A Smaller and Faster Vision-Language Model
Figure 2 for MiniVLM: A Smaller and Faster Vision-Language Model
Figure 3 for MiniVLM: A Smaller and Faster Vision-Language Model
Figure 4 for MiniVLM: A Smaller and Faster Vision-Language Model
Viaarxiv icon

Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks

Add code
May 18, 2020
Figure 1 for Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
Figure 2 for Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
Figure 3 for Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
Figure 4 for Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
Viaarxiv icon

Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space

Add code
Apr 05, 2020
Figure 1 for Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space
Figure 2 for Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space
Figure 3 for Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space
Figure 4 for Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space
Viaarxiv icon