Picture for Jinliang Zheng

Jinliang Zheng

Robo-MUTUAL: Robotic Multimodal Task Specification via Unimodal Learning

Add code
Oct 02, 2024
Viaarxiv icon

MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment

Add code
Jun 28, 2024
Figure 1 for MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment
Figure 2 for MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment
Figure 3 for MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment
Figure 4 for MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment
Viaarxiv icon

Instruction-Guided Visual Masking

Add code
May 30, 2024
Figure 1 for Instruction-Guided Visual Masking
Figure 2 for Instruction-Guided Visual Masking
Figure 3 for Instruction-Guided Visual Masking
Figure 4 for Instruction-Guided Visual Masking
Viaarxiv icon

Enhancing Vision-Language Model with Unmasked Token Alignment

Add code
May 29, 2024
Viaarxiv icon

GLID: Pre-training a Generalist Encoder-Decoder Vision Model

Add code
Apr 11, 2024
Figure 1 for GLID: Pre-training a Generalist Encoder-Decoder Vision Model
Figure 2 for GLID: Pre-training a Generalist Encoder-Decoder Vision Model
Figure 3 for GLID: Pre-training a Generalist Encoder-Decoder Vision Model
Figure 4 for GLID: Pre-training a Generalist Encoder-Decoder Vision Model
Viaarxiv icon

DecisionNCE: Embodied Multimodal Representations via Implicit Preference Learning

Add code
Feb 28, 2024
Viaarxiv icon