Picture for Zhihui Xie

Zhihui Xie

VLRewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models

Add code
Nov 26, 2024
Viaarxiv icon

Learning Versatile Skills with Curriculum Masking

Add code
Oct 23, 2024
Viaarxiv icon

VLFeedback: A Large-Scale AI Feedback Dataset for Large Vision-Language Models Alignment

Add code
Oct 12, 2024
Figure 1 for VLFeedback: A Large-Scale AI Feedback Dataset for Large Vision-Language Models Alignment
Figure 2 for VLFeedback: A Large-Scale AI Feedback Dataset for Large Vision-Language Models Alignment
Figure 3 for VLFeedback: A Large-Scale AI Feedback Dataset for Large Vision-Language Models Alignment
Figure 4 for VLFeedback: A Large-Scale AI Feedback Dataset for Large Vision-Language Models Alignment
Viaarxiv icon

Jailbreaking as a Reward Misspecification Problem

Add code
Jun 20, 2024
Viaarxiv icon

Calibrating Reasoning in Language Models with Internal Consistency

Add code
May 29, 2024
Viaarxiv icon

Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models

Add code
Apr 18, 2024
Figure 1 for Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models
Figure 2 for Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models
Figure 3 for Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models
Figure 4 for Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models
Viaarxiv icon

Discovering Low-rank Subspaces for Language-agnostic Multilingual Representations

Add code
Jan 11, 2024
Viaarxiv icon

Silkie: Preference Distillation for Large Visual Language Models

Add code
Dec 17, 2023
Viaarxiv icon

Future-conditioned Unsupervised Pretraining for Decision Transformer

Add code
May 26, 2023
Viaarxiv icon

Pretraining in Deep Reinforcement Learning: A Survey

Add code
Nov 08, 2022
Viaarxiv icon