Picture for Zhaorun Chen

Zhaorun Chen

SafeWatch: An Efficient Safety-Policy Following Video Guardrail Model with Transparent Explanations

Add code
Dec 09, 2024
Viaarxiv icon

From Uncertainty to Trust: Enhancing Reliability in Vision-Language Models with Uncertainty-Guided Dropout Decoding

Add code
Dec 09, 2024
Viaarxiv icon

GRAPE: Generalizing Robot Policy via Preference Alignment

Add code
Nov 28, 2024
Figure 1 for GRAPE: Generalizing Robot Policy via Preference Alignment
Figure 2 for GRAPE: Generalizing Robot Policy via Preference Alignment
Figure 3 for GRAPE: Generalizing Robot Policy via Preference Alignment
Figure 4 for GRAPE: Generalizing Robot Policy via Preference Alignment
Viaarxiv icon

Beyond Training: Dynamic Token Merging for Zero-Shot Video Understanding

Add code
Nov 21, 2024
Viaarxiv icon

Fine-Grained Verifiers: Preference Modeling as Next-token Prediction in Vision-Language Alignment

Add code
Oct 18, 2024
Figure 1 for Fine-Grained Verifiers: Preference Modeling as Next-token Prediction in Vision-Language Alignment
Figure 2 for Fine-Grained Verifiers: Preference Modeling as Next-token Prediction in Vision-Language Alignment
Figure 3 for Fine-Grained Verifiers: Preference Modeling as Next-token Prediction in Vision-Language Alignment
Figure 4 for Fine-Grained Verifiers: Preference Modeling as Next-token Prediction in Vision-Language Alignment
Viaarxiv icon

Preference Optimization with Multi-Sample Comparisons

Add code
Oct 16, 2024
Viaarxiv icon

MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models

Add code
Oct 14, 2024
Figure 1 for MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models
Figure 2 for MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models
Figure 3 for MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models
Figure 4 for MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models
Viaarxiv icon

Can Editing LLMs Inject Harm?

Add code
Jul 29, 2024
Viaarxiv icon

AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases

Add code
Jul 17, 2024
Viaarxiv icon

MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?

Add code
Jul 05, 2024
Figure 1 for MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?
Figure 2 for MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?
Figure 3 for MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?
Figure 4 for MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?
Viaarxiv icon