Picture for Rui Zheng

Rui Zheng

Multi-Programming Language Sandbox for LLMs

Add code
Oct 30, 2024
Figure 1 for Multi-Programming Language Sandbox for LLMs
Figure 2 for Multi-Programming Language Sandbox for LLMs
Figure 3 for Multi-Programming Language Sandbox for LLMs
Figure 4 for Multi-Programming Language Sandbox for LLMs
Viaarxiv icon

Have the VLMs Lost Confidence? A Study of Sycophancy in VLMs

Add code
Oct 15, 2024
Figure 1 for Have the VLMs Lost Confidence? A Study of Sycophancy in VLMs
Figure 2 for Have the VLMs Lost Confidence? A Study of Sycophancy in VLMs
Figure 3 for Have the VLMs Lost Confidence? A Study of Sycophancy in VLMs
Figure 4 for Have the VLMs Lost Confidence? A Study of Sycophancy in VLMs
Viaarxiv icon

RMB: Comprehensively Benchmarking Reward Models in LLM Alignment

Add code
Oct 13, 2024
Viaarxiv icon

RoCoSDF: Row-Column Scanned Neural Signed Distance Fields for Freehand 3D Ultrasound Imaging Shape Reconstruction

Add code
Aug 14, 2024
Viaarxiv icon

RedAgent: Red Teaming Large Language Models with Context-aware Autonomous Language Agent

Add code
Jul 23, 2024
Viaarxiv icon

What's Wrong with Your Code Generated by Large Language Models? An Extensive Study

Add code
Jul 08, 2024
Figure 1 for What's Wrong with Your Code Generated by Large Language Models? An Extensive Study
Figure 2 for What's Wrong with Your Code Generated by Large Language Models? An Extensive Study
Figure 3 for What's Wrong with Your Code Generated by Large Language Models? An Extensive Study
Figure 4 for What's Wrong with Your Code Generated by Large Language Models? An Extensive Study
Viaarxiv icon

SafeAligner: Safety Alignment against Jailbreak Attacks via Response Disparity Guidance

Add code
Jun 26, 2024
Viaarxiv icon

Aligning Large Language Models from Self-Reference AI Feedback with one General Principle

Add code
Jun 17, 2024
Viaarxiv icon

SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Model

Add code
Jun 17, 2024
Viaarxiv icon

Toward Optimal LLM Alignments Using Two-Player Games

Add code
Jun 16, 2024
Viaarxiv icon