Picture for Zhanhui Zhou

Zhanhui Zhou

Inference-Time Language Model Alignment via Integrated Value Guidance

Add code
Sep 26, 2024
Viaarxiv icon

Iterative Length-Regularized Direct Preference Optimization: A Case Study on Improving 7B Language Models to GPT-4 Level

Add code
Jun 17, 2024
Viaarxiv icon

Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models

Add code
May 29, 2024
Viaarxiv icon

ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large Language Models

Add code
Feb 23, 2024
Figure 1 for ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large Language Models
Figure 2 for ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large Language Models
Figure 3 for ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large Language Models
Figure 4 for ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large Language Models
Viaarxiv icon

MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues

Add code
Feb 22, 2024
Figure 1 for MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues
Figure 2 for MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues
Figure 3 for MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues
Figure 4 for MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues
Viaarxiv icon

Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!

Add code
Feb 21, 2024
Figure 1 for Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!
Figure 2 for Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!
Figure 3 for Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!
Figure 4 for Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!
Viaarxiv icon

Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey

Add code
Feb 14, 2024
Viaarxiv icon

Beyond One-Preference-for-All: Multi-Objective Direct Preference Optimization for Language Models

Add code
Oct 17, 2023
Viaarxiv icon