Picture for Pei Ke

Pei Ke

HPSS: Heuristic Prompting Strategy Search for LLM Evaluators

Add code
Feb 18, 2025
Viaarxiv icon

CharacterBench: Benchmarking Character Customization of Large Language Models

Add code
Dec 16, 2024
Figure 1 for CharacterBench: Benchmarking Character Customization of Large Language Models
Figure 2 for CharacterBench: Benchmarking Character Customization of Large Language Models
Figure 3 for CharacterBench: Benchmarking Character Customization of Large Language Models
Figure 4 for CharacterBench: Benchmarking Character Customization of Large Language Models
Viaarxiv icon

The Superalignment of Superhuman Intelligence with Large Language Models

Add code
Dec 15, 2024
Figure 1 for The Superalignment of Superhuman Intelligence with Large Language Models
Viaarxiv icon

Benchmarking Complex Instruction-Following with Multiple Constraints Composition

Add code
Jul 04, 2024
Figure 1 for Benchmarking Complex Instruction-Following with Multiple Constraints Composition
Figure 2 for Benchmarking Complex Instruction-Following with Multiple Constraints Composition
Figure 3 for Benchmarking Complex Instruction-Following with Multiple Constraints Composition
Figure 4 for Benchmarking Complex Instruction-Following with Multiple Constraints Composition
Viaarxiv icon

Safe Unlearning: A Surprisingly Effective and Generalizable Solution to Defend Against Jailbreak Attacks

Add code
Jul 03, 2024
Figure 1 for Safe Unlearning: A Surprisingly Effective and Generalizable Solution to Defend Against Jailbreak Attacks
Figure 2 for Safe Unlearning: A Surprisingly Effective and Generalizable Solution to Defend Against Jailbreak Attacks
Figure 3 for Safe Unlearning: A Surprisingly Effective and Generalizable Solution to Defend Against Jailbreak Attacks
Figure 4 for Safe Unlearning: A Surprisingly Effective and Generalizable Solution to Defend Against Jailbreak Attacks
Viaarxiv icon

AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models

Add code
Jun 24, 2024
Figure 1 for AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models
Figure 2 for AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models
Figure 3 for AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models
Figure 4 for AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models
Viaarxiv icon

Learning Task Decomposition to Assist Humans in Competitive Programming

Add code
Jun 07, 2024
Figure 1 for Learning Task Decomposition to Assist Humans in Competitive Programming
Figure 2 for Learning Task Decomposition to Assist Humans in Competitive Programming
Figure 3 for Learning Task Decomposition to Assist Humans in Competitive Programming
Figure 4 for Learning Task Decomposition to Assist Humans in Competitive Programming
Viaarxiv icon

Perception of Knowledge Boundary for Large Language Models through Semi-open-ended Question Answering

Add code
May 23, 2024
Figure 1 for Perception of Knowledge Boundary for Large Language Models through Semi-open-ended Question Answering
Figure 2 for Perception of Knowledge Boundary for Large Language Models through Semi-open-ended Question Answering
Figure 3 for Perception of Knowledge Boundary for Large Language Models through Semi-open-ended Question Answering
Figure 4 for Perception of Knowledge Boundary for Large Language Models through Semi-open-ended Question Answering
Viaarxiv icon

ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors

Add code
Feb 26, 2024
Figure 1 for ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors
Figure 2 for ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors
Figure 3 for ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors
Figure 4 for ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors
Viaarxiv icon

Towards Efficient and Exact Optimization of Language Model Alignment

Add code
Feb 02, 2024
Viaarxiv icon