Picture for Pei Ke

Pei Ke

Benchmarking Complex Instruction-Following with Multiple Constraints Composition

Add code
Jul 04, 2024
Viaarxiv icon

Safe Unlearning: A Surprisingly Effective and Generalizable Solution to Defend Against Jailbreak Attacks

Add code
Jul 03, 2024
Viaarxiv icon

AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models

Add code
Jun 24, 2024
Figure 1 for AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models
Figure 2 for AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models
Figure 3 for AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models
Figure 4 for AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models
Viaarxiv icon

Learning Task Decomposition to Assist Humans in Competitive Programming

Add code
Jun 07, 2024
Figure 1 for Learning Task Decomposition to Assist Humans in Competitive Programming
Figure 2 for Learning Task Decomposition to Assist Humans in Competitive Programming
Figure 3 for Learning Task Decomposition to Assist Humans in Competitive Programming
Figure 4 for Learning Task Decomposition to Assist Humans in Competitive Programming
Viaarxiv icon

Perception of Knowledge Boundary for Large Language Models through Semi-open-ended Question Answering

Add code
May 23, 2024
Figure 1 for Perception of Knowledge Boundary for Large Language Models through Semi-open-ended Question Answering
Figure 2 for Perception of Knowledge Boundary for Large Language Models through Semi-open-ended Question Answering
Figure 3 for Perception of Knowledge Boundary for Large Language Models through Semi-open-ended Question Answering
Figure 4 for Perception of Knowledge Boundary for Large Language Models through Semi-open-ended Question Answering
Viaarxiv icon

ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors

Add code
Feb 26, 2024
Viaarxiv icon

Towards Efficient and Exact Optimization of Language Model Alignment

Add code
Feb 02, 2024
Viaarxiv icon

AlignBench: Benchmarking Chinese Alignment of Large Language Models

Add code
Dec 05, 2023
Viaarxiv icon

CritiqueLLM: Scaling LLM-as-Critic for Effective and Explainable Evaluation of Large Language Model Generation

Add code
Nov 30, 2023
Viaarxiv icon

Unveiling the Implicit Toxicity in Large Language Models

Add code
Nov 29, 2023
Viaarxiv icon