Picture for Shitong Duan

Shitong Duan

IROTE: Human-like Traits Elicitation of Large Language Model via In-Context Self-Reflective Optimization

Add code
Aug 12, 2025
Viaarxiv icon

Value Compass Leaderboard: A Platform for Fundamental and Validated Evaluation of LLMs Values

Add code
Jan 13, 2025
Viaarxiv icon

The Road to Artificial SuperIntelligence: A Comprehensive Survey of Superalignment

Add code
Dec 24, 2024
Figure 1 for The Road to Artificial SuperIntelligence: A Comprehensive Survey of Superalignment
Figure 2 for The Road to Artificial SuperIntelligence: A Comprehensive Survey of Superalignment
Figure 3 for The Road to Artificial SuperIntelligence: A Comprehensive Survey of Superalignment
Figure 4 for The Road to Artificial SuperIntelligence: A Comprehensive Survey of Superalignment
Viaarxiv icon

On the Essence and Prospect: An Investigation of Alignment Approaches for Big Models

Add code
Mar 07, 2024
Figure 1 for On the Essence and Prospect: An Investigation of Alignment Approaches for Big Models
Figure 2 for On the Essence and Prospect: An Investigation of Alignment Approaches for Big Models
Figure 3 for On the Essence and Prospect: An Investigation of Alignment Approaches for Big Models
Figure 4 for On the Essence and Prospect: An Investigation of Alignment Approaches for Big Models
Viaarxiv icon

Negating Negatives: Alignment without Human Positive Samples via Distributional Dispreference Optimization

Add code
Mar 06, 2024
Figure 1 for Negating Negatives: Alignment without Human Positive Samples via Distributional Dispreference Optimization
Figure 2 for Negating Negatives: Alignment without Human Positive Samples via Distributional Dispreference Optimization
Figure 3 for Negating Negatives: Alignment without Human Positive Samples via Distributional Dispreference Optimization
Figure 4 for Negating Negatives: Alignment without Human Positive Samples via Distributional Dispreference Optimization
Viaarxiv icon

Denevil: Towards Deciphering and Navigating the Ethical Values of Large Language Models via Instruction Learning

Add code
Oct 30, 2023
Figure 1 for Denevil: Towards Deciphering and Navigating the Ethical Values of Large Language Models via Instruction Learning
Figure 2 for Denevil: Towards Deciphering and Navigating the Ethical Values of Large Language Models via Instruction Learning
Figure 3 for Denevil: Towards Deciphering and Navigating the Ethical Values of Large Language Models via Instruction Learning
Figure 4 for Denevil: Towards Deciphering and Navigating the Ethical Values of Large Language Models via Instruction Learning
Viaarxiv icon