Picture for Shuaiyi Nie

Shuaiyi Nie

S1-Bench: A Simple Benchmark for Evaluating System 1 Thinking Capability of Large Reasoning Models

Add code
Apr 14, 2025
Viaarxiv icon

Revealing the Challenge of Detecting Character Knowledge Errors in LLM Role-Playing

Add code
Sep 18, 2024
Figure 1 for Revealing the Challenge of Detecting Character Knowledge Errors in LLM Role-Playing
Figure 2 for Revealing the Challenge of Detecting Character Knowledge Errors in LLM Role-Playing
Figure 3 for Revealing the Challenge of Detecting Character Knowledge Errors in LLM Role-Playing
Figure 4 for Revealing the Challenge of Detecting Character Knowledge Errors in LLM Role-Playing
Viaarxiv icon