Picture for Sichao Jiang

Sichao Jiang

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

Add code
Feb 20, 2025
Viaarxiv icon

Evaluating the Robustness to Instructions of Large Language Models

Add code
Aug 29, 2023
Figure 1 for Evaluating the Robustness to Instructions of Large Language Models
Figure 2 for Evaluating the Robustness to Instructions of Large Language Models
Figure 3 for Evaluating the Robustness to Instructions of Large Language Models
Figure 4 for Evaluating the Robustness to Instructions of Large Language Models
Viaarxiv icon