Picture for Arman Cohan

Arman Cohan

MSRS: Evaluating Multi-Source Retrieval-Augmented Generation

Add code
Aug 28, 2025
Viaarxiv icon

Demystifying Scientific Problem-Solving in LLMs by Probing Knowledge and Reasoning

Add code
Aug 26, 2025
Viaarxiv icon

AbGen: Evaluating Large Language Models in Ablation Study Design and Evaluation for Scientific Research

Add code
Jul 17, 2025
Viaarxiv icon

Can LLMs Identify Critical Limitations within Scientific Research? A Systematic Evaluation on AI Research Papers

Add code
Jul 03, 2025
Viaarxiv icon

SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks

Add code
Jul 01, 2025
Viaarxiv icon

SciVer: Evaluating Foundation Models for Multimodal Scientific Claim Verification

Add code
Jun 18, 2025
Viaarxiv icon

SUCEA: Reasoning-Intensive Retrieval for Adversarial Fact-checking through Claim Decomposition and Editing

Add code
Jun 05, 2025
Viaarxiv icon

MetaFaith: Faithful Natural Language Uncertainty Expression in LLMs

Add code
May 30, 2025
Viaarxiv icon

Table-R1: Inference-Time Scaling for Table Reasoning

Add code
May 29, 2025
Viaarxiv icon

Judging with Many Minds: Do More Perspectives Mean Less Prejudice?

Add code
May 26, 2025
Viaarxiv icon