Picture for Tejal Patwardhan

Tejal Patwardhan

SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?

Add code
Feb 19, 2025
Viaarxiv icon

Humanity's Last Exam

Add code
Jan 24, 2025
Viaarxiv icon

OpenAI o1 System Card

Add code
Dec 21, 2024
Figure 1 for OpenAI o1 System Card
Figure 2 for OpenAI o1 System Card
Figure 3 for OpenAI o1 System Card
Figure 4 for OpenAI o1 System Card
Viaarxiv icon

GPT-4o System Card

Add code
Oct 25, 2024
Viaarxiv icon

MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering

Add code
Oct 09, 2024
Figure 1 for MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering
Figure 2 for MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering
Figure 3 for MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering
Figure 4 for MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering
Viaarxiv icon