Picture for James Aung

James Aung

Tony

GPT-4o System Card

Add code
Oct 25, 2024
Viaarxiv icon

MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering

Add code
Oct 09, 2024
Figure 1 for MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering
Figure 2 for MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering
Figure 3 for MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering
Figure 4 for MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering
Viaarxiv icon

Large Language Models as Misleading Assistants in Conversation

Add code
Jul 16, 2024
Viaarxiv icon