Picture for José Hernández-Orallo

José Hernández-Orallo

Leaving the barn door open for Clever Hans: Simple features predict LLM benchmark answers

Add code
Oct 15, 2024
Viaarxiv icon

100 instances is all you need: predicting the success of a new LLM on unseen data by testing on a few instances

Add code
Sep 05, 2024
Viaarxiv icon

Learning Alternative Ways of Performing a Task

Add code
Apr 03, 2024
Viaarxiv icon

Animal-AI 3: What's New & Why You Should Care

Add code
Dec 18, 2023
Figure 1 for Animal-AI 3: What's New & Why You Should Care
Figure 2 for Animal-AI 3: What's New & Why You Should Care
Figure 3 for Animal-AI 3: What's New & Why You Should Care
Figure 4 for Animal-AI 3: What's New & Why You Should Care
Viaarxiv icon

An International Consortium for Evaluations of Societal-Scale Risks from Advanced AI

Add code
Nov 06, 2023
Figure 1 for An International Consortium for Evaluations of Societal-Scale Risks from Advanced AI
Figure 2 for An International Consortium for Evaluations of Societal-Scale Risks from Advanced AI
Figure 3 for An International Consortium for Evaluations of Societal-Scale Risks from Advanced AI
Figure 4 for An International Consortium for Evaluations of Societal-Scale Risks from Advanced AI
Viaarxiv icon

Predictable Artificial Intelligence

Add code
Oct 09, 2023
Figure 1 for Predictable Artificial Intelligence
Figure 2 for Predictable Artificial Intelligence
Figure 3 for Predictable Artificial Intelligence
Figure 4 for Predictable Artificial Intelligence
Viaarxiv icon

Inferring Capabilities from Task Performance with Bayesian Triangulation

Add code
Sep 21, 2023
Figure 1 for Inferring Capabilities from Task Performance with Bayesian Triangulation
Figure 2 for Inferring Capabilities from Task Performance with Bayesian Triangulation
Figure 3 for Inferring Capabilities from Task Performance with Bayesian Triangulation
Figure 4 for Inferring Capabilities from Task Performance with Bayesian Triangulation
Viaarxiv icon

Compute and Energy Consumption Trends in Deep Learning Inference

Add code
Sep 12, 2021
Figure 1 for Compute and Energy Consumption Trends in Deep Learning Inference
Figure 2 for Compute and Energy Consumption Trends in Deep Learning Inference
Figure 3 for Compute and Energy Consumption Trends in Deep Learning Inference
Figure 4 for Compute and Energy Consumption Trends in Deep Learning Inference
Viaarxiv icon

Conditional Teaching Size

Add code
Jun 29, 2021
Figure 1 for Conditional Teaching Size
Figure 2 for Conditional Teaching Size
Viaarxiv icon

Automating Data Science: Prospects and Challenges

Add code
May 12, 2021
Figure 1 for Automating Data Science: Prospects and Challenges
Figure 2 for Automating Data Science: Prospects and Challenges
Figure 3 for Automating Data Science: Prospects and Challenges
Figure 4 for Automating Data Science: Prospects and Challenges
Viaarxiv icon