Picture for Jose Hernandez-Orallo

Jose Hernandez-Orallo

Shammie

Conversational Complexity for Assessing Risk in Large Language Models

Add code
Sep 02, 2024
Viaarxiv icon

Foundational Challenges in Assuring Alignment and Safety of Large Language Models

Add code
Apr 15, 2024
Viaarxiv icon

Evaluating General-Purpose AI with Psychometrics

Add code
Oct 25, 2023
Viaarxiv icon

Dialectical language model evaluation: An initial appraisal of the commonsense spatial reasoning abilities of LLMs

Add code
Apr 22, 2023
Figure 1 for Dialectical language model evaluation: An initial appraisal of the commonsense spatial reasoning abilities of LLMs
Viaarxiv icon

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Add code
Jun 10, 2022
Viaarxiv icon

Evaluating the Apperception Engine

Add code
Jul 09, 2020
Figure 1 for Evaluating the Apperception Engine
Figure 2 for Evaluating the Apperception Engine
Figure 3 for Evaluating the Apperception Engine
Figure 4 for Evaluating the Apperception Engine
Viaarxiv icon

Making sense of sensory input

Add code
Oct 05, 2019
Figure 1 for Making sense of sensory input
Figure 2 for Making sense of sensory input
Figure 3 for Making sense of sensory input
Figure 4 for Making sense of sensory input
Viaarxiv icon

Finite Biased Teaching with Infinite Concept Classes

Add code
Apr 19, 2018
Figure 1 for Finite Biased Teaching with Infinite Concept Classes
Figure 2 for Finite Biased Teaching with Infinite Concept Classes
Figure 3 for Finite Biased Teaching with Infinite Concept Classes
Figure 4 for Finite Biased Teaching with Infinite Concept Classes
Viaarxiv icon

AI Evaluation: past, present and future

Add code
Aug 21, 2016
Figure 1 for AI Evaluation: past, present and future
Figure 2 for AI Evaluation: past, present and future
Figure 3 for AI Evaluation: past, present and future
Figure 4 for AI Evaluation: past, present and future
Viaarxiv icon

Universal Psychometrics Tasks: difficulty, composition and decomposition

Add code
Mar 26, 2015
Figure 1 for Universal Psychometrics Tasks: difficulty, composition and decomposition
Figure 2 for Universal Psychometrics Tasks: difficulty, composition and decomposition
Figure 3 for Universal Psychometrics Tasks: difficulty, composition and decomposition
Figure 4 for Universal Psychometrics Tasks: difficulty, composition and decomposition
Viaarxiv icon