Picture for Melanie Mitchell

Melanie Mitchell

Do AI Models Perform Human-like Abstract Reasoning Across Modalities?

Add code
Oct 02, 2025
Viaarxiv icon

Evaluating the Robustness of Analogical Reasoning in Large Language Models

Add code
Nov 21, 2024
Figure 1 for Evaluating the Robustness of Analogical Reasoning in Large Language Models
Figure 2 for Evaluating the Robustness of Analogical Reasoning in Large Language Models
Figure 3 for Evaluating the Robustness of Analogical Reasoning in Large Language Models
Figure 4 for Evaluating the Robustness of Analogical Reasoning in Large Language Models
Viaarxiv icon

Imagining and building wise machines: The centrality of AI metacognition

Add code
Nov 04, 2024
Figure 1 for Imagining and building wise machines: The centrality of AI metacognition
Figure 2 for Imagining and building wise machines: The centrality of AI metacognition
Figure 3 for Imagining and building wise machines: The centrality of AI metacognition
Viaarxiv icon

Can Large Language Models generalize analogy solving like people can?

Add code
Nov 04, 2024
Figure 1 for Can Large Language Models generalize analogy solving like people can?
Figure 2 for Can Large Language Models generalize analogy solving like people can?
Figure 3 for Can Large Language Models generalize analogy solving like people can?
Figure 4 for Can Large Language Models generalize analogy solving like people can?
Viaarxiv icon

Using Counterfactual Tasks to Evaluate the Generality of Analogical Reasoning in Large Language Models

Add code
Feb 14, 2024
Figure 1 for Using Counterfactual Tasks to Evaluate the Generality of Analogical Reasoning in Large Language Models
Figure 2 for Using Counterfactual Tasks to Evaluate the Generality of Analogical Reasoning in Large Language Models
Figure 3 for Using Counterfactual Tasks to Evaluate the Generality of Analogical Reasoning in Large Language Models
Figure 4 for Using Counterfactual Tasks to Evaluate the Generality of Analogical Reasoning in Large Language Models
Viaarxiv icon

Perspectives on the State and Future of Deep Learning - 2023

Add code
Dec 19, 2023
Viaarxiv icon

Comparing Humans, GPT-4, and GPT-4V On Abstraction and Reasoning Tasks

Add code
Nov 26, 2023
Figure 1 for Comparing Humans, GPT-4, and GPT-4V On Abstraction and Reasoning Tasks
Figure 2 for Comparing Humans, GPT-4, and GPT-4V On Abstraction and Reasoning Tasks
Figure 3 for Comparing Humans, GPT-4, and GPT-4V On Abstraction and Reasoning Tasks
Figure 4 for Comparing Humans, GPT-4, and GPT-4V On Abstraction and Reasoning Tasks
Viaarxiv icon

The ConceptARC Benchmark: Evaluating Understanding and Generalization in the ARC Domain

Add code
May 11, 2023
Figure 1 for The ConceptARC Benchmark: Evaluating Understanding and Generalization in the ARC Domain
Figure 2 for The ConceptARC Benchmark: Evaluating Understanding and Generalization in the ARC Domain
Figure 3 for The ConceptARC Benchmark: Evaluating Understanding and Generalization in the ARC Domain
Figure 4 for The ConceptARC Benchmark: Evaluating Understanding and Generalization in the ARC Domain
Viaarxiv icon

Gathering Strength, Gathering Storms: The One Hundred Year Study on Artificial Intelligence (AI100) 2021 Study Panel Report

Add code
Oct 27, 2022
Viaarxiv icon

Embodied, Situated, and Grounded Intelligence: Implications for AI

Add code
Oct 24, 2022
Viaarxiv icon