Picture for Jack Parker-Holder

Jack Parker-Holder

BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games

Add code
Nov 20, 2024
Viaarxiv icon

Open-Endedness is Essential for Artificial Superhuman Intelligence

Add code
Jun 06, 2024
Viaarxiv icon

Outliers and Calibration Sets have Diminishing Effect on Quantization of Modern LLMs

Add code
Jun 03, 2024
Viaarxiv icon

Video as the New Language for Real-World Decision Making

Add code
Feb 27, 2024
Figure 1 for Video as the New Language for Real-World Decision Making
Figure 2 for Video as the New Language for Real-World Decision Making
Figure 3 for Video as the New Language for Real-World Decision Making
Figure 4 for Video as the New Language for Real-World Decision Making
Viaarxiv icon

Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts

Add code
Feb 26, 2024
Figure 1 for Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
Figure 2 for Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
Figure 3 for Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
Figure 4 for Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
Viaarxiv icon

Genie: Generative Interactive Environments

Add code
Feb 23, 2024
Figure 1 for Genie: Generative Interactive Environments
Figure 2 for Genie: Generative Interactive Environments
Figure 3 for Genie: Generative Interactive Environments
Figure 4 for Genie: Generative Interactive Environments
Viaarxiv icon

Multi-Agent Diagnostics for Robustness via Illuminated Diversity

Add code
Jan 24, 2024
Viaarxiv icon

Vision-Language Models as a Source of Rewards

Add code
Dec 14, 2023
Figure 1 for Vision-Language Models as a Source of Rewards
Figure 2 for Vision-Language Models as a Source of Rewards
Figure 3 for Vision-Language Models as a Source of Rewards
Figure 4 for Vision-Language Models as a Source of Rewards
Viaarxiv icon

Discovering General Reinforcement Learning Algorithms with Adversarial Environment Design

Add code
Oct 04, 2023
Viaarxiv icon

Stabilizing Unsupervised Environment Design with a Learned Adversary

Add code
Aug 22, 2023
Viaarxiv icon