Picture for Davide Paglieri

Davide Paglieri

BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games

Add code
Nov 20, 2024
Figure 1 for BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games
Figure 2 for BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games
Figure 3 for BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games
Figure 4 for BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games
Viaarxiv icon

Outliers and Calibration Sets have Diminishing Effect on Quantization of Modern LLMs

Add code
Jun 03, 2024
Viaarxiv icon

Multi-Agent Diagnostics for Robustness via Illuminated Diversity

Add code
Jan 24, 2024
Viaarxiv icon