Picture for Chandler Smith

Chandler Smith

Multi-Agent Risks from Advanced AI

Add code
Feb 19, 2025
Viaarxiv icon

MALT: Improving Reasoning with Multi-Agent LLM Training

Add code
Dec 02, 2024
Figure 1 for MALT: Improving Reasoning with Multi-Agent LLM Training
Figure 2 for MALT: Improving Reasoning with Multi-Agent LLM Training
Figure 3 for MALT: Improving Reasoning with Multi-Agent LLM Training
Viaarxiv icon

BetterBench: Assessing AI Benchmarks, Uncovering Issues, and Establishing Best Practices

Add code
Nov 20, 2024
Figure 1 for BetterBench: Assessing AI Benchmarks, Uncovering Issues, and Establishing Best Practices
Figure 2 for BetterBench: Assessing AI Benchmarks, Uncovering Issues, and Establishing Best Practices
Figure 3 for BetterBench: Assessing AI Benchmarks, Uncovering Issues, and Establishing Best Practices
Figure 4 for BetterBench: Assessing AI Benchmarks, Uncovering Issues, and Establishing Best Practices
Viaarxiv icon

Riemannian Optimization for Non-convex Euclidean Distance Geometry with Global Recovery Guarantees

Add code
Oct 08, 2024
Viaarxiv icon

Evaluating Language Model Character Traits

Add code
Oct 05, 2024
Figure 1 for Evaluating Language Model Character Traits
Figure 2 for Evaluating Language Model Character Traits
Figure 3 for Evaluating Language Model Character Traits
Figure 4 for Evaluating Language Model Character Traits
Viaarxiv icon

Escalation Risks from Language Models in Military and Diplomatic Decision-Making

Add code
Jan 07, 2024
Viaarxiv icon