Picture for Percy Liang

Percy Liang

Shammie

RoboReward: General-Purpose Vision-Language Reward Models for Robotics

Add code
Jan 08, 2026
Viaarxiv icon

Extracting books from production language models

Add code
Jan 06, 2026
Viaarxiv icon

The 2025 Foundation Model Transparency Index

Add code
Dec 11, 2025
Figure 1 for The 2025 Foundation Model Transparency Index
Figure 2 for The 2025 Foundation Model Transparency Index
Figure 3 for The 2025 Foundation Model Transparency Index
Figure 4 for The 2025 Foundation Model Transparency Index
Viaarxiv icon

Comparing AI Agents to Cybersecurity Professionals in Real-World Penetration Testing

Add code
Dec 10, 2025
Figure 1 for Comparing AI Agents to Cybersecurity Professionals in Real-World Penetration Testing
Figure 2 for Comparing AI Agents to Cybersecurity Professionals in Real-World Penetration Testing
Figure 3 for Comparing AI Agents to Cybersecurity Professionals in Real-World Penetration Testing
Figure 4 for Comparing AI Agents to Cybersecurity Professionals in Real-World Penetration Testing
Viaarxiv icon

Beat the long tail: Distribution-Aware Speculative Decoding for RL Training

Add code
Nov 17, 2025
Viaarxiv icon

On the Entropy Calibration of Language Models

Add code
Nov 15, 2025
Figure 1 for On the Entropy Calibration of Language Models
Figure 2 for On the Entropy Calibration of Language Models
Figure 3 for On the Entropy Calibration of Language Models
Figure 4 for On the Entropy Calibration of Language Models
Viaarxiv icon

MLE-Smith: Scaling MLE Tasks with Automated Multi-Agent Pipeline

Add code
Oct 08, 2025
Viaarxiv icon

Pre-training under infinite compute

Add code
Sep 18, 2025
Figure 1 for Pre-training under infinite compute
Figure 2 for Pre-training under infinite compute
Figure 3 for Pre-training under infinite compute
Figure 4 for Pre-training under infinite compute
Viaarxiv icon

UQ: Assessing Language Models on Unsolved Questions

Add code
Aug 25, 2025
Viaarxiv icon

Establishing Best Practices for Building Rigorous Agentic Benchmarks

Add code
Jul 03, 2025
Figure 1 for Establishing Best Practices for Building Rigorous Agentic Benchmarks
Figure 2 for Establishing Best Practices for Building Rigorous Agentic Benchmarks
Figure 3 for Establishing Best Practices for Building Rigorous Agentic Benchmarks
Figure 4 for Establishing Best Practices for Building Rigorous Agentic Benchmarks
Viaarxiv icon