Picture for Eric Horvitz

Eric Horvitz

Generating Structured Outputs from Language Models: Benchmark and Studies

Add code
Jan 18, 2025
Viaarxiv icon

Superhuman performance of a large language model on the reasoning tasks of a physician

Add code
Dec 14, 2024
Figure 1 for Superhuman performance of a large language model on the reasoning tasks of a physician
Figure 2 for Superhuman performance of a large language model on the reasoning tasks of a physician
Figure 3 for Superhuman performance of a large language model on the reasoning tasks of a physician
Figure 4 for Superhuman performance of a large language model on the reasoning tasks of a physician
Viaarxiv icon

Steering Language Model Refusal with Sparse Autoencoders

Add code
Nov 18, 2024
Figure 1 for Steering Language Model Refusal with Sparse Autoencoders
Figure 2 for Steering Language Model Refusal with Sparse Autoencoders
Figure 3 for Steering Language Model Refusal with Sparse Autoencoders
Figure 4 for Steering Language Model Refusal with Sparse Autoencoders
Viaarxiv icon

From Medprompt to o1: Exploration of Run-Time Strategies for Medical Challenge Problems and Beyond

Add code
Nov 06, 2024
Figure 1 for From Medprompt to o1: Exploration of Run-Time Strategies for Medical Challenge Problems and Beyond
Figure 2 for From Medprompt to o1: Exploration of Run-Time Strategies for Medical Challenge Problems and Beyond
Figure 3 for From Medprompt to o1: Exploration of Run-Time Strategies for Medical Challenge Problems and Beyond
Figure 4 for From Medprompt to o1: Exploration of Run-Time Strategies for Medical Challenge Problems and Beyond
Viaarxiv icon

Decision-Focused Uncertainty Quantification

Add code
Oct 02, 2024
Viaarxiv icon

How to Build the Virtual Cell with Artificial Intelligence: Priorities and Opportunities

Add code
Sep 18, 2024
Figure 1 for How to Build the Virtual Cell with Artificial Intelligence: Priorities and Opportunities
Figure 2 for How to Build the Virtual Cell with Artificial Intelligence: Priorities and Opportunities
Viaarxiv icon

MedFuzz: Exploring the Robustness of Large Language Models in Medical Question Answering

Add code
Jun 03, 2024
Viaarxiv icon

The Rise of the AI Co-Pilot: Lessons for Design from Aviation and Beyond

Add code
Nov 29, 2023
Viaarxiv icon

Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine

Add code
Nov 28, 2023
Figure 1 for Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine
Figure 2 for Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine
Figure 3 for Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine
Figure 4 for Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine
Viaarxiv icon

Frontier AI Regulation: Managing Emerging Risks to Public Safety

Add code
Jul 11, 2023
Viaarxiv icon