Picture for Sean Welleck

Sean Welleck

ImProver: Agent-Based Automated Proof Optimization

Add code
Oct 07, 2024
Viaarxiv icon

miniCTX: Neural Theorem Proving with (Long-)Contexts

Add code
Aug 05, 2024
Viaarxiv icon

An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models

Add code
Aug 01, 2024
Figure 1 for An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models
Figure 2 for An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models
Figure 3 for An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models
Figure 4 for An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models
Viaarxiv icon

Lean-STaR: Learning to Interleave Thinking and Proving

Add code
Jul 14, 2024
Viaarxiv icon

From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models

Add code
Jun 24, 2024
Viaarxiv icon

miniCodeProps: a Minimal Benchmark for Proving Code Properties

Add code
Jun 16, 2024
Figure 1 for miniCodeProps: a Minimal Benchmark for Proving Code Properties
Figure 2 for miniCodeProps: a Minimal Benchmark for Proving Code Properties
Figure 3 for miniCodeProps: a Minimal Benchmark for Proving Code Properties
Figure 4 for miniCodeProps: a Minimal Benchmark for Proving Code Properties
Viaarxiv icon

The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models

Add code
Jun 09, 2024
Figure 1 for The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Figure 2 for The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Figure 3 for The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Figure 4 for The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Viaarxiv icon

Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models

Add code
May 02, 2024
Viaarxiv icon

Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision

Add code
Mar 14, 2024
Viaarxiv icon

STEER: Unified Style Transfer with Expert Reinforcement

Add code
Nov 13, 2023
Figure 1 for STEER: Unified Style Transfer with Expert Reinforcement
Figure 2 for STEER: Unified Style Transfer with Expert Reinforcement
Figure 3 for STEER: Unified Style Transfer with Expert Reinforcement
Figure 4 for STEER: Unified Style Transfer with Expert Reinforcement
Viaarxiv icon