Picture for Sean Welleck

Sean Welleck

Data for Mathematical Copilots: Better Ways of Presenting Proofs for Machine Learning

Add code
Dec 19, 2024
Figure 1 for Data for Mathematical Copilots: Better Ways of Presenting Proofs for Machine Learning
Figure 2 for Data for Mathematical Copilots: Better Ways of Presenting Proofs for Machine Learning
Viaarxiv icon

AlphaVerus: Bootstrapping Formally Verified Code Generation through Self-Improving Translation and Treefinement

Add code
Dec 09, 2024
Viaarxiv icon

Evaluating Language Models as Synthetic Data Generators

Add code
Dec 04, 2024
Figure 1 for Evaluating Language Models as Synthetic Data Generators
Figure 2 for Evaluating Language Models as Synthetic Data Generators
Figure 3 for Evaluating Language Models as Synthetic Data Generators
Figure 4 for Evaluating Language Models as Synthetic Data Generators
Viaarxiv icon

ImProver: Agent-Based Automated Proof Optimization

Add code
Oct 07, 2024
Viaarxiv icon

miniCTX: Neural Theorem Proving with (Long-)Contexts

Add code
Aug 05, 2024
Viaarxiv icon

An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models

Add code
Aug 01, 2024
Figure 1 for An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models
Figure 2 for An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models
Figure 3 for An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models
Figure 4 for An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models
Viaarxiv icon

Lean-STaR: Learning to Interleave Thinking and Proving

Add code
Jul 14, 2024
Viaarxiv icon

From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models

Add code
Jun 24, 2024
Viaarxiv icon

miniCodeProps: a Minimal Benchmark for Proving Code Properties

Add code
Jun 16, 2024
Figure 1 for miniCodeProps: a Minimal Benchmark for Proving Code Properties
Figure 2 for miniCodeProps: a Minimal Benchmark for Proving Code Properties
Figure 3 for miniCodeProps: a Minimal Benchmark for Proving Code Properties
Figure 4 for miniCodeProps: a Minimal Benchmark for Proving Code Properties
Viaarxiv icon

The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models

Add code
Jun 09, 2024
Figure 1 for The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Figure 2 for The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Figure 3 for The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Figure 4 for The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Viaarxiv icon