Picture for Arvind Narayanan

Arvind Narayanan

Inference Scaling fLaws: The Limits of LLM Resampling with Imperfect Verifiers

Add code
Dec 02, 2024
Figure 1 for Inference Scaling fLaws: The Limits of LLM Resampling with Imperfect Verifiers
Figure 2 for Inference Scaling fLaws: The Limits of LLM Resampling with Imperfect Verifiers
Figure 3 for Inference Scaling fLaws: The Limits of LLM Resampling with Imperfect Verifiers
Figure 4 for Inference Scaling fLaws: The Limits of LLM Resampling with Imperfect Verifiers
Viaarxiv icon

Inference Scaling $\scriptsize\mathtt{F}$Laws: The Limits of LLM Resampling with Imperfect Verifiers

Add code
Nov 26, 2024
Figure 1 for Inference Scaling $\scriptsize\mathtt{F}$Laws: The Limits of LLM Resampling with Imperfect Verifiers
Figure 2 for Inference Scaling $\scriptsize\mathtt{F}$Laws: The Limits of LLM Resampling with Imperfect Verifiers
Figure 3 for Inference Scaling $\scriptsize\mathtt{F}$Laws: The Limits of LLM Resampling with Imperfect Verifiers
Figure 4 for Inference Scaling $\scriptsize\mathtt{F}$Laws: The Limits of LLM Resampling with Imperfect Verifiers
Viaarxiv icon

CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark

Add code
Sep 17, 2024
Figure 1 for CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark
Figure 2 for CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark
Figure 3 for CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark
Figure 4 for CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark
Viaarxiv icon

AI Agents That Matter

Add code
Jul 01, 2024
Viaarxiv icon

The Responsible Foundation Model Development Cheatsheet: A Review of Tools & Resources

Add code
Jun 26, 2024
Viaarxiv icon

AI Risk Management Should Incorporate Both Safety and Security

Add code
May 29, 2024
Viaarxiv icon

A Safe Harbor for AI Evaluation and Red Teaming

Add code
Mar 07, 2024
Figure 1 for A Safe Harbor for AI Evaluation and Red Teaming
Figure 2 for A Safe Harbor for AI Evaluation and Red Teaming
Figure 3 for A Safe Harbor for AI Evaluation and Red Teaming
Figure 4 for A Safe Harbor for AI Evaluation and Red Teaming
Viaarxiv icon

On the Societal Impact of Open Foundation Models

Add code
Feb 27, 2024
Figure 1 for On the Societal Impact of Open Foundation Models
Figure 2 for On the Societal Impact of Open Foundation Models
Viaarxiv icon

Foundation Model Transparency Reports

Add code
Feb 26, 2024
Viaarxiv icon

REFORMS: Reporting Standards for Machine Learning Based Science

Add code
Aug 15, 2023
Figure 1 for REFORMS: Reporting Standards for Machine Learning Based Science
Figure 2 for REFORMS: Reporting Standards for Machine Learning Based Science
Viaarxiv icon