Picture for Chirag Nagpal

Chirag Nagpal

Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning

Add code
Oct 10, 2024
Figure 1 for Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning
Figure 2 for Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning
Figure 3 for Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning
Figure 4 for Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning
Viaarxiv icon

Robust Preference Optimization through Reward Model Distillation

Add code
May 29, 2024
Viaarxiv icon

A Toolbox for Surfacing Health Equity Harms and Biases in Large Language Models

Add code
Mar 18, 2024
Viaarxiv icon

The Case for Globalizing Fairness: A Mixed Methods Study on Colonialism, AI, and Health in Africa

Add code
Mar 11, 2024
Figure 1 for The Case for Globalizing Fairness: A Mixed Methods Study on Colonialism, AI, and Health in Africa
Figure 2 for The Case for Globalizing Fairness: A Mixed Methods Study on Colonialism, AI, and Health in Africa
Viaarxiv icon

Bias in Language Models: Beyond Trick Tests and Toward RUTEd Evaluation

Add code
Feb 20, 2024
Figure 1 for Bias in Language Models: Beyond Trick Tests and Toward RUTEd Evaluation
Viaarxiv icon

Transforming and Combining Rewards for Aligning Large Language Models

Add code
Feb 01, 2024
Viaarxiv icon

Theoretical guarantees on the best-of-n alignment policy

Add code
Jan 03, 2024
Viaarxiv icon

Helping or Herding? Reward Model Ensembles Mitigate but do not Eliminate Reward Hacking

Add code
Dec 21, 2023
Figure 1 for Helping or Herding? Reward Model Ensembles Mitigate but do not Eliminate Reward Hacking
Figure 2 for Helping or Herding? Reward Model Ensembles Mitigate but do not Eliminate Reward Hacking
Figure 3 for Helping or Herding? Reward Model Ensembles Mitigate but do not Eliminate Reward Hacking
Figure 4 for Helping or Herding? Reward Model Ensembles Mitigate but do not Eliminate Reward Hacking
Viaarxiv icon

Recovering Sparse and Interpretable Subgroups with Heterogeneous Treatment Effects with Censored Time-to-Event Outcomes

Add code
Feb 24, 2023
Viaarxiv icon

Participatory Systems for Personalized Prediction

Add code
Feb 08, 2023
Figure 1 for Participatory Systems for Personalized Prediction
Figure 2 for Participatory Systems for Personalized Prediction
Figure 3 for Participatory Systems for Personalized Prediction
Figure 4 for Participatory Systems for Personalized Prediction
Viaarxiv icon