Picture for Kanishk Gandhi

Kanishk Gandhi

Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models

Add code
Feb 24, 2025
Viaarxiv icon

Non-literal Understanding of Number Words by Language Models

Add code
Feb 10, 2025
Viaarxiv icon

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Thought

Add code
Jan 08, 2025
Viaarxiv icon

BoxingGym: Benchmarking Progress in Automated Experimental Design and Model Discovery

Add code
Jan 02, 2025
Figure 1 for BoxingGym: Benchmarking Progress in Automated Experimental Design and Model Discovery
Figure 2 for BoxingGym: Benchmarking Progress in Automated Experimental Design and Model Discovery
Figure 3 for BoxingGym: Benchmarking Progress in Automated Experimental Design and Model Discovery
Figure 4 for BoxingGym: Benchmarking Progress in Automated Experimental Design and Model Discovery
Viaarxiv icon

Surveying the Effects of Quality, Diversity, and Complexity in Synthetic Data From Large Language Models

Add code
Dec 04, 2024
Figure 1 for Surveying the Effects of Quality, Diversity, and Complexity in Synthetic Data From Large Language Models
Figure 2 for Surveying the Effects of Quality, Diversity, and Complexity in Synthetic Data From Large Language Models
Figure 3 for Surveying the Effects of Quality, Diversity, and Complexity in Synthetic Data From Large Language Models
Figure 4 for Surveying the Effects of Quality, Diversity, and Complexity in Synthetic Data From Large Language Models
Viaarxiv icon

Human-like Affective Cognition in Foundation Models

Add code
Sep 19, 2024
Viaarxiv icon

Psychometric Alignment: Capturing Human Knowledge Distributions via Language Models

Add code
Jul 22, 2024
Viaarxiv icon

Self-Supervised Alignment with Mutual Information: Learning to Follow Principles without Preference Labels

Add code
Apr 22, 2024
Figure 1 for Self-Supervised Alignment with Mutual Information: Learning to Follow Principles without Preference Labels
Figure 2 for Self-Supervised Alignment with Mutual Information: Learning to Follow Principles without Preference Labels
Figure 3 for Self-Supervised Alignment with Mutual Information: Learning to Follow Principles without Preference Labels
Figure 4 for Self-Supervised Alignment with Mutual Information: Learning to Follow Principles without Preference Labels
Viaarxiv icon

Procedural Dilemma Generation for Evaluating Moral Reasoning in Humans and Language Models

Add code
Apr 17, 2024
Figure 1 for Procedural Dilemma Generation for Evaluating Moral Reasoning in Humans and Language Models
Figure 2 for Procedural Dilemma Generation for Evaluating Moral Reasoning in Humans and Language Models
Figure 3 for Procedural Dilemma Generation for Evaluating Moral Reasoning in Humans and Language Models
Figure 4 for Procedural Dilemma Generation for Evaluating Moral Reasoning in Humans and Language Models
Viaarxiv icon

Stream of Search : Learning to Search in Language

Add code
Apr 01, 2024
Viaarxiv icon