Picture for Jason Weston

Jason Weston

Google

Scaling Agent Learning via Experience Synthesis

Add code
Nov 10, 2025
Figure 1 for Scaling Agent Learning via Experience Synthesis
Figure 2 for Scaling Agent Learning via Experience Synthesis
Figure 3 for Scaling Agent Learning via Experience Synthesis
Figure 4 for Scaling Agent Learning via Experience Synthesis
Viaarxiv icon

RESTRAIN: From Spurious Votes to Signals -- Self-Driven RL with Self-Penalization

Add code
Oct 02, 2025
Figure 1 for RESTRAIN: From Spurious Votes to Signals -- Self-Driven RL with Self-Penalization
Figure 2 for RESTRAIN: From Spurious Votes to Signals -- Self-Driven RL with Self-Penalization
Figure 3 for RESTRAIN: From Spurious Votes to Signals -- Self-Driven RL with Self-Penalization
Figure 4 for RESTRAIN: From Spurious Votes to Signals -- Self-Driven RL with Self-Penalization
Viaarxiv icon

Stochastic activations

Add code
Sep 26, 2025
Viaarxiv icon

StepWiser: Stepwise Generative Judges for Wiser Reasoning

Add code
Aug 27, 2025
Figure 1 for StepWiser: Stepwise Generative Judges for Wiser Reasoning
Figure 2 for StepWiser: Stepwise Generative Judges for Wiser Reasoning
Figure 3 for StepWiser: Stepwise Generative Judges for Wiser Reasoning
Figure 4 for StepWiser: Stepwise Generative Judges for Wiser Reasoning
Viaarxiv icon

OptimalThinkingBench: Evaluating Over and Underthinking in LLMs

Add code
Aug 18, 2025
Figure 1 for OptimalThinkingBench: Evaluating Over and Underthinking in LLMs
Figure 2 for OptimalThinkingBench: Evaluating Over and Underthinking in LLMs
Figure 3 for OptimalThinkingBench: Evaluating Over and Underthinking in LLMs
Figure 4 for OptimalThinkingBench: Evaluating Over and Underthinking in LLMs
Viaarxiv icon

Learning to Reason for Factuality

Add code
Aug 07, 2025
Viaarxiv icon

CoT-Self-Instruct: Building high-quality synthetic prompts for reasoning and non-reasoning tasks

Add code
Jul 31, 2025
Viaarxiv icon

MetaCLIP 2: A Worldwide Scaling Recipe

Add code
Jul 29, 2025
Figure 1 for MetaCLIP 2: A Worldwide Scaling Recipe
Figure 2 for MetaCLIP 2: A Worldwide Scaling Recipe
Figure 3 for MetaCLIP 2: A Worldwide Scaling Recipe
Figure 4 for MetaCLIP 2: A Worldwide Scaling Recipe
Viaarxiv icon

NaturalThoughts: Selecting and Distilling Reasoning Traces for General Reasoning Tasks

Add code
Jul 02, 2025
Figure 1 for NaturalThoughts: Selecting and Distilling Reasoning Traces for General Reasoning Tasks
Figure 2 for NaturalThoughts: Selecting and Distilling Reasoning Traces for General Reasoning Tasks
Figure 3 for NaturalThoughts: Selecting and Distilling Reasoning Traces for General Reasoning Tasks
Figure 4 for NaturalThoughts: Selecting and Distilling Reasoning Traces for General Reasoning Tasks
Viaarxiv icon

J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning

Add code
May 15, 2025
Viaarxiv icon