Picture for Jason Weston

Jason Weston

Google

RESTRAIN: From Spurious Votes to Signals -- Self-Driven RL with Self-Penalization

Add code
Oct 02, 2025
Viaarxiv icon

Stochastic activations

Add code
Sep 26, 2025
Viaarxiv icon

StepWiser: Stepwise Generative Judges for Wiser Reasoning

Add code
Aug 27, 2025
Figure 1 for StepWiser: Stepwise Generative Judges for Wiser Reasoning
Figure 2 for StepWiser: Stepwise Generative Judges for Wiser Reasoning
Figure 3 for StepWiser: Stepwise Generative Judges for Wiser Reasoning
Figure 4 for StepWiser: Stepwise Generative Judges for Wiser Reasoning
Viaarxiv icon

OptimalThinkingBench: Evaluating Over and Underthinking in LLMs

Add code
Aug 18, 2025
Viaarxiv icon

Learning to Reason for Factuality

Add code
Aug 07, 2025
Viaarxiv icon

CoT-Self-Instruct: Building high-quality synthetic prompts for reasoning and non-reasoning tasks

Add code
Jul 31, 2025
Viaarxiv icon

MetaCLIP 2: A Worldwide Scaling Recipe

Add code
Jul 29, 2025
Viaarxiv icon

NaturalThoughts: Selecting and Distilling Reasoning Traces for General Reasoning Tasks

Add code
Jul 02, 2025
Viaarxiv icon

J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning

Add code
May 15, 2025
Viaarxiv icon

Multi-Token Attention

Add code
Apr 01, 2025
Figure 1 for Multi-Token Attention
Figure 2 for Multi-Token Attention
Figure 3 for Multi-Token Attention
Figure 4 for Multi-Token Attention
Viaarxiv icon