Picture for Jonas Geiping

Jonas Geiping

Scaling Open-Ended Reasoning to Predict the Future

Add code
Dec 31, 2025
Viaarxiv icon

Training AI Co-Scientists Using Rubric Rewards

Add code
Dec 29, 2025
Viaarxiv icon

Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence

Add code
Nov 10, 2025
Viaarxiv icon

Efficient Parallel Samplers for Recurrent-Depth Models and Their Connection to Diffusion Language Models

Add code
Oct 16, 2025
Figure 1 for Efficient Parallel Samplers for Recurrent-Depth Models and Their Connection to Diffusion Language Models
Figure 2 for Efficient Parallel Samplers for Recurrent-Depth Models and Their Connection to Diffusion Language Models
Figure 3 for Efficient Parallel Samplers for Recurrent-Depth Models and Their Connection to Diffusion Language Models
Figure 4 for Efficient Parallel Samplers for Recurrent-Depth Models and Their Connection to Diffusion Language Models
Viaarxiv icon

Adaptive Attacks on Trusted Monitors Subvert AI Control Protocols

Add code
Oct 10, 2025
Figure 1 for Adaptive Attacks on Trusted Monitors Subvert AI Control Protocols
Figure 2 for Adaptive Attacks on Trusted Monitors Subvert AI Control Protocols
Figure 3 for Adaptive Attacks on Trusted Monitors Subvert AI Control Protocols
Figure 4 for Adaptive Attacks on Trusted Monitors Subvert AI Control Protocols
Viaarxiv icon

The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs

Add code
Sep 11, 2025
Figure 1 for The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs
Figure 2 for The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs
Figure 3 for The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs
Figure 4 for The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs
Viaarxiv icon

Answer Matching Outperforms Multiple Choice for Language Model Evaluation

Add code
Jul 03, 2025
Viaarxiv icon

Is your batch size the problem? Revisiting the Adam-SGD gap in language modeling

Add code
Jun 14, 2025
Viaarxiv icon

Capability-Based Scaling Laws for LLM Red-Teaming

Add code
May 26, 2025
Viaarxiv icon

Can you Finetune your Binoculars? Embedding Text Watermarks into the Weights of Large Language Models

Add code
Apr 08, 2025
Figure 1 for Can you Finetune your Binoculars? Embedding Text Watermarks into the Weights of Large Language Models
Figure 2 for Can you Finetune your Binoculars? Embedding Text Watermarks into the Weights of Large Language Models
Figure 3 for Can you Finetune your Binoculars? Embedding Text Watermarks into the Weights of Large Language Models
Figure 4 for Can you Finetune your Binoculars? Embedding Text Watermarks into the Weights of Large Language Models
Viaarxiv icon