Picture for Matthias Bethge

Matthias Bethge

Can Language Models Falsify? Evaluating Algorithmic Reasoning with Counterexample Creation

Add code
Feb 26, 2025
Viaarxiv icon

Project Alexandria: Towards Freeing Scientific Knowledge from Copyright Burdens via LLMs

Add code
Feb 26, 2025
Viaarxiv icon

Testing the limits of fine-tuning to improve reasoning in vision language models

Add code
Feb 21, 2025
Viaarxiv icon

LLMs on the Line: Data Determines Loss-to-Loss Scaling Laws

Add code
Feb 17, 2025
Viaarxiv icon

Great Models Think Alike and this Undermines AI Oversight

Add code
Feb 06, 2025
Viaarxiv icon

How to Merge Your Multimodal Models Over Time?

Add code
Dec 09, 2024
Figure 1 for How to Merge Your Multimodal Models Over Time?
Figure 2 for How to Merge Your Multimodal Models Over Time?
Figure 3 for How to Merge Your Multimodal Models Over Time?
Figure 4 for How to Merge Your Multimodal Models Over Time?
Viaarxiv icon

ONEBench to Test Them All: Sample-Level Benchmarking Over Open-Ended Capabilities

Add code
Dec 09, 2024
Figure 1 for ONEBench to Test Them All: Sample-Level Benchmarking Over Open-Ended Capabilities
Figure 2 for ONEBench to Test Them All: Sample-Level Benchmarking Over Open-Ended Capabilities
Figure 3 for ONEBench to Test Them All: Sample-Level Benchmarking Over Open-Ended Capabilities
Figure 4 for ONEBench to Test Them All: Sample-Level Benchmarking Over Open-Ended Capabilities
Viaarxiv icon

Object segmentation from common fate: Motion energy processing enables human-like zero-shot generalization to random dot stimuli

Add code
Nov 03, 2024
Viaarxiv icon

Centaur: a foundation model of human cognition

Add code
Oct 26, 2024
Figure 1 for Centaur: a foundation model of human cognition
Figure 2 for Centaur: a foundation model of human cognition
Figure 3 for Centaur: a foundation model of human cognition
Figure 4 for Centaur: a foundation model of human cognition
Viaarxiv icon

In Search of Forgotten Domain Generalization

Add code
Oct 10, 2024
Viaarxiv icon