Picture for Bernd Bohnet

Bernd Bohnet

Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability

Add code
Aug 14, 2024
Viaarxiv icon

Exploring and Benchmarking the Planning Capabilities of Large Language Models

Add code
Jun 18, 2024
Viaarxiv icon

Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation

Add code
May 31, 2024
Viaarxiv icon

Many-Shot In-Context Learning

Add code
Apr 17, 2024
Viaarxiv icon

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Add code
Mar 08, 2024
Viaarxiv icon

A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains

Add code
Feb 02, 2024
Viaarxiv icon

Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

Add code
Dec 22, 2023
Viaarxiv icon

Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5?

Add code
Nov 15, 2023
Viaarxiv icon

A Comprehensive Evaluation of Tool-Assisted Generation Strategies

Add code
Oct 16, 2023
Viaarxiv icon

Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models

Add code
Dec 15, 2022
Viaarxiv icon