Picture for Alexander Koller

Alexander Koller

Triangulating LLM Progress through Benchmarks, Games, and Cognitive Tests

Add code
Feb 20, 2025
Viaarxiv icon

EHOP: A Dataset of Everyday NP-Hard Optimization Problems

Add code
Feb 19, 2025
Viaarxiv icon

A Survey on Complex Tasks for Goal-Directed Interactive Agents

Add code
Sep 27, 2024
Viaarxiv icon

Learning Program Behavioral Models from Synthesized Input-Output Pairs

Add code
Jul 11, 2024
Figure 1 for Learning Program Behavioral Models from Synthesized Input-Output Pairs
Figure 2 for Learning Program Behavioral Models from Synthesized Input-Output Pairs
Figure 3 for Learning Program Behavioral Models from Synthesized Input-Output Pairs
Figure 4 for Learning Program Behavioral Models from Synthesized Input-Output Pairs
Viaarxiv icon

Strengthening Structural Inductive Biases by Pre-training to Perform Syntactic Transformations

Add code
Jul 05, 2024
Viaarxiv icon

Scope-enhanced Compositional Semantic Parsing for DRT

Add code
Jul 02, 2024
Viaarxiv icon

LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks

Add code
Jun 26, 2024
Figure 1 for LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks
Figure 2 for LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks
Figure 3 for LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks
Figure 4 for LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks
Viaarxiv icon

Fine-grained Controllable Text Generation through In-context Learning with Feedback

Add code
Jun 17, 2024
Viaarxiv icon

A Dialogue Game for Eliciting Balanced Collaboration

Add code
Jun 12, 2024
Viaarxiv icon

Simple and effective data augmentation for compositional generalization

Add code
Jan 18, 2024
Viaarxiv icon