Picture for Barbara Plank

Barbara Plank

Understanding When Tree of Thoughts Succeeds: Larger Models Excel in Generation, Not Discrimination

Add code
Oct 24, 2024
Viaarxiv icon

Dialetto, ma Quanto Dialetto? Transcribing and Evaluating Dialects on a Continuum

Add code
Oct 18, 2024
Viaarxiv icon

Surgical, Cheap, and Flexible: Mitigating False Refusal in Language Models via Single Vector Ablation

Add code
Oct 04, 2024
Viaarxiv icon

Circuit Compositions: Exploring Modular Structures in Transformer-Based Language Models

Add code
Oct 02, 2024
Viaarxiv icon

MultiClimate: Multimodal Stance Detection on Climate Change Videos

Add code
Sep 26, 2024
Viaarxiv icon

Behavioral Testing: Can Large Language Models Implicitly Resolve Ambiguous Entities?

Add code
Jul 25, 2024
Viaarxiv icon

An Empirical Comparison of Generative Approaches for Product Attribute-Value Identification

Add code
Jul 01, 2024
Viaarxiv icon

LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks

Add code
Jun 26, 2024
Figure 1 for LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks
Figure 2 for LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks
Figure 3 for LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks
Figure 4 for LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks
Viaarxiv icon

"Seeing the Big through the Small": Can LLMs Approximate Human Judgment Distributions on NLI from a Few Explanations?

Add code
Jun 25, 2024
Viaarxiv icon

CLIMATELI: Evaluating Entity Linking on Climate Change Data

Add code
Jun 24, 2024
Viaarxiv icon