Picture for Juri Opitz

Juri Opitz

PARAPHRASUS : A Comprehensive Benchmark for Evaluating Paraphrase Detection Models

Add code
Sep 18, 2024
Viaarxiv icon

Natural Language Processing RELIES on Linguistics

Add code
May 09, 2024
Viaarxiv icon

A Closer Look at Classification Evaluation Metrics and a Critical Reflection of Common Evaluation Practice

Add code
Apr 25, 2024
Viaarxiv icon

Schroedinger's Threshold: When the AUC doesn't predict Accuracy

Add code
Apr 04, 2024
Viaarxiv icon

On the Role of Summary Content Units in Text Summarization Evaluation

Add code
Apr 02, 2024
Viaarxiv icon

The Eval4NLP 2023 Shared Task on Prompting Large Language Models as Explainable Metrics

Add code
Oct 30, 2023
Figure 1 for The Eval4NLP 2023 Shared Task on Prompting Large Language Models as Explainable Metrics
Figure 2 for The Eval4NLP 2023 Shared Task on Prompting Large Language Models as Explainable Metrics
Figure 3 for The Eval4NLP 2023 Shared Task on Prompting Large Language Models as Explainable Metrics
Figure 4 for The Eval4NLP 2023 Shared Task on Prompting Large Language Models as Explainable Metrics
Viaarxiv icon

Gzip versus bag-of-words for text classification

Add code
Aug 08, 2023
Viaarxiv icon

AMR4NLI: Interpretable and robust NLI measures from semantic graphs

Add code
Jun 01, 2023
Figure 1 for AMR4NLI: Interpretable and robust NLI measures from semantic graphs
Figure 2 for AMR4NLI: Interpretable and robust NLI measures from semantic graphs
Figure 3 for AMR4NLI: Interpretable and robust NLI measures from semantic graphs
Figure 4 for AMR4NLI: Interpretable and robust NLI measures from semantic graphs
Viaarxiv icon

With a Little Push, NLI Models can Robustly and Efficiently Predict Faithfulness

Add code
May 26, 2023
Viaarxiv icon

Similarity-weighted Construction of Contextualized Commonsense Knowledge Graphs for Knowledge-intense Argumentation Tasks

Add code
May 15, 2023
Viaarxiv icon