Picture for Daniel Deutsch

Daniel Deutsch

Mitigating Metric Bias in Minimum Bayes Risk Decoding

Add code
Nov 05, 2024
Viaarxiv icon

Beyond Human-Only: Evaluating Human-Machine Collaboration for Collecting High-Quality Translation Data

Add code
Oct 14, 2024
Viaarxiv icon

MetricX-24: The Google Submission to the WMT 2024 Metrics Shared Task

Add code
Oct 04, 2024
Viaarxiv icon

Improving Statistical Significance in Human Evaluation of Automatic Metrics via Soft Pairwise Accuracy

Add code
Sep 15, 2024
Viaarxiv icon

On the Role of Summary Content Units in Text Summarization Evaluation

Add code
Apr 02, 2024
Viaarxiv icon

Finding Replicable Human Evaluations via Stable Ranking Probability

Add code
Apr 01, 2024
Viaarxiv icon

Pinpoint, Not Criticize: Refining Large Language Models via Fine-Grained Actionable Feedback

Add code
Nov 15, 2023
Viaarxiv icon

There's no Data Like Better Data: Using QE Metrics for MT Data Filtering

Add code
Nov 09, 2023
Viaarxiv icon

The Eval4NLP 2023 Shared Task on Prompting Large Language Models as Explainable Metrics

Add code
Oct 30, 2023
Figure 1 for The Eval4NLP 2023 Shared Task on Prompting Large Language Models as Explainable Metrics
Figure 2 for The Eval4NLP 2023 Shared Task on Prompting Large Language Models as Explainable Metrics
Figure 3 for The Eval4NLP 2023 Shared Task on Prompting Large Language Models as Explainable Metrics
Figure 4 for The Eval4NLP 2023 Shared Task on Prompting Large Language Models as Explainable Metrics
Viaarxiv icon

Training and Meta-Evaluating Machine Translation Evaluation Metrics at the Paragraph Level

Add code
Aug 28, 2023
Viaarxiv icon