Picture for Daniel Deutsch

Daniel Deutsch

Dima

Searching for Difficult-to-Translate Test Examples at Scale

Add code
Sep 30, 2025
Viaarxiv icon

Deconstructing Self-Bias in LLM-generated Translation Benchmarks

Add code
Sep 30, 2025
Viaarxiv icon

SSA-COMET: Do LLMs Outperform Learned Metrics in Evaluating MT for Under-Resourced African Languages?

Add code
Jun 05, 2025
Viaarxiv icon

Gemma 3 Technical Report

Add code
Mar 25, 2025
Viaarxiv icon

Enhancing Human Evaluation in Machine Translation with Comparative Judgment

Add code
Feb 25, 2025
Figure 1 for Enhancing Human Evaluation in Machine Translation with Comparative Judgment
Figure 2 for Enhancing Human Evaluation in Machine Translation with Comparative Judgment
Figure 3 for Enhancing Human Evaluation in Machine Translation with Comparative Judgment
Figure 4 for Enhancing Human Evaluation in Machine Translation with Comparative Judgment
Viaarxiv icon

WMT24++: Expanding the Language Coverage of WMT24 to 55 Languages & Dialects

Add code
Feb 18, 2025
Viaarxiv icon

Overestimation in LLM Evaluation: A Controlled Large-Scale Study on Data Contamination's Impact on Machine Translation

Add code
Jan 30, 2025
Figure 1 for Overestimation in LLM Evaluation: A Controlled Large-Scale Study on Data Contamination's Impact on Machine Translation
Figure 2 for Overestimation in LLM Evaluation: A Controlled Large-Scale Study on Data Contamination's Impact on Machine Translation
Figure 3 for Overestimation in LLM Evaluation: A Controlled Large-Scale Study on Data Contamination's Impact on Machine Translation
Figure 4 for Overestimation in LLM Evaluation: A Controlled Large-Scale Study on Data Contamination's Impact on Machine Translation
Viaarxiv icon

Mitigating Metric Bias in Minimum Bayes Risk Decoding

Add code
Nov 05, 2024
Figure 1 for Mitigating Metric Bias in Minimum Bayes Risk Decoding
Figure 2 for Mitigating Metric Bias in Minimum Bayes Risk Decoding
Figure 3 for Mitigating Metric Bias in Minimum Bayes Risk Decoding
Figure 4 for Mitigating Metric Bias in Minimum Bayes Risk Decoding
Viaarxiv icon

Beyond Human-Only: Evaluating Human-Machine Collaboration for Collecting High-Quality Translation Data

Add code
Oct 14, 2024
Viaarxiv icon

MetricX-24: The Google Submission to the WMT 2024 Metrics Shared Task

Add code
Oct 04, 2024
Viaarxiv icon