Picture for Marzena Karpinska

Marzena Karpinska

Preliminary WMT24 Ranking of General MT Systems and LLMs

Add code
Jul 29, 2024
Viaarxiv icon

CaLMQA: Exploring culturally specific long-form question answering across 23 languages

Add code
Jun 25, 2024
Viaarxiv icon

One Thousand and One Pairs: A "novel" challenge for long-context language models

Add code
Jun 24, 2024
Viaarxiv icon

Error Span Annotation: A Balanced Approach for Human Evaluation of Machine Translation

Add code
Jun 17, 2024
Viaarxiv icon

FABLES: Evaluating faithfulness and content selection in book-length summarization

Add code
Apr 01, 2024
Figure 1 for FABLES: Evaluating faithfulness and content selection in book-length summarization
Figure 2 for FABLES: Evaluating faithfulness and content selection in book-length summarization
Figure 3 for FABLES: Evaluating faithfulness and content selection in book-length summarization
Figure 4 for FABLES: Evaluating faithfulness and content selection in book-length summarization
Viaarxiv icon

Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order

Add code
Mar 30, 2024
Viaarxiv icon

Large language models effectively leverage document-level context for literary translation, but critical errors persist

Add code
Apr 07, 2023
Viaarxiv icon

Paraphrasing evades detectors of AI-generated text, but retrieval is an effective defense

Add code
Mar 23, 2023
Viaarxiv icon

Exploring Document-Level Literary Machine Translation with Parallel Paragraphs from World Literature

Add code
Oct 25, 2022
Viaarxiv icon

DEMETR: Diagnosing Evaluation Metrics for Translation

Add code
Oct 25, 2022
Figure 1 for DEMETR: Diagnosing Evaluation Metrics for Translation
Figure 2 for DEMETR: Diagnosing Evaluation Metrics for Translation
Figure 3 for DEMETR: Diagnosing Evaluation Metrics for Translation
Figure 4 for DEMETR: Diagnosing Evaluation Metrics for Translation
Viaarxiv icon