Picture for Craig Thomson

Craig Thomson

HEDS 3.0: The Human Evaluation Data Sheet Version 3.0

Add code
Dec 10, 2024
Viaarxiv icon

AI-based traffic analysis in digital twin networks

Add code
Nov 01, 2024
Viaarxiv icon

AI in Energy Digital Twining: A Reinforcement Learning-based Adaptive Digital Twin Model for Green Cities

Add code
Jan 28, 2024
Viaarxiv icon

Missing Information, Unresponsive Authors, Experimental Flaws: The Impossibility of Assessing the Reproducibility of Previous Human Evaluations in NLP

Add code
May 02, 2023
Figure 1 for Missing Information, Unresponsive Authors, Experimental Flaws: The Impossibility of Assessing the Reproducibility of Previous Human Evaluations in NLP
Figure 2 for Missing Information, Unresponsive Authors, Experimental Flaws: The Impossibility of Assessing the Reproducibility of Previous Human Evaluations in NLP
Figure 3 for Missing Information, Unresponsive Authors, Experimental Flaws: The Impossibility of Assessing the Reproducibility of Previous Human Evaluations in NLP
Figure 4 for Missing Information, Unresponsive Authors, Experimental Flaws: The Impossibility of Assessing the Reproducibility of Previous Human Evaluations in NLP
Viaarxiv icon

GEMv2: Multilingual NLG Benchmarking in a Single Line of Code

Add code
Jun 24, 2022
Figure 1 for GEMv2: Multilingual NLG Benchmarking in a Single Line of Code
Figure 2 for GEMv2: Multilingual NLG Benchmarking in a Single Line of Code
Figure 3 for GEMv2: Multilingual NLG Benchmarking in a Single Line of Code
Figure 4 for GEMv2: Multilingual NLG Benchmarking in a Single Line of Code
Viaarxiv icon

Generation Challenges: Results of the Accuracy Evaluation Shared Task

Add code
Aug 15, 2021
Figure 1 for Generation Challenges: Results of the Accuracy Evaluation Shared Task
Figure 2 for Generation Challenges: Results of the Accuracy Evaluation Shared Task
Figure 3 for Generation Challenges: Results of the Accuracy Evaluation Shared Task
Figure 4 for Generation Challenges: Results of the Accuracy Evaluation Shared Task
Viaarxiv icon

Underreporting of errors in NLG output, and what to do about it

Add code
Aug 08, 2021
Figure 1 for Underreporting of errors in NLG output, and what to do about it
Figure 2 for Underreporting of errors in NLG output, and what to do about it
Viaarxiv icon

A Gold Standard Methodology for Evaluating Accuracy in Data-To-Text Systems

Add code
Nov 08, 2020
Figure 1 for A Gold Standard Methodology for Evaluating Accuracy in Data-To-Text Systems
Figure 2 for A Gold Standard Methodology for Evaluating Accuracy in Data-To-Text Systems
Figure 3 for A Gold Standard Methodology for Evaluating Accuracy in Data-To-Text Systems
Figure 4 for A Gold Standard Methodology for Evaluating Accuracy in Data-To-Text Systems
Viaarxiv icon

Shared Task on Evaluating Accuracy in Natural Language Generation

Add code
Jun 22, 2020
Viaarxiv icon