Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shiva Taslimipoor

Prompting open-source and commercial language models for grammatical error correction of English learner text

Jan 15, 2024

Christopher Davis, Andrew Caines, Øistein Andersen, Shiva Taslimipoor, Helen Yannakoudakis, Zheng Yuan, Christopher Bryant, Marek Rei, Paula Buttery

Abstract:Thanks to recent advances in generative AI, we are able to prompt large language models (LLMs) to produce texts which are fluent and grammatical. In addition, it has been shown that we can elicit attempts at grammatical error correction (GEC) from LLMs when prompted with ungrammatical input sentences. We evaluate how well LLMs can perform at GEC by measuring their performance on established benchmark datasets. We go beyond previous studies, which only examined GPT* models on a selection of English GEC datasets, by evaluating seven open-source and three commercial LLMs on four established GEC benchmarks. We investigate model performance and report results against individual error types. Our results indicate that LLMs do not always outperform supervised English GEC models except in specific contexts -- namely commercial LLMs on benchmarks annotated with fluency corrections as opposed to minimal edits. We find that several open-source models outperform commercial ones on minimal edit benchmarks, and that in some settings zero-shot prompting is just as competitive as few-shot prompting.

* 8 pages with appendices

Via

Access Paper or Ask Questions

On the application of Large Language Models for language teaching and assessment technology

Jul 17, 2023

Andrew Caines, Luca Benedetto, Shiva Taslimipoor, Christopher Davis, Yuan Gao, Oeistein Andersen, Zheng Yuan, Mark Elliott, Russell Moore, Christopher Bryant(+5 more)

Abstract:The recent release of very large language models such as PaLM and GPT-4 has made an unprecedented impact in the popular media and public consciousness, giving rise to a mixture of excitement and fear as to their capabilities and potential uses, and shining a light on natural language processing research which had not previously received so much attention. The developments offer great promise for education technology, and in this paper we look specifically at the potential for incorporating large language models in AI-driven language teaching and assessment systems. We consider several research areas and also discuss the risks and ethical considerations surrounding generative AI in education technology for language learners. Overall we find that larger language models offer improvements over previous models in text generation, opening up routes toward content generation which had not previously been plausible. For text generation they must be prompted carefully and their outputs may need to be reshaped before they are ready for use. For automated grading and grammatical error correction, tasks whose progress is checked on well-known benchmarks, early investigations indicate that large language models on their own do not improve on state-of-the-art results according to standard evaluation metrics. For grading it appears that linguistic features established in the literature should still be used for best performance, and for error correction it may be that the models can offer alternative feedback styles which are not measured sensitively with existing methods. In all cases, there is work to be done to experiment with the inclusion of large language models in education technology for language learners, in order to properly understand and report on their capacities and limitations, and to ensure that foreseeable risks such as misinformation and harmful bias are mitigated.

* Accepted at the AIED2023 workshop: Empowering Education with LLMs - the Next-Gen Interface and Content Generation

Via

Access Paper or Ask Questions

Constructing Open Cloze Tests Using Generation and Discrimination Capabilities of Transformers

Apr 14, 2022

Mariano Felice, Shiva Taslimipoor, Paula Buttery

Figure 1 for Constructing Open Cloze Tests Using Generation and Discrimination Capabilities of Transformers

Figure 2 for Constructing Open Cloze Tests Using Generation and Discrimination Capabilities of Transformers

Figure 3 for Constructing Open Cloze Tests Using Generation and Discrimination Capabilities of Transformers

Figure 4 for Constructing Open Cloze Tests Using Generation and Discrimination Capabilities of Transformers

Abstract:This paper presents the first multi-objective transformer model for constructing open cloze tests that exploits generation and discrimination capabilities to improve performance. Our model is further enhanced by tweaking its loss function and applying a post-processing re-ranking algorithm that improves overall test structure. Experiments using automatic and human evaluation show that our approach can achieve up to 82% accuracy according to experts, outperforming previous work and baselines. We also release a collection of high-quality open cloze tests along with sample system output and human annotations that can serve as a future benchmark.

* Accepted at Findings of ACL 2022

Via

Access Paper or Ask Questions

MTLB-STRUCT @PARSEME 2020: Capturing Unseen Multiword Expressions Using Multi-task Learning and Pre-trained Masked Language Models

Nov 04, 2020

Shiva Taslimipoor, Sara Bahaadini, Ekaterina Kochmar

Figure 1 for MTLB-STRUCT @PARSEME 2020: Capturing Unseen Multiword Expressions Using Multi-task Learning and Pre-trained Masked Language Models

Figure 2 for MTLB-STRUCT @PARSEME 2020: Capturing Unseen Multiword Expressions Using Multi-task Learning and Pre-trained Masked Language Models

Figure 3 for MTLB-STRUCT @PARSEME 2020: Capturing Unseen Multiword Expressions Using Multi-task Learning and Pre-trained Masked Language Models

Abstract:This paper describes a semi-supervised system that jointly learns verbal multiword expressions (VMWEs) and dependency parse trees as an auxiliary task. The model benefits from pre-trained multilingual BERT. BERT hidden layers are shared among the two tasks and we introduce an additional linear layer to retrieve VMWE tags. The dependency parse tree prediction is modelled by a linear layer and a bilinear one plus a tree CRF on top of BERT. The system has participated in the open track of the PARSEME shared task 2020 and ranked first in terms of F1-score in identifying unseen VMWEs as well as VMWEs in general, averaged across all 14 languages.

* accepted for publication at MWE-LEX 2020 Workshop at COLING

Via

Access Paper or Ask Questions

Bridging the Gap: Attending to Discontinuity in Identification of Multiword Expressions

Feb 27, 2019

Omid Rohanian, Shiva Taslimipoor, Samaneh Kouchaki, Le An Ha, Ruslan Mitkov

Figure 1 for Bridging the Gap: Attending to Discontinuity in Identification of Multiword Expressions

Figure 2 for Bridging the Gap: Attending to Discontinuity in Identification of Multiword Expressions

Figure 3 for Bridging the Gap: Attending to Discontinuity in Identification of Multiword Expressions

Figure 4 for Bridging the Gap: Attending to Discontinuity in Identification of Multiword Expressions

Abstract:We introduce a new method to tag Multiword Expressions (MWEs) using a linguistically interpretable language-independent deep learning architecture. We specifically target discontinuity, an under-explored aspect that poses a significant challenge to computational treatment of MWEs. Two neural architectures are explored: Graph Convolutional Network (GCN) and multi-head self-attention. GCN leverages dependency parse information, and self-attention attends to long-range relations. We finally propose a combined model that integrates complementary information from both through a gating mechanism. The experiments on a standard multilingual dataset for verbal MWEs show that our model outperforms the baselines not only in the case of discontinuous MWEs but also in overall F-score.

* Accepted at NAACL-HLT 2019

Via

Access Paper or Ask Questions

SHOMA at Parseme Shared Task on Automatic Identification of VMWEs: Neural Multiword Expression Tagging with High Generalisation

Sep 09, 2018

Shiva Taslimipoor, Omid Rohanian

Figure 1 for SHOMA at Parseme Shared Task on Automatic Identification of VMWEs: Neural Multiword Expression Tagging with High Generalisation

Figure 2 for SHOMA at Parseme Shared Task on Automatic Identification of VMWEs: Neural Multiword Expression Tagging with High Generalisation

Figure 3 for SHOMA at Parseme Shared Task on Automatic Identification of VMWEs: Neural Multiword Expression Tagging with High Generalisation

Figure 4 for SHOMA at Parseme Shared Task on Automatic Identification of VMWEs: Neural Multiword Expression Tagging with High Generalisation

Abstract:This paper presents a language-independent deep learning architecture adapted to the task of multiword expression (MWE) identification. We employ a neural architecture comprising of convolutional and recurrent layers with the addition of an optional CRF layer at the top. This system participated in the open track of the Parseme shared task on automatic identification of verbal MWEs due to the use of pre-trained wikipedia word embeddings. It outperformed all participating systems in both open and closed tracks with the overall macro-average MWE-based F1 score of 58.09 averaged among all languages. A particular strength of the system is its superior performance on unseen data entries.

Via

Access Paper or Ask Questions