Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Deniz Zeyrek

Lightweight Connective Detection Using Gradient Boosting

Apr 21, 2024

Mustafa Erolcan Er, Murathan Kurfalı, Deniz Zeyrek

Abstract:In this work, we introduce a lightweight discourse connective detection system. Employing gradient boosting trained on straightforward, low-complexity features, this proposed approach sidesteps the computational demands of the current approaches that rely on deep neural networks. Considering its simplicity, our approach achieves competitive results while offering significant gains in terms of time even on CPU. Furthermore, the stable performance across two unrelated languages suggests the robustness of our system in the multilingual scenario. The model is designed to support the annotation of discourse relations, particularly in scenarios with limited resources, while minimizing performance loss.

* 7 pages, 2 figures, 5 tables

Via

Access Paper or Ask Questions

A description of Turkish Discourse Bank 1.2 and an examination of common dependencies in Turkish discourse

Jul 13, 2022

Deniz Zeyrek, Mustafa Erolcan Er

Figure 1 for A description of Turkish Discourse Bank 1.2 and an examination of common dependencies in Turkish discourse

Figure 2 for A description of Turkish Discourse Bank 1.2 and an examination of common dependencies in Turkish discourse

Abstract:We describe Turkish Discourse Bank 1.2, the latest version of a discourse corpus annotated for explicitly or implicitly conveyed discourse relations, their constitutive units, and senses in the Penn Discourse Treebank style. We present an evaluation of the recently added tokens and examine three commonly occurring dependency patterns that hold among the constitutive units of a pair of adjacent discourse relations, namely, shared arguments, full embedding and partial containment of a discourse relation. We present three major findings: (a) implicitly conveyed relations occur more often than explicitly conveyed relations in the data; (b) it is much more common for two adjacent implicit discourse relations to share an argument than for two adjacent explicit relations to do so; (c) both full embedding and partial containment of discourse relations are pervasive in the corpus, which can be partly due to subordinator connectives whose preposed subordinate clause tends to be selected together with the matrix clause rather than being selected alone. Finally, we briefly discuss the implications of our findings for Turkish discourse parsing.

* Presented in The International Conference on Agglutinative Language Technologies as a challenge of Natural Language Processing (ALTNLP) 2022

Via

Access Paper or Ask Questions