Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Oliver Eulenstein

Empirical Analysis for Unsupervised Universal Dependency Parse Tree Aggregation

Apr 03, 2024

Adithya Kulkarni, Oliver Eulenstein, Qi Li

Abstract:Dependency parsing is an essential task in NLP, and the quality of dependency parsers is crucial for many downstream tasks. Parsers' quality often varies depending on the domain and the language involved. Therefore, it is essential to combat the issue of varying quality to achieve stable performance. In various NLP tasks, aggregation methods are used for post-processing aggregation and have been shown to combat the issue of varying quality. However, aggregation methods for post-processing aggregation have not been sufficiently studied in dependency parsing tasks. In an extensive empirical study, we compare different unsupervised post-processing aggregation methods to identify the most suitable dependency tree structure aggregation method.

Via

Access Paper or Ask Questions

CPTAM: Constituency Parse Tree Aggregation Method

Jan 19, 2022

Adithya Kulkarni, Nasim Sabetpour, Alexey Markin, Oliver Eulenstein, Qi Li

Figure 1 for CPTAM: Constituency Parse Tree Aggregation Method

Figure 2 for CPTAM: Constituency Parse Tree Aggregation Method

Figure 3 for CPTAM: Constituency Parse Tree Aggregation Method

Figure 4 for CPTAM: Constituency Parse Tree Aggregation Method

Abstract:Diverse Natural Language Processing tasks employ constituency parsing to understand the syntactic structure of a sentence according to a phrase structure grammar. Many state-of-the-art constituency parsers are proposed, but they may provide different results for the same sentences, especially for corpora outside their training domains. This paper adopts the truth discovery idea to aggregate constituency parse trees from different parsers by estimating their reliability in the absence of ground truth. Our goal is to consistently obtain high-quality aggregated constituency parse trees. We formulate the constituency parse tree aggregation problem in two steps, structure aggregation and constituent label aggregation. Specifically, we propose the first truth discovery solution for tree structures by minimizing the weighted sum of Robinson-Foulds (RF) distances, a classic symmetric distance metric between two trees. Extensive experiments are conducted on benchmark datasets in different languages and domains. The experimental results show that our method, CPTAM, outperforms the state-of-the-art aggregation baselines. We also demonstrate that the weights estimated by CPTAM can adequately evaluate constituency parsers in the absence of ground truth.

Via

Access Paper or Ask Questions