Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Bruno Guillaume

SEMAGRAMME, LORIA

AfriSUD: A Dependency Treebank Collection for Evaluating Models on African Languages

Jun 10, 2026

Happy Buzaaba, Cheikh Mouhamadou Bamba Dione, David Ifeoluwa Adelani, Sylvain Kahane, Kim Gerdes, Bruno Guillaume, Kevin Guan, Aremu Anuoluwapo, Naome A. Etori, Shamsuddeen Hassan Muhammad(+8 more)

Abstract:Despite their linguistic diversity and global significance, African languages remain underrepresented in research and resources to support NLP. We aim to bridge this gap by introducing AfriSUD, the first large-scale collection of syntactically annotated treebanks for nine diverse African languages spanning major language families and regions across Sub-Saharan Africa. Using the Surface-Syntactic Universal Dependencies (SUD) framework, our community-led effort provides high-quality, native-speaker verified data that capture typological key features such as agglutination and tone. We evaluate a range of models on AfriSUD for part-of-speech tagging and dependency parsing including non-transformer baselines, multilingual pretrained encoders, and LLMs. Our results reveal a significant syntax gap, where models still show clear limitations across the nine languages, suggesting that existing architectures may not fully capture the structural diversity of African-language syntax.

Via

Access Paper or Ask Questions

Coconstructions in spoken data: UD annotation guidelines and first results

Mar 30, 2026

Ludovica Pannitto, Sylvain Kahane, Kaja Dobrovoljc, Elena Battaglia, Bruno Guillaume, Caterina Mauri, Eleonora Zucchini

Abstract:The paper proposes annotation guidelines for syntactic dependencies that span across speaker turns - including collaborative coconstructions proper, wh-question answers, and backchannels - in spoken language treebanks within the Universal Dependencies framework. Two representations are proposed: a speaker-based representation following the segmentation into speech turns, and a dependency-based representation with dependencies across speech turns. New propositions are also put forward to distinguish between reformulations and repairs, and to promote elements in unfinished phrases.

Via

Access Paper or Ask Questions

How much of UCCA can be predicted from AMR?

Jul 25, 2022

Siyana Pavlova, Maxime Amblard, Bruno Guillaume

Figure 1 for How much of UCCA can be predicted from AMR?

Figure 2 for How much of UCCA can be predicted from AMR?

Figure 3 for How much of UCCA can be predicted from AMR?

Figure 4 for How much of UCCA can be predicted from AMR?

Abstract:In this paper, we consider two of the currently popular semantic frameworks: Abstract Meaning Representation (AMR)a more abstract framework, and Universal Conceptual Cognitive Annotation (UCCA)-an anchored framework. We use a corpus-based approach to build two graph rewriting systems, a deterministic and a non-deterministic one, from the former to the latter framework. We present their evaluation and a number of ambiguities that we discovered while building our rules. Finally, we provide a discussion and some future work directions in relation to comparing semantic frameworks of different flavors.

* f ISA-18 Workshop at LREC2022, Jun 2022, Marseille, France

Via

Access Paper or Ask Questions

Graph Querying for Semantic Annotations

Jul 25, 2022

Maxime Amblard, Bruno Guillaume, Siyana Pavlova, Guy Perrier

Figure 1 for Graph Querying for Semantic Annotations

Figure 2 for Graph Querying for Semantic Annotations

Figure 3 for Graph Querying for Semantic Annotations

Figure 4 for Graph Querying for Semantic Annotations

Abstract:This paper presents how the online tool GREW-MATCH can be used to make queries and visualise data from existing semantically annotated corpora. A dedicated syntax is available to construct simple to complex queries and execute them against a corpus. Such queries give transverse views of the annotated data, these views can help for checking the consistency of annotations in one corpus or across several corpora. GREW-MATCH can then be seen as an error mining tool: when inconsistencies are detected, it helps finding the sentences which should be fixed. Finally, GREW-MATCH can also be used as a side tool to assist annotation tasks helping to find annotation examples in existing corpora to be compared to the data to be annotated.

* f ISA-18 Workshop at LREC2022, Jun 2022, Marseille, France

Via

Access Paper or Ask Questions

Non-simplifying Graph Rewriting Termination

Feb 26, 2013

Guillaume Bonfante, Bruno Guillaume

Figure 1 for Non-simplifying Graph Rewriting Termination

Abstract:So far, a very large amount of work in Natural Language Processing (NLP) rely on trees as the core mathematical structure to represent linguistic informations (e.g. in Chomsky's work). However, some linguistic phenomena do not cope properly with trees. In a former paper, we showed the benefit of encoding linguistic structures by graphs and of using graph rewriting rules to compute on those structures. Justified by some linguistic considerations, graph rewriting is characterized by two features: first, there is no node creation along computations and second, there are non-local edge modifications. Under these hypotheses, we show that uniform termination is undecidable and that non-uniform termination is decidable. We describe two termination techniques based on weights and we give complexity bound on the derivation length for these rewriting system.

* EPTCS 110, 2013, pp. 4-16
* In Proceedings TERMGRAPH 2013, arXiv:1302.5997

Via

Access Paper or Ask Questions

Motifs de graphe pour le calcul de dépendances syntaxiques complètes

Nov 18, 2010

Jonathan Marchand, Bruno Guillaume, Guy Perrier

Figure 1 for Motifs de graphe pour le calcul de dépendances syntaxiques complètes

Figure 2 for Motifs de graphe pour le calcul de dépendances syntaxiques complètes

Figure 3 for Motifs de graphe pour le calcul de dépendances syntaxiques complètes

Figure 4 for Motifs de graphe pour le calcul de dépendances syntaxiques complètes

Abstract:This article describes a method to build syntactical dependencies starting from the phrase structure parsing process. The goal is to obtain all the information needed for a detailled semantical analysis. Interaction Grammars are used for parsing; the saturation of polarities which is the core of this formalism can be mapped to dependency relation. Formally, graph patterns are used to express the set of constraints which control dependency creations.

* Conf\'erence sur le Traitement Automatique des Langues Naturelles - TALN'10, Montr\'eal : Canada (2010)

Via

Access Paper or Ask Questions

Analyse en dépendances à l'aide des grammaires d'interaction

Sep 18, 2009

Jonathan Marchand, Bruno Guillaume, Guy Perrier

Figure 1 for Analyse en dépendances à l'aide des grammaires d'interaction

Figure 2 for Analyse en dépendances à l'aide des grammaires d'interaction

Figure 3 for Analyse en dépendances à l'aide des grammaires d'interaction

Figure 4 for Analyse en dépendances à l'aide des grammaires d'interaction

Abstract:This article proposes a method to extract dependency structures from phrase-structure level parsing with Interaction Grammars. Interaction Grammars are a formalism which expresses interactions among words using a polarity system. Syntactical composition is led by the saturation of polarities. Interactions take place between constituents, but as grammars are lexicalized, these interactions can be translated at the level of words. Dependency relations are extracted from the parsing process: every dependency is the consequence of a polarity saturation. The dependency relations we obtain can be seen as a refinement of the usual dependency tree. Generally speaking, this work sheds new light on links between phrase structure and dependency parsing.

* Traitement Automatique des Langues Naturelles, Senlis : France (2009)

Via

Access Paper or Ask Questions