Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Nektarios Machner

Efficient Few-shot Learning for Multi-label Classification of Scientific Documents with Many Classes

Oct 08, 2024

Tim Schopf, Alexander Blatzheim, Nektarios Machner, Florian Matthes

Figure 1 for Efficient Few-shot Learning for Multi-label Classification of Scientific Documents with Many Classes

Figure 2 for Efficient Few-shot Learning for Multi-label Classification of Scientific Documents with Many Classes

Figure 3 for Efficient Few-shot Learning for Multi-label Classification of Scientific Documents with Many Classes

Figure 4 for Efficient Few-shot Learning for Multi-label Classification of Scientific Documents with Many Classes

Abstract:Scientific document classification is a critical task and often involves many classes. However, collecting human-labeled data for many classes is expensive and usually leads to label-scarce scenarios. Moreover, recent work has shown that sentence embedding model fine-tuning for few-shot classification is efficient, robust, and effective. In this work, we propose FusionSent (Fusion-based Sentence Embedding Fine-tuning), an efficient and prompt-free approach for few-shot classification of scientific documents with many classes. FusionSent uses available training examples and their respective label texts to contrastively fine-tune two different sentence embedding models. Afterward, the parameters of both fine-tuned models are fused to combine the complementary knowledge from the separate fine-tuning steps into a single model. Finally, the resulting sentence embedding model is frozen to embed the training instances, which are then used as input features to train a classification head. Our experiments show that FusionSent significantly outperforms strong baselines by an average of $6.0$ $F_{1}$ points across multiple scientific document classification datasets. In addition, we introduce a new dataset for multi-label classification of scientific documents, which contains 183,565 scientific articles and 130 classes from the arXiv category taxonomy. Code and data are available at https://github.com/sebischair/FusionSent.

* Accepted to the 7th International Conference on Natural Language and Speech Processing (ICNLSP 2024)

Via

Access Paper or Ask Questions

Bridging Information Gaps in Dialogues With Grounded Exchanges Using Knowledge Graphs

Aug 02, 2024

Phillip Schneider, Nektarios Machner, Kristiina Jokinen, Florian Matthes

Figure 1 for Bridging Information Gaps in Dialogues With Grounded Exchanges Using Knowledge Graphs

Figure 2 for Bridging Information Gaps in Dialogues With Grounded Exchanges Using Knowledge Graphs

Figure 3 for Bridging Information Gaps in Dialogues With Grounded Exchanges Using Knowledge Graphs

Figure 4 for Bridging Information Gaps in Dialogues With Grounded Exchanges Using Knowledge Graphs

Abstract:Knowledge models are fundamental to dialogue systems for enabling conversational interactions, which require handling domain-specific knowledge. Ensuring effective communication in information-providing conversations entails aligning user understanding with the knowledge available to the system. However, dialogue systems often face challenges arising from semantic inconsistencies in how information is expressed in natural language compared to how it is represented within the system's internal knowledge. To address this problem, we study the potential of large language models for conversational grounding, a mechanism to bridge information gaps by establishing shared knowledge between dialogue participants. Our approach involves annotating human conversations across five knowledge domains to create a new dialogue corpus called BridgeKG. Through a series of experiments on this dataset, we empirically evaluate the capabilities of large language models in classifying grounding acts and identifying grounded information items within a knowledge graph structure. Our findings offer insights into how these models use in-context learning for conversational grounding tasks and common prediction errors, which we illustrate with examples from challenging dialogues. We discuss how the models handle knowledge graphs as a semantic layer between unstructured dialogue utterances and structured information items.

* Accepted to SIGDIAL 2024

Via

Access Paper or Ask Questions