Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sania Nayab

KGQuest: Template-Driven QA Generation from Knowledge Graphs with LLM-Based Refinement

Nov 14, 2025

Sania Nayab, Marco Simoni, Giulio Rossolini, Andrea Saracino

Figure 1 for KGQuest: Template-Driven QA Generation from Knowledge Graphs with LLM-Based Refinement

Figure 2 for KGQuest: Template-Driven QA Generation from Knowledge Graphs with LLM-Based Refinement

Figure 3 for KGQuest: Template-Driven QA Generation from Knowledge Graphs with LLM-Based Refinement

Figure 4 for KGQuest: Template-Driven QA Generation from Knowledge Graphs with LLM-Based Refinement

Abstract:The generation of questions and answers (QA) from knowledge graphs (KG) plays a crucial role in the development and testing of educational platforms, dissemination tools, and large language models (LLM). However, existing approaches often struggle with scalability, linguistic quality, and factual consistency. This paper presents a scalable and deterministic pipeline for generating natural language QA from KGs, with an additional refinement step using LLMs to further enhance linguistic quality. The approach first clusters KG triplets based on their relations, creating reusable templates through natural language rules derived from the entity types of objects and relations. A module then leverages LLMs to refine these templates, improving clarity and coherence while preserving factual accuracy. Finally, the instantiation of answer options is achieved through a selection strategy that introduces distractors from the KG. Our experiments demonstrate that this hybrid approach efficiently generates high-quality QA pairs, combining scalability with fluency and linguistic precision.

Via

Access Paper or Ask Questions

Leveraging Knowledge Graphs and LLMs for Structured Generation of Misinformation

May 30, 2025

Sania Nayab, Marco Simoni, Giulio Rossolini

Abstract:The rapid spread of misinformation, further amplified by recent advances in generative AI, poses significant threats to society, impacting public opinion, democratic stability, and national security. Understanding and proactively assessing these threats requires exploring methodologies that enable structured and scalable misinformation generation. In this paper, we propose a novel approach that leverages knowledge graphs (KGs) as structured semantic resources to systematically generate fake triplets. By analyzing the structural properties of KGs, such as the distance between entities and their predicates, we identify plausibly false relationships. These triplets are then used to guide large language models (LLMs) in generating misinformation statements with varying degrees of credibility. By utilizing structured semantic relationships, our deterministic approach produces misinformation inherently challenging for humans to detect, drawing exclusively upon publicly available KGs (e.g., WikiGraphs). Additionally, we investigate the effectiveness of LLMs in distinguishing between genuine and artificially generated misinformation. Our analysis highlights significant limitations in current LLM-based detection methods, underscoring the necessity for enhanced detection strategies and a deeper exploration of inherent biases in generative models.

Via

Access Paper or Ask Questions

Concise Thoughts: Impact of Output Length on LLM Reasoning and Cost

Jul 29, 2024

Sania Nayab, Giulio Rossolini, Giorgio Buttazzo, Nicolamaria Manes, Fabrizio Giacomelli

Figure 1 for Concise Thoughts: Impact of Output Length on LLM Reasoning and Cost

Figure 2 for Concise Thoughts: Impact of Output Length on LLM Reasoning and Cost

Figure 3 for Concise Thoughts: Impact of Output Length on LLM Reasoning and Cost

Figure 4 for Concise Thoughts: Impact of Output Length on LLM Reasoning and Cost

Abstract:Today's large language models (LLMs) can solve challenging question-answering tasks, and prompt engineering techniques, such as chain-of-thought (CoT), have gained attention for enhancing the explanation and correctness of outputs. Nevertheless, models require significant time to generate answers augmented with lengthy reasoning details. To address this issue, this paper analyzes the impact of output lengths on LLM inference pipelines and proposes novel metrics to evaluate them in terms of \textit{correct conciseness}. It also examines the impact of controlling output length through a refined prompt engineering strategy, Constrained-CoT (CCoT), which encourages the model to limit output length. Experiments on pre-trained LLMs demonstrated the benefit of the proposed metrics and the effectiveness of CCoT across different models. For instance, constraining the reasoning of LLaMA2-70b to 100 words improves the accuracy from 36.01\% (CoT) to 41.07\% (CCoT) on the GSM8K dataset, while reducing the average output length by 28 words.

* Preprint version, under review

Via

Access Paper or Ask Questions