Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Chengkai Li

Department of Computer Science and Engineering, University of Texas at Arlington

On Large-scale Evaluation of Embedding Models for Knowledge Graph Completion

Apr 11, 2025

Nasim Shirvani-Mahdavi, Farahnaz Akrami, Chengkai Li

Abstract:Knowledge graph embedding (KGE) models are extensively studied for knowledge graph completion, yet their evaluation remains constrained by unrealistic benchmarks. Commonly used datasets are either faulty or too small to reflect real-world data. Few studies examine the role of mediator nodes, which are essential for modeling n-ary relationships, or investigate model performance variation across domains. Standard evaluation metrics rely on the closed-world assumption, which penalizes models for correctly predicting missing triples, contradicting the fundamental goals of link prediction. These metrics often compress accuracy assessment into a single value, obscuring models' specific strengths and weaknesses. The prevailing evaluation protocol operates under the unrealistic assumption that an entity's properties, for which values are to be predicted, are known in advance. While alternative protocols such as property prediction, entity-pair ranking and triple classification address some of these limitations, they remain underutilized. This paper conducts a comprehensive evaluation of four representative KGE models on large-scale datasets FB-CVT-REV and FB+CVT-REV. Our analysis reveals critical insights, including substantial performance variations between small and large datasets, both in relative rankings and absolute metrics, systematic overestimation of model capabilities when n-ary relations are binarized, and fundamental limitations in current evaluation protocols and metrics.

Via

Access Paper or Ask Questions

LLMTaxo: Leveraging Large Language Models for Constructing Taxonomy of Factual Claims from Social Media

Apr 11, 2025

Haiqi Zhang, Zhengyuan Zhu, Zeyu Zhang, Chengkai Li

Abstract:With the vast expansion of content on social media platforms, analyzing and comprehending online discourse has become increasingly complex. This paper introduces LLMTaxo, a novel framework leveraging large language models for the automated construction of taxonomy of factual claims from social media by generating topics from multi-level granularities. This approach aids stakeholders in more effectively navigating the social media landscapes. We implement this framework with different models across three distinct datasets and introduce specially designed taxonomy evaluation metrics for a comprehensive assessment. With the evaluations from both human evaluators and GPT-4, the results indicate that LLMTaxo effectively categorizes factual claims from social media, and reveals that certain models perform better on specific datasets.

Via

Access Paper or Ask Questions

Can LLMs Extract Frame-Semantic Arguments?

Feb 18, 2025

Jacob Devasier, Rishabh Mediratta, Chengkai Li

Abstract:Frame-semantic parsing is a critical task in natural language understanding, yet the ability of large language models (LLMs) to extract frame-semantic arguments remains underexplored. This paper presents a comprehensive evaluation of LLMs on frame-semantic argument identification, analyzing the impact of input representation formats, model architectures, and generalization to unseen and out-of-domain samples. Our experiments, spanning models from 0.5B to 78B parameters, reveal that JSON-based representations significantly enhance performance, and while larger models generally perform better, smaller models can achieve competitive results through fine-tuning. We also introduce a novel approach to frame identification leveraging predicted frame elements, achieving state-of-the-art performance on ambiguous targets. Despite strong generalization capabilities, our analysis finds that LLMs still struggle with out-of-domain data.

Via

Access Paper or Ask Questions

On Detecting Cherry-picking in News Coverage Using Large Language Models

Jan 11, 2024

Israa Jaradat, Haiqi Zhang, Chengkai Li

Abstract:Cherry-picking refers to the deliberate selection of evidence or facts that favor a particular viewpoint while ignoring or distorting evidence that supports an opposing perspective. Manually identifying instances of cherry-picked statements in news stories can be challenging, particularly when the opposing viewpoint's story is absent. This study introduces Cherry, an innovative approach for automatically detecting cherry-picked statements in news articles by finding missing important statements in the target news story. Cherry utilizes the analysis of news coverage from multiple sources to identify instances of cherry-picking. Our approach relies on language models that consider contextual information from other news sources to classify statements based on their importance to the event covered in the target news story. Furthermore, this research introduces a novel dataset specifically designed for cherry-picking detection, which was used to train and evaluate the performance of the models. Our best performing model achieves an F-1 score of about %89 in detecting important statements when tested on unseen set of news stories. Moreover, results show the importance incorporating external knowledge from alternative unbiased narratives when assessing a statement's importance.

Via

Access Paper or Ask Questions

A Benchmark Dataset of Check-worthy Factual Claims

Apr 29, 2020

Fatma Arslan, Naeemul Hassan, Chengkai Li, Mark Tremayne

Figure 1 for A Benchmark Dataset of Check-worthy Factual Claims

Figure 2 for A Benchmark Dataset of Check-worthy Factual Claims

Figure 3 for A Benchmark Dataset of Check-worthy Factual Claims

Figure 4 for A Benchmark Dataset of Check-worthy Factual Claims

Abstract:In this paper we present the ClaimBuster dataset of 23,533 statements extracted from all U.S. general election presidential debates and annotated by human coders. The ClaimBuster dataset can be leveraged in building computational methods to identify claims that are worth fact-checking from the myriad of sources of digital or traditional media. The ClaimBuster dataset is publicly available to the research community, and it can be found at http://doi.org/10.5281/zenodo.3609356.

* Accepted to ICWSM 2020

Via

Access Paper or Ask Questions

Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

Mar 18, 2020

Farahnaz Akrami, Mohammed Samiul Saeef, Qingheng Zhang, Wei Hu, Chengkai Li

Figure 1 for Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

Figure 2 for Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

Figure 3 for Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

Figure 4 for Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

Abstract:In the active research area of employing embedding models for knowledge graph completion, particularly for the task of link prediction, most prior studies used two benchmark datasets FB15k and WN18 in evaluating such models. Most triples in these and other datasets in such studies belong to reverse and duplicate relations which exhibit high data redundancy due to semantic duplication, correlation or data incompleteness. This is a case of excessive data leakage---a model is trained using features that otherwise would not be available when the model needs to be applied for real prediction. There are also Cartesian product relations for which every triple formed by the Cartesian product of applicable subjects and objects is a true fact. Link prediction on the aforementioned relations is easy and can be achieved with even better accuracy using straightforward rules instead of sophisticated embedding models. A more fundamental defect of these models is that the link prediction scenario, given such data, is non-existent in the real-world. This paper is the first systematic study with the main objective of assessing the true effectiveness of embedding models when the unrealistic triples are removed. Our experiment results show these models are much less accurate than what we used to perceive. Their poor accuracy renders link prediction a task without truly effective automated solution. Hence, we call for re-investigation of possible effective approaches.

* accepted to SIGMOD 2020

Via

Access Paper or Ask Questions

A Benchmarking Study of Embedding-based Entity Alignment for Knowledge Graphs

Mar 10, 2020

Zequn Sun, Qingheng Zhang, Wei Hu, Chengming Wang, Muhao Chen, Farahnaz Akrami, Chengkai Li

Figure 1 for A Benchmarking Study of Embedding-based Entity Alignment for Knowledge Graphs

Figure 2 for A Benchmarking Study of Embedding-based Entity Alignment for Knowledge Graphs

Figure 3 for A Benchmarking Study of Embedding-based Entity Alignment for Knowledge Graphs

Figure 4 for A Benchmarking Study of Embedding-based Entity Alignment for Knowledge Graphs

Abstract:Entity alignment seeks to find entities in different knowledge graphs (KGs) that refer to the same real-world object. Recent advancement in KG embedding impels the advent of embedding-based entity alignment, which encodes entities in a continuous embedding space and measures entity similarities based on the learned embeddings. In this paper, we conduct a comprehensive experimental study of this emerging field. This study surveys 23 recent embedding-based entity alignment approaches and categorizes them based on their techniques and characteristics. We further observe that current approaches use different datasets in evaluation, and the degree distributions of entities in these datasets are inconsistent with real KGs. Hence, we propose a new KG sampling algorithm, with which we generate a set of dedicated benchmark datasets with various heterogeneity and distributions for a realistic evaluation. This study also produces an open-source library, which includes 12 representative embedding-based entity alignment approaches. We extensively evaluate these approaches on the generated datasets, to understand their strengths and limitations. Additionally, for several directions that have not been explored in current approaches, we perform exploratory experiments and report our preliminary findings for future studies. The benchmark datasets, open-source library and experimental results are all accessible online and will be duly maintained.

* Under review

Via

Access Paper or Ask Questions

Gradient-Based Adversarial Training on Transformer Networks for Detecting Check-Worthy Factual Claims

Feb 18, 2020

Kevin Meng, Damian Jimenez, Fatma Arslan, Jacob Daniel Devasier, Daniel Obembe, Chengkai Li

Figure 1 for Gradient-Based Adversarial Training on Transformer Networks for Detecting Check-Worthy Factual Claims

Figure 2 for Gradient-Based Adversarial Training on Transformer Networks for Detecting Check-Worthy Factual Claims

Figure 3 for Gradient-Based Adversarial Training on Transformer Networks for Detecting Check-Worthy Factual Claims

Figure 4 for Gradient-Based Adversarial Training on Transformer Networks for Detecting Check-Worthy Factual Claims

Abstract:We present a study on the efficacy of adversarial training on transformer neural network models, with respect to the task of detecting check-worthy claims. In this work, we introduce the first adversarially-regularized, transformer-based claim spotter model that achieves state-of-the-art results on multiple challenging benchmarks. We obtain a 4.31 point F1-score improvement and a 1.09 point mAP score improvement over current state-of-the-art models on the ClaimBuster Dataset and CLEF2019 Dataset, respectively. In the process, we propose a method to apply adversarial training to transformer models, which has the potential to be generalized to many similar text classification tasks. Along with our results, we are releasing our codebase and manually labeled datasets. We also showcase our models' real world usage via a live public API.

* 11 pages, 4 figures, 6 tables

Via

Access Paper or Ask Questions

Cross-lingual Entity Alignment via Joint Attribute-Preserving Embedding

Sep 26, 2017

Zequn Sun, Wei Hu, Chengkai Li

Figure 1 for Cross-lingual Entity Alignment via Joint Attribute-Preserving Embedding

Figure 2 for Cross-lingual Entity Alignment via Joint Attribute-Preserving Embedding

Figure 3 for Cross-lingual Entity Alignment via Joint Attribute-Preserving Embedding

Figure 4 for Cross-lingual Entity Alignment via Joint Attribute-Preserving Embedding

Abstract:Entity alignment is the task of finding entities in two knowledge bases (KBs) that represent the same real-world object. When facing KBs in different natural languages, conventional cross-lingual entity alignment methods rely on machine translation to eliminate the language barriers. These approaches often suffer from the uneven quality of translations between languages. While recent embedding-based techniques encode entities and relationships in KBs and do not need machine translation for cross-lingual entity alignment, a significant number of attributes remain largely unexplored. In this paper, we propose a joint attribute-preserving embedding model for cross-lingual entity alignment. It jointly embeds the structures of two KBs into a unified vector space and further refines it by leveraging attribute correlations in the KBs. Our experimental results on real-world datasets show that this approach significantly outperforms the state-of-the-art embedding approaches for cross-lingual entity alignment and could be complemented with methods based on machine translation.

Via

Access Paper or Ask Questions

Crowdsourcing Pareto-Optimal Object Finding by Pairwise Comparisons

Sep 15, 2014

Abolfazl Asudeh, Gensheng Zhang, Naeemul Hassan, Chengkai Li, Gergely V. Zaruba

Figure 1 for Crowdsourcing Pareto-Optimal Object Finding by Pairwise Comparisons

Figure 2 for Crowdsourcing Pareto-Optimal Object Finding by Pairwise Comparisons

Figure 3 for Crowdsourcing Pareto-Optimal Object Finding by Pairwise Comparisons

Figure 4 for Crowdsourcing Pareto-Optimal Object Finding by Pairwise Comparisons

Abstract:This is the first study on crowdsourcing Pareto-optimal object finding, which has applications in public opinion collection, group decision making, and information exploration. Departing from prior studies on crowdsourcing skyline and ranking queries, it considers the case where objects do not have explicit attributes and preference relations on objects are strict partial orders. The partial orders are derived by aggregating crowdsourcers' responses to pairwise comparison questions. The goal is to find all Pareto-optimal objects by the fewest possible questions. It employs an iterative question-selection framework. Guided by the principle of eagerly identifying non-Pareto optimal objects, the framework only chooses candidate questions which must satisfy three conditions. This design is both sufficient and efficient, as it is proven to find a short terminal question sequence. The framework is further steered by two ideas---macro-ordering and micro-ordering. By different micro-ordering heuristics, the framework is instantiated into several algorithms with varying power in pruning questions. Experiment results using both real crowdsourcing marketplace and simulations exhibited not only orders of magnitude reductions in questions when compared with a brute-force approach, but also close-to-optimal performance from the most efficient instantiation.

Via

Access Paper or Ask Questions