Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jing Tao

Deliberation on Priors: Trustworthy Reasoning of Large Language Models on Knowledge Graphs

May 21, 2025

Jie Ma, Ning Qu, Zhitao Gao, Rui Xing, Jun Liu, Hongbin Pei, Jiang Xie, Linyun Song, Pinghui Wang, Jing Tao(+1 more)

Abstract:Knowledge graph-based retrieval-augmented generation seeks to mitigate hallucinations in Large Language Models (LLMs) caused by insufficient or outdated knowledge. However, existing methods often fail to fully exploit the prior knowledge embedded in knowledge graphs (KGs), particularly their structural information and explicit or implicit constraints. The former can enhance the faithfulness of LLMs' reasoning, while the latter can improve the reliability of response generation. Motivated by these, we propose a trustworthy reasoning framework, termed Deliberation over Priors (DP), which sufficiently utilizes the priors contained in KGs. Specifically, DP adopts a progressive knowledge distillation strategy that integrates structural priors into LLMs through a combination of supervised fine-tuning and Kahneman-Tversky optimization, thereby improving the faithfulness of relation path generation. Furthermore, our framework employs a reasoning-introspection strategy, which guides LLMs to perform refined reasoning verification based on extracted constraint priors, ensuring the reliability of response generation. Extensive experiments on three benchmark datasets demonstrate that DP achieves new state-of-the-art performance, especially a Hit@1 improvement of 13% on the ComplexWebQuestions dataset, and generates highly trustworthy responses. We also conduct various analyses to verify its flexibility and practicality. The code is available at https://github.com/reml-group/Deliberation-on-Priors.

* Under Review

Via

Access Paper or Ask Questions

FortisAVQA and MAVEN: a Benchmark Dataset and Debiasing Framework for Robust Multimodal Reasoning

Apr 02, 2025

Jie Ma, Zhitao Gao, Qi Chai, Jun Liu, Pinghui Wang, Jing Tao, Zhou Su

Figure 1 for FortisAVQA and MAVEN: a Benchmark Dataset and Debiasing Framework for Robust Multimodal Reasoning

Figure 2 for FortisAVQA and MAVEN: a Benchmark Dataset and Debiasing Framework for Robust Multimodal Reasoning

Figure 3 for FortisAVQA and MAVEN: a Benchmark Dataset and Debiasing Framework for Robust Multimodal Reasoning

Figure 4 for FortisAVQA and MAVEN: a Benchmark Dataset and Debiasing Framework for Robust Multimodal Reasoning

Abstract:Audio-Visual Question Answering (AVQA) is a challenging multimodal reasoning task requiring intelligent systems to answer natural language queries based on paired audio-video inputs accurately. However, existing AVQA approaches often suffer from overfitting to dataset biases, leading to poor robustness. Moreover, current datasets may not effectively diagnose these methods. To address these challenges, we first introduce a novel dataset, FortisAVQA, constructed in two stages: (1) rephrasing questions in the test split of the public MUSIC-AVQA dataset and (2) introducing distribution shifts across questions. The first stage expands the test space with greater diversity, while the second enables a refined robustness evaluation across rare, frequent, and overall question distributions. Second, we introduce a robust Multimodal Audio-Visual Epistemic Network (MAVEN) that leverages a multifaceted cycle collaborative debiasing strategy to mitigate bias learning. Experimental results demonstrate that our architecture achieves state-of-the-art performance on FortisAVQA, with a notable improvement of 7.81\%. Extensive ablation studies on both datasets validate the effectiveness of our debiasing components. Additionally, our evaluation reveals the limited robustness of existing multimodal QA methods. We also verify the plug-and-play capability of our strategy by integrating it with various baseline models across both datasets. Our dataset and code are available at https://github.com/reml-group/fortisavqa.

* Under Review

Via

Access Paper or Ask Questions

Debate on Graph: a Flexible and Reliable Reasoning Framework for Large Language Models

Sep 05, 2024

Jie Ma, Zhitao Gao, Qi Chai, Wangchun Sun, Pinghui Wang, Hongbin Pei, Jing Tao, Lingyun Song, Jun Liu, Chen Zhang(+1 more)

Figure 1 for Debate on Graph: a Flexible and Reliable Reasoning Framework for Large Language Models

Abstract:Large Language Models (LLMs) may suffer from hallucinations in real-world applications due to the lack of relevant knowledge. In contrast, knowledge graphs encompass extensive, multi-relational structures that store a vast array of symbolic facts. Consequently, integrating LLMs with knowledge graphs has been extensively explored, with Knowledge Graph Question Answering (KGQA) serving as a critical touchstone for the integration. This task requires LLMs to answer natural language questions by retrieving relevant triples from knowledge graphs. However, existing methods face two significant challenges: \textit{excessively long reasoning paths distracting from the answer generation}, and \textit{false-positive relations hindering the path refinement}. In this paper, we propose an iterative interactive KGQA framework that leverages the interactive learning capabilities of LLMs to perform reasoning and Debating over Graphs (DoG). Specifically, DoG employs a subgraph-focusing mechanism, allowing LLMs to perform answer trying after each reasoning step, thereby mitigating the impact of lengthy reasoning paths. On the other hand, DoG utilizes a multi-role debate team to gradually simplify complex questions, reducing the influence of false-positive relations. This debate mechanism ensures the reliability of the reasoning process. Experimental results on five public datasets demonstrate the effectiveness and superiority of our architecture. Notably, DoG outperforms the state-of-the-art method ToG by 23.7\% and 9.1\% in accuracy on WebQuestions and GrailQA, respectively. Furthermore, the integration experiments with various LLMs on the mentioned datasets highlight the flexibility of DoG. Code is available at \url{https://github.com/reml-group/DoG}.

* 12 pages

Via

Access Paper or Ask Questions

Distinguish Confusion in Legal Judgment Prediction via Revised Relation Knowledge

Aug 18, 2024

Nuo Xu, Pinghui Wang, Junzhou Zhao, Feiyang Sun, Lin Lan, Jing Tao, Li Pan, Xiaohong Guan

Figure 1 for Distinguish Confusion in Legal Judgment Prediction via Revised Relation Knowledge

Figure 2 for Distinguish Confusion in Legal Judgment Prediction via Revised Relation Knowledge

Figure 3 for Distinguish Confusion in Legal Judgment Prediction via Revised Relation Knowledge

Figure 4 for Distinguish Confusion in Legal Judgment Prediction via Revised Relation Knowledge

Abstract:Legal Judgment Prediction (LJP) aims to automatically predict a law case's judgment results based on the text description of its facts. In practice, the confusing law articles (or charges) problem frequently occurs, reflecting that the law cases applicable to similar articles (or charges) tend to be misjudged. Although some recent works based on prior knowledge solve this issue well, they ignore that confusion also occurs between law articles with a high posterior semantic similarity due to the data imbalance problem instead of only between the prior highly similar ones, which is this work's further finding. This paper proposes an end-to-end model named \textit{D-LADAN} to solve the above challenges. On the one hand, D-LADAN constructs a graph among law articles based on their text definition and proposes a graph distillation operation (GDO) to distinguish the ones with a high prior semantic similarity. On the other hand, D-LADAN presents a novel momentum-updated memory mechanism to dynamically sense the posterior similarity between law articles (or charges) and a weighted GDO to adaptively capture the distinctions for revising the inductive bias caused by the data imbalance problem. We perform extensive experiments to demonstrate that D-LADAN significantly outperforms state-of-the-art methods in accuracy and robustness.

* Accepted by ACM TOIS

Via

Access Paper or Ask Questions

Representation Learning of Tangled Key-Value Sequence Data for Early Classification

Apr 11, 2024

Tao Duan, Junzhou Zhao, Shuo Zhang, Jing Tao, Pinghui Wang

Figure 1 for Representation Learning of Tangled Key-Value Sequence Data for Early Classification

Figure 2 for Representation Learning of Tangled Key-Value Sequence Data for Early Classification

Figure 3 for Representation Learning of Tangled Key-Value Sequence Data for Early Classification

Figure 4 for Representation Learning of Tangled Key-Value Sequence Data for Early Classification

Abstract:Key-value sequence data has become ubiquitous and naturally appears in a variety of real-world applications, ranging from the user-product purchasing sequences in e-commerce, to network packet sequences forwarded by routers in networking. Classifying these key-value sequences is important in many scenarios such as user profiling and malicious applications identification. In many time-sensitive scenarios, besides the requirement of classifying a key-value sequence accurately, it is also desired to classify a key-value sequence early, in order to respond fast. However, these two goals are conflicting in nature, and it is challenging to achieve them simultaneously. In this work, we formulate a novel tangled key-value sequence early classification problem, where a tangled key-value sequence is a mixture of several concurrent key-value sequences with different keys. The goal is to classify each individual key-value sequence sharing a same key both accurately and early. To address this problem, we propose a novel method, i.e., Key-Value sequence Early Co-classification (KVEC), which leverages both inner- and inter-correlations of items in a tangled key-value sequence through key correlation and value correlation to learn a better sequence representation. Meanwhile, a time-aware halting policy decides when to stop the ongoing key-value sequence and classify it based on current sequence representation. Experiments on both real-world and synthetic datasets demonstrate that our method outperforms the state-of-the-art baselines significantly. KVEC improves the prediction accuracy by up to $4.7 - 17.5\%$ under the same prediction earliness condition, and improves the harmonic mean of accuracy and earliness by up to $3.7 - 14.0\%$.

* 12 pages, 31 figures, Accepted by ICDE2024

Via

Access Paper or Ask Questions

High-Dimensional Tail Index Regression: with An Application to Text Analyses of Viral Posts in Social Media

Mar 02, 2024

Yuya Sasaki, Jing Tao, Yulong Wang

Figure 1 for High-Dimensional Tail Index Regression: with An Application to Text Analyses of Viral Posts in Social Media

Figure 2 for High-Dimensional Tail Index Regression: with An Application to Text Analyses of Viral Posts in Social Media

Figure 3 for High-Dimensional Tail Index Regression: with An Application to Text Analyses of Viral Posts in Social Media

Figure 4 for High-Dimensional Tail Index Regression: with An Application to Text Analyses of Viral Posts in Social Media

Abstract:Motivated by the empirical power law of the distributions of credits (e.g., the number of "likes") of viral posts in social media, we introduce the high-dimensional tail index regression and methods of estimation and inference for its parameters. We propose a regularized estimator, establish its consistency, and derive its convergence rate. To conduct inference, we propose to debias the regularized estimate, and establish the asymptotic normality of the debiased estimator. Simulation studies support our theory. These methods are applied to text analyses of viral posts in X (formerly Twitter) concerning LGBTQ+.

Via

Access Paper or Ask Questions

TBDLNet: a network for classifying multidrug-resistant and drug-sensitive tuberculosis

Oct 27, 2023

Ziquan Zhu, Jing Tao, Shuihua Wang, Xin Zhang, Yudong Zhang

Figure 1 for TBDLNet: a network for classifying multidrug-resistant and drug-sensitive tuberculosis

Figure 2 for TBDLNet: a network for classifying multidrug-resistant and drug-sensitive tuberculosis

Figure 3 for TBDLNet: a network for classifying multidrug-resistant and drug-sensitive tuberculosis

Figure 4 for TBDLNet: a network for classifying multidrug-resistant and drug-sensitive tuberculosis

Abstract:This paper proposes applying a novel deep-learning model, TBDLNet, to recognize CT images to classify multidrug-resistant and drug-sensitive tuberculosis automatically. The pre-trained ResNet50 is selected to extract features. Three randomized neural networks are used to alleviate the overfitting problem. The ensemble of three RNNs is applied to boost the robustness via majority voting. The proposed model is evaluated by five-fold cross-validation. Five indexes are selected in this paper, which are accuracy, sensitivity, precision, F1-score, and specificity. The TBDLNet achieves 0.9822 accuracy, 0.9815 specificity, 0.9823 precision, 0.9829 sensitivity, and 0.9826 F1-score, respectively. The TBDLNet is suitable for classifying multidrug-resistant tuberculosis and drug-sensitive tuberculosis. It can detect multidrug-resistant pulmonary tuberculosis as early as possible, which helps to adjust the treatment plan in time and improve the treatment effect.

Via

Access Paper or Ask Questions

Multi-Action Dialog Policy Learning from Logged User Feedback

Feb 27, 2023

Shuo Zhang, Junzhou Zhao, Pinghui Wang, Tianxiang Wang, Zi Liang, Jing Tao, Yi Huang, Junlan Feng

Abstract:Multi-action dialog policy, which generates multiple atomic dialog actions per turn, has been widely applied in task-oriented dialog systems to provide expressive and efficient system responses. Existing policy models usually imitate action combinations from the labeled multi-action dialog examples. Due to data limitations, they generalize poorly toward unseen dialog flows. While reinforcement learning-based methods are proposed to incorporate the service ratings from real users and user simulators as external supervision signals, they suffer from sparse and less credible dialog-level rewards. To cope with this problem, we explore to improve multi-action dialog policy learning with explicit and implicit turn-level user feedback received for historical predictions (i.e., logged user feedback) that are cost-efficient to collect and faithful to real-world scenarios. The task is challenging since the logged user feedback provides only partial label feedback limited to the particular historical dialog actions predicted by the agent. To fully exploit such feedback information, we propose BanditMatch, which addresses the task from a feedback-enhanced semi-supervised learning perspective with a hybrid objective of semi-supervised learning and bandit learning. BanditMatch integrates pseudo-labeling methods to better explore the action space through constructing full label feedback. Extensive experiments show that our BanditMatch outperforms the state-of-the-art methods by generating more concise and informative responses. The source code and the appendix of this paper can be obtained from https://github.com/ShuoZhangXJTU/BanditMatch.

* AAAI 2023

Via

Access Paper or Ask Questions

Federated Learning over Coupled Graphs

Jan 26, 2023

Runze Lei, Pinghui Wang, Junzhou Zhao, Lin Lan, Jing Tao, Chao Deng, Junlan Feng, Xidian Wang, Xiaohong Guan

Abstract:Graphs are widely used to represent the relations among entities. When one owns the complete data, an entire graph can be easily built, therefore performing analysis on the graph is straightforward. However, in many scenarios, it is impractical to centralize the data due to data privacy concerns. An organization or party only keeps a part of the whole graph data, i.e., graph data is isolated from different parties. Recently, Federated Learning (FL) has been proposed to solve the data isolation issue, mainly for Euclidean data. It is still a challenge to apply FL on graph data because graphs contain topological information which is notorious for its non-IID nature and is hard to partition. In this work, we propose a novel FL framework for graph data, FedCog, to efficiently handle coupled graphs that are a kind of distributed graph data, but widely exist in a variety of real-world applications such as mobile carriers' communication networks and banks' transaction networks. We theoretically prove the correctness and security of FedCog. Experimental results demonstrate that our method FedCog significantly outperforms traditional FL methods on graphs. Remarkably, our FedCog improves the accuracy of node classification tasks by up to 14.7%.

* Accepted by IEEE Transactions on Parallel and Distributed Systems

Via

Access Paper or Ask Questions

Doubly Robust Semiparametric Difference-in-Differences Estimators with High-Dimensional Data

Sep 07, 2020

Yang Ning, Sida Peng, Jing Tao

Figure 1 for Doubly Robust Semiparametric Difference-in-Differences Estimators with High-Dimensional Data

Figure 2 for Doubly Robust Semiparametric Difference-in-Differences Estimators with High-Dimensional Data

Figure 3 for Doubly Robust Semiparametric Difference-in-Differences Estimators with High-Dimensional Data

Figure 4 for Doubly Robust Semiparametric Difference-in-Differences Estimators with High-Dimensional Data

Abstract:This paper proposes a doubly robust two-stage semiparametric difference-in-difference estimator for estimating heterogeneous treatment effects with high-dimensional data. Our new estimator is robust to model miss-specifications and allows for, but does not require, many more regressors than observations. The first stage allows a general set of machine learning methods to be used to estimate the propensity score. In the second stage, we derive the rates of convergence for both the parametric parameter and the unknown function under a partially linear specification for the outcome equation. We also provide bias correction procedures to allow for valid inference for the heterogeneous treatment effects. We evaluate the finite sample performance with extensive simulation studies. Additionally, a real data analysis on the effect of Fair Minimum Wage Act on the unemployment rate is performed as an illustration of our method. An R package for implementing the proposed method is available on Github.

Via

Access Paper or Ask Questions