Abstract:Although Large Language Models(LLMs) can generate coherent and contextually relevant text, they often struggle to recognise the intent behind the human user's query. Natural Language Understanding (NLU) models, however, interpret the purpose and key information of user's input to enable responsive interactions. Existing NLU models generally map individual utterances to a dual-level semantic frame, involving sentence-level intent and word-level slot labels. However, real-life conversations primarily consist of multi-turn conversations, involving the interpretation of complex and extended dialogues. Researchers encounter challenges addressing all facets of multi-turn dialogue conversations using a unified single NLU model. This paper introduces a novel approach, MIDAS, leveraging a multi-level intent, domain, and slot knowledge distillation for multi-turn NLU. To achieve this, we construct distinct teachers for varying levels of conversation knowledge, namely, sentence-level intent detection, word-level slot filling, and conversation-level domain classification. These teachers are then fine-tuned to acquire specific knowledge of their designated levels. A multi-teacher loss is proposed to facilitate the combination of these multi-level teachers, guiding a student model in multi-turn dialogue tasks. The experimental results demonstrate the efficacy of our model in improving the overall multi-turn conversation understanding, showcasing the potential for advancements in NLU models through the incorporation of multi-level dialogue knowledge distillation techniques.
Abstract:Federated learning (FL) is a distributed machine learning technique in which multiple clients cooperate to train a shared model without exchanging their raw data. However, heterogeneity of data distribution among clients usually leads to poor model inference. In this paper, a prototype-based federated learning framework is proposed, which can achieve better inference performance with only a few changes to the last global iteration of the typical federated learning process. In the last iteration, the server aggregates the prototypes transmitted from distributed clients and then sends them back to local clients for their respective model inferences. Experiments on two baseline datasets show that our proposal can achieve higher accuracy (at least 1%) and relatively efficient communication than two popular baselines under different heterogeneous settings.
Abstract:In contrast to centralized model training that involves data collection, federated learning (FL) enables remote clients to collaboratively train a model without exposing their private data. However, model performance usually degrades in FL due to the heterogeneous data generated by clients of diverse characteristics. One promising strategy to maintain good performance is by limiting the local training from drifting far away from the global model. Previous studies accomplish this by regularizing the distance between the representations learned by the local and global models. However, they only consider representations from the early layers of a model or the layer preceding the output layer. In this study, we introduce FedIntR, which provides a more fine-grained regularization by integrating the representations of intermediate layers into the local training process. Specifically, FedIntR computes a regularization term that encourages the closeness between the intermediate layer representations of the local and global models. Additionally, FedIntR automatically determines the contribution of each layer's representation to the regularization term based on the similarity between local and global representations. We conduct extensive experiments on various datasets to show that FedIntR can achieve equivalent or higher performance compared to the state-of-the-art approaches.
Abstract:Explainable artificial intelligence (XAI) twin systems will be a fundamental enabler of zero-touch network and service management (ZSM) for sixth-generation (6G) wireless networks. A reliable XAI twin system for ZSM requires two composites: an extreme analytical ability for discretizing the physical behavior of the Internet of Everything (IoE) and rigorous methods for characterizing the reasoning of such behavior. In this paper, a novel neuro-symbolic explainable artificial intelligence twin framework is proposed to enable trustworthy ZSM for a wireless IoE. The physical space of the XAI twin executes a neural-network-driven multivariate regression to capture the time-dependent wireless IoE environment while determining unconscious decisions of IoE service aggregation. Subsequently, the virtual space of the XAI twin constructs a directed acyclic graph (DAG)-based Bayesian network that can infer a symbolic reasoning score over unconscious decisions through a first-order probabilistic language model. Furthermore, a Bayesian multi-arm bandits-based learning problem is proposed for reducing the gap between the expected explained score and the current obtained score of the proposed neuro-symbolic XAI twin. To address the challenges of extensible, modular, and stateless management functions in ZSM, the proposed neuro-symbolic XAI twin framework consists of two learning systems: 1) an implicit learner that acts as an unconscious learner in physical space, and 2) an explicit leaner that can exploit symbolic reasoning based on implicit learner decisions and prior evidence. Experimental results show that the proposed neuro-symbolic XAI twin can achieve around 96.26% accuracy while guaranteeing from 18% to 44% more trust score in terms of reasoning and closed-loop automation.
Abstract:One of the core envisions of the sixth-generation (6G) wireless networks is to accumulate artificial intelligence (AI) for autonomous controlling of the Internet of Everything (IoE). Particularly, the quality of IoE services delivery must be maintained by analyzing contextual metrics of IoE such as people, data, process, and things. However, the challenges incorporate when the AI model conceives a lake of interpretation and intuition to the network service provider. Therefore, this paper provides an explainable artificial intelligence (XAI) framework for quality-aware IoE service delivery that enables both intelligence and interpretation. First, a problem of quality-aware IoE service delivery is formulated by taking into account network dynamics and contextual metrics of IoE, where the objective is to maximize the channel quality index (CQI) of each IoE service user. Second, a regression problem is devised to solve the formulated problem, where explainable coefficients of the contextual matrices are estimated by Shapley value interpretation. Third, the XAI-enabled quality-aware IoE service delivery algorithm is implemented by employing ensemble-based regression models for ensuring the interpretation of contextual relationships among the matrices to reconfigure network parameters. Finally, the experiment results show that the uplink improvement rate becomes 42.43% and 16.32% for the AdaBoost and Extra Trees, respectively, while the downlink improvement rate reaches up to 28.57% and 14.29%. However, the AdaBoost-based approach cannot maintain the CQI of IoE service users. Therefore, the proposed Extra Trees-based regression model shows significant performance gain for mitigating the trade-off between accuracy and interpretability than other baselines.
Abstract:In this work, we present an approach for mining user preferences and recommendation based on reviews. There have been various studies worked on recommendation problem. However, most of the studies beyond one aspect user generated- content such as user ratings, user feedback and so on to state user preferences. There is a prob- lem in one aspect mining is lacking for stating user preferences. As a demonstration, in collaborative filter recommendation, we try to figure out the preference trend of crowded users, then use that trend to predict current user preference. Therefore, there is a gap between real user preferences and the trend of the crowded people. Additionally, user preferences can be addressed from mining user reviews since user often comment about various aspects of products. To solve this problem, we mainly focus on mining product aspects and user aspects inside user reviews to directly state user preferences. We also take into account Social Network Analysis for cold-start item problem. With cold-start user problem, collaborative filter algorithm is employed in our work. The framework is general enough to be applied to different recommendation domains. Theoretically, our method would achieve a significant enhancement.
Abstract:SentiWordNet is an important lexical resource supporting sentiment analysis in opinion mining applications. In this paper, we propose a novel approach to construct a Vietnamese SentiWordNet (VSWN). SentiWordNet is typically generated from WordNet in which each synset has numerical scores to indicate its opinion polarities. Many previous studies obtained these scores by applying a machine learning method to WordNet. However, Vietnamese WordNet is not available unfortunately by the time of this paper. Therefore, we propose a method to construct VSWN from a Vietnamese dictionary, not from WordNet. We show the effectiveness of the proposed method by generating a VSWN with 39,561 synsets automatically. The method is experimentally tested with 266 synsets with aspect of positivity and negativity. It attains a competitive result compared with English SentiWordNet that is 0.066 and 0.052 differences for positivity and negativity sets respectively.