Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Peilong Li

Enhancing Uncertainty Modeling with Semantic Graph for Hallucination Detection

Jan 02, 2025

Kedi Chen, Qin Chen, Jie Zhou, Xinqi Tao, Bowen Ding, Jingwen Xie, Mingchen Xie, Peilong Li, Feng Zheng, Liang He

Figure 1 for Enhancing Uncertainty Modeling with Semantic Graph for Hallucination Detection

Figure 2 for Enhancing Uncertainty Modeling with Semantic Graph for Hallucination Detection

Figure 3 for Enhancing Uncertainty Modeling with Semantic Graph for Hallucination Detection

Figure 4 for Enhancing Uncertainty Modeling with Semantic Graph for Hallucination Detection

Abstract:Large Language Models (LLMs) are prone to hallucination with non-factual or unfaithful statements, which undermines the applications in real-world scenarios. Recent researches focus on uncertainty-based hallucination detection, which utilizes the output probability of LLMs for uncertainty calculation and does not rely on external knowledge or frequent sampling from LLMs. Whereas, most approaches merely consider the uncertainty of each independent token, while the intricate semantic relations among tokens and sentences are not well studied, which limits the detection of hallucination that spans over multiple tokens and sentences in the passage. In this paper, we propose a method to enhance uncertainty modeling with semantic graph for hallucination detection. Specifically, we first construct a semantic graph that well captures the relations among entity tokens and sentences. Then, we incorporate the relations between two entities for uncertainty propagation to enhance sentence-level hallucination detection. Given that hallucination occurs due to the conflict between sentences, we further present a graph-based uncertainty calibration method that integrates the contradiction probability of the sentence with its neighbors in the semantic graph for uncertainty calculation. Extensive experiments on two datasets show the great advantages of our proposed approach. In particular, we obtain substantial improvements with 19.78% in passage-level hallucination detection.

Via

Access Paper or Ask Questions

MalIoT: Scalable and Real-time Malware Traffic Detection for IoT Networks

Apr 02, 2023

Ethan Weitkamp, Yusuke Satani, Adam Omundsen, Jingwen Wang, Peilong Li

Figure 1 for MalIoT: Scalable and Real-time Malware Traffic Detection for IoT Networks

Figure 2 for MalIoT: Scalable and Real-time Malware Traffic Detection for IoT Networks

Figure 3 for MalIoT: Scalable and Real-time Malware Traffic Detection for IoT Networks

Figure 4 for MalIoT: Scalable and Real-time Malware Traffic Detection for IoT Networks

Abstract:The machine learning approach is vital in Internet of Things (IoT) malware traffic detection due to its ability to keep pace with the ever-evolving nature of malware. Machine learning algorithms can quickly and accurately analyze the vast amount of data produced by IoT devices, allowing for the real-time identification of malicious network traffic. The system can handle the exponential growth of IoT devices thanks to the usage of distributed systems like Apache Kafka and Apache Spark, and Intel's oneAPI software stack accelerates model inference speed, making it a useful tool for real-time malware traffic detection. These technologies work together to create a system that can give scalable performance and high accuracy, making it a crucial tool for defending against cyber threats in smart communities and medical institutions.

Via

Access Paper or Ask Questions

Multi-Task Hierarchical Learning Based Network Traffic Analytics

Jun 05, 2021

Onur Barut, Yan Luo, Tong Zhang, Weigang Li, Peilong Li

Figure 1 for Multi-Task Hierarchical Learning Based Network Traffic Analytics

Figure 2 for Multi-Task Hierarchical Learning Based Network Traffic Analytics

Figure 3 for Multi-Task Hierarchical Learning Based Network Traffic Analytics

Figure 4 for Multi-Task Hierarchical Learning Based Network Traffic Analytics

Abstract:Classifying network traffic is the basis for important network applications. Prior research in this area has faced challenges on the availability of representative datasets, and many of the results cannot be readily reproduced. Such a problem is exacerbated by emerging data-driven machine learning based approaches. To address this issue, we present(N et)2databasewith three open datasets containing nearly 1.3M labeled flows in total, with a comprehensive list of flow features, for there search community1. We focus on broad aspects in network traffic analysis, including both malware detection and application classification. As we continue to grow them, we expect the datasets to serve as a common ground for AI driven, reproducible research on network flow analytics. We release the datasets publicly and also introduce a Multi-Task Hierarchical Learning (MTHL)model to perform all tasks in a single model. Our results show that MTHL is capable of accurately performing multiple tasks with hierarchical labeling with a dramatic reduction in training time.

* 6 pages, 2 figures, 1 table. arXiv admin note: substantial text overlap with arXiv:2004.13006

Via

Access Paper or Ask Questions

NetML: A Challenge for Network Traffic Analytics

Apr 25, 2020

Onur Barut, Yan Luo, Tong Zhang, Weigang Li, Peilong Li

Figure 1 for NetML: A Challenge for Network Traffic Analytics

Figure 2 for NetML: A Challenge for Network Traffic Analytics

Figure 3 for NetML: A Challenge for Network Traffic Analytics

Figure 4 for NetML: A Challenge for Network Traffic Analytics

Abstract:Classifying network traffic is the basis for important network applications. Prior research in this area has faced challenges on the availability of representative datasets, and many of the results cannot be readily reproduced. Such a problem is exacerbated by emerging data-driven machine learning based approaches. To address this issue, we provide three open datasets containing almost 1.3M labeled flows in total, with flow features and anonymized raw packets, for the research community. We focus on broad aspects in network traffic analysis, including both malware detection and application classification. We release the datasets in the form of an open challenge called NetML and implement several machine learning methods including random-forest, SVM and MLP. As we continue to grow NetML, we expect the datasets to serve as a common platform for AI driven, reproducible research on network flow analytics.

* 27 pages, 39 figures, 7 tables

Via

Access Paper or Ask Questions