Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Rui Xi

Anomaly Detection and Generation with Diffusion Models: A Survey

Jun 11, 2025

Yang Liu, Jing Liu, Chengfang Li, Rui Xi, Wenchao Li, Liang Cao, Jin Wang, Laurence T. Yang, Junsong Yuan, Wei Zhou

Abstract:Anomaly detection (AD) plays a pivotal role across diverse domains, including cybersecurity, finance, healthcare, and industrial manufacturing, by identifying unexpected patterns that deviate from established norms in real-world data. Recent advancements in deep learning, specifically diffusion models (DMs), have sparked significant interest due to their ability to learn complex data distributions and generate high-fidelity samples, offering a robust framework for unsupervised AD. In this survey, we comprehensively review anomaly detection and generation with diffusion models (ADGDM), presenting a tutorial-style analysis of the theoretical foundations and practical implementations and spanning images, videos, time series, tabular, and multimodal data. Crucially, unlike existing surveys that often treat anomaly detection and generation as separate problems, we highlight their inherent synergistic relationship. We reveal how DMs enable a reinforcing cycle where generation techniques directly address the fundamental challenge of anomaly data scarcity, while detection methods provide critical feedback to improve generation fidelity and relevance, advancing both capabilities beyond their individual potential. A detailed taxonomy categorizes ADGDM methods based on anomaly scoring mechanisms, conditioning strategies, and architectural designs, analyzing their strengths and limitations. We final discuss key challenges including scalability and computational efficiency, and outline promising future directions such as efficient architectures, conditioning strategies, and integration with foundation models (e.g., visual-language models and large language models). By synthesizing recent advances and outlining open research questions, this survey aims to guide researchers and practitioners in leveraging DMs for innovative AD solutions across diverse applications.

* 20 pages, 11 figures, 13 tables

Via

Access Paper or Ask Questions

Outlier Detection Bias Busted: Understanding Sources of Algorithmic Bias through Data-centric Factors

Aug 24, 2024

Xueying Ding, Rui Xi, Leman Akoglu

Abstract:The astonishing successes of ML have raised growing concern for the fairness of modern methods when deployed in real world settings. However, studies on fairness have mostly focused on supervised ML, while unsupervised outlier detection (OD), with numerous applications in finance, security, etc., have attracted little attention. While a few studies proposed fairness-enhanced OD algorithms, they remain agnostic to the underlying driving mechanisms or sources of unfairness. Even within the supervised ML literature, there exists debate on whether unfairness stems solely from algorithmic biases (i.e. design choices) or from the biases encoded in the data on which they are trained. To close this gap, this work aims to shed light on the possible sources of unfairness in OD by auditing detection models under different data-centric factors. By injecting various known biases into the input data -- as pertain to sample size disparity, under-representation, feature measurement noise, and group membership obfuscation -- we find that the OD algorithms under the study all exhibit fairness pitfalls, although differing in which types of data bias they are more susceptible to. Most notable of our study is to demonstrate that OD algorithm bias is not merely a data bias problem. A key realization is that the data properties that emerge from bias injection could as well be organic -- as pertain to natural group differences w.r.t. sparsity, base rate, variance, and multi-modality. Either natural or biased, such data properties can give rise to unfairness as they interact with certain algorithmic design choices.

* 11 pages, 5 figures, 16 appendix pages, accepted at AIES 2024

Via

Access Paper or Ask Questions

Enhancing HNSW Index for Real-Time Updates: Addressing Unreachable Points and Performance Degradation

Jul 10, 2024

Wentao Xiao, Yueyang Zhan, Rui Xi, Mengshu Hou, Jianming Liao

Figure 1 for Enhancing HNSW Index for Real-Time Updates: Addressing Unreachable Points and Performance Degradation

Figure 2 for Enhancing HNSW Index for Real-Time Updates: Addressing Unreachable Points and Performance Degradation

Figure 3 for Enhancing HNSW Index for Real-Time Updates: Addressing Unreachable Points and Performance Degradation

Figure 4 for Enhancing HNSW Index for Real-Time Updates: Addressing Unreachable Points and Performance Degradation

Abstract:The approximate nearest neighbor search (ANNS) is a fundamental and essential component in information retrieval, with graph-based methodologies demonstrating superior performance compared to alternative approaches. Extensive research efforts have been dedicated to improving search efficiency by developing various graph-based indices, such as HNSW (Hierarchical Navigable Small World). However, the performance of HNSW and most graph-based indices becomes unacceptable when faced with a large number of real-time deletions, insertions, and updates. Furthermore, during update operations, HNSW can result in some data points becoming unreachable, a situation we refer to as the `unreachable points phenomenon'. This phenomenon could significantly affect the search accuracy of the graph in certain situations. To address these issues, we present efficient measures to overcome the shortcomings of HNSW, specifically addressing poor performance over long periods of delete and update operations and resolving the issues caused by the unreachable points phenomenon. Our proposed MN-RU algorithm effectively improves update efficiency and suppresses the growth rate of unreachable points, ensuring better overall performance and maintaining the integrity of the graph. Our results demonstrate that our methods outperform existing approaches. Furthermore, since our methods are based on HNSW, they can be easily integrated with existing indices widely used in the industrial field, making them practical for future real-world applications. Code is available at https://github.com/xwt1/ICPADS-MN-RU.git

Via

Access Paper or Ask Questions

mmHawkeye: Passive UAV Detection with a COTS mmWave Radar

Aug 12, 2023

Jia Zhang, Xin Na, Rui Xi, Yimiao Sun, Yuan He

Figure 1 for mmHawkeye: Passive UAV Detection with a COTS mmWave Radar

Figure 2 for mmHawkeye: Passive UAV Detection with a COTS mmWave Radar

Figure 3 for mmHawkeye: Passive UAV Detection with a COTS mmWave Radar

Figure 4 for mmHawkeye: Passive UAV Detection with a COTS mmWave Radar

Abstract:Small Unmanned Aerial Vehicles (UAVs) are becoming potential threats to security-sensitive areas and personal privacy. A UAV can shoot photos at height, but how to detect such an uninvited intruder is an open problem. This paper presents mmHawkeye, a passive approach for UAV detection with a COTS millimeter wave (mmWave) radar. mmHawkeye doesn't require prior knowledge of the type, motions, and flight trajectory of the UAV, while exploiting the signal feature induced by the UAV's periodic micro-motion (PMM) for long-range accurate detection. The design is therefore effective in dealing with low-SNR and uncertain reflected signals from the UAV. mmHawkeye can further track the UAV's position with dynamic programming and particle filtering, and identify it with a Long Short-Term Memory (LSTM) based detector. We implement mmHawkeye on a commercial mmWave radar and evaluate its performance under varied settings. The experimental results show that mmHawkeye has a detection accuracy of 95.8% and can realize detection at a range up to 80m.

* 9 pages, 14 figures, IEEE SECON2023

Via

Access Paper or Ask Questions