Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xiaochen Yang

VISTA: Unsupervised 2D Temporal Dependency Representations for Time Series Anomaly Detection

Apr 03, 2025

Sinchee Chin, Fan Zhang, Xiaochen Yang, Jing-Hao Xue, Wenming Yang, Peng Jia, Guijin Wang, Luo Yingqun

Abstract:Time Series Anomaly Detection (TSAD) is essential for uncovering rare and potentially harmful events in unlabeled time series data. Existing methods are highly dependent on clean, high-quality inputs, making them susceptible to noise and real-world imperfections. Additionally, intricate temporal relationships in time series data are often inadequately captured in traditional 1D representations, leading to suboptimal modeling of dependencies. We introduce VISTA, a training-free, unsupervised TSAD algorithm designed to overcome these challenges. VISTA features three core modules: 1) Time Series Decomposition using Seasonal and Trend Decomposition via Loess (STL) to decompose noisy time series into trend, seasonal, and residual components; 2) Temporal Self-Attention, which transforms 1D time series into 2D temporal correlation matrices for richer dependency modeling and anomaly detection; and 3) Multivariate Temporal Aggregation, which uses a pretrained feature extractor to integrate cross-variable information into a unified, memory-efficient representation. VISTA's training-free approach enables rapid deployment and easy hyperparameter tuning, making it suitable for industrial applications. It achieves state-of-the-art performance on five multivariate TSAD benchmarks.

Via

Access Paper or Ask Questions

MAP-based Problem-Agnostic diffusion model for Inverse Problems

Jan 25, 2025

Pingping Tao, Haixia Liu, Jing Su, Xiaochen Yang, Hongchen Tan

Figure 1 for MAP-based Problem-Agnostic diffusion model for Inverse Problems

Figure 2 for MAP-based Problem-Agnostic diffusion model for Inverse Problems

Figure 3 for MAP-based Problem-Agnostic diffusion model for Inverse Problems

Figure 4 for MAP-based Problem-Agnostic diffusion model for Inverse Problems

Abstract:Diffusion models have indeed shown great promise in solving inverse problems in image processing. In this paper, we propose a novel, problem-agnostic diffusion model called the maximum a posteriori (MAP)-based guided term estimation method for inverse problems. We divide the conditional score function into two terms according to Bayes' rule: the unconditional score function and the guided term. We design the MAP-based guided term estimation method, while the unconditional score function is approximated by an existing score network. To estimate the guided term, we base on the assumption that the space of clean natural images is inherently smooth, and introduce a MAP estimate of the $t$-th latent variable. We then substitute this estimation into the expression of the inverse problem and obtain the approximation of the guided term. We evaluate our method extensively on super-resolution, inpainting, and denoising tasks, and demonstrate comparable performance to DDRM, DMPS, DPS and $\Pi$GDM.

* 13 pages, 6 figures

Via

Access Paper or Ask Questions

COMO: Cross-Mamba Interaction and Offset-Guided Fusion for Multimodal Object Detection

Dec 24, 2024

Chang Liu, Xin Ma, Xiaochen Yang, Yuxiang Zhang, Yanni Dong

Abstract:Single-modal object detection tasks often experience performance degradation when encountering diverse scenarios. In contrast, multimodal object detection tasks can offer more comprehensive information about object features by integrating data from various modalities. Current multimodal object detection methods generally use various fusion techniques, including conventional neural networks and transformer-based models, to implement feature fusion strategies and achieve complementary information. However, since multimodal images are captured by different sensors, there are often misalignments between them, making direct matching challenging. This misalignment hinders the ability to establish strong correlations for the same object across different modalities. In this paper, we propose a novel approach called the CrOss-Mamba interaction and Offset-guided fusion (COMO) framework for multimodal object detection tasks. The COMO framework employs the cross-mamba technique to formulate feature interaction equations, enabling multimodal serialized state computation. This results in interactive fusion outputs while reducing computational overhead and improving efficiency. Additionally, COMO leverages high-level features, which are less affected by misalignment, to facilitate interaction and transfer complementary information between modalities, addressing the positional offset challenges caused by variations in camera angles and capture times. Furthermore, COMO incorporates a global and local scanning mechanism in the cross-mamba module to capture features with local correlation, particularly in remote sensing images. To preserve low-level features, the offset-guided fusion mechanism ensures effective multiscale feature utilization, allowing the construction of a multiscale fusion data cube that enhances detection performance.

Via

Access Paper or Ask Questions

Efficient Feature Aggregation and Scale-Aware Regression for Monocular 3D Object Detection

Nov 05, 2024

Yifan Wang, Xiaochen Yang, Fanqi Pu, Qingmin Liao, Wenming Yang

Figure 1 for Efficient Feature Aggregation and Scale-Aware Regression for Monocular 3D Object Detection

Figure 2 for Efficient Feature Aggregation and Scale-Aware Regression for Monocular 3D Object Detection

Figure 3 for Efficient Feature Aggregation and Scale-Aware Regression for Monocular 3D Object Detection

Figure 4 for Efficient Feature Aggregation and Scale-Aware Regression for Monocular 3D Object Detection

Abstract:Monocular 3D object detection has attracted great attention due to simplicity and low cost. Existing methods typically follow conventional 2D detection paradigms, first locating object centers and then predicting 3D attributes via neighboring features. However, these methods predominantly rely on progressive cross-scale feature aggregation and focus solely on local information, which may result in a lack of global awareness and the omission of small-scale objects. In addition, due to large variation in object scales across different scenes and depths, inaccurate receptive fields often lead to background noise and degraded feature representation. To address these issues, we introduces MonoASRH, a novel monocular 3D detection framework composed of Efficient Hybrid Feature Aggregation Module (EH-FAM) and Adaptive Scale-Aware 3D Regression Head (ASRH). Specifically, EH-FAM employs multi-head attention with a global receptive field to extract semantic features for small-scale objects and leverages lightweight convolutional modules to efficiently aggregate visual features across different scales. The ASRH encodes 2D bounding box dimensions and then fuses scale features with the semantic features aggregated by EH-FAM through a scale-semantic feature fusion module. The scale-semantic feature fusion module guides ASRH in learning dynamic receptive field offsets, incorporating scale priors into 3D position prediction for better scale-awareness. Extensive experiments on the KITTI and Waymo datasets demonstrate that MonoASRH achieves state-of-the-art performance.

Via

Access Paper or Ask Questions

Deciphering the Chaos: Enhancing Jailbreak Attacks via Adversarial Prompt Translation

Oct 15, 2024

Qizhang Li, Xiaochen Yang, Wangmeng Zuo, Yiwen Guo

Abstract:Automatic adversarial prompt generation provides remarkable success in jailbreaking safely-aligned large language models (LLMs). Existing gradient-based attacks, while demonstrating outstanding performance in jailbreaking white-box LLMs, often generate garbled adversarial prompts with chaotic appearance. These adversarial prompts are difficult to transfer to other LLMs, hindering their performance in attacking unknown victim models. In this paper, for the first time, we delve into the semantic meaning embedded in garbled adversarial prompts and propose a novel method that "translates" them into coherent and human-readable natural language adversarial prompts. In this way, we can effectively uncover the semantic information that triggers vulnerabilities of the model and unambiguously transfer it to the victim model, without overlooking the adversarial information hidden in the garbled text, to enhance jailbreak attacks. It also offers a new approach to discovering effective designs for jailbreak prompts, advancing the understanding of jailbreak attacks. Experimental results demonstrate that our method significantly improves the success rate of jailbreak attacks against various safety-aligned LLMs and outperforms state-of-the-arts by large margins. With at most 10 queries, our method achieves an average attack success rate of 81.8% in attacking 7 commercial closed-source LLMs, including GPT and Claude-3 series, on HarmBench. Our method also achieves over 90% attack success rates against Llama-2-Chat models on AdvBench, despite their outstanding resistance to jailbreak attacks. Code at: https://github.com/qizhangli/Adversarial-Prompt-Translator.

Via

Access Paper or Ask Questions

Facial Identity Anonymization via Intrinsic and Extrinsic Attention Distraction

Jun 25, 2024

Zhenzhong Kuang, Xiaochen Yang, Yingjie Shen, Chao Hu, Jun Yu

Figure 1 for Facial Identity Anonymization via Intrinsic and Extrinsic Attention Distraction

Figure 2 for Facial Identity Anonymization via Intrinsic and Extrinsic Attention Distraction

Figure 3 for Facial Identity Anonymization via Intrinsic and Extrinsic Attention Distraction

Figure 4 for Facial Identity Anonymization via Intrinsic and Extrinsic Attention Distraction

Abstract:The unprecedented capture and application of face images raise increasing concerns on anonymization to fight against privacy disclosure. Most existing methods may suffer from the problem of excessive change of the identity-independent information or insufficient identity protection. In this paper, we present a new face anonymization approach by distracting the intrinsic and extrinsic identity attentions. On the one hand, we anonymize the identity information in the feature space by distracting the intrinsic identity attention. On the other, we anonymize the visual clues (i.e. appearance and geometry structure) by distracting the extrinsic identity attention. Our approach allows for flexible and intuitive manipulation of face appearance and geometry structure to produce diverse results, and it can also be used to instruct users to perform personalized anonymization. We conduct extensive experiments on multiple datasets and demonstrate that our approach outperforms state-of-the-art methods.

* IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024: 12406-12415

Via

Access Paper or Ask Questions

PUAL: A Classifier on Trifurcate Positive-Unlabeled Data

May 31, 2024

Xiaoke Wang, Xiaochen Yang, Rui Zhu, Jing-Hao Xue

Figure 1 for PUAL: A Classifier on Trifurcate Positive-Unlabeled Data

Figure 2 for PUAL: A Classifier on Trifurcate Positive-Unlabeled Data

Figure 3 for PUAL: A Classifier on Trifurcate Positive-Unlabeled Data

Figure 4 for PUAL: A Classifier on Trifurcate Positive-Unlabeled Data

Abstract:Positive-unlabeled (PU) learning aims to train a classifier using the data containing only labeled-positive instances and unlabeled instances. However, existing PU learning methods are generally hard to achieve satisfactory performance on trifurcate data, where the positive instances distribute on both sides of the negative instances. To address this issue, firstly we propose a PU classifier with asymmetric loss (PUAL), by introducing a structure of asymmetric loss on positive instances into the objective function of the global and local learning classifier. Then we develop a kernel-based algorithm to enable PUAL to obtain non-linear decision boundary. We show that, through experiments on both simulated and real-world datasets, PUAL can achieve satisfactory classification on trifurcate data.

* 24 pages, 6 figures

Via

Access Paper or Ask Questions

Train Faster, Perform Better: Modular Adaptive Training in Over-Parameterized Models

May 13, 2024

Yubin Shi, Yixuan Chen, Mingzhi Dong, Xiaochen Yang, Dongsheng Li, Yujiang Wang, Robert P. Dick, Qin Lv, Yingying Zhao, Fan Yang(+3 more)

Abstract:Despite their prevalence in deep-learning communities, over-parameterized models convey high demands of computational costs for proper training. This work studies the fine-grained, modular-level learning dynamics of over-parameterized models to attain a more efficient and fruitful training strategy. Empirical evidence reveals that when scaling down into network modules, such as heads in self-attention models, we can observe varying learning patterns implicitly associated with each module's trainability. To describe such modular-level learning capabilities, we introduce a novel concept dubbed modular neural tangent kernel (mNTK), and we demonstrate that the quality of a module's learning is tightly associated with its mNTK's principal eigenvalue $\lambda_{\max}$. A large $\lambda_{\max}$ indicates that the module learns features with better convergence, while those miniature ones may impact generalization negatively. Inspired by the discovery, we propose a novel training strategy termed Modular Adaptive Training (MAT) to update those modules with their $\lambda_{\max}$ exceeding a dynamic threshold selectively, concentrating the model on learning common features and ignoring those inconsistent ones. Unlike most existing training schemes with a complete BP cycle across all network modules, MAT can significantly save computations by its partially-updating strategy and can further improve performance. Experiments show that MAT nearly halves the computational cost of model training and outperforms the accuracy of baselines.

* Accepted at NeurIPS 2023

Via

Access Paper or Ask Questions

Improving Transferability of Adversarial Examples via Bayesian Attacks

Jul 21, 2023

Qizhang Li, Yiwen Guo, Xiaochen Yang, Wangmeng Zuo, Hao Chen

Figure 1 for Improving Transferability of Adversarial Examples via Bayesian Attacks

Figure 2 for Improving Transferability of Adversarial Examples via Bayesian Attacks

Figure 3 for Improving Transferability of Adversarial Examples via Bayesian Attacks

Figure 4 for Improving Transferability of Adversarial Examples via Bayesian Attacks

Abstract:This paper presents a substantial extension of our work published at ICLR. Our ICLR work advocated for enhancing transferability in adversarial examples by incorporating a Bayesian formulation into model parameters, which effectively emulates the ensemble of infinitely many deep neural networks, while, in this paper, we introduce a novel extension by incorporating the Bayesian formulation into the model input as well, enabling the joint diversification of both the model input and model parameters. Our empirical findings demonstrate that: 1) the combination of Bayesian formulations for both the model input and model parameters yields significant improvements in transferability; 2) by introducing advanced approximations of the posterior distribution over the model input, adversarial transferability achieves further enhancement, surpassing all state-of-the-arts when attacking without model fine-tuning. Moreover, we propose a principled approach to fine-tune model parameters in such an extended Bayesian formulation. The derived optimization objective inherently encourages flat minima in the parameter space and input space. Extensive experiments demonstrate that our method achieves a new state-of-the-art on transfer-based attacks, improving the average success rate on ImageNet and CIFAR-10 by 19.14% and 2.08%, respectively, when comparing with our ICLR basic Bayesian method. We will make our code publicly available.

Via

Access Paper or Ask Questions

HAGEN: Homophily-Aware Graph Convolutional Recurrent Network for Crime Forecasting

Sep 27, 2021

Chenyu Wang, Zongyu Lin, Xiaochen Yang, Jiao Sun, Mingxuan Yue, Cyrus Shahabi

Figure 1 for HAGEN: Homophily-Aware Graph Convolutional Recurrent Network for Crime Forecasting

Figure 2 for HAGEN: Homophily-Aware Graph Convolutional Recurrent Network for Crime Forecasting

Figure 3 for HAGEN: Homophily-Aware Graph Convolutional Recurrent Network for Crime Forecasting

Figure 4 for HAGEN: Homophily-Aware Graph Convolutional Recurrent Network for Crime Forecasting

Abstract:The crime forecasting is an important problem as it greatly contributes to urban safety. Typically, the goal of the problem is to predict different types of crimes for each geographical region (like a neighborhood or censor tract) in the near future. Since nearby regions usually have similar socioeconomic characteristics which indicate similar crime patterns, recent state-of-the-art solutions constructed a distance-based region graph and utilized Graph Neural Network (GNN) techniques for crime forecasting, because the GNN techniques could effectively exploit the latent relationships between neighboring region nodes in the graph. However, this distance-based pre-defined graph cannot fully capture crime correlation between regions that are far from each other but share similar crime patterns. Hence, to make an accurate crime prediction, the main challenge is to learn a better graph that reveals the dependencies between regions in crime occurrences and meanwhile captures the temporal patterns from historical crime records. To address these challenges, we propose an end-to-end graph convolutional recurrent network called HAGEN with several novel designs for crime prediction. Specifically, our framework could jointly capture the crime correlation between regions and the temporal crime dynamics by combining an adaptive region graph learning module with the Diffusion Convolution Gated Recurrent Unit (DCGRU). Based on the homophily assumption of GNN, we propose a homophily-aware constraint to regularize the optimization of the region graph so that neighboring region nodes on the learned graph share similar crime patterns, thus fitting the mechanism of diffusion convolution. It also incorporates crime embedding to model the interdependencies between regions and crime categories. Empirical experiments and comprehensive analysis on two real-world datasets showcase the effectiveness of HAGEN.

Via

Access Paper or Ask Questions