Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yuting Jiang

Argos: Agentic Time-Series Anomaly Detection with Autonomous Rule Generation via Large Language Models

Jan 24, 2025

Yile Gu, Yifan Xiong, Jonathan Mace, Yuting Jiang, Yigong Hu, Baris Kasikci, Peng Cheng

Figure 1 for Argos: Agentic Time-Series Anomaly Detection with Autonomous Rule Generation via Large Language Models

Figure 2 for Argos: Agentic Time-Series Anomaly Detection with Autonomous Rule Generation via Large Language Models

Figure 3 for Argos: Agentic Time-Series Anomaly Detection with Autonomous Rule Generation via Large Language Models

Figure 4 for Argos: Agentic Time-Series Anomaly Detection with Autonomous Rule Generation via Large Language Models

Abstract:Observability in cloud infrastructure is critical for service providers, driving the widespread adoption of anomaly detection systems for monitoring metrics. However, existing systems often struggle to simultaneously achieve explainability, reproducibility, and autonomy, which are three indispensable properties for production use. We introduce Argos, an agentic system for detecting time-series anomalies in cloud infrastructure by leveraging large language models (LLMs). Argos proposes to use explainable and reproducible anomaly rules as intermediate representation and employs LLMs to autonomously generate such rules. The system will efficiently train error-free and accuracy-guaranteed anomaly rules through multiple collaborative agents and deploy the trained rules for low-cost online anomaly detection. Through evaluation results, we demonstrate that Argos outperforms state-of-the-art methods, increasing $F_1$ scores by up to $9.5\%$ and $28.3\%$ on public anomaly detection datasets and an internal dataset collected from Microsoft, respectively.

Via

Access Paper or Ask Questions

Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models

Jan 23, 2025

Zhenghao Lin, Zihao Tang, Xiao Liu, Yeyun Gong, Yi Cheng, Qi Chen, Hang Li, Ying Xin, Ziyue Yang, Kailai Yang(+24 more)

Figure 1 for Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models

Figure 2 for Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models

Figure 3 for Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models

Figure 4 for Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models

Abstract:We introduce Sigma, an efficient large language model specialized for the system domain, empowered by a novel architecture including DiffQKV attention, and pre-trained on our meticulously collected system domain data. DiffQKV attention significantly enhances the inference efficiency of Sigma by optimizing the Query (Q), Key (K), and Value (V) components in the attention mechanism differentially, based on their varying impacts on the model performance and efficiency indicators. Specifically, we (1) conduct extensive experiments that demonstrate the model's varying sensitivity to the compression of K and V components, leading to the development of differentially compressed KV, and (2) propose augmented Q to expand the Q head dimension, which enhances the model's representation capacity with minimal impacts on the inference speed. Rigorous theoretical and empirical analyses reveal that DiffQKV attention significantly enhances efficiency, achieving up to a 33.36% improvement in inference speed over the conventional grouped-query attention (GQA) in long-context scenarios. We pre-train Sigma on 6T tokens from various sources, including 19.5B system domain data that we carefully collect and 1T tokens of synthesized and rewritten data. In general domains, Sigma achieves comparable performance to other state-of-arts models. In the system domain, we introduce the first comprehensive benchmark AIMicius, where Sigma demonstrates remarkable performance across all tasks, significantly outperforming GPT-4 with an absolute improvement up to 52.5%.

Via

Access Paper or Ask Questions

Infrared and visible image fusion based on Multi-State Contextual Hidden Markov Model

Jan 26, 2022

Xiaoqing Luo, Yuting Jiang, Anqi Wang, Zhancheng Zhang, Xiao-Jun Wu

Figure 1 for Infrared and visible image fusion based on Multi-State Contextual Hidden Markov Model

Figure 2 for Infrared and visible image fusion based on Multi-State Contextual Hidden Markov Model

Figure 3 for Infrared and visible image fusion based on Multi-State Contextual Hidden Markov Model

Figure 4 for Infrared and visible image fusion based on Multi-State Contextual Hidden Markov Model

Abstract:The traditional two-state hidden Markov model divides the high frequency coefficients only into two states (large and small states). Such scheme is prone to produce an inaccurate statistical model for the high frequency subband and reduces the quality of fusion result. In this paper, a fine-grained multi-state contextual hidden Markov model (MCHMM) is proposed for infrared and visible image fusion in the non-subsampled Shearlet domain, which takes full consideration of the strong correlations and level of details of NSST coefficients. To this end, an accurate soft context variable is designed correspondingly from the perspective of context correlation. Then, the statistical features provided by MCHMM are utilized for the fusion of high frequency subbands. To ensure the visual quality, a fusion strategy based on the difference in regional energy is proposed as well for lowfrequency subbands. Experimental results demonstrate that the proposed method can achieve a superior performance compared with other fusion methods in both subjective and objective aspects.

Via

Access Paper or Ask Questions