Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sayan Chakraborty

Towards Robust Offline Evaluation: A Causal and Information Theoretic Framework for Debiasing Ranking Systems

Apr 04, 2025

Seyedeh Baharan Khatami, Sayan Chakraborty, Ruomeng Xu, Babak Salimi

Abstract:Evaluating retrieval-ranking systems is crucial for developing high-performing models. While online A/B testing is the gold standard, its high cost and risks to user experience require effective offline methods. However, relying on historical interaction data introduces biases-such as selection, exposure, conformity, and position biases-that distort evaluation metrics, driven by the Missing-Not-At-Random (MNAR) nature of user interactions and favoring popular or frequently exposed items over true user preferences. We propose a novel framework for robust offline evaluation of retrieval-ranking systems, transforming MNAR data into Missing-At-Random (MAR) through reweighting combined with black-box optimization, guided by neural estimation of information-theoretic metrics. Our contributions include (1) a causal formulation for addressing offline evaluation biases, (2) a system-agnostic debiasing framework, and (3) empirical validation of its effectiveness. This framework enables more accurate, fair, and generalizable evaluations, enhancing model assessment before deployment.

Via

Access Paper or Ask Questions

Building an Automated and Self-Aware Anomaly Detection System

Nov 10, 2020

Sayan Chakraborty, Smit Shah, Kiumars Soltani, Anna Swigart, Luyao Yang, Kyle Buckingham

Figure 1 for Building an Automated and Self-Aware Anomaly Detection System

Figure 2 for Building an Automated and Self-Aware Anomaly Detection System

Figure 3 for Building an Automated and Self-Aware Anomaly Detection System

Figure 4 for Building an Automated and Self-Aware Anomaly Detection System

Abstract:Organizations rely heavily on time series metrics to measure and model key aspects of operational and business performance. The ability to reliably detect issues with these metrics is imperative to identifying early indicators of major problems before they become pervasive. It can be very challenging to proactively monitor a large number of diverse and constantly changing time series for anomalies, so there are often gaps in monitoring coverage, disabled or ignored monitors due to false positive alarms, and teams resorting to manual inspection of charts to catch problems. Traditionally, variations in the data generation processes and patterns have required strong modeling expertise to create models that accurately flag anomalies. In this paper, we describe an anomaly detection system that overcomes this common challenge by keeping track of its own performance and making changes as necessary to each model without requiring manual intervention. We demonstrate that this novel approach outperforms available alternatives on benchmark datasets in many scenarios.

* 11 pages, 5 figures, Accepted to 2020 IEEE International Conference on Big Data (IEEE BigData 2020)

Via

Access Paper or Ask Questions

Root Cause Detection Among Anomalous Time Series Using Temporal State Alignment

Jan 04, 2020

Sayan Chakraborty, Smit Shah, Kiumars Soltani, Anna Swigart

Figure 1 for Root Cause Detection Among Anomalous Time Series Using Temporal State Alignment

Figure 2 for Root Cause Detection Among Anomalous Time Series Using Temporal State Alignment

Figure 3 for Root Cause Detection Among Anomalous Time Series Using Temporal State Alignment

Figure 4 for Root Cause Detection Among Anomalous Time Series Using Temporal State Alignment

Abstract:The recent increase in the scale and complexity of software systems has introduced new challenges to the time series monitoring and anomaly detection process. A major drawback of existing anomaly detection methods is that they lack contextual information to help stakeholders identify the cause of anomalies. This problem, known as root cause detection, is particularly challenging to undertake in today's complex distributed software systems since the metrics under consideration generally have multiple internal and external dependencies. Significant manual analysis and strong domain expertise is required to isolate the correct cause of the problem. In this paper, we propose a method that isolates the root cause of an anomaly by analyzing the patterns in time series fluctuations. Our method considers the time series as observations from an underlying process passing through a sequence of discretized hidden states. The idea is to track the propagation of the effect when a given problem causes unaligned but homogeneous shifts of the underlying states. We evaluate our approach by finding the root cause of anomalies in Zillows clickstream data by identifying causal patterns among a set of observed fluctuations.

* 6 pages, 7 figures, 2019 18th IEEE International Conference on Machine Learning and Applications (ICMLA)

Via

Access Paper or Ask Questions