Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Pedro Larroy

Amazon SageMaker Clarify: Machine Learning Bias Detection and Explainability in the Cloud

Sep 07, 2021

Michaela Hardt, Xiaoguang Chen, Xiaoyi Cheng, Michele Donini, Jason Gelman, Satish Gollaprolu, John He, Pedro Larroy, Xinyu Liu, Nick McCarthy(+11 more)

Figure 1 for Amazon SageMaker Clarify: Machine Learning Bias Detection and Explainability in the Cloud

Figure 2 for Amazon SageMaker Clarify: Machine Learning Bias Detection and Explainability in the Cloud

Figure 3 for Amazon SageMaker Clarify: Machine Learning Bias Detection and Explainability in the Cloud

Figure 4 for Amazon SageMaker Clarify: Machine Learning Bias Detection and Explainability in the Cloud

Abstract:Understanding the predictions made by machine learning (ML) models and their potential biases remains a challenging and labor-intensive task that depends on the application, the dataset, and the specific model. We present Amazon SageMaker Clarify, an explainability feature for Amazon SageMaker that launched in December 2020, providing insights into data and ML models by identifying biases and explaining predictions. It is deeply integrated into Amazon SageMaker, a fully managed service that enables data scientists and developers to build, train, and deploy ML models at any scale. Clarify supports bias detection and feature importance computation across the ML lifecycle, during data preparation, model evaluation, and post-deployment monitoring. We outline the desiderata derived from customer input, the modular architecture, and the methodology for bias and explanation computations. Further, we describe the technical challenges encountered and the tradeoffs we had to make. For illustration, we discuss two customer use cases. We present our deployment results including qualitative customer feedback and a quantitative evaluation. Finally, we summarize lessons learned, and discuss best practices for the successful adoption of fairness and explanation tools in practice.

* In Proc. ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2974-2983 (2021)

Via

Access Paper or Ask Questions

AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data

Mar 13, 2020

Nick Erickson, Jonas Mueller, Alexander Shirkov, Hang Zhang, Pedro Larroy, Mu Li, Alexander Smola

Figure 1 for AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data

Figure 2 for AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data

Figure 3 for AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data

Figure 4 for AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data

Abstract:We introduce AutoGluon-Tabular, an open-source AutoML framework that requires only a single line of Python to train highly accurate machine learning models on an unprocessed tabular dataset such as a CSV file. Unlike existing AutoML frameworks that primarily focus on model/hyperparameter selection, AutoGluon-Tabular succeeds by ensembling multiple models and stacking them in multiple layers. Experiments reveal that our multi-layer combination of many models offers better use of allocated training time than seeking out the best. A second contribution is an extensive evaluation of public and commercial AutoML platforms including TPOT, H2O, AutoWEKA, auto-sklearn, AutoGluon, and Google AutoML Tables. Tests on a suite of 50 classification and regression tasks from Kaggle and the OpenML AutoML Benchmark reveal that AutoGluon is faster, more robust, and much more accurate. We find that AutoGluon often even outperforms the best-in-hindsight combination of all of its competitors. In two popular Kaggle competitions, AutoGluon beat 99% of the participating data scientists after merely 4h of training on the raw data.

Via

Access Paper or Ask Questions