Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Muhammad Osama

The Sparsity Roofline: Understanding the Hardware Limits of Sparse Neural Networks

Sep 30, 2023

Cameron Shinn, Collin McCarthy, Saurav Muralidharan, Muhammad Osama, John D. Owens

Abstract:We introduce the Sparsity Roofline, a visual performance model for evaluating sparsity in neural networks. The Sparsity Roofline jointly models network accuracy, sparsity, and predicted inference speedup. Our approach does not require implementing and benchmarking optimized kernels, and the predicted speedup is equal to what would be measured when the corresponding dense and sparse kernels are equally well-optimized. We achieve this through a novel analytical model for predicting sparse network performance, and validate the predicted speedup using several real-world computer vision architectures pruned across a range of sparsity patterns and degrees. We demonstrate the utility and ease-of-use of our model through two case studies: (1) we show how machine learning researchers can predict the performance of unimplemented or unoptimized block-structured sparsity patterns, and (2) we show how hardware designers can predict the performance implications of new sparsity patterns and sparse data formats in hardware. In both scenarios, the Sparsity Roofline helps performance experts identify sparsity regimes with the highest performance potential.

Via

Access Paper or Ask Questions

Distributionally Robust Learning in Heterogeneous Contexts

May 18, 2021

Muhammad Osama, Dave Zachariah, Petre Stoica

Figure 1 for Distributionally Robust Learning in Heterogeneous Contexts

Figure 2 for Distributionally Robust Learning in Heterogeneous Contexts

Figure 3 for Distributionally Robust Learning in Heterogeneous Contexts

Figure 4 for Distributionally Robust Learning in Heterogeneous Contexts

Abstract:We consider the problem of learning from training data obtained in different contexts, where the test data is subject to distributional shifts. We develop a distributionally robust method that focuses on excess risks and achieves a more appropriate trade-off between performance and robustness than the conventional and overly conservative minimax approach. The proposed method is computationally feasible and provides statistical guarantees. We demonstrate its performance using both real and synthetic data.

Via

Access Paper or Ask Questions

Prediction of Spatial Point Processes: Regularized Method with Out-of-Sample Guarantees

Jul 03, 2020

Muhammad Osama, Dave Zachariah, Petre Stoica

Figure 1 for Prediction of Spatial Point Processes: Regularized Method with Out-of-Sample Guarantees

Figure 2 for Prediction of Spatial Point Processes: Regularized Method with Out-of-Sample Guarantees

Figure 3 for Prediction of Spatial Point Processes: Regularized Method with Out-of-Sample Guarantees

Figure 4 for Prediction of Spatial Point Processes: Regularized Method with Out-of-Sample Guarantees

Abstract:A spatial point process can be characterized by an intensity function which predicts the number of events that occur across space. In this paper, we develop a method to infer predictive intensity intervals by learning a spatial model using a regularized criterion. We prove that the proposed method exhibits out-of-sample prediction performance guarantees which, unlike standard estimators, are valid even when the spatial model is misspecified. The method is demonstrated using synthetic as well as real spatial data.

Via

Access Paper or Ask Questions

Learning Robust Decision Policies from Observational Data

Jun 03, 2020

Muhammad Osama, Dave Zachariah, Peter Stoica

Figure 1 for Learning Robust Decision Policies from Observational Data

Figure 2 for Learning Robust Decision Policies from Observational Data

Figure 3 for Learning Robust Decision Policies from Observational Data

Figure 4 for Learning Robust Decision Policies from Observational Data

Abstract:We address the problem of learning a decision policy from observational data of past decisions in contexts with features and associated outcomes. The past policy maybe unknown and in safety-critical applications, such as medical decision support, it is of interest to learn robust policies that reduce the risk of outcomes with high costs. In this paper, we develop a method for learning policies that reduce tails of the cost distribution at a specified level and, moreover, provide a statistically valid bound on the cost of each decision. These properties are valid under finite samples -- even in scenarios with uneven or no overlap between features for different decisions in the observed data -- by building on recent results in conformal prediction. The performance and statistical properties of the proposed method are illustrated using both real and synthetic data.

Via

Access Paper or Ask Questions

DENS: A Dataset for Multi-class Emotion Analysis

Oct 25, 2019

Chen Liu, Muhammad Osama, Anderson de Andrade

Figure 1 for DENS: A Dataset for Multi-class Emotion Analysis

Figure 2 for DENS: A Dataset for Multi-class Emotion Analysis

Figure 3 for DENS: A Dataset for Multi-class Emotion Analysis

Figure 4 for DENS: A Dataset for Multi-class Emotion Analysis

Abstract:We introduce a new dataset for multi-class emotion analysis from long-form narratives in English. The Dataset for Emotions of Narrative Sequences (DENS) was collected from both classic literature available on Project Gutenberg and modern online narratives available on Wattpad, annotated using Amazon Mechanical Turk. A number of statistics and baseline benchmarks are provided for the dataset. Of the tested techniques, we find that the fine-tuning of a pre-trained BERT model achieves the best results, with an average micro-F1 score of 60.4%. Our results show that the dataset provides a novel opportunity in emotion analysis that requires moving beyond existing sentence-level techniques.

* Accepted to EMNLP 2019

Via

Access Paper or Ask Questions

Exploring Multilingual Syntactic Sentence Representations

Oct 25, 2019

Chen Liu, Anderson de Andrade, Muhammad Osama

Figure 1 for Exploring Multilingual Syntactic Sentence Representations

Figure 2 for Exploring Multilingual Syntactic Sentence Representations

Figure 3 for Exploring Multilingual Syntactic Sentence Representations

Figure 4 for Exploring Multilingual Syntactic Sentence Representations

Abstract:We study methods for learning sentence embeddings with syntactic structure. We focus on methods of learning syntactic sentence-embeddings by using a multilingual parallel-corpus augmented by Universal Parts-of-Speech tags. We evaluate the quality of the learned embeddings by examining sentence-level nearest neighbours and functional dissimilarity in the embedding space. We also evaluate the ability of the method to learn syntactic sentence-embeddings for low-resource languages and demonstrate strong evidence for transfer learning. Our results show that syntactic sentence-embeddings can be learned while using less training data, fewer model parameters, and resulting in better evaluation metrics than state-of-the-art language models.

Via

Access Paper or Ask Questions

Robust Risk Minimization for Statistical Learning

Oct 03, 2019

Muhammad Osama, Dave Zachariah, Peter Stoica

Figure 1 for Robust Risk Minimization for Statistical Learning

Figure 2 for Robust Risk Minimization for Statistical Learning

Figure 3 for Robust Risk Minimization for Statistical Learning

Figure 4 for Robust Risk Minimization for Statistical Learning

Abstract:We consider a general statistical learning problem where an unknown fraction of the training data is corrupted. We develop a robust learning method that only requires specifying an upper bound on the corrupted data fraction. The method is formulated as a risk minimization problem that can be solved using a blockwise coordinate descent algorithm. We demonstrate the wide range applicability of the method, including regression, classification, unsupervised learning and classic parameter estimation, with state-of-the-art performance.

Via

Access Paper or Ask Questions

Inferring Heterogeneous Causal Effects in Presence of Spatial Confounding

Jan 28, 2019

Muhammad Osama, Dave Zachariah, Thomas Schön

Figure 1 for Inferring Heterogeneous Causal Effects in Presence of Spatial Confounding

Abstract:We address the problem of inferring the causal effect of an exposure on an outcome across space, using observational data. The data is possibly subject to unmeasured confounding variables which, in a standard approach, must be adjusted for by estimating a nuisance function. Here we develop a method that eliminates the nuisance function, while mitigating the resulting errors-in-variables. The result is a robust and accurate inference method for spatially varying heterogeneous causal effects. The properties of the method are demonstrated on synthetic as well as real data from Germany and the US.

* 10 pages, 10 figures

Via

Access Paper or Ask Questions

Learning Localized Spatio-Temporal Models From Streaming Data

Jun 22, 2018

Muhammad Osama, Dave Zachariah, Thomas B. Schön

Figure 1 for Learning Localized Spatio-Temporal Models From Streaming Data

Figure 2 for Learning Localized Spatio-Temporal Models From Streaming Data

Figure 3 for Learning Localized Spatio-Temporal Models From Streaming Data

Figure 4 for Learning Localized Spatio-Temporal Models From Streaming Data

Abstract:We address the problem of predicting spatio-temporal processes with temporal patterns that vary across spatial regions, when data is obtained as a stream. That is, when the training dataset is augmented sequentially. Specifically, we develop a localized spatio-temporal covariance model of the process that can capture spatially varying temporal periodicities in the data. We then apply a covariance-fitting methodology to learn the model parameters which yields a predictor that can be updated sequentially with each new data point. The proposed method is evaluated using both synthetic and real climate data which demonstrate its ability to accurately predict data missing in spatial regions over time.

* 12 pages, 7 figures

Via

Access Paper or Ask Questions

Unsupervised Cipher Cracking Using Discrete GANs

Jan 15, 2018

Aidan N. Gomez, Sicong Huang, Ivan Zhang, Bryan M. Li, Muhammad Osama, Lukasz Kaiser

Figure 1 for Unsupervised Cipher Cracking Using Discrete GANs

Figure 2 for Unsupervised Cipher Cracking Using Discrete GANs

Figure 3 for Unsupervised Cipher Cracking Using Discrete GANs

Figure 4 for Unsupervised Cipher Cracking Using Discrete GANs

Abstract:This work details CipherGAN, an architecture inspired by CycleGAN used for inferring the underlying cipher mapping given banks of unpaired ciphertext and plaintext. We demonstrate that CipherGAN is capable of cracking language data enciphered using shift and Vigenere ciphers to a high degree of fidelity and for vocabularies much larger than previously achieved. We present how CycleGAN can be made compatible with discrete data and train in a stable way. We then prove that the technique used in CipherGAN avoids the common problem of uninformative discrimination associated with GANs applied to discrete data.

Via

Access Paper or Ask Questions