Abstract:In this paper, we study the problem of constrained robust (min-max) optimization ina black-box setting, where the desired optimizer cannot access the gradients of the objective function but may query its values. We present a principled optimization framework, integrating a zeroth-order (ZO) gradient estimator with an alternating projected stochastic gradient descent-ascent method, where the former only requires a small number of function queries and the later needs just one-step descent/ascent update. We show that the proposed framework, referred to as ZO-Min-Max, has a sub-linear convergence rate under mild conditions and scales gracefully with problem size. From an application side, we explore a promising connection between black-box min-max optimization and black-box evasion and poisoning attacks in adversarial machine learning (ML). Our empirical evaluations on these use cases demonstrate the effectiveness of our approach and its scalability to dimensions that prohibit using recent black-box solvers.
Abstract:We present a black-box adversarial attack algorithm which sets new state-of-the-art model evasion rates for query efficiency in the $\ell_\infty$ and $\ell_2$ metrics, where only loss-oracle access to the model is available. On two public black-box attack challenges, the algorithm achieves the highest evasion rate, surpassing all of the submitted attacks. Similar performance is observed on a model that is secure against substitute-model attacks. For standard models trained on the MNIST, CIFAR10, and IMAGENET datasets, averaged over the datasets and metrics, the algorithm is 3.8x less failure-prone, and spends in total 2.5x fewer queries than the current state-of-the-art attacks combined given a budget of 10, 000 queries per attack attempt. Notably, it requires no hyperparameter tuning or any data/time-dependent prior. The algorithm exploits a new approach, namely sign-based rather than magnitude-based gradient estimation. This shifts the estimation from continuous to binary black-box optimization. With three properties of the directional derivative, we examine three approaches to adversarial attacks. This yields a superior algorithm breaking a standard MNIST model using just 12 queries on average!
Abstract:Timely prediction of clinically critical events in Intensive Care Unit (ICU) is important for improving care and survival rate. Most of the existing approaches are based on the application of various classification methods on explicitly extracted statistical features from vital signals. In this work, we propose to eliminate the high cost of engineering hand-crafted features from multivariate time-series of physiologic signals by learning their representation with a sequence-to-sequence auto-encoder. We then propose to hash the learned representations to enable signal similarity assessment for the prediction of critical events. We apply this methodological framework to predict Acute Hypotensive Episodes (AHE) on a large and diverse dataset of vital signal recordings. Experiments demonstrate the ability of the presented framework in accurately predicting an upcoming AHE.
Abstract:GANs are difficult to train due to convergence pathologies such as mode and discriminator collapse. We introduce Lipizzaner, an open source software system that allows machine learning engineers to train GANs in a distributed and robust way. Lipizzaner distributes a competitive coevolutionary algorithm which, by virtue of dual, adapting, generator and discriminator populations, is robust to collapses. The algorithm is well suited to efficient distribution because it uses a spatial grid abstraction. Training is local to each cell and strong intermediate training results are exchanged among overlapping neighborhoods allowing high performing solutions to propagate and improve with more rounds of training. Experiments on common image datasets overcome critical collapses. Communication overhead scales linearly when increasing the number of compute instances and we observe that increasing scale leads to improved model performance.
Abstract:With the celebrated success of deep learning, some attempts to develop effective methods for detecting malicious PowerShell programs employ neural nets in a traditional natural language processing setup while others employ convolutional neural nets to detect obfuscated malicious commands at a character level. While these representations may express salient PowerShell properties, our hypothesis is that tools from static program analysis will be more effective. We propose a hybrid approach combining traditional program analysis (in the form of abstract syntax trees) and deep learning. This poster presents preliminary results of a fundamental step in our approach: learning embeddings for nodes of PowerShell ASTs. We classify malicious scripts by family type and explore embedded program vector representations.
Abstract:Generative Adversarial Networks (GANs) have become one of the dominant methods for deep generative modeling. Despite their demonstrated success on multiple vision tasks, GANs are difficult to train and much research has been dedicated towards understanding and improving their gradient-based learning dynamics. Here, we investigate the use of coevolution, a class of black-box (gradient-free) co-optimization techniques and a powerful tool in evolutionary computing, as a supplement to gradient-based GAN training techniques. Experiments on a simple model that exhibits several of the GAN gradient-based dynamics (e.g., mode collapse, oscillatory behavior, and vanishing gradients) show that coevolution is a promising framework for escaping degenerate GAN training behaviors.
Abstract:A central challenge of adversarial learning is to interpret the resulting hardened model. In this contribution, we ask how robust generalization can be visually discerned and whether a concise view of the interactions between a hardened decision map and input samples is possible. We first provide a means of visually comparing a hardened model's loss behavior with respect to the adversarial variants generated during training versus loss behavior with respect to adversarial variants generated from other sources. This allows us to confirm that the association of observed flatness of a loss landscape with generalization that is seen with naturally trained models extends to adversarially hardened models and robust generalization. To complement these means of interpreting model parameter robustness we also use self-organizing maps to provide a visual means of superimposing adversarial and natural variants on a model's decision space, thus allowing the model's global robustness to be comprehensively examined.
Abstract:Malware is constantly adapting in order to avoid detection. Model based malware detectors, such as SVM and neural networks, are vulnerable to so-called adversarial examples which are modest changes to detectable malware that allows the resulting malware to evade detection. Continuous-valued methods that are robust to adversarial examples of images have been developed using saddle-point optimization formulations. We are inspired by them to develop similar methods for the discrete, e.g. binary, domain which characterizes the features of malware. A specific extra challenge of malware is that the adversarial examples must be generated in a way that preserves their malicious functionality. We introduce methods capable of generating functionally preserved adversarial malware examples in the binary domain. Using the saddle-point formulation, we incorporate the adversarial examples into the training of models that are robust to them. We evaluate the effectiveness of the methods and others in the literature on a set of Portable Execution~(PE) files. Comparison prompts our introduction of an online measure computed during training to assess general expectation of robustness.
Abstract:This document briefly describes the Black-Box Multi-Objective Optimization Benchmarking (BMOBench) platform. It presents the test problems, evaluation procedure, and experimental setup. To this end, the BMOBench is demonstrated by comparing recent multi-objective solvers from the literature, namely SMS-EMOA, DMS, and MO-SOO.
Abstract:Random embedding has been applied with empirical success to large-scale black-box optimization problems with low effective dimensions. This paper proposes the EmbeddedHunter algorithm, which incorporates the technique in a hierarchical stochastic bandit setting, following the optimism in the face of uncertainty principle and breaking away from the multiple-run framework in which random embedding has been conventionally applied similar to stochastic black-box optimization solvers. Our proposition is motivated by the bounded mean variation in the objective value for a low-dimensional point projected randomly into the decision space of Lipschitz-continuous problems. In essence, the EmbeddedHunter algorithm expands optimistically a partitioning tree over a low-dimensional---equal to the effective dimension of the problem---search space based on a bounded number of random embeddings of sampled points from the low-dimensional space. In contrast to the probabilistic theoretical guarantees of multiple-run random-embedding algorithms, the finite-time analysis of the proposed algorithm presents a theoretical upper bound on the regret as a function of the algorithm's number of iterations. Furthermore, numerical experiments were conducted to validate its performance. The results show a clear performance gain over recently proposed random embedding methods for large-scale problems, provided the intrinsic dimensionality is low.