Abstract:Neural network have achieved remarkable successes in many scientific fields. However, the interpretability of the neural network model is still a major bottlenecks to deploy such technique into our daily life. The challenge can dive into the non-linear behavior of the neural network, which rises a critical question that how a model use input feature to make a decision. The classical approach to address this challenge is feature attribution, which assigns an important score to each input feature and reveal its importance of current prediction. However, current feature attribution approaches often indicate the importance of each input feature without detail of how they are actually processed by a model internally. These attribution approaches often raise a concern that whether they highlight correct features for a model prediction. For a neural network model, the non-linear behavior is often caused by non-linear activation units of a model. However, the computation behavior of a prediction from a neural network model is locally linear, because one prediction has only one activation pattern. Base on the observation, we propose an instance-wise linearization approach to reformulates the forward computation process of a neural network prediction. This approach reformulates different layers of convolution neural networks into linear matrix multiplication. Aggregating all layers' computation, a prediction complex convolution neural network operations can be described as a linear matrix multiplication $F(x) = W \cdot x + b$. This equation can not only provides a feature attribution map that highlights the important of the input features but also tells how each input feature contributes to a prediction exactly. Furthermore, we discuss the application of this technique in both supervise classification and unsupervised neural network learning parametric t-SNE dimension reduction.
Abstract:In the past few years, Generative Adversarial Networks (GANs) have dramatically advanced our ability to represent and parameterize high-dimensional, non-linear image manifolds. As a result, they have been widely adopted across a variety of applications, ranging from challenging inverse problems like image completion, to problems such as anomaly detection and adversarial defense. A recurring theme in many of these applications is the notion of projecting an image observation onto the manifold that is inferred by the generator. In this context, Projected Gradient Descent (PGD) has been the most popular approach, which essentially optimizes for a latent vector that minimizes the discrepancy between a generated image and the given observation. However, PGD is a brittle optimization technique that fails to identify the right projection (or latent vector) when the observation is corrupted, or perturbed even by a small amount. Such corruptions are common in the real world, for example images in the wild come with unknown crops, rotations, missing pixels, or other kinds of non-linear distributional shifts which break current encoding methods, rendering downstream applications unusable. To address this, we propose corruption mimicking -- a new robust projection technique, that utilizes a surrogate network to approximate the unknown corruption directly at test time, without the need for additional supervision or data augmentation. The proposed method is significantly more robust than PGD and other competing methods under a wide variety of corruptions, thereby enabling a more effective use of GANs in real-world applications. More importantly, we show that our approach produces state-of-the-art performance in several GAN-based applications -- anomaly detection, domain adaptation, and adversarial defense, that benefit from an accurate projection.
Abstract:Solving inverse problems continues to be a central challenge in computer vision. Existing techniques either explicitly construct an inverse mapping using prior knowledge about the corruption, or learn the inverse directly using a large collection of examples. However, in practice, the nature of corruption may be unknown, and thus it is challenging to regularize the problem of inferring a plausible solution. On the other hand, collecting task-specific training data is tedious for known corruptions and impossible for unknown ones. We present MimicGAN, an unsupervised technique to solve general inverse problems based on image priors in the form of generative adversarial networks (GANs). Using a GAN prior, we show that one can reliably recover solutions to underdetermined inverse problems through a surrogate network that learns to mimic the corruption at test time. Our system successively estimates the corruption and the clean image without the need for supervisory training, while outperforming existing baselines in blind image recovery. We also demonstrate that MimicGAN improves upon recent GAN-based defenses against adversarial attacks and represents one of the strongest test-time defenses available today.
Abstract:Computed Tomography (CT) reconstruction is a fundamental component to a wide variety of applications ranging from security, to healthcare. The classical techniques require measuring projections, called sinograms, from a full 180$^\circ$ view of the object. This is impractical in a limited angle scenario, when the viewing angle is less than 180$^\circ$, which can occur due to different factors including restrictions on scanning time, limited flexibility of scanner rotation, etc. The sinograms obtained as a result, cause existing techniques to produce highly artifact-laden reconstructions. In this paper, we propose to address this problem through implicit sinogram completion, on a challenging real world dataset containing scans of common checked-in luggage. We propose a system, consisting of 1D and 2D convolutional neural networks, that operates on a limited angle sinogram to directly produce the best estimate of a reconstruction. Next, we use the x-ray transform on this reconstruction to obtain a "completed" sinogram, as if it came from a full 180$^\circ$ measurement. We feed this to standard analytical and iterative reconstruction techniques to obtain the final reconstruction. We show with extensive experimentation that this combined strategy outperforms many competitive baselines. We also propose a measure of confidence for the reconstruction that enables a practitioner to gauge the reliability of a prediction made by our network. We show that this measure is a strong indicator of quality as measured by the PSNR, while not requiring ground truth at test time. Finally, using a segmentation experiment, we show that our reconstruction preserves the 3D structure of objects effectively.
Abstract:Solving inverse problems continues to be a challenge in a wide array of applications ranging from deblurring, image inpainting, source separation etc. Most existing techniques solve such inverse problems by either explicitly or implicitly finding the inverse of the model. The former class of techniques require explicit knowledge of the measurement process which can be unrealistic, and rely on strong analytical regularizers to constrain the solution space, which often do not generalize well. The latter approaches have had remarkable success in part due to deep learning, but require a large collection of source-observation pairs, which can be prohibitively expensive. In this paper, we propose an unsupervised technique to solve inverse problems with generative adversarial networks (GANs). Using a pre-trained GAN in the space of source signals, we show that one can reliably recover solutions to under determined problems in a `blind' fashion, i.e., without knowledge of the measurement process. We solve this by making successive estimates on the model and the solution in an iterative fashion. We show promising results in three challenging applications -- blind source separation, image deblurring, and recovering an image from its edge map, and perform better than several baselines.
Abstract:Interpretability has emerged as a crucial aspect of machine learning, aimed at providing insights into the working of complex neural networks. However, existing solutions vary vastly based on the nature of the interpretability task, with each use case requiring substantial time and effort. This paper introduces MARGIN, a simple yet general approach to address a large set of interpretability tasks ranging from identifying prototypes to explaining image predictions. MARGIN exploits ideas rooted in graph signal analysis to determine influential nodes in a graph, which are defined as those nodes that maximally describe a function defined on the graph. By carefully defining task-specific graphs and functions, we demonstrate that MARGIN outperforms existing approaches in a number of disparate interpretability challenges.