Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Chuan-Yung Tsai

Monitoring Shortcut Learning using Mutual Information

Jun 27, 2022

Mohammed Adnan, Yani Ioannou, Chuan-Yung Tsai, Angus Galloway, H. R. Tizhoosh, Graham W. Taylor

Figure 1 for Monitoring Shortcut Learning using Mutual Information

Figure 2 for Monitoring Shortcut Learning using Mutual Information

Figure 3 for Monitoring Shortcut Learning using Mutual Information

Figure 4 for Monitoring Shortcut Learning using Mutual Information

Abstract:The failure of deep neural networks to generalize to out-of-distribution data is a well-known problem and raises concerns about the deployment of trained networks in safety-critical domains such as healthcare, finance and autonomous vehicles. We study a particular kind of distribution shift $\unicode{x2013}$ shortcuts or spurious correlations in the training data. Shortcut learning is often only exposed when models are evaluated on real-world data that does not contain the same spurious correlations, posing a serious dilemma for AI practitioners to properly assess the effectiveness of a trained model for real-world applications. In this work, we propose to use the mutual information (MI) between the learned representation and the input as a metric to find where in training, the network latches onto shortcuts. Experiments demonstrate that MI can be used as a domain-agnostic metric for monitoring shortcut learning.

* Accepted at ICML 2022 Workshop on Spurious Correlations, Invariance, and Stability

Via

Access Paper or Ask Questions

DeepRNG: Towards Deep Reinforcement Learning-Assisted Generative Testing of Software

Jan 29, 2022

Chuan-Yung Tsai, Graham W. Taylor

Abstract:Although machine learning (ML) has been successful in automating various software engineering needs, software testing still remains a highly challenging topic. In this paper, we aim to improve the generative testing of software by directly augmenting the random number generator (RNG) with a deep reinforcement learning (RL) agent using an efficient, automatically extractable state representation of the software under test. Using the Cosmos SDK as the testbed, we show that the proposed DeepRNG framework provides a statistically significant improvement to the testing of the highly complex software library with over 350,000 lines of code. The source code of the DeepRNG framework is publicly available online.

* Workshop on ML for Systems, 35th Conference on Neural Information Processing Systems (NeurIPS 2021)

Via

Access Paper or Ask Questions

Domain-Agnostic Clustering with Self-Distillation

Nov 23, 2021

Mohammed Adnan, Yani A. Ioannou, Chuan-Yung Tsai, Graham W. Taylor

Figure 1 for Domain-Agnostic Clustering with Self-Distillation

Figure 2 for Domain-Agnostic Clustering with Self-Distillation

Abstract:Recent advancements in self-supervised learning have reduced the gap between supervised and unsupervised representation learning. However, most self-supervised and deep clustering techniques rely heavily on data augmentation, rendering them ineffective for many learning tasks where insufficient domain knowledge exists for performing augmentation. We propose a new self-distillation based algorithm for domain-agnostic clustering. Our method builds upon the existing deep clustering frameworks and requires no separate student model. The proposed method outperforms existing domain agnostic (augmentation-free) algorithms on CIFAR-10. We empirically demonstrate that knowledge distillation can improve unsupervised representation learning by extracting richer `dark knowledge' from the model than using predicted labels alone. Preliminary experiments also suggest that self-distillation improves the convergence of DeepCluster-v2.

* NeurIPS 2021 Workshop: Self-Supervised Learning - Theory and Practice

Via

Access Paper or Ask Questions

FusedProp: Towards Efficient Training of Generative Adversarial Networks

Mar 30, 2020

Zachary Polizzi, Chuan-Yung Tsai

Figure 1 for FusedProp: Towards Efficient Training of Generative Adversarial Networks

Figure 2 for FusedProp: Towards Efficient Training of Generative Adversarial Networks

Figure 3 for FusedProp: Towards Efficient Training of Generative Adversarial Networks

Figure 4 for FusedProp: Towards Efficient Training of Generative Adversarial Networks

Abstract:Generative adversarial networks (GANs) are capable of generating strikingly realistic samples but state-of-the-art GANs can be extremely computationally expensive to train. In this paper, we propose the fused propagation (FusedProp) algorithm which can be used to efficiently train the discriminator and the generator of common GANs simultaneously using only one forward and one backward propagation. We show that FusedProp achieves 1.49 times the training speed compared to the conventional training of GANs, although further studies are required to improve its stability. By reporting our preliminary results and open-sourcing our implementation, we hope to accelerate future research on the training of GANs.

* source code available at https://github.com/zplizzi/fusedprop

Via

Access Paper or Ask Questions

Tensor Switching Networks

Oct 31, 2016

Chuan-Yung Tsai, Andrew Saxe, David Cox

Abstract:We present a novel neural network algorithm, the Tensor Switching (TS) network, which generalizes the Rectified Linear Unit (ReLU) nonlinearity to tensor-valued hidden units. The TS network copies its entire input vector to different locations in an expanded representation, with the location determined by its hidden unit activity. In this way, even a simple linear readout from the TS representation can implement a highly expressive deep-network-like function. The TS network hence avoids the vanishing gradient problem by construction, at the cost of larger representation size. We develop several methods to train the TS network, including equivalent kernels for infinitely wide and deep TS networks, a one-pass linear learning algorithm, and two backpropagation-inspired representation learning algorithms. Our experimental results demonstrate that the TS network is indeed more expressive and consistently learns faster than standard ReLU networks.

Via

Access Paper or Ask Questions

Measuring and Understanding Sensory Representations within Deep Networks Using a Numerical Optimization Framework

Feb 17, 2015

Chuan-Yung Tsai, David D. Cox

Figure 1 for Measuring and Understanding Sensory Representations within Deep Networks Using a Numerical Optimization Framework

Figure 2 for Measuring and Understanding Sensory Representations within Deep Networks Using a Numerical Optimization Framework

Figure 3 for Measuring and Understanding Sensory Representations within Deep Networks Using a Numerical Optimization Framework

Figure 4 for Measuring and Understanding Sensory Representations within Deep Networks Using a Numerical Optimization Framework

Abstract:A central challenge in sensory neuroscience is describing how the activity of populations of neurons can represent useful features of the external environment. However, while neurophysiologists have long been able to record the responses of neurons in awake, behaving animals, it is another matter entirely to say what a given neuron does. A key problem is that in many sensory domains, the space of all possible stimuli that one might encounter is effectively infinite; in vision, for instance, natural scenes are combinatorially complex, and an organism will only encounter a tiny fraction of possible stimuli. As a result, even describing the response properties of sensory neurons is difficult, and investigations of neuronal functions are almost always critically limited by the number of stimuli that can be considered. In this paper, we propose a closed-loop, optimization-based experimental framework for characterizing the response properties of sensory neurons, building on past efforts in closed-loop experimental methods, and leveraging recent advances in artificial neural networks to serve as as a proving ground for our techniques. Specifically, using deep convolutional neural networks, we asked whether modern black-box optimization techniques can be used to interrogate the "tuning landscape" of an artificial neuron in a deep, nonlinear system, without imposing significant constraints on the space of stimuli under consideration. We introduce a series of measures to quantify the tuning landscapes, and show how these relate to the performances of the networks in an object recognition task. To the extent that deep convolutional neural networks increasingly serve as de facto working hypotheses for biological vision, we argue that developing a unified approach for studying both artificial and biological systems holds great potential to advance both fields together.

Via

Access Paper or Ask Questions