Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Alvin Chan

Pretraining ECG Data with Adversarial Masking Improves Model Generalizability for Data-Scarce Tasks

Nov 15, 2022

Jessica Y. Bo, Hen-Wei Huang, Alvin Chan, Giovanni Traverso

Abstract:Medical datasets often face the problem of data scarcity, as ground truth labels must be generated by medical professionals. One mitigation strategy is to pretrain deep learning models on large, unlabelled datasets with self-supervised learning (SSL). Data augmentations are essential for improving the generalizability of SSL-trained models, but they are typically handcrafted and tuned manually. We use an adversarial model to generate masks as augmentations for 12-lead electrocardiogram (ECG) data, where masks learn to occlude diagnostically-relevant regions of the ECGs. Compared to random augmentations, adversarial masking reaches better accuracy when transferring to to two diverse downstream objectives: arrhythmia classification and gender classification. Compared to a state-of-art ECG augmentation method 3KG, adversarial masking performs better in data-scarce regimes, demonstrating the generalizability of our model.

* Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2022, November 28th, 2022, New Orleans, United States & Virtual, http://www.ml4h.cc, 9 pages

Via

Access Paper or Ask Questions

How Does Frequency Bias Affect the Robustness of Neural Image Classifiers against Common Corruption and Adversarial Perturbations?

May 09, 2022

Alvin Chan, Yew-Soon Ong, Clement Tan

Figure 1 for How Does Frequency Bias Affect the Robustness of Neural Image Classifiers against Common Corruption and Adversarial Perturbations?

Figure 2 for How Does Frequency Bias Affect the Robustness of Neural Image Classifiers against Common Corruption and Adversarial Perturbations?

Figure 3 for How Does Frequency Bias Affect the Robustness of Neural Image Classifiers against Common Corruption and Adversarial Perturbations?

Figure 4 for How Does Frequency Bias Affect the Robustness of Neural Image Classifiers against Common Corruption and Adversarial Perturbations?

Abstract:Model robustness is vital for the reliable deployment of machine learning models in real-world applications. Recent studies have shown that data augmentation can result in model over-relying on features in the low-frequency domain, sacrificing performance against low-frequency corruptions, highlighting a connection between frequency and robustness. Here, we take one step further to more directly study the frequency bias of a model through the lens of its Jacobians and its implication to model robustness. To achieve this, we propose Jacobian frequency regularization for models' Jacobians to have a larger ratio of low-frequency components. Through experiments on four image datasets, we show that biasing classifiers towards low (high)-frequency components can bring performance gain against high (low)-frequency corruption and adversarial perturbation, albeit with a tradeoff in performance for low (high)-frequency corruption. Our approach elucidates a more direct connection between the frequency bias and robustness of deep learning models.

* IJCAI 2022 Long Oral, Camera-ready full version

Via

Access Paper or Ask Questions

A Survey on AI Sustainability: Emerging Trends on Learning Algorithms and Research Challenges

May 08, 2022

Zhenghua Chen, Min Wu, Alvin Chan, Xiaoli Li, Yew-Soon Ong

Figure 1 for A Survey on AI Sustainability: Emerging Trends on Learning Algorithms and Research Challenges

Figure 2 for A Survey on AI Sustainability: Emerging Trends on Learning Algorithms and Research Challenges

Figure 3 for A Survey on AI Sustainability: Emerging Trends on Learning Algorithms and Research Challenges

Figure 4 for A Survey on AI Sustainability: Emerging Trends on Learning Algorithms and Research Challenges

Abstract:Artificial Intelligence (AI) is a fast-growing research and development (R&D) discipline which is attracting increasing attention because of its promises to bring vast benefits for consumers and businesses, with considerable benefits promised in productivity growth and innovation. To date it has reported significant accomplishments in many areas that have been deemed as challenging for machines, ranging from computer vision, natural language processing, audio analysis to smart sensing and many others. The technical trend in realizing the successes has been towards increasing complex and large size AI models so as to solve more complex problems at superior performance and robustness. This rapid progress, however, has taken place at the expense of substantial environmental costs and resources. Besides, debates on the societal impacts of AI, such as fairness, safety and privacy, have continued to grow in intensity. These issues have presented major concerns pertaining to the sustainable development of AI. In this work, we review major trends in machine learning approaches that can address the sustainability problem of AI. Specifically, we examine emerging AI methodologies and algorithms for addressing the sustainability issue of AI in two major aspects, i.e., environmental sustainability and social sustainability of AI. We will also highlight the major limitations of existing studies and propose potential research challenges and directions for the development of next generation of sustainable AI techniques. We believe that this technical review can help to promote a sustainable development of AI R&D activities for the research community.

Via

Access Paper or Ask Questions

ORCHARD: A Benchmark For Measuring Systematic Generalization of Multi-Hierarchical Reasoning

Nov 28, 2021

Bill Tuck Weng Pung, Alvin Chan

Figure 1 for ORCHARD: A Benchmark For Measuring Systematic Generalization of Multi-Hierarchical Reasoning

Figure 2 for ORCHARD: A Benchmark For Measuring Systematic Generalization of Multi-Hierarchical Reasoning

Figure 3 for ORCHARD: A Benchmark For Measuring Systematic Generalization of Multi-Hierarchical Reasoning

Figure 4 for ORCHARD: A Benchmark For Measuring Systematic Generalization of Multi-Hierarchical Reasoning

Abstract:The ability to reason with multiple hierarchical structures is an attractive and desirable property of sequential inductive biases for natural language processing. Do the state-of-the-art Transformers and LSTM architectures implicitly encode for these biases? To answer this, we propose ORCHARD, a diagnostic dataset for systematically evaluating hierarchical reasoning in state-of-the-art neural sequence models. While there have been prior evaluation frameworks such as ListOps or Logical Inference, our work presents a novel and more natural setting where our models learn to reason with multiple explicit hierarchical structures instead of only one, i.e., requiring the ability to do both long-term sequence memorizing, relational reasoning while reasoning with hierarchical structure. Consequently, backed by a set of rigorous experiments, we show that (1) Transformer and LSTM models surprisingly fail in systematic generalization, and (2) with increased references between hierarchies, Transformer performs no better than random.

Via

Access Paper or Ask Questions

FastTrees: Parallel Latent Tree-Induction for Faster Sequence Encoding

Nov 28, 2021

Bill Tuck Weng Pung, Alvin Chan

Figure 1 for FastTrees: Parallel Latent Tree-Induction for Faster Sequence Encoding

Figure 2 for FastTrees: Parallel Latent Tree-Induction for Faster Sequence Encoding

Figure 3 for FastTrees: Parallel Latent Tree-Induction for Faster Sequence Encoding

Figure 4 for FastTrees: Parallel Latent Tree-Induction for Faster Sequence Encoding

Abstract:Inducing latent tree structures from sequential data is an emerging trend in the NLP research landscape today, largely popularized by recent methods such as Gumbel LSTM and Ordered Neurons (ON-LSTM). This paper proposes FASTTREES, a new general purpose neural module for fast sequence encoding. Unlike most previous works that consider recurrence to be necessary for tree induction, our work explores the notion of parallel tree induction, i.e., imbuing our model with hierarchical inductive biases in a parallelizable, non-autoregressive fashion. To this end, our proposed FASTTREES achieves competitive or superior performance to ON-LSTM on four well-established sequence modeling tasks, i.e., language modeling, logical inference, sentiment analysis and natural language inference. Moreover, we show that the FASTTREES module can be applied to enhance Transformer models, achieving performance gains on three sequence transduction tasks (machine translation, subject-verb agreement and mathematical language understanding), paving the way for modular tree induction modules. Overall, we outperform existing state-of-the-art models on logical inference tasks by +4% and mathematical language understanding by +8%.

Via

Access Paper or Ask Questions

Deep Extrapolation for Attribute-Enhanced Generation

Jul 07, 2021

Alvin Chan, Ali Madani, Ben Krause, Nikhil Naik

Figure 1 for Deep Extrapolation for Attribute-Enhanced Generation

Figure 2 for Deep Extrapolation for Attribute-Enhanced Generation

Figure 3 for Deep Extrapolation for Attribute-Enhanced Generation

Figure 4 for Deep Extrapolation for Attribute-Enhanced Generation

Abstract:Attribute extrapolation in sample generation is challenging for deep neural networks operating beyond the training distribution. We formulate a new task for extrapolation in sequence generation, focusing on natural language and proteins, and propose GENhance, a generative framework that enhances attributes through a learned latent space. Trained on movie reviews and a computed protein stability dataset, GENhance can generate strongly-positive text reviews and highly stable protein sequences without being exposed to similar data during training. We release our benchmark tasks and models to contribute to the study of generative modeling extrapolation and data-driven design in biology and chemistry.

Via

Access Paper or Ask Questions

RNA Alternative Splicing Prediction with Discrete Compositional Energy Network

Mar 07, 2021

Alvin Chan, Anna Korsakova, Yew-Soon Ong, Fernaldo Richtia Winnerdy, Kah Wai Lim, Anh Tuan Phan

Figure 1 for RNA Alternative Splicing Prediction with Discrete Compositional Energy Network

Figure 2 for RNA Alternative Splicing Prediction with Discrete Compositional Energy Network

Figure 3 for RNA Alternative Splicing Prediction with Discrete Compositional Energy Network

Figure 4 for RNA Alternative Splicing Prediction with Discrete Compositional Energy Network

Abstract:A single gene can encode for different protein versions through a process called alternative splicing. Since proteins play major roles in cellular functions, aberrant splicing profiles can result in a variety of diseases, including cancers. Alternative splicing is determined by the gene's primary sequence and other regulatory factors such as RNA-binding protein levels. With these as input, we formulate the prediction of RNA splicing as a regression task and build a new training dataset (CAPD) to benchmark learned models. We propose discrete compositional energy network (DCEN) which leverages the hierarchical relationships between splice sites, junctions and transcripts to approach this task. In the case of alternative splicing prediction, DCEN models mRNA transcript probabilities through its constituent splice junctions' energy values. These transcript probabilities are subsequently mapped to relative abundance values of key nucleotides and trained with ground-truth experimental measurements. Through our experiments on CAPD, we show that DCEN outperforms baselines and ablation variants.

* ACM CHIL 2021 Camera-Ready

Via

Access Paper or Ask Questions

Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with $1/n$ Parameters

Feb 17, 2021

Aston Zhang, Yi Tay, Shuai Zhang, Alvin Chan, Anh Tuan Luu, Siu Cheung Hui, Jie Fu

Figure 1 for Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with $1/n$ Parameters

Figure 2 for Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with $1/n$ Parameters

Figure 3 for Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with $1/n$ Parameters

Figure 4 for Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with $1/n$ Parameters

Abstract:Recent works have demonstrated reasonable success of representation learning in hypercomplex space. Specifically, "fully-connected layers with Quaternions" (4D hypercomplex numbers), which replace real-valued matrix multiplications in fully-connected layers with Hamilton products of Quaternions, both enjoy parameter savings with only 1/4 learnable parameters and achieve comparable performance in various applications. However, one key caveat is that hypercomplex space only exists at very few predefined dimensions (4D, 8D, and 16D). This restricts the flexibility of models that leverage hypercomplex multiplications. To this end, we propose parameterizing hypercomplex multiplications, allowing models to learn multiplication rules from data regardless of whether such rules are predefined. As a result, our method not only subsumes the Hamilton product, but also learns to operate on any arbitrary nD hypercomplex space, providing more architectural flexibility using arbitrarily $1/n$ learnable parameters compared with the fully-connected layer counterpart. Experiments of applications to the LSTM and Transformer models on natural language inference, machine translation, text style transfer, and subject verb agreement demonstrate architectural flexibility and effectiveness of the proposed approach.

* Published as a conference paper at the 9th International Conference on Learning Representations (ICLR 2021)

Via

Access Paper or Ask Questions

Poison Attacks against Text Datasets with Conditional Adversarially Regularized Autoencoder

Oct 06, 2020

Alvin Chan, Yi Tay, Yew-Soon Ong, Aston Zhang

Figure 1 for Poison Attacks against Text Datasets with Conditional Adversarially Regularized Autoencoder

Figure 2 for Poison Attacks against Text Datasets with Conditional Adversarially Regularized Autoencoder

Figure 3 for Poison Attacks against Text Datasets with Conditional Adversarially Regularized Autoencoder

Figure 4 for Poison Attacks against Text Datasets with Conditional Adversarially Regularized Autoencoder

Abstract:This paper demonstrates a fatal vulnerability in natural language inference (NLI) and text classification systems. More concretely, we present a 'backdoor poisoning' attack on NLP models. Our poisoning attack utilizes conditional adversarially regularized autoencoder (CARA) to generate poisoned training samples by poison injection in latent space. Just by adding 1% poisoned data, our experiments show that a victim BERT finetuned classifier's predictions can be steered to the poison target class with success rates of >80% when the input hypothesis is injected with the poison signature, demonstrating that NLI and text classification systems face a huge security risk.

* Accepted in EMNLP-Findings 2020, Camera Ready Version

Via

Access Paper or Ask Questions

Player Identification in Hockey Broadcast Videos

Sep 14, 2020

Alvin Chan, Martin D. Levine, Mehrsan Javan

Figure 1 for Player Identification in Hockey Broadcast Videos

Figure 2 for Player Identification in Hockey Broadcast Videos

Figure 3 for Player Identification in Hockey Broadcast Videos

Figure 4 for Player Identification in Hockey Broadcast Videos

Abstract:We present a deep recurrent convolutional neural network (CNN) approach to solve the problem of hockey player identification in NHL broadcast videos. Player identification is a difficult computer vision problem mainly because of the players' similar appearance, occlusion, and blurry facial and physical features. However, we can observe players' jersey numbers over time by processing variable length image sequences of players (aka 'tracklets'). We propose an end-to-end trainable ResNet+LSTM network, with a residual network (ResNet) base and a long short-term memory (LSTM) layer, to discover spatio-temporal features of jersey numbers over time and learn long-term dependencies. For this work, we created a new hockey player tracklet dataset that contains sequences of hockey player bounding boxes. Additionally, we employ a secondary 1-dimensional convolutional neural network classifier as a late score-level fusion method to classify the output of the ResNet+LSTM network. This achieves an overall player identification accuracy score over 87% on the test split of our new dataset.

* Volume 165, 1 March 2021, 113891

Via

Access Paper or Ask Questions