Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Penny Chong

Efficiently Distilling LLMs for Edge Applications

Apr 01, 2024

Achintya Kundu, Fabian Lim, Aaron Chew, Laura Wynter, Penny Chong, Rhui Dih Lee

Figure 1 for Efficiently Distilling LLMs for Edge Applications

Figure 2 for Efficiently Distilling LLMs for Edge Applications

Figure 3 for Efficiently Distilling LLMs for Edge Applications

Figure 4 for Efficiently Distilling LLMs for Edge Applications

Abstract:Supernet training of LLMs is of great interest in industrial applications as it confers the ability to produce a palette of smaller models at constant cost, regardless of the number of models (of different size / latency) produced. We propose a new method called Multistage Low-rank Fine-tuning of Super-transformers (MLFS) for parameter-efficient supernet training. We show that it is possible to obtain high-quality encoder models that are suitable for commercial edge applications, and that while decoder-only models are resistant to a comparable degree of compression, decoders can be effectively sliced for a significant reduction in training time.

* This paper has been accepted for publication in NAACL 2024 (Industry Track)

Via

Access Paper or Ask Questions

Towards A Conceptually Simple Defensive Approach for Few-shot classifiers Against Adversarial Support Samples

Oct 24, 2021

Yi Xiang Marcus Tan, Penny Chong, Jiamei Sun, Ngai-man Cheung, Yuval Elovici, Alexander Binder

Figure 1 for Towards A Conceptually Simple Defensive Approach for Few-shot classifiers Against Adversarial Support Samples

Figure 2 for Towards A Conceptually Simple Defensive Approach for Few-shot classifiers Against Adversarial Support Samples

Figure 3 for Towards A Conceptually Simple Defensive Approach for Few-shot classifiers Against Adversarial Support Samples

Figure 4 for Towards A Conceptually Simple Defensive Approach for Few-shot classifiers Against Adversarial Support Samples

Abstract:Few-shot classifiers have been shown to exhibit promising results in use cases where user-provided labels are scarce. These models are able to learn to predict novel classes simply by training on a non-overlapping set of classes. This can be largely attributed to the differences in their mechanisms as compared to conventional deep networks. However, this also offers new opportunities for novel attackers to induce integrity attacks against such models, which are not present in other machine learning setups. In this work, we aim to close this gap by studying a conceptually simple approach to defend few-shot classifiers against adversarial attacks. More specifically, we propose a simple attack-agnostic detection method, using the concept of self-similarity and filtering, to flag out adversarial support sets which destroy the understanding of a victim classifier for a certain class. Our extended evaluation on the miniImagenet (MI) and CUB datasets exhibit good attack detection performance, across three different few-shot classifiers and across different attack strengths, beating baselines. Our observed results allow our approach to establishing itself as a strong detection method for support set poisoning attacks. We also show that our approach constitutes a generalizable concept, as it can be paired with other filtering functions. Finally, we provide an analysis of our results when we vary two components found in our detection approach.

* arXiv admin note: text overlap with arXiv:2012.06330

Via

Access Paper or Ask Questions

Detection of Adversarial Supports in Few-shot Classifiers Using Feature Preserving Autoencoders and Self-Similarity

Dec 09, 2020

Yi Xiang Marcus Tan, Penny Chong, Jiamei Sun, Yuval Elovici, Alexander Binder

Figure 1 for Detection of Adversarial Supports in Few-shot Classifiers Using Feature Preserving Autoencoders and Self-Similarity

Figure 2 for Detection of Adversarial Supports in Few-shot Classifiers Using Feature Preserving Autoencoders and Self-Similarity

Figure 3 for Detection of Adversarial Supports in Few-shot Classifiers Using Feature Preserving Autoencoders and Self-Similarity

Figure 4 for Detection of Adversarial Supports in Few-shot Classifiers Using Feature Preserving Autoencoders and Self-Similarity

Abstract:Few-shot classifiers excel under limited training samples, making it useful in real world applications. However, the advent of adversarial samples threatens the efficacy of such classifiers. For them to remain reliable, defences against such attacks must be explored. However, closer examination to prior literature reveals a big gap in this domain. Hence, in this work, we propose a detection strategy to highlight adversarial support sets, aiming to destroy a few-shot classifier's understanding of a certain class of objects. We make use of feature preserving autoencoder filtering and also the concept of self-similarity of a support set to perform this detection. As such, our method is attack-agnostic and also the first to explore detection for few-shot classifiers to the best of our knowledge. Our evaluation on the miniImagenet and CUB datasets exhibit optimism when employing our proposed approach, showing high AUROC scores for detection in general.

Via

Access Paper or Ask Questions

Deja vu from the SVM Era: Example-based Explanations with Outlier Detection

Nov 11, 2020

Penny Chong, Yuval Elovici, Alexander Binder

Figure 1 for Deja vu from the SVM Era: Example-based Explanations with Outlier Detection

Figure 2 for Deja vu from the SVM Era: Example-based Explanations with Outlier Detection

Figure 3 for Deja vu from the SVM Era: Example-based Explanations with Outlier Detection

Figure 4 for Deja vu from the SVM Era: Example-based Explanations with Outlier Detection

Abstract:Understanding the features that contributed to a prediction is important for high-stake tasks. In this work, we revisit the idea of a student network to provide an example-based explanation for its prediction in two forms: i) identify top-k most relevant prototype examples and ii) show evidence of similarity between the prediction sample and each of the top-k prototypes. We compare the prediction performance and the explanation performance for the second type of explanation with the teacher network. In addition, we evaluate the outlier detection performance of the network. We show that using prototype-based students beyond similarity kernels deliver meaningful explanations and promising outlier detection results, without compromising on classification accuracy.

Via

Access Paper or Ask Questions

Simple and Effective Prevention of Mode Collapse in Deep One-Class Classification

Jan 28, 2020

Penny Chong, Lukas Ruff, Marius Kloft, Alexander Binder

Figure 1 for Simple and Effective Prevention of Mode Collapse in Deep One-Class Classification

Figure 2 for Simple and Effective Prevention of Mode Collapse in Deep One-Class Classification

Figure 3 for Simple and Effective Prevention of Mode Collapse in Deep One-Class Classification

Figure 4 for Simple and Effective Prevention of Mode Collapse in Deep One-Class Classification

Abstract:Anomaly detection algorithms find extensive use in various fields. This area of research has recently made great advances thanks to deep learning. A recent method, the deep Support Vector Data Description (deep SVDD), which is inspired by the classic kernel-based Support Vector Data Description (SVDD), is capable of simultaneously learning a feature representation of the data and a data-enclosing hypersphere. The method has shown promising results in both unsupervised and semi-supervised settings. However, deep SVDD suffers from hypersphere collapse---also known as mode collapse---, if the architecture of the model does not comply with certain architectural constraints, e.g. the removal of bias terms. These constraints limit the adaptability of the model and in some cases, may affect the model performance due to learning sub-optimal features. In this work, we consider two regularizers to prevent hypersphere collapse in deep SVDD. The first regularizer is based on injecting random noise via the standard cross-entropy loss. The second regularizer penalizes the minibatch variance when it becomes too small. Moreover, we introduce an adaptive weighting scheme to control the amount of penalization between the SVDD loss and the respective regularizer. Our proposed regularized variants of deep SVDD show encouraging results and outperform a prominent state-of-the-art method on a setup where the anomalies have no apparent geometrical structure.

Via

Access Paper or Ask Questions