Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ning Miao

SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning

Aug 02, 2023

Ning Miao, Yee Whye Teh, Tom Rainforth

Abstract:The recent progress in large language models (LLMs), especially the invention of chain-of-thoughts (CoT) prompting, makes it possible to solve reasoning problems. However, even the strongest LLMs are still struggling with more complicated problems that require non-linear thinking and multi-step reasoning. In this work, we explore whether LLMs have the ability to recognize their own errors, without resorting to external resources. In particular, we investigate whether they can be used to identify individual errors within a step-by-step reasoning. To this end, we propose a zero-shot verification scheme to recognize such errors. We then use this verification scheme to improve question-answering performance, by using it to perform weighted voting on different generated answers. We test the method on three math datasets-GSM8K, MathQA, and MATH-and find that it successfully recognizes errors and, in turn, increases final predictive performance.

Via

Access Paper or Ask Questions

Side Channel-Assisted Inference Leakage from Machine Learning-based ECG Classification

Apr 04, 2023

Jialin Liu, Ning Miao, Chongzhou Fang, Houman Homayoun, Han Wang

Abstract:The Electrocardiogram (ECG) measures the electrical cardiac activity generated by the heart to detect abnormal heartbeat and heart attack. However, the irregular occurrence of the abnormalities demands continuous monitoring of heartbeats. Machine learning techniques are leveraged to automate the task to reduce labor work needed during monitoring. In recent years, many companies have launched products with ECG monitoring and irregular heartbeat alert. Among all classification algorithms, the time series-based algorithm dynamic time warping (DTW) is widely adopted to undertake the ECG classification task. Though progress has been achieved, the DTW-based ECG classification also brings a new attacking vector of leaking the patients' diagnosis results. This paper shows that the ECG input samples' labels can be stolen via a side-channel attack, Flush+Reload. In particular, we first identify the vulnerability of DTW for ECG classification, i.e., the correlation between warping path choice and prediction results. Then we implement an attack that leverages Flush+Reload to monitor the warping path selection with known ECG data and then build a predictor for constructing the relation between warping path selection and labels of input ECG samples. Based on experiments, we find that the Flush+Reload-based inference leakage can achieve an 84.0\% attacking success rate to identify the labels of the two samples in DTW.

Via

Access Paper or Ask Questions

Learning Instance-Specific Data Augmentations

May 31, 2022

Ning Miao, Emile Mathieu, Yann Dubois, Tom Rainforth, Yee Whye Teh, Adam Foster, Hyunjik Kim

Figure 1 for Learning Instance-Specific Data Augmentations

Figure 2 for Learning Instance-Specific Data Augmentations

Figure 3 for Learning Instance-Specific Data Augmentations

Figure 4 for Learning Instance-Specific Data Augmentations

Abstract:Existing data augmentation methods typically assume independence between transformations and inputs: they use the same transformation distribution for all input instances. We explain why this can be problematic and propose InstaAug, a method for automatically learning input-specific augmentations from data. This is achieved by introducing an augmentation module that maps an input to a distribution over transformations. This is simultaneously trained alongside the base model in a fully end-to-end manner using only the training data. We empirically demonstrate that InstaAug learns meaningful augmentations for a wide range of transformation classes, which in turn provides better performance on supervised and self-supervised tasks compared with augmentations that assume input--transformation independence.

Via

Access Paper or Ask Questions

InteL-VAEs: Adding Inductive Biases to Variational Auto-Encoders via Intermediary Latents

Jun 25, 2021

Ning Miao, Emile Mathieu, N. Siddharth, Yee Whye Teh, Tom Rainforth

Figure 1 for InteL-VAEs: Adding Inductive Biases to Variational Auto-Encoders via Intermediary Latents

Figure 2 for InteL-VAEs: Adding Inductive Biases to Variational Auto-Encoders via Intermediary Latents

Figure 3 for InteL-VAEs: Adding Inductive Biases to Variational Auto-Encoders via Intermediary Latents

Figure 4 for InteL-VAEs: Adding Inductive Biases to Variational Auto-Encoders via Intermediary Latents

Abstract:We introduce a simple and effective method for learning VAEs with controllable inductive biases by using an intermediary set of latent variables. This allows us to overcome the limitations of the standard Gaussian prior assumption. In particular, it allows us to impose desired properties like sparsity or clustering on learned representations, and incorporate prior information into the learned model. Our approach, which we refer to as the Intermediary Latent Space VAE (InteL-VAE), is based around controlling the stochasticity of the encoding process with the intermediary latent variables, before deterministically mapping them forward to our target latent representation, from which reconstruction is performed. This allows us to maintain all the advantages of the traditional VAE framework, while incorporating desired prior information, inductive biases, and even topological information through the latent mapping. We show that this, in turn, allows InteL-VAEs to learn both better generative models and representations.

Via

Access Paper or Ask Questions

Generating Fluent Adversarial Examples for Natural Languages

Jul 13, 2020

Huangzhao Zhang, Hao Zhou, Ning Miao, Lei Li

Figure 1 for Generating Fluent Adversarial Examples for Natural Languages

Figure 2 for Generating Fluent Adversarial Examples for Natural Languages

Figure 3 for Generating Fluent Adversarial Examples for Natural Languages

Figure 4 for Generating Fluent Adversarial Examples for Natural Languages

Abstract:Efficiently building an adversarial attacker for natural language processing (NLP) tasks is a real challenge. Firstly, as the sentence space is discrete, it is difficult to make small perturbations along the direction of gradients. Secondly, the fluency of the generated examples cannot be guaranteed. In this paper, we propose MHA, which addresses both problems by performing Metropolis-Hastings sampling, whose proposal is designed with the guidance of gradients. Experiments on IMDB and SNLI show that our proposed MHA outperforms the baseline model on attacking capability. Adversarial training with MAH also leads to better robustness and performance.

* Accepted by ACL 2019

Via

Access Paper or Ask Questions

Do You Have the Right Scissors? Tailoring Pre-trained Language Models via Monte-Carlo Methods

Jul 13, 2020

Ning Miao, Yuxuan Song, Hao Zhou, Lei Li

Figure 1 for Do You Have the Right Scissors? Tailoring Pre-trained Language Models via Monte-Carlo Methods

Figure 2 for Do You Have the Right Scissors? Tailoring Pre-trained Language Models via Monte-Carlo Methods

Figure 3 for Do You Have the Right Scissors? Tailoring Pre-trained Language Models via Monte-Carlo Methods

Figure 4 for Do You Have the Right Scissors? Tailoring Pre-trained Language Models via Monte-Carlo Methods

Abstract:It has been a common approach to pre-train a language model on a large corpus and fine-tune it on task-specific data. In practice, we observe that fine-tuning a pre-trained model on a small dataset may lead to over- and/or under-estimation problem. In this paper, we propose MC-Tailor, a novel method to alleviate the above issue in text generation tasks by truncating and transferring the probability mass from over-estimated regions to under-estimated ones. Experiments on a variety of text generation datasets show that MC-Tailor consistently and significantly outperforms the fine-tuning approach. Our code is available at this url.

* Accepted by ACL 2020

Via

Access Paper or Ask Questions

Improving Maximum Likelihood Training for Text Generation with Density Ratio Estimation

Jul 12, 2020

Yuxuan Song, Ning Miao, Hao Zhou, Lantao Yu, Mingxuan Wang, Lei Li

Figure 1 for Improving Maximum Likelihood Training for Text Generation with Density Ratio Estimation

Figure 2 for Improving Maximum Likelihood Training for Text Generation with Density Ratio Estimation

Figure 3 for Improving Maximum Likelihood Training for Text Generation with Density Ratio Estimation

Figure 4 for Improving Maximum Likelihood Training for Text Generation with Density Ratio Estimation

Abstract:Auto-regressive sequence generative models trained by Maximum Likelihood Estimation suffer the exposure bias problem in practical finite sample scenarios. The crux is that the number of training samples for Maximum Likelihood Estimation is usually limited and the input data distributions are different at training and inference stages. Many method shave been proposed to solve the above problem (Yu et al., 2017; Lu et al., 2018), which relies on sampling from the non-stationary model distribution and suffers from high variance or biased estimations. In this paper, we propose{\psi}-MLE, a new training scheme for auto-regressive sequence generative models, which is effective and stable when operating at large sample space encountered in text generation. We derive our algorithm from a new perspective of self-augmentation and introduce bias correction with density ratio estimation. Extensive experimental results on synthetic data and real-world text generation tasks demonstrate that our method stably outperforms Maximum Likelihood Estimation and other state-of-the-art sequence generative models in terms of both quality and diversity.

* Accepted to International Conference on Artificial Intelligence and Statistics 2020

Via

Access Paper or Ask Questions

Kernelized Bayesian Softmax for Text Generation

Nov 01, 2019

Ning Miao, Hao Zhou, Chengqi Zhao, Wenxian Shi, Lei Li

Figure 1 for Kernelized Bayesian Softmax for Text Generation

Figure 2 for Kernelized Bayesian Softmax for Text Generation

Figure 3 for Kernelized Bayesian Softmax for Text Generation

Figure 4 for Kernelized Bayesian Softmax for Text Generation

Abstract:Neural models for text generation require a softmax layer with proper token embeddings during the decoding phase. Most existing approaches adopt single point embedding for each token. However, a word may have multiple senses according to different context, some of which might be distinct. In this paper, we propose KerBS, a novel approach for learning better embeddings for text generation. KerBS embodies two advantages: (a) it employs a Bayesian composition of embeddings for words with multiple senses; (b) it is adaptive to semantic variances of words and robust to rare sentence context by imposing learned kernels to capture the closeness of words (senses) in the embedding space. Empirical studies show that KerBS significantly boosts the performance of several text generation tasks.

Via

Access Paper or Ask Questions

Fixing Gaussian Mixture VAEs for Interpretable Text Generation

Jun 16, 2019

Wenxian Shi, Hao Zhou, Ning Miao, Shenjian Zhao, Lei Li

Figure 1 for Fixing Gaussian Mixture VAEs for Interpretable Text Generation

Figure 2 for Fixing Gaussian Mixture VAEs for Interpretable Text Generation

Figure 3 for Fixing Gaussian Mixture VAEs for Interpretable Text Generation

Figure 4 for Fixing Gaussian Mixture VAEs for Interpretable Text Generation

Abstract:Variational auto-encoder (VAE) with Gaussian priors is effective in text generation. To improve the controllability and interpretability, we propose to use Gaussian mixture distribution as the prior for VAE (GMVAE), since it includes an extra discrete latent variable in addition to the continuous one. Unfortunately, training GMVAE using standard variational approximation often leads to the mode-collapse problem. We theoretically analyze the root cause --- maximizing the evidence lower bound of GMVAE implicitly aggregates the means of multiple Gaussian priors. We propose Dispersed-GMVAE (DGMVAE), an improved model for text generation. It introduces two extra terms to alleviate mode-collapse and to induce a better structured latent space. Experimental results show that DGMVAE outperforms strong baselines in several language modeling and text generation benchmarks.

Via

Access Paper or Ask Questions

CGMH: Constrained Sentence Generation by Metropolis-Hastings Sampling

Nov 14, 2018

Ning Miao, Hao Zhou, Lili Mou, Rui Yan, Lei Li

Figure 1 for CGMH: Constrained Sentence Generation by Metropolis-Hastings Sampling

Figure 2 for CGMH: Constrained Sentence Generation by Metropolis-Hastings Sampling

Figure 3 for CGMH: Constrained Sentence Generation by Metropolis-Hastings Sampling

Figure 4 for CGMH: Constrained Sentence Generation by Metropolis-Hastings Sampling

Abstract:In real-world applications of natural language generation, there are often constraints on the target sentences in addition to fluency and naturalness requirements. Existing language generation techniques are usually based on recurrent neural networks (RNNs). However, it is non-trivial to impose constraints on RNNs while maintaining generation quality, since RNNs generate sentences sequentially (or with beam search) from the first word to the last. In this paper, we propose CGMH, a novel approach using Metropolis-Hastings sampling for constrained sentence generation. CGMH allows complicated constraints such as the occurrence of multiple keywords in the target sentences, which cannot be handled in traditional RNN-based approaches. Moreover, CGMH works in the inference stage, and does not require parallel corpora for training. We evaluate our method on a variety of tasks, including keywords-to-sentence generation, unsupervised sentence paraphrasing, and unsupervised sentence error correction. CGMH achieves high performance compared with previous supervised methods for sentence generation. Our code is released at https://github.com/NingMiao/CGMH

* AAAI19

Via

Access Paper or Ask Questions