Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tarek Lahlou

Adversarial Bootstrapping for Dialogue Model Training

Sep 04, 2019

Oluwatobi Olabiyi, Erik T. Mueller, Christopher Larson, Tarek Lahlou

Figure 1 for Adversarial Bootstrapping for Dialogue Model Training

Figure 2 for Adversarial Bootstrapping for Dialogue Model Training

Figure 3 for Adversarial Bootstrapping for Dialogue Model Training

Figure 4 for Adversarial Bootstrapping for Dialogue Model Training

Abstract:Open domain neural dialogue models, despite their successes, are known to produce responses that lack relevance, diversity, and in many cases coherence. These shortcomings stem from the limited ability of common training objectives to directly express these properties as well as their interplay with training datasets and model architectures. Toward addressing these problems, this paper proposes bootstrapping a dialogue response generator with an adversarially trained discriminator. The method involves training a neural generator in both autoregressive and traditional teacher-forcing modes, with the maximum likelihood loss of the auto-regressive outputs weighted by the score from a metric-based discriminator model. The discriminator input is a mixture of ground truth labels, the teacher-forcing outputs of the generator, and distractors sampled from the dataset, thereby allowing for richer feedback on the autoregressive outputs of the generator. To improve the calibration of the discriminator output, we also bootstrap the discriminator with the matching of the intermediate features of the ground truth and the generator's autoregressive output. We explore different sampling and adversarial policy optimization strategies during training in order to understand how to encourage response diversity without sacrificing relevance. Our experiments shows that adversarial bootstrapping is effective at addressing exposure bias, leading to improvement in response relevance and coherence. The improvement is demonstrated with the state-of-the-art results on the Movie and Ubuntu dialogue datasets with respect to human evaluations and BLUE, ROGUE, and distinct n-gram scores.

Via

Access Paper or Ask Questions

Telephonetic: Making Neural Language Models Robust to ASR and Semantic Noise

Jun 13, 2019

Chris Larson, Tarek Lahlou, Diana Mingels, Zachary Kulis, Erik Mueller

Figure 1 for Telephonetic: Making Neural Language Models Robust to ASR and Semantic Noise

Figure 2 for Telephonetic: Making Neural Language Models Robust to ASR and Semantic Noise

Figure 3 for Telephonetic: Making Neural Language Models Robust to ASR and Semantic Noise

Figure 4 for Telephonetic: Making Neural Language Models Robust to ASR and Semantic Noise

Abstract:Speech processing systems rely on robust feature extraction to handle phonetic and semantic variations found in natural language. While techniques exist for desensitizing features to common noise patterns produced by Speech-to-Text (STT) and Text-to-Speech (TTS) systems, the question remains how to best leverage state-of-the-art language models (which capture rich semantic features, but are trained on only written text) on inputs with ASR errors. In this paper, we present Telephonetic, a data augmentation framework that helps robustify language model features to ASR corrupted inputs. To capture phonetic alterations, we employ a character-level language model trained using probabilistic masking. Phonetic augmentations are generated in two stages: a TTS encoder (Tacotron 2, WaveGlow) and a STT decoder (DeepSpeech). Similarly, semantic perturbations are produced by sampling from nearby words in an embedding space, which is computed using the BERT language model. Words are selected for augmentation according to a hierarchical grammar sampling strategy. Telephonetic is evaluated on the Penn Treebank (PTB) corpus, and demonstrates its effectiveness as a bootstrapping technique for transferring neural language models to the speech domain. Notably, our language model achieves a test perplexity of 37.49 on PTB, which to our knowledge is state-of-the-art among models trained only on PTB.

Via

Access Paper or Ask Questions