Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Denny Britz

JESC: Japanese-English Subtitle Corpus

Feb 21, 2018

Reid Pryzant, Yongjoo Chung, Dan Jurafsky, Denny Britz

Figure 1 for JESC: Japanese-English Subtitle Corpus

Figure 2 for JESC: Japanese-English Subtitle Corpus

Figure 3 for JESC: Japanese-English Subtitle Corpus

Figure 4 for JESC: Japanese-English Subtitle Corpus

Abstract:In this paper we describe the Japanese-English Subtitle Corpus (JESC). JESC is a large Japanese-English parallel corpus covering the underrepresented domain of conversational dialogue. It consists of more than 3.2 million examples, making it the largest freely available dataset of its kind. The corpus was assembled by crawling and aligning subtitles found on the web. The assembly process incorporates a number of novel preprocessing elements to ensure high monolingual fluency and accurate bilingual alignments. We summarize its contents and evaluate its quality using human experts and baseline machine translation (MT) systems.

* To appear at LREC 2018. Project website updated

Via

Access Paper or Ask Questions

Generating High-Quality and Informative Conversation Responses with Sequence-to-Sequence Models

Jul 31, 2017

Louis Shao, Stephan Gouws, Denny Britz, Anna Goldie, Brian Strope, Ray Kurzweil

Figure 1 for Generating High-Quality and Informative Conversation Responses with Sequence-to-Sequence Models

Figure 2 for Generating High-Quality and Informative Conversation Responses with Sequence-to-Sequence Models

Figure 3 for Generating High-Quality and Informative Conversation Responses with Sequence-to-Sequence Models

Abstract:Sequence-to-sequence models have been applied to the conversation response generation problem where the source sequence is the conversation history and the target sequence is the response. Unlike translation, conversation responding is inherently creative. The generation of long, informative, coherent, and diverse responses remains a hard task. In this work, we focus on the single turn setting. We add self-attention to the decoder to maintain coherence in longer responses, and we propose a practical approach, called the glimpse-model, for scaling to large datasets. We introduce a stochastic beam-search algorithm with segment-by-segment reranking which lets us inject diversity earlier in the generation process. We trained on a combined data set of over 2.3B conversation messages mined from the web. In human evaluation studies, our method produces longer responses overall, with a higher proportion rated as acceptable and excellent as length increases, compared to baseline sequence-to-sequence models with explicit length-promotion. A back-off strategy produces better responses overall, in the full spectrum of lengths.

* To appear in EMNLP 2017

Via

Access Paper or Ask Questions

Efficient Attention using a Fixed-Size Memory Representation

Jul 01, 2017

Denny Britz, Melody Y. Guan, Minh-Thang Luong

Figure 1 for Efficient Attention using a Fixed-Size Memory Representation

Figure 2 for Efficient Attention using a Fixed-Size Memory Representation

Figure 3 for Efficient Attention using a Fixed-Size Memory Representation

Figure 4 for Efficient Attention using a Fixed-Size Memory Representation

Abstract:The standard content-based attention mechanism typically used in sequence-to-sequence models is computationally expensive as it requires the comparison of large encoder and decoder states at each time step. In this work, we propose an alternative attention mechanism based on a fixed size memory representation that is more efficient. Our technique predicts a compact set of K attention contexts during encoding and lets the decoder compute an efficient lookup that does not need to consult the memory. We show that our approach performs on-par with the standard attention mechanism while yielding inference speedups of 20% for real-world translation tasks and more for tasks with longer sequences. By visualizing attention scores we demonstrate that our models learn distinct, meaningful alignments.

* EMNLP 2017

Via

Access Paper or Ask Questions

Massive Exploration of Neural Machine Translation Architectures

Mar 21, 2017

Denny Britz, Anna Goldie, Minh-Thang Luong, Quoc Le

Figure 1 for Massive Exploration of Neural Machine Translation Architectures

Figure 2 for Massive Exploration of Neural Machine Translation Architectures

Figure 3 for Massive Exploration of Neural Machine Translation Architectures

Figure 4 for Massive Exploration of Neural Machine Translation Architectures

Abstract:Neural Machine Translation (NMT) has shown remarkable progress over the past few years with production systems now being deployed to end-users. One major drawback of current architectures is that they are expensive to train, typically requiring days to weeks of GPU time to converge. This makes exhaustive hyperparameter search, as is commonly done with other neural network architectures, prohibitively expensive. In this work, we present the first large-scale analysis of NMT architecture hyperparameters. We report empirical results and variance numbers for several hundred experimental runs, corresponding to over 250,000 GPU hours on the standard WMT English to German translation task. Our experiments lead to novel insights and practical advice for building and extending NMT architectures. As part of this contribution, we release an open-source NMT framework that enables researchers to easily experiment with novel techniques and reproduce state of the art results.

* 9 pages, 2 figures, 8 tables, submitted to ACL 2017, open source code at https://github.com/google/seq2seq/

Via

Access Paper or Ask Questions