Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Leena Shekhar

Generating Narrative Text in a Switching Dynamical System

Apr 08, 2020

Noah Weber, Leena Shekhar, Heeyoung Kwon, Niranjan Balasubramanian, Nathanael Chambers

Figure 1 for Generating Narrative Text in a Switching Dynamical System

Figure 2 for Generating Narrative Text in a Switching Dynamical System

Figure 3 for Generating Narrative Text in a Switching Dynamical System

Figure 4 for Generating Narrative Text in a Switching Dynamical System

Abstract:Early work on narrative modeling used explicit plans and goals to generate stories, but the language generation itself was restricted and inflexible. Modern methods use language models for more robust generation, but often lack an explicit representation of the scaffolding and dynamics that guide a coherent narrative. This paper introduces a new model that integrates explicit narrative structure with neural language models, formalizing narrative modeling as a Switching Linear Dynamical System (SLDS). A SLDS is a dynamical system in which the latent dynamics of the system (i.e. how the state vector transforms over time) is controlled by top-level discrete switching variables. The switching variables represent narrative structure (e.g., sentiment or discourse states), while the latent state vector encodes information on the current state of the narrative. This probabilistic formulation allows us to control generation, and can be learned in a semi-supervised fashion using both labeled and unlabeled data. Additionally, we derive a Gibbs sampler for our model that can fill in arbitrary parts of the narrative, guided by the switching variables. Our filled-in (English language) narratives outperform several baselines on both automatic and human evaluations.

Via

Access Paper or Ask Questions

Hierarchical Quantized Representations for Script Generation

Aug 28, 2018

Noah Weber, Leena Shekhar, Niranjan Balasubramanian, Nathanael Chambers

Figure 1 for Hierarchical Quantized Representations for Script Generation

Figure 2 for Hierarchical Quantized Representations for Script Generation

Figure 3 for Hierarchical Quantized Representations for Script Generation

Figure 4 for Hierarchical Quantized Representations for Script Generation

Abstract:Scripts define knowledge about how everyday scenarios (such as going to a restaurant) are expected to unfold. One of the challenges to learning scripts is the hierarchical nature of the knowledge. For example, a suspect arrested might plead innocent or guilty, and a very different track of events is then expected to happen. To capture this type of information, we propose an autoencoder model with a latent space defined by a hierarchy of categorical variables. We utilize a recently proposed vector quantization based approach, which allows continuous embeddings to be associated with each latent variable value. This permits the decoder to softly decide what portions of the latent hierarchy to condition on by attending over the value embeddings for a given setting. Our model effectively encodes and generates scripts, outperforming a recent language modeling-based method on several standard tasks, and allowing the autoencoder model to achieve substantially lower perplexity scores compared to the previous language modeling-based method.

* EMNLP 2018

Via

Access Paper or Ask Questions

The Fine Line between Linguistic Generalization and Failure in Seq2Seq-Attention Models

May 08, 2018

Noah Weber, Leena Shekhar, Niranjan Balasubramanian

Figure 1 for The Fine Line between Linguistic Generalization and Failure in Seq2Seq-Attention Models

Figure 2 for The Fine Line between Linguistic Generalization and Failure in Seq2Seq-Attention Models

Figure 3 for The Fine Line between Linguistic Generalization and Failure in Seq2Seq-Attention Models

Figure 4 for The Fine Line between Linguistic Generalization and Failure in Seq2Seq-Attention Models

Abstract:Seq2Seq based neural architectures have become the go-to architecture to apply to sequence to sequence language tasks. Despite their excellent performance on these tasks, recent work has noted that these models usually do not fully capture the linguistic structure required to generalize beyond the dense sections of the data distribution \cite{ettinger2017towards}, and as such, are likely to fail on samples from the tail end of the distribution (such as inputs that are noisy \citep{belkinovnmtbreak} or of different lengths \citep{bentivoglinmtlength}). In this paper, we look at a model's ability to generalize on a simple symbol rewriting task with a clearly defined structure. We find that the model's ability to generalize this structure beyond the training distribution depends greatly on the chosen random seed, even when performance on the standard test set remains the same. This suggests that a model's ability to capture generalizable structure is highly sensitive. Moreover, this sensitivity may not be apparent when evaluating it on standard test sets.

* Workshop on New Forms of Generalization in Deep Learning and NLP (NAACL 2018), revised to update some references

Via

Access Paper or Ask Questions

Controlling Decoding for More Abstractive Summaries with Copy-Based Networks

Mar 20, 2018

Noah Weber, Leena Shekhar, Niranjan Balasubramanian, Kyunghyun Cho

Figure 1 for Controlling Decoding for More Abstractive Summaries with Copy-Based Networks

Figure 2 for Controlling Decoding for More Abstractive Summaries with Copy-Based Networks

Figure 3 for Controlling Decoding for More Abstractive Summaries with Copy-Based Networks

Abstract:Attention-based neural abstractive summarization systems equipped with copy mechanisms have shown promising results. Despite this success, it has been noticed that such a system generates a summary by mostly, if not entirely, copying over phrases, sentences, and sometimes multiple consecutive sentences from an input paragraph, effectively performing extractive summarization. In this paper, we verify this behavior using the latest neural abstractive summarization system - a pointer-generator network. We propose a simple baseline method that allows us to control the amount of copying without retraining. Experiments indicate that the method provides a strong baseline for abstractive systems looking to obtain high ROUGE scores while minimizing overlap with the source article, substantially reducing the n-gram overlap with the original article while keeping within 2 points of the original model's ROUGE score.

Via

Access Paper or Ask Questions