Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jianqiao Li

Improving Text Generation with Student-Forcing Optimal Transport

Oct 12, 2020

Guoyin Wang, Chunyuan Li, Jianqiao Li, Hao Fu, Yuh-Chen Lin, Liqun Chen, Yizhe Zhang, Chenyang Tao, Ruiyi Zhang, Wenlin Wang(+3 more)

Figure 1 for Improving Text Generation with Student-Forcing Optimal Transport

Figure 2 for Improving Text Generation with Student-Forcing Optimal Transport

Figure 3 for Improving Text Generation with Student-Forcing Optimal Transport

Figure 4 for Improving Text Generation with Student-Forcing Optimal Transport

Abstract:Neural language models are often trained with maximum likelihood estimation (MLE), where the next word is generated conditioned on the ground-truth word tokens. During testing, however, the model is instead conditioned on previously generated tokens, resulting in what is termed exposure bias. To reduce this gap between training and testing, we propose using optimal transport (OT) to match the sequences generated in these two modes. An extension is further proposed to improve the OT learning, based on the structural and contextual information of the text sequences. The effectiveness of the proposed method is validated on machine translation, text summarization, and text generation tasks.

* To appear at EMNLP 2020

Via

Access Paper or Ask Questions

GO Hessian for Expectation-Based Objectives

Jun 16, 2020

Yulai Cong, Miaoyun Zhao, Jianqiao Li, Junya Chen, Lawrence Carin

Figure 1 for GO Hessian for Expectation-Based Objectives

Figure 2 for GO Hessian for Expectation-Based Objectives

Figure 3 for GO Hessian for Expectation-Based Objectives

Figure 4 for GO Hessian for Expectation-Based Objectives

Abstract:An unbiased low-variance gradient estimator, termed GO gradient, was proposed recently for expectation-based objectives $\mathbb{E}_{q_{\boldsymbol{\gamma}}(\boldsymbol{y})} [f(\boldsymbol{y})]$, where the random variable (RV) $\boldsymbol{y}$ may be drawn from a stochastic computation graph with continuous (non-reparameterizable) internal nodes and continuous/discrete leaves. Upgrading the GO gradient, we present for $\mathbb{E}_{q_{\boldsymbol{\boldsymbol{\gamma}}}(\boldsymbol{y})} [f(\boldsymbol{y})]$ an unbiased low-variance Hessian estimator, named GO Hessian. Considering practical implementation, we reveal that GO Hessian is easy-to-use with auto-differentiation and Hessian-vector products, enabling efficient cheap exploitation of curvature information over stochastic computation graphs. As representative examples, we present the GO Hessian for non-reparameterizable gamma and negative binomial RVs/nodes. Based on the GO Hessian, we design a new second-order method for $\mathbb{E}_{q_{\boldsymbol{\boldsymbol{\gamma}}}(\boldsymbol{y})} [f(\boldsymbol{y})]$, with rigorous experiments conducted to verify its effectiveness and efficiency.

Via

Access Paper or Ask Questions

GAN Memory with No Forgetting

Jun 13, 2020

Yulai Cong, Miaoyun Zhao, Jianqiao Li, Sijia Wang, Lawrence Carin

Figure 1 for GAN Memory with No Forgetting

Figure 2 for GAN Memory with No Forgetting

Figure 3 for GAN Memory with No Forgetting

Figure 4 for GAN Memory with No Forgetting

Abstract:Seeking to address the fundamental issue of memory in lifelong learning, we propose a GAN memory that is capable of realistically remembering a stream of generative processes with \emph{no} forgetting. Our GAN memory is based on recognizing that one can modulate the ``style'' of a GAN model to form perceptually-distant targeted generation. Accordingly, we propose to do sequential style modulations atop a well-behaved base GAN model, to form sequential targeted generative models, while simultaneously benefiting from the transferred base knowledge. Experiments demonstrate the superiority of our method over existing approaches and its effectiveness in alleviating catastrophic forgetting for lifelong classification problems.

Via

Access Paper or Ask Questions

Adversarial Learning of a Sampler Based on an Unnormalized Distribution

Jan 03, 2019

Chunyuan Li, Ke Bai, Jianqiao Li, Guoyin Wang, Changyou Chen, Lawrence Carin

Figure 1 for Adversarial Learning of a Sampler Based on an Unnormalized Distribution

Figure 2 for Adversarial Learning of a Sampler Based on an Unnormalized Distribution

Figure 3 for Adversarial Learning of a Sampler Based on an Unnormalized Distribution

Figure 4 for Adversarial Learning of a Sampler Based on an Unnormalized Distribution

Abstract:We investigate adversarial learning in the case when only an unnormalized form of the density can be accessed, rather than samples. With insights so garnered, adversarial learning is extended to the case for which one has access to an unnormalized form u(x) of the target density function, but no samples. Further, new concepts in GAN regularization are developed, based on learning from samples or from u(x). The proposed method is compared to alternative approaches, with encouraging results demonstrated across a range of applications, including deep soft Q-learning.

* Published in AISTATS 2019; Code: https://github.com/ChunyuanLI/RAS

Via

Access Paper or Ask Questions