Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Length bias in Encoder Decoder Models and a Case for Global Conditioning

Sep 21, 2016

Pavel Sountsov, Sunita Sarawagi

Figure 1 for Length bias in Encoder Decoder Models and a Case for Global Conditioning

Figure 2 for Length bias in Encoder Decoder Models and a Case for Global Conditioning

Figure 3 for Length bias in Encoder Decoder Models and a Case for Global Conditioning

Figure 4 for Length bias in Encoder Decoder Models and a Case for Global Conditioning

Share this with someone who'll enjoy it:

Abstract:Encoder-decoder networks are popular for modeling sequences probabilistically in many applications. These models use the power of the Long Short-Term Memory (LSTM) architecture to capture the full dependence among variables, unlike earlier models like CRFs that typically assumed conditional independence among non-adjacent variables. However in practice encoder-decoder models exhibit a bias towards short sequences that surprisingly gets worse with increasing beam size. In this paper we show that such phenomenon is due to a discrepancy between the full sequence margin and the per-element margin enforced by the locally conditioned training objective of a encoder-decoder model. The discrepancy more adversely impacts long sequences, explaining the bias towards predicting short sequences. For the case where the predicted sequences come from a closed set, we show that a globally conditioned model alleviates the above problems of encoder-decoder models. From a practical point of view, our proposed model also eliminates the need for a beam-search during inference, which reduces to an efficient dot-product based search in a vector-space.

View paper on

Share this with someone who'll enjoy it:

Title:Length bias in Encoder Decoder Models and a Case for Global Conditioning

Paper and Code