Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Quantifying the vanishing gradient and long distance dependency problem in recursive neural networks and recursive LSTMs

Mar 01, 2016

Phong Le, Willem Zuidema

Figure 1 for Quantifying the vanishing gradient and long distance dependency problem in recursive neural networks and recursive LSTMs

Figure 2 for Quantifying the vanishing gradient and long distance dependency problem in recursive neural networks and recursive LSTMs

Figure 3 for Quantifying the vanishing gradient and long distance dependency problem in recursive neural networks and recursive LSTMs

Figure 4 for Quantifying the vanishing gradient and long distance dependency problem in recursive neural networks and recursive LSTMs

Share this with someone who'll enjoy it:

Abstract:Recursive neural networks (RNN) and their recently proposed extension recursive long short term memory networks (RLSTM) are models that compute representations for sentences, by recursively combining word embeddings according to an externally provided parse tree. Both models thus, unlike recurrent networks, explicitly make use of the hierarchical structure of a sentence. In this paper, we demonstrate that RNNs nevertheless suffer from the vanishing gradient and long distance dependency problem, and that RLSTMs greatly improve over RNN's on these problems. We present an artificial learning task that allows us to quantify the severity of these problems for both models. We further show that a ratio of gradients (at the root node and a focal leaf node) is highly indicative of the success of backpropagation at optimizing the relevant weights low in the tree. This paper thus provides an explanation for existing, superior results of RLSTMs on tasks such as sentiment analysis, and suggests that the benefits of including hierarchical structure and of including LSTM-style gating are complementary.

View paper on

Share this with someone who'll enjoy it:

Title:Quantifying the vanishing gradient and long distance dependency problem in recursive neural networks and recursive LSTMs

Paper and Code