Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Explain My Surprise: Learning Efficient Long-Term Memory by Predicting Uncertain Outcomes

Jul 27, 2022

Artyom Sorokin, Nazar Buzun, Leonid Pugachev, Mikhail Burtsev

Figure 1 for Explain My Surprise: Learning Efficient Long-Term Memory by Predicting Uncertain Outcomes

Figure 2 for Explain My Surprise: Learning Efficient Long-Term Memory by Predicting Uncertain Outcomes

Figure 3 for Explain My Surprise: Learning Efficient Long-Term Memory by Predicting Uncertain Outcomes

Figure 4 for Explain My Surprise: Learning Efficient Long-Term Memory by Predicting Uncertain Outcomes

Share this with someone who'll enjoy it:

Abstract:In many sequential tasks, a model needs to remember relevant events from the distant past to make correct predictions. Unfortunately, a straightforward application of gradient based training requires intermediate computations to be stored for every element of a sequence. This requires prohibitively large computing memory if a sequence consists of thousands or even millions elements, and as a result, makes learning of very long-term dependencies infeasible. However, the majority of sequence elements can usually be predicted by taking into account only temporally local information. On the other hand, predictions affected by long-term dependencies are sparse and characterized by high uncertainty given only local information. We propose MemUP, a new training method that allows to learn long-term dependencies without backpropagating gradients through the whole sequence at a time. This method can be potentially applied to any gradient based sequence learning. MemUP implementation for recurrent architectures shows performances better or comparable to baselines while requiring significantly less computing memory.

View paper on

OpenReview

Share this with someone who'll enjoy it:

Title:Explain My Surprise: Learning Efficient Long-Term Memory by Predicting Uncertain Outcomes

Paper and Code