Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Atra Akandeh

Slim LSTM networks: LSTM_6 and LSTM_C6

Jan 18, 2019

Atra Akandeh, Fathi M. Salem

Figure 1 for Slim LSTM networks: LSTM_6 and LSTM_C6

Figure 2 for Slim LSTM networks: LSTM_6 and LSTM_C6

Figure 3 for Slim LSTM networks: LSTM_6 and LSTM_C6

Figure 4 for Slim LSTM networks: LSTM_6 and LSTM_C6

Abstract:We have shown previously that our parameter-reduced variants of Long Short-Term Memory (LSTM) Recurrent Neural Networks (RNN) are comparable in performance to the standard LSTM RNN on the MNIST dataset. In this study, we show that this is also the case for two diverse benchmark datasets, namely, the review sentiment IMDB and the 20 Newsgroup datasets. Specifically, we focus on two of the simplest variants, namely LSTM_6 (i.e., standard LSTM with three constant fixed gates) and LSTM_C6 (i.e., LSTM_6 with further reduced cell body input block). We demonstrate that these two aggressively reduced-parameter variants are competitive with the standard LSTM when hyper-parameters, e.g., learning parameter, number of hidden units and gate constants are set properly. These architectures enable speeding up training computations and hence, these networks would be more suitable for online training and inference onto portable devices with relatively limited computational resources.

* 6 pages, 12 figures, 5 tables

Via

Access Paper or Ask Questions

Simplified Long Short-term Memory Recurrent Neural Networks: part III

Jul 14, 2017

Atra Akandeh, Fathi M. Salem

Figure 1 for Simplified Long Short-term Memory Recurrent Neural Networks: part III

Figure 2 for Simplified Long Short-term Memory Recurrent Neural Networks: part III

Figure 3 for Simplified Long Short-term Memory Recurrent Neural Networks: part III

Figure 4 for Simplified Long Short-term Memory Recurrent Neural Networks: part III

Abstract:This is part III of three-part work. In parts I and II, we have presented eight variants for simplified Long Short Term Memory (LSTM) recurrent neural networks (RNNs). It is noted that fast computation, specially in constrained computing resources, are an important factor in processing big time-sequence data. In this part III paper, we present and evaluate two new LSTM model variants which dramatically reduce the computational load while retaining comparable performance to the base (standard) LSTM RNNs. In these new variants, we impose (Hadamard) pointwise state multiplications in the cell-memory network in addition to the gating signal networks.

* Here 5 pages (in the conference 4 pages), 10 figures, 5 tables; this is part III of a three part work, all will appear in the IKE'17 - The 16th Int'l Conference on Information & Knowledge Engineering. The 2017 World Congress in Computer Science Computer Engineering & Applied Computing | CSCE'17, July 17-20, 2017, Las Vegas, Nevada, USA

Via

Access Paper or Ask Questions

Simplified Long Short-term Memory Recurrent Neural Networks: part II

Jul 14, 2017

Atra Akandeh, Fathi M. Salem

Figure 1 for Simplified Long Short-term Memory Recurrent Neural Networks: part II

Figure 2 for Simplified Long Short-term Memory Recurrent Neural Networks: part II

Figure 3 for Simplified Long Short-term Memory Recurrent Neural Networks: part II

Figure 4 for Simplified Long Short-term Memory Recurrent Neural Networks: part II

Abstract:This is part II of three-part work. Here, we present a second set of inter-related five variants of simplified Long Short-term Memory (LSTM) recurrent neural networks by further reducing adaptive parameters. Two of these models have been introduced in part I of this work. We evaluate and verify our model variants on the benchmark MNIST dataset and assert that these models are comparable to the base LSTM model while use progressively less number of parameters. Moreover, we observe that in case of using the ReLU activation, the test accuracy performance of the standard LSTM will drop after a number of epochs when learning parameter become larger. However all of the new model variants sustain their performance.

* 4 pages, 6 figures, 5 tables; this is part II of three-part work, all to appear in IKE'17- The 16th Int'l Conference on Information & Knowledge Engineering, in The 2017 World Congress in Computer Science Computer Engineering & Applied Computing | CSCE'17 July 17-20, 2017, Las Vegas, Nevada, USA

Via

Access Paper or Ask Questions

Simplified Long Short-term Memory Recurrent Neural Networks: part I

Jul 14, 2017

Atra Akandeh, Fathi M. Salem

Figure 1 for Simplified Long Short-term Memory Recurrent Neural Networks: part I

Figure 2 for Simplified Long Short-term Memory Recurrent Neural Networks: part I

Figure 3 for Simplified Long Short-term Memory Recurrent Neural Networks: part I

Figure 4 for Simplified Long Short-term Memory Recurrent Neural Networks: part I

Abstract:We present five variants of the standard Long Short-term Memory (LSTM) recurrent neural networks by uniformly reducing blocks of adaptive parameters in the gating mechanisms. For simplicity, we refer to these models as LSTM1, LSTM2, LSTM3, LSTM4, and LSTM5, respectively. Such parameter-reduced variants enable speeding up data training computations and would be more suitable for implementations onto constrained embedded platforms. We comparatively evaluate and verify our five variant models on the classical MNIST dataset and demonstrate that these variant models are comparable to a standard implementation of the LSTM model while using less number of parameters. Moreover, we observe that in some cases the standard LSTM's accuracy performance will drop after a number of epochs when using the ReLU nonlinearity; in contrast, however, LSTM3, LSTM4 and LSTM5 will retain their performance.

* 4 pages, 6 figures, 5 tables. Part I of a three part publications that will appear in IKE'17 - The 16th Int'l Conference on Information & Knowledge Engineering The 2017 World Congress in Computer Science, Computer Engineering & Applied Computing | CSCE'17, July 17-20, 2017, Las Vegas, Nevada, USA

Via

Access Paper or Ask Questions