Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Michael James

Online Normalization for Training Neural Networks

May 28, 2019

Vitaliy Chiley, Ilya Sharapov, Atli Kosson, Urs Koster, Ryan Reece, Sofia Samaniego de la Fuente, Vishal Subbiah, Michael James

Figure 1 for Online Normalization for Training Neural Networks

Figure 2 for Online Normalization for Training Neural Networks

Figure 3 for Online Normalization for Training Neural Networks

Figure 4 for Online Normalization for Training Neural Networks

Abstract:Online Normalization is a new technique for normalizing the hidden activations of a neural network. Like Batch Normalization, it normalizes the sample dimension. While Online Normalization does not use batches, it is as accurate as Batch Normalization. We resolve a theoretical limitation of Batch Normalization by introducing an unbiased technique for computing the gradient of normalized activations. Online Normalization works with automatic differentiation by adding statistical normalization as a primitive. This technique can be used in cases not covered by some other normalizers, such as recurrent networks, fully connected networks, and networks with activation memory requirements prohibitive for batching. We show its applications to image classification, image segmentation, and language modeling. We present formal proofs and experimental results on ImageNet, CIFAR, and PTB datasets.

* This version adds the appendix and corrects minor typos and omissions. 23 pages total

Via

Access Paper or Ask Questions

Predictive State Representations: A New Theory for Modeling Dynamical Systems

Jul 11, 2012

Satinder Singh, Michael James, Matthew Rudary

Figure 1 for Predictive State Representations: A New Theory for Modeling Dynamical Systems

Figure 2 for Predictive State Representations: A New Theory for Modeling Dynamical Systems

Figure 3 for Predictive State Representations: A New Theory for Modeling Dynamical Systems

Figure 4 for Predictive State Representations: A New Theory for Modeling Dynamical Systems

Abstract:Modeling dynamical systems, both for control purposes and to make predictions about their behavior, is ubiquitous in science and engineering. Predictive state representations (PSRs) are a recently introduced class of models for discrete-time dynamical systems. The key idea behind PSRs and the closely related OOMs (Jaeger's observable operator models) is to represent the state of the system as a set of predictions of observable outcomes of experiments one can do in the system. This makes PSRs rather different from history-based models such as nth-order Markov models and hidden-state-based models such as HMMs and POMDPs. We introduce an interesting construct, the systemdynamics matrix, and show how PSRs can be derived simply from it. We also use this construct to show formally that PSRs are more general than both nth-order Markov models and HMMs/POMDPs. Finally, we discuss the main difference between PSRs and OOMs and conclude with directions for future work.

* Appears in Proceedings of the Twentieth Conference on Uncertainty in Artificial Intelligence (UAI2004)

Via

Access Paper or Ask Questions