Abstract:We study a universal approximation property of ODENet and ResNet. The ODENet is a map from an initial value to the final value of an ODE system in a finite interval. It is considered a mathematical model of a ResNet-type deep learning system. We consider dynamical systems with vector fields given by a single composition of the activation function and an affine mapping, which is the most common choice of the ODENet or ResNet vector field in actual machine learning systems. We show that such an ODENet and ResNet with a restricted vector field can uniformly approximate ODENet with a general vector field.
Abstract:We prove a universal approximation property (UAP) for a class of ODENet and a class of ResNet, which are used in many deep learning algorithms. The UAP can be stated as follows. Let $n$ and $m$ be the dimension of input and output data, and assume $m\leq n$. Then we show that ODENet width $n+m$ with any non-polynomial continuous activation function can approximate any continuous function on a compact subset on $\mathbb{R}^n$. We also show that ResNet has the same property as the depth tends to infinity. Furthermore, we derive explicitly the gradient of a loss function with respect to a certain tuning variable. We use this to construct a learning algorithm for ODENet. To demonstrate the usefulness of this algorithm, we apply it to a regression problem, a binary classification, and a multinomial classification in MNIST.