Abstract:Deep neural networks have achieved remarkable success in diverse applications, prompting the need for a solid theoretical foundation. Recent research has identified the minimal width $\max\{2,d_x,d_y\}$ required for neural networks with input dimensions $d_x$ and output dimension $d_y$ that use leaky ReLU activations to universally approximate $L^p(\mathbb{R}^{d_x},\mathbb{R}^{d_y})$ on compacta. Here, we present an alternative proof for the minimal width of such neural networks, by directly constructing approximating networks using a coding scheme that leverages the properties of leaky ReLUs and standard $L^p$ results. The obtained construction has a minimal interior dimension of $1$, independent of input and output dimensions, which allows us to show that autoencoders with leaky ReLU activations are universal approximators of $L^p$ functions. Furthermore, we demonstrate that the normalizing flow LU-Net serves as a distributional universal approximator. We broaden our results to show that smooth invertible neural networks can approximate $L^p(\mathbb{R}^{d},\mathbb{R}^{d})$ on compacta when the dimension $d\geq 2$, which provides a constructive proof of a classical theorem of Brenier and Gangbo. In addition, we use a topological argument to establish that for FNNs with monotone Lipschitz continuous activations, $d_x+1$ is a lower bound on the minimal width required for the uniform universal approximation of continuous functions $C^0(\mathbb{R}^{d_x},\mathbb{R}^{d_y})$ on compacta when $d_x\geq d_y$.