This paper considers deep neural networks for learning weakly dependent processes in a general framework that includes, for instance, regression estimation, time series prediction, time series classification. The $\psi$-weak dependence structure considered is quite large and covers other conditions such as mixing, association,$\ldots$ Firstly, the approximation of smooth functions by deep neural networks with a broad class of activation functions is considered. We derive the required depth, width and sparsity of a deep neural network to approximate any H\"{o}lder smooth function, defined on any compact set $\mx$. Secondly, we establish a bound of the excess risk for the learning of weakly dependent observations by deep neural networks. When the target function is sufficiently smooth, this bound is close to the usual $\mathcal{O}(n^{-1/2})$.