This paper proves an abstract theorem addressing in a unified manner two important problems in function approximation: avoiding curse of dimensionality and estimating the degree of approximation for out-of-sample extension in manifold learning. We consider an abstract (shallow) network that includes, for example, neural networks, radial basis function networks, and kernels on data defined manifolds used for function approximation in various settings. A deep network is obtained by a composition of the shallow networks according to a directed acyclic graph, representing the architecture of the deep network. In this paper, we prove dimension independent bounds for approximation by shallow networks in the very general setting of what we have called $G$-networks on a compact metric measure space, where the notion of dimension is defined in terms of the cardinality of maximal distinguishable sets, generalizing the notion of dimension of a cube or a manifold. Our techniques give bounds that improve without saturation with the smoothness of the kernel involved in an integral representation of the target function. In the context of manifold learning, our bounds provide estimates on the degree of approximation for an out-of-sample extension of the target function to the ambient space. One consequence of our theorem is that without the requirement of robust parameter selection, deep networks using a non-smooth activation function such as the ReLU, do not provide any significant advantage over shallow networks in terms of the degree of approximation alone.