Abstract:Recent works explore deep learning's success by examining functions or data with hierarchical structure. Complementarily, research on gradient descent performance for deep nets has shown that noise sensitivity of functions under independent and identically distributed (i.i.d.) Bernoulli inputs establishes learning complexity bounds. This paper aims to bridge these research streams by demonstrating that functions constructed through repeated composition of non-linear functions are noise sensitive under general product measures.