In this paper, we consider a problem in which distributively extracted features are used for performing inference in wireless networks. We elaborate on our proposed architecture, which we herein refer to as "in-network learning", provide a suitable loss function and discuss its optimization using neural networks. We compare its performance with both Federated- and Split learning; and show that this architecture offers both better accuracy and bandwidth savings.