Networked sensing refers to the capability of properly orchestrating multiple sensing terminals to enhance specific figures of merit, e.g., positioning accuracy or imaging resolution. Regarding radio-based sensing, it is essential to understand \textit{when} and \textit{how} sensing terminals should be orchestrated, namely the best cooperation that trades between performance and cost (e.g., energy consumption, communication overhead, and complexity). This paper addresses networked sensing from a physics-driven perspective, aiming to provide a general theoretical benchmark to evaluate its \textit{imaging} performance bounds and to guide the sensing orchestration accordingly. Diffraction tomography theory (DTT) is the method to quantify the imaging resolution of any radio sensing experiment from inspection of its spectral (or wavenumber) content. In networked sensing, the image formation is based on the back-projection integral, valid for any network topology and physical configuration of the terminals. The \textit{wavefield networked sensing} is a framework in which multiple sensing terminals are orchestrated during the acquisition process to maximize the imaging quality (resolution and grating lobes suppression) by pursuing the deceptively simple \textit{wavenumber tessellation principle}. We discuss all the cooperation possibilities between sensing terminals and possible killer applications. Remarkably, we show that the proposed method allows obtaining high-quality images of the environment in limited bandwidth conditions, leveraging the coherent combination of multiple multi-static low-resolution images.