Wideband spectrum sensing (WSS) is critical for orchestrating multitudinous wireless transmissions via spectrum sharing, but may incur excessive costs of hardware, power and computation due to the high sampling rate. In this article, a deep learning based WSS framework embedding the multicoset preprocessing is proposed to enable the low-cost sub-Nyquist sampling. A pruned convolutional attention WSS network (PCA-WSSNet) is designed to organically integrate the multicoset preprocessing and the convolutional attention mechanism as well as to reduce the model complexity remarkably via the selective weight pruning without the performance loss. Furthermore, a transfer learning (TL) strategy benefiting from the model pruning is developed to improve the robustness of PCA-WSSNet with few adaptation samples of new scenarios. Simulation results show the performance superiority of PCA-WSSNet over the state of the art. Compared with direct TL, the pruned TL strategy can simultaneously improve the prediction accuracy in unseen scenarios, reduce the model size, and accelerate the model inference.