The reconstruction of micro-Doppler signatures of human movements is a key enabler for fine-grained activity recognition with radio-frequency sensing. In this work, we focus on Joint Communication and Sensing (JCS) systems where, unlike in dedicated radar sensing systems, a suitable tradeoff between sensing accuracy and communication overhead has to be attained. It follows that the micro-Doppler has to be reconstructed from sparse and noisy channel estimates obtained from communication packets, limiting as much as possible the transmission of additional probing signals for the purpose of sensing. Existing approaches exploit compressed sensing, but produce very poor reconstructions when only a few channel measurements are available, which is often the case in real communication patterns. In addition, the large number of iterations they need to converge hinders their use in real-time systems. Here, we present STAR, a lightweight neural network that combines a single unrolled iterative hard-thresholding layer with an attention mechanism. Our new approach exploits the temporal correlation of the micro-Doppler to accurately reconstruct microDoppler sequences from human movement even from very sparse channel measurements. In doing so, it combines model-based and data-driven approaches into an interpretable and low-complexity architecture, which is amenable to real-time implementations. We evaluate STAR on a public JCS dataset of 60 GHz IEEE 802.11ay channel measurements of human activity traces. Experimental results show that it substantially outperforms state-of-the-art solutions in terms of the reconstructed microDoppler quality. Remarkably, STAR enables human activity recognition with satisfactory accuracy even with 90%-sparse channel measurements, for which existing techniques fail.