We propose a general method for solving statistical mechanics problems defined on sparse graphs, such as random graphs, real-world networks, and low-dimensional lattices. Our approach extract a small feedback vertex set of the sparse graph, converting the sparse system to a strongly correlated system with many-body and dense interactions on the feedback set, then solve it using variational method based on neural networks to estimate free energy, observables, and generate unbiased samples via direct sampling. Extensive experiments show that our approach is more accurate than existing approaches for sparse spin glass systems. On random graphs and real-world networks, our approach significantly outperforms the standard methods for sparse systems such as belief-propagation; on structured sparse systems such as two-dimensional lattices our approach is significantly faster and more accurate than recently proposed variational autoregressive networks using convolution neural networks.