Program synthesis has recently emerged as a promising approach to the image parsing task. However, most prior works have relied on supervised learning methods, which require ground truth programs for each training image. We present an unsupervised learning algorithm that can parse constructive solid geometry (CSG) images into context-free grammar with a non-differentiable renderer. We propose a grammar-encoded tree LSTM to effectively constrain our search space by leveraging the structure of the context-free grammar while handling the non-differentiable renderer via REINFORCE and encouraging the exploration by regularizing the objective with an entropy term. Instead of using simple Monte Carlo sampling, we propose a lower-variance entropy estimator with sampling without replacement for effective exploration. We demonstrate the effectiveness of the proposed algorithm on a synthetic 2D CSG dataset, which outperforms baseline models by a large margin.