We propose a reinforcement learning based method to identify important configurations that connect reactant and product states along chemical reaction paths. By shooting multiple trajectories from these configurations, we can generate an ensemble of configurations that concentrate on the transition path ensemble. This configuration ensemble can be effectively employed in a neural network-based partial differential equation solver to obtain an approximation solution of a restricted Backward Kolmogorov equation, even when the dimension of the problem is very high. The resulting solution, known as the committor function, encodes mechanistic information for the reaction and can in turn be used to evaluate reaction rates.