Nuclear fusion is regarded as the energy of the future since it presents the possibility of unlimited clean energy. One obstacle in utilizing fusion as a feasible energy source is the stability of the reaction. Ideally, one would have a controller for the reactor that makes actions in response to the current state of the plasma in order to prolong the reaction as long as possible. In this work, we make preliminary steps to learning such a controller. Since learning on a real world reactor is infeasible, we tackle this problem by attempting to learn optimal controls offline via a simulator, where the state of the plasma can be explicitly set. In particular, we introduce a theoretically grounded Bayesian optimization algorithm that recommends a state and action pair to evaluate at every iteration and show that this results in more efficient use of the simulator.