Bayesian optimization is a sample-efficient method for solving expensive, black-box optimization problems. Stochastic programming concerns optimization under uncertainty where, typically, average performance is the quantity of interest. In the first stage of a two-stage problem, here-and-now decisions must be made in the face of this uncertainty, while in the second stage, wait-and-see decisions are made after the uncertainty has been resolved. Many methods in stochastic programming assume that the objective is cheap to evaluate and linear or convex. In this work, we apply Bayesian optimization to solve non-convex, two-stage stochastic programs which are expensive to evaluate. We formulate a knowledge-gradient-based acquisition function to jointly optimize the first- and second-stage variables, establish a guarantee of asymptotic consistency and provide a computationally efficient approximation. We demonstrate comparable empirical results to an alternative we formulate which alternates its focus between the two variable types, and superior empirical results over the standard, naive, two-step benchmark. We show that differences in the dimension and length scales between the variable types can lead to inefficiencies of the two-step algorithm, while the joint and alternating acquisition functions perform well in all problems tested. Experiments are conducted on both synthetic and real-world examples.