Coded-illumination can enable quantitative phase microscopy of transparent samples with minimal hardware requirements. Intensity images are captured with different source patterns and are processed using non-linear phase retrieval to recover the quantitative phase. The non-linear nature of the processing makes optimizing the coded-illumination pattern designs complicated. Traditional techniques for experimental design (e.g. condition number optimization or spectral analysis) may not be ideal as they characterize linear measurement formation models for linear reconstructions. Deep neural networks (DNNs) offer an end-to-end framework which can efficiently represent the non-linear process and can be optimized over by training. However, DNNs require an enormous amount of training examples and parameters to properly learn the phase retrieval process, without making use of the known physical models. Here, we aim to use both our knowledge of the physics and the power of machine learning together. We develop a new data-driven approach to optimizing coded-illumination patterns for a LED array microscope to maximize performance of a given phase reconstruction algorithm. Our general formulation incorporates the physics of the measurement scheme as well as the non-linearity of the reconstruction algorithm into the design problem. This enables efficient parameterization of the problem, which allows us to use only a small number of training examples to learn designs that generalize well in the experimental setting without retraining. We show experimental results for both a well-characterized phase target and mouse fibroblast cells using coded-illumination patterns optimized for a sparsity-based phase reconstruction algorithm. Our results demonstrate similar accuracy to Fourier Ptychography with 69 measurements, while only using 2 measurements with our learned design.