We study the complexity of training classical and quantum machine learning (ML) models for predicting outcomes of physical experiments. The experiments depend on an input parameter $x$ and involve the execution of a (possibly unknown) quantum process $\mathcal{E}$. Our figure of merit is the number of runs of $\mathcal{E}$ during training, disregarding other measures of runtime. A classical ML model performs a measurement and records the classical outcome after each run of $\mathcal{E}$, while a quantum ML model can access $\mathcal{E}$ coherently to acquire quantum data; the classical or quantum data is then used to predict outcomes of future experiments. We prove that, for any input distribution $\mathcal{D}(x)$, a classical ML model can provide accurate predictions on average by accessing $\mathcal{E}$ a number of times comparable to the optimal quantum ML model. In contrast, for achieving accurate prediction on all inputs, we show that exponential quantum advantage is possible for certain tasks. For example, to predict expectation values of all Pauli observables in an $n$-qubit system $\rho$, we present a quantum ML model using only $\mathcal{O}(n)$ copies of $\rho$ and prove that classical ML models require $2^{\Omega(n)}$ copies.