We generalize the PAC (probably approximately correct) learning model to the quantum world by generalizing the concepts from classical functions to quantum processes, defining the problem of \emph{PAC learning quantum process}, and study its sample complexity. In the problem of PAC learning quantum process, we want to learn an $\epsilon$-approximate of an unknown quantum process $c^*$ from a known finite concept class $C$ with probability $1-\delta$ using samples $\{(x_1,c^*(x_1)),(x_2,c^*(x_2)),\dots\}$, where $\{x_1,x_2, \dots\}$ are computational basis states sampled from an unknown distribution $D$ and $\{c^*(x_1),c^*(x_2),\dots\}$ are the (possibly mixed) quantum states outputted by $c^*$. The special case of PAC-learning quantum process under constant input reduces to a natural problem which we named as approximate state discrimination, where we are given copies of an unknown quantum state $c^*$ from an known finite set $C$, and we want to learn with probability $1-\delta$ an $\epsilon$-approximate of $c^*$ with as few copies of $c^*$ as possible. We show that the problem of PAC learning quantum process can be solved with $$O\left(\frac{\log|C| + \log(1/ \delta)} { \epsilon^2}\right)$$ samples when the outputs are pure states and $$O\left(\frac{\log^3 |C|(\log |C|+\log(1/ \delta))} { \epsilon^2}\right)$$ samples if the outputs can be mixed. Some implications of our results are that we can PAC-learn a polynomial sized quantum circuit in polynomial samples and approximate state discrimination can be solved in polynomial samples even when concept class size $|C|$ is exponential in the number of qubits, an exponentially improvement over a full state tomography.