In this paper, we study the optimality of the Bussgang linear minimum mean squared error (BLMMSE) channel estimator for multiple-input multiple-output systems with 1-bit analog-to-digital converters. We compare the BLMMSE with the optimal minimum mean squared error (MMSE) channel estimator, which is generally non-linear, and we develop a novel framework based on the orthant probability of a multivariate normal distribution to compute the MMSE channel estimate. Then, we analyze the equivalence of the MMSE and BLMMSE channel estimators under specific assumptions on the channel correlation or pilot symbols. Interestingly, the BLMMSE channel estimator turns out to be optimal in several specific cases. Our study culminates with the presentation of a necessary and sufficient condition for the BLMMSE channel estimator to be optimal.