Abstract:A basic question in the PAC model of learning is whether proper learning is harder than improper learning. In the classical case, there are examples of concept classes with VC dimension $d$ that have sample complexity $\Omega\left(\frac d\epsilon\log\frac1\epsilon\right)$ for proper learning with error $\epsilon$, while the complexity for improper learning is O$\!\left(\frac d\epsilon\right)$. One such example arises from the Coupon Collector problem. Motivated by the efficiency of proper versus improper learning with quantum samples, Arunachalam, Belovs, Childs, Kothari, Rosmanis, and de Wolf (TQC 2020) studied an analogue, the Quantum Coupon Collector problem. Curiously, they discovered that for learning size $k$ subsets of $[n]$ the problem has sample complexity $\Theta(k\log\min\{k,n-k+1\})$, in contrast with the complexity of $\Theta(k\log k)$ for Coupon Collector. This effectively negates the possibility of a separation between the two modes of learning via the quantum problem, and Arunachalam et al.\ posed the possibility of such a separation as an open question. In this work, we first present an algorithm for the Quantum Coupon Collector problem with sample complexity that matches the sharper lower bound of $(1-o_k(1))k\ln\min\{k,n-k+1\}$ shown recently by Bab Hadiashar, Nayak, and Sinha (IEEE TIT 2024), for the entire range of the parameter $k$. Next, we devise a variant of the problem, the Quantum Padded Coupon Collector. We prove that its sample complexity matches that of the classical Coupon Collector problem for both modes of learning, thereby exhibiting the same asymptotic separation between proper and improper quantum learning as mentioned above. The techniques we develop in the process can be directly applied to any form of padded quantum data. We hope that padding can more generally lift other forms of classical learning behaviour to the quantum setting.
Abstract:Although a concept class may be learnt more efficiently using quantum samples as compared with classical samples in certain scenarios, Arunachalam and de Wolf (JMLR, 2018) proved that quantum learners are asymptotically no more efficient than classical ones in the quantum PAC and Agnostic learning models. They established lower bounds on sample complexity via quantum state identification and Fourier analysis. In this paper, we derive optimal lower bounds for quantum sample complexity in both the PAC and agnostic models via an information-theoretic approach. The proofs are arguably simpler, and the same ideas can potentially be used to derive optimal bounds for other problems in quantum learning theory. We then turn to a quantum analogue of the Coupon Collector problem, a classic problem from probability theory also of importance in the study of PAC learning. Arunachalam, Belovs, Childs, Kothari, Rosmanis, and de Wolf (TQC, 2020) characterized the quantum sample complexity of this problem up to constant factors. First, we show that the information-theoretic approach mentioned above provably does not yield the optimal lower bound. As a by-product, we get a natural ensemble of pure states in arbitrarily high dimensions which are not easily (simultaneously) distinguishable, while the ensemble has close to maximal Holevo information. Second, we discover that the information-theoretic approach yields an asymptotically optimal bound for an approximation variant of the problem. Finally, we derive a sharp lower bound for the Quantum Coupon Collector problem, with the exact leading order term, via the Holevo-Curlander bounds on the distinguishability of an ensemble. All the aspects of the Quantum Coupon Collector problem we study rest on properties of the spectrum of the associated Gram matrix, which may be of independent interest.