We propose a general machine learning-based framework for building an accurate and widely-applicable energy functional within the framework of generalized Kohn-Sham density functional theory. To this end, we develop a way of training self-consistent models that are capable of taking large datasets from different systems and different kinds of labels. We demonstrate that the functional that results from this training procedure gives chemically accurate predictions on energy, force, dipole, and electron density for a large class of molecules. It can be continuously improved when more and more data are available.