Recently, mean field control (MFC) has provided a tractable and theoretically founded approach to otherwise difficult cooperative multi-agent control. However, the strict assumption of many independent, homogeneous agents may be too stringent in practice. In this work, we propose a novel discrete-time generalization of Markov decision processes and MFC to both many minor agents and potentially complex major agents -- major-minor mean field control (M3FC). In contrast to deterministic MFC, M3FC allows for stochastic minor agent distributions with strong correlation between minor agents through the major agent state, which can model arbitrary problem details not bound to any agent. Theoretically, we give rigorous approximation properties with novel proofs for both M3FC and existing MFC models in the finite multi-agent problem, together with a dynamic programming principle for solving such problems. In the infinite-horizon discounted case, existence of an optimal stationary policy follows. Algorithmically, we propose the major-minor mean field proximal policy optimization algorithm (M3FPPO) as a novel multi-agent reinforcement learning algorithm and demonstrate its success in illustrative M3FC-type problems.