The accurate modeling and control of nonlinear dynamical effects are crucial for numerous robotic systems. The Koopman formalism emerges as a valuable tool for linear control design in nonlinear systems within unknown environments. However, it still remains a challenging task to learn the Koopman operator with control from data, and in particular, the simultaneous identification of the Koopman linear dynamics and the mapping between the state and Koopman spaces. Conventional approaches, based on single-level unconstrained optimization, may lack model robustness, training efficiency, and long-term predictive accuracy. This paper presents a bi-level optimization framework that jointly learns the Koopman embedding mapping and Koopman dynamics with explicit multi-step dynamical constraints, eliminating the need for heuristically-tuned loss terms. Leveraging implicit differentiation, our formulation allows back-propagation in standard learning framework and the use of state-of-the-art optimizers, yielding more stable and robust system performance over various applications compared to conventional methods.