Abstract:In this paper, we propose a novel and generalizable zero-shot knowledge transfer framework that distills expert sports navigation strategies from web videos into robotic systems with adversarial constraints and out-of-distribution image trajectories. Our pipeline enables diffusion-based imitation learning by reconstructing the full 3D task space from multiple partial views, warping it into 2D image space, closing the planning loop within this 2D space, and transfer constrained motion of interest back to task space. Additionally, we demonstrate that the learned policy can serve as a local planner in conjunction with position control. We apply this framework in the wheelchair tennis navigation problem to guide the wheelchair into the ball-hitting region. Our pipeline achieves a navigation success rate of 97.67% in reaching real-world recorded tennis ball trajectories with a physical robot wheelchair, and achieve a success rate of 68.49% in a real-world, real-time experiment on a full-sized tennis court.