A mega-constellation of low-earth orbit (LEO) satellites has the potential to enable long-range communication with low latency. Integrating this with burgeoning unmanned aerial vehicle (UAV) assisted non-terrestrial networks will be a disruptive solution for beyond 5G systems provisioning large scale three-dimensional connectivity. In this article, we study the problem of forwarding packets between two faraway ground terminals, through an LEO satellite selected from an orbiting constellation and a mobile high-altitude platform (HAP) such as a fixed-wing UAV. To maximize the end-to-end data rate, the satellite association and HAP location should be optimized, which is challenging due to a huge number of orbiting satellites and the resulting time-varying network topology. We tackle this problem using deep reinforcement learning (DRL) with a novel action dimension reduction technique. Simulation results corroborate that our proposed method achieves up to 5.74x higher average data rate compared to a direct communication baseline without SAT and HAP.