Aerial robots are increasingly being utilized for a wide range of environmental monitoring and exploration tasks. However, a key challenge is efficiently planning paths to maximize the information value of acquired data as an initially unknown environment is explored. To address this, we propose a new approach for informative path planning (IPP) based on deep reinforcement learning (RL). Bridging the gap between recent advances in RL and robotic applications, our method combines Monte Carlo tree search with an offline-learned neural network predicting informative sensing actions. We introduce several components making our approach applicable for robotic tasks with continuous high-dimensional state spaces and large action spaces. By deploying the trained network during a mission, our method enables sample-efficient online replanning on physical platforms with limited computational resources. Evaluations using synthetic data show that our approach performs on par with existing information-gathering methods while reducing runtime by a factor of 8-10. We validate the performance of our framework using real-world surface temperature data from a crop field.