We are interested in the optimal scheduling of a collection of multi-component application jobs in an edge computing system that consists of geo-distributed edge computing nodes connected through a wide area network. The scheduling and placement of application jobs in an edge system is challenging due to the interdependence of multiple components of each job, and the communication delays between the geographically distributed data sources and edge nodes and their dynamic availability. In this paper we explore the feasibility of applying Deep Reinforcement Learning (DRL) based design to address these challenges. We introduce a DRL actor-critic algorithm that aims to find an optimal scheduling policy to minimize average job slowdown in the edge system. We have demonstrated through simulations that our design outperforms a few existing algorithms, based on both synthetic data and a Google cloud data trace.