Given the current point-to-point navigation capabilities of autonomous vehicles, researchers are looking into complex service requests that require the vehicles to visit multiple points of interest. In this paper, we develop a layered planning framework, called GLAD, for complex service requests in autonomous urban driving. There are three layers for service-level, behavior-level, and motion-level planning. The layered framework is unique in its tight coupling, where the different layers communicate user preferences, safety estimates, and motion costs for system optimization. GLAD is visually grounded by perceptual learning from a dataset of 13.8k instances collected from driving behaviors. GLAD enables autonomous vehicles to efficiently and safely fulfill complex service requests. Experimental results from abstract and full simulation show that our system outperforms a few competitive baselines from the literature.