Abstract:There has been a paradigm-shift in urban logistic services in the last years; demand for real-time, instant mobility and delivery services grows. This poses new challenges to logistic service providers as the underlying stochastic dynamic vehicle routing problems (SDVRPs) require anticipatory real-time routing actions. Searching the combinatorial action space for efficient routing actions is by itself a complex task of mixed-integer programming (MIP) well-known by the operations research community. This complexity is now multiplied by the challenge of evaluating such actions with respect to their effectiveness given future dynamism and uncertainty, a potentially ideal case for reinforcement learning (RL) well-known by the computer science community. For solving SDVRPs, joint work of both communities is needed, but as we show, essentially non-existing. Both communities focus on their individual strengths leaving potential for improvement. Our survey paper highlights this potential in research originating from both communities. We point out current obstacles in SDVRPs and guide towards joint approaches to overcome them.