Abstract:With the development of mobility-on-demand services, increasing sources of rich transportation data, and the advent of autonomous vehicles (AVs), there are significant opportunities for shared-use AV mobility services (SAMSs) to provide accessible and demand-responsive personal mobility. This paper focuses on the problem of anticipatory repositioning of idle vehicles in a SAMS fleet to enable better assignment decisions in serving future demand. The rebalancing problem is formulated as a Markov Decision Process and a reinforcement learning approach using an advantage actor critic (A2C) method is proposed to learn a rebalancing policy that anticipates future demand and cooperates with an optimization-based assignment strategy. The proposed formulation and solution approach allow for centralized repositioning decisions for the entire vehicle fleet but ensure that the problem size does not change with the size of the vehicle fleet. Using an agent-based simulation tool and New York City taxi data to simulate demand for rides in a SAMS system, two versions of the A2C AV repositioning approach are tested: A2C-AVR(A) observing past demand for rides and learning to anticipate future demand, and A2C-AVR(B) that receives demand forecasts. Numerical experiments demonstrate that the A2C-AVR approaches significantly reduce mean passenger wait times relative to an alternative optimization-based rebalancing approach, at the expense of slightly increased percentage of empty fleet miles travelled. The experiments show comparable performance between the A2C-AVR(A) and (B), indicating that the approach can anticipate future demand based on past demand observations. Testing with various demand and time-of-day scenarios, and an alternative assignment strategy, experiments demonstrate the models transferability to cases unseen at the training stage.