Abstract:We consider the problem of cross-sensor domain adaptation in the context of LiDAR-based 3D object detection and propose Stationary Object Aggregation Pseudo-labelling (SOAP) to generate high quality pseudo-labels for stationary objects. In contrast to the current state-of-the-art in-domain practice of aggregating just a few input scans, SOAP aggregates entire sequences of point clouds at the input level to reduce the sensor domain gap. Then, by means of what we call quasi-stationary training and spatial consistency post-processing, the SOAP model generates accurate pseudo-labels for stationary objects, closing a minimum of 30.3% domain gap compared to few-frame detectors. Our results also show that state-of-the-art domain adaptation approaches can achieve even greater performance in combination with SOAP, in both the unsupervised and semi-supervised settings.
Abstract:Modelling pedestrian behavior is crucial in the development and testing of autonomous vehicles. In this work, we present a hierarchical pedestrian behavior model that generates high-level decisions through the use of behavior trees, in order to produce maneuvers executed by a low-level motion planner using an adapted Social Force model. A full implementation of our work is integrated into GeoScenario Server, a scenario definition and execution engine, extending its vehicle simulation capabilities with pedestrian simulation. The extended environment allows simulating test scenarios involving both vehicles and pedestrians to assist in the scenario-based testing process of autonomous vehicles. The presented hierarchical model is evaluated on two real-world data sets collected at separate locations with different road structures. Our model is shown to replicate the real-world pedestrians' trajectories with a high degree of fidelity and a decision-making accuracy of 98% or better, given only high-level routing information for each pedestrian.
Abstract:We consider the challenge of finding a deterministic policy for a Markov decision process that uniformly (in all states) maximizes one reward subject to a probabilistic constraint over a different reward. Existing solutions do not fully address our precise problem definition, which nevertheless arises naturally in the context of safety-critical robotic systems. This class of problem is known to be hard, but the combined requirements of determinism and uniform optimality can create learning instability. In this work, after describing and motivating our problem with a simple example, we present a suitable constrained reinforcement learning algorithm that prevents learning instability, using recursive constraints. Our proposed approach admits an approximative form that improves efficiency and is conservative w.r.t. the constraint.
Abstract:We explore the complex design space of behaviour planning for autonomous driving. Design choices that successfully address one aspect of behaviour planning can critically constrain others. To aid the design process, in this work we decompose the design space with respect to important choices arising from the current state of the art approaches, and describe the resulting trade-offs. In doing this, we also identify interesting directions of future work.
Abstract:Machine learning can provide efficient solutions to the complex problems encountered in autonomous driving, but ensuring their safety remains a challenge. A number of authors have attempted to address this issue, but there are few publicly-available tools to adequately explore the trade-offs between functionality, scalability, and safety. We thus present WiseMove, a software framework to investigate safe deep reinforcement learning in the context of motion planning for autonomous driving. WiseMove adopts a modular learning architecture that suits our current research questions and can be adapted to new technologies and new questions. We present the details of WiseMove, demonstrate its use on a common traffic scenario, and describe how we use it in our ongoing safe learning research.
Abstract:Markov decision processes (MDP) are useful to model concurrent process optimisation problems, but verifying them with numerical methods is often intractable. Existing approximative approaches do not scale well and are limited to memoryless schedulers. Here we present the basis of scalable verification for MDPSs, using an O(1) memory representation of history-dependent schedulers. We thus facilitate scalable learning techniques and the use of massively parallel verification.