Abstract:The logistics of transporting a package from a storage facility to the consumer's front door usually employs highly specialized robots often times splitting sub-tasks up to different systems, e.g., manipulator arms to sort and wheeled vehicles to deliver. More recent endeavors attempt to have a unified approach with legged and humanoid robots. These solutions, however, occupy large amounts of space thus reducing the number of packages that can fit into a delivery vehicle. As a result, these bulky robotic systems often reduce the potential for scalability and task parallelization. In this paper, we introduce LIMMS (Latching Intelligent Modular Mobility System) to address both the manipulation and delivery portion of a typical last-mile delivery while maintaining a minimal spatial footprint. LIMMS is a symmetrically designed, 6 degree of freedom (DoF) appendage-like robot with wheels and latching mechanisms at both ends. By latching onto a surface and anchoring at one end, LIMMS can function as a traditional 6-DoF manipulator arm. On the other hand, multiple LIMMS can latch onto a single box and behave like a legged robotic system where the package is the body. During transit, LIMMS folds up compactly and takes up much less space compared to traditional robotic systems. A large group of LIMMS units can fit inside of a single delivery vehicle, opening the potential for new delivery optimization and hybrid planning methods never done before. In this paper, the feasibility of LIMMS is studied and presented using a hardware prototype as well as simulation results for a range of sub-tasks in a typical last-mile delivery.
Abstract:Practitioners often rely on compute-intensive domain randomization to ensure reinforcement learning policies trained in simulation can robustly transfer to the real world. Due to unmodeled nonlinearities in the real system, however, even such simulated policies can still fail to perform stably enough to acquire experience in real environments. In this paper we propose a novel method that guarantees a stable region of attraction for the output of a policy trained in simulation, even for highly nonlinear systems. Our core technique is to use "bias-shifted" neural networks for constructing the controller and training the network in the simulator. The modified neural networks not only capture the nonlinearities of the system but also provably preserve linearity in a certain region of the state space and thus can be tuned to resemble a linear quadratic regulator that is known to be stable for the real system. We have tested our new method by transferring simulated policies for a swing-up inverted pendulum to real systems and demonstrated its efficacy.