Abstract:Reliable pose estimation of uncooperative satellites is a key technology for enabling future on-orbit servicing and debris removal missions. The Kelvins Satellite Pose Estimation Challenge aims at evaluating and comparing monocular vision-based approaches and pushing the state-of-the-art on this problem. This work is based on the Satellite Pose Estimation Dataset, the first publicly available machine learning set of synthetic and real spacecraft imagery. The choice of dataset reflects one of the unique challenges associated with spaceborne computer vision tasks, namely the lack of spaceborne images to train and validate the developed algorithms. This work briefly reviews the basic properties and the collection process of the dataset which was made publicly available. The competition design, including the definition of performance metrics and the adopted testbed, is also discussed. Furthermore, the submissions of the 48 participants are analyzed to compare the performance of their approaches and uncover what factors make the satellite pose estimation problem especially challenging.
Abstract:This work presents a novel Convolutional Neural Network (CNN) architecture and a training procedure to enable robust and accurate pose estimation of a noncooperative spacecraft. First, a new CNN architecture is introduced that has scored a fourth place in the recent Pose Estimation Challenge hosted by Stanford's Space Rendezvous Laboratory (SLAB) and the Advanced Concepts Team (ACT) of the European Space Agency (ESA). The proposed architecture first detects the object by regressing a 2D bounding box, then a separate network regresses the 2D locations of the known surface keypoints from an image of the target cropped around the detected Region-of-Interest (RoI). In a single-image pose estimation problem, the extracted 2D keypoints can be used in conjunction with corresponding 3D model coordinates to compute relative pose via the Perspective-n-Point (PnP) problem. These keypoint locations have known correspondences to those in the 3D model, since the CNN is trained to predict the corners in a pre-defined order, allowing for bypassing the computationally expensive feature matching processes. This work also introduces and explores the texture randomization to train a CNN for spaceborne applications. Specifically, Neural Style Transfer (NST) is applied to randomize the texture of the spacecraft in synthetically rendered images. It is shown that using the texture-randomized images of spacecraft for training improves the network's performance on spaceborne images without exposure to them during training. It is also shown that when using the texture-randomized spacecraft images during training, regressing 3D bounding box corners leads to better performance on spaceborne images than regressing surface keypoints, as NST inevitably distorts the spacecraft's geometric features to which the surface keypoints have closer relation.
Abstract:This work introduces the Spacecraft Pose Network (SPN) for on-board estimation of the pose, i.e., the relative position and attitude, of a known non-cooperative spacecraft using monocular vision. In contrast to other state-of-the-art pose estimation approaches for spaceborne applications, the SPN method does not require the formulation of hand-engineered features and only requires a single grayscale image to determine the pose of the spacecraft relative to the camera. The SPN method uses a Convolutional Neural Network (CNN) with three branches to solve for the pose. The first branch of the CNN bootstraps a state-of-the-art object detector to detect a 2D bounding box around the target spacecraft. The region inside the bounding box is then used by the other two branches of the CNN to determine the attitude by initially classifying the input region into discrete coarse attitude labels before regressing to a finer estimate. The SPN method then uses a novel Gauss-Newton algorithm to estimate the position by using the constraints imposed by the detected 2D bounding box and the estimated attitude. The secondary contribution of this work is the generation of the Spacecraft PosE Estimation Dataset (SPEED). SPEED consists of synthetic as well as actual camera images of a mock-up of the Tango spacecraft from the PRISMA mission. The synthetic images are created by fusing OpenGL-based renderings of the spacecraft's 3D model with actual images of the Earth captured by the Himawari-8 meteorological satellite. The actual camera images are created using a 7 degrees-of-freedom robotic arm, which positions and orients a vision-based sensor with respect to a full-scale mock-up of the Tango spacecraft. The SPN method, trained only on synthetic images, produces degree-level attitude error and cm-level position errors when evaluated on the actual camera images not used during training.
Abstract:On-board estimation of the pose of an uncooperative target spacecraft is an essential task for future on-orbit servicing and close-proximity formation flying missions. However, two issues hinder reliable on-board monocular vision based pose estimation: robustness to illumination conditions due to a lack of reliable visual features and scarcity of image datasets required for training and benchmarking. To address these two issues, this work details the design and validation of a monocular vision based pose determination architecture for spaceborne applications. The primary contribution to the state-of-the-art of this work is the introduction of a novel pose determination method based on Convolutional Neural Networks (CNN) to provide an initial guess of the pose in real-time on-board. The method involves discretizing the pose space and training the CNN with images corresponding to the resulting pose labels. Since reliable training of the CNN requires massive image datasets and computational resources, the parameters of the CNN must be determined prior to the mission with synthetic imagery. Moreover, reliable training of the CNN requires datasets that appropriately account for noise, color, and illumination characteristics expected in orbit. Therefore, the secondary contribution of this work is the introduction of an image synthesis pipeline, which is tailored to generate high fidelity images of any spacecraft 3D model. The proposed technique is scalable to spacecraft of different structural and physical properties as well as robust to the dynamic illumination conditions of space. Through metrics measuring classification and pose accuracy, it is shown that the presented architecture has desirable robustness and scalable properties.