Abstract:Automated 3D pose estimation of satellites and other known space objects is a critical component of space situational awareness. Ground-based imagery offers a convenient data source for satellite characterization; however, analysis algorithms must contend with atmospheric distortion, variable lighting, and unknown reflectance properties. Traditional feature-based pose estimation approaches are unable to discover an accurate correlation between a known 3D model and imagery given this challenging image environment. This paper presents an innovative method for automated 3D pose estimation of known space objects in the absence of satisfactory texture. The proposed approach fits the silhouette of a known satellite model to ground-based imagery via particle filtering. Each particle contains enough information (orientation, position, scale, model articulation) to generate an accurate object silhouette. The silhouette of individual particles is compared to an observed image. Comparison is done probabilistically by calculating the joint probability that pixels inside the silhouette belong to the foreground distribution and that pixels outside the silhouette belong to the background distribution. Both foreground and background distributions are computed by observing empty space. The population of particles are resampled at each new image observation, with the probability of a particle being resampled proportional to how the particle's silhouette matches the observation image. The resampling process maintains multiple pose estimates which is beneficial in preventing and escaping local minimums. Experiments were conducted on both commercial imagery and on LEO satellite imagery. Imagery from the commercial experiments are shown in this paper.
Abstract:Monocular visual SLAM has become an attractive practical approach for robot localization and 3D environment mapping, since cameras are small, lightweight, inexpensive, and produce high-rate, high-resolution data streams. Although numerous robust tools have been developed, most existing systems are designed to operate in terrestrial environments and at relatively small scale (a few thousand frames) due to constraints on computation and storage. In this paper, we present a feature-based visual SLAM system for aerial video whose simple design permits near real-time operation, and whose scalability permits large-area mapping using tens of thousands of frames, all on a single conventional computer. Our approach consists of two parallel threads: the first incrementally creates small locally consistent submaps and estimates camera poses at video rate; the second aligns these submaps with one another to produce a single globally consistent map via factor graph optimization over both poses and landmarks. Scale drift is minimized through the use of 7-degree-of-freedom similarity transformations during submap alignment. We quantify our system's performance on both simulated and real data sets, and demonstrate city-scale map reconstruction accurate to within 2 meters using nearly 90,000 aerial video frames - to our knowledge, the largest and fastest such reconstruction to date.
Abstract:An efficient, fully automatic method for 3D face shape and pose estimation in unconstrained 2D imagery is presented. The proposed method jointly estimates a dense set of 3D landmarks and facial geometry using a single pass of a modified version of the popular "U-Net" neural network architecture. Additionally, we propose a method for directly estimating a set of 3D Morphable Model (3DMM) parameters, using the estimated 3D landmarks and geometry as constraints in a simple linear system. Qualitative modeling results are presented, as well as quantitative evaluation of predicted 3D face landmarks in unconstrained video sequences.
Abstract:Commercial off the shelf (COTS) 3D scanners are capable of generating point clouds covering visible portions of a face with sub-millimeter accuracy at close range, but lack the coverage and specialized anatomic registration provided by more expensive 3D facial scanners. We demonstrate an effective pipeline for joint alignment of multiple unstructured 3D point clouds and registration to a parameterized 3D model which represents shape variation of the human head. Most algorithms separate the problems of pose estimation and mesh warping, however we propose a new iterative method where these steps are interwoven. Error decreases with each iteration, showing the proposed approach is effective in improving geometry and alignment. The approach described is used to align the NDOff-2007 dataset, which contains 7,358 individual scans at various poses of 396 subjects. The dataset has a number of full profile scans which are correctly aligned and contribute directly to the associated mesh geometry. The dataset in its raw form contains a significant number of mislabeled scans, which are identified and corrected based on alignment error using the proposed algorithm. The average point to surface distance between the aligned scans and the produced geometries is one half millimeter.