Abstract:Automated 3D pose estimation of satellites and other known space objects is a critical component of space situational awareness. Ground-based imagery offers a convenient data source for satellite characterization; however, analysis algorithms must contend with atmospheric distortion, variable lighting, and unknown reflectance properties. Traditional feature-based pose estimation approaches are unable to discover an accurate correlation between a known 3D model and imagery given this challenging image environment. This paper presents an innovative method for automated 3D pose estimation of known space objects in the absence of satisfactory texture. The proposed approach fits the silhouette of a known satellite model to ground-based imagery via particle filtering. Each particle contains enough information (orientation, position, scale, model articulation) to generate an accurate object silhouette. The silhouette of individual particles is compared to an observed image. Comparison is done probabilistically by calculating the joint probability that pixels inside the silhouette belong to the foreground distribution and that pixels outside the silhouette belong to the background distribution. Both foreground and background distributions are computed by observing empty space. The population of particles are resampled at each new image observation, with the probability of a particle being resampled proportional to how the particle's silhouette matches the observation image. The resampling process maintains multiple pose estimates which is beneficial in preventing and escaping local minimums. Experiments were conducted on both commercial imagery and on LEO satellite imagery. Imagery from the commercial experiments are shown in this paper.
Abstract:Monocular visual SLAM has become an attractive practical approach for robot localization and 3D environment mapping, since cameras are small, lightweight, inexpensive, and produce high-rate, high-resolution data streams. Although numerous robust tools have been developed, most existing systems are designed to operate in terrestrial environments and at relatively small scale (a few thousand frames) due to constraints on computation and storage. In this paper, we present a feature-based visual SLAM system for aerial video whose simple design permits near real-time operation, and whose scalability permits large-area mapping using tens of thousands of frames, all on a single conventional computer. Our approach consists of two parallel threads: the first incrementally creates small locally consistent submaps and estimates camera poses at video rate; the second aligns these submaps with one another to produce a single globally consistent map via factor graph optimization over both poses and landmarks. Scale drift is minimized through the use of 7-degree-of-freedom similarity transformations during submap alignment. We quantify our system's performance on both simulated and real data sets, and demonstrate city-scale map reconstruction accurate to within 2 meters using nearly 90,000 aerial video frames - to our knowledge, the largest and fastest such reconstruction to date.