Abstract:Capturing real-world aerial images for vision-based navigation (VBN) is challenging due to limited availability and conditions that make it nearly impossible to access all desired images from any location. The complexity increases when multiple locations are involved. The state of the art solutions, such as flying a UAV (Unmanned Aerial Vehicle) to take pictures or using existing research databases, have significant limitations. SkyAI Sim offers a compelling alternative by simulating a UAV to capture bird's-eye view satellite images at zero-yaw with real-world visible-band specifications. This open-source tool allows users to specify the bounding box (top-left and bottom-right) coordinates of any region on a map. Without the need to physically fly a drone, the virtual Python UAV performs a raster search to capture satellite images using the Google Maps Static API. Users can define parameters such as flight altitude, aspect ratio and diagonal field of view of the camera, and the overlap between consecutive images. SkyAI Sim's capabilities range from capturing a few low-altitude images for basic applications to generating extensive datasets of entire cities for complex tasks like deep learning. This versatility makes SkyAI a valuable tool for not only VBN, but also other applications including environmental monitoring, construction, and city management. The open-source nature of the tool also allows for extending the raster search to other missions. A dataset of Memphis, TN has been provided along with this simulator, partially generated using SkyAI and, also includes data from a 3D world generation package for comparison.
Abstract:In this article, we propose a framework for contactless human-computer interaction (HCI) using novel tracking techniques based on deep learning-based super-resolution and tracking algorithms. Our system offers unprecedented high-resolution tracking of hand position and motion characteristics by leveraging spatial and temporal features embedded in the reflected radar waveform. Rather than classifying samples from a predefined set of hand gestures, as common in existing work on deep learning with mmWave radar, our proposed imager employs a regressive full convolutional neural network (FCNN) approach to improve localization accuracy by spatial super-resolution. While the proposed techniques are suitable for a host of tracking applications, this article focuses on their application as a musical interface to demonstrate the robustness of the gesture sensing pipeline and deep learning signal processing chain. The user can control the instrument by varying the position and velocity of their hand above the vertically-facing sensor. By employing a commercially available multiple-input-multiple-output (MIMO) radar rather than a traditional optical sensor, our framework demonstrates the efficacy of the mmWave sensing modality for fine motion tracking and offers an elegant solution to a host of HCI tasks. Additionally, we provide a freely available software package and user interface for controlling the device, streaming the data to MATLAB in real-time, and increasing accessibility to the signal processing and device interface functionality utilized in this article.