Abstract:Synthetic Aperture Radar (SAR) images are prone to be contaminated by noise, which makes it very difficult to perform target recognition in SAR images. Inspired by great success of very deep convolutional neural networks (CNNs), this paper proposes a robust feature extraction method for SAR image target classification by adaptively fusing effective features from different CNN layers. First, YOLOv4 network is fine-tuned to detect the targets from the respective MF SAR target images. Second, a very deep CNN is trained from scratch on the moving and stationary target acquisition and recognition (MSTAR) database by using small filters throughout the whole net to reduce the speckle noise. Besides, using small-size convolution filters decreases the number of parameters in each layer and, therefore, reduces computation cost as the CNN goes deeper. The resulting CNN model is capable of extracting very deep features from the target images without performing any noise filtering or pre-processing techniques. Third, our approach proposes to use the multi-canonical correlation analysis (MCCA) to adaptively learn CNN features from different layers such that the resulting representations are highly linearly correlated and therefore can achieve better classification accuracy even if a simple linear support vector machine is used. Experimental results on the MSTAR dataset demonstrate that the proposed method outperforms the state-of-the-art methods.
Abstract:3D alignment has become a very important part of 3D scanning technology. For instance, we can divide the alignment process into four steps: key point detection, key point description, initial pose estimation, and alignment refinement. Researchers have contributed several approaches to the literature for each step, which suggests a natural need for a comparative study for an educated more appropriate choice. In this work, we propose a description and an evaluation of the different methods used for 3D registration with special focus on RGB-D data to find the best combinations that permit a complete and more accurate 3D reconstruction of indoor scenes with cheap depth cameras.
Abstract:Road networks exist in the form of polylines with attributes within the GIS databases. Such a representation renders the geographic data impracticable for 3D road traffic simulation. In this work, we propose a method to transform raw GIS data into a realistic, operational model for real-time road traffic simulation. For instance, the proposed raw to simulation ready data transformation is achieved through several curvature estimation, interpolation/approximation, and clustering schemes. The obtained results show the performance of our approach and prove its adequacy to real traffic simulation scenario as can be seen in this video 1 .
Abstract:This paper presents a new approach to accurately track a moving vehicle with a multiview setup of red-green-blue depth (RGBD) cameras. We first propose a correction method to eliminate a shift, which occurs in depth sensors when they become worn. This issue could not be otherwise corrected with the ordinary calibration procedure. Next, we present a sensor-wise filtering system to correct for an unknown vehicle motion. A data fusion algorithm is then used to optimally merge the sensor-wise estimated trajectories. We implement most parts of our solution in the graphic processor. Hence, the whole system is able to operate at up to 25 frames per second with a configuration of five cameras. Test results show the accuracy we achieved and the robustness of our solution to overcome uncertainties in the measurements and the modelling.
Abstract:This paper presents a novel approach for background/foreground segmentation of RGBD data with the Gaussian Mixture Models (GMM). We first start by the background subtraction from the colour and depth images separately. The foregrounds resulting from both streams are then fused for a more accurate detection. Our segmentation solution is implemented on the GPU. Thus, it works at the full frame rate of the sensor (30fps). Test results show its robustness against illumination change, shadows and reflections.
Abstract:This work presents a new recursive robust filtering approach for feature-based 3D registration. Unlike the common state-of-the-art alignment algorithms, the proposed method has four advantages that have not yet occurred altogether in any previous solution. For instance, it is able to deal with inherent noise contaminating sensory data; it is robust to uncertainties caused by noisy feature localisation; it also combines the advantages of both (Formula presented.) and (Formula presented.) norms for a higher performance and a more prospective prevention of local minima. The result is an accurate and stable rigid body transformation. The latter enables a thorough control over the convergence regarding the alignment as well as a correct assessment of the quality of registration. The mathematical rationale behind the proposed approach is explained, and the results are validated on physical and synthetic data.
Abstract:In this work, we propose a new head-tracking solution for human-machine real-time interaction with virtual 3D environments. This solution leverages RGBD data to compute virtual camera pose according to the movements of the user's head. The process starts with the extraction of a set of facial features from the images delivered by the sensor. Such features are matched against their respective counterparts in a reference image for the computation of the current head pose. Afterwards, a prediction approach is used to guess the most likely next head move (final pose). Pythagorean Hodograph interpolation is then adapted to determine the path and local frames taken between the two poses. The result is a smooth head trajectory that serves as an input to set the camera in virtual scenes according to the user's gaze. The resulting motion model has the advantage of being: continuous in time, it adapts to any frame rate of rendering; it is ergonomic, as it frees the user from wearing tracking markers; it is smooth and free from rendering jerks; and it is also torsion and curvature minimizing as it produces a path with minimum bending energy.