Abstract:The increasing demand for accurate representations of 3D scenes, combined with immersive technologies has led point clouds to extensive popularity. However, quality point clouds require a large amount of data and therefore the need for compression methods is imperative. In this paper, we present a novel, geometry-based, end-to-end compression scheme, that combines information on the geometrical features of the point cloud and the user's position, achieving remarkable results for aggressive compression schemes demanding very small bit rates. After separating visible and non-visible points, four saliency maps are calculated, utilizing the point cloud's geometry and distance from the user, the visibility information, and the user's focus point. A combination of these maps results in a final saliency map, indicating the overall significance of each point and therefore quantizing different regions with a different number of bits during the encoding process. The decoder reconstructs the point cloud making use of delta coordinates and solving a sparse linear system. Evaluation studies and comparisons with the geometry-based point cloud compression (G-PCC) algorithm by the Moving Picture Experts Group (MPEG), carried out for a variety of point clouds, demonstrate that the proposed method achieves significantly better results for small bit rates.
Abstract:In collaborative tasks where humans work alongside machines, the robot's movements and behaviour can have a significant impact on the operator's safety, health, and comfort. To address this issue, we present a multi-stereo camera system that continuously monitors the operator's posture while they work with the robot. This system uses a novel distributed fusion approach to assess the operator's posture in real-time and to help avoid uncomfortable or unsafe positions. The system adjusts the robot's movements and informs the operator of any incorrect or potentially harmful postures, reducing the risk of accidents, strain, and musculoskeletal disorders. The analysis is personalized, taking into account the unique anthropometric characteristics of each operator, to ensure optimal ergonomics. The results of our experiments show that the proposed approach leads to improved human body postures and offers a promising solution for enhancing the ergonomics of operators in collaborative tasks.
Abstract:Autonomous vehicles are expected to operate safely in real-life road conditions in the next years. Nevertheless, unanticipated events such as the existence of unexpected objects in the range of the road, can put safety at risk. The advancement of sensing and communication technologies and Internet of Things may facilitate the recognition of hazardous situations and information exchange in a cooperative driving scheme, providing new opportunities for the increase of collaborative situational awareness. Safe and unobtrusive visualization of the obtained information may nowadays be enabled through the adoption of novel Augmented Reality (AR) interfaces in the form of windshields. Motivated by these technological opportunities, we propose in this work a saliency-based distributed, cooperative obstacle detection and rendering scheme for increasing the driver's situational awareness through (i) automated obstacle detection, (ii) AR visualization and (iii) information sharing (upcoming potential dangers) with other connected vehicles or road infrastructure. An extensive evaluation study using a variety of real datasets for pothole detection showed that the proposed method provides favorable results and features compared to other recent and relevant approaches.
Abstract:Asthma is a common, usually long-term respiratory disease with negative impact on society and the economy worldwide. Treatment involves using medical devices (inhalers) that distribute medication to the airways, and its efficiency depends on the precision of the inhalation technique. Health monitoring systems equipped with sensors and embedded with sound signal detection enable the recognition of drug actuation and could be powerful tools for reliable audio content analysis. This paper revisits audio pattern recognition and machine learning techniques for asthma medication adherence assessment and presents the Respiratory and Drug Actuation (RDA) Suite(https://gitlab.com/vvr/monitoring-medication-adherence/rda-benchmark) for benchmarking and further research. The RDA Suite includes a set of tools for audio processing, feature extraction and classification and is provided along with a dataset consisting of respiratory and drug actuation sounds. The classification models in RDA are implemented based on conventional and advanced machine learning and deep network architectures. This study provides a comparative evaluation of the implemented approaches, examines potential improvements and discusses challenges and future tendencies.
Abstract:Recent advances in 3D scanning technology have enabled the deployment of 3D models in various industrial applications like digital twins, remote inspection and reverse engineering. Despite their evolving performance, 3D scanners, still introduce noise and artifacts in the acquired dense models. In this work, we propose a fast and robust denoising method for dense 3D scanned industrial models. The proposed approach employs conditional variational autoencoders to effectively filter face normals. Training and inference are performed in a sliding patch setup reducing the size of the required training data and execution times. We conducted extensive evaluation studies using 3D scanned and CAD models. The results verify plausible denoising outcomes, demonstrating similar or higher reconstruction accuracy, compared to other state-of-the-art approaches. Specifically, for 3D models with more than 1e4 faces, the presented pipeline is twice as fast as methods with equivalent reconstruction error.
Abstract:3D representations of highly deformable 3D models, such as dynamic 3D meshes, have recently become very popular due to their wide applicability in various domains. This trend inevitably leads to a demand for storage and transmission of voluminous data sets, making the need for the design of a robust and reliable compression scheme a necessity. In this work, we present an approach for dynamic 3D mesh compression, that effectively exploits the spatio-temporal coherence of animated sequences, achieving low compression ratios without noticeably affecting the visual quality of the animation. We show that, on contrary to mainstream approaches that either exploit spatial (e.g., spectral coding) or temporal redundancies (e.g., PCA-based method), the proposed scheme, achieves increased efficiency, by projecting the differential coordinates sequence to the subspace of the covariance of the point trajectories. An extensive evaluation study, using different dynamic 3D models, highlights the benefits of the proposed approach in terms of both execution time and reconstruction quality, providing extremely low bit-per-vertex per-frame (bpvf) rates.
Abstract:Geometry processing of 3D objects is of primary interest in many areas of computer vision and graphics, including robot navigation, 3D object recognition, classification, feature extraction, etc. The recent introduction of cheap range sensors has created a great interest in many new areas, driving the need for developing efficient algorithms for 3D object processing. Previously, in order to capture a 3D object, expensive specialized sensors were used, such as lasers or dedicated range images, but now this limitation has changed. The current approaches of 3D object processing require a significant amount of manual intervention and they are still time-consuming making them unavailable for use in real-time applications. The aim of this thesis is to present algorithms, mainly inspired by the spectral analysis, subspace tracking, etc, that can be used and facilitate many areas of low-level 3D geometry processing (i.e., reconstruction, outliers removal, denoising, compression), pattern recognition tasks (i.e., significant features extraction) and high-level applications (i.e., registration and identification of 3D objects in partially scanned and cluttered scenes), taking into consideration different types of 3D models (i.e., static and dynamic point clouds, static and dynamic 3D meshes).