Abstract:Existing neural radiance fields (NeRF)-based novel view synthesis methods for large-scale outdoor scenes are mainly built on a single altitude. Moreover, they often require a priori camera shooting height and scene scope, leading to inefficient and impractical applications when camera altitude changes. In this work, we propose an end-to-end framework, termed AG-NeRF, and seek to reduce the training cost of building good reconstructions by synthesizing free-viewpoint images based on varying altitudes of scenes. Specifically, to tackle the detail variation problem from low altitude (drone-level) to high altitude (satellite-level), a source image selection method and an attention-based feature fusion approach are developed to extract and fuse the most relevant features of target view from multi-height images for high-fidelity rendering. Extensive experiments demonstrate that AG-NeRF achieves SOTA performance on 56 Leonard and Transamerica benchmarks and only requires a half hour of training time to reach the competitive PSNR as compared to the latest BungeeNeRF.
Abstract:Three-dimensional point cloud anomaly detection that aims to detect anomaly data points from a training set serves as the foundation for a variety of applications, including industrial inspection and autonomous driving. However, existing point cloud anomaly detection methods often incorporate multiple feature memory banks to fully preserve local and global representations, which comes at the high cost of computational complexity and mismatches between features. To address that, we propose an unsupervised point cloud anomaly detection framework based on joint local-global features, termed PointCore. To be specific, PointCore only requires a single memory bank to store local (coordinate) and global (PointMAE) representations and different priorities are assigned to these local-global features, thereby reducing the computational cost and mismatching disturbance in inference. Furthermore, to robust against the outliers, a normalization ranking method is introduced to not only adjust values of different scales to a notionally common scale, but also transform densely-distributed data into a uniform distribution. Extensive experiments on Real3D-AD dataset demonstrate that PointCore achieves competitive inference time and the best performance in both detection and localization as compared to the state-of-the-art Reg3D-AD approach and several competitors.
Abstract:Automatic classification of electrocardiogram (ECG) signals plays a crucial role in the early prevention and diagnosis of cardiovascular diseases. While ECG signals can be used for the diagnosis of various diseases, their pathological characteristics exhibit minimal variations, posing a challenge to automatic classification models. Existing methods primarily utilize convolutional neural networks to extract ECG signal features for classification, which may not fully capture the pathological feature differences of different diseases. Transformer networks have advantages in feature extraction for sequence data, but the complete network is complex and relies on large-scale datasets. To address these challenges, we propose a single-layer Transformer network called Multi-Scale Shifted Windows Transformer Networks (MSW-Transformer), which uses a multi-window sliding attention mechanism at different scales to capture features in different dimensions. The self-attention is restricted to non-overlapping local windows via shifted windows, and different window scales have different receptive fields. A learnable feature fusion method is then proposed to integrate features from different windows to further enhance model performance. Furthermore, we visualize the attention mechanism of the multi-window shifted mechanism to achieve better clinical interpretation in the ECG classification task. The proposed model achieves state-of-the-art performance on five classification tasks of the PTBXL-2020 12-lead ECG dataset, which includes 5 diagnostic superclasses, 23 diagnostic subclasses, 12 rhythm classes, 17 morphology classes, and 44 diagnosis classes, with average macro-F1 scores of 77.85%, 47.57%, 66.13%, 34.60%, and 34.29%, and average sample-F1 scores of 81.26%, 68.27%, 91.32%, 50.07%, and 63.19%, respectively.