Abstract:Event cameras operate fundamentally differently from traditional Active Pixel Sensor (APS) cameras, offering significant advantages. Recent research has developed simulators to convert video frames into events, addressing the shortage of real event datasets. Current simulators primarily focus on the logical behavior of event cameras. However, the fundamental analogue properties of pixel circuits are seldom considered in simulator design. The gap between analogue pixel circuit and discrete video frames causes the degeneration of synthetic events, particularly in high-contrast scenes. In this paper, we propose a novel method of generating reliable event data based on a detailed analysis of the pixel circuitry in event cameras. We incorporate the analogue properties of event camera pixel circuits into the simulator design: (1) analogue filtering of signals from light intensity to events, and (2) a cutoff frequency that is independent of video frame rate. Experimental results on two relevant tasks, including semantic segmentation and image reconstruction, validate the reliability of simulated event data, even in high-contrast scenes. This demonstrates that deep neural networks exhibit strong generalization from simulated to real event data, confirming that the synthetic events generated by the proposed method are both realistic and well-suited for effective training.
Abstract:Diffusion Posterior Sampling(DPS) methodology is a novel framework that permits nonlinear CT reconstruction by integrating a diffusion prior and an analytic physical system model, allowing for one-time training for different applications. However, baseline DPS can struggle with large variability, hallucinations, and slow reconstruction. This work introduces a number of strategies designed to enhance the stability and efficiency of DPS CT reconstruction. Specifically, jumpstart sampling allows one to skip many reverse time steps, significantly reducing the reconstruction time as well as the sampling variability. Additionally, the likelihood update is modified to simplify the Jacobian computation and improve data consistency more efficiently. Finally, a hyperparameter sweep is conducted to investigate the effects of parameter tuning and to optimize the overall reconstruction performance. Simulation studies demonstrated that the proposed DPS technique achieves up to 46.72% PSNR and 51.50% SSIM enhancement in a low-mAs setting, and an over 31.43% variability reduction in a sparse-view setting. Moreover, reconstruction time is sped up from >23.5 s/slice to <1.5 s/slice. In a physical data study, the proposed DPS exhibits robustness on an anthropomorphic phantom reconstruction which does not strictly follow the prior distribution. Quantitative analysis demonstrates that the proposed DPS can accommodate various dose levels and number of views. With 10% dose, only a 5.60% and 4.84% reduction of PSNR and SSIM was observed for the proposed approach. Both simulation and phantom studies demonstrate that the proposed method can significantly improve reconstruction accuracy and reduce computational costs, greatly enhancing the practicality of DPS CT reconstruction.
Abstract:Dynamic Vision Sensors (DVS) have recently generated great interest because of the advantages of wide dynamic range and low latency compared with conventional frame-based cameras. However, the complicated behaviors in dim light conditions are still not clear, restricting the applications of DVS. In this paper, we analyze the typical DVS circuit, and find that there exists discontinuity of event triggering time. In dim light conditions, the discontinuity becomes prominent. We point out that the discontinuity depends exclusively on the changing speed of light intensity. Experimental results on real event data validate the analysis and the existence of discontinuity that reveals the non-first-order behaviors of DVS in dim light conditions.
Abstract:In this work, we introduce a new deep learning approach based on diffusion posterior sampling (DPS) to perform material decomposition from spectral CT measurements. This approach combines sophisticated prior knowledge from unsupervised training with a rigorous physical model of the measurements. A faster and more stable variant is proposed that uses a jumpstarted process to reduce the number of time steps required in the reverse process and a gradient approximation to reduce the computational cost. Performance is investigated for two spectral CT systems: dual-kVp and dual-layer detector CT. On both systems, DPS achieves high Structure Similarity Index Metric Measure(SSIM) with only 10% of iterations as used in the model-based material decomposition(MBMD). Jumpstarted DPS (JSDPS) further reduces computational time by over 85% and achieves the highest accuracy, the lowest uncertainty, and the lowest computational costs compared to classic DPS and MBMD. The results demonstrate the potential of JSDPS for providing relatively fast and accurate material decomposition based on spectral CT data.
Abstract:As a critical cue for understanding human intention, human gaze provides a key signal for Human-Computer Interaction(HCI) applications. Appearance-based gaze estimation, which directly regresses the gaze vector from eye images, has made great progress recently based on Convolutional Neural Networks(ConvNets) architecture and open-source large-scale gaze datasets. However, encoding model-based knowledge into CNN model to further improve the gaze estimation performance remains a topic that needs to be explored. In this paper, we propose HybridGazeNet(HGN), a unified framework that encodes the geometric eyeball model into the appearance-based CNN architecture explicitly. Composed of a multi-branch network and an uncertainty module, HybridGazeNet is trained using a hyridized strategy. Experiments on multiple challenging gaze datasets shows that HybridGazeNet has better accuracy and generalization ability compared with existing SOTA methods. The code will be released later.
Abstract:Change detection in heterogeneous remote sensing images is crucial for disaster damage assessment. Recent methods use homogenous transformation, which transforms the heterogeneous optical and SAR remote sensing images into the same feature space, to achieve change detection. Such transformations mainly operate on the low-level feature space and may corrupt the semantic content, deteriorating the performance of change detection. To solve this problem, this paper presents a new homogeneous transformation model termed deep homogeneous feature fusion (DHFF) based on image style transfer (IST). Unlike the existing methods, the DHFF method segregates the semantic content and the style features in the heterogeneous images to perform homogeneous transformation. The separation of the semantic content and the style in homogeneous transformation prevents the corruption of image semantic content, especially in the regions of change. In this way, the detection performance is improved with accurate homogeneous transformation. Furthermore, we present a new iterative IST (IIST) strategy, where the cost function in each IST iteration measures and thus maximizes the feature homogeneity in additional new feature subspaces for change detection. After that, change detection is accomplished accurately on the original and the transformed images that are in the same feature space. Real remote sensing images acquired by SAR and optical satellites are utilized to evaluate the performance of the proposed method. The experiments demonstrate that the proposed DHFF method achieves significant improvement for change detection in heterogeneous optical and SAR remote sensing images, in terms of both accuracy rate and Kappa index.