Abstract:Traditional CMOS sensors suffer from restricted dynamic range and sub optimal performance under extreme lighting conditions. They are affected by electronic noise in low light conditions and pixel saturation while capturing high illumination. Recent High Dynamic Range (HDR) Imaging methods, often designed for CMOS Sensors, attempt to address these issues by fusing multiple exposures. However, they frequently introduce artifacts like ghosting and light flickering in dynamic scenarios and non-uniform signal-to-noise ratio in extreme dynamic range conditions. Recently, Single Photon Avalanche Diodes (SPADs), also known as Single Photon Camera (SPC) sensors, have surpassed CMOS sensors due to their capability to capture individual photons with high timing precision. Unlike traditional digital cameras that first convert light energy into analog electrical currents and then digitize them, SPAD sensors perform direct photon detection, making them less susceptible to extreme illumination conditions. Their distinctive non-linear response curve aids in capturing photons across both low-light and high-illumination environments, making them particularly effective for High Dynamic Range Imaging. Despite their advantages, images from SPAD Sensors are often noisy and visually unappealing. To address these challenges, we evaluate state-of-the-art architectures for converting monochromatic SPAD images into Color HDR images at various resolutions. Our evaluation involves both qualitative and quantitative assessments of these architectures, focusing on their effectiveness in each stage of the conversion process.
Abstract:Flare, an optical phenomenon resulting from unwanted scattering and reflections within a lens system, presents a significant challenge in imaging. The diverse patterns of flares, such as halos, streaks, color bleeding, and haze, complicate the flare removal process. Existing traditional and learning-based methods have exhibited limited efficacy due to their reliance on single-image approaches, where flare removal is highly ill-posed. We address this by framing flare removal as a multi-view image problem, taking advantage of the view-dependent nature of flare artifacts. This approach leverages information from neighboring views to recover details obscured by flare in individual images. Our proposed framework, GN-FR (Generalizable Neural Radiance Fields for Flare Removal), can render flare-free views from a sparse set of input images affected by lens flare and generalizes across different scenes in an unsupervised manner. GN-FR incorporates several modules within the Generalizable NeRF Transformer (GNT) framework: Flare-occupancy Mask Generation (FMG), View Sampler (VS), and Point Sampler (PS). To overcome the impracticality of capturing both flare-corrupted and flare-free data, we introduce a masking loss function that utilizes mask information in an unsupervised setting. Additionally, we present a 3D multi-view flare dataset, comprising 17 real flare scenes with 782 images, 80 real flare patterns, and their corresponding annotated flare-occupancy masks. To our knowledge, this is the first work to address flare removal within a Neural Radiance Fields (NeRF) framework.
Abstract:Soil moisture estimation is an important task to enable precision agriculture in creating optimal plans for irrigation, fertilization, and harvest. It is common to utilize statistical and machine learning models to estimate soil moisture from traditional data sources such as weather forecasts, soil properties, and crop properties. However, there is a growing interest in utilizing aerial and geospatial imagery to estimate soil moisture. Although these images capture high-resolution crop details, they are expensive to curate and challenging to interpret. Imagine, an AI-enhanced software tool that predicts soil moisture using visual cues captured by smartphones and statistical data given by weather forecasts. This work is a first step towards that goal of developing a multi-modal approach for soil moisture estimation. In particular, we curate a dataset consisting of real-world images taken from ground stations and their corresponding weather data. We also propose MIS-ME - Meteorological & Image based Soil Moisture Estimator, a multi-modal framework for soil moisture estimation. Our extensive analysis shows that MIS-ME achieves a MAPE of 10.79%, outperforming traditional unimodal approaches with a reduction of 2.6% in MAPE for meteorological data and 1.5% in MAPE for image data, highlighting the effectiveness of tailored multi-modal approaches.
Abstract:Situation awareness is a crucial cognitive skill that enables individuals to perceive, comprehend, and project the current state of their environment accurately. It involves being conscious of relevant information, understanding its meaning, and using that understanding to make well-informed decisions. Awareness systems often need to integrate new knowledge and adapt to changing environments. Ontology reasoning facilitates knowledge integration and evolution, allowing for seamless updates and expansions of the ontology. With the consideration of above, we are providing a quick review on semantic information retrieval and ontology engineering to understand the emerging challenges and future research. In the review we have found that the ontology reasoning addresses the limitations of traditional systems by providing a formal, flexible, and scalable framework for knowledge representation, reasoning, and inference.
Abstract:Current practice in parameter space exploration in euclidean space is dominated by randomized sampling or design of experiment methods. The biggest issue with these methods is not keeping track of what part of parameter space has been explored and what has not. In this context, we utilize the geometric learning of explored data space using modern machine learning methods to keep track of already explored regions and samples from the regions that are unexplored. For this purpose, we use a modified version of a robust random-cut forest along with other heuristic-based approaches. We demonstrate our method and its progression in two-dimensional Euclidean space but it can be extended to any dimension since the underlying method is generic.
Abstract:In the medical field, landmark detection in MRI plays an important role in reducing medical technician efforts in tasks like scan planning, image registration, etc. First, 88 landmarks spread across the brain anatomy in the three respective views -- sagittal, coronal, and axial are manually annotated, later guidelines from the expert clinical technicians are taken sub-anatomy-wise, for better localization of the existing landmarks, in order to identify and locate the important atlas landmarks even in oblique scans. To overcome limited data availability, we implement realistic data augmentation to generate synthetic 3D volumetric data. We use a modified HighRes3DNet model for solving brain MRI volumetric landmark detection problem. In order to visually explain our trained model on unseen data, and discern a stronger model from a weaker model, we implement Gradient-weighted Class Activation Mapping (Grad-CAM) which produces a coarse localization map highlighting the regions the model is focusing. Our experiments show that the proposed method shows favorable results, and the overall pipeline can be extended to a variable number of landmarks and other anatomies.