Abstract:The global demand for unconventional energy sources such as geothermal energy and white hydrogen requires new exploration techniques for precise subsurface structure characterization and potential reservoir identification. Magnetotelluric (MT) inversion is crucial for these tasks, providing critical information on the distribution of subsurface electrical resistivity at depths ranging from hundreds to thousands of meters. However, traditional iterative algorithm-based inversion methods require the adjustment of multiple parameters, demanding time-consuming and exhaustive tuning processes to achieve proper cost function minimization. Although recent advances have incorporated deep learning algorithms for MT inversion, these have been primarily based on supervised learning, which needs large labeled datasets for training. Therefore, it causes issues in generalization and model characteristics that are restricted to the neural network's features. This work utilizes TensorFlow operations to create a differentiable forward MT operator, leveraging its automatic differentiation capability. Moreover, instead of solving for the subsurface model directly, as classical algorithms perform, this paper presents a new deep unsupervised inversion algorithm guided by physics to estimate 1D MT models. Instead of using datasets with the observed data and their respective model as labels during training, our method employs a differentiable modeling operator that physically guides the cost function minimization, making the proposed method solely dependent on observed data. Therefore, the optimization problem is updating the network weights to minimize the data misfit. We test the proposed method with field and synthetic data at different acquisition frequencies, demonstrating that the resistivity models are more accurate than other results using state-of-the-art techniques.
Abstract:Chronic wounds pose an ongoing health concern globally, largely due to the prevalence of conditions such as diabetes and leprosy's disease. The standard method of monitoring these wounds involves visual inspection by healthcare professionals, a practice that could present challenges for patients in remote areas with inadequate transportation and healthcare infrastructure. This has led to the development of algorithms designed for the analysis and follow-up of wound images, which perform image-processing tasks such as classification, detection, and segmentation. However, the effectiveness of these algorithms heavily depends on the availability of comprehensive and varied wound image data, which is usually scarce. This paper introduces the CO2Wounds-V2 dataset, an extended collection of RGB wound images from leprosy patients with their corresponding semantic segmentation annotations, aiming to enhance the development and testing of image-processing algorithms in the medical field.
Abstract:Computational optical imaging (COI) systems have enabled the acquisition of high-dimensional signals through optical coding elements (OCEs). OCEs encode the high-dimensional signal in one or more snapshots, which are subsequently decoded using computational algorithms. Currently, COI systems are optimized through an end-to-end (E2E) approach, where the OCEs are modeled as a layer of a neural network and the remaining layers perform a specific imaging task. However, the performance of COI systems optimized through E2E is limited by the physical constraints imposed by these systems. This paper proposes a knowledge distillation (KD) framework for the design of highly physically constrained COI systems. This approach employs the KD methodology, which consists of a teacher-student relationship, where a high-performance, unconstrained COI system (the teacher), guides the optimization of a physically constrained system (the student) characterized by a limited number of snapshots. We validate the proposed approach, using a binary coded apertures single pixel camera for monochromatic and multispectral image reconstruction. Simulation results demonstrate the superiority of the KD scheme over traditional E2E optimization for the designing of highly physically constrained COI systems.
Abstract:Deep-learning (DL)-based image deconvolution (ID) has exhibited remarkable recovery performance, surpassing traditional linear methods. However, unlike traditional ID approaches that rely on analytical properties of the point spread function (PSF) to achieve high recovery performance - such as specific spectrum properties or small conditional numbers in the convolution matrix - DL techniques lack quantifiable metrics for evaluating PSF suitability for DL-assisted recovery. Aiming to enhance deconvolution quality, we propose a metric that employs a non-linear approach to learn the invertibility of an arbitrary PSF using a neural network by mapping it to a unit impulse. A lower discrepancy between the mapped PSF and a unit impulse indicates a higher likelihood of successful inversion by a DL network. Our findings reveal that this metric correlates with high recovery performance in DL and traditional methods, thereby serving as an effective regularizer in deconvolution tasks. This approach reduces the computational complexity over conventional condition number assessments and is a differentiable process. These useful properties allow its application in designing diffractive optical elements through end-to-end (E2E) optimization, achieving invertible PSFs, and outperforming the E2E baseline framework.
Abstract:Binary Neural Networks emerged as a cost-effective and energy-efficient solution for computer vision tasks by binarizing either network weights or activations. However, common binary activations, such as the Sign activation function, abruptly binarize the values with a single threshold, losing fine-grained details in the feature outputs. This work proposes an activation that applies multiple thresholds following dithering principles, shifting the Sign activation function for each pixel according to a spatially periodic threshold kernel. Unlike literature methods, the shifting is defined jointly for a set of adjacent pixels, taking advantage of spatial correlations. Experiments over the classification task demonstrate the effectiveness of the designed dithering Sign activation function as an alternative activation for binary neural networks, without increasing the computational cost. Further, DeSign balances the preservation of details with the efficiency of binary operations.
Abstract:In the era of cloud computing and data-driven applications, it is crucial to protect sensitive information to maintain data privacy, ensuring truly reliable systems. As a result, preserving privacy in deep learning systems has become a critical concern. Existing methods for privacy preservation rely on image encryption or perceptual transformation approaches. However, they often suffer from reduced task performance and high computational costs. To address these challenges, we propose a novel Privacy-Preserving framework that uses a set of deformable operators for secure task learning. Our method involves shuffling pixels during the analog-to-digital conversion process to generate visually protected data. Those are then fed into a well-known network enhanced with deformable operators. Using our approach, users can achieve equivalent performance to original images without additional training using a secret key. Moreover, our method enables access control against unauthorized users. Experimental results demonstrate the efficacy of our approach, showcasing its potential in cloud-based scenarios and privacy-sensitive applications.
Abstract:Quantized neural networks employ reduced precision representations for both weights and activations. This quantization process significantly reduces the memory requirements and computational complexity of the network. Binary Neural Networks (BNNs) are the extreme quantization case, representing values with just one bit. Since the sign function is typically used to map real values to binary values, smooth approximations are introduced to mimic the gradients during error backpropagation. Thus, the mismatch between the forward and backward models corrupts the direction of the gradient, causing training inconsistency problems and performance degradation. In contrast to current BNN approaches, we propose to employ a binary periodic (BiPer) function during binarization. Specifically, we use a square wave for the forward pass to obtain the binary values and employ the trigonometric sine function with the same period of the square wave as a differentiable surrogate during the backward pass. We demonstrate that this approach can control the quantization error by using the frequency of the periodic function and improves network performance. Extensive experiments validate the effectiveness of BiPer in benchmark datasets and network architectures, with improvements of up to 1% and 0.69% with respect to state-of-the-art methods in the classification task over CIFAR-10 and ImageNet, respectively. Our code is publicly available at https://github.com/edmav4/BiPer.
Abstract:The modern surge in camera usage alongside widespread computer vision technology applications poses significant privacy and security concerns. Current artificial intelligence (AI) technologies aid in recognizing relevant events and assisting in daily tasks in homes, offices, hospitals, etc. The need to access or process personal information for these purposes raises privacy concerns. While software-level solutions like face de-identification provide a good privacy/utility trade-off, they present vulnerabilities to sniffing attacks. In this paper, we propose a hardware-level face de-identification method to solve this vulnerability. Specifically, our approach first learns an optical encoder along with a regression model to obtain a face heatmap while hiding the face identity from the source image. We also propose an anonymization framework that generates a new face using the privacy-preserving image, face heatmap, and a reference face image from a public dataset as input. We validate our approach with extensive simulations and hardware experiments.
Abstract:Seismic data interpolation plays a crucial role in subsurface imaging, enabling accurate analysis and interpretation throughout the seismic processing workflow. Despite the widespread exploration of deep supervised learning methods for seismic data reconstruction, several challenges still remain open. Particularly, the requirement of extensive training data and poor domain generalization due to the seismic survey's variability poses significant issues. To overcome these limitations, this paper introduces a deep-learning-based seismic data reconstruction approach that leverages data redundancy. This method involves a two-stage training process. First, an adversarial generative network (GAN) is trained using synthetic seismic data, enabling the extraction and learning of their primary and local seismic characteristics. Second, a reconstruction network is trained with synthetic data generated by the GAN, which dynamically adjusts the noise and distortion level at each epoch to promote feature diversity. This approach enhances the generalization capabilities of the reconstruction network by allowing control over the generation of seismic patterns from the latent space of the GAN, thereby reducing the dependency on large seismic databases. Experimental results on field and synthetic seismic datasets both pre-stack and post-stack show that the proposed method outperforms the baseline supervised learning and unsupervised approaches such as deep seismic prior and internal learning, by up to 8 dB of PSNR.
Abstract:Integrated sensing and communications (ISAC) systems have gained significant interest because of their ability to jointly and efficiently access, utilize, and manage the scarce electromagnetic spectrum. The co-existence approach toward ISAC focuses on the receiver processing of overlaid radar and communications signals coming from independent transmitters. A specific ISAC coexistence problem is dual-blind deconvolution (DBD), wherein the transmit signals and channels of both radar and communications are unknown to the receiver. Prior DBD works ignore the evolution of the signal model over time. In this work, we consider a dynamic DBD scenario using a linear state space model (LSSM) such that, apart from the transmit signals and channels of both systems, the LSSM parameters are also unknown. We employ a factor graph representation to model these unknown variables. We avoid the conventional matrix inversion approach to estimate the unknown variables by using an efficient expectation-maximization algorithm, where each iteration employs a Gaussian message passing over the factor graph structure. Numerical experiments demonstrate the accurate estimation of radar and communications channels, including in the presence of noise.