Abstract:Promptable foundation models, particularly Segment Anything Model (SAM), have emerged as a promising alternative to the traditional task-specific supervised learning for image segmentation. However, many evaluation studies have found that their performance on medical imaging modalities to be underwhelming compared to conventional deep learning methods. In the world of large pre-trained language and vision-language models, learning prompt from downstream tasks has achieved considerable success in improving performance. In this work, we propose a plug-and-play Prompt Optimization Technique for foundation models like SAM (SAMPOT) that utilizes the downstream segmentation task to optimize the human-provided prompt to obtain improved performance. We demonstrate the utility of SAMPOT on lung segmentation in chest X-ray images and obtain an improvement on a significant number of cases ($\sim75\%$) over human-provided initial prompts. We hope this work will lead to further investigations in the nascent field of automatic visual prompt-tuning.
Abstract:The ability to automatically learn task specific feature representations has led to a huge success of deep learning methods. When large training data is scarce, such as in medical imaging problems, transfer learning has been very effective. In this paper, we systematically investigate the process of transferring a Convolutional Neural Network, trained on ImageNet images to perform image classification, to kidney detection problem in ultrasound images. We study how the detection performance depends on the extent of transfer. We show that a transferred and tuned CNN can outperform a state-of-the-art feature engineered pipeline and a hybridization of these two techniques achieves 20\% higher performance. We also investigate how the evolution of intermediate response images from our network. Finally, we compare these responses to state-of-the-art image processing filters in order to gain greater insight into how transfer learning is able to effectively manage widely varying imaging regimes.
Abstract:Typical convolutional neural networks (CNNs) have several millions of parameters and require a large amount of annotated data to train them. In medical applications where training data is hard to come by, these sophisticated machine learning models are difficult to train. In this paper, we propose a method to reduce the inherent complexity of CNNs during training by exploiting the significant redundancy that is noticed in the learnt CNN filters. Our method relies on finding a small set of filters and mixing coefficients to derive every filter in each convolutional layer at the time of training itself, thereby reducing the number of parameters to be trained. We consider the problem of 3D lung nodule segmentation in CT images and demonstrate the effectiveness of our method in achieving good results with only few training examples.
Abstract:Light rays incident on a transparent object of uniform refractive index undergo deflections, which uniquely characterize the surface geometry of the object. Associated with each point on the surface is a deflection map (or spectrum) which describes the pattern of deflections in various directions. This article presents a novel method to efficiently acquire and reconstruct sparse deflection spectra induced by smooth object surfaces. To this end, we leverage the framework of Compressed Sensing (CS) in a particular implementation of a schlieren deflectometer, i.e., an optical system providing linear measurements of deflection spectra with programmable spatial light modulation patterns. We design those modulation patterns on the principle of spread spectrum CS for reducing the number of observations. The ability of our device to simultaneously observe the deflection spectra on a dense discretization of the object surface is related to a Multiple Measurement Vector (MMV) model. This scheme allows us to estimate both the noise power and the instrumental point spread function. We formulate the spectrum reconstruction task as the solving of a linear inverse problem regularized by an analysis sparsity prior using a translation invariant wavelet frame. Our results demonstrate the capability and advantages of using a CS based approach for deflectometric imaging both on simulated data and experimental deflectometric data. Finally, the paper presents an extension of our method showing how we can extract the main deflection direction in each point of the object surface from a few compressive measurements, without needing any costly reconstruction procedures. This compressive characterization is then confirmed with experimental results on simple plano-convex and multifocal intra-ocular lenses studying the evolution of the main deflection as a function of the object point location.
Abstract:Schlieren deflectometry aims at characterizing the deflections undergone by refracted incident light rays at any surface point of a transparent object. For smooth surfaces, each surface location is actually associated with a sparse deflection map (or spectrum). This paper presents a novel method to compressively acquire and reconstruct such spectra. This is achieved by altering the way deflection information is captured in a common Schlieren Deflectometer, i.e., the deflection spectra are indirectly observed by the principle of spread spectrum compressed sensing. These observations are realized optically using a 2-D Spatial Light Modulator (SLM) adjusted to the corresponding sensing basis and whose modulations encode the light deviation subsequently recorded by a CCD camera. The efficiency of this approach is demonstrated experimentally on the observation of few test objects. Further, using a simple parametrization of the deflection spectra we show that relevant key parameters can be directly computed using the measurements, avoiding full reconstruction.