Abstract:Optical coherence tomography (OCT) is pivotal in corneal imaging for both surgical planning and diagnosis. However, high-speed acquisitions often degrade spatial resolution and increase speckle noise, posing challenges for accurate interpretation. We propose an advanced super-resolution framework leveraging diffusion model plug-and-play (PnP) priors to achieve 4x spatial resolution enhancement alongside effective denoising of OCT Bscan images. Our approach formulates reconstruction as a principled Bayesian inverse problem, combining Markov chain Monte Carlo sampling with pretrained generative priors to enforce anatomical consistency. We comprehensively validate the framework using \emph{in vivo} fisheye corneal datasets, to assess robustness and scalability under diverse clinical settings. Comparative experiments against bicubic interpolation, conventional supervised U-Net baselines, and alternative diffusion priors demonstrate that our method consistently yields more precise anatomical structures, improved delineation of corneal layers, and superior noise suppression. Quantitative results show state-of-the-art performance in peak signal-to-noise ratio, structural similarity index, and perceptual metrics. This work highlights the potential of diffusion-driven plug-and-play reconstruction to deliver high-fidelity, high-resolution OCT imaging, supporting more reliable clinical assessments and enabling advanced image-guided interventions. Our findings suggest the approach can be extended to other biomedical imaging modalities requiring robust super-resolution and denoising.
Abstract:Inverse scattering in optical coherence tomography (OCT) seeks to recover both structural images and intrinsic tissue optical properties, including refractive index, scattering coefficient, and anisotropy. This inverse problem is challenging due to attenuation, speckle noise, and strong coupling among parameters. We propose a regularized end-to-end deep learning framework that jointly reconstructs optical parameter maps and speckle-reduced OCT structural intensity for layer visualization. Trained with Monte Carlo-simulated ground truth, our network incorporates a physics-based OCT forward model that generates predicted signals from the estimated parameters, providing physics-consistent supervision for parameter recovery and artifact suppression. Experiments on the synthetic corneal OCT dataset demonstrate robust optical map recovery under noise, improved resolution, and enhanced structural fidelity. This approach enables quantitative multi-parameter tissue characterization and highlights the benefit of combining physics-informed modeling with deep learning for computational OCT.
Abstract:Robotic deep anterior lamellar keratoplasty (DALK) requires accurate real time depth feedback to approach Descemet's membrane (DM) without perforation. M-mode intraoperative optical coherence tomography (OCT) provides high temporal resolution depth traces, but speckle noise, attenuation, and instrument induced shadowing often result in discontinuous or ambiguous layer interfaces that challenge anatomically consistent segmentation at deployment frame rates. We present a lightweight, topology aware M-mode segmentation pipeline based on UNeXt that incorporates anatomical topology regularization to stabilize boundary continuity and layer ordering under low signal to noise ratio conditions. The proposed system achieves end to end throughput exceeding 80 Hz measured over the complete preprocessing inference overlay pipeline on a single GPU, demonstrating practical real time guidance beyond model only timing. This operating margin provides temporal headroom to reject low quality or dropout frames while maintaining a stable effective depth update rate. Evaluation on a standard rabbit eye M-mode dataset using an established baseline protocol shows improved qualitative boundary stability compared with topology agnostic controls, while preserving deployable real time performance.
Abstract:Training deep learning models for corneal optical coherence tomography (OCT) imaging is limited by the availability of large, well-annotated datasets. We present a configurable Monte Carlo simulation framework that generates synthetic corneal B-scan optical OCT images with pixel-level five-layer segmentation labels derived directly from the simulation geometry. A five-layer corneal model with Gaussian surfaces captures curvature and thickness variability in healthy and keratoconic eyes. Each layer is assigned optical properties from the literature and light transport is simulated using Monte Carlo modeling of light transport in multi-layered tissues (MCML), while incorporating system features such as the confocal PSF and sensitivity roll-off. This approach produces over 10,000 high-resolution (1024x1024) image-label pairs and supports customization of geometry, photon count, noise, and system parameters. The resulting dataset enables systematic training, validation, and benchmarking of AI models under controlled, ground-truth conditions, providing a reproducible and scalable resource to support the development of diagnostic and surgical guidance applications in image-guided ophthalmology.
Abstract:Wide-field high-resolution microscopy requires fast scanning and accurate image mosaicking to cover large fields of view without compromising image quality. However, conventional galvanometric scanning, particularly under sinusoidal driving, can introduce nonuniform spatial sampling, leading to geometric inconsistencies and brightness variations across the scanned field. To address these challenges, we present an image mosaicking framework for wide-field microscopic imaging that is applicable to both linear and sinusoidal galvanometric scanning strategies. The proposed approach combines a translation-based geometric mosaicking model with region-of-interest (ROI) based brightness correction and seam-aware feathering to improve radiometric consistency across large fields of view. The method relies on calibrated scan parameters and synchronized scan--camera control, without requiring image-content-based registration. Using the proposed framework, wide-field mosaicked images were successfully reconstructed under both linear and sinusoidal scanning strategies, achieving a field of view of up to $2.5 \times 2.5~\mathrm{cm}^2$ with a total acquisition time of approximately $6~\mathrm{s}$ per dataset. Quantitative evaluation shows that both scanning strategies demonstrate improved image quality, including enhanced brightness uniformity, increased contrast-to-noise ratio (CNR), and reduced seam-related artifacts after image processing, while preserving a lateral resolution of $7.81~μ\mathrm{m}$. Overall, the presented framework provides a practical and efficient solution for scan-based wide-field microscopic mosaicking.
Abstract:Diffusion models are highly expressive image priors for Bayesian inverse problems. However, most diffusion models cannot operate on large-scale, high-dimensional data due to high training and inference costs. In this work, we introduce a Plug-and-play algorithm for 3D stochastic inference with latent diffusion prior (PSI3D) to address massive ($1024\times 1024\times 128$) volumes. Specifically, we formulate a Markov chain Monte Carlo approach to reconstruct each two-dimensional (2D) slice by sampling from a 2D latent diffusion model. To enhance inter-slice consistency, we also incorporate total variation (TV) regularization stochastically along the concatenation axis. We evaluate our performance on optical coherence tomography (OCT) super-resolution. Our method significantly improves reconstruction quality for large-scale scientific imaging compared to traditional and learning-based baselines, while providing robust and credible reconstructions.
Abstract:We propose an OCT super-resolution framework based on a plug-and-play diffusion model (PnP-DM) to reconstruct high-quality images from sparse measurements (OCT B-mode corneal images). Our method formulates reconstruction as an inverse problem, combining a diffusion prior with Markov chain Monte Carlo sampling for efficient posterior inference. We collect high-speed under-sampled B-mode corneal images and apply a deep learning-based up-sampling pipeline to build realistic training pairs. Evaluations on in vivo and ex vivo fish-eye corneal models show that PnP-DM outperforms conventional 2D-UNet baselines, producing sharper structures and better noise suppression. This approach advances high-fidelity OCT imaging in high-speed acquisition for clinical applications.
Abstract:In recent years, deep learning technology has developed rapidly, and the application of deep neural networks in the medical image processing field has become the focus of the spotlight. This paper aims to achieve needle position detection in medical retinal surgery by adopting the target detection algorithm based on YOLOv5 as the basic deep neural network model. The state-of-the-art needle detection approaches for medical surgery mainly focus on needle structure segmentation. Instead of the needle segmentation, the proposed method in this paper contains the angle examination during the needle detection process. This approach also adopts a novel classification method based on the different positions of the needle to improve the model. The experiments demonstrate that the proposed network can accurately detect the needle position and measure the needle angle. The performance test of the proposed method achieves 4.80 for the average Euclidean distance between the detected tip position and the actual tip position. It also obtains an average error of 0.85 degrees for the tip angle across all test sets.
Abstract:Accurate tongue motion estimation is essential for tongue function evaluation. The harmonic phase processing (HARP) method and the phase vector incompressible registration algorithm (PVIRA) based on HARP can generate motion estimates from tagged MRI images, but they suffer from tag jumping due to large motions. This paper proposes a new registration method by combining the stationary velocity fields produced by PVIRA between successive time frames as a new initialization of the final registration stage to avoid tag jumping. The experiment results demonstrate the proposed method can avoid tag jumping and outperform the existing methods on tongue motion estimates.




Abstract:Tagged magnetic resonance imaging (MRI) has been used for decades to observe and quantify the detailed motion of deforming tissue. However, this technique faces several challenges such as tag fading, large motion, long computation times, and difficulties in obtaining diffeomorphic incompressible flow fields. To address these issues, this paper presents a novel unsupervised phase-based 3D motion estimation technique for tagged MRI. We introduce two key innovations. First, we apply a sinusoidal transformation to the harmonic phase input, which enables end-to-end training and avoids the need for phase interpolation. Second, we propose a Jacobian determinant-based learning objective to encourage incompressible flow fields for deforming biological tissues. Our method efficiently estimates 3D motion fields that are accurate, dense, and approximately diffeomorphic and incompressible. The efficacy of the method is assessed using human tongue motion during speech, and includes both healthy controls and patients that have undergone glossectomy. We show that the method outperforms existing approaches, and also exhibits improvements in speed, robustness to tag fading, and large tongue motion.