Abstract:Spatiotemporal forecasting of complex three-dimensional phenomena (4D: 3D + time) is fundamental to applications in medical imaging, fluid and material dynamics, and geophysics. In contrast to unconstrained neural forecasting models, we propose a Schrödinger-inspired, physics-guided neural architecture that embeds an explicit time-evolution operator within a deep convolutional framework for 4D prediction. From observed volumetric sequences, the model learns voxelwise amplitude, phase, and potential fields that define a complex-valued wavefunction $ψ= A e^{iφ}$, which is evolved forward in time using a differentiable, unrolled Schrödinger time stepper. This physics-guided formulation yields several key advantages: (i) temporal stability arising from the structured evolution operator, which mitigates drift and error accumulation in long-horizon forecasting; (ii) an interpretable latent representation, where phase encodes transport dynamics, amplitude captures structural intensity, and the learned potential governs spatiotemporal interactions; and (iii) natural compatibility with deformation-based synthesis, which is critical for preserving anatomical fidelity in medical imaging applications. By integrating physical priors directly into the learning process, the proposed approach combines the expressivity of deep networks with the robustness and interpretability of physics-based modeling. We demonstrate accurate and stable prediction of future 4D states, including volumetric intensities and deformation fields, on synthetic benchmarks that emulate realistic shape deformations and topological changes. To our knowledge, this is the first end-to-end 4D neural forecasting framework to incorporate a Schrödinger-type evolution operator, offering a principled pathway toward interpretable, stable, and anatomically consistent spatiotemporal prediction.




Abstract:Accurate volumetric image registration is highly relevant for clinical routines and computer-aided medical diagnosis. Recently, researchers have begun to use transformers in learning-based methods for medical image registration, and have achieved remarkable success. Due to the strong global modeling capability, Transformers are considered a better option than convolutional neural networks (CNNs) for registration. However, they use bulky models with huge parameter sets, which require high computation edge devices for deployment as portable devices or in hospitals. Transformers also need a large amount of training data to produce significant results, and it is often challenging to collect suitable annotated data. Although existing CNN-based image registration can offer rich local information, their global modeling capability is poor for handling long-distance information interaction and limits registration performance. In this work, we propose a CNN-based registration method with an enhanced receptive field, a low number of parameters, and significant results on a limited training dataset. For this, we propose a residual U-Net with embedded parallel dilated-convolutional blocks to enhance the receptive field. The proposed method is evaluated on inter-patient and atlas-based datasets. We show that the performance of the proposed method is comparable and slightly better than transformer-based methods by using only $\SI{1.5}{\percent}$ of its number of parameters.