Abstract:In marine towed-streamer seismic acquisition, the nearest hydrophone is often two hundred meter away from the source resulting in missing near-offset traces, which degrades critical processing workflows such as surface-related multiple elimination, velocity analysis, and full-waveform inversion. Existing reconstruction methods, like transform-domain interpolation, often produce kinematic inconsistencies and amplitude distortions, while supervised deep learning approaches require complete ground-truth near-offset data that are unavailable in realistic acquisition scenarios. To address these limitations, we propose a self-supervised diffusion-based framework that reconstructs missing near-offset traces without requiring near-offset reference data. Our method leverages overlapping patch extraction with single-trace shifts from the available far-offset section to train a conditional diffusion model, which learns offset-dependent statistical patterns governing event curvature, amplitude variation, and wavelet characteristics. At inference, we perform trace-by-trace recursive extrapolation from the nearest recorded offset toward zero offset, progressively propagating learned prior information from far to near offsets. The generative formulation further provides uncertainty estimates via ensemble sampling, quantifying prediction confidence where validation data are absent. Controlled validation experiments on synthetic and field datasets show substantial performance gains over conventional parabolic Radon transform baselines. Operational deployment on actual near-offset gaps demonstrates practical viability where ground-truth validation is impossible. Notably, the reconstructed waveforms preserve realistic amplitude-versus-offset trends despite training exclusively on far-offset observations, and uncertainty maps accurately identify challenging extrapolation regions.
Abstract:Seismic data often face challenges in their utilization due to noise contamination, incomplete acquisition, and limited low-frequency information, which hinder accurate subsurface imaging and interpretation. Traditional processing methods rely heavily on task-specific designs to address these challenges and fail to account for the variability of data. To address these limitations, we present a generative seismic foundation model (GSFM), a unified framework based on generative diffusion models (GDMs), designed to tackle multi-task seismic processing challenges, including denoising, backscattered noise attenuation, interpolation, and low-frequency extrapolation. GSFM leverages a pre-training stage on synthetic data to capture the features of clean, complete, and broadband seismic data distributions and applies an iterative fine-tuning strategy to adapt the model to field data. By adopting a target-oriented diffusion process prediction, GSFM improves computational efficiency without compromising accuracy. Synthetic data tests demonstrate GSFM surpasses benchmarks with equivalent architectures in all tasks and achieves performance comparable to traditional pre-training strategies, even after their fine-tuning. Also, field data tests suggest that our iterative fine-tuning approach addresses the generalization limitations of conventional pre-training and fine-tuning paradigms, delivering significantly enhanced performance across diverse tasks. Furthermore, GSFM's inherent probabilistic nature enables effective uncertainty quantification, offering valuable insights into the reliability of processing results.




Abstract:Physics-informed neural networks (PINNs) face significant challenges in modeling multi-frequency wavefields in complex velocity models due to their slow convergence, difficulty in representing high-frequency details, and lack of generalization to varying frequencies and velocity scenarios. To address these issues, we propose Meta-LRPINN, a novel framework that combines low-rank parameterization using singular value decomposition (SVD) with meta-learning and frequency embedding. Specifically, we decompose the weights of PINN's hidden layers using SVD and introduce an innovative frequency embedding hypernetwork (FEH) that links input frequencies with the singular values, enabling efficient and frequency-adaptive wavefield representation. Meta-learning is employed to provide robust initialization, improving optimization stability and reducing training time. Additionally, we implement adaptive rank reduction and FEH pruning during the meta-testing phase to further enhance efficiency. Numerical experiments, which are presented on multi-frequency scattered wavefields for different velocity models, demonstrate that Meta-LRPINN achieves much fast convergence speed and much high accuracy compared to baseline methods such as Meta-PINN and vanilla PINN. Also, the proposed framework shows strong generalization to out-of-distribution frequencies while maintaining computational efficiency. These results highlight the potential of our Meta-LRPINN for scalable and adaptable seismic wavefield modeling.
Abstract:Building subsurface velocity models is essential to our goals in utilizing seismic data for Earth discovery and exploration, as well as monitoring. With the dawn of machine learning, these velocity models (or, more precisely, their distribution) can be stored accurately and efficiently in a generative model. These stored velocity model distributions can be utilized to regularize or quantify uncertainties in inverse problems, like full waveform inversion. However, most generators, like normalizing flows or diffusion models, treat the image (velocity model) uniformly, disregarding spatial dependencies and resolution changes with respect to the observation locations. To address this weakness, we introduce VelocityGPT, a novel implementation that utilizes Transformer decoders trained autoregressively to generate a velocity model from shallow subsurface to deep. Owing to the fact that seismic data are often recorded on the Earth's surface, a top-down generator can utilize the inverted information in the shallow as guidance (prior) to generating the deep. To facilitate the implementation, we use an additional network to compress the velocity model. We also inject prior information, like well or structure (represented by a migration image) to generate the velocity model. Using synthetic data, we demonstrate the effectiveness of VelocityGPT as a promising approach in generative model applications for seismic velocity model building.