Abstract:Accurate prediction and synthesis of seismic waveforms are crucial for seismic hazard assessment and earthquake-resistant infrastructure design. Existing prediction methods, such as Ground Motion Models and physics-based simulations, often fail to capture the full complexity of seismic wavefields, particularly at higher frequencies. This study introduces a novel, efficient, and scalable generative model for high-frequency seismic waveform generation. Our approach leverages a spectrogram representation of seismic waveform data, which is reduced to a lower-dimensional submanifold via an autoencoder. A state-of-the-art diffusion model is trained to generate this latent representation, conditioned on key input parameters: earthquake magnitude, recording distance, site conditions, and faulting type. The model generates waveforms with frequency content up to 50 Hz. Any scalar ground motion statistic, such as peak ground motion amplitudes and spectral accelerations, can be readily derived from the synthesized waveforms. We validate our model using commonly used seismological metrics, and performance metrics from image generation studies. Our results demonstrate that our openly available model can generate distributions of realistic high-frequency seismic waveforms across a wide range of input parameters, even in data-sparse regions. For the scalar ground motion statistics commonly used in seismic hazard and earthquake engineering studies, we show that the model accurately reproduces both the median trends of the real data and its variability. To evaluate and compare the growing number of this and similar 'Generative Waveform Models' (GWM), we argue that they should generally be openly available and that they should be included in community efforts for ground motion model evaluations.
Abstract:Robust estimation of ground motions generated by scenario earthquakes is critical for many engineering applications. We leverage recent advances in Generative Adversarial Networks (GANs) to develop a new framework for synthesizing earthquake acceleration time histories. Our approach extends the Wasserstein GAN formulation to allow for the generation of ground-motions conditioned on a set of continuous physical variables. Our model is trained to approximate the intrinsic probability distribution of a massive set of strong-motion recordings from Japan. We show that the trained generator model can synthesize realistic 3-Component accelerograms conditioned on magnitude, distance, and $V_{s30}$. Our model captures the expected statistical features of the acceleration spectra and waveform envelopes. The output seismograms display clear P and S-wave arrivals with the appropriate energy content and relative onset timing. The synthesized Peak Ground Acceleration (PGA) estimates are also consistent with observations. We develop a set of metrics that allow us to assess the training process's stability and tune model hyperparameters. We further show that the trained generator network can interpolate to conditions where no earthquake ground motion recordings exist. Our approach allows the on-demand synthesis of accelerograms for engineering purposes.
Abstract:Seismic phase association is a fundamental task in seismology that pertains to linking together phase detections on different sensors that originate from a common earthquake. It is widely employed to detect earthquakes on permanent and temporary seismic networks, and underlies most seismicity catalogs produced around the world. This task can be challenging because the number of sources is unknown, events frequently overlap in time, or can occur simultaneously in different parts of a network. We present PhaseLink, a framework based on recent advances in deep learning for grid-free earthquake phase association. Our approach learns to link phases together that share a common origin, and is trained entirely on tens of millions of synthetic sequences of P- and S-wave arrival times generated using a simple 1D velocity model. Our approach is simple to implement for any tectonic regime, suitable for real-time processing, and can naturally incorporate errors in arrival time picks. Rather than tuning a set of ad hoc hyperparameters to improve performance, PhaseLink can be improved by simply adding examples of problematic cases to the training dataset. We demonstrate the state-of-the-art performance of PhaseLink on a challenging recent sequence from southern California, and synthesized sequences from Japan designed to test the point at which the method fails. These tests show that PhaseLink can precisely associate P- and S-picks to events that are separated by ~12 seconds in origin time. This approach is expected to improve the resolution of seismicity catalogs, add stability to real-time seismic monitoring, and streamline automated processing of large seismic datasets.