Abstract:With the rapid development of gravitational wave astronomy, the increasing number of detected events necessitates efficient methods for parameter estimation and model updates. This study presents a novel approach using knowledge distillation techniques to enhance computational efficiency in gravitational wave analysis. We develop a framework combining ResNet1D and Inverse Autoregressive Flow (IAF) architectures, where knowledge from a complex teacher model is transferred to a lighter student model. Our experimental results show that the student model achieves a validation loss of 3.70 with optimal configuration (40,100,0.75), compared to the teacher model's 4.09, while reducing the number of parameters by 43\%. The Jensen-Shannon divergence between teacher and student models remains below 0.0001 across network layers, indicating successful knowledge transfer. By optimizing ResNet layers (7-16) and hidden features (70-120), we achieve a 35\% reduction in inference time while maintaining parameter estimation accuracy. This work demonstrates significant improvements in computational efficiency for gravitational wave data analysis, providing valuable insights for real-time event processing.
Abstract:Adversarial training with Normalizing Flow (NF) models is an emerging research area aimed at improving model robustness through adversarial samples. In this study, we focus on applying adversarial training to NF models for gravitational wave parameter estimation. We propose an adaptive epsilon method for Fast Gradient Sign Method (FGSM) adversarial training, which dynamically adjusts perturbation strengths based on gradient magnitudes using logarithmic scaling. Our hybrid architecture, combining ResNet and Inverse Autoregressive Flow, reduces the Negative Log Likelihood (NLL) loss by 47\% under FGSM attacks compared to the baseline model, while maintaining an NLL of 4.2 on clean data (only 5\% higher than the baseline). For perturbation strengths between 0.01 and 0.1, our model achieves an average NLL of 5.8, outperforming both fixed-epsilon (NLL: 6.7) and progressive-epsilon (NLL: 7.2) methods. Under stronger Projected Gradient Descent attacks with perturbation strength of 0.05, our model maintains an NLL of 6.4, demonstrating superior robustness while avoiding catastrophic overfitting.