Abstract:The diffusion model has gained significant attention due to its remarkable data generation ability in fields such as image synthesis. However, its strong memorization and replication abilities with respect to the training data also make it a prime target for copyright infringement attacks. This paper provides an in-depth analysis of the spatial similarity of replication in diffusion model and leverages this key characteristic to design a method for detecting poisoning data. By employing a joint assessment of spatial-level and feature-level information from the detected segments, we effectively identify covertly dispersed poisoned samples. Building upon detected poisoning data, we propose a novel defense method specifically targeting copyright infringement attacks by introducing a protection constraint term into the loss function to mitigate the impact of poisoning. Extensive experimental results demonstrate that our approach achieves an average F1 score of 0.709 in detecting copyright infringement backdoors, resulting in an average increase of 68.1% in First-Attack Epoch (FAE) and an average decrease of 51.4% in Copyright Infringement Rate (CIR) of the poisoned model, effectively defending against copyright infringement. Additionally, we introduce the concept of copyright feature inversion, which aids in determining copyright responsibility and expands the application scenarios of defense strategies.
Abstract:We explore adapting foundation models (FMs) from the computer vision domain to geoscience. FMs, large neural networks trained on massive datasets, excel in diverse tasks with remarkable adaptability and generality. However, geoscience faces challenges like lacking curated training datasets and high computational costs for developing specialized FMs. This study considers adapting FMs from computer vision to geoscience, analyzing their scale, adaptability, and generality for geoscientific data analysis. We introduce a workflow that leverages existing computer vision FMs, fine-tuning them for geoscientific tasks, reducing development costs while enhancing accuracy. Through experiments, we demonstrate this workflow's effectiveness in broad applications to process and interpret geoscientific data of lunar images, seismic data, DAS arrays and so on. Our findings introduce advanced ML techniques to geoscience, proving the feasibility and advantages of cross-domain FMs adaptation, driving further advancements in geoscientific data analysis and offering valuable insights for FMs applications in other scientific domains.