Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Optimistic Verifiable Training by Controlling Hardware Nondeterminism

Mar 16, 2024

Megha Srivastava, Simran Arora, Dan Boneh

Figure 1 for Optimistic Verifiable Training by Controlling Hardware Nondeterminism

Figure 2 for Optimistic Verifiable Training by Controlling Hardware Nondeterminism

Figure 3 for Optimistic Verifiable Training by Controlling Hardware Nondeterminism

Figure 4 for Optimistic Verifiable Training by Controlling Hardware Nondeterminism

Share this with someone who'll enjoy it:

Abstract:The increasing compute demands of AI systems has led to the emergence of services that train models on behalf of clients lacking necessary resources. However, ensuring correctness of training and guarding against potential training-time attacks, such as data poisoning, poses challenges. Existing works on verifiable training largely fall into two classes: proof-based systems, which struggle to scale due to requiring cryptographic techniques, and "optimistic" methods that consider a trusted third-party auditor who replicates the training process. A key challenge with the latter is that hardware nondeterminism between GPU types during training prevents an auditor from replicating the training process exactly, and such schemes are therefore non-robust. We propose a method that combines training in a higher precision than the target model, rounding after intermediate computation steps, and storing rounding decisions based on an adaptive thresholding procedure, to successfully control for nondeterminism. Across three different NVIDIA GPUs (A40, Titan XP, RTX 2080 Ti), we achieve exact training replication at FP32 precision for both full-training and fine-tuning of ResNet-50 (23M) and GPT-2 (117M) models. Our verifiable training scheme significantly decreases the storage and time costs compared to proof-based systems.

* 11 pages, 5 figures, preprint

View paper on

Share this with someone who'll enjoy it:

Title:Optimistic Verifiable Training by Controlling Hardware Nondeterminism

Paper and Code