Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Nathaniel Woodward

Re-Simulation-based Self-Supervised Learning for Pre-Training Foundation Models

Mar 11, 2024

Philip Harris, Michael Kagan, Jeffrey Krupa, Benedikt Maier, Nathaniel Woodward

Figure 1 for Re-Simulation-based Self-Supervised Learning for Pre-Training Foundation Models

Figure 2 for Re-Simulation-based Self-Supervised Learning for Pre-Training Foundation Models

Figure 3 for Re-Simulation-based Self-Supervised Learning for Pre-Training Foundation Models

Figure 4 for Re-Simulation-based Self-Supervised Learning for Pre-Training Foundation Models

Abstract:Self-Supervised Learning (SSL) is at the core of training modern large machine learning models, providing a scheme for learning powerful representations that can be used in a variety of downstream tasks. However, SSL strategies must be adapted to the type of training data and downstream tasks required. We propose RS3L, a novel simulation-based SSL strategy that employs a method of re-simulation to drive data augmentation for contrastive learning. By intervening in the middle of the simulation process and re-running simulation components downstream of the intervention, we generate multiple realizations of an event, thus producing a set of augmentations covering all physics-driven variations available in the simulator. Using experiments from high-energy physics, we explore how this strategy may enable the development of a foundation model; we show how R3SL pre-training enables powerful performance in downstream tasks such as discrimination of a variety of objects and uncertainty mitigation. In addition to our results, we make the RS3L dataset publicly available for further studies on how to improve SSL strategies.

* 24 pages, 9 figures

Via

Access Paper or Ask Questions