Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jacqueline Blaum

Self-supervised similarity search for large scientific datasets

Oct 25, 2021

George Stein, Peter Harrington, Jacqueline Blaum, Tomislav Medan, Zarija Lukic

Figure 1 for Self-supervised similarity search for large scientific datasets

Figure 2 for Self-supervised similarity search for large scientific datasets

Abstract:We present the use of self-supervised learning to explore and exploit large unlabeled datasets. Focusing on 42 million galaxy images from the latest data release of the Dark Energy Spectroscopic Instrument (DESI) Legacy Imaging Surveys, we first train a self-supervised model to distil low-dimensional representations that are robust to symmetries, uncertainties, and noise in each image. We then use the representations to construct and publicly release an interactive semantic similarity search tool. We demonstrate how our tool can be used to rapidly discover rare objects given only a single example, increase the speed of crowd-sourcing campaigns, and construct and improve training sets for supervised applications. While we focus on images from sky surveys, the technique is straightforward to apply to any scientific dataset of any dimensionality. The similarity search web app can be found at https://github.com/georgestein/galaxy_search

* 5 pages, 2 figures. The similarity search web app can be found at https://github.com/georgestein/galaxy_search. arXiv admin note: text overlap with arXiv:2110.00023

Via

Access Paper or Ask Questions

Mining for strong gravitational lenses with self-supervised learning

Sep 30, 2021

George Stein, Jacqueline Blaum, Peter Harrington, Tomislav Medan, Zarija Lukic

Figure 1 for Mining for strong gravitational lenses with self-supervised learning

Figure 2 for Mining for strong gravitational lenses with self-supervised learning

Figure 3 for Mining for strong gravitational lenses with self-supervised learning

Figure 4 for Mining for strong gravitational lenses with self-supervised learning

Abstract:We employ self-supervised representation learning to distill information from 76 million galaxy images from the Dark Energy Spectroscopic Instrument (DESI) Legacy Imaging Surveys' Data Release 9. Targeting the identification of new strong gravitational lens candidates, we first create a rapid similarity search tool to discover new strong lenses given only a single labelled example. We then show how training a simple linear classifier on the self-supervised representations, requiring only a few minutes on a CPU, can automatically classify strong lenses with great efficiency. We present 1192 new strong lens candidates that we identified through a brief visual identification campaign, and release an interactive web-based similarity search tool and the top network predictions to facilitate crowd-sourcing rapid discovery of additional strong gravitational lenses and other rare objects: github.com/georgestein/ssl-legacysurvey

* 24 Pages, 15 figures, submitted to ApJ, data at github.com/georgestein/ssl-legacysurvey

Via

Access Paper or Ask Questions