Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Benjamin Liu

USat: A Unified Self-Supervised Encoder for Multi-Sensor Satellite Imagery

Dec 02, 2023

Jeremy Irvin, Lucas Tao, Joanne Zhou, Yuntao Ma, Langston Nashold, Benjamin Liu, Andrew Y. Ng

Figure 1 for USat: A Unified Self-Supervised Encoder for Multi-Sensor Satellite Imagery

Figure 2 for USat: A Unified Self-Supervised Encoder for Multi-Sensor Satellite Imagery

Figure 3 for USat: A Unified Self-Supervised Encoder for Multi-Sensor Satellite Imagery

Figure 4 for USat: A Unified Self-Supervised Encoder for Multi-Sensor Satellite Imagery

Abstract:Large, self-supervised vision models have led to substantial advancements for automatically interpreting natural images. Recent works have begun tailoring these methods to remote sensing data which has rich structure with multi-sensor, multi-spectral, and temporal information providing massive amounts of self-labeled data that can be used for self-supervised pre-training. In this work, we develop a new encoder architecture called USat that can input multi-spectral data from multiple sensors for self-supervised pre-training. USat is a vision transformer with modified patch projection layers and positional encodings to model spectral bands with varying spatial scales from multiple sensors. We integrate USat into a Masked Autoencoder (MAE) self-supervised pre-training procedure and find that a pre-trained USat outperforms state-of-the-art self-supervised MAE models trained on remote sensing data on multiple remote sensing benchmark datasets (up to 8%) and leads to improvements in low data regimes (up to 7%). Code and pre-trained weights are available at https://github.com/stanfordmlgroup/USat .

Via

Access Paper or Ask Questions