Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Downstream Task Guided Masking Learning in Masked Autoencoders Using Multi-Level Optimization

Feb 28, 2024

Han Guo, Ramtin Hosseini, Ruiyi Zhang, Sai Ashish Somayajula, Ranak Roy Chowdhury, Rajesh K. Gupta, Pengtao Xie

Figure 1 for Downstream Task Guided Masking Learning in Masked Autoencoders Using Multi-Level Optimization

Figure 2 for Downstream Task Guided Masking Learning in Masked Autoencoders Using Multi-Level Optimization

Figure 3 for Downstream Task Guided Masking Learning in Masked Autoencoders Using Multi-Level Optimization

Figure 4 for Downstream Task Guided Masking Learning in Masked Autoencoders Using Multi-Level Optimization

Share this with someone who'll enjoy it:

Abstract:Masked Autoencoder (MAE) is a notable method for self-supervised pretraining in visual representation learning. It operates by randomly masking image patches and reconstructing these masked patches using the unmasked ones. A key limitation of MAE lies in its disregard for the varying informativeness of different patches, as it uniformly selects patches to mask. To overcome this, some approaches propose masking based on patch informativeness. However, these methods often do not consider the specific requirements of downstream tasks, potentially leading to suboptimal representations for these tasks. In response, we introduce the Multi-level Optimized Mask Autoencoder (MLO-MAE), a novel framework that leverages end-to-end feedback from downstream tasks to learn an optimal masking strategy during pretraining. Our experimental findings highlight MLO-MAE's significant advancements in visual representation learning. Compared to existing methods, it demonstrates remarkable improvements across diverse datasets and tasks, showcasing its adaptability and efficiency. Our code is available at: https://github.com/Alexiland/MLOMAE

View paper on

Share this with someone who'll enjoy it:

Title:Downstream Task Guided Masking Learning in Masked Autoencoders Using Multi-Level Optimization

Paper and Code