Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Meta-prediction Model for Distillation-Aware NAS on Unseen Datasets

May 26, 2023

Hayeon Lee, Sohyun An, Minseon Kim, Sung Ju Hwang

Figure 1 for Meta-prediction Model for Distillation-Aware NAS on Unseen Datasets

Figure 2 for Meta-prediction Model for Distillation-Aware NAS on Unseen Datasets

Figure 3 for Meta-prediction Model for Distillation-Aware NAS on Unseen Datasets

Figure 4 for Meta-prediction Model for Distillation-Aware NAS on Unseen Datasets

Share this with someone who'll enjoy it:

Abstract:Distillation-aware Neural Architecture Search (DaNAS) aims to search for an optimal student architecture that obtains the best performance and/or efficiency when distilling the knowledge from a given teacher model. Previous DaNAS methods have mostly tackled the search for the neural architecture for fixed datasets and the teacher, which are not generalized well on a new task consisting of an unseen dataset and an unseen teacher, thus need to perform a costly search for any new combination of the datasets and the teachers. For standard NAS tasks without KD, meta-learning-based computationally efficient NAS methods have been proposed, which learn the generalized search process over multiple tasks (datasets) and transfer the knowledge obtained over those tasks to a new task. However, since they assume learning from scratch without KD from a teacher, they might not be ideal for DaNAS scenarios. To eliminate the excessive computational cost of DaNAS methods and the sub-optimality of rapid NAS methods, we propose a distillation-aware meta accuracy prediction model, DaSS (Distillation-aware Student Search), which can predict a given architecture's final performances on a dataset when performing KD with a given teacher, without having actually to train it on the target task. The experimental results demonstrate that our proposed meta-prediction model successfully generalizes to multiple unseen datasets for DaNAS tasks, largely outperforming existing meta-NAS methods and rapid NAS baselines. Code is available at https://github.com/CownowAn/DaSS

* ICLR 2023 (Notable-top-25%)

View paper on

OpenReview

Share this with someone who'll enjoy it:

Title:Meta-prediction Model for Distillation-Aware NAS on Unseen Datasets

Paper and Code