Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients

Jan 26, 2023

Guihong Li, Yuedong Yang, Kartikeya Bhardwaj, Radu Marculescu

Figure 1 for ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients

Figure 2 for ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients

Figure 3 for ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients

Figure 4 for ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients

Share this with someone who'll enjoy it:

Abstract:Neural Architecture Search (NAS) is widely used to automatically design the neural network with the best performance among a large number of candidate architectures. To reduce the search time, zero-shot NAS aims at designing training-free proxies that can predict the test performance of a given architecture. However, as shown recently, none of the zero-shot proxies proposed to date can actually work consistently better than a naive proxy, namely, the number of network parameters (#Params). To improve this state of affairs, as the main theoretical contribution, we first reveal how some specific gradient properties across different samples impact the convergence rate and generalization capacity of neural networks. Based on this theoretical analysis, we propose a new zero-shot proxy, ZiCo, the first proxy that works consistently better than #Params. We demonstrate that ZiCo works better than State-Of-The-Art (SOTA) proxies on several popular NAS-Benchmarks (NASBench101, NATSBench-SSS/TSS, TransNASBench-101) for multiple applications (e.g., image classification/reconstruction and pixel-level prediction). Finally, we demonstrate that the optimal architectures found via ZiCo are as competitive as the ones found by one-shot and multi-shot NAS methods, but with much less search time. For example, ZiCo-based NAS can find optimal architectures with 78.1%, 79.4%, and 80.4% test accuracy under inference budgets of 450M, 600M, and 1000M FLOPs on ImageNet within 0.4 GPU days.

* ICLR 2023 Spotlight

View paper on

OpenReview

Share this with someone who'll enjoy it:

Title:ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients

Paper and Code