Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:GradAlign for Training-free Model Performance Inference

Nov 29, 2024

Yuxuan Li, Yunhui Guo

Figure 1 for GradAlign for Training-free Model Performance Inference

Figure 2 for GradAlign for Training-free Model Performance Inference

Figure 3 for GradAlign for Training-free Model Performance Inference

Figure 4 for GradAlign for Training-free Model Performance Inference

Share this with someone who'll enjoy it:

Abstract:Architecture plays an important role in deciding the performance of deep neural networks. However, the search for the optimal architecture is often hindered by the vast search space, making it a time-intensive process. Recently, a novel approach known as training-free neural architecture search (NAS) has emerged, aiming to discover the ideal architecture without necessitating extensive training. Training-free NAS leverages various indicators for architecture selection, including metrics such as the count of linear regions, the density of per-sample losses, and the stability of the finite-width Neural Tangent Kernel (NTK) matrix. Despite the competitive empirical performance of current training-free NAS techniques, they suffer from certain limitations, including inconsistent performance and a lack of deep understanding. In this paper, we introduce GradAlign, a simple yet effective method designed for inferring model performance without the need for training. At its core, GradAlign quantifies the extent of conflicts within per-sample gradients during initialization, as substantial conflicts hinder model convergence and ultimately result in worse performance. We evaluate GradAlign against established training-free NAS methods using standard NAS benchmarks, showing a better overall performance. Moreover, we show that the widely adopted metric of linear region count may not suffice as a dependable criterion for selecting network architectures during at initialization.

View paper on

Share this with someone who'll enjoy it:

Title:GradAlign for Training-free Model Performance Inference

Paper and Code