Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:The Inductive Bias of Flatness Regularization for Deep Matrix Factorization

Jun 22, 2023

Khashayar Gatmiry, Zhiyuan Li, Ching-Yao Chuang, Sashank Reddi, Tengyu Ma, Stefanie Jegelka

Figure 1 for The Inductive Bias of Flatness Regularization for Deep Matrix Factorization

Figure 2 for The Inductive Bias of Flatness Regularization for Deep Matrix Factorization

Figure 3 for The Inductive Bias of Flatness Regularization for Deep Matrix Factorization

Figure 4 for The Inductive Bias of Flatness Regularization for Deep Matrix Factorization

Share this with someone who'll enjoy it:

Abstract:Recent works on over-parameterized neural networks have shown that the stochasticity in optimizers has the implicit regularization effect of minimizing the sharpness of the loss function (in particular, the trace of its Hessian) over the family zero-loss solutions. More explicit forms of flatness regularization also empirically improve the generalization performance. However, it remains unclear why and when flatness regularization leads to better generalization. This work takes the first step toward understanding the inductive bias of the minimum trace of the Hessian solutions in an important setting: learning deep linear networks from linear measurements, also known as \emph{deep matrix factorization}. We show that for all depth greater than one, with the standard Restricted Isometry Property (RIP) on the measurements, minimizing the trace of Hessian is approximately equivalent to minimizing the Schatten 1-norm of the corresponding end-to-end matrix parameters (i.e., the product of all layer matrices), which in turn leads to better generalization. We empirically verify our theoretical findings on synthetic datasets.

View paper on

Share this with someone who'll enjoy it:

Title:The Inductive Bias of Flatness Regularization for Deep Matrix Factorization

Paper and Code