SGD with large step sizes learns sparse features

Add code
Oct 11, 2022
Figure 1 for SGD with large step sizes learns sparse features
Figure 2 for SGD with large step sizes learns sparse features
Figure 3 for SGD with large step sizes learns sparse features
Figure 4 for SGD with large step sizes learns sparse features

Share this with someone who'll enjoy it:

View paper onarxiv iconopen_review iconOpenReview

Share this with someone who'll enjoy it: