Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Edison Guo

FourCastNeXt: Improving FourCastNet Training with Limited Compute

Jan 10, 2024

Edison Guo, Maruf Ahmed, Yue Sun, Rahul Mahendru, Rui Yang, Harrison Cook, Tennessee Leeuwenburg, Ben Evans

Abstract:Recently, the FourCastNet Neural Earth System Model (NESM) has shown impressive results on predicting various atmospheric variables, trained on the ERA5 reanalysis dataset. While FourCastNet enjoys quasi-linear time and memory complexity in sequence length compared to quadratic complexity in vanilla transformers, training FourCastNet on ERA5 from scratch still requires large amount of compute resources, which is expensive or even inaccessible to most researchers. In this work, we will show improved methods that can train FourCastNet using only 1% of the compute required by the baseline, while maintaining model performance or par or even better than the baseline.

Via

Access Paper or Ask Questions

On Differentiating Parameterized Argmin and Argmax Problems with Application to Bi-level Optimization

Jul 21, 2016

Stephen Gould, Basura Fernando, Anoop Cherian, Peter Anderson, Rodrigo Santa Cruz, Edison Guo

Figure 1 for On Differentiating Parameterized Argmin and Argmax Problems with Application to Bi-level Optimization

Figure 2 for On Differentiating Parameterized Argmin and Argmax Problems with Application to Bi-level Optimization

Figure 3 for On Differentiating Parameterized Argmin and Argmax Problems with Application to Bi-level Optimization

Figure 4 for On Differentiating Parameterized Argmin and Argmax Problems with Application to Bi-level Optimization

Abstract:Some recent works in machine learning and computer vision involve the solution of a bi-level optimization problem. Here the solution of a parameterized lower-level problem binds variables that appear in the objective of an upper-level problem. The lower-level problem typically appears as an argmin or argmax optimization problem. Many techniques have been proposed to solve bi-level optimization problems, including gradient descent, which is popular with current end-to-end learning approaches. In this technical report we collect some results on differentiating argmin and argmax optimization problems with and without constraints and provide some insightful motivating examples.

* 16 pages, 6 figures

Via

Access Paper or Ask Questions