Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tyler J. Jarvis

Aliasing and Label-Independent Decomposition of Risk: Beyond the bias-variance trade-off

Aug 15, 2024

Mark K. Transtrum, Gus L. W. Hart, Tyler J. Jarvis, Jared P. Whitehead

Abstract:A central problem in data science is to use potentially noisy samples of an unknown function to predict function values for unseen inputs. In classical statistics, the predictive error is understood as a trade-off between the bias and the variance that balances model simplicity with its ability to fit complex functions. However, over-parameterized models exhibit counter-intuitive behaviors, such as "double descent" in which models of increasing complexity exhibit decreasing generalization error. We introduce an alternative paradigm called the generalized aliasing decomposition. We explain the asymptotically small error of complex models as a systematic "de-aliasing" that occurs in the over-parameterized regime. In the limit of large models, the contribution due to aliasing vanishes, leaving an expression for the asymptotic total error we call the invertibility failure of very large models on few training points. Because the generalized aliasing decomposition can be explicitly calculated from the relationship between model class and samples without seeing any data labels, it can answer questions related to experimental design and model selection before collecting data or performing experiments. We demonstrate this approach using several examples, including classical regression problems and a cluster expansion model used in materials science.

Via

Access Paper or Ask Questions

Tandem Blocks in Deep Convolutional Neural Networks

Jun 01, 2018

Chris Hettinger, Tanner Christensen, Jeffrey Humpherys, Tyler J. Jarvis

Figure 1 for Tandem Blocks in Deep Convolutional Neural Networks

Figure 2 for Tandem Blocks in Deep Convolutional Neural Networks

Figure 3 for Tandem Blocks in Deep Convolutional Neural Networks

Figure 4 for Tandem Blocks in Deep Convolutional Neural Networks

Abstract:Due to the success of residual networks (resnets) and related architectures, shortcut connections have quickly become standard tools for building convolutional neural networks. The explanations in the literature for the apparent effectiveness of shortcuts are varied and often contradictory. We hypothesize that shortcuts work primarily because they act as linear counterparts to nonlinear layers. We test this hypothesis by using several variations on the standard residual block, with different types of linear connections, to build small image classification networks. Our experiments show that other kinds of linear connections can be even more effective than the identity shortcuts. Our results also suggest that the best type of linear connection for a given application may depend on both network width and depth.

* 10 pages, 6 figures

Via

Access Paper or Ask Questions