Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zeyad Emam

End-to-end Algorithm Synthesis with Recurrent Networks: Logical Extrapolation Without Overthinking

Feb 15, 2022

Arpit Bansal, Avi Schwarzschild, Eitan Borgnia, Zeyad Emam, Furong Huang, Micah Goldblum, Tom Goldstein

Figure 1 for End-to-end Algorithm Synthesis with Recurrent Networks: Logical Extrapolation Without Overthinking

Figure 2 for End-to-end Algorithm Synthesis with Recurrent Networks: Logical Extrapolation Without Overthinking

Figure 3 for End-to-end Algorithm Synthesis with Recurrent Networks: Logical Extrapolation Without Overthinking

Figure 4 for End-to-end Algorithm Synthesis with Recurrent Networks: Logical Extrapolation Without Overthinking

Abstract:Machine learning systems perform well on pattern matching tasks, but their ability to perform algorithmic or logical reasoning is not well understood. One important reasoning capability is logical extrapolation, in which models trained only on small/simple reasoning problems can synthesize complex algorithms that scale up to large/complex problems at test time. Logical extrapolation can be achieved through recurrent systems, which can be iterated many times to solve difficult reasoning problems. We observe that this approach fails to scale to highly complex problems because behavior degenerates when many iterations are applied -- an issue we refer to as "overthinking." We propose a recall architecture that keeps an explicit copy of the problem instance in memory so that it cannot be forgotten. We also employ a progressive training routine that prevents the model from learning behaviors that are specific to iteration number and instead pushes it to learn behaviors that can be repeated indefinitely. These innovations prevent the overthinking problem, and enable recurrent systems to solve extremely hard logical extrapolation tasks, some requiring over 100K convolutional layers, without overthinking.

Via

Access Paper or Ask Questions

Datasets for Studying Generalization from Easy to Hard Examples

Aug 13, 2021

Avi Schwarzschild, Eitan Borgnia, Arjun Gupta, Arpit Bansal, Zeyad Emam, Furong Huang, Micah Goldblum, Tom Goldstein

Figure 1 for Datasets for Studying Generalization from Easy to Hard Examples

Figure 2 for Datasets for Studying Generalization from Easy to Hard Examples

Figure 3 for Datasets for Studying Generalization from Easy to Hard Examples

Abstract:We describe new datasets for studying generalization from easy to hard examples.

Via

Access Paper or Ask Questions

On The State of Data In Computer Vision: Human Annotations Remain Indispensable for Developing Deep Learning Models

Jul 31, 2021

Zeyad Emam, Andrew Kondrich, Sasha Harrison, Felix Lau, Yushi Wang, Aerin Kim, Elliot Branson

Abstract:High-quality labeled datasets play a crucial role in fueling the development of machine learning (ML), and in particular the development of deep learning (DL). However, since the emergence of the ImageNet dataset and the AlexNet model in 2012, the size of new open-source labeled vision datasets has remained roughly constant. Consequently, only a minority of publications in the computer vision community tackle supervised learning on datasets that are orders of magnitude larger than Imagenet. In this paper, we survey computer vision research domains that study the effects of such large datasets on model performance across different vision tasks. We summarize the community's current understanding of those effects, and highlight some open questions related to training with massive datasets. In particular, we tackle: (a) The largest datasets currently used in computer vision research and the interesting takeaways from training on such datasets; (b) The effectiveness of pre-training on large datasets; (c) Recent advancements and hurdles facing synthetic datasets; (d) An overview of double descent and sample non-monotonicity phenomena; and finally, (e) A brief discussion of lifelong/continual learning and how it fares compared to learning from huge labeled datasets in an offline setting. Overall, our findings are that research on optimization for deep learning focuses on perfecting the training routine and thus making DL models less data hungry, while research on synthetic datasets aims to offset the cost of data labeling. However, for the time being, acquiring non-synthetic labeled data remains indispensable to boost performance.

Via

Access Paper or Ask Questions

Understanding Generalization through Visualizations

Jul 16, 2019

W. Ronny Huang, Zeyad Emam, Micah Goldblum, Liam Fowl, Justin K. Terry, Furong Huang, Tom Goldstein

Figure 1 for Understanding Generalization through Visualizations

Figure 2 for Understanding Generalization through Visualizations

Figure 3 for Understanding Generalization through Visualizations

Figure 4 for Understanding Generalization through Visualizations

Abstract:The power of neural networks lies in their ability to generalize to unseen data, yet the underlying reasons for this phenomenon remain elusive. Numerous rigorous attempts have been made to explain generalization, but available bounds are still quite loose, and analysis does not always lead to true understanding. The goal of this work is to make generalization more intuitive. Using visualization methods, we discuss the mystery of generalization, the geometry of loss landscapes, and how the curse (or, rather, the blessing) of dimensionality causes optimizers to settle into minima that generalize well.

* 8 pages (excluding acknowledgments and references), 8 figures

Via

Access Paper or Ask Questions