Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Gianluca Bencomo

Teasing Apart Architecture and Initial Weights as Sources of Inductive Bias in Neural Networks

Feb 27, 2025

Gianluca Bencomo, Max Gupta, Ioana Marinescu, R. Thomas McCoy, Thomas L. Griffiths

Abstract:Artificial neural networks can acquire many aspects of human knowledge from data, making them promising as models of human learning. But what those networks can learn depends upon their inductive biases -- the factors other than the data that influence the solutions they discover -- and the inductive biases of neural networks remain poorly understood, limiting our ability to draw conclusions about human learning from the performance of these systems. Cognitive scientists and machine learning researchers often focus on the architecture of a neural network as a source of inductive bias. In this paper we explore the impact of another source of inductive bias -- the initial weights of the network -- using meta-learning as a tool for finding initial weights that are adapted for specific problems. We evaluate four widely-used architectures -- MLPs, CNNs, LSTMs, and Transformers -- by meta-training 430 different models across three tasks requiring different biases and forms of generalization. We find that meta-learning can substantially reduce or entirely eliminate performance differences across architectures and data representations, suggesting that these factors may be less important as sources of inductive bias than is typically assumed. When differences are present, architectures and data representations that perform well without meta-learning tend to meta-train more effectively. Moreover, all architectures generalize poorly on problems that are far from their meta-training experience, underscoring the need for stronger inductive biases for robust generalization.

* 11 pages, 6 figures, 6 tables

Via

Access Paper or Ask Questions

Using Contrastive Learning with Generative Similarity to Learn Spaces that Capture Human Inductive Biases

May 29, 2024

Raja Marjieh, Sreejan Kumar, Declan Campbell, Liyi Zhang, Gianluca Bencomo, Jake Snell, Thomas L. Griffiths

Figure 1 for Using Contrastive Learning with Generative Similarity to Learn Spaces that Capture Human Inductive Biases

Figure 2 for Using Contrastive Learning with Generative Similarity to Learn Spaces that Capture Human Inductive Biases

Figure 3 for Using Contrastive Learning with Generative Similarity to Learn Spaces that Capture Human Inductive Biases

Figure 4 for Using Contrastive Learning with Generative Similarity to Learn Spaces that Capture Human Inductive Biases

Abstract:Humans rely on strong inductive biases to learn from few examples and abstract useful information from sensory data. Instilling such biases in machine learning models has been shown to improve their performance on various benchmarks including few-shot learning, robustness, and alignment. However, finding effective training procedures to achieve that goal can be challenging as psychologically-rich training data such as human similarity judgments are expensive to scale, and Bayesian models of human inductive biases are often intractable for complex, realistic domains. Here, we address this challenge by introducing a Bayesian notion of generative similarity whereby two datapoints are considered similar if they are likely to have been sampled from the same distribution. This measure can be applied to complex generative processes, including probabilistic programs. We show that generative similarity can be used to define a contrastive learning objective even when its exact form is intractable, enabling learning of spatial embeddings that express specific inductive biases. We demonstrate the utility of our approach by showing how it can be used to capture human inductive biases for geometric shapes, and to better distinguish different abstract drawing styles that are parameterized by probabilistic programs.

Via

Access Paper or Ask Questions

A Metalearned Neural Circuit for Nonparametric Bayesian Inference

Nov 24, 2023

Jake C. Snell, Gianluca Bencomo, Thomas L. Griffiths

Figure 1 for A Metalearned Neural Circuit for Nonparametric Bayesian Inference

Figure 2 for A Metalearned Neural Circuit for Nonparametric Bayesian Inference

Figure 3 for A Metalearned Neural Circuit for Nonparametric Bayesian Inference

Figure 4 for A Metalearned Neural Circuit for Nonparametric Bayesian Inference

Abstract:Most applications of machine learning to classification assume a closed set of balanced classes. This is at odds with the real world, where class occurrence statistics often follow a long-tailed power-law distribution and it is unlikely that all classes are seen in a single sample. Nonparametric Bayesian models naturally capture this phenomenon, but have significant practical barriers to widespread adoption, namely implementation complexity and computational inefficiency. To address this, we present a method for extracting the inductive bias from a nonparametric Bayesian model and transferring it to an artificial neural network. By simulating data with a nonparametric Bayesian prior, we can metalearn a sequence model that performs inference over an unlimited set of classes. After training, this "neural circuit" has distilled the corresponding inductive bias and can successfully perform sequential inference over an open set of classes. Our experimental results show that the metalearned neural circuit achieves comparable or better performance than particle filter-based methods for inference in these models while being faster and simpler to use than methods that explicitly incorporate Bayesian nonparametric inference.

* 13 pages, 3 figures. Code available at https://github.com/jakesnell/neural-circuits

Via

Access Paper or Ask Questions