Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Konrad P. Kording

Teresa

Grounding Intelligence in Movement

Jul 03, 2025

Melanie Segado, Felipe Parodi, Jordan K. Matelsky, Michael L. Platt, Eva B. Dyer, Konrad P. Kording

Figure 1 for Grounding Intelligence in Movement

Figure 2 for Grounding Intelligence in Movement

Figure 3 for Grounding Intelligence in Movement

Abstract:Recent advances in machine learning have dramatically improved our ability to model language, vision, and other high-dimensional data, yet they continue to struggle with one of the most fundamental aspects of biological systems: movement. Across neuroscience, medicine, robotics, and ethology, movement is essential for interpreting behavior, predicting intent, and enabling interaction. Despite its core significance in our intelligence, movement is often treated as an afterthought rather than as a rich and structured modality in its own right. This reflects a deeper fragmentation in how movement data is collected and modeled, often constrained by task-specific goals and domain-specific assumptions. But movement is not domain-bound. It reflects shared physical constraints, conserved morphological structures, and purposeful dynamics that cut across species and settings. We argue that movement should be treated as a primary modeling target for AI. It is inherently structured and grounded in embodiment and physics. This structure, often allowing for compact, lower-dimensional representations (e.g., pose), makes it more interpretable and computationally tractable to model than raw, high-dimensional sensory inputs. Developing models that can learn from and generalize across diverse movement data will not only advance core capabilities in generative modeling and control, but also create a shared foundation for understanding behavior across biological and artificial systems. Movement is not just an outcome, it is a window into how intelligent systems engage with the world.

* 9 pages, 2 figures

Via

Access Paper or Ask Questions

Who Does What in Deep Learning? Multidimensional Game-Theoretic Attribution of Function of Neural Units

Jun 24, 2025

Shrey Dixit, Kayson Fakhar, Fatemeh Hadaeghi, Patrick Mineault, Konrad P. Kording, Claus C. Hilgetag

Abstract:Neural networks now generate text, images, and speech with billions of parameters, producing a need to know how each neural unit contributes to these high-dimensional outputs. Existing explainable-AI methods, such as SHAP, attribute importance to inputs, but cannot quantify the contributions of neural units across thousands of output pixels, tokens, or logits. Here we close that gap with Multiperturbation Shapley-value Analysis (MSA), a model-agnostic game-theoretic framework. By systematically lesioning combinations of units, MSA yields Shapley Modes, unit-wise contribution maps that share the exact dimensionality of the model's output. We apply MSA across scales, from multi-layer perceptrons to the 56-billion-parameter Mixtral-8x7B and Generative Adversarial Networks (GAN). The approach demonstrates how regularisation concentrates computation in a few hubs, exposes language-specific experts inside the LLM, and reveals an inverted pixel-generation hierarchy in GANs. Together, these results showcase MSA as a powerful approach for interpreting, editing, and compressing deep neural networks.

Via

Access Paper or Ask Questions

The Landscape of Causal Discovery Data: Grounding Causal Discovery in Real-World Applications

Dec 02, 2024

Philippe Brouillard, Chandler Squires, Jonas Wahl, Konrad P. Kording, Karen Sachs, Alexandre Drouin, Dhanya Sridhar

Figure 1 for The Landscape of Causal Discovery Data: Grounding Causal Discovery in Real-World Applications

Figure 2 for The Landscape of Causal Discovery Data: Grounding Causal Discovery in Real-World Applications

Figure 3 for The Landscape of Causal Discovery Data: Grounding Causal Discovery in Real-World Applications

Figure 4 for The Landscape of Causal Discovery Data: Grounding Causal Discovery in Real-World Applications

Abstract:Causal discovery aims to automatically uncover causal relationships from data, a capability with significant potential across many scientific disciplines. However, its real-world applications remain limited. Current methods often rely on unrealistic assumptions and are evaluated only on simple synthetic toy datasets, often with inadequate evaluation metrics. In this paper, we substantiate these claims by performing a systematic review of the recent causal discovery literature. We present applications in biology, neuroscience, and Earth sciences - fields where causal discovery holds promise for addressing key challenges. We highlight available simulated and real-world datasets from these domains and discuss common assumption violations that have spurred the development of new methods. Our goal is to encourage the community to adopt better evaluation practices by utilizing realistic datasets and more adequate metrics.

* 39 pages, 8 figures

Via

Access Paper or Ask Questions

Empirical influence functions to understand the logic of fine-tuning

Jun 01, 2024

Jordan K. Matelsky, Lyle Ungar, Konrad P. Kording

Figure 1 for Empirical influence functions to understand the logic of fine-tuning

Figure 2 for Empirical influence functions to understand the logic of fine-tuning

Figure 3 for Empirical influence functions to understand the logic of fine-tuning

Figure 4 for Empirical influence functions to understand the logic of fine-tuning

Abstract:Understanding the process of learning in neural networks is crucial for improving their performance and interpreting their behavior. This can be approximately understood by asking how a model's output is influenced when we fine-tune on a new training sample. There are desiderata for such influences, such as decreasing influence with semantic distance, sparseness, noise invariance, transitive causality, and logical consistency. Here we use the empirical influence measured using fine-tuning to demonstrate how individual training samples affect outputs. We show that these desiderata are violated for both for simple convolutional networks and for a modern LLM. We also illustrate how prompting can partially rescue this failure. Our paper presents an efficient and practical way of quantifying how well neural networks learn from fine-tuning stimuli. Our results suggest that popular models cannot generalize or perform logic in the way they appear to.

Via

Access Paper or Ask Questions

A large language model-assisted education tool to provide feedback on open-ended responses

Jul 25, 2023

Jordan K. Matelsky, Felipe Parodi, Tony Liu, Richard D. Lange, Konrad P. Kording

Abstract:Open-ended questions are a favored tool among instructors for assessing student understanding and encouraging critical exploration of course material. Providing feedback for such responses is a time-consuming task that can lead to overwhelmed instructors and decreased feedback quality. Many instructors resort to simpler question formats, like multiple-choice questions, which provide immediate feedback but at the expense of personalized and insightful comments. Here, we present a tool that uses large language models (LLMs), guided by instructor-defined criteria, to automate responses to open-ended questions. Our tool delivers rapid personalized feedback, enabling students to quickly test their knowledge and identify areas for improvement. We provide open-source reference implementations both as a web application and as a Jupyter Notebook widget that can be used with instructional coding or math notebooks. With instructor guidance, LLMs hold promise to enhance student learning outcomes and elevate instructional methodologies.

Via

Access Paper or Ask Questions

Neural Networks as Paths through the Space of Representations

Jun 22, 2022

Richard D. Lange, Jordan Matelsky, Xinyue Wang, Devin Kwok, David S. Rolnick, Konrad P. Kording

Figure 1 for Neural Networks as Paths through the Space of Representations

Figure 2 for Neural Networks as Paths through the Space of Representations

Figure 3 for Neural Networks as Paths through the Space of Representations

Figure 4 for Neural Networks as Paths through the Space of Representations

Abstract:Deep neural networks implement a sequence of layer-by-layer operations that are each relatively easy to understand, but the resulting overall computation is generally difficult to understand. We develop a simple idea for interpreting the layer-by-layer construction of useful representations: the role of each layer is to reformat information to reduce the "distance" to the target outputs. We formalize this intuitive idea of "distance" by leveraging recent work on metric representational similarity, and show how it leads to a rich space of geometric concepts. With this framework, the layer-wise computation implemented by a deep neural network can be viewed as a path in a high-dimensional representation space. We develop tools to characterize the geometry of these in terms of distances, angles, and geodesics. We then ask three sets of questions of residual networks trained on CIFAR-10: (1) how straight are paths, and how does each layer contribute towards the target? (2) how do these properties emerge over training? and (3) how similar are the paths taken by wider versus deeper networks? We conclude by sketching additional ways that this kind of representational geometry can be used to understand and interpret network training, or to prescriptively improve network architectures to suit a task.

* 9 pages, 4 figures, plus appendices. Submitted to NeurIPS 2022

Via

Access Paper or Ask Questions

Clustering units in neural networks: upstream vs downstream information

Mar 22, 2022

Richard D. Lange, David S. Rolnick, Konrad P. Kording

Figure 1 for Clustering units in neural networks: upstream vs downstream information

Figure 2 for Clustering units in neural networks: upstream vs downstream information

Figure 3 for Clustering units in neural networks: upstream vs downstream information

Figure 4 for Clustering units in neural networks: upstream vs downstream information

Abstract:It has been hypothesized that some form of "modular" structure in artificial neural networks should be useful for learning, compositionality, and generalization. However, defining and quantifying modularity remains an open problem. We cast the problem of detecting functional modules into the problem of detecting clusters of similar-functioning units. This begs the question of what makes two units functionally similar. For this, we consider two broad families of methods: those that define similarity based on how units respond to structured variations in inputs ("upstream"), and those based on how variations in hidden unit activations affect outputs ("downstream"). We conduct an empirical study quantifying modularity of hidden layer representations of simple feedforward, fully connected networks, across a range of hyperparameters. For each model, we quantify pairwise associations between hidden units in each layer using a variety of both upstream and downstream measures, then cluster them by maximizing their "modularity score" using established tools from network science. We find two surprising results: first, dropout dramatically increased modularity, while other forms of weight regularization had more modest effects. Second, although we observe that there is usually good agreement about clusters within both upstream methods and downstream methods, there is little agreement about the cluster assignments across these two families of methods. This has important implications for representation-learning, as it suggests that finding modular representations that reflect structure in inputs (e.g. disentanglement) may be a distinct goal from learning modular representations that reflect structure in outputs (e.g. compositionality).

* 12 main text pages, 4 main figures, 5 supplemental figures. Will be submitted to TMLR

Via

Access Paper or Ask Questions

Prospective Learning: Back to the Future

Jan 19, 2022

Joshua T. Vogelstein, Timothy Verstynen, Konrad P. Kording, Leyla Isik, John W. Krakauer, Ralph Etienne-Cummings, Elizabeth L. Ogburn, Carey E. Priebe, Randal Burns, Kwame Kutten(+54 more)

Figure 1 for Prospective Learning: Back to the Future

Figure 2 for Prospective Learning: Back to the Future

Figure 3 for Prospective Learning: Back to the Future

Abstract:Research on both natural intelligence (NI) and artificial intelligence (AI) generally assumes that the future resembles the past: intelligent agents or systems (what we call 'intelligence') observe and act on the world, then use this experience to act on future experiences of the same kind. We call this 'retrospective learning'. For example, an intelligence may see a set of pictures of objects, along with their names, and learn to name them. A retrospective learning intelligence would merely be able to name more pictures of the same objects. We argue that this is not what true intelligence is about. In many real world problems, both NIs and AIs will have to learn for an uncertain future. Both must update their internal models to be useful for future tasks, such as naming fundamentally new objects and using these objects effectively in a new context or to achieve previously unencountered goals. This ability to learn for the future we call 'prospective learning'. We articulate four relevant factors that jointly define prospective learning. Continual learning enables intelligences to remember those aspects of the past which it believes will be most useful in the future. Prospective constraints (including biases and priors) facilitate the intelligence finding general solutions that will be applicable to future problems. Curiosity motivates taking actions that inform future decision making, including in previously unmet situations. Causal estimation enables learning the structure of relations that guide choosing actions for specific outcomes, even when the specific action-outcome contingencies have never been observed before. We argue that a paradigm shift from retrospective to prospective learning will enable the communities that study intelligence to unite and overcome existing bottlenecks to more effectively explain, augment, and engineer intelligences.

Via

Access Paper or Ask Questions

Object Based Attention Through Internal Gating

Jun 08, 2021

Jordan Lei, Ari S. Benjamin, Konrad P. Kording

Figure 1 for Object Based Attention Through Internal Gating

Figure 2 for Object Based Attention Through Internal Gating

Figure 3 for Object Based Attention Through Internal Gating

Figure 4 for Object Based Attention Through Internal Gating

Abstract:Object-based attention is a key component of the visual system, relevant for perception, learning, and memory. Neurons tuned to features of attended objects tend to be more active than those associated with non-attended objects. There is a rich set of models of this phenomenon in computational neuroscience. However, there is currently a divide between models that successfully match physiological data but can only deal with extremely simple problems and models of attention used in computer vision. For example, attention in the brain is known to depend on top-down processing, whereas self-attention in deep learning does not. Here, we propose an artificial neural network model of object-based attention that captures the way in which attention is both top-down and recurrent. Our attention model works well both on simple test stimuli, such as those using images of handwritten digits, and on more complex stimuli, such as natural images drawn from the COCO dataset. We find that our model replicates a range of findings from neuroscience, including attention-invariant tuning, inhibition of return, and attention-mediated scaling of activity. Understanding object based attention is both computationally interesting and a key problem for computational neuroscience.

Via

Access Paper or Ask Questions

An adversarial algorithm for variational inference with a new role for acetylcholine

Jun 18, 2020

Ari S. Benjamin, Konrad P. Kording

Figure 1 for An adversarial algorithm for variational inference with a new role for acetylcholine

Figure 2 for An adversarial algorithm for variational inference with a new role for acetylcholine

Figure 3 for An adversarial algorithm for variational inference with a new role for acetylcholine

Figure 4 for An adversarial algorithm for variational inference with a new role for acetylcholine

Abstract:Sensory learning in the mammalian cortex has long been hypothesized to involve the objective of variational inference (VI). Likely the most well-known algorithm for cortical VI is the Wake-Sleep algorithm (Hinton et al. 1995). However Wake-Sleep problematically assumes that neural activities are independent given lower-layers during generation. Here, we construct a VI system that is both compatible with neurobiology and avoids this assumption. The core of the system is a wake-sleep discriminator that classifies network states as inferred or self-generated. Inference connections learn by opposing this discriminator. This adversarial dynamic solves a core problem within VI, which is to match the distribution of stimulus-evoked (inference) activity to that of self-generated activity. Meanwhile, generative connections learn to predict lower-level activity as in standard VI. We implement this algorithm and show that it can successfully train the approximate inference network for generative models. Our proposed algorithm makes several biological predictions that can be tested. Most importantly, it predicts a teaching signal that is remarkably similar to known properties of the cholinergic system.

Via

Access Paper or Ask Questions