Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kartik Chandra

Building Machines that Learn and Think with People

Jul 22, 2024

Katherine M. Collins, Ilia Sucholutsky, Umang Bhatt, Kartik Chandra, Lionel Wong, Mina Lee, Cedegao E. Zhang, Tan Zhi-Xuan, Mark Ho, Vikash Mansinghka(+3 more)

Figure 1 for Building Machines that Learn and Think with People

Figure 2 for Building Machines that Learn and Think with People

Figure 3 for Building Machines that Learn and Think with People

Figure 4 for Building Machines that Learn and Think with People

Abstract:What do we want from machine intelligence? We envision machines that are not just tools for thought, but partners in thought: reasonable, insightful, knowledgeable, reliable, and trustworthy systems that think with us. Current artificial intelligence (AI) systems satisfy some of these criteria, some of the time. In this Perspective, we show how the science of collaborative cognition can be put to work to engineer systems that really can be called ``thought partners,'' systems built to meet our expectations and complement our limitations. We lay out several modes of collaborative thought in which humans and AI thought partners can engage and propose desiderata for human-compatible thought partnerships. Drawing on motifs from computational cognitive science, we motivate an alternative scaling path for the design of thought partners and ecosystems around their use through a Bayesian lens, whereby the partners we construct actively build and reason over models of the human and world.

Via

Access Paper or Ask Questions

WatChat: Explaining perplexing programs by debugging mental models

Mar 08, 2024

Kartik Chandra, Tzu-Mao Li, Rachit Nigam, Joshua Tenenbaum, Jonathan Ragan-Kelley

Abstract:Often, a good explanation for a program's unexpected behavior is a bug in the programmer's code. But sometimes, an even better explanation is a bug in the programmer's mental model of the language they are using. Instead of merely debugging our current code ("giving the programmer a fish"), what if our tools could directly debug our mental models ("teaching the programmer to fish")? In this paper, we apply ideas from computational cognitive science to do exactly that. Given a perplexing program, we use program synthesis techniques to automatically infer potential misconceptions that might cause the user to be surprised by the program's behavior. By analyzing these misconceptions, we provide succinct, useful explanations of the program's behavior. Our methods can even be inverted to synthesize pedagogical example programs for diagnosing and correcting misconceptions in students.

Via

Access Paper or Ask Questions

How to guess a gradient

Dec 07, 2023

Utkarsh Singhal, Brian Cheung, Kartik Chandra, Jonathan Ragan-Kelley, Joshua B. Tenenbaum, Tomaso A. Poggio, Stella X. Yu

Abstract:How much can you say about the gradient of a neural network without computing a loss or knowing the label? This may sound like a strange question: surely the answer is "very little." However, in this paper, we show that gradients are more structured than previously thought. Gradients lie in a predictable low-dimensional subspace which depends on the network architecture and incoming features. Exploiting this structure can significantly improve gradient-free optimization schemes based on directional derivatives, which have struggled to scale beyond small networks trained on toy datasets. We study how to narrow the gap in optimization performance between methods that calculate exact gradients and those that use directional derivatives. Furthermore, we highlight new challenges in overcoming the large gap between optimizing with exact gradients and guessing the gradients.

Via

Access Paper or Ask Questions

Differentiating Metropolis-Hastings to Optimize Intractable Densities

Jun 30, 2023

Gaurav Arya, Ruben Seyer, Frank Schäfer, Kartik Chandra, Alexander K. Lew, Mathieu Huot, Vikash K. Mansinghka, Jonathan Ragan-Kelley, Christopher Rackauckas, Moritz Schauer

Abstract:We develop an algorithm for automatic differentiation of Metropolis-Hastings samplers, allowing us to differentiate through probabilistic inference, even if the model has discrete components within it. Our approach fuses recent advances in stochastic automatic differentiation with traditional Markov chain coupling schemes, providing an unbiased and low-variance gradient estimator. This allows us to apply gradient-based optimization to objectives expressed as expectations over intractable target densities. We demonstrate our approach by finding an ambiguous observation in a Gaussian mixture model and by maximizing the specific heat in an Ising model.

* 6 pages, 6 figures; accepted at Differentiable Almost Everything Workshop of ICML 2023

Via

Access Paper or Ask Questions

Inferring the Future by Imagining the Past

May 26, 2023

Kartik Chandra, Tony Chen, Tzu-Mao Li, Jonathan Ragan-Kelley, Josh Tenenbaum

Abstract:A single panel of a comic book can say a lot: it shows not only where characters currently are, but also where they came from, what their motivations are, and what might happen next. More generally, humans can often infer a complex sequence of past and future events from a *single snapshot image* of an intelligent agent. Building on recent work in cognitive science, we offer a Monte Carlo algorithm for making such inferences. Drawing a connection to Monte Carlo path tracing in computer graphics, we borrow ideas that help us dramatically improve upon prior work in sample efficiency. This allows us to scale to a wide variety of challenging inference problems with only a handful of samples. It also suggests some degree of cognitive plausibility, and indeed we present human subject studies showing that our algorithm matches human intuitions in a variety of domains that previous methods could not scale to.

Via

Access Paper or Ask Questions

Acting as Inverse Inverse Planning

May 26, 2023

Kartik Chandra, Tzu-Mao Li, Josh Tenenbaum, Jonathan Ragan-Kelley

Abstract:Great storytellers know how to take us on a journey. They direct characters to act -- not necessarily in the most rational way -- but rather in a way that leads to interesting situations, and ultimately creates an impactful experience for audience members looking on. If audience experience is what matters most, then can we help artists and animators *directly* craft such experiences, independent of the concrete character actions needed to evoke those experiences? In this paper, we offer a novel computational framework for such tools. Our key idea is to optimize animations with respect to *simulated* audience members' experiences. To simulate the audience, we borrow an established principle from cognitive science: that human social intuition can be modeled as "inverse planning," the task of inferring an agent's (hidden) goals from its (observed) actions. Building on this model, we treat storytelling as "*inverse* inverse planning," the task of choosing actions to manipulate an inverse planner's inferences. Our framework is grounded in literary theory, naturally capturing many storytelling elements from first principles. We give a series of examples to demonstrate this, with supporting evidence from human subject studies.

* SIGGRAPH '23

Via

Access Paper or Ask Questions

Designing Perceptual Puzzles by Differentiating Probabilistic Programs

Apr 26, 2022

Kartik Chandra, Tzu-Mao Li, Joshua Tenenbaum, Jonathan Ragan-Kelley

Figure 1 for Designing Perceptual Puzzles by Differentiating Probabilistic Programs

Figure 2 for Designing Perceptual Puzzles by Differentiating Probabilistic Programs

Figure 3 for Designing Perceptual Puzzles by Differentiating Probabilistic Programs

Abstract:We design new visual illusions by finding "adversarial examples" for principled models of human perception -- specifically, for probabilistic models, which treat vision as Bayesian inference. To perform this search efficiently, we design a differentiable probabilistic programming language, whose API exposes MCMC inference as a first-class differentiable function. We demonstrate our method by automatically creating illusions for three features of human vision: color constancy, size constancy, and face perception.

* 9 pages; 3 figures; SIGGRAPH '22 Conference Proceedings

Via

Access Paper or Ask Questions

Gradient Descent: The Ultimate Optimizer

Sep 29, 2019

Kartik Chandra, Erik Meijer, Samantha Andow, Emilio Arroyo-Fang, Irene Dea, Johann George, Melissa Grueter, Basil Hosmer, Steffi Stumpos, Alanna Tempest(+1 more)

Figure 1 for Gradient Descent: The Ultimate Optimizer

Figure 2 for Gradient Descent: The Ultimate Optimizer

Figure 3 for Gradient Descent: The Ultimate Optimizer

Figure 4 for Gradient Descent: The Ultimate Optimizer

Abstract:Working with any gradient-based machine learning algorithm involves the tedious task of tuning the optimizer's hyperparameters, such as the learning rate. There exist many techniques for automated hyperparameter optimization, but they typically introduce even more hyperparameters to control the hyperparameter optimization process. We propose to instead learn the hyperparameters themselves by gradient descent, and furthermore to learn the hyper-hyperparameters by gradient descent as well, and so on ad infinitum. As these towers of gradient-based optimizers grow, they become significantly less sensitive to the choice of top-level hyperparameters, hence decreasing the burden on the user to search for optimal values.

Via

Access Paper or Ask Questions

SPoC: Search-based Pseudocode to Code

Jun 12, 2019

Sumith Kulal, Panupong Pasupat, Kartik Chandra, Mina Lee, Oded Padon, Alex Aiken, Percy Liang

Figure 1 for SPoC: Search-based Pseudocode to Code

Figure 2 for SPoC: Search-based Pseudocode to Code

Figure 3 for SPoC: Search-based Pseudocode to Code

Figure 4 for SPoC: Search-based Pseudocode to Code

Abstract:We consider the task of mapping pseudocode to long programs that are functionally correct. Given test cases as a mechanism to validate programs, we search over the space of possible translations of the pseudocode to find a program that passes the validation. However, without proper credit assignment to localize the sources of program failures, it is difficult to guide search toward more promising programs. We propose to perform credit assignment based on signals from compilation errors, which constitute 88.7% of program failures. Concretely, we treat the translation of each pseudocode line as a discrete portion of the program, and whenever a synthesized program fails to compile, an error localization method tries to identify the portion of the program responsible for the failure. We then focus search over alternative translations of the pseudocode for those portions. For evaluation, we collected the SPoC dataset (Search-based Pseudocode to Code) containing 18,356 programs with human-authored pseudocode and test cases. Under a budget of 100 program compilations, performing search improves the synthesis success rate over using the top-one translation of the pseudocode from 25.6% to 44.7%.

* Under submission to NeurIPS 2019

Via

Access Paper or Ask Questions