Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Dane Corneil

Retrieve to Explain: Evidence-driven Predictions with Language Models

Feb 06, 2024

Ravi Patel, Angus Brayne, Rogier Hintzen, Daniel Jaroslawicz, Georgiana Neculae, Dane Corneil

Abstract:Machine learning models, particularly language models, are notoriously difficult to introspect. Black-box models can mask both issues in model training and harmful biases. For human-in-the-loop processes, opaque predictions can drive lack of trust, limiting a model's impact even when it performs effectively. To address these issues, we introduce Retrieve to Explain (R2E). R2E is a retrieval-based language model that prioritizes amongst a pre-defined set of possible answers to a research question based on the evidence in a document corpus, using Shapley values to identify the relative importance of pieces of evidence to the final prediction. R2E can adapt to new evidence without retraining, and incorporate structured data through templating into natural language. We assess on the use case of drug target identification from published scientific literature, where we show that the model outperforms an industry-standard genetics-based approach on predicting clinical trial outcomes.

Via

Access Paper or Ask Questions

Working memory facilitates reward-modulated Hebbian learning in recurrent neural networks

Oct 23, 2019

Roman Pogodin, Dane Corneil, Alexander Seeholzer, Joseph Heng, Wulfram Gerstner

Figure 1 for Working memory facilitates reward-modulated Hebbian learning in recurrent neural networks

Figure 2 for Working memory facilitates reward-modulated Hebbian learning in recurrent neural networks

Figure 3 for Working memory facilitates reward-modulated Hebbian learning in recurrent neural networks

Abstract:Reservoir computing is a powerful tool to explain how the brain learns temporal sequences, such as movements, but existing learning schemes are either biologically implausible or too inefficient to explain animal performance. We show that a network can learn complicated sequences with a reward-modulated Hebbian learning rule if the network of reservoir neurons is combined with a second network that serves as a dynamic working memory and provides a spatio-temporal backbone signal to the reservoir. In combination with the working memory, reward-modulated Hebbian learning of the readout neurons performs as well as FORCE learning, but with the advantage of a biologically plausible interpretation of both the learning rule and the learning paradigm.

* NeurIPS 2019 workshop "Real Neurons & Hidden Units: Future directions at the intersection of neuroscience and artificial intelligence", Vancouver, Canada

Via

Access Paper or Ask Questions

Efficient Model-Based Deep Reinforcement Learning with Variational State Tabulation

Jun 11, 2018

Dane Corneil, Wulfram Gerstner, Johanni Brea

Figure 1 for Efficient Model-Based Deep Reinforcement Learning with Variational State Tabulation

Figure 2 for Efficient Model-Based Deep Reinforcement Learning with Variational State Tabulation

Figure 3 for Efficient Model-Based Deep Reinforcement Learning with Variational State Tabulation

Figure 4 for Efficient Model-Based Deep Reinforcement Learning with Variational State Tabulation

Abstract:Modern reinforcement learning algorithms reach super-human performance on many board and video games, but they are sample inefficient, i.e. they typically require significantly more playing experience than humans to reach an equal performance level. To improve sample efficiency, an agent may build a model of the environment and use planning methods to update its policy. In this article we introduce Variational State Tabulation (VaST), which maps an environment with a high-dimensional state space (e.g. the space of visual inputs) to an abstract tabular model. Prioritized sweeping with small backups, a highly efficient planning method, can then be used to update state-action values. We show how VaST can rapidly learn to maximize reward in tasks like 3D navigation and efficiently adapt to sudden changes in rewards or transition probabilities.

* Accepted at ICML 2018; camera-ready version

Via

Access Paper or Ask Questions