Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Aditi Mavalankar

Open-Endedness is Essential for Artificial Superhuman Intelligence

Jun 06, 2024

Edward Hughes, Michael Dennis, Jack Parker-Holder, Feryal Behbahani, Aditi Mavalankar, Yuge Shi, Tom Schaul, Tim Rocktaschel

Figure 1 for Open-Endedness is Essential for Artificial Superhuman Intelligence

Figure 2 for Open-Endedness is Essential for Artificial Superhuman Intelligence

Figure 3 for Open-Endedness is Essential for Artificial Superhuman Intelligence

Abstract:In recent years there has been a tremendous surge in the general capabilities of AI systems, mainly fuelled by training foundation models on internetscale data. Nevertheless, the creation of openended, ever self-improving AI remains elusive. In this position paper, we argue that the ingredients are now in place to achieve openendedness in AI systems with respect to a human observer. Furthermore, we claim that such open-endedness is an essential property of any artificial superhuman intelligence (ASI). We begin by providing a concrete formal definition of open-endedness through the lens of novelty and learnability. We then illustrate a path towards ASI via open-ended systems built on top of foundation models, capable of making novel, humanrelevant discoveries. We conclude by examining the safety implications of generally-capable openended AI. We expect that open-ended foundation models will prove to be an increasingly fertile and safety-critical area of research in the near future.

Via

Access Paper or Ask Questions

Genie: Generative Interactive Environments

Feb 23, 2024

Jake Bruce, Michael Dennis, Ashley Edwards, Jack Parker-Holder, Yuge Shi, Edward Hughes, Matthew Lai, Aditi Mavalankar, Richie Steigerwald, Chris Apps(+15 more)

Figure 1 for Genie: Generative Interactive Environments

Figure 2 for Genie: Generative Interactive Environments

Figure 3 for Genie: Generative Interactive Environments

Figure 4 for Genie: Generative Interactive Environments

Abstract:We introduce Genie, the first generative interactive environment trained in an unsupervised manner from unlabelled Internet videos. The model can be prompted to generate an endless variety of action-controllable virtual worlds described through text, synthetic images, photographs, and even sketches. At 11B parameters, Genie can be considered a foundation world model. It is comprised of a spatiotemporal video tokenizer, an autoregressive dynamics model, and a simple and scalable latent action model. Genie enables users to act in the generated environments on a frame-by-frame basis despite training without any ground-truth action labels or other domain-specific requirements typically found in the world model literature. Further the resulting learned latent action space facilitates training agents to imitate behaviors from unseen videos, opening the path for training generalist agents of the future.

* https://sites.google.com/corp/view/genie-2024/

Via

Access Paper or Ask Questions

Provably Efficient Model-based Policy Adaptation

Jun 14, 2020

Yuda Song, Aditi Mavalankar, Wen Sun, Sicun Gao

Figure 1 for Provably Efficient Model-based Policy Adaptation

Figure 2 for Provably Efficient Model-based Policy Adaptation

Figure 3 for Provably Efficient Model-based Policy Adaptation

Figure 4 for Provably Efficient Model-based Policy Adaptation

Abstract:The high sample complexity of reinforcement learning challenges its use in practice. A promising approach is to quickly adapt pre-trained policies to new environments. Existing methods for this policy adaptation problem typically rely on domain randomization and meta-learning, by sampling from some distribution of target environments during pre-training, and thus face difficulty on out-of-distribution target environments. We propose new model-based mechanisms that are able to make online adaptation in unseen target environments, by combining ideas from no-regret online learning and adaptive control. We prove that the approach learns policies in the target environment that can quickly recover trajectories from the source environment, and establish the rate of convergence in general settings. We demonstrate the benefits of our approach for policy adaptation in a diverse set of continuous control tasks, achieving the performance of state-of-the-art methods with much lower sample complexity.

Via

Access Paper or Ask Questions

Goal-conditioned Batch Reinforcement Learning for Rotation Invariant Locomotion

Apr 17, 2020

Aditi Mavalankar

Figure 1 for Goal-conditioned Batch Reinforcement Learning for Rotation Invariant Locomotion

Figure 2 for Goal-conditioned Batch Reinforcement Learning for Rotation Invariant Locomotion

Figure 3 for Goal-conditioned Batch Reinforcement Learning for Rotation Invariant Locomotion

Figure 4 for Goal-conditioned Batch Reinforcement Learning for Rotation Invariant Locomotion

Abstract:We propose a novel approach to learn goal-conditioned policies for locomotion in a batch RL setting. The batch data is collected by a policy that is not goal-conditioned. For the locomotion task, this translates to data collection using a policy learnt by the agent for walking straight in one direction, and using that data to learn a goal-conditioned policy that enables the agent to walk in any direction. The data collection policy used should be invariant to the direction the agent is facing i.e. regardless of its initial orientation, the agent should take the same actions to walk forward. We exploit this property to learn a goal-conditioned policy using two key ideas: (1) augmenting data by generating trajectories with the same actions in different directions, and (2) learning an encoder that enforces invariance between these rotated trajectories with a Siamese framework. We show that our approach outperforms existing RL algorithms on 3-D locomotion agents like Ant, Humanoid and Minitaur.

* Accepted to the BeTR-RL workshop at ICLR 2020. Link to code: https://github.com/aditimavalankar/gc-batch-rl-locomotion

Via

Access Paper or Ask Questions