Picture for Sharath Chandra Raparthy

Sharath Chandra Raparthy

Teaching Large Language Models to Reason with Reinforcement Learning

Add code
Mar 07, 2024
Viaarxiv icon

Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts

Add code
Feb 26, 2024
Viaarxiv icon

Generalization to New Sequential Decision Making Tasks with In-Context Learning

Add code
Dec 06, 2023
Viaarxiv icon

Multi-Objective GFlowNets

Add code
Oct 23, 2022
Viaarxiv icon

Continual Learning In Environments With Polynomial Mixing Times

Add code
Dec 13, 2021
Figure 1 for Continual Learning In Environments With Polynomial Mixing Times
Figure 2 for Continual Learning In Environments With Polynomial Mixing Times
Figure 3 for Continual Learning In Environments With Polynomial Mixing Times
Figure 4 for Continual Learning In Environments With Polynomial Mixing Times
Viaarxiv icon

Compositional Attention: Disentangling Search and Retrieval

Add code
Oct 18, 2021
Figure 1 for Compositional Attention: Disentangling Search and Retrieval
Figure 2 for Compositional Attention: Disentangling Search and Retrieval
Figure 3 for Compositional Attention: Disentangling Search and Retrieval
Figure 4 for Compositional Attention: Disentangling Search and Retrieval
Viaarxiv icon

Curriculum in Gradient-Based Meta-Reinforcement Learning

Add code
Feb 19, 2020
Figure 1 for Curriculum in Gradient-Based Meta-Reinforcement Learning
Figure 2 for Curriculum in Gradient-Based Meta-Reinforcement Learning
Figure 3 for Curriculum in Gradient-Based Meta-Reinforcement Learning
Figure 4 for Curriculum in Gradient-Based Meta-Reinforcement Learning
Viaarxiv icon

Generating Automatic Curricula via Self-Supervised Active Domain Randomization

Add code
Feb 18, 2020
Figure 1 for Generating Automatic Curricula via Self-Supervised Active Domain Randomization
Figure 2 for Generating Automatic Curricula via Self-Supervised Active Domain Randomization
Figure 3 for Generating Automatic Curricula via Self-Supervised Active Domain Randomization
Figure 4 for Generating Automatic Curricula via Self-Supervised Active Domain Randomization
Viaarxiv icon