Picture for David Mguni

David Mguni

All Language Models Large and Small

Add code
Feb 19, 2024
Viaarxiv icon

ChessGPT: Bridging Policy Learning and Language Modeling

Add code
Jun 15, 2023
Figure 1 for ChessGPT: Bridging Policy Learning and Language Modeling
Figure 2 for ChessGPT: Bridging Policy Learning and Language Modeling
Figure 3 for ChessGPT: Bridging Policy Learning and Language Modeling
Figure 4 for ChessGPT: Bridging Policy Learning and Language Modeling
Viaarxiv icon

Ensemble Value Functions for Efficient Exploration in Multi-Agent Reinforcement Learning

Add code
Mar 02, 2023
Viaarxiv icon

Semi-Centralised Multi-Agent Reinforcement Learning with Policy-Embedded Training

Add code
Sep 02, 2022
Figure 1 for Semi-Centralised Multi-Agent Reinforcement Learning with Policy-Embedded Training
Figure 2 for Semi-Centralised Multi-Agent Reinforcement Learning with Policy-Embedded Training
Figure 3 for Semi-Centralised Multi-Agent Reinforcement Learning with Policy-Embedded Training
Figure 4 for Semi-Centralised Multi-Agent Reinforcement Learning with Policy-Embedded Training
Viaarxiv icon

Timing is Everything: Learning to Act Selectively with Costly Actions and Budgetary Constraints

Add code
Jun 06, 2022
Figure 1 for Timing is Everything: Learning to Act Selectively with Costly Actions and Budgetary Constraints
Figure 2 for Timing is Everything: Learning to Act Selectively with Costly Actions and Budgetary Constraints
Figure 3 for Timing is Everything: Learning to Act Selectively with Costly Actions and Budgetary Constraints
Figure 4 for Timing is Everything: Learning to Act Selectively with Costly Actions and Budgetary Constraints
Viaarxiv icon

SEREN: Knowing When to Explore and When to Exploit

Add code
May 30, 2022
Figure 1 for SEREN: Knowing When to Explore and When to Exploit
Figure 2 for SEREN: Knowing When to Explore and When to Exploit
Figure 3 for SEREN: Knowing When to Explore and When to Exploit
Figure 4 for SEREN: Knowing When to Explore and When to Exploit
Viaarxiv icon

On the Convergence of Fictitious Play: A Decomposition Approach

Add code
May 03, 2022
Figure 1 for On the Convergence of Fictitious Play: A Decomposition Approach
Figure 2 for On the Convergence of Fictitious Play: A Decomposition Approach
Figure 3 for On the Convergence of Fictitious Play: A Decomposition Approach
Figure 4 for On the Convergence of Fictitious Play: A Decomposition Approach
Viaarxiv icon

SAUTE RL: Almost Surely Safe Reinforcement Learning Using State Augmentation

Add code
Feb 16, 2022
Figure 1 for SAUTE RL: Almost Surely Safe Reinforcement Learning Using State Augmentation
Figure 2 for SAUTE RL: Almost Surely Safe Reinforcement Learning Using State Augmentation
Figure 3 for SAUTE RL: Almost Surely Safe Reinforcement Learning Using State Augmentation
Figure 4 for SAUTE RL: Almost Surely Safe Reinforcement Learning Using State Augmentation
Viaarxiv icon

DESTA: A Framework for Safe Reinforcement Learning with Markov Games of Intervention

Add code
Oct 27, 2021
Figure 1 for DESTA: A Framework for Safe Reinforcement Learning with Markov Games of Intervention
Figure 2 for DESTA: A Framework for Safe Reinforcement Learning with Markov Games of Intervention
Figure 3 for DESTA: A Framework for Safe Reinforcement Learning with Markov Games of Intervention
Figure 4 for DESTA: A Framework for Safe Reinforcement Learning with Markov Games of Intervention
Viaarxiv icon

Learning to Shape Rewards using a Game of Switching Controls

Add code
Mar 16, 2021
Figure 1 for Learning to Shape Rewards using a Game of Switching Controls
Figure 2 for Learning to Shape Rewards using a Game of Switching Controls
Figure 3 for Learning to Shape Rewards using a Game of Switching Controls
Figure 4 for Learning to Shape Rewards using a Game of Switching Controls
Viaarxiv icon