Picture for Pierluca D'Oro

Pierluca D'Oro

MaestroMotif: Skill Design from Artificial Intelligence Feedback

Add code
Dec 11, 2024
Viaarxiv icon

Controlling Large Language Model Agents with Entropic Activation Steering

Add code
Jun 01, 2024
Viaarxiv icon

The Curse of Diversity in Ensemble-Based Exploration

Add code
May 07, 2024
Figure 1 for The Curse of Diversity in Ensemble-Based Exploration
Figure 2 for The Curse of Diversity in Ensemble-Based Exploration
Figure 3 for The Curse of Diversity in Ensemble-Based Exploration
Figure 4 for The Curse of Diversity in Ensemble-Based Exploration
Viaarxiv icon

Maxwell's Demon at Work: Efficient Pruning by Leveraging Saturation of Neurons

Add code
Mar 12, 2024
Figure 1 for Maxwell's Demon at Work: Efficient Pruning by Leveraging Saturation of Neurons
Figure 2 for Maxwell's Demon at Work: Efficient Pruning by Leveraging Saturation of Neurons
Figure 3 for Maxwell's Demon at Work: Efficient Pruning by Leveraging Saturation of Neurons
Figure 4 for Maxwell's Demon at Work: Efficient Pruning by Leveraging Saturation of Neurons
Viaarxiv icon

Do Transformer World Models Give Better Policy Gradients?

Add code
Feb 11, 2024
Figure 1 for Do Transformer World Models Give Better Policy Gradients?
Figure 2 for Do Transformer World Models Give Better Policy Gradients?
Figure 3 for Do Transformer World Models Give Better Policy Gradients?
Figure 4 for Do Transformer World Models Give Better Policy Gradients?
Viaarxiv icon

Motif: Intrinsic Motivation from Artificial Intelligence Feedback

Add code
Sep 29, 2023
Viaarxiv icon

Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control

Add code
Sep 26, 2023
Figure 1 for Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control
Figure 2 for Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control
Figure 3 for Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control
Figure 4 for Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control
Viaarxiv icon

The Primacy Bias in Deep Reinforcement Learning

Add code
May 16, 2022
Figure 1 for The Primacy Bias in Deep Reinforcement Learning
Figure 2 for The Primacy Bias in Deep Reinforcement Learning
Figure 3 for The Primacy Bias in Deep Reinforcement Learning
Figure 4 for The Primacy Bias in Deep Reinforcement Learning
Viaarxiv icon

Policy Optimization as Online Learning with Mediator Feedback

Add code
Dec 15, 2020
Figure 1 for Policy Optimization as Online Learning with Mediator Feedback
Figure 2 for Policy Optimization as Online Learning with Mediator Feedback
Figure 3 for Policy Optimization as Online Learning with Mediator Feedback
Figure 4 for Policy Optimization as Online Learning with Mediator Feedback
Viaarxiv icon

How to Learn a Useful Critic? Model-based Action-Gradient-Estimator Policy Optimization

Add code
Apr 29, 2020
Figure 1 for How to Learn a Useful Critic? Model-based Action-Gradient-Estimator Policy Optimization
Figure 2 for How to Learn a Useful Critic? Model-based Action-Gradient-Estimator Policy Optimization
Figure 3 for How to Learn a Useful Critic? Model-based Action-Gradient-Estimator Policy Optimization
Figure 4 for How to Learn a Useful Critic? Model-based Action-Gradient-Estimator Policy Optimization
Viaarxiv icon