Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Osher Lerner

Boltzmann State-Dependent Rationality

Apr 26, 2024

Osher Lerner

Abstract:This paper expands on existing learned models of human behavior via a measured step in structured irrationality. Specifically, by replacing the suboptimality constant $\beta$ in a Boltzmann rationality model with a function over states $\beta(s)$, we gain natural expressivity in a computationally tractable manner. This paper discusses relevant mathematical theory, sets up several experimental designs, presents limited preliminary results, and proposes future investigations.

Via

Access Paper or Ask Questions

Precise Object Placement Using Force-Torque Feedback

Apr 26, 2024

Osher Lerner, Zachary Tam, Michael Equi

Abstract:Precise object manipulation and placement is a common problem for household robots, surgery robots, and robots working on in-situ construction. Prior work using computer vision, depth sensors, and reinforcement learning lacks the ability to reactively recover from planning errors, execution errors, or sensor noise. This work introduces a method that uses force-torque sensing to robustly place objects in stable poses, even in adversarial environments. On 46 trials, our method finds success rates of 100% for basic stacking, and 17% for cases requiring adjustment.

Via

Access Paper or Ask Questions

Natural Gradient Deep Q-learning

Mar 20, 2018

Ethan Knight, Osher Lerner

Figure 1 for Natural Gradient Deep Q-learning

Figure 2 for Natural Gradient Deep Q-learning

Figure 3 for Natural Gradient Deep Q-learning

Figure 4 for Natural Gradient Deep Q-learning

Abstract:This paper presents findings for training a Q-learning reinforcement learning agent using natural gradient techniques. We compare the original deep Q-network (DQN) algorithm to its natural gradient counterpart (NGDQN), measuring NGDQN and DQN performance on classic controls environments without target networks. We find that NGDQN performs favorably relative to DQN, converging to significantly better policies faster and more frequently. These results indicate that natural gradient could be used for value function optimization in reinforcement learning to accelerate and stabilize training.

Via

Access Paper or Ask Questions