Abstract:We introduce a dynamic neural algorithm called Dynamic Neural (DN) SARSA(\lambda) for learning a behavioral sequence from delayed reward. DN-SARSA(\lambda) combines Dynamic Field Theory models of behavioral sequence representation, classical reinforcement learning, and a computational neuroscience model of working memory, called Item and Order working memory, which serves as an eligibility trace. DN-SARSA(\lambda) is implemented on both a simulated and real robot that must learn a specific rewarding sequence of elementary behaviors from exploration. Results show DN-SARSA(\lambda) performs on the level of the discrete SARSA(\lambda), validating the feasibility of general reinforcement learning without compromising neural dynamics.