Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Stefano Nolfi

LLMs for sensory-motor control: Combining in-context and iterative learning

Jun 05, 2025

Jônata Tyska Carvalho, Stefano Nolfi

Abstract:We propose a method that enables large language models (LLMs) to control embodied agents by directly mapping continuous observation vectors to continuous action vectors. Initially, the LLMs generate a control strategy based on a textual description of the agent, its environment, and the intended goal. This strategy is then iteratively refined through a learning process in which the LLMs are repeatedly prompted to improve the current strategy, using performance feedback and sensory-motor data collected during its evaluation. The method is validated on classic control tasks from the Gymnasium library and the inverted pendulum task from the MuJoCo library. In most cases, it successfully identifies optimal or high-performing solutions by integrating symbolic knowledge derived through reasoning with sub-symbolic sensory-motor data gathered as the agent interacts with its environment.

* 24 pages (13 pages are from appendix), 6 figures, code for experiments replication and supplementary material provided at https://github.com/jtyska/llm-robotics-article/

Via

Access Paper or Ask Questions

On the Unexpected Abilities of Large Language Models

Aug 09, 2023

Stefano Nolfi

Abstract:Large language models are capable of displaying a wide range of abilities that are not directly connected with the task for which they are trained: predicting the next words of human-written texts. In this article, I discuss the nature of this indirect acquisition process and its relation to other known indirect processes. I argue that an important side effect of such indirect acquisition is the development of integrated abilities. I discuss the extent to which the abilities developed by large language models are predictable. Finally, I briefly discuss the relation between the cognitive skills acquired by these systems and human cognition.

Via

Access Paper or Ask Questions

The Role of Environmental Variations in Evolutionary Robotics: Maximizing Performance and Robustness

Aug 04, 2022

Jonata Tyska Carvalho, Stefano Nolfi

Figure 1 for The Role of Environmental Variations in Evolutionary Robotics: Maximizing Performance and Robustness

Figure 2 for The Role of Environmental Variations in Evolutionary Robotics: Maximizing Performance and Robustness

Figure 3 for The Role of Environmental Variations in Evolutionary Robotics: Maximizing Performance and Robustness

Figure 4 for The Role of Environmental Variations in Evolutionary Robotics: Maximizing Performance and Robustness

Abstract:Exposing evolving robots to variable conditions is necessary to obtain solutions which are robust to environmental variations and which can cross the reality gap. However, we do not yet have methods for analyzing and understanding the impact of environmental variations on the evolutionary process, and therefore for choosing suitable variation ranges. In this article we introduce a method that permits us to measure the impact of environmental variations and we analyze the relation between the amplitude of variations, the modality with which they are introduced, and the performance and robustness of evolving agents. Our results demonstrate that (i) the evolutionary algorithm can tolerate environmental variations which have a very high impact, (ii) variations affecting the actions of the agent are tolerated much better than variations affecting the initial state of the agent or of the environment, and (iii) improving the accuracy of the fitness measure through multiple evaluations is not always useful. Moreover, our results show that environmental variations permit generating solutions which perform better both in varying and non-varying environments.

* submitted to MIT Evolutionary Computation Journal

Via

Access Paper or Ask Questions

Qualitative Differences Between Evolutionary Strategies and Reinforcement Learning Methods for Control of Autonomous Agents

May 16, 2022

Nicola Milano, Stefano Nolfi

Figure 1 for Qualitative Differences Between Evolutionary Strategies and Reinforcement Learning Methods for Control of Autonomous Agents

Figure 2 for Qualitative Differences Between Evolutionary Strategies and Reinforcement Learning Methods for Control of Autonomous Agents

Figure 3 for Qualitative Differences Between Evolutionary Strategies and Reinforcement Learning Methods for Control of Autonomous Agents

Figure 4 for Qualitative Differences Between Evolutionary Strategies and Reinforcement Learning Methods for Control of Autonomous Agents

Abstract:In this paper we analyze the qualitative differences between evolutionary strategies and reinforcement learning algorithms by focusing on two popular state-of-the-art algorithms: the OpenAI-ES evolutionary strategy and the Proximal Policy Optimization (PPO) reinforcement learning algorithm -- the most similar methods of the two families. We analyze how the methods differ with respect to: (i) general efficacy, (ii) ability to cope with sparse rewards, (iii) propensity/capacity to discover minimal solutions, (iv) dependency on reward shaping, and (v) ability to cope with variations of the environmental conditions. The analysis of the performance and of the behavioral strategies displayed by the agents trained with the two methods on benchmark problems enable us to demonstrate qualitative differences which were not identified in previous studies, to identify the relative weakness of the two methods, and to propose ways to ameliorate some of those weakness. We show that the characteristics of the reward function has a strong impact which vary qualitatively not only for the OpenAI-ES and the PPO but also for alternative reinforcement learning algorithms, thus demonstrating the importance of optimizing the characteristic of the reward function to the algorithm used.

Via

Access Paper or Ask Questions

Automated Curriculum Learning for Embodied Agents: A Neuroevolutionary Approach

Feb 17, 2021

Nicola Milano, Stefano Nolfi

Figure 1 for Automated Curriculum Learning for Embodied Agents: A Neuroevolutionary Approach

Figure 2 for Automated Curriculum Learning for Embodied Agents: A Neuroevolutionary Approach

Figure 3 for Automated Curriculum Learning for Embodied Agents: A Neuroevolutionary Approach

Figure 4 for Automated Curriculum Learning for Embodied Agents: A Neuroevolutionary Approach

Abstract:We demonstrate how an evolutionary algorithm can be extended with a curriculum learning process that selects automatically the environmental conditions in which the evolving agents are evaluated. The environmental conditions are selected so to adjust the level of difficulty to the ability level of the current evolving agents and so to challenge the weaknesses of the evolving agents. The method does not require domain knowledge and does not introduce additional hyperparameters. The results collected on two benchmark problems, that require to solve a task in significantly varying environmental conditions, demonstrate that the method proposed outperforms conventional algorithms and generates solutions that are robust to variations

Via

Access Paper or Ask Questions

The Dynamic of Body and Brain Co-Evolution

Nov 23, 2020

Paolo Pagliuca, Stefano Nolfi

Figure 1 for The Dynamic of Body and Brain Co-Evolution

Figure 2 for The Dynamic of Body and Brain Co-Evolution

Figure 3 for The Dynamic of Body and Brain Co-Evolution

Figure 4 for The Dynamic of Body and Brain Co-Evolution

Abstract:We introduce a method that permits to co-evolve the body and the control properties of robots. It can be used to adapt the morphological traits of robots with a hand-designed morphological bauplan or to evolve the morphological bauplan as well. Our results indicate that robots with co-adapted body and control traits outperform robots with fixed hand-designed morphologies. Interestingly, the advantage is not due to the selection of better morphologies but rather to the mutual scaffolding process that results from the possibility to co-adapt the morphological traits to the control traits and vice versa. Our results also demonstrate that morphological variations do not necessarily have destructive effects on robot skills.

Via

Access Paper or Ask Questions

Autonomous Learning of Features for Control: Experiments with Embodied and Situated Agents

Sep 15, 2020

Nicola Milano, Stefano Nolfi

Figure 1 for Autonomous Learning of Features for Control: Experiments with Embodied and Situated Agents

Figure 2 for Autonomous Learning of Features for Control: Experiments with Embodied and Situated Agents

Figure 3 for Autonomous Learning of Features for Control: Experiments with Embodied and Situated Agents

Figure 4 for Autonomous Learning of Features for Control: Experiments with Embodied and Situated Agents

Abstract:As discussed in previous studies, the efficacy of evolutionary or reinforcement learning algorithms for continuous control optimization can be enhanced by including a neural module dedicated to feature extraction trained through self-supervised methods. In this paper we report additional experiments supporting this hypothesis and we demonstrate how the advantage provided by feature extraction is not limited to problems that benefit from dimensionality reduction or that involve agents operating on the basis of allocentric perception. We introduce a method that permits to continue the training of the feature-extraction module during the training of the policy network and that increases the efficacy of feature extraction. Finally, we compare alternative feature-extracting methods and we show that sequence-to-sequence learning yields better results than the methods considered in previous studies.

Via

Access Paper or Ask Questions

Efficacy of Modern Neuro-Evolutionary Strategies for Continuous Control Optimization

Dec 11, 2019

Paolo Pagliuca, Nicola Milano, Stefano Nolfi

Figure 1 for Efficacy of Modern Neuro-Evolutionary Strategies for Continuous Control Optimization

Figure 2 for Efficacy of Modern Neuro-Evolutionary Strategies for Continuous Control Optimization

Figure 3 for Efficacy of Modern Neuro-Evolutionary Strategies for Continuous Control Optimization

Figure 4 for Efficacy of Modern Neuro-Evolutionary Strategies for Continuous Control Optimization

Abstract:We analyze the efficacy of modern neuro-evolutionary strategies for continuous control optimization. Overall the results collected on a wide variety of qualitatively different benchmark problems indicate that these methods are generally effective and scale well with respect to the number of parameters and the complexity of the problem. We demonstrate the importance of using suitable fitness functions or reward criteria since functions that are optimal for reinforcement learning algorithms tend to be sub-optimal for evolutionary strategies and vice versa. Finally, we provide an analysis of the role of hyper-parameters that demonstrates the importance of normalization techniques, especially in complex problems.

* 13 pages, 5 Figures, 3 Tables

Via

Access Paper or Ask Questions

Long-Term Progress and Behavior Complexification in Competitive Co-Evolution

Sep 18, 2019

Luca Simione, Stefano Nolfi

Figure 1 for Long-Term Progress and Behavior Complexification in Competitive Co-Evolution

Figure 2 for Long-Term Progress and Behavior Complexification in Competitive Co-Evolution

Figure 3 for Long-Term Progress and Behavior Complexification in Competitive Co-Evolution

Figure 4 for Long-Term Progress and Behavior Complexification in Competitive Co-Evolution

Abstract:The possibility to use competitive evolutionary algorithms to generate long-term progress is normally prevented by the convergence on limit cycle dynamics in which the evolving agents keep progressing against their current competitors by periodically rediscovering solutions adopted previously over and over again. This leads to local but not to global progress, i.e. progress against all possible competitors. We propose a new competitive algorithm capable of leading to long term global progress thanks to its ability to identify and filter out opportunistic variations, i.e. variations leading to progress against current competitors and retrogression against other competitors. The efficacy of the method is validated on the co-evolution of predator and prey robots, a classic scenario that has been used in other related researches. The accumulation of global progress over many generation leads to effective solutions that involve the production of rather articulated behaviors. The complexity of the behavior displayed by the evolving robots tend to increase across generation although progresses in performance are not always accompanied by behavior complexification.

Via

Access Paper or Ask Questions

Scaling Up Cartesian Genetic Programming through Preferential Selection of Larger Solutions

Oct 22, 2018

Nicola Milano, Stefano Nolfi

Figure 1 for Scaling Up Cartesian Genetic Programming through Preferential Selection of Larger Solutions

Figure 2 for Scaling Up Cartesian Genetic Programming through Preferential Selection of Larger Solutions

Figure 3 for Scaling Up Cartesian Genetic Programming through Preferential Selection of Larger Solutions

Figure 4 for Scaling Up Cartesian Genetic Programming through Preferential Selection of Larger Solutions

Abstract:We demonstrate how efficiency of Cartesian Genetic Programming method can be scaled up through the preferential selection of phenotypically larger solutions, i.e. through the preferential selection of larger solutions among equally good solutions. The advantage of the preferential selection of larger solutions is validated on the six, seven and eight-bit parity problems, on a dynamically varying problem involving the classification of binary patterns, and on the Paige regression problem. In all cases, the preferential selection of larger solutions provides an advantage in term of the performance of the evolved solutions and in term of speed, the number of evaluations required to evolve optimal or high-quality solutions. The advantage provided by the preferential selection of larger solutions can be further extended by self-adapting the mutation rate through the one-fifth success rule. Finally, for problems like the Paige regression in which neutrality plays a minor role, the advantage of the preferential selection of larger solutions can be extended by preferring larger solutions also among quasi-neutral alternative candidate solutions, i.e. solutions achieving slightly different performance.

Via

Access Paper or Ask Questions