Abstract:Recent advanced large language models (LLMs) have showcased their emergent capability of in-context learning, facilitating intelligent decision-making through natural language prompts without retraining. This new machine learning paradigm has shown promise in various fields, including general control and optimization problems. Inspired by these advancements, we explore the potential of LLMs for a specific and essential engineering task: parametric shape optimization (PSO). We develop an optimization framework, LLM-PSO, that leverages an LLM to determine the optimal shape of parameterized engineering designs in the spirit of evolutionary strategies. Utilizing the ``Claude 3.5 Sonnet'' LLM, we evaluate LLM-PSO on two benchmark flow optimization problems, specifically aiming to identify drag-minimizing profiles for 1) a two-dimensional airfoil in laminar flow, and 2) a three-dimensional axisymmetric body in Stokes flow. In both cases, LLM-PSO successfully identifies optimal shapes in agreement with benchmark solutions. Besides, it generally converges faster than other classical optimization algorithms. Our preliminary exploration may inspire further investigations into harnessing LLMs for shape optimization and engineering design more broadly.
Abstract:Recently, the concept of embodied intelligence has been widely accepted and popularized, leading people to naturally consider the potential for commercialization in this field. In this work, we propose a specific commercial scenario simulation, human-centered in-building embodied delivery. Furthermore, for this scenario, we have developed a brand-new virtual environment system from scratch, constructing a multi-level connected building space modeled after a polar research station. This environment also includes autonomous human characters and robots with grasping and mobility capabilities, as well as a large number of interactive items. Based on this environment, we have built a delivery dataset containing 13k language instructions to guide robots in providing services. We simulate human behavior through human characters and sample their various needs in daily life. Finally, we proposed a method centered around a large multimodal model to serve as the baseline system for this dataset. Compared to past embodied data work, our work focuses on a virtual environment centered around human-robot interaction for commercial scenarios. We believe this will bring new perspectives and exploration angles to the embodied community.
Abstract:Machine learning and artificial intelligence have recently represented a popular paradigm for designing and optimizing robotic systems across various scales. Recent studies have showcased the innovative application of large language models (LLMs) in industrial control [1] and in directing legged walking robots [2]. In this study, we utilize an LLM, GPT-4, to train two prototypical microrobots for swimming in viscous fluids. Adopting a few-shot learning approach, we develop a minimal, unified prompt composed of only five sentences. The same concise prompt successfully guides two distinct articulated microrobots -- the three-link swimmer and the three-sphere swimmer -- in mastering their signature strokes. These strokes, initially conceptualized by physicists, are now effectively interpreted and applied by the LLM, enabling the microrobots to circumvent the physical constraints inherent to micro-locomotion. Remarkably, our LLM-based decision-making strategy substantially surpasses a traditional reinforcement learning method in terms of training speed. We discuss the nuanced aspects of prompt design, particularly emphasizing the reduction of monetary expenses of using GPT-4.