Abstract:An (artificial cardiac) pacemaker is an implantable electronic device that sends electrical impulses to the heart to regulate the heartbeat. As the number of pacemaker users continues to rise, so does the demand for features with additional sensors, adaptability, and improved battery performance. Reinforcement learning (RL) has recently been proposed as a performant algorithm for creative design space exploration, adaptation, and statistical verification of cardiac pacemakers. The design of correct reward functions, expressed as a reward machine, is a key programming activity in this process. In 2007, Boston Scientific published a detailed description of their pacemaker specifications. This document has since formed the basis for several formal characterizations of pacemaker specifications using real-time automata and logic. However, because these translations are done manually, they are challenging to verify. Moreover, capturing requirements in automata or logic is notoriously difficult. We posit that it is significantly easier for domain experts, such as electrophysiologists, to observe and identify abnormalities in electrocardiograms that correspond to patient-pacemaker interactions. Therefore, we explore the possibility of learning correctness specifications from such labeled demonstrations in the form of a reward machine and training an RL agent to synthesize a cardiac pacemaker based on the resulting reward machine. We leverage advances in machine learning to extract signals from labeled demonstrations as reward machines using recurrent neural networks and transformer architectures. These reward machines are then used to design a simple pacemaker with RL. Finally, we validate the resulting pacemaker using properties extracted from the Boston Scientific document.
Abstract:This paper investigates whether recent advances in Large Language Models (LLMs) can assist in translating human explanations into a format that can robustly support learning Linear Temporal Logic (LTL) from demonstrations. Both LLMs and optimization-based methods can extract LTL specifications from demonstrations; however, they have distinct limitations. LLMs can quickly generate solutions and incorporate human explanations, but their lack of consistency and reliability hampers their applicability in safety-critical domains. On the other hand, optimization-based methods do provide formal guarantees but cannot process natural language explanations and face scalability challenges. We present a principled approach to combining LLMs and optimization-based methods to faithfully translate human explanations and demonstrations into LTL specifications. We have implemented a tool called Janaka based on our approach. Our experiments demonstrate the effectiveness of combining explanations with demonstrations in learning LTL specifications through several case studies.