Abstract:This paper introduces an automatic affordance reasoning paradigm tailored to minimal semantic inputs, addressing the critical challenges of classifying and manipulating unseen classes of objects in household settings. Inspired by human cognitive processes, our method integrates generative language models and physics-based simulators to foster analytical thinking and creative imagination of novel affordances. Structured with a tripartite framework consisting of analysis, imagination, and evaluation, our system "analyzes" the requested affordance names into interaction-based definitions, "imagines" the virtual scenarios, and "evaluates" the object affordance. If an object is recognized as possessing the requested affordance, our method also predicts the optimal pose for such functionality, and how a potential user can interact with it. Tuned on only a few synthetic examples across 3 affordance classes, our pipeline achieves a very high success rate on affordance classification and functional pose prediction of 8 classes of novel objects, outperforming learning-based baselines. Validation through real robot manipulating experiments demonstrates the practical applicability of the imagined user interaction, showcasing the system's ability to independently conceptualize unseen affordances and interact with new objects and scenarios in everyday settings.
Abstract:This paper presents an innovative approach to integrating Large Language Models (LLMs) with Arduino-controlled Electrohydrodynamic (EHD) pumps for precise color synthesis in automation systems. We propose a novel framework that employs fine-tuned LLMs to interpret natural language commands and convert them into specific operational instructions for EHD pump control. This approach aims to enhance user interaction with complex hardware systems, making it more intuitive and efficient. The methodology involves four key steps: fine-tuning the language model with a dataset of color specifications and corresponding Arduino code, developing a natural language processing interface, translating user inputs into executable Arduino code, and controlling EHD pumps for accurate color mixing. Conceptual experiment results, based on theoretical assumptions, indicate a high potential for accurate color synthesis, efficient language model interpretation, and reliable EHD pump operation. This research extends the application of LLMs beyond text-based tasks, demonstrating their potential in industrial automation and control systems. While highlighting the limitations and the need for real-world testing, this study opens new avenues for AI applications in physical system control and sets a foundation for future advancements in AI-driven automation technologies.