Abstract:The ability to interact with machines using natural human language is becoming not just commonplace, but expected. The next step is not just text interfaces, but speech interfaces and not just with computers, but with all machines including robots. In this paper, we chronicle the recent history of this growing field of spoken dialogue with robots and offer the community three proposals, the first focused on education, the second on benchmarks, and the third on the modeling of language when it comes to spoken interaction with robots. The three proposals should act as white papers for any researcher to take and build upon.
Abstract:Patients managing a complex illness such as cancer face a complex information challenge where they not only must learn about their illness but also how to manage it. Close interaction with healthcare experts (radiologists, oncologists) can improve patient learning and thereby, their disease outcome. However, this approach is resource intensive and takes expert time away from other critical tasks. Given the recent advancements in Generative AI models aimed at improving the healthcare system, our work investigates whether and how generative visual question answering systems can responsibly support patient information needs in the context of radiology imaging data. We conducted a formative need-finding study in which participants discussed chest computed tomography (CT) scans and associated radiology reports of a fictitious close relative with a cardiothoracic radiologist. Using thematic analysis of the conversation between participants and medical experts, we identified commonly occurring themes across interactions, including clarifying medical terminology, locating the problems mentioned in the report in the scanned image, understanding disease prognosis, discussing the next diagnostic steps, and comparing treatment options. Based on these themes, we evaluated two state-of-the-art generative visual language models against the radiologist's responses. Our results reveal variability in the quality of responses generated by the models across various themes. We highlight the importance of patient-facing generative AI systems to accommodate a diverse range of conversational themes, catering to the real-world informational needs of patients.
Abstract:Model-based reasoning agents are ill-equipped to act in novel situations in which their model of the environment no longer sufficiently represents the world. We propose HYDRA - a framework for designing model-based agents operating in mixed discrete-continuous worlds, that can autonomously detect when the environment has evolved from its canonical setup, understand how it has evolved, and adapt the agents' models to perform effectively. HYDRA is based upon PDDL+, a rich modeling language for planning in mixed, discrete-continuous environments. It augments the planning module with visual reasoning, task selection, and action execution modules for closed-loop interaction with complex environments. HYDRA implements a novel meta-reasoning process that enables the agent to monitor its own behavior from a variety of aspects. The process employs a diverse set of computational methods to maintain expectations about the agent's own behavior in an environment. Divergences from those expectations are useful in detecting when the environment has evolved and identifying opportunities to adapt the underlying models. HYDRA builds upon ideas from diagnosis and repair and uses a heuristics-guided search over model changes such that they become competent in novel conditions. The HYDRA framework has been used to implement novelty-aware agents for three diverse domains - CartPole++ (a higher dimension variant of a classic control problem), Science Birds (an IJCAI competition problem), and PogoStick (a specific problem domain in Minecraft). We report empirical observations from these domains to demonstrate the efficacy of various components in the novelty meta-reasoning process.
Abstract:This paper studies how a domain-independent planner and combinatorial search can be employed to play Angry Birds, a well established AI challenge problem. To model the game, we use PDDL+, a planning language for mixed discrete/continuous domains that supports durative processes and exogenous events. The paper describes the model and identifies key design decisions that reduce the problem complexity. In addition, we propose several domain-specific enhancements including heuristics and a search technique similar to preferred operators. Together, they alleviate the complexity of combinatorial search. We evaluate our approach by comparing its performance with dedicated domain-specific solvers on a range of Angry Birds levels. The results show that our performance is on par with these domain-specific approaches in most levels, even without using our domain-specific search enhancements.
Abstract:Planning agents are ill-equipped to act in novel situations in which their domain model no longer accurately represents the world. We introduce an approach for such agents operating in open worlds that detects the presence of novelties and effectively adapts their domain models and consequent action selection. It uses observations of action execution and measures their divergence from what is expected, according to the environment model, to infer existence of a novelty. Then, it revises the model through a heuristics-guided search over model changes. We report empirical evaluations on the CartPole problem, a standard Reinforcement Learning (RL) benchmark. The results show that our approach can deal with a class of novelties very quickly and in an interpretable fashion.
Abstract:This demo paper presents the first system for playing the popular Angry Birds game using a domain-independent planner. Our system models Angry Birds levels using PDDL+, a planning language for mixed discrete/continuous domains. It uses a domain-independent PDDL+ planner to generate plans and executes them. In this demo paper, we present the system's PDDL+ model for this domain, identify key design decisions that reduce the problem complexity, and compare the performance of our system to model-specific methods for this domain. The results show that our system's performance is on par with other domain-specific systems for Angry Birds, suggesting the applicability of domain-independent planning to this benchmark AI challenge.
Abstract:Interactive Task Learning (ITL) is an emerging research agenda that studies the design of complex intelligent robots that can acquire new knowledge through natural human teacher-robot learner interactions. ITL methods are particularly useful for designing intelligent robots whose behavior can be adapted by humans collaborating with them. Various research communities are contributing methods for ITL and a large subset of this research is robot-centered with a focus on developing algorithms that can learn online, quickly. This paper studies the ITL problem from a human-centered perspective to provide guidance for robot design so that human teachers can interact with ITL robots naturally. In this paper, we present 1) a cognitive task analysis of an interactive teaching study (N=10) that extracts and classify various actions intended and executed by human teachers when teaching a robot; 2) in-depth discussion of the teaching approach employed by two participants to understand the need for personal adaptation to individual styles; and 3) requirements for ITL robot design based on our analyses informed by plan-based theories of dialogue, specifically SharedPlans.
Abstract:We propose a new long-term declarative memory for Soar that leverages the computational models of analogical reasoning and generalization. We situate our research in interactive task learning (ITL) and embodied language processing (ELP). We demonstrate that the learning methods implemented in the proposed memory can quickly learn a diverse types of novel concepts that are useful in task execution. Our approach has been instantiated in an implemented hybrid AI system AILEEN and evaluated on a simulated robotic domain.
Abstract:Our research aims to develop intelligent collaborative agents that are human-aware - they can model, learn, and reason about their human partner's physiological, cognitive, and affective states. In this paper, we study how adaptive coaching interactions can be designed to help people develop sustainable healthy behaviors. We leverage the common model of cognition - CMC [26] - as a framework for unifying several behavior change theories that are known to be useful in human-human coaching. We motivate a set of interactive system desiderata based on the CMC-based view of behavior change. Then, we propose PARCoach - an interactive system that addresses the desiderata. PARCoach helps a trainee pick a relevant health goal, set an implementation intention, and track their behavior. During this process, the trainee identifies a specific goal-directed behavior as well as the situational context in which they will perform it. PARCcoach uses this information to send notifications to the trainee, reminding them of their chosen behavior and the context. We report the results from a 4-week deployment with 60 participants. Our results support the CMC-based view of behavior change and demonstrate that the desiderata for proposed interactive system design is useful in producing behavior change.
Abstract:Our research aims to develop interactive, social agents that can coach people to learn new tasks, skills, and habits. In this paper, we focus on coaching sedentary, overweight individuals (i.e., trainees) to exercise regularly. We employ adaptive goal setting in which the intelligent health coach generates, tracks, and revises personalized exercise goals for a trainee. The goals become incrementally more difficult as the trainee progresses through the training program. Our approach is model-based - the coach maintains a parameterized model of the trainee's aerobic capability that drives its expectation of the trainee's performance. The model is continually revised based on trainee-coach interactions. The coach is embodied in a smartphone application, NutriWalking, which serves as a medium for coach-trainee interaction. We adopt a task-centric evaluation approach for studying the utility of the proposed algorithm in promoting regular aerobic exercise. We show that our approach can adapt the trainee program not only to several trainees with different capabilities, but also to how a trainee's capability improves as they begin to exercise more. Experts rate the goals selected by the coach better than other plausible goals, demonstrating that our approach is consistent with clinical recommendations. Further, in a 6-week observational study with sedentary participants, we show that the proposed approach helps increase exercise volume performed each week.