Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Stephanie M. Lukin

SCOUT: A Situated and Multi-Modal Human-Robot Dialogue Corpus

Nov 19, 2024

Stephanie M. Lukin, Claire Bonial, Matthew Marge, Taylor Hudson, Cory J. Hayes, Kimberly A. Pollard, Anthony Baker, Ashley N. Foots, Ron Artstein, Felix Gervits(+7 more)

Figure 1 for SCOUT: A Situated and Multi-Modal Human-Robot Dialogue Corpus

Figure 2 for SCOUT: A Situated and Multi-Modal Human-Robot Dialogue Corpus

Figure 3 for SCOUT: A Situated and Multi-Modal Human-Robot Dialogue Corpus

Figure 4 for SCOUT: A Situated and Multi-Modal Human-Robot Dialogue Corpus

Abstract:We introduce the Situated Corpus Of Understanding Transactions (SCOUT), a multi-modal collection of human-robot dialogue in the task domain of collaborative exploration. The corpus was constructed from multiple Wizard-of-Oz experiments where human participants gave verbal instructions to a remotely-located robot to move and gather information about its surroundings. SCOUT contains 89,056 utterances and 310,095 words from 278 dialogues averaging 320 utterances per dialogue. The dialogues are aligned with the multi-modal data streams available during the experiments: 5,785 images and 30 maps. The corpus has been annotated with Abstract Meaning Representation and Dialogue-AMR to identify the speaker's intent and meaning within an utterance, and with Transactional Units and Relations to track relationships between utterances to reveal patterns of the Dialogue Structure. We describe how the corpus and its annotations have been used to develop autonomous human-robot systems and enable research in open questions of how humans speak to robots. We release this corpus to accelerate progress in autonomous, situated, human-robot dialogue, especially in the context of navigation tasks where details about the environment need to be discovered.

* 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) https://aclanthology.org/2024.lrec-main.1259/
* 14 pages, 7 figures

Via

Access Paper or Ask Questions

Human-Robot Dialogue Annotation for Multi-Modal Common Ground

Nov 19, 2024

Claire Bonial, Stephanie M. Lukin, Mitchell Abrams, Anthony Baker, Lucia Donatelli, Ashley Foots, Cory J. Hayes, Cassidy Henry, Taylor Hudson, Matthew Marge(+4 more)

Abstract:In this paper, we describe the development of symbolic representations annotated on human-robot dialogue data to make dimensions of meaning accessible to autonomous systems participating in collaborative, natural language dialogue, and to enable common ground with human partners. A particular challenge for establishing common ground arises in remote dialogue (occurring in disaster relief or search-and-rescue tasks), where a human and robot are engaged in a joint navigation and exploration task of an unfamiliar environment, but where the robot cannot immediately share high quality visual information due to limited communication constraints. Engaging in a dialogue provides an effective way to communicate, while on-demand or lower-quality visual information can be supplemented for establishing common ground. Within this paradigm, we capture propositional semantics and the illocutionary force of a single utterance within the dialogue through our Dialogue-AMR annotation, an augmentation of Abstract Meaning Representation. We then capture patterns in how different utterances within and across speaker floors relate to one another in our development of a multi-floor Dialogue Structure annotation schema. Finally, we begin to annotate and analyze the ways in which the visual modalities provide contextual information to the dialogue for overcoming disparities in the collaborators' understanding of the environment. We conclude by discussing the use-cases, architectures, and systems we have implemented from our annotations that enable physical robots to autonomously engage with humans in bi-directional dialogue and navigation.

* Language Resources and Evaluation 2024
* 52 pages, 14 figures

Via

Access Paper or Ask Questions

Navigating to Success in Multi-Modal Human-Robot Collaboration: Analysis and Corpus Release

Oct 26, 2023

Stephanie M. Lukin, Kimberly A. Pollard, Claire Bonial, Taylor Hudson, Ron Arstein, Clare Voss, David Traum

Figure 1 for Navigating to Success in Multi-Modal Human-Robot Collaboration: Analysis and Corpus Release

Figure 2 for Navigating to Success in Multi-Modal Human-Robot Collaboration: Analysis and Corpus Release

Figure 3 for Navigating to Success in Multi-Modal Human-Robot Collaboration: Analysis and Corpus Release

Figure 4 for Navigating to Success in Multi-Modal Human-Robot Collaboration: Analysis and Corpus Release

Abstract:Human-guided robotic exploration is a useful approach to gathering information at remote locations, especially those that might be too risky, inhospitable, or inaccessible for humans. Maintaining common ground between the remotely-located partners is a challenge, one that can be facilitated by multi-modal communication. In this paper, we explore how participants utilized multiple modalities to investigate a remote location with the help of a robotic partner. Participants issued spoken natural language instructions and received from the robot: text-based feedback, continuous 2D LIDAR mapping, and upon-request static photographs. We noticed that different strategies were adopted in terms of use of the modalities, and hypothesize that these differences may be correlated with success at several exploration sub-tasks. We found that requesting photos may have improved the identification and counting of some key entities (doorways in particular) and that this strategy did not hinder the amount of overall area exploration. Future work with larger samples may reveal the effects of more nuanced photo and dialogue strategies, which can inform the training of robotic agents. Additionally, we announce the release of our unique multi-modal corpus of human-robot communication in an exploration context: SCOUT, the Situated Corpus on Understanding Transactions.

* Proceedings of the 2023 IEEE Robot and Human Interactive Communication Conference
* 7 pages, 3 figures

Via

Access Paper or Ask Questions

Envisioning Narrative Intelligence: A Creative Visual Storytelling Anthology

Oct 06, 2023

Brett A. Halperin, Stephanie M. Lukin

Abstract:In this paper, we collect an anthology of 100 visual stories from authors who participated in our systematic creative process of improvised story-building based on image sequences. Following close reading and thematic analysis of our anthology, we present five themes that characterize the variations found in this creative visual storytelling process: (1) Narrating What is in Vision vs. Envisioning; (2) Dynamically Characterizing Entities/Objects; (3) Sensing Experiential Information About the Scenery; (4) Modulating the Mood; (5) Encoding Narrative Biases. In understanding the varied ways that people derive stories from images, we offer considerations for collecting story-driven training data to inform automatic story generation. In correspondence with each theme, we envision narrative intelligence criteria for computational visual storytelling as: creative, reliable, expressive, grounded, and responsible. From these criteria, we discuss how to foreground creative expression, account for biases, and operate in the bounds of visual storyworlds.

* Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems
* 21 pages, 11 figures

Via

Access Paper or Ask Questions

A Research Platform for Multi-Robot Dialogue with Humans

Oct 12, 2019

Matthew Marge, Stephen Nogar, Cory J. Hayes, Stephanie M. Lukin, Jesse Bloecker, Eric Holder, Clare Voss

Figure 1 for A Research Platform for Multi-Robot Dialogue with Humans

Figure 2 for A Research Platform for Multi-Robot Dialogue with Humans

Figure 3 for A Research Platform for Multi-Robot Dialogue with Humans

Abstract:This paper presents a research platform that supports spoken dialogue interaction with multiple robots. The demonstration showcases our crafted MultiBot testing scenario in which users can verbally issue search, navigate, and follow instructions to two robotic teammates: a simulated ground robot and an aerial robot. This flexible language and robotic platform takes advantage of existing tools for speech recognition and dialogue management that are compatible with new domains, and implements an inter-agent communication protocol (tactical behavior specification), where verbal instructions are encoded for tasks assigned to the appropriate robot.

* Accepted for publication at NAACL 2019; also presented at AI-HRI 2019 (arXiv:1909.04812)

Via

Access Paper or Ask Questions

Visual Understanding and Narration: A Deeper Understanding and Explanation of Visual Scenes

May 31, 2019

Stephanie M. Lukin, Claire Bonial, Clare R. Voss

Figure 1 for Visual Understanding and Narration: A Deeper Understanding and Explanation of Visual Scenes

Abstract:We describe the task of Visual Understanding and Narration, in which a robot (or agent) generates text for the images that it collects when navigating its environment, by answering open-ended questions, such as 'what happens, or might have happened, here?'

* 2-page extended abstract, presented at the Workshop on Shortcomings in Vision and Language (SiVL), 2019, at the North American Association for Computational Linguistics (NAACL)

Via

Access Paper or Ask Questions

A Pipeline for Creative Visual Storytelling

Jul 21, 2018

Stephanie M. Lukin, Reginald Hobbs, Clare R. Voss

Figure 1 for A Pipeline for Creative Visual Storytelling

Figure 2 for A Pipeline for Creative Visual Storytelling

Figure 3 for A Pipeline for Creative Visual Storytelling

Figure 4 for A Pipeline for Creative Visual Storytelling

Abstract:Computational visual storytelling produces a textual description of events and interpretations depicted in a sequence of images. These texts are made possible by advances and cross-disciplinary approaches in natural language processing, generation, and computer vision. We define a computational creative visual storytelling as one with the ability to alter the telling of a story along three aspects: to speak about different environments, to produce variations based on narrative goals, and to adapt the narrative to the audience. These aspects of creative storytelling and their effect on the narrative have yet to be explored in visual storytelling. This paper presents a pipeline of task-modules, Object Identification, Single-Image Inferencing, and Multi-Image Narration, that serve as a preliminary design for building a creative visual storyteller. We have piloted this design for a sequence of images in an annotation task. We present and analyze the collected corpus and describe plans towards automation.

* Originally published in the Proceedings of the First Workshop on Storytelling (StoryNLP), 2018, at the North American Association for Computational Linguistics (NAACL)

Via

Access Paper or Ask Questions

Consequences and Factors of Stylistic Differences in Human-Robot Dialogue

Jul 21, 2018

Stephanie M. Lukin, Kimberly A. Pollard, Claire Bonial, Matthew Marge, Cassidy Henry, Ron Arstein, David Traum, Clare R. Voss

Figure 1 for Consequences and Factors of Stylistic Differences in Human-Robot Dialogue

Figure 2 for Consequences and Factors of Stylistic Differences in Human-Robot Dialogue

Figure 3 for Consequences and Factors of Stylistic Differences in Human-Robot Dialogue

Figure 4 for Consequences and Factors of Stylistic Differences in Human-Robot Dialogue

Abstract:This paper identifies stylistic differences in instruction-giving observed in a corpus of human-robot dialogue. Differences in verbosity and structure (i.e., single-intent vs. multi-intent instructions) arose naturally without restrictions or prior guidance on how users should speak with the robot. Different styles were found to produce different rates of miscommunication, and correlations were found between style differences and individual user variation, trust, and interaction experience with the robot. Understanding potential consequences and factors that influence style can inform design of dialogue systems that are robust to natural variation from human users.

* Originally published in the Proceedings of the 19th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL), 2018

Via

Access Paper or Ask Questions

ScoutBot: A Dialogue System for Collaborative Navigation

Jul 21, 2018

Stephanie M. Lukin, Felix Gervits, Cory J. Hayes, Anton Leuski, Pooja Moolchandani, John G. Rogers III, Carlos Sanchez Amaro, Matthew Marge, Clare R. Voss, David Traum

Figure 1 for ScoutBot: A Dialogue System for Collaborative Navigation

Figure 2 for ScoutBot: A Dialogue System for Collaborative Navigation

Figure 3 for ScoutBot: A Dialogue System for Collaborative Navigation

Figure 4 for ScoutBot: A Dialogue System for Collaborative Navigation

Abstract:ScoutBot is a dialogue interface to physical and simulated robots that supports collaborative exploration of environments. The demonstration will allow users to issue unconstrained spoken language commands to ScoutBot. ScoutBot will prompt for clarification if the user's instruction needs additional input. It is trained on human-robot dialogue collected from Wizard-of-Oz experiments, where robot responses were initiated by a human wizard in previous interactions. The demonstration will show a simulated ground robot (Clearpath Jackal) in a simulated environment supported by ROS (Robot Operating System).

* Originally published in the Proceedings of the Association for Computational Linguistics (ACL) 2018, System Demonstrations, 93-98

Via

Access Paper or Ask Questions

Laying Down the Yellow Brick Road: Development of a Wizard-of-Oz Interface for Collecting Human-Robot Dialogue

Oct 17, 2017

Claire Bonial, Matthew Marge, Ron artstein, Ashley Foots, Felix Gervits, Cory J. Hayes, Cassidy Henry, Susan G. Hill, Anton Leuski, Stephanie M. Lukin(+4 more)

Figure 1 for Laying Down the Yellow Brick Road: Development of a Wizard-of-Oz Interface for Collecting Human-Robot Dialogue

Abstract:We describe the adaptation and refinement of a graphical user interface designed to facilitate a Wizard-of-Oz (WoZ) approach to collecting human-robot dialogue data. The data collected will be used to develop a dialogue system for robot navigation. Building on an interface previously used in the development of dialogue systems for virtual agents and video playback, we add templates with open parameters which allow the wizard to quickly produce a wide variety of utterances. Our research demonstrates that this approach to data collection is viable as an intermediate step in developing a dialogue system for physical robots in remote locations from their users - a domain in which the human and robot need to regularly verify and update a shared understanding of the physical environment. We show that our WoZ interface and the fixed set of utterances and templates therein provide for a natural pace of dialogue with good coverage of the navigation domain.

* 7 pages, 2 figures, accepted for oral presentation at the Symposium on Natural Communication for Human-Robot Collaboration, AAAI Fall Symposium Series, November 9-11, 2017, https://www.aaai.org/ocs/index.php/FSS/FSS17

Via

Access Paper or Ask Questions