Abstract:This paper is a report of the Workshop on Simulations for Information Access (Sim4IA) workshop at SIGIR 2024. The workshop had two keynotes, a panel discussion, nine lightning talks, and two breakout sessions. Key takeaways were user simulation's importance in academia and industry, the possible bridging of online and offline evaluation, and the issues of organizing a companion shared task around user simulations for information access. We report on how we organized the workshop, provide a brief overview of what happened at the workshop, and summarize the main topics and findings of the workshop and future work.
Abstract:User simulation is a promising approach for automatically training and evaluating conversational information access agents, enabling the generation of synthetic dialogues and facilitating reproducible experiments at scale. However, the objectives of user simulation for the different uses remain loosely defined, hindering the development of effective simulators. In this work, we formally characterize the distinct objectives for user simulators: training aims to maximize behavioral similarity to real users, while evaluation focuses on the accurate prediction of real-world conversational agent performance. Through an empirical study, we demonstrate that optimizing for one objective does not necessarily lead to improved performance on the other. This finding underscores the need for tailored design considerations depending on the intended use of the simulator. By establishing clear objectives and proposing concrete measures to evaluate user simulators against those objectives, we pave the way for the development of simulators that are specifically tailored to their intended use, ultimately leading to more effective conversational agents.
Abstract:We present a methodology to systematically test conversational recommender systems with regards to conversational breakdowns. It involves examining conversations generated between the system and simulated users for a set of pre-defined breakdown types, extracting responsible conversational paths, and characterizing them in terms of the underlying dialogue intents. User simulation offers the advantages of simplicity, cost-effectiveness, and time efficiency for obtaining conversations where potential breakdowns can be identified. The proposed methodology can be used as diagnostic tool as well as a development tool to improve conversational recommendation systems. We apply our methodology in a case study with an existing conversational recommender system and user simulator, demonstrating that with just a few iterations, we can make the system more robust to conversational breakdowns.
Abstract:While interest in conversational recommender systems has been on the rise, operational systems suitable for serving as research platforms for comprehensive studies are currently lacking. This paper introduces an enhanced version of the IAI MovieBot conversational movie recommender system, aiming to evolve it into a robust and adaptable platform for conducting user-facing experiments. The key highlights of this enhancement include the addition of trainable neural components for natural language understanding and dialogue policy, transparent and explainable modeling of user preferences, along with improvements in the user interface and research infrastructure.
Abstract:Personal knowledge graphs (PKGs) offer individuals a way to store and consolidate their fragmented personal data in a central place, improving service personalization while maintaining full user control. Despite their potential, practical PKG implementations with user-friendly interfaces remain scarce. This work addresses this gap by proposing a complete solution to represent, manage, and interface with PKGs. Our approach includes (1) a user-facing PKG Client, enabling end-users to administer their personal data easily via natural language statements, and (2) a service-oriented PKG API. To tackle the complexity of representing these statements within a PKG, we present an RDF-based PKG vocabulary that supports this, along with properties for access rights and provenance.
Abstract:We observe a change in the way users access information, that is, the rise of conversational information access (CIA) agents. However, the automatic evaluation of these agents remains an open challenge. Moreover, the training of CIA agents is cumbersome as it mostly relies on conversational corpora, expert knowledge, and reinforcement learning. User simulation has been identified as a promising solution to tackle automatic evaluation and has been previously used in reinforcement learning. In this research, we investigate how user simulation can be leveraged in the context of CIA. We organize the work in three parts. We begin with the identification of requirements for user simulators for training and evaluating CIA agents and compare existing types of simulator regarding these. Then, we plan to combine these different types of simulators into a new hybrid simulator. Finally, we aim to extend simulators to handle more complex information seeking scenarios.
Abstract:Conversational systems can be particularly effective in supporting complex information seeking scenarios with evolving information needs. Finding the right products on an e-commerce platform is one such scenario, where a conversational agent would need to be able to provide search capabilities over the item catalog, understand and make recommendations based on the user's preferences, and answer a range of questions related to items and their usage. Yet, existing conversational datasets do not fully support the idea of mixing different conversational goals (i.e., search, recommendation, and question answering) and instead focus on a single goal. To address this, we introduce MG-ShopDial: a dataset of conversations mixing different goals in the domain of e-commerce. Specifically, we make the following contributions. First, we develop a coached human-human data collection protocol where each dialogue participant is given a set of instructions, instead of a specific script or answers to choose from. Second, we implement a data collection tool to facilitate the collection of multi-goal conversations via a web chat interface, using the above protocol. Third, we create the MG-ShopDial collection, which contains 64 high-quality dialogues with a total of 2,196 utterances for e-commerce scenarios of varying complexity. The dataset is additionally annotated with both intents and goals on the utterance level. Finally, we present an analysis of this dataset and identify multi-goal conversational patterns.
Abstract:This paper presents an ecosystem for personal knowledge graphs (PKG), commonly defined as resources of structured information about entities related to an individual, their attributes, and the relations between them. PKGs are a key enabler of secure and sophisticated personal data management and personalized services. However, there are challenges that need to be addressed before PKGs can achieve widespread adoption. One of the fundamental challenges is the very definition of what constitutes a PKG, as there are multiple interpretations of the term. We propose our own definition of a PKG, emphasizing the aspects of (1) data ownership by a single individual and (2) the delivery of personalized services as the primary purpose. We further argue that a holistic view of PKGs is needed to unlock their full potential, and propose a unified framework for PKGs, where the PKG is a part of a larger ecosystem with clear interfaces towards data services and data sources. A comprehensive survey and synthesis of existing work is conducted, with a mapping of the surveyed work into the proposed unified ecosystem. Finally, we identify open challenges and research opportunities for the ecosystem as a whole, as well as for the specific aspects of PKGs, which include population, representation and management, and utilization.
Abstract:DAGFiNN is a conversational conference assistant that can be made available for a given conference both as a chatbot on the website and as a Furhat robot physically exhibited at the conference venue. Conference participants can interact with the assistant to get advice on various questions, ranging from where to eat in the city or how to get to the airport to which sessions we recommend them to attend based on the information we have about them. The overall objective is to provide a personalized and engaging experience and allow users to ask a broad range of questions that naturally arise before and during the conference.