Abstract:Conversational agents (CAs) are gaining traction in both industry and academia, especially with the advent of generative AI and large language models. As these agents are used more broadly by members of the general public and take on a number of critical use cases and social roles, it becomes important to consider the values embedded in these systems. This consideration includes answering questions such as 'whose values get embedded in these agents?' and 'how do those values manifest in the agents being designed?' Accordingly, the aim of this paper is to present the Value-Sensitive Conversational Agent (VSCA) Framework for enabling the collaborative design (co-design) of value-sensitive CAs with relevant stakeholders. Firstly, requirements for co-designing value-sensitive CAs which were identified in previous works are summarised here. Secondly, the practical framework is presented and discussed, including its operationalisation into a design toolkit. The framework facilitates the co-design of three artefacts that elicit stakeholder values and have a technical utility to CA teams to guide CA implementation, enabling the creation of value-embodied CA prototypes. Finally, an evaluation protocol for the framework is proposed where the effects of the framework and toolkit are explored in a design workshop setting to evaluate both the process followed and the outcomes produced.
Abstract:In conversational analyses, humans manually weave multimodal information into the transcripts, which is significantly time-consuming. We introduce a system that automatically expands the verbatim transcripts of video-recorded conversations using multimodal data streams. This system uses a set of preprocessing rules to weave multimodal annotations into the verbatim transcripts and promote interpretability. Our feature engineering contributions are two-fold: firstly, we identify the range of multimodal features relevant to detect rapport-building; secondly, we expand the range of multimodal annotations and show that the expansion leads to statistically significant improvements in detecting rapport-building.