CSELT - Turin, Italy
Abstract:We are interested in the problem of conversational analysis and its application to the health domain. Cognitive Behavioral Therapy is a structured approach in psychotherapy, allowing the therapist to help the patient to identify and modify the malicious thoughts, behavior, or actions. This cooperative effort can be evaluated using the Working Alliance Inventory Observer-rated Shortened - a 12 items inventory covering task, goal, and relationship - which has a relevant influence on therapeutic outcomes. In this work, we investigate the relation between this alliance inventory and the spoken conversations (sessions) between the patient and the psychotherapist. We have delivered eight weeks of e-therapy, collected their audio and video call sessions, and manually transcribed them. The spoken conversations have been annotated and evaluated with WAI ratings by professional therapists. We have investigated speech and language features and their association with WAI items. The feature types include turn dynamics, lexical entrainment, and conversational descriptors extracted from the speech and language signals. Our findings provide strong evidence that a subset of these features are strong indicators of working alliance. To the best of our knowledge, this is the first and a novel study to exploit speech and language for characterising working alliance.
Abstract:Empathy, as defined in behavioral sciences, expresses the ability of human beings to recognize, understand and react to emotions, attitudes and beliefs of others. The lack of an operational definition of empathy makes it difficult to measure it. In this paper, we address two related problems in automatic affective behavior analysis: the design of the annotation protocol and the automatic recognition of empathy from spoken conversations. We propose and evaluate an annotation scheme for empathy inspired by the modal model of emotions. The annotation scheme was evaluated on a corpus of real-life, dyadic spoken conversations. In the context of behavioral analysis, we designed an automatic segmentation and classification system for empathy. Given the different speech and language levels of representation where empathy may be communicated, we investigated features derived from the lexical and acoustic spaces. The feature development process was designed to support both the fusion and automatic selection of relevant features from high dimensional space. The automatic classification system was evaluated on call center conversations where it showed significantly better performance than the baseline.
Abstract:In this paper I describe how miscommunication problems are dealt with in the spoken language system DIALOGOS. The dialogue module of the system exploits dialogic expectations in a twofold way: to model what future user utterance might be about (predictions), and to account how the user's next utterance may be related to previous ones in the ongoing interaction (pragmatic-based expectations). The analysis starts from the hypothesis that the occurrence of miscommunication is concomitant with two pragmatic phenomena: the deviation of the user from the expected behaviour and the generation of a conversational implicature. A preliminary evaluation of a large amount of interactions between subjects and DIALOGOS shows that the system performance is enhanced by the uses of both predictions and pragmatic-based expectations.
Abstract:In this paper we explain how contextual expectations are generated and used in the task-oriented spoken language understanding system Dialogos. The hard task of recognizing spontaneous speech on the telephone may greatly benefit from the use of specific language models during the recognition of callers' utterances. By 'specific language models' we mean a set of language models that are trained on contextually appropriated data, and that are used during different states of the dialogue on the basis of the information sent to the acoustic level by the dialogue management module. In this paper we describe how the specific language models are obtained on the basis of contextual information. The experimental result we report show that recognition and understanding performance are improved thanks to the use of specific language models.
Abstract:This paper presents Dialogos, a real-time system for human-machine spoken dialogue on the telephone in task-oriented domains. The system has been tested in a large trial with inexperienced users and it has proved robust enough to allow spontaneous interactions both to users which get good recognition performance and to the ones which get lower scores. The robust behavior of the system has been achieved by combining the use of specific language models during the recognition phase of analysis, the tolerance toward spontaneous speech phenomena, the activity of a robust parser, and the use of pragmatic-based dialogue knowledge. This integration of the different modules allows to deal with partial or total breakdowns of the different levels of analysis. We report the field trial data of the system and the evaluation results of the overall system and of the submodules.
Abstract:In this paper, we describe a set of metrics for the evaluation of different dialogue management strategies in an implemented real-time spoken language system. The set of metrics we propose offers useful insights in evaluating how particular choices in the dialogue management can affect the overall quality of the man-machine dialogue. The evaluation makes use of established metrics: the transaction success, the contextual appropriateness of system answers, the calculation of normal and correction turns in a dialogue. We also define a new metric, the implicit recovery, which allows to measure the ability of a dialogue manager to deal with errors by different levels of analysis. We report evaluation data from several experiments, and we compare two different approaches to dialogue repair strategies using the set of metrics we argue for.