Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Brian McMahan

Discourse Coherence, Reference Grounding and Goal Oriented Dialogue

Jul 08, 2020

Baber Khalid, Malihe Alikhani, Michael Fellner, Brian McMahan, Matthew Stone

Figure 1 for Discourse Coherence, Reference Grounding and Goal Oriented Dialogue

Figure 2 for Discourse Coherence, Reference Grounding and Goal Oriented Dialogue

Figure 3 for Discourse Coherence, Reference Grounding and Goal Oriented Dialogue

Figure 4 for Discourse Coherence, Reference Grounding and Goal Oriented Dialogue

Abstract:Prior approaches to realizing mixed-initiative human--computer referential communication have adopted information-state or collaborative problem-solving approaches. In this paper, we argue for a new approach, inspired by coherence-based models of discourse such as SDRT \cite{asher-lascarides:2003a}, in which utterances attach to an evolving discourse structure and the associated knowledge graph of speaker commitments serves as an interface to real-world reasoning and conversational strategy. As first steps towards implementing the approach, we describe a simple dialogue system in a referential communication domain that accumulates constraints across discourse, interprets them using a learned probabilistic model, and plans clarification using reinforcement learning.

* Accepted for Publishing at SemDial 2020

Via

Access Paper or Ask Questions

Listening to the World Improves Speech Command Recognition

Oct 23, 2017

Brian McMahan, Delip Rao

Figure 1 for Listening to the World Improves Speech Command Recognition

Figure 2 for Listening to the World Improves Speech Command Recognition

Abstract:We study transfer learning in convolutional network architectures applied to the task of recognizing audio, such as environmental sound events and speech commands. Our key finding is that not only is it possible to transfer representations from an unrelated task like environmental sound classification to a voice-focused task like speech command recognition, but also that doing so improves accuracies significantly. We also investigate the effect of increased model capacity for transfer learning audio, by first validating known results from the field of Computer Vision of achieving better accuracies with increasingly deeper networks on two audio datasets: UrbanSound8k and the newly released Google Speech Commands dataset. Then we propose a simple multiscale input representation using dilated convolutions and show that it is able to aggregate larger contexts and increase classification performance. Further, the models trained using a combination of transfer learning and multiscale input representations need only 40% of the training data to achieve similar accuracies as a freshly trained model with 100% of the training data. Finally, we demonstrate a positive interaction effect for the multiscale input and transfer learning, making a case for the joint application of the two techniques.

* 8 pages

Via

Access Paper or Ask Questions