Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Angeliki Metallinou

FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and Reranking

Jul 18, 2024

Zhuoer Wang, Leonardo F. R. Ribeiro, Alexandros Papangelis, Rohan Mukherjee, Tzu-Yen Wang, Xinyan Zhao, Arijit Biswas, James Caverlee, Angeliki Metallinou

Figure 1 for FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and Reranking

Figure 2 for FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and Reranking

Figure 3 for FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and Reranking

Figure 4 for FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and Reranking

Abstract:API call generation is the cornerstone of large language models' tool-using ability that provides access to the larger world. However, existing supervised and in-context learning approaches suffer from high training costs, poor data efficiency, and generated API calls that can be unfaithful to the API documentation and the user's request. To address these limitations, we propose an output-side optimization approach called FANTASE. Two of the unique contributions of FANTASE are its State-Tracked Constrained Decoding (SCD) and Reranking components. SCD dynamically incorporates appropriate API constraints in the form of Token Search Trie for efficient and guaranteed generation faithfulness with respect to the API documentation. The Reranking component efficiently brings in the supervised signal by leveraging a lightweight model as the discriminator to rerank the beam-searched candidate generations of the large language model. We demonstrate the superior performance of FANTASE in API call generation accuracy, inference efficiency, and context efficiency with DSTC8 and API Bank datasets.

Via

Access Paper or Ask Questions

Multi-User MultiWOZ: Task-Oriented Dialogues among Multiple Users

Oct 31, 2023

Yohan Jo, Xinyan Zhao, Arijit Biswas, Nikoletta Basiou, Vincent Auvray, Nikolaos Malandrakis, Angeliki Metallinou, Alexandros Potamianos

Figure 1 for Multi-User MultiWOZ: Task-Oriented Dialogues among Multiple Users

Figure 2 for Multi-User MultiWOZ: Task-Oriented Dialogues among Multiple Users

Figure 3 for Multi-User MultiWOZ: Task-Oriented Dialogues among Multiple Users

Figure 4 for Multi-User MultiWOZ: Task-Oriented Dialogues among Multiple Users

Abstract:While most task-oriented dialogues assume conversations between the agent and one user at a time, dialogue systems are increasingly expected to communicate with multiple users simultaneously who make decisions collaboratively. To facilitate development of such systems, we release the Multi-User MultiWOZ dataset: task-oriented dialogues among two users and one agent. To collect this dataset, each user utterance from MultiWOZ 2.2 was replaced with a small chat between two users that is semantically and pragmatically consistent with the original user utterance, thus resulting in the same dialogue state and system response. These dialogues reflect interesting dynamics of collaborative decision-making in task-oriented scenarios, e.g., social chatter and deliberation. Supported by this data, we propose the novel task of multi-user contextual query rewriting: to rewrite a task-oriented chat between two users as a concise task-oriented query that retains only task-relevant information and that is directly consumable by the dialogue system. We demonstrate that in multi-user dialogues, using predicted rewrites substantially improves dialogue state tracking without modifying existing dialogue systems that are trained for single-user dialogues. Further, this method surpasses training a medium-sized model directly on multi-user dialogues and generalizes to unseen domains.

* To Appear in EMNLP-Findings 2023

Via

Access Paper or Ask Questions

Toward More Accurate and Generalizable Evaluation Metrics for Task-Oriented Dialogs

Jun 09, 2023

Abishek Komma, Nagesh Panyam Chandrasekarasastry, Timothy Leffel, Anuj Goyal, Angeliki Metallinou, Spyros Matsoukas, Aram Galstyan

Figure 1 for Toward More Accurate and Generalizable Evaluation Metrics for Task-Oriented Dialogs

Figure 2 for Toward More Accurate and Generalizable Evaluation Metrics for Task-Oriented Dialogs

Figure 3 for Toward More Accurate and Generalizable Evaluation Metrics for Task-Oriented Dialogs

Figure 4 for Toward More Accurate and Generalizable Evaluation Metrics for Task-Oriented Dialogs

Abstract:Measurement of interaction quality is a critical task for the improvement of spoken dialog systems. Existing approaches to dialog quality estimation either focus on evaluating the quality of individual turns, or collect dialog-level quality measurements from end users immediately following an interaction. In contrast to these approaches, we introduce a new dialog-level annotation workflow called Dialog Quality Annotation (DQA). DQA expert annotators evaluate the quality of dialogs as a whole, and also label dialogs for attributes such as goal completion and user sentiment. In this contribution, we show that: (i) while dialog quality cannot be completely decomposed into dialog-level attributes, there is a strong relationship between some objective dialog attributes and judgments of dialog quality; (ii) for the task of dialog-level quality estimation, a supervised model trained on dialog-level annotations outperforms methods based purely on aggregating turn-level features; and (iii) the proposed evaluation model shows better domain generalization ability compared to the baselines. On the basis of these results, we argue that having high-quality human-annotated data is an important component of evaluating interaction quality for large industrial-scale voice assistant platforms.

Via

Access Paper or Ask Questions

Alexa Conversations: An Extensible Data-driven Approach for Building Task-oriented Dialogue Systems

Apr 19, 2021

Anish Acharya, Suranjit Adhikari, Sanchit Agarwal, Vincent Auvray, Nehal Belgamwar, Arijit Biswas, Shubhra Chandra, Tagyoung Chung, Maryam Fazel-Zarandi, Raefer Gabriel(+21 more)

Figure 1 for Alexa Conversations: An Extensible Data-driven Approach for Building Task-oriented Dialogue Systems

Figure 2 for Alexa Conversations: An Extensible Data-driven Approach for Building Task-oriented Dialogue Systems

Figure 3 for Alexa Conversations: An Extensible Data-driven Approach for Building Task-oriented Dialogue Systems

Figure 4 for Alexa Conversations: An Extensible Data-driven Approach for Building Task-oriented Dialogue Systems

Abstract:Traditional goal-oriented dialogue systems rely on various components such as natural language understanding, dialogue state tracking, policy learning and response generation. Training each component requires annotations which are hard to obtain for every new domain, limiting scalability of such systems. Similarly, rule-based dialogue systems require extensive writing and maintenance of rules and do not scale either. End-to-End dialogue systems, on the other hand, do not require module-specific annotations but need a large amount of data for training. To overcome these problems, in this demo, we present Alexa Conversations, a new approach for building goal-oriented dialogue systems that is scalable, extensible as well as data efficient. The components of this system are trained in a data-driven manner, but instead of collecting annotated conversations for training, we generate them using a novel dialogue simulator based on a few seed dialogues and specifications of APIs and entities provided by the developer. Our approach provides out-of-the-box support for natural conversational phenomena like entity sharing across turns or users changing their mind during conversation without requiring developers to provide any such dialogue flows. We exemplify our approach using a simple pizza ordering task and showcase its value in reducing the developer burden for creating a robust experience. Finally, we evaluate our system using a typical movie ticket booking task and show that the dialogue simulator is an essential component of the system that leads to over $50\%$ improvement in turn-level action signature prediction accuracy.

* NAACL 2021 System Demonstrations Track

Via

Access Paper or Ask Questions

Dialog Simulation with Realistic Variations for Training Goal-Oriented Conversational Systems

Nov 16, 2020

Chien-Wei Lin, Vincent Auvray, Daniel Elkind, Arijit Biswas, Maryam Fazel-Zarandi, Nehal Belgamwar, Shubhra Chandra, Matt Zhao, Angeliki Metallinou, Tagyoung Chung(+3 more)

Figure 1 for Dialog Simulation with Realistic Variations for Training Goal-Oriented Conversational Systems

Figure 2 for Dialog Simulation with Realistic Variations for Training Goal-Oriented Conversational Systems

Figure 3 for Dialog Simulation with Realistic Variations for Training Goal-Oriented Conversational Systems

Figure 4 for Dialog Simulation with Realistic Variations for Training Goal-Oriented Conversational Systems

Abstract:Goal-oriented dialog systems enable users to complete specific goals like requesting information about a movie or booking a ticket. Typically the dialog system pipeline contains multiple ML models, including natural language understanding, state tracking and action prediction (policy learning). These models are trained through a combination of supervised or reinforcement learning methods and therefore require collection of labeled domain specific datasets. However, collecting annotated datasets with language and dialog-flow variations is expensive, time-consuming and scales poorly due to human involvement. In this paper, we propose an approach for automatically creating a large corpus of annotated dialogs from a few thoroughly annotated sample dialogs and the dialog schema. Our approach includes a novel goal-sampling technique for sampling plausible user goals and a dialog simulation technique that uses heuristic interplay between the user and the system (Alexa), where the user tries to achieve the sampled goal. We validate our approach by generating data and training three different downstream conversational ML models. We achieve 18 ? 50% relative accuracy improvements on a held-out test set compared to a baseline dialog generation approach that only samples natural language and entity value variations from existing catalogs but does not generate any novel dialog flow variations. We also qualitatively establish that the proposed approach is better than the baseline. Moreover, several different conversational experiences have been built using this method, which enables customers to have a wide variety of conversations with Alexa.

* To be presented at Human in the Loop Dialogue Systems Workshop, NeurIPS 2020

Via

Access Paper or Ask Questions

MA-DST: Multi-Attention Based Scalable Dialog State Tracking

Feb 07, 2020

Adarsh Kumar, Peter Ku, Anuj Kumar Goyal, Angeliki Metallinou, Dilek Hakkani-Tur

Figure 1 for MA-DST: Multi-Attention Based Scalable Dialog State Tracking

Figure 2 for MA-DST: Multi-Attention Based Scalable Dialog State Tracking

Figure 3 for MA-DST: Multi-Attention Based Scalable Dialog State Tracking

Figure 4 for MA-DST: Multi-Attention Based Scalable Dialog State Tracking

Abstract:Task oriented dialog agents provide a natural language interface for users to complete their goal. Dialog State Tracking (DST), which is often a core component of these systems, tracks the system's understanding of the user's goal throughout the conversation. To enable accurate multi-domain DST, the model needs to encode dependencies between past utterances and slot semantics and understand the dialog context, including long-range cross-domain references. We introduce a novel architecture for this task to encode the conversation history and slot semantics more robustly by using attention mechanisms at multiple granularities. In particular, we use cross-attention to model relationships between the context and slots at different semantic levels and self-attention to resolve cross-domain coreferences. In addition, our proposed architecture does not rely on knowing the domain ontologies beforehand and can also be used in a zero-shot setting for new domains or unseen slot values. Our model improves the joint goal accuracy by 5% (absolute) in the full-data setting and by up to 2% (absolute) in the zero-shot setting over the present state-of-the-art on the MultiWoZ 2.1 dataset.

* Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI 2020)

Via

Access Paper or Ask Questions

Controlled Text Generation for Data Augmentation in Intelligent Artificial Agents

Oct 04, 2019

Nikolaos Malandrakis, Minmin Shen, Anuj Goyal, Shuyang Gao, Abhishek Sethi, Angeliki Metallinou

Figure 1 for Controlled Text Generation for Data Augmentation in Intelligent Artificial Agents

Figure 2 for Controlled Text Generation for Data Augmentation in Intelligent Artificial Agents

Figure 3 for Controlled Text Generation for Data Augmentation in Intelligent Artificial Agents

Figure 4 for Controlled Text Generation for Data Augmentation in Intelligent Artificial Agents

Abstract:Data availability is a bottleneck during early stages of development of new capabilities for intelligent artificial agents. We investigate the use of text generation techniques to augment the training data of a popular commercial artificial agent across categories of functionality, with the goal of faster development of new functionality. We explore a variety of encoder-decoder generative models for synthetic training data generation and propose using conditional variational auto-encoders. Our approach requires only direct optimization, works well with limited data and significantly outperforms the previous controlled text generation techniques. Further, the generated data are used as additional training samples in an extrinsic intent classification task, leading to improved performance by up to 5\% absolute f-score in low-resource cases, validating the usefulness of our approach.

* EMNLP WNGT workshop

Via

Access Paper or Ask Questions

Simple Question Answering with Subgraph Ranking and Joint-Scoring

Apr 04, 2019

Wenbo Zhao, Tagyoung Chung, Anuj Goyal, Angeliki Metallinou

Figure 1 for Simple Question Answering with Subgraph Ranking and Joint-Scoring

Figure 2 for Simple Question Answering with Subgraph Ranking and Joint-Scoring

Figure 3 for Simple Question Answering with Subgraph Ranking and Joint-Scoring

Figure 4 for Simple Question Answering with Subgraph Ranking and Joint-Scoring

Abstract:Knowledge graph based simple question answering (KBSQA) is a major area of research within question answering. Although only dealing with simple questions, i.e., questions that can be answered through a single knowledge base (KB) fact, this task is neither simple nor close to being solved. Targeting on the two main steps, subgraph selection and fact selection, the research community has developed sophisticated approaches. However, the importance of subgraph ranking and leveraging the subject--relation dependency of a KB fact have not been sufficiently explored. Motivated by this, we present a unified framework to describe and analyze existing approaches. Using this framework as a starting point, we focus on two aspects: improving subgraph selection through a novel ranking method and leveraging the subject--relation dependency by proposing a joint scoring CNN model with a novel loss function that enforces the well-order of scores. Our methods achieve a new state of the art (85.44% in accuracy) on the SimpleQuestions dataset.

* Accepted by The 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2019). 11 pages, 1 figure

Via

Access Paper or Ask Questions

Unsupervised Transfer Learning for Spoken Language Understanding in Intelligent Agents

Nov 13, 2018

Aditya Siddhant, Anuj Goyal, Angeliki Metallinou

Figure 1 for Unsupervised Transfer Learning for Spoken Language Understanding in Intelligent Agents

Figure 2 for Unsupervised Transfer Learning for Spoken Language Understanding in Intelligent Agents

Figure 3 for Unsupervised Transfer Learning for Spoken Language Understanding in Intelligent Agents

Figure 4 for Unsupervised Transfer Learning for Spoken Language Understanding in Intelligent Agents

Abstract:User interaction with voice-powered agents generates large amounts of unlabeled utterances. In this paper, we explore techniques to efficiently transfer the knowledge from these unlabeled utterances to improve model performance on Spoken Language Understanding (SLU) tasks. We use Embeddings from Language Model (ELMo) to take advantage of unlabeled data by learning contextualized word representations. Additionally, we propose ELMo-Light (ELMoL), a faster and simpler unsupervised pre-training method for SLU. Our findings suggest unsupervised pre-training on a large corpora of unlabeled utterances leads to significantly better SLU performance compared to training from scratch and it can even outperform conventional supervised transfer. Additionally, we show that the gains from unsupervised transfer techniques can be further improved by supervised transfer. The improvements are more pronounced in low resource settings and when using only 1000 labeled in-domain samples, our techniques match the performance of training from scratch on 10-15x more labeled in-domain data.

* To appear at AAAI 2019

Via

Access Paper or Ask Questions

Online Embedding Compression for Text Classification using Low Rank Matrix Factorization

Nov 01, 2018

Anish Acharya, Rahul Goel, Angeliki Metallinou, Inderjit Dhillon

Figure 1 for Online Embedding Compression for Text Classification using Low Rank Matrix Factorization

Figure 2 for Online Embedding Compression for Text Classification using Low Rank Matrix Factorization

Figure 3 for Online Embedding Compression for Text Classification using Low Rank Matrix Factorization

Figure 4 for Online Embedding Compression for Text Classification using Low Rank Matrix Factorization

Abstract:Deep learning models have become state of the art for natural language processing (NLP) tasks, however deploying these models in production system poses significant memory constraints. Existing compression methods are either lossy or introduce significant latency. We propose a compression method that leverages low rank matrix factorization during training,to compress the word embedding layer which represents the size bottleneck for most NLP models. Our models are trained, compressed and then further re-trained on the downstream task to recover accuracy while maintaining the reduced size. Empirically, we show that the proposed method can achieve 90% compression with minimal impact in accuracy for sentence classification tasks, and outperforms alternative methods like fixed-point quantization or offline word embedding compression. We also analyze the inference time and storage space for our method through FLOP calculations, showing that we can compress DNN models by a configurable ratio and regain accuracy loss without introducing additional latency compared to fixed point quantization. Finally, we introduce a novel learning rate schedule, the Cyclically Annealed Learning Rate (CALR), which we empirically demonstrate to outperform other popular adaptive learning rate algorithms on a sentence classification benchmark.

* Accepted in Thirty-Third AAAI Conference on Artificial Intelligence (AAAI 2019)

Via

Access Paper or Ask Questions