Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zhenpeng Zhou

Zero-Shot Dialogue State Tracking via Cross-Task Transfer

Sep 10, 2021

Zhaojiang Lin, Bing Liu, Andrea Madotto, Seungwhan Moon, Paul Crook, Zhenpeng Zhou, Zhiguang Wang, Zhou Yu, Eunjoon Cho, Rajen Subba(+1 more)

Figure 1 for Zero-Shot Dialogue State Tracking via Cross-Task Transfer

Figure 2 for Zero-Shot Dialogue State Tracking via Cross-Task Transfer

Figure 3 for Zero-Shot Dialogue State Tracking via Cross-Task Transfer

Figure 4 for Zero-Shot Dialogue State Tracking via Cross-Task Transfer

Abstract:Zero-shot transfer learning for dialogue state tracking (DST) enables us to handle a variety of task-oriented dialogue domains without the expense of collecting in-domain data. In this work, we propose to transfer the \textit{cross-task} knowledge from general question answering (QA) corpora for the zero-shot DST task. Specifically, we propose TransferQA, a transferable generative QA model that seamlessly combines extractive QA and multi-choice QA via a text-to-text transformer framework, and tracks both categorical slots and non-categorical slots in DST. In addition, we introduce two effective ways to construct unanswerable questions, namely, negative question sampling and context truncation, which enable our model to handle "none" value slots in the zero-shot DST setting. The extensive experiments show that our approaches substantially improve the existing zero-shot and few-shot results on MultiWoz. Moreover, compared to the fully trained baseline on the Schema-Guided Dialogue dataset, our approach shows better generalization ability in unseen domains.

* EMNLP 2021

Via

Access Paper or Ask Questions

Leveraging Slot Descriptions for Zero-Shot Cross-Domain Dialogue State Tracking

May 10, 2021

Zhaojiang Lin, Bing Liu, Seungwhan Moon, Paul Crook, Zhenpeng Zhou, Zhiguang Wang, Zhou Yu, Andrea Madotto, Eunjoon Cho, Rajen Subba

Figure 1 for Leveraging Slot Descriptions for Zero-Shot Cross-Domain Dialogue State Tracking

Figure 2 for Leveraging Slot Descriptions for Zero-Shot Cross-Domain Dialogue State Tracking

Figure 3 for Leveraging Slot Descriptions for Zero-Shot Cross-Domain Dialogue State Tracking

Figure 4 for Leveraging Slot Descriptions for Zero-Shot Cross-Domain Dialogue State Tracking

Abstract:Zero-shot cross-domain dialogue state tracking (DST) enables us to handle task-oriented dialogue in unseen domains without the expense of collecting in-domain data. In this paper, we propose a slot description enhanced generative approach for zero-shot cross-domain DST. Specifically, our model first encodes dialogue context and slots with a pre-trained self-attentive encoder, and generates slot values in an auto-regressive manner. In addition, we incorporate Slot Type Informed Descriptions that capture the shared information across slots to facilitate cross-domain knowledge transfer. Experimental results on the MultiWOZ dataset show that our proposed method significantly improves existing state-of-the-art results in the zero-shot cross-domain setting.

* NAACL 2021

Via

Access Paper or Ask Questions

Continual Learning in Task-Oriented Dialogue Systems

Dec 31, 2020

Andrea Madotto, Zhaojiang Lin, Zhenpeng Zhou, Seungwhan Moon, Paul Crook, Bing Liu, Zhou Yu, Eunjoon Cho, Zhiguang Wang

Figure 1 for Continual Learning in Task-Oriented Dialogue Systems

Figure 2 for Continual Learning in Task-Oriented Dialogue Systems

Figure 3 for Continual Learning in Task-Oriented Dialogue Systems

Figure 4 for Continual Learning in Task-Oriented Dialogue Systems

Abstract:Continual learning in task-oriented dialogue systems can allow us to add new domains and functionalities through time without incurring the high cost of a whole system retraining. In this paper, we propose a continual learning benchmark for task-oriented dialogue systems with 37 domains to be learned continuously in four settings, such as intent recognition, state tracking, natural language generation, and end-to-end. Moreover, we implement and compare multiple existing continual learning baselines, and we propose a simple yet effective architectural method based on residual adapters. Our experiments demonstrate that the proposed architectural method and a simple replay-based strategy perform comparably well but they both achieve inferior performance to the multi-task learning baseline, in where all the data are shown at once, showing that continual learning in task-oriented dialogue systems is a challenging task. Furthermore, we reveal several trade-offs between different continual learning methods in term of parameter usage and memory size, which are important in the design of a task-oriented dialogue system. The proposed benchmark is released together with several baselines to promote more research in this direction.

* 9 pages

Via

Access Paper or Ask Questions

Resource Constrained Dialog Policy Learning via Differentiable Inductive Logic Programming

Nov 10, 2020

Zhenpeng Zhou, Ahmad Beirami, Paul Crook, Pararth Shah, Rajen Subba, Alborz Geramifard

Figure 1 for Resource Constrained Dialog Policy Learning via Differentiable Inductive Logic Programming

Figure 2 for Resource Constrained Dialog Policy Learning via Differentiable Inductive Logic Programming

Figure 3 for Resource Constrained Dialog Policy Learning via Differentiable Inductive Logic Programming

Abstract:Motivated by the needs of resource constrained dialog policy learning, we introduce dialog policy via differentiable inductive logic (DILOG). We explore the tasks of one-shot learning and zero-shot domain transfer with DILOG on SimDial and MultiWoZ. Using a single representative dialog from the restaurant domain, we train DILOG on the SimDial dataset and obtain 99+% in-domain test accuracy. We also show that the trained DILOG zero-shot transfers to all other domains with 99+% accuracy, proving the suitability of DILOG to slot-filling dialogs. We further extend our study to the MultiWoZ dataset achieving 90+% inform and success metrics. We also observe that these metrics are not capturing some of the shortcomings of DILOG in terms of false positives, prompting us to measure an auxiliary Action F1 score. We show that DILOG is 100x more data efficient than state-of-the-art neural approaches on MultiWoZ while achieving similar performance metrics. We conclude with a discussion on the strengths and weaknesses of DILOG.

Via

Access Paper or Ask Questions

Optimization of Molecules via Deep Reinforcement Learning

Oct 23, 2018

Zhenpeng Zhou, Steven Kearnes, Li Li, Richard N. Zare, Patrick Riley

Figure 1 for Optimization of Molecules via Deep Reinforcement Learning

Figure 2 for Optimization of Molecules via Deep Reinforcement Learning

Figure 3 for Optimization of Molecules via Deep Reinforcement Learning

Figure 4 for Optimization of Molecules via Deep Reinforcement Learning

Abstract:We present a framework, which we call Molecule Deep $Q$-Networks (MolDQN), for molecule optimization by combining domain knowledge of chemistry and state-of-the-art reinforcement learning techniques (prioritized experience replay, double $Q$-learning, and randomized value functions). We directly define modifications on molecules, thereby ensuring 100% chemical validity. Further, we operate without pre-training on any dataset to avoid possible bias from the choice of that set. As a result, our model outperforms several other state-of-the-art algorithms by having a higher success rate of acquiring molecules with better properties. Inspired by problems faced during medicinal chemistry lead optimization, we extend our model with multi-objective reinforcement learning, which maximizes drug-likeness while maintaining similarity to the original molecule. We further show the path through chemical space to achieve optimization for a molecule to understand how the model works.

* Adds Supporting Information

Via

Access Paper or Ask Questions

Graph Convolution: A High-Order and Adaptive Approach

Oct 20, 2017

Zhenpeng Zhou, Xiaocheng Li

Figure 1 for Graph Convolution: A High-Order and Adaptive Approach

Figure 2 for Graph Convolution: A High-Order and Adaptive Approach

Figure 3 for Graph Convolution: A High-Order and Adaptive Approach

Figure 4 for Graph Convolution: A High-Order and Adaptive Approach

Abstract:In this paper, we presented a novel convolutional neural network framework for graph modeling, with the introduction of two new modules specially designed for graph-structured data: the $k$-th order convolution operator and the adaptive filtering module. Importantly, our framework of High-order and Adaptive Graph Convolutional Network (HA-GCN) is a general-purposed architecture that fits various applications on both node and graph centrics, as well as graph generative models. We conducted extensive experiments on demonstrating the advantages of our framework. Particularly, our HA-GCN outperforms the state-of-the-art models on node classification and molecule property prediction tasks. It also generates 32% more real molecules on the molecule generation task, both of which will significantly benefit real-world applications such as material design and drug screening.

Via

Access Paper or Ask Questions