Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yanyan Zou

Automatic Scene-based Topic Channel Construction System for E-Commerce

Oct 06, 2022

Peng Lin, Yanyan Zou, Lingfei Wu, Mian Ma, Zhuoye Ding, Bo Long

Figure 1 for Automatic Scene-based Topic Channel Construction System for E-Commerce

Figure 2 for Automatic Scene-based Topic Channel Construction System for E-Commerce

Figure 3 for Automatic Scene-based Topic Channel Construction System for E-Commerce

Figure 4 for Automatic Scene-based Topic Channel Construction System for E-Commerce

Abstract:Scene marketing that well demonstrates user interests within a certain scenario has proved effective for offline shopping. To conduct scene marketing for e-commerce platforms, this work presents a novel product form, scene-based topic channel which typically consists of a list of diverse products belonging to the same usage scenario and a topic title that describes the scenario with marketing words. As manual construction of channels is time-consuming due to billions of products as well as dynamic and diverse customers' interests, it is necessary to leverage AI techniques to automatically construct channels for certain usage scenarios and even discover novel topics. To be specific, we first frame the channel construction task as a two-step problem, i.e., scene-based topic generation and product clustering, and propose an E-commerce Scene-based Topic Channel construction system (i.e., ESTC) to achieve automated production, consisting of scene-based topic generation model for the e-commerce domain, product clustering on the basis of topic similarity, as well as quality control based on automatic model filtering and human screening. Extensive offline experiments and online A/B test validates the effectiveness of such a novel product form as well as the proposed system. In addition, we also introduce the experience of deploying the proposed system on a real-world e-commerce recommendation platform.

Via

Access Paper or Ask Questions

Automatic Product Copywriting for E-Commerce

Dec 15, 2021

Xueying Zhang, Yanyan Zou, Hainan Zhang, Jing Zhou, Shiliang Diao, Jiajia Chen, Zhuoye Ding, Zhen He, Xueqi He, Yun Xiao(+3 more)

Figure 1 for Automatic Product Copywriting for E-Commerce

Figure 2 for Automatic Product Copywriting for E-Commerce

Figure 3 for Automatic Product Copywriting for E-Commerce

Figure 4 for Automatic Product Copywriting for E-Commerce

Abstract:Product copywriting is a critical component of e-commerce recommendation platforms. It aims to attract users' interest and improve user experience by highlighting product characteristics with textual descriptions. In this paper, we report our experience deploying the proposed Automatic Product Copywriting Generation (APCG) system into the JD.com e-commerce product recommendation platform. It consists of two main components: 1) natural language generation, which is built from a transformer-pointer network and a pre-trained sequence-to-sequence model based on millions of training data from our in-house platform; and 2) copywriting quality control, which is based on both automatic evaluation and human screening. For selected domains, the models are trained and updated daily with the updated training data. In addition, the model is also used as a real-time writing assistant tool on our live broadcast platform. The APCG system has been deployed in JD.com since Feb 2021. By Sep 2021, it has generated 2.53 million product descriptions, and improved the overall averaged click-through rate (CTR) and the Conversion Rate (CVR) by 4.22% and 3.61%, compared to baselines, respectively on a year-on-year basis. The accumulated Gross Merchandise Volume (GMV) made by our system is improved by 213.42%, compared to the number in Feb 2021.

* Accepted by AAAI 2022/IAAI 2022 under the track of "Highly Innovative Applications of AI"

Via

Access Paper or Ask Questions

Adaptive Bridge between Training and Inference for Dialogue

Oct 22, 2021

Haoran Xu, Hainan Zhang, Yanyan Zou, Hongshen Chen, Zhuoye Ding, Yanyan Lan

Figure 1 for Adaptive Bridge between Training and Inference for Dialogue

Figure 2 for Adaptive Bridge between Training and Inference for Dialogue

Figure 3 for Adaptive Bridge between Training and Inference for Dialogue

Figure 4 for Adaptive Bridge between Training and Inference for Dialogue

Abstract:Although exposure bias has been widely studied in some NLP tasks, it faces its unique challenges in dialogue response generation, the representative one-to-various generation scenario. In real human dialogue, there are many appropriate responses for the same context, not only with different expressions, but also with different topics. Therefore, due to the much bigger gap between various ground-truth responses and the generated synthetic response, exposure bias is more challenging in dialogue generation task. What's more, as MLE encourages the model to only learn the common words among different ground-truth responses, but ignores the interesting and specific parts, exposure bias may further lead to the common response generation problem, such as "I don't know" and "HaHa?" In this paper, we propose a novel adaptive switching mechanism, which learns to automatically transit between ground-truth learning and generated learning regarding the word-level matching score, such as the cosine similarity. Experimental results on both Chinese STC dataset and English Reddit dataset, show that our adaptive method achieves a significant improvement in terms of metric-based evaluation and human evaluation, as compared with the state-of-the-art exposure bias approaches. Further analysis on NMT task also shows that our model can achieve a significant improvement.

* EMNLP2021

Via

Access Paper or Ask Questions

FCM: A Fine-grained Comparison Model for Multi-turn Dialogue Reasoning

Sep 23, 2021

Xu Wang, Hainan Zhang, Shuai Zhao, Yanyan Zou, Hongshen Chen, Zhuoye Ding, Bo Cheng, Yanyan Lan

Figure 1 for FCM: A Fine-grained Comparison Model for Multi-turn Dialogue Reasoning

Figure 2 for FCM: A Fine-grained Comparison Model for Multi-turn Dialogue Reasoning

Figure 3 for FCM: A Fine-grained Comparison Model for Multi-turn Dialogue Reasoning

Figure 4 for FCM: A Fine-grained Comparison Model for Multi-turn Dialogue Reasoning

Abstract:Despite the success of neural dialogue systems in achieving high performance on the leader-board, they cannot meet users' requirements in practice, due to their poor reasoning skills. The underlying reason is that most neural dialogue models only capture the syntactic and semantic information, but fail to model the logical consistency between the dialogue history and the generated response. Recently, a new multi-turn dialogue reasoning task has been proposed, to facilitate dialogue reasoning research. However, this task is challenging, because there are only slight differences between the illogical response and the dialogue history. How to effectively solve this challenge is still worth exploring. This paper proposes a Fine-grained Comparison Model (FCM) to tackle this problem. Inspired by human's behavior in reading comprehension, a comparison mechanism is proposed to focus on the fine-grained differences in the representation of each response candidate. Specifically, each candidate representation is compared with the whole history to obtain a history consistency representation. Furthermore, the consistency signals between each candidate and the speaker's own history are considered to drive a model to prefer a candidate that is logically consistent with the speaker's history logic. Finally, the above consistency representations are employed to output a ranking list of the candidate responses for multi-turn dialogue reasoning. Experimental results on two public dialogue datasets show that our method obtains higher ranking scores than the baseline models.

* EMNLP2021 Findings

Via

Access Paper or Ask Questions

Topic-Aware Contrastive Learning for Abstractive Dialogue Summarization

Sep 10, 2021

Junpeng Liu, Yanyan Zou, Hainan Zhang, Hongshen Chen, Zhuoye Ding, Caixia Yuan, Xiaojie Wang

Figure 1 for Topic-Aware Contrastive Learning for Abstractive Dialogue Summarization

Figure 2 for Topic-Aware Contrastive Learning for Abstractive Dialogue Summarization

Figure 3 for Topic-Aware Contrastive Learning for Abstractive Dialogue Summarization

Figure 4 for Topic-Aware Contrastive Learning for Abstractive Dialogue Summarization

Abstract:Unlike well-structured text, such as news reports and encyclopedia articles, dialogue content often comes from two or more interlocutors, exchanging information with each other. In such a scenario, the topic of a conversation can vary upon progression and the key information for a certain topic is often scattered across multiple utterances of different speakers, which poses challenges to abstractly summarize dialogues. To capture the various topic information of a conversation and outline salient facts for the captured topics, this work proposes two topic-aware contrastive learning objectives, namely coherence detection and sub-summary generation objectives, which are expected to implicitly model the topic change and handle information scattering challenges for the dialogue summarization task. The proposed contrastive objectives are framed as auxiliary tasks for the primary dialogue summarization task, united via an alternative parameter updating strategy. Extensive experiments on benchmark datasets demonstrate that the proposed simple method significantly outperforms strong baselines and achieves new state-of-the-art performance. The code and trained models are publicly available via \href{https://github.com/Junpliu/ConDigSum}{https://github.com/Junpliu/ConDigSum}.

* EMNLP 2021

Via

Access Paper or Ask Questions

STEP: Sequence-to-Sequence Transformer Pre-training for Document Summarization

Apr 04, 2020

Yanyan Zou, Xingxing Zhang, Wei Lu, Furu Wei, Ming Zhou

Figure 1 for STEP: Sequence-to-Sequence Transformer Pre-training for Document Summarization

Figure 2 for STEP: Sequence-to-Sequence Transformer Pre-training for Document Summarization

Figure 3 for STEP: Sequence-to-Sequence Transformer Pre-training for Document Summarization

Figure 4 for STEP: Sequence-to-Sequence Transformer Pre-training for Document Summarization

Abstract:Abstractive summarization aims to rewrite a long document to its shorter form, which is usually modeled as a sequence-to-sequence (Seq2Seq) learning problem. Seq2Seq Transformers are powerful models for this problem. Unfortunately, training large Seq2Seq Transformers on limited supervised summarization data is challenging. We, therefore, propose STEP (as shorthand for Sequence-to-Sequence Transformer Pre-training), which can be trained on large scale unlabeled documents. Specifically, STEP is pre-trained using three different tasks, namely sentence reordering, next sentence generation, and masked document generation. Experiments on two summarization datasets show that all three tasks can improve performance upon a heavily tuned large Seq2Seq Transformer which already includes a strong pre-trained encoder by a large margin. By using our best task to pre-train STEP, we outperform the best published abstractive model on CNN/DailyMail by 0.8 ROUGE-2 and New York Times by 2.4 ROUGE-2.

Via

Access Paper or Ask Questions

Mining Commonsense Facts from the Physical World

Feb 11, 2020

Yanyan Zou, Wei Lu, Xu Sun

Figure 1 for Mining Commonsense Facts from the Physical World

Figure 2 for Mining Commonsense Facts from the Physical World

Figure 3 for Mining Commonsense Facts from the Physical World

Figure 4 for Mining Commonsense Facts from the Physical World

Abstract:Textual descriptions of the physical world implicitly mention commonsense facts, while the commonsense knowledge bases explicitly represent such facts as triples. Compared to dramatically increased text data, the coverage of existing knowledge bases is far away from completion. Most of the prior studies on populating knowledge bases mainly focus on Freebase. To automatically complete commonsense knowledge bases to improve their coverage is under-explored. In this paper, we propose a new task of mining commonsense facts from the raw text that describes the physical world. We build an effective new model that fuses information from both sequence text and existing knowledge base resource. Then we create two large annotated datasets each with approximate 200k instances for commonsense knowledge base completion. Empirical results demonstrate that our model significantly outperforms baselines.

Via

Access Paper or Ask Questions

Aligning Cross-Lingual Entities with Multi-Aspect Information

Oct 15, 2019

Hsiu-Wei Yang, Yanyan Zou, Peng Shi, Wei Lu, Jimmy Lin, Xu Sun

Figure 1 for Aligning Cross-Lingual Entities with Multi-Aspect Information

Figure 2 for Aligning Cross-Lingual Entities with Multi-Aspect Information

Figure 3 for Aligning Cross-Lingual Entities with Multi-Aspect Information

Figure 4 for Aligning Cross-Lingual Entities with Multi-Aspect Information

Abstract:Multilingual knowledge graphs (KGs), such as YAGO and DBpedia, represent entities in different languages. The task of cross-lingual entity alignment is to match entities in a source language with their counterparts in target languages. In this work, we investigate embedding-based approaches to encode entities from multilingual KGs into the same vector space, where equivalent entities are close to each other. Specifically, we apply graph convolutional networks (GCNs) to combine multi-aspect information of entities, including topological connections, relations, and attributes of entities, to learn entity embeddings. To exploit the literal descriptions of entities expressed in different languages, we propose two uses of a pretrained multilingual BERT model to bridge cross-lingual gaps. We further propose two strategies to integrate GCN-based and BERT-based modules to boost performance. Extensive experiments on two benchmark datasets demonstrate that our method significantly outperforms existing systems.

* Accepted by EMNLP19

Via

Access Paper or Ask Questions

Text2Math: End-to-end Parsing Text into Math Expressions

Oct 15, 2019

Yanyan Zou, Wei Lu

Figure 1 for Text2Math: End-to-end Parsing Text into Math Expressions

Figure 2 for Text2Math: End-to-end Parsing Text into Math Expressions

Figure 3 for Text2Math: End-to-end Parsing Text into Math Expressions

Figure 4 for Text2Math: End-to-end Parsing Text into Math Expressions

Abstract:We propose Text2Math, a model for semantically parsing text into math expressions. The model can be used to solve different math related problems including arithmetic word problems and equation parsing problems. Unlike previous approaches, we tackle the problem from an end-to-end structured prediction perspective where our algorithm aims to predict the complete math expression at once as a tree structure, where minimal manual efforts are involved in the process. Empirical results on benchmark datasets demonstrate the efficacy of our approach.

* Accepted by EMNLP2019

Via

Access Paper or Ask Questions

Quantity Tagger: A Latent-Variable Sequence Labeling Approach to Solving Addition-Subtraction Word Problems

Aug 31, 2019

Yanyan Zou, Wei Lu

Figure 1 for Quantity Tagger: A Latent-Variable Sequence Labeling Approach to Solving Addition-Subtraction Word Problems

Figure 2 for Quantity Tagger: A Latent-Variable Sequence Labeling Approach to Solving Addition-Subtraction Word Problems

Figure 3 for Quantity Tagger: A Latent-Variable Sequence Labeling Approach to Solving Addition-Subtraction Word Problems

Figure 4 for Quantity Tagger: A Latent-Variable Sequence Labeling Approach to Solving Addition-Subtraction Word Problems

Abstract:An arithmetic word problem typically includes a textual description containing several constant quantities. The key to solving the problem is to reveal the underlying mathematical relations (such as addition and subtraction) among quantities, and then generate equations to find solutions. This work presents a novel approach, Quantity Tagger, that automatically discovers such hidden relations by tagging each quantity with a sign corresponding to one type of mathematical operation. For each quantity, we assume there exists a latent, variable-sized quantity span surrounding the quantity token in the text, which conveys information useful for determining its sign. Empirical results show that our method achieves 5 and 8 points of accuracy gains on two datasets respectively, compared to prior approaches.

* Accepted by ACL 2019

Via

Access Paper or Ask Questions