Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Weipeng P. Yan

LADDER: A Human-Level Bidding Agent for Large-Scale Real-Time Online Auctions

Sep 01, 2017

Yu Wang, Jiayi Liu, Yuxiang Liu, Jun Hao, Yang He, Jinghe Hu, Weipeng P. Yan, Mantian Li

Figure 1 for LADDER: A Human-Level Bidding Agent for Large-Scale Real-Time Online Auctions

Figure 2 for LADDER: A Human-Level Bidding Agent for Large-Scale Real-Time Online Auctions

Figure 3 for LADDER: A Human-Level Bidding Agent for Large-Scale Real-Time Online Auctions

Figure 4 for LADDER: A Human-Level Bidding Agent for Large-Scale Real-Time Online Auctions

Abstract:We present LADDER, the first deep reinforcement learning agent that can successfully learn control policies for large-scale real-world problems directly from raw inputs composed of high-level semantic information. The agent is based on an asynchronous stochastic variant of DQN (Deep Q Network) named DASQN. The inputs of the agent are plain-text descriptions of states of a game of incomplete information, i.e. real-time large scale online auctions, and the rewards are auction profits of very large scale. We apply the agent to an essential portion of JD's online RTB (real-time bidding) advertising business and find that it easily beats the former state-of-the-art bidding policy that had been carefully engineered and calibrated by human experts: during JD.com's June 18th anniversary sale, the agent increased the company's ads revenue from the portion by more than 50%, while the advertisers' ROI (return on investment) also improved significantly.

* 8 pages, 12 figures

Via

Access Paper or Ask Questions

Optimizing Gross Merchandise Volume via DNN-MAB Dynamic Ranking Paradigm

Aug 14, 2017

Yan Yan, Wentao Guo, Meng Zhao, Jinghe Hu, Weipeng P. Yan

Figure 1 for Optimizing Gross Merchandise Volume via DNN-MAB Dynamic Ranking Paradigm

Figure 2 for Optimizing Gross Merchandise Volume via DNN-MAB Dynamic Ranking Paradigm

Figure 3 for Optimizing Gross Merchandise Volume via DNN-MAB Dynamic Ranking Paradigm

Figure 4 for Optimizing Gross Merchandise Volume via DNN-MAB Dynamic Ranking Paradigm

Abstract:With the transition from people's traditional `brick-and-mortar' shopping to online mobile shopping patterns in web 2.0 $\mathit{era}$, the recommender system plays a critical role in E-Commerce and E-Retails. This is especially true when designing this system for more than $\mathbf{236~million}$ daily active users. Ranking strategy, the key module of the recommender system, needs to be precise, accurate, and responsive for estimating customers' intents. We propose a dynamic ranking paradigm, named as DNN-MAB, that is composed of a pairwise deep neural network (DNN) $\mathit{pre}$-ranker connecting a revised multi-armed bandit (MAB) dynamic $\mathit{post}$-ranker. By taking into account of explicit and implicit user feedbacks such as impressions, clicks, conversions, etc. DNN-MAB is able to adjust DNN $\mathit{pre}$-ranking scores to assist customers locating items they are interested in most so that they can converge quickly and frequently. To the best of our knowledge, frameworks like DNN-MAB have not been discussed in the previous literature to either E-Commerce or machine learning audiences. In practice, DNN-MAB has been deployed to production and it easily outperforms against other state-of-the-art models by significantly lifting the gross merchandise volume (GMV) which is the objective metrics at JD.

* 7 pages, 7 figures, accepted by 'IJCAI-17 Workshop AI Applications in E-Commerce'

Via

Access Paper or Ask Questions