Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Deep Embedding Forest: Forest-based Serving with Deep Embedding Features

Mar 15, 2017

Jie Zhu, Ying Shan, JC Mao, Dong Yu, Holakou Rahmanian, Yi Zhang

Figure 1 for Deep Embedding Forest: Forest-based Serving with Deep Embedding Features

Figure 2 for Deep Embedding Forest: Forest-based Serving with Deep Embedding Features

Figure 3 for Deep Embedding Forest: Forest-based Serving with Deep Embedding Features

Figure 4 for Deep Embedding Forest: Forest-based Serving with Deep Embedding Features

Share this with someone who'll enjoy it:

Abstract:Deep Neural Networks (DNN) have demonstrated superior ability to extract high level embedding vectors from low level features. Despite the success, the serving time is still the bottleneck due to expensive run-time computation of multiple layers of dense matrices. GPGPU, FPGA, or ASIC-based serving systems require additional hardware that are not in the mainstream design of most commercial applications. In contrast, tree or forest-based models are widely adopted because of low serving cost, but heavily depend on carefully engineered features. This work proposes a Deep Embedding Forest model that benefits from the best of both worlds. The model consists of a number of embedding layers and a forest/tree layer. The former maps high dimensional (hundreds of thousands to millions) and heterogeneous low-level features to the lower dimensional (thousands) vectors, and the latter ensures fast serving. Built on top of a representative DNN model called Deep Crossing, and two forest/tree-based models including XGBoost and LightGBM, a two-step Deep Embedding Forest algorithm is demonstrated to achieve on-par or slightly better performance as compared with the DNN counterpart, with only a fraction of serving time on conventional hardware. After comparing with a joint optimization algorithm called partial fuzzification, also proposed in this paper, it is concluded that the two-step Deep Embedding Forest has achieved near optimal performance. Experiments based on large scale data sets (up to 1 billion samples) from a major sponsored search engine proves the efficacy of the proposed model.

* 14 pages, 3 figures, 5 tables

View paper on

Share this with someone who'll enjoy it:

Title:Deep Embedding Forest: Forest-based Serving with Deep Embedding Features

Paper and Code