Abstract:Recommender selects and presents top-K items to the user at each online request, and a recommendation session consists of several sequential requests. Formulating a recommendation session as a Markov decision process and solving it by reinforcement learning (RL) framework has attracted increasing attention from both academic and industry communities. In this paper, we propose a RL-based industrial short-video recommender ranking framework, which models and maximizes user watch-time in an environment of user multi-aspect preferences by a collaborative multi-agent formulization. Moreover, our proposed framework adopts a model-based learning approach to alleviate the sample selection bias which is a crucial but intractable problem in industrial recommender system. Extensive offline evaluations and live experiments confirm the effectiveness of our proposed method over alternatives. Our proposed approach has been deployed in our real large-scale short-video sharing platform, successfully serving over hundreds of millions users.
Abstract:Identifying the arrival times of seismic P-phases plays a significant role in real-time seismic monitoring, which provides critical guidance for emergency response activities. While considerable research has been conducted on this topic, efficiently capturing the arrival times of seismic P-phases hidden within intensively distributed and noisy seismic waves, such as those generated by the aftershocks of destructive earthquakes, remains a real challenge since existing methods rely on laborious expert supervision. To this end, in this paper, we present a machine learning-enhanced framework, ML-Picker, for the automatic identification of seismic P-phase arrivals on continuous and massive waveforms. More specifically, ML-Picker consists of three modules, namely, Trigger, Classifier, and Refiner, and an ensemble learning strategy is exploited to integrate several machine learning classifiers. An evaluation of the aftershocks following the $M8.0$ Wenchuan earthquake demonstrates that ML-Picker can not only achieve the best identification performance but also identify 120% more seismic P-phase arrivals as complementary data. Meanwhile, experimental results also reveal both the applicability of different machine learning models for waveforms collected from different seismic stations and the regularities of seismic P-phase arrivals that might be neglected during manual inspection. These findings clearly validate the effectiveness, efficiency, flexibility and stability of ML-Picker. In particular, with the preliminary version of ML-Picker, we won the championship in the First Season and were the runner-up in the Finals of the 2017 International Aftershock Detection Contest hosted by the China Earthquake Administration, in which 1,143 teams participated from around the world.