Abstract:In online advertising, our aim is to match the advertisers with the most relevant users to optimize the campaign performance. In the pursuit of achieving this goal, multiple data sources provided by the advertisers or third-party data providers are utilized to choose the set of users according to the advertisers' targeting criteria. In this paper, we present a framework that can be applied to assess the quality of such data sources in large scale. This framework efficiently evaluates the similarity of a specific data source categorization to that of the ground truth, especially for those cases when the ground truth is accessible only in aggregate, and the user-level information is anonymized or unavailable due to privacy reasons. We propose multiple methodologies within this framework, present some preliminary assessment results, and evaluate how the methodologies compare to each other. We also present two use cases where we can utilize the data quality assessment results: the first use case is targeting specific user categories, and the second one is forecasting the desirable audiences we can reach for an online advertising campaign with pre-set targeting criteria.
Abstract:Traditional pairwise sequence alignment is based on matching individual samples from two sequences, under time monotonicity constraints. However, in many application settings matching subsequences (segments) instead of individual samples may bring in additional robustness to noise or local non-causal perturbations. This paper presents an approach to segmental sequence alignment that jointly segments and aligns two sequences, generalizing the traditional per-sample alignment. To accomplish this task, we introduce a distance metric between segments based on average pairwise distances and then present a modified pair-HMM (PHMM) that incorporates the proposed distance metric to solve the joint segmentation and alignment task. We also propose a relaxation to our model that improves the computational efficiency of the generic segmental PHMM. Our results demonstrate that this new measure of sequence similarity can lead to improved classification performance, while being resilient to noise, on a variety of sequence retrieval problems, from EEG to motion sequence classification.
Abstract:Online media offers opportunities to marketers to deliver brand messages to a large audience. Advertising technology platforms enables the advertisers to find the proper group of audiences and deliver ad impressions to them in real time. The recent growth of the real time bidding has posed a significant challenge on monitoring such a complicated system. With so many components we need a reliable system that detects the possible changes in the system and alerts the engineering team. In this paper we describe the mechanism that we invented for recovering the representative metrics and detecting the change in their behavior. We show that this mechanism is able to detect the possible problems in time by describing some incident cases.
Abstract:Online media provides opportunities for marketers through which they can deliver effective brand messages to a wide range of audiences. Advertising technology platforms enable advertisers to reach their target audience by delivering ad impressions to online users in real time. In order to identify the best marketing message for a user and to purchase impressions at the right price, we rely heavily on bid prediction and optimization models. Even though the bid prediction models are well studied in the literature, the equally important subject of model evaluation is usually overlooked. Effective and reliable evaluation of an online bidding model is crucial for making faster model improvements as well as for utilizing the marketing budgets more efficiently. In this paper, we present an experimentation framework for bid prediction models where our focus is on the practical aspects of model evaluation. Specifically, we outline the unique challenges we encounter in our platform due to a variety of factors such as heterogeneous goal definitions, varying budget requirements across different campaigns, high seasonality and the auction-based environment for inventory purchasing. Then, we introduce return on investment (ROI) as a unified model performance (i.e., success) metric and explain its merits over more traditional metrics such as click-through rate (CTR) or conversion rate (CVR). Most importantly, we discuss commonly used evaluation and metric summarization approaches in detail and propose a more accurate method for online evaluation of new experimental models against the baseline. Our meta-analysis-based approach addresses various shortcomings of other methods and yields statistically robust conclusions that allow us to conclude experiments more quickly in a reliable manner. We demonstrate the effectiveness of our evaluation strategy on real campaign data through some experiments.