Abstract:Sound decision-making relies on accurate prediction for tangible outcomes ranging from military conflict to disease outbreaks. To improve crowdsourced forecasting accuracy, we developed SAGE, a hybrid forecasting system that combines human and machine generated forecasts. The system provides a platform where users can interact with machine models and thus anchor their judgments on an objective benchmark. The system also aggregates human and machine forecasts weighting both for propinquity and based on assessed skill while adjusting for overconfidence. We present results from the Hybrid Forecasting Competition (HFC) - larger than comparable forecasting tournaments - including 1085 users forecasting 398 real-world forecasting problems over eight months. Our main result is that the hybrid system generated more accurate forecasts compared to a human-only baseline which had no machine generated predictions. We found that skilled forecasters who had access to machine-generated forecasts outperformed those who only viewed historical data. We also demonstrated the inclusion of machine-generated forecasts in our aggregation algorithms improved performance, both in terms of accuracy and scalability. This suggests that hybrid forecasting systems, which potentially require fewer human resources, can be a viable approach for maintaining a competitive level of accuracy over a larger number of forecasting questions.
Abstract:To extract essential information from complex data, computer scientists have been developing machine learning models that learn low-dimensional representation mode. From such advances in machine learning research, not only computer scientists but also social scientists have benefited and advanced their research because human behavior or social phenomena lies in complex data. To document this emerging trend, we survey the recent studies that apply word embedding techniques to human behavior mining, building a taxonomy to illustrate the methods and procedures used in the surveyed papers and highlight the recent emerging trends applying word embedding models to non-textual human behavior data. This survey conducts a simple experiment to warn that common similarity measurements used in the literature could yield different results even if they return consistent results at an aggregate level.
Abstract:With the recent development of technology, data on detailed human temporal behaviors has become available. Many methods have been proposed to mine those human dynamic behavior data and revealed valuable insights for research and businesses. However, most methods analyze only sequence of actions and do not study the inter-temporal information such as the time intervals between actions in a holistic manner. While actions and action time intervals are interdependent, it is challenging to integrate them because they have different natures: time and action. To overcome this challenge, we propose a unified method that analyzes user actions with intertemporal information (time interval). We simultaneously embed the user's action sequence and its time intervals to obtain a low-dimensional representation of the action along with intertemporal information. The paper demonstrates that the proposed method enables us to characterize user actions in terms of temporal context, using three real-world data sets. This paper demonstrates that explicit modeling of action sequences and inter-temporal user behavior information enable successful interpretable analysis.
Abstract:In display ad auctions of Real-Time Bid-ding (RTB), a typical Demand-Side Platform (DSP)bids based on the predicted probability of click and conversion right after an ad impression. Recent studies find such a strategy is suboptimal and propose a better bidding strategy named lift-based bidding.Lift-based bidding simply bids the price according to the lift effect of the ad impression and achieves maximization of target metrics such as sales. Despiteits superiority, lift-based bidding has not yet been widely accepted in the advertising industry. For one reason, lift-based bidding is less profitable for DSP providers under the current billing rule. Second, thepractical usefulness of lift-based bidding is not widely understood in the online advertising industry due to the lack of a comprehensive investigation of its impact.We here propose a practically-implementable lift-based bidding system that perfectly fits the current billing rules. We conduct extensive experiments usinga real-world advertising campaign and examine the performance under various settings. We find that lift-based bidding, especially unbiased lift-based bidding is most profitable for both DSP providers and advertisers. Our ablation study highlights that lift-based bidding has a good property for currently dominant first price auctions. The results will motivate the online
Abstract:Online advertisements have become one of today's most widely used tools for enhancing businesses partly because of their compatibility with A/B testing. A/B testing allows sellers to find effective advertisement strategies such as ad creatives or segmentations. Even though several studies propose a technique to maximize the effect of an advertisement, there is insufficient comprehension of the customers' offline shopping behavior invited by the online advertisements. Herein, we study the difference in offline behavior between customers who received online advertisements and regular customers (i.e., the customers visits the target shop voluntary), and the duration of this difference. We analyzed approximately three thousand users' offline behavior with their 23.5 million location records through 31 A/B testings. We first demonstrate the externality that customers with advertisements traverse larger areas than those without advertisements, and this spatial difference lasts several days after their shopping day. We then find a long-run effect of this externality of advertising that a certain portion of the customers invited to the offline shops revisit these shops. Finally, based on this revisit effect findings, we utilize a causal machine learning model to propose a marketing strategy to maximize the revisit ratio. Our results suggest that advertisements draw customers who have different behavior traits from regular customers. This study's findings demonstrate that a simple analysis may underrate the effects of advertisements on businesses, and an analysis considering externality can attract potentially valuable customers.
Abstract:Crowdwork often entails tackling cognitively-demanding and time-consuming tasks. Crowdsourcing can be used for complex annotation tasks, from medical imaging to geospatial data, and such data powers sensitive applications, such as health diagnostics or autonomous driving. However, the existence and prevalence of underperforming crowdworkers is well-recognized, and can pose a threat to the validity of crowdsourcing. In this study, we propose the use of a computational framework to identify clusters of underperforming workers using clickstream trajectories. We focus on crowdsourced geopolitical forecasting. The framework can reveal different types of underperformers, such as workers with forecasts whose accuracy is far from the consensus of the crowd, those who provide low-quality explanations for their forecasts, and those who simply copy-paste their forecasts from other users. Our study suggests that clickstream clustering and analysis are fundamental tools to diagnose the performance of crowdworkers in platforms leveraging the wisdom of crowds.
Abstract:Conventional bidding strategies for online display ad auction heavily relies on observed performance indicators such as clicks or conversions. A bidding strategy naively pursuing these easily observable metrics, however, fails to optimize the profitability of the advertisers. Rather, the bidding strategy that leads to the maximum revenue is a strategy pursuing the performance lift of showing ads to a specific user. Therefore, it is essential to predict the lift-effect of showing ads to each user on their target variables from observed log data. However, there is a difficulty in predicting the lift-effect, as the training data gathered by a past bidding strategy may have a strong bias towards the winning impressions. In this study, we develop Unbiased Lift-based Bidding System, which maximizes the advertisers' profit by accurately predicting the lift-effect from biased log data. Our system is the first to enable high-performing lift-based bidding strategy by theoretically alleviating the inherent bias in the log. Real-world, large-scale A/B testing successfully demonstrates the superiority and practicability of the proposed system.
Abstract:Understanding consumer behavior is an important task, not only for developing marketing strategies but also for the management of economic policies. Detecting consumption patterns, however, is a high-dimensional problem in which various factors that would affect consumers' behavior need to be considered, such as consumers' demographics, circadian rhythm, seasonal cycles, etc. Here, we develop a method to extract multi-timescale expenditure patterns of consumers from a large dataset of scanned receipts. We use a non-negative tensor factorization (NTF) to detect intra- and inter-week consumption patterns at one time. The proposed method allows us to characterize consumers based on their consumption patterns that are correlated over different timescales.