Abstract:There is a rumbling debate over the impact of gentrification: presumed gentrifiers have been the target of protests and attacks in some cities, while they have been welcome as generators of new jobs and taxes in others. Census data fails to measure neighborhood change in real-time since it is usually updated every ten years. This work shows that Airbnb data can be used to quantify and track neighborhood changes. Specifically, we consider both structured data (e.g. number of listings, number of reviews, listing information) and unstructured data (e.g. user-generated reviews processed with natural language processing and machine learning algorithms) for three major cities, New York City (US), Los Angeles (US), and Greater London (UK). We find that Airbnb data (especially its unstructured part) appears to nowcast neighborhood gentrification, measured as changes in housing affordability and demographics. Overall, our results suggest that user-generated data from online platforms can be used to create socioeconomic indices to complement traditional measures that are less granular, not in real-time, and more costly to obtain.
Abstract:Studying competition and market structure at the product level instead of brand level can provide firms with insights on cannibalization and product line optimization. However, it is computationally challenging to analyze product-level competition for the millions of products available on e-commerce platforms. We introduce Product2Vec, a method based on the representation learning algorithm Word2Vec, to study product-level competition, when the number of products is large. The proposed model takes shopping baskets as inputs and, for every product, generates a low-dimensional embedding that preserves important product information. In order for the product embeddings to be useful for firm strategic decision making, we leverage economic theories and causal inference to propose two modifications to Word2Vec. First of all, we create two measures, complementarity and exchangeability, that allow us to determine whether product pairs are complements or substitutes. Second, we combine these vectors with random utility-based choice models to forecast demand. To accurately estimate price elasticities, i.e., how demand responds to changes in price, we modify Word2Vec by removing the influence of price from the product vectors. We show that, compared with state-of-the-art models, our approach is faster, and can produce more accurate demand forecasts and price elasticities.