Abstract:This paper investigates data sampling strategies to create a benchmark for dialectal sentiment classification of Google Places reviews written in English. Based on location-based filtering, we collect a self-supervised dataset of reviews in Australian (Australian English), Indian (Indian English), and British (British English) English with self-supervised sentiment labels (1-star to 5-star). We employ sampling techniques based on label semantics, review length, and sentiment proportion and report performances on three fine-tuned BERT-based models. Our multi-dialect evaluation provides pointers to challenging scenarios for inner-circle (Australian English and British English) as well as non-native dialects (Indian English) of English, highlighting the need for more diverse benchmarks.