Abstract:Internet connectivity in schools is critical to provide students with the digital literary skills necessary to compete in modern economies. In order for governments to effectively implement digital infrastructure development in schools, accurate internet connectivity information is required. However, traditional survey-based methods can exceed the financial and capacity limits of governments. Open-source Earth Observation (EO) datasets have unlocked our ability to observe and understand socio-economic conditions on Earth from space, and in combination with Machine Learning (ML), can provide the tools to circumvent costly ground-based survey methods to support infrastructure development. In this paper, we present our work on school internet connectivity prediction using EO and ML. We detail the creation of our multi-modal, freely-available satellite imagery and survey information dataset, leverage the latest geographically-aware location encoders, and introduce the first results of using the new European Space Agency phi-lab geographically-aware foundational model to predict internet connectivity in Botswana and Rwanda. We find that ML with EO and ground-based auxiliary data yields the best performance in both countries, for accuracy, F1 score, and False Positive rates, and highlight the challenges of internet connectivity prediction from space with a case study in Kigali, Rwanda. Our work showcases a practical approach to support data-driven digital infrastructure development in low-resource settings, leveraging freely available information, and provide cleaned and labelled datasets for future studies to the community through a unique collaboration between UNICEF and the European Space Agency phi-lab.
Abstract:This ongoing work attempts to understand and address the requirements of UNICEF, a leading organization working in children's welfare, where they aim to tackle the problem of air quality for children at a global level. We are motivated by the lack of a proper model to account for heavily fluctuating air quality levels across the world in the wake of the COVID-19 pandemic, leading to uncertainty among public health professionals on the exact levels of children's exposure to air pollutants. We create an initial model as per the agency's requirement to generate insights through a combination of virtual meetups and online presentations. Our research team comprised of UNICEF's researchers and a group of volunteer data scientists. The presentations were delivered to a number of scientists and domain experts from UNICEF and community champions working with open data. We highlight their feedback and possible avenues to develop this research further.