Abstract:This paper presents a multi-tiered human-in-the-loop framework for interactive school mapping designed to improve the accuracy and completeness of educational facility records, particularly in developing regions where such data may be scarce and infrequently updated. The first tier involves a machine learning based analysis of population density, land cover, and existing infrastructure compared with known school locations. The first tier identifies potential gaps and "mislabelled" schools. In subsequent tiers, medium-resolution satellite imagery (Sentinel-2) is investigated to pinpoint regions with a high likelihood of school presence, followed by the application of very high-resolution (VHR) imagery and deep learning models to generate detailed candidate locations for schools within these prioritised areas. The medium-resolution approach was later removed due to insignificant improvements. The medium and VHR resolution models build upon global pre-trained steps to improve generalisation. A key component of the proposed approach is an interactive interface to allow human operators to iteratively review, validate, and refine the mapping results. Preliminary evaluations indicate that the multi-tiered strategy provides a scalable and cost-effective solution for educational infrastructure mapping to support planning and resource allocation.




Abstract:Internet connectivity in schools is critical to provide students with the digital literary skills necessary to compete in modern economies. In order for governments to effectively implement digital infrastructure development in schools, accurate internet connectivity information is required. However, traditional survey-based methods can exceed the financial and capacity limits of governments. Open-source Earth Observation (EO) datasets have unlocked our ability to observe and understand socio-economic conditions on Earth from space, and in combination with Machine Learning (ML), can provide the tools to circumvent costly ground-based survey methods to support infrastructure development. In this paper, we present our work on school internet connectivity prediction using EO and ML. We detail the creation of our multi-modal, freely-available satellite imagery and survey information dataset, leverage the latest geographically-aware location encoders, and introduce the first results of using the new European Space Agency phi-lab geographically-aware foundational model to predict internet connectivity in Botswana and Rwanda. We find that ML with EO and ground-based auxiliary data yields the best performance in both countries, for accuracy, F1 score, and False Positive rates, and highlight the challenges of internet connectivity prediction from space with a case study in Kigali, Rwanda. Our work showcases a practical approach to support data-driven digital infrastructure development in low-resource settings, leveraging freely available information, and provide cleaned and labelled datasets for future studies to the community through a unique collaboration between UNICEF and the European Space Agency phi-lab.




Abstract:This ongoing work attempts to understand and address the requirements of UNICEF, a leading organization working in children's welfare, where they aim to tackle the problem of air quality for children at a global level. We are motivated by the lack of a proper model to account for heavily fluctuating air quality levels across the world in the wake of the COVID-19 pandemic, leading to uncertainty among public health professionals on the exact levels of children's exposure to air pollutants. We create an initial model as per the agency's requirement to generate insights through a combination of virtual meetups and online presentations. Our research team comprised of UNICEF's researchers and a group of volunteer data scientists. The presentations were delivered to a number of scientists and domain experts from UNICEF and community champions working with open data. We highlight their feedback and possible avenues to develop this research further.