Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Engineer Bainomugisha

Sunflower: A New Approach To Expanding Coverage of African Languages in Large Language Models

Oct 08, 2025

Benjamin Akera, Evelyn Nafula Ouma, Gilbert Yiga, Patrick Walukagga, Phionah Natukunda, Trevor Saaka, Solomon Nsumba, Lilian Teddy Nabukeera, Joel Muhanguzi, Imran Sekalala(+4 more)

Abstract:There are more than 2000 living languages in Africa, most of which have been bypassed by advances in language technology. Current leading LLMs exhibit strong performance on a number of the most common languages (e.g. Swahili or Yoruba), but prioritise support for the languages with the most speakers first, resulting in piecemeal ability across disparate languages. We contend that a regionally focussed approach is more efficient, and present a case study for Uganda, a country with high linguistic diversity. We describe the development of Sunflower 14B and 32B, a pair of models based on Qwen 3 with state of the art comprehension in the majority of all Ugandan languages. These models are open source and can be used to reduce language barriers in a number of important practical applications.

Via

Access Paper or Ask Questions

Gaussian Processes for Monitoring Air-Quality in Kampala

Nov 28, 2023

Clara Stoddart, Lauren Shrack, Richard Sserunjogi, Usman Abdul-Ganiy, Engineer Bainomugisha, Deo Okure, Ruth Misener, Jose Pablo Folch, Ruby Sedgwick

Figure 1 for Gaussian Processes for Monitoring Air-Quality in Kampala

Figure 2 for Gaussian Processes for Monitoring Air-Quality in Kampala

Figure 3 for Gaussian Processes for Monitoring Air-Quality in Kampala

Abstract:Monitoring air pollution is of vital importance to the overall health of the population. Unfortunately, devices that can measure air quality can be expensive, and many cities in low and middle-income countries have to rely on a sparse allocation of them. In this paper, we investigate the use of Gaussian Processes for both nowcasting the current air-pollution in places where there are no sensors and forecasting the air-pollution in the future at the sensor locations. In particular, we focus on the city of Kampala in Uganda, using data from AirQo's network of sensors. We demonstrate the advantage of removing outliers, compare different kernel functions and additional inputs. We also compare two sparse approximations to allow for the large amounts of temporal data in the dataset.

Via

Access Paper or Ask Questions

Modelling calibration uncertainty in networks of environmental sensors

May 09, 2022

Michael Thomas Smith, Magnus Ross, Joel Ssematimba, Pablo A. Alvarado, Mauricio Alvarez, Engineer Bainomugisha, Richard Wilkinson

Figure 1 for Modelling calibration uncertainty in networks of environmental sensors

Figure 2 for Modelling calibration uncertainty in networks of environmental sensors

Figure 3 for Modelling calibration uncertainty in networks of environmental sensors

Figure 4 for Modelling calibration uncertainty in networks of environmental sensors

Abstract:Networks of low-cost sensors are becoming ubiquitous, but often suffer from poor accuracies and drift. Regular colocation with reference sensors allows recalibration but is complicated and expensive. Alternatively the calibration can be transferred using low-cost, mobile sensors. However inferring the calibration (with uncertainty) becomes difficult. We propose a variational approach to model the calibration across the network. We demonstrate the approach on synthetic and real air pollution data, and find it can perform better than the state of the art (multi-hop calibration). We extend it to categorical data produced by citizen-scientist labelling. In Summary: The method achieves uncertainty-quantified calibration, which has been one of the barriers to low-cost sensor deployment and citizen-science research.

* 31 pages (23 pages of content, 4 pages of references, 4 supplementary). 11 figures. 4 tables. Submitted to Journal of the Royal Statistical Society. Series C

Via

Access Paper or Ask Questions

Adjoint-aided inference of Gaussian process driven differential equations

Feb 09, 2022

Paterne Gahungu, Christopher W Lanyon, Mauricio A Alvarez, Engineer Bainomugisha, Michael Smith, Richard D. Wilkinson

Figure 1 for Adjoint-aided inference of Gaussian process driven differential equations

Figure 2 for Adjoint-aided inference of Gaussian process driven differential equations

Figure 3 for Adjoint-aided inference of Gaussian process driven differential equations

Figure 4 for Adjoint-aided inference of Gaussian process driven differential equations

Abstract:Linear systems occur throughout engineering and the sciences, most notably as differential equations. In many cases the forcing function for the system is unknown, and interest lies in using noisy observations of the system to infer the forcing, as well as other unknown parameters. In differential equations, the forcing function is an unknown function of the independent variables (typically time and space), and can be modelled as a Gaussian process (GP). In this paper we show how the adjoint of a linear system can be used to efficiently infer forcing functions modelled as GPs, after using a truncated basis expansion of the GP kernel. We show how exact conjugate Bayesian inference for the truncated GP can be achieved, in many cases with substantially lower computation than would be required using MCMC methods. We demonstrate the approach on systems of both ordinary and partial differential equations, and by testing on synthetic data, show that the basis expansion approach approximates well the true forcing with a modest number of basis vectors. Finally, we show how to infer point estimates for the non-linear model parameters, such as the kernel length-scales, using Bayesian optimisation.

Via

Access Paper or Ask Questions

Machine Learning for a Low-cost Air Pollution Network

Nov 28, 2019

Michael T. Smith, Joel Ssematimba, Mauricio A. Alvarez, Engineer Bainomugisha

Figure 1 for Machine Learning for a Low-cost Air Pollution Network

Figure 2 for Machine Learning for a Low-cost Air Pollution Network

Figure 3 for Machine Learning for a Low-cost Air Pollution Network

Figure 4 for Machine Learning for a Low-cost Air Pollution Network

Abstract:Data collection in economically constrained countries often necessitates using approximate and biased measurements due to the low-cost of the sensors used. This leads to potentially invalid predictions and poor policies or decision making. This is especially an issue if methods from resource-rich regions are applied without handling these additional constraints. In this paper we show, through the use of an air pollution network example, how using probabilistic machine learning can mitigate some of the technical constraints. Specifically we experiment with modelling the calibration for individual sensors as either distributions or Gaussian processes over time, and discuss the wider issues around the decision process.

* Presented at NeurIPS 2019 Workshop on Machine Learning for the Developing World

Via

Access Paper or Ask Questions