Abstract:The calibration of low-cost sensors using machine learning techniques is a methodology widely used nowadays. Although many challenges remain to be solved in the deployment of low-cost sensors for air quality monitoring, low-cost sensors have been shown to be useful in conjunction with high-precision instrumentation. Thus, most research is focused on the application of different calibration techniques using machine learning. Nevertheless, the successful application of these models depends on the quality of the data obtained by the sensors, and very little attention has been paid to the whole data gathering process, from sensor sampling and data pre-processing, to the calibration of the sensor itself. In this article, we show the main sensor sampling parameters, with their corresponding impact on the quality of the resulting machine learning-based sensor calibration and their impact on energy consumption, thus showing the existing trade-offs. Finally, the results on an experimental node show the impact of the data sampling strategy in the calibration of tropospheric ozone, nitrogen dioxide and nitrogen monoxide low-cost sensors. Specifically, we show how a sampling strategy that minimizes the duty cycle of the sensing subsystem can reduce power consumption while maintaining data quality.