Abstract:Objective: To develop machine learning models that can predict the number of COVID-19 cases per day given the last 14 days of environmental and mobility data. Approach: COVID-19 data from four counties around Toronto, Ontario, were used. Data were prepared into daily records containing the number of new COVID case counts, patient demographic data, outdoor weather variables, indoor environment factors, and human movement based on cell mobility and public health restrictions. This data was analyzed to determine the most important variables and their interactions. Predictive models were developed using CNN and LSTM deep neural network approaches. A 5-fold chronological cross-validation approach used these methods to develop predictive models using data from Mar 1 to Oct 14 2020, and test them on data covering Oct 15 to Dec 24 2020. Results: The best LSTM models forecasted tomorrow's daily COVID case counts with 90.7% accuracy, and the 7-day rolling average COVID case counts with 98.1% accuracy using independent test data. The best models to forecast the next 7 days of daily COVID case counts did so with 79.4% accuracy over all days. Models forecasting the 7-day rolling average case counts had a mean accuracy of 83.6% on the same test set. Conclusions: Our findings point to the importance of indoor humidity for the transmission of a virus such as COVID-19. During the coldest portions of the year, when humans spend greater amounts of time indoors or in vehicles, air quality drops within buildings, most significantly indoor relative humidity levels. Moderate to high indoor temperatures coupled with low IRH (below 20%) create conditions where viral transmission is more likely because water vapour ejected from an infected person's mouth can remain longer in the air because of evaporation and dry skin conditions, particularly in a recipient's airway, promotes transmission.