Abstract:Identifying mutations of SARS-CoV-2 strains associated with their phenotypic changes is critical for pandemic prediction and prevention. We compared an explainable convolutional neural network (CNN) and the traditional genome-wide association study (GWAS) on the mutations associated with WHO labels of SARS-CoV-2, a proxy for virulence phenotypes. We trained a CNN classification model that can predict genomic sequences into Variants of Concern (VOCs), and then applied Shapley Additive explanations (SHAP) model to identify mutations that are important for the correct predictions. For comparison, we performed traditional GWAS to identify mutations associated with VOCs. Comparison of the two approaches shows that the explainable neural network approach can more effectively reveal known nucleotide substitutions associated with VOCs, such as those in the spike gene regions. Our results suggest that explainable neural networks for genomic sequences offer a promising alternative to the traditional genome wide analysis approaches.
Abstract:COVID-19 related policies were extensively politicized during the 2020 election year of the United States, resulting in polarizing viewpoints. Twitter users were particularly engaged during the 2020 election year. Here we investigated whether COVID-19 related tweets were associated with the overall election results at the state level during the period leading up to the election day. We observed weak correlations between the average sentiment of COVID-19 related tweets and popular votes in two-week intervals, and the trends gradually become opposite. We then compared the average sentiments of COVID-19 related tweets between states called in favor of Republican (red states) or Democratic parties (blue states). We found that at the beginning of lockdowns sentiments in the blue states were much more positive than those in the red states. However, sentiments in the red states gradually become more positive during the summer of 2020 and persisted until the election day.