Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Whole-Genome Phenotype Prediction with Machine Learning: Open Problems in Bacterial Genomics

Feb 11, 2025

Tamsin James, Ben Williamson, Peter Tino, Nicole Wheeler

Share this with someone who'll enjoy it:

Abstract:How can we identify causal genetic mechanisms that govern bacterial traits? Initial efforts entrusting machine learning models to handle the task of predicting phenotype from genotype return high accuracy scores. However, attempts to extract any meaning from the predictive models are found to be corrupted by falsely identified "causal" features. Relying solely on pattern recognition and correlations is unreliable, significantly so in bacterial genomics settings where high-dimensionality and spurious associations are the norm. Though it is not yet clear whether we can overcome this hurdle, significant efforts are being made towards discovering potential high-risk bacterial genetic variants. In view of this, we set up open problems surrounding phenotype prediction from bacterial whole-genome datasets and extending those to learning causal effects, and discuss challenges that impact the reliability of a machine's decision-making when faced with datasets of this nature.

* 13 pages

View paper on

Share this with someone who'll enjoy it:

Title:Whole-Genome Phenotype Prediction with Machine Learning: Open Problems in Bacterial Genomics

Paper and Code