Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Omni-DNA: A Unified Genomic Foundation Model for Cross-Modal and Multi-Task Learning

Feb 05, 2025

Zehui Li, Vallijah Subasri, Yifei Shen, Dongsheng Li, Yiren Zhao, Guy-Bart Stan, Caihua Shan

Figure 1 for Omni-DNA: A Unified Genomic Foundation Model for Cross-Modal and Multi-Task Learning

Figure 2 for Omni-DNA: A Unified Genomic Foundation Model for Cross-Modal and Multi-Task Learning

Figure 3 for Omni-DNA: A Unified Genomic Foundation Model for Cross-Modal and Multi-Task Learning

Figure 4 for Omni-DNA: A Unified Genomic Foundation Model for Cross-Modal and Multi-Task Learning

Share this with someone who'll enjoy it:

Abstract:Large Language Models (LLMs) demonstrate remarkable generalizability across diverse tasks, yet genomic foundation models (GFMs) still require separate finetuning for each downstream application, creating significant overhead as model sizes grow. Moreover, existing GFMs are constrained by rigid output formats, limiting their applicability to various genomic tasks. In this work, we revisit the transformer-based auto-regressive models and introduce Omni-DNA, a family of cross-modal multi-task models ranging from 20 million to 1 billion parameters. Our approach consists of two stages: (i) pretraining on DNA sequences with next token prediction objective, and (ii) expanding the multi-modal task-specific tokens and finetuning for multiple downstream tasks simultaneously. When evaluated on the Nucleotide Transformer and GB benchmarks, Omni-DNA achieves state-of-the-art performance on 18 out of 26 tasks. Through multi-task finetuning, Omni-DNA addresses 10 acetylation and methylation tasks at once, surpassing models trained on each task individually. Finally, we design two complex genomic tasks, DNA2Function and Needle-in-DNA, which map DNA sequences to textual functional descriptions and images, respectively, indicating Omni-DNA's cross-modal capabilities to broaden the scope of genomic applications. All the models are available through https://huggingface.co/collections/zehui127

View paper on

Share this with someone who'll enjoy it:

Title:Omni-DNA: A Unified Genomic Foundation Model for Cross-Modal and Multi-Task Learning

Paper and Code