Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Abhinav Godavarthi

Guided Generative Protein Design using Regularized Transformers

Jan 24, 2022

Egbert Castro, Abhinav Godavarthi, Julian Rubinfien, Kevin B. Givechian, Dhananjay Bhaskar, Smita Krishnaswamy

Figure 1 for Guided Generative Protein Design using Regularized Transformers

Figure 2 for Guided Generative Protein Design using Regularized Transformers

Figure 3 for Guided Generative Protein Design using Regularized Transformers

Figure 4 for Guided Generative Protein Design using Regularized Transformers

Abstract:The development of powerful natural language models have increased the ability to learn meaningful representations of protein sequences. In addition, advances in high-throughput mutagenesis, directed evolution, and next-generation sequencing have allowed for the accumulation of large amounts of labeled fitness data. Leveraging these two trends, we introduce Regularized Latent Space Optimization (ReLSO), a deep transformer-based autoencoder which is trained to jointly generate sequences as well as predict fitness. Using ReLSO, we explicitly model the underlying sequence-function landscape of large labeled datasets and optimize within latent space using gradient-based methods. Through regularized prediction heads, ReLSO introduces a powerful protein sequence encoder and novel approach for efficient fitness landscape traversal.

Via

Access Paper or Ask Questions

Multimodal data visualization, denoising and clustering with integrated diffusion

Feb 12, 2021

Manik Kuchroo, Abhinav Godavarthi, Guy Wolf, Smita Krishnaswamy

Figure 1 for Multimodal data visualization, denoising and clustering with integrated diffusion

Figure 2 for Multimodal data visualization, denoising and clustering with integrated diffusion

Figure 3 for Multimodal data visualization, denoising and clustering with integrated diffusion

Figure 4 for Multimodal data visualization, denoising and clustering with integrated diffusion

Abstract:We propose a method called integrated diffusion for combining multimodal datasets, or data gathered via several different measurements on the same system, to create a joint data diffusion operator. As real world data suffers from both local and global noise, we introduce mechanisms to optimally calculate a diffusion operator that reflects the combined information from both modalities. We show the utility of this joint operator in data denoising, visualization and clustering, performing better than other methods to integrate and analyze multimodal data. We apply our method to multi-omic data generated from blood cells, measuring both gene expression and chromatin accessibility. Our approach better visualizes the geometry of the joint data, captures known cross-modality associations and identifies known cellular populations. More generally, integrated diffusion is broadly applicable to multimodal datasets generated in many medical and biological systems.

Via

Access Paper or Ask Questions