Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Manvel Avetisian

FREED++: Improving RL Agents for Fragment-Based Molecule Generation by Thorough Reproduction

Jan 18, 2024

Alexander Telepov, Artem Tsypin, Kuzma Khrabrov, Sergey Yakukhnov, Pavel Strashnov, Petr Zhilyaev, Egor Rumiantsev, Daniel Ezhov, Manvel Avetisian, Olga Popova(+1 more)

Figure 1 for FREED++: Improving RL Agents for Fragment-Based Molecule Generation by Thorough Reproduction

Figure 2 for FREED++: Improving RL Agents for Fragment-Based Molecule Generation by Thorough Reproduction

Figure 3 for FREED++: Improving RL Agents for Fragment-Based Molecule Generation by Thorough Reproduction

Figure 4 for FREED++: Improving RL Agents for Fragment-Based Molecule Generation by Thorough Reproduction

Abstract:A rational design of new therapeutic drugs aims to find a molecular structure with desired biological functionality, e.g., an ability to activate or suppress a specific protein via binding to it. Molecular docking is a common technique for evaluating protein-molecule interactions. Recently, Reinforcement Learning (RL) has emerged as a promising approach to generating molecules with the docking score (DS) as a reward. In this work, we reproduce, scrutinize and improve the recent RL model for molecule generation called FREED (arXiv:2110.01219). Extensive evaluation of the proposed method reveals several limitations and challenges despite the outstanding results reported for three target proteins. Our contributions include fixing numerous implementation bugs and simplifying the model while increasing its quality, significantly extending experiments, and conducting an accurate comparison with current state-of-the-art methods for protein-conditioned molecule generation. We show that the resulting fixed model is capable of producing molecules with superior docking scores compared to alternative approaches.

* 37 pages, 10 figures, to be published in TMLR journal (https://www.jmlr.org/tmlr/)

Via

Access Paper or Ask Questions

Gradual Optimization Learning for Conformational Energy Minimization

Nov 05, 2023

Artem Tsypin, Leonid Ugadiarov, Kuzma Khrabrov, Manvel Avetisian, Alexander Telepov, Egor Rumiantsev, Alexey Skrynnik, Aleksandr I. Panov, Dmitry Vetrov, Elena Tutubalina(+1 more)

Figure 1 for Gradual Optimization Learning for Conformational Energy Minimization

Figure 2 for Gradual Optimization Learning for Conformational Energy Minimization

Figure 3 for Gradual Optimization Learning for Conformational Energy Minimization

Figure 4 for Gradual Optimization Learning for Conformational Energy Minimization

Abstract:Molecular conformation optimization is crucial to computer-aided drug discovery and materials design. Traditional energy minimization techniques rely on iterative optimization methods that use molecular forces calculated by a physical simulator (oracle) as anti-gradients. However, this is a computationally expensive approach that requires many interactions with a physical simulator. One way to accelerate this procedure is to replace the physical simulator with a neural network. Despite recent progress in neural networks for molecular conformation energy prediction, such models are prone to distribution shift, leading to inaccurate energy minimization. We find that the quality of energy minimization with neural networks can be improved by providing optimization trajectories as additional training data. Still, it takes around $5 \times 10^5$ additional conformations to match the physical simulator's optimization quality. In this work, we present the Gradual Optimization Learning Framework (GOLF) for energy minimization with neural networks that significantly reduces the required additional data. The framework consists of an efficient data-collecting scheme and an external optimizer. The external optimizer utilizes gradients from the energy prediction model to generate optimization trajectories, and the data-collecting scheme selects additional training data to be processed by the physical simulator. Our results demonstrate that the neural network trained with GOLF performs on par with the oracle on a benchmark of diverse drug-like molecules using $50$x less additional data.

* 17 pages, 5 figures

Via

Access Paper or Ask Questions

Towards Computationally Feasible Deep Active Learning

May 07, 2022

Akim Tsvigun, Artem Shelmanov, Gleb Kuzmin, Leonid Sanochkin, Daniil Larionov, Gleb Gusev, Manvel Avetisian, Leonid Zhukov

Figure 1 for Towards Computationally Feasible Deep Active Learning

Figure 2 for Towards Computationally Feasible Deep Active Learning

Figure 3 for Towards Computationally Feasible Deep Active Learning

Figure 4 for Towards Computationally Feasible Deep Active Learning

Abstract:Active learning (AL) is a prominent technique for reducing the annotation effort required for training machine learning models. Deep learning offers a solution for several essential obstacles to deploying AL in practice but introduces many others. One of such problems is the excessive computational resources required to train an acquisition model and estimate its uncertainty on instances in the unlabeled pool. We propose two techniques that tackle this issue for text classification and tagging tasks, offering a substantial reduction of AL iteration duration and the computational overhead introduced by deep acquisition models in AL. We also demonstrate that our algorithm that leverages pseudo-labeling and distilled models overcomes one of the essential obstacles revealed previously in the literature. Namely, it was shown that due to differences between an acquisition model used to select instances during AL and a successor model trained on the labeled data, the benefits of AL can diminish. We show that our algorithm, despite using a smaller and faster acquisition model, is capable of training a more expressive successor model with higher performance.

* Accepted at NAACL-2022 Findings

Via

Access Paper or Ask Questions

Project Achoo: A Practical Model and Application for COVID-19 Detection from Recordings of Breath, Voice, and Cough

Jul 12, 2021

Alexander Ponomarchuk, Ilya Burenko, Elian Malkin, Ivan Nazarov, Vladimir Kokh, Manvel Avetisian, Leonid Zhukov

Figure 1 for Project Achoo: A Practical Model and Application for COVID-19 Detection from Recordings of Breath, Voice, and Cough

Figure 2 for Project Achoo: A Practical Model and Application for COVID-19 Detection from Recordings of Breath, Voice, and Cough

Figure 3 for Project Achoo: A Practical Model and Application for COVID-19 Detection from Recordings of Breath, Voice, and Cough

Figure 4 for Project Achoo: A Practical Model and Application for COVID-19 Detection from Recordings of Breath, Voice, and Cough

Abstract:The COVID-19 pandemic created a significant interest and demand for infection detection and monitoring solutions. In this paper we propose a machine learning method to quickly triage COVID-19 using recordings made on consumer devices. The approach combines signal processing methods with fine-tuned deep learning networks and provides methods for signal denoising, cough detection and classification. We have also developed and deployed a mobile application that uses symptoms checker together with voice, breath and cough signals to detect COVID-19 infection. The application showed robust performance on both open sourced datasets and on the noisy data collected during beta testing by the end users.

Via

Access Paper or Ask Questions

CoRSAI: A System for Robust Interpretation of CT Scans of COVID-19 Patients Using Deep Learning

May 25, 2021

Manvel Avetisian, Ilya Burenko, Konstantin Egorov, Vladimir Kokh, Aleksandr Nesterov, Aleksandr Nikolaev, Alexander Ponomarchuk, Elena Sokolova, Alex Tuzhilin, Dmitry Umerenkov

Figure 1 for CoRSAI: A System for Robust Interpretation of CT Scans of COVID-19 Patients Using Deep Learning

Figure 2 for CoRSAI: A System for Robust Interpretation of CT Scans of COVID-19 Patients Using Deep Learning

Figure 3 for CoRSAI: A System for Robust Interpretation of CT Scans of COVID-19 Patients Using Deep Learning

Figure 4 for CoRSAI: A System for Robust Interpretation of CT Scans of COVID-19 Patients Using Deep Learning

Abstract:Analysis of chest CT scans can be used in detecting parts of lungs that are affected by infectious diseases such as COVID-19.Determining the volume of lungs affected by lesions is essential for formulating treatment recommendations and prioritizingpatients by severity of the disease. In this paper we adopted an approach based on using an ensemble of deep convolutionalneural networks for segmentation of slices of lung CT scans. Using our models we are able to segment the lesions, evaluatepatients dynamics, estimate relative volume of lungs affected by lesions and evaluate the lung damage stage. Our modelswere trained on data from different medical centers. We compared predictions of our models with those of six experiencedradiologists and our segmentation model outperformed most of them. On the task of classification of disease severity, ourmodel outperformed all the radiologists.

Via

Access Paper or Ask Questions

Noise-Resilient Automatic Interpretation of Holter ECG Recordings

Nov 17, 2020

Konstantin Egorov, Elena Sokolova, Manvel Avetisian, Alexander Tuzhilin

Figure 1 for Noise-Resilient Automatic Interpretation of Holter ECG Recordings

Figure 2 for Noise-Resilient Automatic Interpretation of Holter ECG Recordings

Figure 3 for Noise-Resilient Automatic Interpretation of Holter ECG Recordings

Figure 4 for Noise-Resilient Automatic Interpretation of Holter ECG Recordings

Abstract:Holter monitoring, a long-term ECG recording (24-hours and more), contains a large amount of valuable diagnostic information about the patient. Its interpretation becomes a difficult and time-consuming task for the doctor who analyzes them because every heartbeat needs to be classified, thus requiring highly accurate methods for automatic interpretation. In this paper, we present a three-stage process for analysing Holter recordings with robustness to noisy signal. First stage is a segmentation neural network (NN) with encoderdecoder architecture which detects positions of heartbeats. Second stage is a classification NN which will classify heartbeats as wide or narrow. Third stage in gradient boosting decision trees (GBDT) on top of NN features that incorporates patient-wise features and further increases performance of our approach. As a part of this work we acquired 5095 Holter recordings of patients annotated by an experienced cardiologist. A committee of three cardiologists served as a ground truth annotators for the 291 examples in the test set. We show that the proposed method outperforms the selected baselines, including two commercial-grade software packages and some methods previously published in the literature.

* Accepted for publication on BIOSIGNALS 2021

Via

Access Paper or Ask Questions

Predicting Clinical Diagnosis from Patients Electronic Health Records Using BERT-based Neural Networks

Jul 15, 2020

Pavel Blinov, Manvel Avetisian, Vladimir Kokh, Dmitry Umerenkov, Alexander Tuzhilin

Figure 1 for Predicting Clinical Diagnosis from Patients Electronic Health Records Using BERT-based Neural Networks

Figure 2 for Predicting Clinical Diagnosis from Patients Electronic Health Records Using BERT-based Neural Networks

Figure 3 for Predicting Clinical Diagnosis from Patients Electronic Health Records Using BERT-based Neural Networks

Figure 4 for Predicting Clinical Diagnosis from Patients Electronic Health Records Using BERT-based Neural Networks

Abstract:In this paper we study the problem of predicting clinical diagnoses from textual Electronic Health Records (EHR) data. We show the importance of this problem in medical community and present comprehensive historical review of the problem and proposed methods. As the main scientific contributions we present a modification of Bidirectional Encoder Representations from Transformers (BERT) model for sequence classification that implements a novel way of Fully-Connected (FC) layer composition and a BERT model pretrained only on domain data. To empirically validate our model, we use a large-scale Russian EHR dataset consisting of about 4 million unique patient visits. This is the largest such study for the Russian language and one of the largest globally. We performed a number of comparative experiments with other text representation models on the task of multiclass classification for 265 disease subset of ICD-10. The experiments demonstrate improved performance of our models compared to other baselines, including a fine-tuned Russian BERT (RuBERT) variant. We also show comparable performance of our model with a panel of experienced medical experts. This allows us to hope that implementation of this system will reduce misdiagnosis.

* To be published in the proceedings of 2020 International Conference on Artificial Intelligence in Medicine, Minneapolis MN, USA

Via

Access Paper or Ask Questions

Radiologist-level stroke classification on non-contrast CT scans with Deep U-Net

Mar 31, 2020

Manvel Avetisian, Vladimir Kokh, Alex Tuzhilin, Dmitry Umerenkov

Figure 1 for Radiologist-level stroke classification on non-contrast CT scans with Deep U-Net

Figure 2 for Radiologist-level stroke classification on non-contrast CT scans with Deep U-Net

Figure 3 for Radiologist-level stroke classification on non-contrast CT scans with Deep U-Net

Abstract:Segmentation of ischemic stroke and intracranial hemorrhage on computed tomography is essential for investigation and treatment of stroke. In this paper, we modified the U-Net CNN architecture for the stroke identification problem using non-contrast CT. We applied the proposed DL model to historical patient data and also conducted clinical experiments involving ten experienced radiologists. Our model achieved strong results on historical data, and significantly outperformed seven radiologist out of ten, while being on par with the remaining three.

Via

Access Paper or Ask Questions