Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Nimisha Ghosh

TFBS-Finder: Deep Learning-based Model with DNABERT and Convolutional Networks to Predict Transcription Factor Binding Sites

Feb 03, 2025

Nimisha Ghosh, Pratik Dutta, Daniele Santoni

Figure 1 for TFBS-Finder: Deep Learning-based Model with DNABERT and Convolutional Networks to Predict Transcription Factor Binding Sites

Figure 2 for TFBS-Finder: Deep Learning-based Model with DNABERT and Convolutional Networks to Predict Transcription Factor Binding Sites

Figure 3 for TFBS-Finder: Deep Learning-based Model with DNABERT and Convolutional Networks to Predict Transcription Factor Binding Sites

Figure 4 for TFBS-Finder: Deep Learning-based Model with DNABERT and Convolutional Networks to Predict Transcription Factor Binding Sites

Abstract:Transcription factors are proteins that regulate the expression of genes by binding to specific genomic regions known as Transcription Factor Binding Sites (TFBSs), typically located in the promoter regions of those genes. Accurate prediction of these binding sites is essential for understanding the complex gene regulatory networks underlying various cellular functions. In this regard, many deep learning models have been developed for such prediction, but there is still scope of improvement. In this work, we have developed a deep learning model which uses pre-trained DNABERT, a Convolutional Neural Network (CNN) module, a Modified Convolutional Block Attention Module (MCBAM), a Multi-Scale Convolutions with Attention (MSCA) module and an output module. The pre-trained DNABERT is used for sequence embedding, thereby capturing the long-term dependencies in the DNA sequences while the CNN, MCBAM and MSCA modules are useful in extracting higher-order local features. TFBS-Finder is trained and tested on 165 ENCODE ChIP-seq datasets. We have also performed ablation studies as well as cross-cell line validations and comparisons with other models. The experimental results show the superiority of the proposed method in predicting TFBSs compared to the existing methodologies. The codes and the relevant datasets are publicly available at https://github.com/NimishaGhosh/TFBS-Finder/.

Via

Access Paper or Ask Questions

A Review on the Applications of Transformer-based language models for Nucleotide Sequence Analysis

Dec 10, 2024

Nimisha Ghosh, Daniele Santoni, Indrajit Saha, Giovanni Felici

Figure 1 for A Review on the Applications of Transformer-based language models for Nucleotide Sequence Analysis

Figure 2 for A Review on the Applications of Transformer-based language models for Nucleotide Sequence Analysis

Figure 3 for A Review on the Applications of Transformer-based language models for Nucleotide Sequence Analysis

Abstract:In recent times, Transformer-based language models are making quite an impact in the field of natural language processing. As relevant parallels can be drawn between biological sequences and natural languages, the models used in NLP can be easily extended and adapted for various applications in bioinformatics. In this regard, this paper introduces the major developments of Transformer-based models in the recent past in the context of nucleotide sequences. We have reviewed and analysed a large number of application-based papers on this subject, giving evidence of the main characterizing features and to different approaches that may be adopted to customize such powerful computational machines. We have also provided a structured description of the functioning of Transformers, that may enable even first time users to grab the essence of such complex architectures. We believe this review will help the scientific community in understanding the various applications of Transformer-based language models to nucleotide sequences. This work will motivate the readers to build on these methodologies to tackle also various other problems in the field of bioinformatics.

Via

Access Paper or Ask Questions

Predicting Transcription Factor Binding Sites using Transformer based Capsule Network

Oct 23, 2023

Nimisha Ghosh, Daniele Santoni, Indrajit Saha, Giovanni Felici

Abstract:Prediction of binding sites for transcription factors is important to understand how they regulate gene expression and how this regulation can be modulated for therapeutic purposes. Although in the past few years there are significant works addressing this issue, there is still space for improvement. In this regard, a transformer based capsule network viz. DNABERT-Cap is proposed in this work to predict transcription factor binding sites mining ChIP-seq datasets. DNABERT-Cap is a bidirectional encoder pre-trained with large number of genomic DNA sequences, empowered with a capsule layer responsible for the final prediction. The proposed model builds a predictor for transcription factor binding sites using the joint optimisation of features encompassing both bidirectional encoder and capsule layer, along with convolutional and bidirectional long-short term memory layers. To evaluate the efficiency of the proposed approach, we use a benchmark ChIP-seq datasets of five cell lines viz. A549, GM12878, Hep-G2, H1-hESC and Hela, available in the ENCODE repository. The results show that the average area under the receiver operating characteristic curve score exceeds 0.91 for all such five cell lines. DNABERT-Cap is also compared with existing state-of-the-art deep learning based predictors viz. DeepARC, DeepTF, CNN-Zeng and DeepBind, and is seen to outperform them.

Via

Access Paper or Ask Questions

iDCR: Improved Dempster Combination Rule for Multisensor Fault Diagnosis

Feb 10, 2020

Nimisha Ghosh, Sayantan Saha, Rourab Paul

Figure 1 for iDCR: Improved Dempster Combination Rule for Multisensor Fault Diagnosis

Figure 2 for iDCR: Improved Dempster Combination Rule for Multisensor Fault Diagnosis

Figure 3 for iDCR: Improved Dempster Combination Rule for Multisensor Fault Diagnosis

Figure 4 for iDCR: Improved Dempster Combination Rule for Multisensor Fault Diagnosis

Abstract:Data gathered from multiple sensors can be effectively fused for accurate monitoring of many engineering applications. In the last few years, one of the most sought after applications for multi sensor fusion has been fault diagnosis. Dempster-Shafer Theory of Evidence along with Dempsters Combination Rule is a very popular method for multi sensor fusion which can be successfully applied to fault diagnosis. But if the information obtained from the different sensors shows high conflict, the classical Dempsters Combination Rule may produce counter-intuitive result. To overcome this shortcoming, this paper proposes an improved combination rule for multi sensor data fusion. Numerical examples have been put forward to show the effectiveness of the proposed method. Comparative analysis has also been carried out with existing methods to show the superiority of the proposed method in multi sensor fault diagnosis.

Via

Access Paper or Ask Questions

IoT based Smart Access Controlled Secure Smart City Architecture Using Blockchain

Sep 09, 2019

Rourab Paul, Nimisha Ghosh, Suman Sau, Amlan Chakrabarti, Prasant Mahapatra

Figure 1 for IoT based Smart Access Controlled Secure Smart City Architecture Using Blockchain

Figure 2 for IoT based Smart Access Controlled Secure Smart City Architecture Using Blockchain

Figure 3 for IoT based Smart Access Controlled Secure Smart City Architecture Using Blockchain

Figure 4 for IoT based Smart Access Controlled Secure Smart City Architecture Using Blockchain

Abstract:Standard security protocols like SSL, TLS, IPSec etc. have high memory and processor consumption which makes all these security protocols unsuitable for resource constrained platforms such as Internet of Things (IoT). Blockchain (BC) finds its efficient application in IoT platform to preserve the five basic cryptographic primitives, such as confidentiality, authenticity, integrity, availability and non-repudiation. Conventional adoption of BC in IoT platform causes high energy consumption, delay and computational overhead which are not appropriate for various resource constrained IoT devices. This work proposes a machine learning (ML) based smart access control framework in a public and a private BC for a smart city application which makes it more efficient as compared to the existing IoT applications. The proposed IoT based smart city architecture adopts BC technology for preserving all the cryptographic security and privacy issues. Moreover, BC has very minimal overhead on IoT platform as well. This work investigates the existing threat models and critical access control issues which handle multiple permissions of various nodes and detects relevant inconsistencies to notify the corresponding nodes. Comparison in terms of all security issues with existing literature shows that the proposed architecture is competitively efficient in terms of security access control.

* Manuscript

Via

Access Paper or Ask Questions

Fault Matters: Sensor Data Fusion for Detection of Faults using Dempster-Shafer Theory of Evidence in IoT-Based Applications

Jun 24, 2019

Nimisha Ghosh, Rourab Paul, Satyabrata Maity, Krishanu Maity, Sayantan Saha

Figure 1 for Fault Matters: Sensor Data Fusion for Detection of Faults using Dempster-Shafer Theory of Evidence in IoT-Based Applications

Figure 2 for Fault Matters: Sensor Data Fusion for Detection of Faults using Dempster-Shafer Theory of Evidence in IoT-Based Applications

Figure 3 for Fault Matters: Sensor Data Fusion for Detection of Faults using Dempster-Shafer Theory of Evidence in IoT-Based Applications

Figure 4 for Fault Matters: Sensor Data Fusion for Detection of Faults using Dempster-Shafer Theory of Evidence in IoT-Based Applications

Abstract:Fault detection in sensor nodes is a pertinent issue that has been an important area of research for a very long time. But it is not explored much as yet in the context of Internet of Things. Internet of Things work with a massive amount of data so the responsibility for guaranteeing the accuracy of the data also lies with it. Moreover, a lot of important and critical decisions are made based on these data, so ensuring its correctness and accuracy is also very important. Also, the detection needs to be as precise as possible to avoid negative alerts. For this purpose, this work has adopted Dempster-Shafer Theory of Evidence which is a popular learning method to collate the information from sensors to come up with a decision regarding the faulty status of a sensor node. To verify the validity of the proposed method, simulations have been performed on a benchmark data set and data collected through a test bed in a laboratory set-up. For the different types of faults, the proposed method shows very competent accuracy for both the benchmark (99.8%) and laboratory data sets (99.9%) when compared to the other state-of-the-art machine learning techniques.

Via

Access Paper or Ask Questions