Abstract:Scientific literature contains a considerable amount of information that provides an excellent opportunity for developing text mining methods to extract biomedical relationships. An important type of information is the relationship between singular nucleotide polymorphisms (SNP) and traits. In this paper, we present a BioBERT-GRU method to identify SNP- traits associations. Based on the evaluation of our method on the SNPPhenA dataset, it is concluded that this new method performs better than previous machine learning and deep learning based methods. BioBERT-GRU achieved the result a precision of 0.883, recall of 0.882 and F1-score of 0.881.
Abstract:Genome-wide association (GWA) constitutes a prominent portion of studies which have been conducted on personalized medicine and pharmacogenomics. Recently, very few methods have been developed for extracting mutation-diseases associations. However, there is no available method for extracting the association of SNP-phenotype from text which considers degree of confidence in associations. In this study, first a relation extraction method relying on linguistic-based negation detection and neutral candidates is proposed. The experiments show that negation cues and scope as well as detecting neutral candidates can be employed for implementing a superior relation extraction method which outperforms the kernel-based counterparts due to a uniform innate polarity of sentences and small number of complex sentences in the corpus. Moreover, a modality based approach is proposed to estimate the confidence level of the extracted association which can be used to assess the reliability of the reported association. Keywords: SNP, Phenotype, Biomedical Relation Extraction, Negation Detection.