Literature DB >> 30239680

An end-to-end deep learning architecture for extracting protein-protein interactions affected by genetic mutations.

Tung Tran1, Ramakanth Kavuluru1,2.   

Abstract

The BioCreative VI Track IV (mining protein interactions and mutations for precision medicine) challenge was organized in 2017 with the goal of applying biomedical text mining methods to support advancements in precision medicine approaches. As part of the challenge, a new dataset was introduced for the purpose of building a supervised relation extraction model capable of taking a test article and returning a list of interacting protein pairs identified by their Entrez Gene IDs. Specifically, such pairs represent proteins participating in a binary protein-protein interaction relation where the interaction is additionally affected by a genetic mutation-referred to as a PPIm relation. In this study, we explore an end-to-end approach for PPIm relation extraction by deploying a three-component pipeline involving deep learning-based named-entity recognition and relation classification models along with a knowledge-based approach for gene normalization. We propose several recall-focused improvements to our original challenge entry that placed second when matching on Entrez Gene ID (exact matching) and on HomoloGene ID. On exact matching, the improved system achieved new competitive test results of 37.78% micro-F1 with a precision of 38.22% and recall of 37.34% that corresponds to an improvement from the prior best system by approximately three micro-F1 points. When matching on HomoloGene IDs, we report similarly competitive test results at 46.17% micro-F1 with a precision and recall of 46.67 and 45.59%, respectively, corresponding to an improvement of more than eight micro-F1 points over the prior best result. The code for our deep learning system is made publicly available at https://github.com/bionlproc/biocppi_extraction.

Entities:  

Mesh:

Year:  2018        PMID: 30239680      PMCID: PMC6146129          DOI: 10.1093/database/bay092

Source DB:  PubMed          Journal:  Database (Oxford)        ISSN: 1758-0463            Impact factor:   3.451


  15 in total

1.  RelEx--relation extraction using dependency parse trees.

Authors:  Katrin Fundel; Robert Küffner; Ralf Zimmer
Journal:  Bioinformatics       Date:  2006-12-01       Impact factor: 6.937

2.  Automatic extraction of relations between medical concepts in clinical texts.

Authors:  Bryan Rink; Sanda Harabagiu; Kirk Roberts
Journal:  J Am Med Inform Assoc       Date:  2011 Sep-Oct       Impact factor: 4.497

3.  Long short-term memory.

Authors:  S Hochreiter; J Schmidhuber
Journal:  Neural Comput       Date:  1997-11-15       Impact factor: 2.026

4.  Convolutional Neural Networks for Biomedical Text Classification: Application in Indexing Biomedical Articles.

Authors:  Anthony Rios; Ramakanth Kavuluru
Journal:  ACM BCB       Date:  2015-09

5.  Extracting Drug-Drug Interactions with Word and Character-Level Recurrent Neural Networks.

Authors:  Ramakanth Kavuluru; Anthony Rios; Tung Tran
Journal:  IEEE Int Conf Healthc Inform       Date:  2017-09-14

6.  Identifying gene-disease associations using centrality on a literature mined gene-interaction network.

Authors:  Arzucan Ozgür; Thuy Vu; Günes Erkan; Dragomir R Radev
Journal:  Bioinformatics       Date:  2008-07-01       Impact factor: 6.937

7.  Entrez Gene: gene-centered information at NCBI.

Authors:  Donna Maglott; Jim Ostell; Kim D Pruitt; Tatiana Tatusova
Journal:  Nucleic Acids Res       Date:  2010-11-28       Impact factor: 16.971

8.  Overview of BioCreAtIvE: critical assessment of information extraction for biology.

Authors:  Lynette Hirschman; Alexander Yeh; Christian Blaschke; Alfonso Valencia
Journal:  BMC Bioinformatics       Date:  2005-05-24       Impact factor: 3.169

9.  Overview of BioCreative II gene normalization.

Authors:  Alexander A Morgan; Zhiyong Lu; Xinglong Wang; Aaron M Cohen; Juliane Fluck; Patrick Ruch; Anna Divoli; Katrin Fundel; Robert Leaman; Jörg Hakenberg; Chengjie Sun; Heng-hui Liu; Rafael Torres; Michael Krauthammer; William W Lau; Hongfang Liu; Chun-Nan Hsu; Martijn Schuemie; K Bretonnel Cohen; Lynette Hirschman
Journal:  Genome Biol       Date:  2008-09-01       Impact factor: 13.583

10.  All-paths graph kernel for protein-protein interaction extraction with evaluation of cross-corpus learning.

Authors:  Antti Airola; Sampo Pyysalo; Jari Björne; Tapio Pahikkala; Filip Ginter; Tapio Salakoski
Journal:  BMC Bioinformatics       Date:  2008-11-19       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.