Literature DB >> 26357075

RLIMS-P 2.0: A Generalizable Rule-Based Information Extraction System for Literature Mining of Protein Phosphorylation Information.

Manabu Torii, Cecilia N Arighi, Gang Li, Qinghua Wang, Cathy H Wu, K Vijay-Shanker.   

Abstract

We introduce RLIMS-P version 2.0, an enhanced rule-based information extraction (IE) system for mining kinase, substrate, and phosphorylation site information from scientific literature. Consisting of natural language processing and IE modules, the system has integrated several new features, including the capability of processing full-text articles and generalizability towards different post-translational modifications (PTMs). To evaluate the system, sets of abstracts and full-text articles, containing a variety of textual expressions, were annotated. On the abstract corpus, the system achieved F-scores of 0.91, 0.92, and 0.95 for kinases, substrates, and sites, respectively. The corresponding scores on the full-text corpus were 0.88, 0.91, and 0.92. It was additionally evaluated on the corpus of the 2013 BioNLP-ST GE task, and achieved an F-score of 0.87 for the phosphorylation core task, improving upon the results previously reported on the corpus. Full-scale processing of all abstracts in MEDLINE and all articles in PubMed Central Open Access Subset has demonstrated scalability for mining rich information in literature, enabling its adoption for biocuration and for knowledge discovery. The new system is generalizable and it will be adapted to tackle other major PTM types. RLIMS-P 2.0 online system is available online (http://proteininformationresource.org/rlimsp/) and the developed corpora are available from iProLINK (http://proteininformationresource.org/iprolink/).

Entities:  

Mesh:

Substances:

Year:  2015        PMID: 26357075      PMCID: PMC4568560          DOI: 10.1109/TCBB.2014.2372765

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.710


  39 in total

1.  A biological named entity recognizer.

Authors:  Meenakshi Narayanaswamy; K E Ravikumar; K Vijay-Shanker
Journal:  Pac Symp Biocomput       Date:  2003

2.  Beyond the clause: extraction of phosphorylation information from medline abstracts.

Authors:  M Narayanaswamy; K E Ravikumar; K Vijay-Shanker
Journal:  Bioinformatics       Date:  2005-06       Impact factor: 6.937

3.  Zone analysis in biology articles as a basis for information extraction.

Authors:  Yoko Mizuta; Anna Korhonen; Tony Mullen; Nigel Collier
Journal:  Int J Med Inform       Date:  2005-08-19       Impact factor: 4.046

Review 4.  Frontiers of biomedical text mining: current progress.

Authors:  Pierre Zweigenbaum; Dina Demner-Fushman; Hong Yu; Kevin B Cohen
Journal:  Brief Bioinform       Date:  2007-10-30       Impact factor: 11.622

5.  iProLINK: an integrated protein resource for literature mining.

Authors:  Zhang-Zhi Hu; Inderjeet Mani; Vincent Hermoso; Hongfang Liu; Cathy H Wu
Journal:  Comput Biol Chem       Date:  2004-12       Impact factor: 2.877

6.  PhosphoGRID: a database of experimentally verified in vivo protein phosphorylation sites from the budding yeast Saccharomyces cerevisiae.

Authors:  Chris Stark; Ting-Cheng Su; Ashton Breitkreutz; Pedro Lourenco; Matthew Dahabieh; Bobby-Joe Breitkreutz; Mike Tyers; Ivan Sadowski
Journal:  Database (Oxford)       Date:  2010-01-28       Impact factor: 3.451

7.  Construction of protein phosphorylation networks by data mining, text mining and ontology integration: analysis of the spindle checkpoint.

Authors:  Karen E Ross; Cecilia N Arighi; Jia Ren; Hongzhan Huang; Cathy H Wu
Journal:  Database (Oxford)       Date:  2013-06-07       Impact factor: 3.451

8.  Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011.

Authors:  Sampo Pyysalo; Tomoko Ohta; Rafal Rak; Dan Sullivan; Chunhong Mao; Chunxia Wang; Bruno Sobral; Jun'ichi Tsujii; Sophia Ananiadou
Journal:  BMC Bioinformatics       Date:  2012-06-26       Impact factor: 3.169

9.  The eFIP system for text mining of protein interaction networks of phosphorylated proteins.

Authors:  Catalina O Tudor; Cecilia N Arighi; Qinghua Wang; Cathy H Wu; K Vijay-Shanker
Journal:  Database (Oxford)       Date:  2012-12-05       Impact factor: 3.451

Review 10.  Linking genes to literature: text mining, information extraction, and retrieval applications for biology.

Authors:  Martin Krallinger; Alfonso Valencia; Lynette Hirschman
Journal:  Genome Biol       Date:  2008-09-01       Impact factor: 13.583

View more
  19 in total

1.  Selective Neuronal Vulnerability in Alzheimer's Disease: A Network-Based Analysis.

Authors:  Jean-Pierre Roussarie; Vicky Yao; Patricia Rodriguez-Rodriguez; Rose Oughtred; Jennifer Rust; Zakary Plautz; Shirin Kasturia; Christian Albornoz; Wei Wang; Eric F Schmidt; Ruth Dannenfelser; Alicja Tadych; Lars Brichta; Alona Barnea-Cramer; Nathaniel Heintz; Patrick R Hof; Myriam Heiman; Kara Dolinski; Marc Flajolet; Olga G Troyanskaya; Paul Greengard
Journal:  Neuron       Date:  2020-06-29       Impact factor: 17.173

2.  Analysis of Protein Phosphorylation and Its Functional Impact on Protein-Protein Interactions via Text Mining of the Scientific Literature.

Authors:  Qinghua Wang; Karen E Ross; Hongzhan Huang; Jia Ren; Gang Li; K Vijay-Shanker; Cathy H Wu; Cecilia N Arighi
Journal:  Methods Mol Biol       Date:  2017

3.  iPTMnet: Integrative Bioinformatics for Studying PTM Networks.

Authors:  Karen E Ross; Hongzhan Huang; Jia Ren; Cecilia N Arighi; Gang Li; Catalina O Tudor; Mengxi Lv; Jung-Youn Lee; Sheng-Chih Chen; K Vijay-Shanker; Cathy H Wu
Journal:  Methods Mol Biol       Date:  2017

Review 4.  Protein Bioinformatics Databases and Resources.

Authors:  Chuming Chen; Hongzhan Huang; Cathy H Wu
Journal:  Methods Mol Biol       Date:  2017

5.  Scalable Text Mining Assisted Curation of Post-Translationally Modified Proteoforms in the Protein Ontology.

Authors:  Karen E Ross; Darren A Natale; Cecilia Arighi; Sheng-Chih Chen; Hongzhan Huang; Gang Li; Jia Ren; Michael Wang; K Vijay-Shanker; Cathy H Wu
Journal:  CEUR Workshop Proc       Date:  2016-11-29

6.  iPTMnet: an integrated resource for protein post-translational modification network discovery.

Authors:  Hongzhan Huang; Cecilia N Arighi; Karen E Ross; Jia Ren; Gang Li; Sheng-Chih Chen; Qinghua Wang; Julie Cowart; K Vijay-Shanker; Cathy H Wu
Journal:  Nucleic Acids Res       Date:  2018-01-04       Impact factor: 16.971

7.  Text Mining and Machine Learning Protocol for Extracting Human-Related Protein Phosphorylation Information from PubMed.

Authors:  Krishnamurthy Arumugam; Raja Ravi Shanker
Journal:  Methods Mol Biol       Date:  2022

8.  A Text Mining and Machine Learning Protocol for Extracting Posttranslational Modifications of Proteins from PubMed: A Special Focus on Glycosylation, Acetylation, Methylation, Hydroxylation, and Ubiquitination.

Authors:  Krishnamurthy Arumugam; Malathi Sellappan; Dheepa Anand; Sadhanha Anand; Subhashini Vedagiri Radhakrishnan
Journal:  Methods Mol Biol       Date:  2022

9.  miRTex: A Text Mining System for miRNA-Gene Relation Extraction.

Authors:  Gang Li; Karen E Ross; Cecilia N Arighi; Yifan Peng; Cathy H Wu; K Vijay-Shanker
Journal:  PLoS Comput Biol       Date:  2015-09-25       Impact factor: 4.475

10.  Mining clinical attributes of genomic variants through assisted literature curation in Egas.

Authors:  Sérgio Matos; David Campos; Renato Pinho; Raquel M Silva; Matthew Mort; David N Cooper; José Luís Oliveira
Journal:  Database (Oxford)       Date:  2016-06-07       Impact factor: 3.451

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.