Literature DB >> 15556482

iProLINK: an integrated protein resource for literature mining.

Zhang-Zhi Hu1, Inderjeet Mani, Vincent Hermoso, Hongfang Liu, Cathy H Wu.   

Abstract

The exponential growth of large-scale molecular sequence data and of the PubMed scientific literature has prompted active research in biological literature mining and information extraction to facilitate genome/proteome annotation and improve the quality of biological databases. Motivated by the promise of text mining methodologies, but at the same time, the lack of adequate curated data for training and benchmarking, the Protein Information Resource (PIR) has developed a resource for protein literature mining--iProLINK (integrated Protein Literature INformation and Knowledge). As PIR focuses its effort on the curation of the UniProt protein sequence database, the goal of iProLINK is to provide curated data sources that can be utilized for text mining research in the areas of bibliography mapping, annotation extraction, protein named entity recognition, and protein ontology development. The data sources for bibliography mapping and annotation extraction include mapped citations (PubMed ID to protein entry and feature line mapping) and annotation-tagged literature corpora. The latter includes several hundred abstracts and full-text articles tagged with experimentally validated post-translational modifications (PTMs) annotated in the PIR protein sequence database. The data sources for entity recognition and ontology development include a protein name dictionary, word token dictionaries, protein name-tagged literature corpora along with tagging guidelines, as well as a protein ontology based on PIRSF protein family names. iProLINK is freely accessible at http://pir.georgetown.edu/iprolink, with hypertext links for all downloadable files.

Entities:  

Mesh:

Substances:

Year:  2004        PMID: 15556482     DOI: 10.1016/j.compbiolchem.2004.09.010

Source DB:  PubMed          Journal:  Comput Biol Chem        ISSN: 1476-9271            Impact factor:   2.877


  17 in total

1.  NERBio: using selected word conjunctions, term normalization, and global patterns to improve biomedical named entity recognition.

Authors:  Richard Tzong-Han Tsai; Cheng-Lung Sung; Hong-Jie Dai; Hsieh-Chuan Hung; Ting-Yi Sung; Wen-Lian Hsu
Journal:  BMC Bioinformatics       Date:  2006-12-18       Impact factor: 3.169

2.  Protein databases on the internet.

Authors:  Dong Xu; Ying Xu
Journal:  Curr Protoc Mol Biol       Date:  2004-11

3.  Analysis of Protein Phosphorylation and Its Functional Impact on Protein-Protein Interactions via Text Mining of the Scientific Literature.

Authors:  Qinghua Wang; Karen E Ross; Hongzhan Huang; Jia Ren; Gang Li; K Vijay-Shanker; Cathy H Wu; Cecilia N Arighi
Journal:  Methods Mol Biol       Date:  2017

4.  RLIMS-P 2.0: A Generalizable Rule-Based Information Extraction System for Literature Mining of Protein Phosphorylation Information.

Authors:  Manabu Torii; Cecilia N Arighi; Gang Li; Qinghua Wang; Cathy H Wu; K Vijay-Shanker
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2015 Jan-Feb       Impact factor: 3.710

5.  A Text Mining and Machine Learning Protocol for Extracting Posttranslational Modifications of Proteins from PubMed: A Special Focus on Glycosylation, Acetylation, Methylation, Hydroxylation, and Ubiquitination.

Authors:  Krishnamurthy Arumugam; Malathi Sellappan; Dheepa Anand; Sadhanha Anand; Subhashini Vedagiri Radhakrishnan
Journal:  Methods Mol Biol       Date:  2022

6.  JUZBOX: a web server for extracting biomedical words from the protein sequence.

Authors:  Paul Bobby; Seetharaman Balaji; Variath Sathyanath; Santhosh J Eapen
Journal:  Bioinformation       Date:  2009-11-17

7.  Biblio-MetReS: a bibliometric network reconstruction application and server.

Authors:  Anabel Usié; Hiren Karathia; Ivan Teixidó; Joan Valls; Xavier Faus; Rui Alves; Francesc Solsona
Journal:  BMC Bioinformatics       Date:  2011-10-05       Impact factor: 3.307

8.  Applications of natural language processing in biodiversity science.

Authors:  Anne E Thessen; Hong Cui; Dmitry Mozzherin
Journal:  Adv Bioinformatics       Date:  2012-05-22

9.  Various criteria in the evaluation of biomedical named entity recognition.

Authors:  Richard Tzong-Han Tsai; Shih-Hung Wu; Wen-Chi Chou; Yu-Chun Lin; Ding He; Jieh Hsiang; Ting-Yi Sung; Wen-Lian Hsu
Journal:  BMC Bioinformatics       Date:  2006-02-24       Impact factor: 3.169

10.  An open-source framework for large-scale, flexible evaluation of biomedical text mining systems.

Authors:  William A Baumgartner; K Bretonnel Cohen; Lawrence Hunter
Journal:  J Biomed Discov Collab       Date:  2008-01-29
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.