Literature DB >> 15509601

Automatic extraction of gene/protein biological functions from biomedical text.

Asako Koike1, Yoshiki Niwa, Toshihisa Takagi.   

Abstract

MOTIVATION: With the rapid advancement of biomedical science and the development of high-throughput analysis methods, the extraction of various types of information from biomedical text has become critical. Since automatic functional annotations of genes are quite useful for interpreting large amounts of high-throughput data efficiently, the demand for automatic extraction of information related to gene functions from text has been increasing.
RESULTS: We have developed a method for automatically extracting the biological process functions of genes/protein/families based on Gene Ontology (GO) from text using a shallow parser and sentence structure analysis techniques. When the gene/protein/family names and their functions are described in ACTOR (doer of action) and OBJECT (receiver of action) relationships, the corresponding GO-IDs are assigned to the genes/proteins/families. The gene/protein/family names are recognized using the gene/protein/family name dictionaries developed by our group. To achieve wide recognition of the gene/protein/family functions, we semi-automatically gather functional terms based on GO using co-occurrence, collocation similarities and rule-based techniques. A preliminary experiment demonstrated that our method has an estimated recall of 54-64% with a precision of 91-94% for actually described functions in abstracts. When applied to the PUBMED, it extracted over 190 000 gene-GO relationships and 150 000 family-GO relationships for major eukaryotes.

Mesh:

Substances:

Year:  2004        PMID: 15509601     DOI: 10.1093/bioinformatics/bti084

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  21 in total

1.  PhenoGO: assigning phenotypic context to gene ontology annotations with natural language processing.

Authors:  Yves Lussier; Tara Borlawsky; Daniel Rappaport; Yang Liu; Carol Friedman
Journal:  Pac Symp Biocomput       Date:  2006

2.  Global mapping of gene/protein interactions in PubMed abstracts: a framework and an experiment with P53 interactions.

Authors:  Xin Li; Hsinchun Chen; Zan Huang; Hua Su; Jesse D Martinez
Journal:  J Biomed Inform       Date:  2007-01-17       Impact factor: 6.317

3.  BioTagger-GM: a gene/protein name recognition system.

Authors:  Manabu Torii; Zhangzhi Hu; Cathy H Wu; Hongfang Liu
Journal:  J Am Med Inform Assoc       Date:  2008-12-11       Impact factor: 4.497

4.  Extracting causal relations on HIV drug resistance from literature.

Authors:  Quoc-Chinh Bui; Breanndán O Nualláin; Charles A Boucher; Peter M A Sloot
Journal:  BMC Bioinformatics       Date:  2010-02-23       Impact factor: 3.169

5.  Mining experimental evidence of molecular function claims from the literature.

Authors:  Colleen E Crangle; J Michael Cherry; Eurie L Hong; Alex Zbyslaw
Journal:  Bioinformatics       Date:  2007-10-17       Impact factor: 6.937

6.  Information theory applied to the sparse gene ontology annotation network to predict novel gene function.

Authors:  Ying Tao; Lee Sam; Jianrong Li; Carol Friedman; Yves A Lussier
Journal:  Bioinformatics       Date:  2007-07-01       Impact factor: 6.937

7.  Integrated bio-entity network: a system for biological knowledge discovery.

Authors:  Lindsey Bell; Rajesh Chowdhary; Jun S Liu; Xufeng Niu; Jinfeng Zhang
Journal:  PLoS One       Date:  2011-06-27       Impact factor: 3.240

8.  Quantification of protein group coherence and pathway assignment using functional association.

Authors:  Meghana Chitale; Shriphani Palakodety; Daisuke Kihara
Journal:  BMC Bioinformatics       Date:  2011-09-19       Impact factor: 3.169

9.  Automatic extraction of angiogenesis bioprocess from text.

Authors:  Xinglong Wang; Iain McKendrick; Ian Barrett; Ian Dix; Tim French; Jun'ichi Tsujii; Sophia Ananiadou
Journal:  Bioinformatics       Date:  2011-08-05       Impact factor: 6.937

10.  Protein function prediction using text-based features extracted from the biomedical literature: the CAFA challenge.

Authors:  Andrew Wong; Hagit Shatkay
Journal:  BMC Bioinformatics       Date:  2013-02-28       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.