Literature DB >> 15661799

Concept-based annotation of enzyme classes.

Oliver Hofmann1, Dietmar Schomburg.   

Abstract

MOTIVATION: Given the explosive growth of biomedical data as well as the literature describing results and findings, it is getting increasingly difficult to keep up to date with new information. Keeping databases synchronized with current knowledge is a time-consuming and expensive task-one which can be alleviated by automatically gathering findings from the literature using linguistic approaches. We describe a method to automatically annotate enzyme classes with disease-related information extracted from the biomedical literature for inclusion in such a database.
RESULTS: Enzyme names for the 3901 enzyme classes in the BRENDA database, a repository for quantitative and qualitative enzyme information, were identified in more than 100,000 abstracts retrieved from the PubMed literature database. Phrases in the abstracts were assigned to concepts from the Unified Medical Language System (UMLS) utilizing the MetaMap program, allowing for the identification of disease-related concepts by their semantic fields in the UMLS ontology. Assignments between enzyme classes and diseases were created based on their co-occurrence within a single sentence. False positives could be removed by a variety of filters including minimum number of co-occurrences, removal of sentences containing a negation and the classification of sentences based on their semantic fields by a Support Vector Machine. Verification of the assignments with a manually annotated set of 1500 sentences yielded favorable results of 92% precision at 50% recall, sufficient for inclusion in a high-quality database. AVAILABILITY: Source code is available from the author upon request. SUPPLEMENTARY INFORMATION: ftp.uni-koeln.de/institute/biochemie/pub/brenda/info/diseaseSupp.pdf.

Mesh:

Substances:

Year:  2005        PMID: 15661799     DOI: 10.1093/bioinformatics/bti284

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  6 in total

1.  Capturing patient information at nursing shift changes: methodological evaluation of speech recognition and information extraction.

Authors:  Hanna Suominen; Maree Johnson; Liyuan Zhou; Paula Sanchez; Raul Sirel; Jim Basilakis; Leif Hanlen; Dominique Estival; Linda Dawson; Barbara Kelly
Journal:  J Am Med Inform Assoc       Date:  2014-10-21       Impact factor: 4.497

2.  Development of a classification scheme for disease-related enzyme information.

Authors:  Carola Söhngen; Antje Chang; Dietmar Schomburg
Journal:  BMC Bioinformatics       Date:  2011-08-09       Impact factor: 3.169

3.  The Autoimmune Disease Database: a dynamically compiled literature-derived database.

Authors:  Thomas Karopka; Juliane Fluck; Heinz-Theodor Mevissen; Anne Glass
Journal:  BMC Bioinformatics       Date:  2006-06-27       Impact factor: 3.169

4.  PathBinder--text empirics and automatic extraction of biomolecular interactions.

Authors:  Lifeng Zhang; Daniel Berleant; Jing Ding; Tuan Cao; Eve Syrkin Wurtele
Journal:  BMC Bioinformatics       Date:  2009-10-08       Impact factor: 3.169

5.  Semantic reclassification of the UMLS concepts.

Authors:  Jung-Wei Fan; Carol Friedman
Journal:  Bioinformatics       Date:  2008-07-13       Impact factor: 6.937

6.  Functional group and substructure searching as a tool in metabolomics.

Authors:  Masaaki Kotera; Andrew G McDonald; Sinéad Boyce; Keith F Tipton
Journal:  PLoS One       Date:  2008-02-06       Impact factor: 3.240

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.