Literature DB >> 16901087

Supporting the curation of biological databases with reusable text mining.

Olivo Miotto1, Tin Wee Tan, Vladimir Brusic.   

Abstract

Curators of biological databases transfer knowledge from scientific publications, a laborious and expensive manual process. Machine learning algorithms can reduce the workload of curators by filtering relevant biomedical literature, though their widespread adoption will depend on the availability of intuitive tools that can be configured for a variety of tasks. We propose a new method for supporting curators by means of document categorization, and describe the architecture of a curator-oriented tool implementing this method using techniques that require no computational linguistic or programming expertise. To demonstrate the feasibility of this approach, we prototyped an application of this method to support a real curation task: identifying PubMed abstracts that contain allergen cross-reactivity information. We tested the performance of two different classifier algorithms (CART and ANN), applied to both composite and single-word features, using several feature scoring functions. Both classifiers exceeded our performance targets, the ANN classifier yielding the best results. These results show that the method we propose can deliver the level of performance needed to assist database curation.

Entities:  

Mesh:

Substances:

Year:  2005        PMID: 16901087

Source DB:  PubMed          Journal:  Genome Inform        ISSN: 0919-9454


  12 in total

Review 1.  Allergen databases: current status and perspectives.

Authors:  Adriano Mari; Chiara Rasi; Paola Palazzo; Enrico Scala
Journal:  Curr Allergy Asthma Rep       Date:  2009-09       Impact factor: 4.806

2.  Finding falls in ambulatory care clinical documents using statistical text mining.

Authors:  James A McCart; Donald J Berndt; Jay Jarman; Dezon K Finch; Stephen L Luther
Journal:  J Am Med Inform Assoc       Date:  2012-12-15       Impact factor: 4.497

3.  Literature classification for semi-automated updating of biological knowledgebases.

Authors:  Lars Olsen; Ulrich Johan Kudahl; Ole Winther; Vladimir Brusic
Journal:  BMC Genomics       Date:  2013-10-16       Impact factor: 3.969

4.  Pharmspresso: a text mining tool for extraction of pharmacogenomic concepts and relationships from full text.

Authors:  Yael Garten; Russ B Altman
Journal:  BMC Bioinformatics       Date:  2009-02-05       Impact factor: 3.169

5.  Enhancing navigation in biomedical databases by community voting and database-driven text classification.

Authors:  Timo Duchrow; Timur Shtatland; Daniel Guettler; Misha Pivovarov; Stefan Kramer; Ralph Weissleder
Journal:  BMC Bioinformatics       Date:  2009-10-03       Impact factor: 3.169

6.  Rule-based knowledge aggregation for large-scale protein sequence analysis of influenza A viruses.

Authors:  Olivo Miotto; Tin Wee Tan; Vladimir Brusic
Journal:  BMC Bioinformatics       Date:  2008       Impact factor: 3.169

7.  Clustering of cognate proteins among distinct proteomes derived from multiple links to a single seed sequence.

Authors:  Adriano Barbosa-Silva; Venkata P Satagopam; Reinhard Schneider; J Miguel Ortega
Journal:  BMC Bioinformatics       Date:  2008-03-05       Impact factor: 3.169

8.  Identification of human-to-human transmissibility factors in PB2 proteins of influenza A by large-scale mutual information analysis.

Authors:  Olivo Miotto; At Heiny; Tin Wee Tan; J Thomas August; Vladimir Brusic
Journal:  BMC Bioinformatics       Date:  2008       Impact factor: 3.169

9.  Automating document classification for the Immune Epitope Database.

Authors:  Peng Wang; Alexander A Morgan; Qing Zhang; Alessandro Sette; Bjoern Peters
Journal:  BMC Bioinformatics       Date:  2007-07-26       Impact factor: 3.169

10.  Text Categorization of Heart, Lung, and Blood Studies in the Database of Genotypes and Phenotypes (dbGaP) Utilizing n-grams and Metadata Features.

Authors:  Mindy K Ross; Ko-Wei Lin; Karen Truong; Abhishek Kumar; Mike Conway
Journal:  Biomed Inform Insights       Date:  2013-07-22
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.