Literature DB >> 27634494

Stacked ensemble combined with fuzzy matching for biomedical named entity recognition of diseases.

Balu Bhasuran1, Gurusamy Murugesan2, Sabenabanu Abdulkadhar2, Jeyakumar Natarajan3.   

Abstract

Biomedical Named Entity Recognition (Bio-NER) is the crucial initial step in the information extraction process and a majorly focused research area in biomedical text mining. In the past years, several models and methodologies have been proposed for the recognition of semantic types related to gene, protein, chemical, drug and other biological relevant named entities. In this paper, we implemented a stacked ensemble approach combined with fuzzy matching for biomedical named entity recognition of disease names. The underlying concept of stacked generalization is to combine the outputs of base-level classifiers using a second-level meta-classifier in an ensemble. We used Conditional Random Field (CRF) as the underlying classification method that makes use of a diverse set of features, mostly based on domain specific, and are orthographic and morphologically relevant. In addition, we used fuzzy string matching to tag rare disease names from our in-house disease dictionary. For fuzzy matching, we incorporated two best fuzzy search algorithms Rabin Karp and Tuned Boyer Moore. Our proposed approach shows promised result of 94.66%, 89.12%, 84.10%, and 76.71% of F-measure while on evaluating training and testing set of both NCBI disease and BioCreative V CDR Corpora. Copyright Â
© 2016 Elsevier Inc. All rights reserved.

Entities:  

Keywords:  Biomedical named entity recognition; Fuzzy matching; Machine learning; Stacked ensemble; Text mining

Mesh:

Substances:

Year:  2016        PMID: 27634494     DOI: 10.1016/j.jbi.2016.09.009

Source DB:  PubMed          Journal:  J Biomed Inform        ISSN: 1532-0464            Impact factor:   6.317


  8 in total

1.  Combining Literature Mining and Machine Learning for Predicting Biomedical Discoveries.

Authors:  Balu Bhasuran
Journal:  Methods Mol Biol       Date:  2022

2.  BioBERT and Similar Approaches for Relation Extraction.

Authors:  Balu Bhasuran
Journal:  Methods Mol Biol       Date:  2022

3.  A Text Mining Protocol for Mining Biological Pathways and Regulatory Networks from Biomedical Literature.

Authors:  Sabenabanu Abdulkadhar; Jeyakumar Natarajan
Journal:  Methods Mol Biol       Date:  2022

4.  A Text Mining Pipeline Using Active and Deep Learning Aimed at Curating Information in Computational Neuroscience.

Authors:  Matthew Shardlow; Meizhi Ju; Maolin Li; Christian O'Reilly; Elisabetta Iavarone; John McNaught; Sophia Ananiadou
Journal:  Neuroinformatics       Date:  2019-07

5.  Weighted Random Forests to Improve Arrhythmia Classification.

Authors:  Krzysztof Gajowniczek; Iga Grzegorczyk; Tomasz Ząbkowski; Chandrajit Bajaj
Journal:  Electronics (Basel)       Date:  2020-01-03       Impact factor: 2.397

6.  Automatic extraction of gene-disease associations from literature using joint ensemble learning.

Authors:  Balu Bhasuran; Jeyakumar Natarajan
Journal:  PLoS One       Date:  2018-07-26       Impact factor: 3.240

Review 7.  Artificial Intelligence (AI) in Rare Diseases: Is the Future Brighter?

Authors:  Sandra Brasil; Carlota Pascoal; Rita Francisco; Vanessa Dos Reis Ferreira; Paula A Videira; And Gonçalo Valadão
Journal:  Genes (Basel)       Date:  2019-11-27       Impact factor: 4.096

8.  Multi-step ahead meningitis case forecasting based on decomposition and multi-objective optimization methods.

Authors:  Matheus Henrique Dal Molin Ribeiro; Viviana Cocco Mariani; Leandro Dos Santos Coelho
Journal:  J Biomed Inform       Date:  2020-09-22       Impact factor: 6.317

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.