Literature DB >> 23564842

tmVar: a text mining approach for extracting sequence variants in biomedical literature.

Chih-Hsuan Wei1, Bethany R Harris, Hung-Yu Kao, Zhiyong Lu.   

Abstract

MOTIVATION: Text-mining mutation information from the literature becomes a critical part of the bioinformatics approach for the analysis and interpretation of sequence variations in complex diseases in the post-genomic era. It has also been used for assisting the creation of disease-related mutation databases. Most of existing approaches are rule-based and focus on limited types of sequence variations, such as protein point mutations. Thus, extending their extraction scope requires significant manual efforts in examining new instances and developing corresponding rules. As such, new automatic approaches are greatly needed for extracting different kinds of mutations with high accuracy.
RESULTS: Here, we report tmVar, a text-mining approach based on conditional random field (CRF) for extracting a wide range of sequence variants described at protein, DNA and RNA levels according to a standard nomenclature developed by the Human Genome Variation Society. By doing so, we cover several important types of mutations that were not considered in past studies. Using a novel CRF label model and feature set, our method achieves higher performance than a state-of-the-art method on both our corpus (91.4 versus 78.1% in F-measure) and their own gold standard (93.9 versus 89.4% in F-measure). These results suggest that tmVar is a high-performance method for mutation extraction from biomedical literature. AVAILABILITY: tmVar software and its corpus of 500 manually curated abstracts are available for download at http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/pub/tmVar

Entities:  

Mesh:

Year:  2013        PMID: 23564842      PMCID: PMC3661051          DOI: 10.1093/bioinformatics/btt156

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  27 in total

1.  Semi-automatic semantic annotation of PubMed queries: a study on quality, efficiency, satisfaction.

Authors:  Aurélie Névéol; Rezarta Islamaj Doğan; Zhiyong Lu
Journal:  J Biomed Inform       Date:  2010-11-20       Impact factor: 6.317

2.  BANNER: an executable survey of advances in biomedical named entity recognition.

Authors:  Robert Leaman; Graciela Gonzalez
Journal:  Pac Symp Biocomput       Date:  2008

3.  Novel tools for extraction and validation of disease-related mutations applied to Fabry disease.

Authors:  Remko Kuipers; Tom van den Bergh; Henk-Jan Joosten; Ronald H Lekanne dit Deprez; Marcel Mam Mannens; Peter J Schaap
Journal:  Hum Mutat       Date:  2010-09       Impact factor: 4.878

4.  MutationFinder: a high-performance system for extracting point mutation mentions from text.

Authors:  J Gregory Caporaso; William A Baumgartner; David A Randolph; K Bretonnel Cohen; Lawrence Hunter
Journal:  Bioinformatics       Date:  2007-05-11       Impact factor: 6.937

5.  Interpretation of the consequences of mutations in protein kinases: combined use of bioinformatics and text mining.

Authors:  Jose M G Izarzugaza; Martin Krallinger; Alfonso Valencia
Journal:  Front Physiol       Date:  2012-08-22       Impact factor: 4.566

6.  Automated extraction and semantic analysis of mutation impacts from the biomedical literature.

Authors:  Nona Naderi; René Witte
Journal:  BMC Genomics       Date:  2012-06-18       Impact factor: 3.969

7.  Improving links between literature and biological data with text mining: a case study with GEO, PDB and MEDLINE.

Authors:  Aurélie Névéol; W John Wilbur; Zhiyong Lu
Journal:  Database (Oxford)       Date:  2012-06-08       Impact factor: 3.451

8.  EnzyMiner: automatic identification of protein level mutations and their impact on target enzymes from PubMed abstracts.

Authors:  Süveyda Yeniterzi; Ugur Sezerman
Journal:  BMC Bioinformatics       Date:  2009-08-27       Impact factor: 3.169

9.  Overview of BioCreative II gene normalization.

Authors:  Alexander A Morgan; Zhiyong Lu; Xinglong Wang; Aaron M Cohen; Juliane Fluck; Patrick Ruch; Anna Divoli; Katrin Fundel; Robert Leaman; Jörg Hakenberg; Chengjie Sun; Heng-hui Liu; Rafael Torres; Michael Krauthammer; William W Lau; Hongfang Liu; Chun-Nan Hsu; Martijn Schuemie; K Bretonnel Cohen; Lynette Hirschman
Journal:  Genome Biol       Date:  2008-09-01       Impact factor: 13.583

10.  The need for genetic variant naming standards in published abstracts of human genetic association studies.

Authors:  Wei Yu; Renée Ned; Anja Wulf; Tiebin Liu; Muin J Khoury; Marta Gwinn
Journal:  BMC Res Notes       Date:  2009-04-14
View more
  71 in total

1.  Beyond accuracy: creating interoperable and scalable text-mining web services.

Authors:  Chih-Hsuan Wei; Robert Leaman; Zhiyong Lu
Journal:  Bioinformatics       Date:  2016-02-16       Impact factor: 6.937

2.  SimConcept: A Hybrid Approach for Simplifying Composite Named Entities in Biomedicine.

Authors:  Chih-Hsuan Wei; Robert Leaman; Zhiyong Lu
Journal:  ACM BCB       Date:  2014

3.  SimConcept: a hybrid approach for simplifying composite named entities in biomedical text.

Authors:  Chih-Hsuan Wei; Robert Leaman; Zhiyong Lu
Journal:  IEEE J Biomed Health Inform       Date:  2015-04-13       Impact factor: 5.772

4.  tmChem: a high performance approach for chemical named entity recognition and normalization.

Authors:  Robert Leaman; Chih-Hsuan Wei; Zhiyong Lu
Journal:  J Cheminform       Date:  2015-01-19       Impact factor: 5.514

5.  A CRF-based system for recognizing chemical entity mentions (CEMs) in biomedical literature.

Authors:  Shuo Xu; Xin An; Lijun Zhu; Yunliang Zhang; Haodong Zhang
Journal:  J Cheminform       Date:  2015-01-19       Impact factor: 5.514

6.  NERChem: adapting NERBio to chemical patents via full-token features and named entity feature with chemical sub-class composition.

Authors:  Richard Tzong-Han Tsai; Yu-Cheng Hsiao; Po-Ting Lai
Journal:  Database (Oxford)       Date:  2016-10-25       Impact factor: 3.451

7.  A hybrid approach for automated mutation annotation of the extended human mutation landscape in scientific literature.

Authors:  Antonio Jimeno Yepes; Andrew MacKinlay; Natalie Gunn; Christine Schieber; Noel Faux; Matthew Downton; Benjamin Goudey; Richard L Martin
Journal:  AMIA Annu Symp Proc       Date:  2018-12-05

8.  PubTator central: automated concept annotation for biomedical full text articles.

Authors:  Chih-Hsuan Wei; Alexis Allot; Robert Leaman; Zhiyong Lu
Journal:  Nucleic Acids Res       Date:  2019-07-02       Impact factor: 16.971

Review 9.  Text Mining for Precision Medicine: Bringing Structure to EHRs and Biomedical Literature to Understand Genes and Health.

Authors:  Michael Simmons; Ayush Singhal; Zhiyong Lu
Journal:  Adv Exp Med Biol       Date:  2016       Impact factor: 2.622

10.  DES-Mutation: System for Exploring Links of Mutations and Diseases.

Authors:  Vasiliki Kordopati; Adil Salhi; Rozaimi Razali; Aleksandar Radovanovic; Faroug Tifratene; Mahmut Uludag; Yu Li; Ameerah Bokhari; Ahdab AlSaieedi; Arwa Bin Raies; Christophe Van Neste; Magbubah Essack; Vladimir B Bajic
Journal:  Sci Rep       Date:  2018-09-06       Impact factor: 4.379

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.