Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Text mining for precision medicine: automating disease-mutation relationship extraction from biomedical literature.

Literature DB >> 27121612

Text mining for precision medicine: automating disease-mutation relationship extraction from biomedical literature.

Ayush Singhal¹, Michael Simmons¹, Zhiyong Lu².

Abstract

OBJECTIVE: Identifying disease-mutation relationships is a significant challenge in the advancement of precision medicine. The aim of this work is to design a tool that automates the extraction of disease-related mutations from biomedical text to advance database curation for the support of precision medicine.
MATERIALS AND METHODS: We developed a machine-learning (ML) based method to automatically identify the mutations mentioned in the biomedical literature related to a particular disease. In order to predict a relationship between the mutation and the target disease, several features, such as statistical features, distance features, and sentiment features, were constructed. Our ML model was trained with a pre-labeled dataset consisting of manually curated information about mutation-disease associations. The model was subsequently used to extract disease-related mutations from larger biomedical literature corpora.
RESULTS: The performance of the proposed approach was assessed using a benchmarking dataset. Results show that our proposed approach gains significant improvement over the previous state of the art and obtains F-measures of 0.880 and 0.845 for prostate and breast cancer mutations, respectively. DISCUSSION: To demonstrate its utility, we applied our approach to all abstracts in PubMed for 3 diseases (including a non-cancer disease). The mutations extracted were then manually validated against human-curated databases. The validation results show that the proposed approach is useful in a real-world setting to extract uncurated disease mutations from the biomedical literature.
CONCLUSIONS: The proposed approach improves the state of the art for mutation-disease extraction from text. It is scalable and generalizable to identify mutations for any disease at a PubMed scale. Published by Oxford University Press on behalf of the American Medical Informatics Association 2016. This work is written by US Government employees and is in the public domain in the United States.

Entities: Disease Species

Keywords: automated extraction; breast cancer; disease-mutation relationship; machine learning; precision medicine; prostate cancer; text mining

Mesh：

Year: 2016 PMID： 27121612 PMCID： PMC4926749 DOI： 10.1093/jamia/ocw041

Source DB: PubMed Journal: J Am Med Inform Assoc ISSN： 1067-5027 Impact factor: 4.497

23 in total

1. A new initiative on precision medicine.

Authors: Francis S Collins; Harold Varmus
Journal: N Engl J Med Date: 2015-01-30 Impact factor: 91.245

2. The Cancer Biomedical Informatics Grid (caBIG): infrastructure and applications for a worldwide research community.

Authors:
Journal: Stud Health Technol Inform Date: 2007

3. Personalized medicine: challenges and opportunities for translational bioinformatics.

Authors: Casey Lynnette Overby; Peter Tarczy-Hornoch
Journal: Per Med Date: 2013-07-01 Impact factor: 2.512

4. MutationFinder: a high-performance system for extracting point mutation mentions from text.

Authors: J Gregory Caporaso; William A Baumgartner; David A Randolph; K Bretonnel Cohen; Lawrence Hunter
Journal: Bioinformatics Date: 2007-05-11 Impact factor: 6.937

5. SNPedia: a wiki supporting personal genome annotation, interpretation and analysis.

Authors: Michael Cariaso; Greg Lennon
Journal: Nucleic Acids Res Date: 2011-12-02 Impact factor: 16.971

6. Mutation extraction tools can be combined for robust recognition of genetic variants in the literature.

Authors: Antonio Jimeno Yepes; Karin Verspoor
Journal: F1000Res Date: 2014-01-21

7. Adapting a natural language processing tool to facilitate clinical trial curation for personalized cancer therapy.

Authors: Jia Zeng; Yonghui Wu; Ann Bailey; Amber Johnson; Vijaykumar Holla; Elmer V Bernstam; Hua Xu; Funda Meric-Bernstam
Journal: AMIA Jt Summits Transl Sci Proc Date: 2014-04-07

8. McKusick's Online Mendelian Inheritance in Man (OMIM).

Authors: Joanna Amberger; Carol A Bocchini; Alan F Scott; Ada Hamosh
Journal: Nucleic Acids Res Date: 2008-10-08 Impact factor: 16.971

9. PubTator: a web-based text mining tool for assisting biocuration.

Authors: Chih-Hsuan Wei; Hung-Yu Kao; Zhiyong Lu
Journal: Nucleic Acids Res Date: 2013-05-22 Impact factor: 16.971

10. ClinVar: public archive of relationships among sequence variation and human phenotype.

Authors: Melissa J Landrum; Jennifer M Lee; George R Riley; Wonhee Jang; Wendy S Rubinstein; Deanna M Church; Donna R Maglott
Journal: Nucleic Acids Res Date: 2013-11-14 Impact factor: 16.971

16 in total

Review 1. Text Mining for Precision Medicine: Bringing Structure to EHRs and Biomedical Literature to Understand Genes and Health.

Authors: Michael Simmons; Ayush Singhal; Zhiyong Lu
Journal: Adv Exp Med Biol Date: 2016 Impact factor: 2.622

2. ResidueFinder: extracting individual residue mentions from protein literature.

Authors: Ton E Becker; Eric Jakobsson
Journal: J Biomed Semantics Date: 2021-07-21

3. Biomarker identification of hepatocellular carcinoma using a methodical literature mining strategy.

Authors: Nai-Wen Chang; Hong-Jie Dai; Yung-Yu Shih; Chi-Yang Wu; Mira Anne C Dela Rosa; Rofeamor P Obena; Yu-Ju Chen; Wen-Lian Hsu; Yen-Jen Oyang
Journal: Database (Oxford) Date: 2017-01-01 Impact factor: 3.451

4. A Deep Phenotype Association Study Reveals Specific Phenotype Associations with Genetic Variants in Age-related Macular Degeneration: Age-Related Eye Disease Study 2 (AREDS2) Report No. 14.

Authors: Freekje van Asten; Michael Simmons; Ayush Singhal; Tiarnan D Keenan; Rinki Ratnapriya; Elvira Agrón; Traci E Clemons; Anand Swaroop; Zhiyong Lu; Emily Y Chew
Journal: Ophthalmology Date: 2017-10-31 Impact factor: 12.079