Literature DB >> 18229717

Enabling integrative genomic analysis of high-impact human diseases through text mining.

Joel Dudley1, Atul J Butte.   

Abstract

Our limited ability to perform large-scale translational discovery and analysis of disease characterizations from public genomic data repositories remains a major bottleneck in efforts to translate genomics experiments to medicine. Through comprehensive, integrative genomic analysis of all available human disease characterizations we gain crucial insight into the molecular phenomena underlying pathogenesis as well as intra- and inter-disease differentiation. Such knowledge is crucial in the development of improved clinical diagnostics and the identification of molecular targets for novel therapeutics. In this study we build on our previous work to realize the next important step in large-scale translational discovery and analysis, which is to automatically identify those genomic experiments in which a disease state is compared to a normal control state. We present an automated text mining method that employs Natural Language Processing (NLP) techniques to automatically identify disease-related experiments in the NCBI Gene Expression Omnibus (GEO) that include measurements for both disease and normal control states. In this manner, we find that 62% of disease-related experiments contain sample subsets that can be automatically identified as normal controls. Furthermore, we calculate that the identified experiments characterize diseases that contribute to 30% of all human disease-related mortality in the United States. This work demonstrates that we now have the necessary tools and methods to initiate large-scale translational bioinformatics inquiry across the broad spectrum of high-impact human disease.

Entities:  

Mesh:

Year:  2008        PMID: 18229717      PMCID: PMC2735266     

Source DB:  PubMed          Journal:  Pac Symp Biocomput        ISSN: 2335-6928


  18 in total

1.  The Stanford Microarray Database.

Authors:  G Sherlock; T Hernandez-Boussard; A Kasarskis; G Binkley; J C Matese; S S Dwight; M Kaloper; S Weng; H Jin; C A Ball; M B Eisen; P T Spellman; P O Brown; D Botstein; J M Cherry
Journal:  Nucleic Acids Res       Date:  2001-01-01       Impact factor: 16.971

2.  Use of general-purpose negation detection to augment concept indexing of medical documents: a quantitative study using the UMLS.

Authors:  P G Mutalik; A Deshpande; P M Nadkarni
Journal:  J Am Med Inform Assoc       Date:  2001 Nov-Dec       Impact factor: 4.497

3.  Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma.

Authors:  Gavin J Gordon; Roderick V Jensen; Li-Li Hsiao; Steven R Gullans; Joshua E Blumenstock; Sridhar Ramaswamy; William G Richards; David J Sugarbaker; Raphael Bueno
Journal:  Cancer Res       Date:  2002-09-01       Impact factor: 12.701

Review 4.  Progress in the use of microarray technology to study the neurobiology of disease.

Authors:  Károly Mirnics; Jonathan Pevsner
Journal:  Nat Neurosci       Date:  2004-05       Impact factor: 24.884

5.  Funding high-throughput data sharing.

Authors:  Catherine A Ball; Gavin Sherlock; Alvis Brazma
Journal:  Nat Biotechnol       Date:  2004-09       Impact factor: 54.908

6.  Accessing genetic information with high-density DNA arrays.

Authors:  M Chee; R Yang; E Hubbell; A Berno; X C Huang; D Stern; J Winkler; D J Lockhart; M S Morris; S P Fodor
Journal:  Science       Date:  1996-10-25       Impact factor: 47.728

7.  Beyond synonymy: exploiting the UMLS semantics in mapping vocabularies.

Authors:  O Bodenreider; S J Nelson; W T Hole; H F Chang
Journal:  Proc AMIA Symp       Date:  1998

8.  The Unified Medical Language System.

Authors:  D A Lindberg; B L Humphreys; A T McCray
Journal:  Methods Inf Med       Date:  1993-08       Impact factor: 2.176

9.  ArrayExpress--a public repository for microarray gene expression data at the EBI.

Authors:  Alvis Brazma; Helen Parkinson; Ugis Sarkans; Mohammadreza Shojatalab; Jaak Vilo; Niran Abeygunawardena; Ele Holloway; Misha Kapushesky; Patrick Kemmeren; Gonzalo Garcia Lara; Ahmet Oezcimen; Philippe Rocca-Serra; Susanna-Assunta Sansone
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

10.  NCBI GEO: mining millions of expression profiles--database and tools.

Authors:  Tanya Barrett; Tugba O Suzek; Dennis B Troup; Stephen E Wilhite; Wing-Chi Ngau; Pierre Ledoux; Dmitry Rudnev; Alex E Lash; Wataru Fujibuchi; Ron Edgar
Journal:  Nucleic Acids Res       Date:  2005-01-01       Impact factor: 16.971

View more
  20 in total

1.  TRANSLATING BIOLOGY: TEXT MINING TOOLS THAT WORK.

Authors:  K Bretonnel Cohen; Hong Yu; Philip E Bourne; Lynette Hirschman
Journal:  Pac Symp Biocomput       Date:  2008-01-01

2.  Translational bioinformatics: coming of age.

Authors:  Atul J Butte
Journal:  J Am Med Inform Assoc       Date:  2008-08-28       Impact factor: 4.497

3.  Recall and bias of retrieving gene expression microarray datasets through PubMed identifiers.

Authors:  Heather Piwowar; Wendy Chapman
Journal:  J Biomed Discov Collab       Date:  2010-03-28

4.  DSGeo: software tools for cross-platform analysis of gene expression data in GEO.

Authors:  Ronilda Lacson; Erik Pitzer; Jihoon Kim; Pedro Galante; Christian Hinske; Lucila Ohno-Machado
Journal:  J Biomed Inform       Date:  2010-05-07       Impact factor: 6.317

5.  Disease signatures are robust across tissues and experiments.

Authors:  Joel T Dudley; Robert Tibshirani; Tarangini Deshpande; Atul J Butte
Journal:  Mol Syst Biol       Date:  2009-09-15       Impact factor: 11.429

6.  Translational bioinformatics applications in genome medicine.

Authors:  Atul J Butte
Journal:  Genome Med       Date:  2009-06-29       Impact factor: 11.117

7.  Network-based elucidation of human disease similarities reveals common functional modules enriched for pluripotent drug targets.

Authors:  Silpa Suthram; Joel T Dudley; Annie P Chiang; Rong Chen; Trevor J Hastie; Atul J Butte
Journal:  PLoS Comput Biol       Date:  2010-02-05       Impact factor: 4.475

8.  Differentially expressed RNA from public microarray data identifies serum protein biomarkers for cross-organ transplant rejection and other conditions.

Authors:  Rong Chen; Tara K Sigdel; Li Li; Neeraja Kambham; Joel T Dudley; Szu-Chuan Hsieh; R Bryan Klassen; Amery Chen; Tuyen Caohuu; Alexander A Morgan; Hannah A Valantine; Kiran K Khush; Minnie M Sarwal; Atul J Butte
Journal:  PLoS Comput Biol       Date:  2010-09-23       Impact factor: 4.475

9.  Towards large-scale sample annotation in gene expression repositories.

Authors:  Erik Pitzer; Ronilda Lacson; Christian Hinske; Jihoon Kim; Pedro Af Galante; Lucila Ohno-Machado
Journal:  BMC Bioinformatics       Date:  2009-09-17       Impact factor: 3.169

10.  Evaluation of a large-scale biomedical data annotation initiative.

Authors:  Ronilda Lacson; Erik Pitzer; Christian Hinske; Pedro Galante; Lucila Ohno-Machado
Journal:  BMC Bioinformatics       Date:  2009-09-17       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.