Literature DB >> 17586549

Evaporative cooling feature selection for genotypic data involving interactions.

B A McKinney1, D M Reif, B C White, J E Crowe, J H Moore.   

Abstract

MOTIVATION: The development of genome-wide capabilities for genotyping has led to the practical problem of identifying the minimum subset of genetic variants relevant to the classification of a phenotype. This challenge is especially difficult in the presence of attribute interactions, noise and small sample size.
METHODS: Analogous to the physical mechanism of evaporation, we introduce an evaporative cooling (EC) feature selection algorithm that seeks to obtain a subset of attributes with the optimum information temperature (i.e. the least noise). EC uses an attribute quality measure analogous to thermodynamic free energy that combines Relief-F and mutual information to evaporate (i.e. remove) noise features, leaving behind a subset of attributes that contain DNA sequence variations associated with a given phenotype.
RESULTS: EC is able to identify functional sequence variations that involve interactions (epistasis) between other sequence variations that influence their association with the phenotype. This ability is demonstrated on simulated genotypic data with attribute interactions and on real genotypic data from individuals who experienced adverse events following smallpox vaccination. The EC formalism allows us to combine information entropy, energy and temperature into a single information free energy attribute quality measure that balances interaction and main effects. AVAILABILITY: Open source software, written in Java, is freely available upon request.

Entities:  

Mesh:

Year:  2007        PMID: 17586549      PMCID: PMC3988427          DOI: 10.1093/bioinformatics/btm317

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  8 in total

1.  A complete enumeration and classification of two-locus disease models.

Authors:  W Li; J Reich
Journal:  Hum Hered       Date:  2000 Nov-Dec       Impact factor: 0.444

2.  The ubiquitous nature of epistasis in determining susceptibility to common human diseases.

Authors:  Jason H Moore
Journal:  Hum Hered       Date:  2003       Impact factor: 0.444

Review 3.  Genetics, statistics and human disease: analytical retooling for complexity.

Authors:  Tricia A Thornton-Wells; Jason H Moore; Jonathan L Haines
Journal:  Trends Genet       Date:  2004-12       Impact factor: 11.639

4.  Machine learning for detecting gene-gene interactions: a review.

Authors:  Brett A McKinney; David M Reif; Marylyn D Ritchie; Jason H Moore
Journal:  Appl Bioinformatics       Date:  2006

5.  Evaporative cooling of magnetically trapped and compressed spin-polarized hydrogen.

Authors: 
Journal:  Phys Rev B Condens Matter       Date:  1986-09-01

6.  Who's afraid of epistasis?

Authors:  W N Frankel; N J Schork
Journal:  Nat Genet       Date:  1996-12       Impact factor: 38.330

7.  Cytokine expression patterns associated with systemic adverse events following smallpox immunization.

Authors:  Brett A McKinney; David M Reif; Michael T Rock; Kathryn M Edwards; Stephen F Kingsmore; Jason H Moore; James E Crowe
Journal:  J Infect Dis       Date:  2006-07-13       Impact factor: 5.226

8.  Data simulation software for whole-genome association and other studies in human genetics.

Authors:  Scott M Dudek; Alison A Motsinger; Digna R Velez; Scott M Williams; Marylyn D Ritchie
Journal:  Pac Symp Biocomput       Date:  2006
  8 in total
  19 in total

Review 1.  Prioritizing GWAS results: A review of statistical methods and recommendations for their application.

Authors:  Rita M Cantor; Kenneth Lange; Janet S Sinsheimer
Journal:  Am J Hum Genet       Date:  2010-01       Impact factor: 11.025

2.  An algorithm for learning maximum entropy probability models of disease risk that efficiently searches and sparingly encodes multilocus genomic interactions.

Authors:  David J Miller; Yanxin Zhang; Guoqiang Yu; Yongmei Liu; Li Chen; Carl D Langefeld; David Herrington; Yue Wang
Journal:  Bioinformatics       Date:  2009-07-16       Impact factor: 6.937

3.  A comparison of multifactor dimensionality reduction and L1-penalized regression to identify gene-gene interactions in genetic association studies.

Authors:  Stacey Winham; Chong Wang; Alison A Motsinger-Reif
Journal:  Stat Appl Genet Mol Biol       Date:  2011-01-06

4.  Benchmarking relief-based feature selection methods for bioinformatics data mining.

Authors:  Ryan J Urbanowicz; Randal S Olson; Peter Schmitt; Melissa Meeker; Jason H Moore
Journal:  J Biomed Inform       Date:  2018-07-17       Impact factor: 6.317

Review 5.  Relief-based feature selection: Introduction and review.

Authors:  Ryan J Urbanowicz; Melissa Meeker; William La Cava; Randal S Olson; Jason H Moore
Journal:  J Biomed Inform       Date:  2018-07-18       Impact factor: 6.317

Review 6.  Epistasis and its implications for personal genetics.

Authors:  Jason H Moore; Scott M Williams
Journal:  Am J Hum Genet       Date:  2009-09       Impact factor: 11.025

Review 7.  Detecting gene-gene interactions that underlie human diseases.

Authors:  Heather J Cordell
Journal:  Nat Rev Genet       Date:  2009-06       Impact factor: 53.242

8.  Capturing the spectrum of interaction effects in genetic association studies by simulated evaporative cooling network analysis.

Authors:  Brett A McKinney; James E Crowe; Jingyu Guo; Dehua Tian
Journal:  PLoS Genet       Date:  2009-03-20       Impact factor: 5.917

9.  Spatially uniform relieff (SURF) for computationally-efficient filtering of gene-gene interactions.

Authors:  Casey S Greene; Nadia M Penrod; Jeff Kiralis; Jason H Moore
Journal:  BioData Min       Date:  2009-09-22       Impact factor: 2.522

Review 10.  Bioinformatics challenges for genome-wide association studies.

Authors:  Jason H Moore; Folkert W Asselbergs; Scott M Williams
Journal:  Bioinformatics       Date:  2010-01-06       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.