Literature DB >> 32042829

Gene-gene interaction: the curse of dimensionality.

Amrita Chattopadhyay1, Tzu-Pin Lu1.   

Abstract

Identified genetic variants from genome wide association studies frequently show only modest effects on the disease risk, leading to the "missing heritability" problem. An avenue, to account for a part of this "missingness" is to evaluate gene-gene interactions (epistasis) thereby elucidating their effect on complex diseases. This can potentially help with identifying gene functions, pathways, and drug targets. However, the exhaustive evaluation of all possible genetic interactions among millions of single nucleotide polymorphisms (SNPs) raises several issues, otherwise known as the "curse of dimensionality". The dimensionality involved in the epistatic analysis of such exponentially growing SNPs diminishes the usefulness of traditional, parametric statistical methods. With the immense popularity of multifactor dimensionality reduction (MDR), a non-parametric method, proposed in 2001, that classifies multi-dimensional genotypes into one- dimensional binary approaches, led to the emergence of a fast-growing collection of methods that were based on the MDR approach. Moreover, machine-learning (ML) methods such as random forests and neural networks (NNs), deep-learning (DL) approaches, and hybrid approaches have also been applied profusely, in the recent years, to tackle this dimensionality issue associated with whole genome gene-gene interaction studies. However, exhaustive searching in MDR based approaches or variable selection in ML methods, still pose the risk of missing out on relevant SNPs. Furthermore, interpretability issues are a major hindrance for DL methods. To minimize this loss of information, Python based tools such as PySpark can potentially take advantage of distributed computing resources in the cloud, to bring back smaller subsets of data for further local analysis. Parallel computing can be a powerful resource that stands to fight this "curse". PySpark supports all standard Python libraries and C extensions thus making it convenient to write codes to deliver dramatic improvements in processing speed for extraordinarily large sets of data. 2019 Annals of Translational Medicine. All rights reserved.

Keywords:  Gene-gene interaction; PySpark; deep-learning (DL); machine-learning (ML); multifactor dimensionality reduction (MDR); parallel computing

Year:  2019        PMID: 32042829      PMCID: PMC6989881          DOI: 10.21037/atm.2019.12.87

Source DB:  PubMed          Journal:  Ann Transl Med        ISSN: 2305-5839


  28 in total

1.  A robust multifactor dimensionality reduction method for detecting gene-gene interactions with application to the genetic analysis of bladder cancer susceptibility.

Authors:  Jiang Gui; Angeline S Andrew; Peter Andrews; Heather M Nelson; Karl T Kelsey; Margaret R Karagas; Jason H Moore
Journal:  Ann Hum Genet       Date:  2010-11-22       Impact factor: 1.670

2.  A generalized combinatorial approach for detecting gene-by-gene and gene-by-environment interactions with application to nicotine dependence.

Authors:  Xiang-Yang Lou; Guo-Bo Chen; Lei Yan; Jennie Z Ma; Jun Zhu; Robert C Elston; Ming D Li
Journal:  Am J Hum Genet       Date:  2007-04-25       Impact factor: 11.025

Review 3.  Taiwan Biobank: a project aiming to aid Taiwan's transition into a biomedical island.

Authors:  Chien-Te Fan; Jui-Chu Lin; Chung-His Lee
Journal:  Pharmacogenomics       Date:  2008-02       Impact factor: 2.533

4.  A novel survival multifactor dimensionality reduction method for detecting gene-gene interactions with application to bladder cancer prognosis.

Authors:  Jiang Gui; Jason H Moore; Karl T Kelsey; Carmen J Marsit; Margaret R Karagas; Angeline S Andrew
Journal:  Hum Genet       Date:  2010-10-28       Impact factor: 4.132

5.  Model-based multifactor dimensionality reduction for detecting epistasis in case-control data in the presence of noise.

Authors:  Tom Cattaert; M Luz Calle; Scott M Dudek; Jestinah M Mahachie John; François Van Lishout; Victor Urrea; Marylyn D Ritchie; Kristel Van Steen
Journal:  Ann Hum Genet       Date:  2010-09-08       Impact factor: 1.670

6.  Genetic Programming Neural Networks: A Powerful Bioinformatics Tool for Human Genetics.

Authors:  Marylyn D Ritchie; Alison A Motsinger; William S Bush; Christopher S Coffey; Jason H Moore
Journal:  Appl Soft Comput       Date:  2007-01       Impact factor: 6.725

7.  A map of human genome variation from population-scale sequencing.

Authors:  Gonçalo R Abecasis; David Altshuler; Adam Auton; Lisa D Brooks; Richard M Durbin; Richard A Gibbs; Matt E Hurles; Gil A McVean
Journal:  Nature       Date:  2010-10-28       Impact factor: 49.962

8.  Power of grammatical evolution neural networks to detect gene-gene interactions in the presence of error.

Authors:  Alison A Motsinger-Reif; Theresa J Fanelli; Anna C Davis; Marylyn D Ritchie
Journal:  BMC Res Notes       Date:  2008-08-13

9.  Exploiting SNP correlations within random forest for genome-wide association studies.

Authors:  Vincent Botta; Gilles Louppe; Pierre Geurts; Louis Wehenkel
Journal:  PLoS One       Date:  2014-04-02       Impact factor: 3.240

10.  Finding the Sources of Missing Heritability within Rare Variants Through Simulation.

Authors:  Baishali Bandyopadhyay; Veda Chanda; Yupeng Wang
Journal:  Bioinform Biol Insights       Date:  2017-10-04
View more
  5 in total

1.  CHDH-PNPLA3 Gene-Gene Interactions Predict Insulin Resistance in Children with Obesity.

Authors:  Adela Chirita-Emandi; Costela Lacrimioara Serban; Corina Paul; Nicoleta Andreescu; Iulian Velea; Alexandra Mihailescu; Vlad Serafim; Diana-Andreea Tiugan; Paul Tutac; Cristian Zimbru; Maria Puiu; Mihai Dinu Niculescu
Journal:  Diabetes Metab Syndr Obes       Date:  2020-11-19       Impact factor: 3.168

2.  Biomarker interaction selection and disease detection based on multivariate gain ratio.

Authors:  Xiao Chu; Mao Jiang; Zhuo-Jun Liu
Journal:  BMC Bioinformatics       Date:  2022-05-12       Impact factor: 3.307

3.  Editorial: Current Status and Future Challenges of Biobank Data Analysis.

Authors:  Tzu-Pin Lu; Yoichiro Kamatani; Gillian Belbin; Taesung Park; Chuhsing Kate Hsiao
Journal:  Front Genet       Date:  2022-04-14       Impact factor: 4.772

4.  Molecular Classification and Interpretation of Amyotrophic Lateral Sclerosis Using Deep Convolution Neural Networks and Shapley Values.

Authors:  Abdul Karim; Zheng Su; Phillip K West; Matthew Keon; Jannah Shamsani; Samuel Brennan; Ted Wong; Ognjen Milicevic; Guus Teunisse; Hima Nikafshan Rad; Abdul Sattar
Journal:  Genes (Basel)       Date:  2021-10-30       Impact factor: 4.096

5.  Functional coding haplotypes and machine-learning feature elimination identifies predictors of Methotrexate Response in Rheumatoid Arthritis patients.

Authors:  Ashley J W Lim; Lee Jin Lim; Brandon N S Ooi; Ee Tzun Koh; Justina Wei Lynn Tan; Samuel S Chong; Chiea Chuen Khor; Lisa Tucker-Kellogg; Khai Pang Leong; Caroline G Lee
Journal:  EBioMedicine       Date:  2022-01-10       Impact factor: 8.143

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.