| Literature DB >> 27920536 |
Cengizhan Acikel1, Yesim Aydin Son2, Cemil Celik3, Husamettin Gul4.
Abstract
BACKGROUND: Multifactor dimensionality reduction (MDR) is a nonparametric approach that can be used to detect relevant interactions between single-nucleotide polymorphisms (SNPs). The aim of this study was to build the best genomic model based on SNP associations and to identify candidate polymorphisms that are the underlying molecular basis of the bipolar disorders.Entities:
Keywords: Bipolar disorders; Data Mining; Decision Support; GWAS; MDR; SNP
Year: 2016 PMID: 27920536 PMCID: PMC5127431 DOI: 10.2147/NDT.S112558
Source DB: PubMed Journal: Neuropsychiatr Dis Treat ISSN: 1176-6328 Impact factor: 2.570
Figure 1Data analysis flowchart.
Abbreviations: BDO, bipolar disorder only; MAF, minor allele frequencies; HWE, Hardy–Weinberg equilibrium; kNN, k-nearest neighbor; MDR, multifactor dimensionality reduction; SNP, single-nucleotide polymorphism; PLINK, open-source whole-genome association analysis toolset version 1.8.
Main descriptive statistics
| Group | Race | Frequency | Valid percent |
|---|---|---|---|
| BDO | Total | 604 | 34.2 |
| GRU | Total | 1,767 | 65.6 |
| BDO | EA | 339 | 56.1 |
| AA | 265 | 43.9 | |
| GRU | EA | 1,081 | 61.2 |
| AA | 686 | 38.8 |
Abbreviations: GRU, Control genotypes with General research use consent; BDO, Bipolar disorders only; EA, European ancestry; AA, African American ancestry.
Single-nucleotide polymorphisms identified in the genome-based model for RF, kNN, and NB methods
| RS ID | RF | kNN | NB | Multidimentionality reduction |
|---|---|---|---|---|
| rs6785 | ☑ | ☑ | ☑ | |
| rs2194124 | ☑ | ☑ | ☑ | |
| rs4792189 | ☑ | ☑ | ☑ | |
| rs7569781 | ☑ | ☑ | ☑ | |
| rs9375098 | ☑ | ☑ | ☑ | |
| rs10415145 | ☑ | ☑ | ☑ | |
| rs10857580 | ☑ | ☑ | ☑ | |
| rs11015814 | ☑ | ☑ | ☑ | |
| rs11015877 | ☑ | ☑ | ☑ | |
| rs732183 | ☑ | ☑ | ☑ | |
| rs11023096 | ☑ | ☑ | ||
| rs1328392 | ☑ | ☑ | ||
| rs2791142 | ☑ | ☑ | ||
| rs1861226 | ☑ | |||
| rs4654814 | ☑ | |||
| rs219506 | ☑ | |||
| rs2055710 | ☑ | |||
| rs2483023 | ☑ | |||
| rs9372649 | ☑ | |||
| rs12145634 | ☑ | |||
| rs17736182 | ☑ |
Abbreviations: RF, random forest; NB, naïve Bayes; kNN, k-nearest neighbor.
Comparison of the performance of the classification-based models with MDR
| Feature | Method | RF | NB | kNN | MDR
| |
|---|---|---|---|---|---|---|
| Two-way | Three-way | |||||
| Validity criteria | Classification accuracy | 0.734 | 0.702 | 0.733 | 0.647 | 0.721 |
| 0.853 | 0.785 | 0.841 | 0.764 | 0.861 | ||
| Precision | 0.743 | 0.845 | 0.754 | 0.675 | 0.772 | |
| Recall | 0.998 | 0.734 | 0.954 | 0.664 | 0.883 | |
| Overfit | Very resistant since boot strap selection is performed | Relatively risky | Boot strapping performed to avoid overfit | Risky k-fold cross-validation used to overcome overfit problem | ||
| Advantages | Nonparametric | Resistant to noise | Simple, flexible | Nonparametric test | ||
| Disadvantages | Sensitive to inconsistent data | Accuracy degraded by correlated variables | Sensitive to noise | Too slow | ||
Abbreviations: RF, random forest; NB, naïve Bayes; kNN, k-nearest neighbor; MDR, multifactor dimensionality reduction.