| Literature DB >> 26063326 |
Madhuchhanda Bhattacharjee1, Mangalathu S Rajeevan2, Mikko J Sillanpää3.
Abstract
BACKGROUND: The current practice of using only a few strongly associated genetic markers in regression models results in generally low power in prediction or accounting for heritability of complex human traits.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26063326 PMCID: PMC4479222 DOI: 10.1186/s40246-015-0030-6
Source DB: PubMed Journal: Hum Genomics ISSN: 1473-9542 Impact factor: 4.639
Demographic and other characteristics of the subjects selected for analysis
| Factor | Categories | NF subjects ( | CFS subjects ( |
|---|---|---|---|
| Age (years) | MQMQMa | 31.0/44.3/51.5/56.0/69.0 | 27.0/46.5/51.0/57.5/69.0 |
| Sex ( | Female/male | 46/12 | 36/7 |
| Race ( | White/Black/others | 54/2/2 | 40/1/2 |
| BMI | MQMQM | 16.0/25.3/29.0/32.0/40.0 | 23.0/26.0/29.0/32.5/40.0 |
| Onsetb | Gradual/sudden | 14/1 | 36/6 |
aMQMQM represents the minimum, first quartile, median, third quartile, and maximum, respectively
bOnset represents gradual vs. sudden onset of illness. This information is available for all but one CFS subject. For NF subjects, onset information is relevant to only 15 individuals with past report of chronic fatigue
Top 10 genetic markers associated with CFS based on weighted genetic variation (WGV) estimated by the Bayesian model
| SNP ID | Proxy SNP | Gene symbola | SNP annotationa | WGV | SE of WGVb |
|---|---|---|---|---|---|
| rs2288831 | rs3212227 |
| Intron (UTR-3) | 3.95 | 0.0299 |
| rs2071376 |
| intron | 3.6 | 0.0296 | |
| rs2069718 |
| intron | 3.34 | 0.0272 | |
| rs846906 |
| intron | 3.29 | 0.0337 | |
| rs1923884 |
| intron | 3.16 | 0.0324 | |
| rs1799836 |
| Intron | 2.56 | 0.0394 | |
| rs363236 | rs3814230 |
| UTR-3 (synonymous codon) | 2.31 | 0.0272 |
| rs1396862 | rs1218523 |
| Intron (missense codon) | 2.31 | 0.0334 |
| rs891512 | rs743507 |
| Intron | 2.18 | 0.0287 |
| rs1124492 | rs46220755 |
| Intron | 2.02 | 0.0312 |
aGene symbol and SNP annotation in parenthesis correspond to proxy SNPs, if different from the genotyped SNPs for the model
bSE of WGV standard error of weighted genetic variation
Fig. 1Impact of varying the number of SNPs on prediction performance, as measured by sensitivity, specificity, and accuracy
Increase in accuracy with increasing number of SNPs in the predictive model with individual-level allelic information
| Percentilea | Cutoff for weighted genetic variation | No. of SNPs | Sensitivity | Specificity | Accuracy |
|---|---|---|---|---|---|
| 100 | 3.95 | 1 | 74.42 | 41.38 | 0.55 |
| 95 | 2.13 | 9 | 62.79 | 70.69 | 0.67 |
| 90 | 1.30 | 17 | 65.12 | 81.03 | 0.74 |
| 85 | 1.07 | 26 | 67.44 | 87.93 | 0.79 |
| 80 | 0.96 | 34 | 69.77 | 87.93 | 0.80 |
| 75 | 0.82 | 42 | 76.74 | 87.93 | 0.83 |
| 70 | 0.79 | 52 | 81.40 | 91.38 | 0.87 |
| 65 | 0.74 | 58 | 81.40 | 89.66 | 0.86 |
| 60 | 0.70 | 70 | 86.05 | 93.10 | 0.90 |
| 55 | 0.68 | 76 | 86.05 | 91.38 | 0.89 |
| 50 | 0.63 | 84 | 88.37 | 96.55 | 0.93 |
| 45 | 0.58 | 94 | 88.37 | 96.55 | 0.93 |
| 40 | 0.54 | 100 | 88.37 | 98.28 | 0.94 |
| 35 | 0.51 | 109 | 93.02 | 96.55 | 0.95 |
| 30 | 0.48 | 116 | 93.02 | 96.55 | 0.95 |
| 25 | 0.45 | 125 | 93.02 | 98.28 | 0.96 |
| 20 | 0.42 | 135 | 93.02 | 96.55 | 0.95 |
| 15 | 0.41 | 142 | 93.02 | 98.28 | 0.96 |
| 10 | 0.36 | 150 | 93.02 | 98.28 | 0.96 |
| 5 | 0.32 | 159 | 95.35 | 98.28 | 0.97 |
| 0 | 0.00 | 167 | 100 | 100 | 1.00 |
aPercentiles are those for the estimated weighted genetic variation (WGV) under the full model
Fig. 2Impact of varying the threshold and the number of SNPs on prediction performance as assessed by the in-data prediction accuracy
CFS prediction from in-data and tenfold cross-validations
| Model and prediction type | Accuracy | Sensitivity | Specificity | FDR |
|---|---|---|---|---|
| In-data-unconstrained model | 100.0 | 100.0 | 100.0 | 0.0 |
| In-data-constrained model | 100.0 | 100.0 | 100.0 | 0.0 |
|
| 79.2 | 74.4 | 82.8 | 23.8 |
|
| 72.7 | 60.0 | 83.3 | 25.0 |
|
| 70.0 | 75.0 | 66.7 | 40.0 |
|
| 80.0 | 75.0 | 83.3 | 25.0 |
|
| 100.0 | 100.0 | 100.0 | 0.0 |
|
| 70.0 | 50.0 | 83.3 | 33.3 |
|
| 70.0 | 50.0 | 83.3 | 33.3 |
|
| 70.0 | 75.0 | 66.7 | 40.0 |
|
| 70.0 | 75.0 | 66.7 | 40.0 |
|
| 100.0 | 100.0 | 100.0 | 0.0 |
|
| 90.0 | 80.0 | 100.0 | 0.0 |
Fig. 3Probability of CFS at individual level as estimated by the K-fold cross-validation model and in-data prediction model
CFS prediction from tenfold cross-validation for competing methods
| Model | Accuracy | Sensitivity | Specificity |
|---|---|---|---|
| Bayes with dependence | 79.2 | 74.4 | 82.8 |
| Bayes with independence | 77.2 | 65.1 | 86.2 |
| LASSO | 59.4 | 20.9 | 87.9 |
| Ridge regression | 60.4 | 18.6 | 91.4 |