| Literature DB >> 21455292 |
Kylee L Spencer1, Lana M Olson, Nathalie Schnetz-Boutaud, Paul Gallins, Anita Agarwal, Alessandro Iannaccone, Stephen B Kritchevsky, Melissa Garcia, Michael A Nalls, Anne B Newman, William K Scott, Margaret A Pericak-Vance, Jonathan L Haines.
Abstract
A major goal of personalized medicine is to pre-symptomatically identify individuals at high risk for disease using knowledge of each individual's particular genetic profile and constellation of environmental risk factors. With the identification of several well-replicated risk factors for age-related macular degeneration (AMD), the leading cause of legal blindness in older adults, this previously unreachable goal is beginning to seem less elusive. However, recently developed algorithms have either been much less accurate than expected, given the strong effects of the identified risk factors, or have not been applied to independent datasets, leaving unknown how well they would perform in the population at large. We sought to increase accuracy by using novel modeling strategies, including multifactor dimensionality reduction (MDR) and grammatical evolution of neural networks (GENN), in addition to the traditional logistic regression approach. Furthermore, we rigorously designed and tested our models in three distinct datasets: a Vanderbilt-Miami (VM) clinic-based case-control dataset, a VM family dataset, and the population-based Age-related Maculopathy Ancillary (ARMA) Study cohort. Using a consensus approach to combine the results from logistic regression and GENN models, our algorithm was successful in differentiating between high- and low-risk groups (sensitivity 77.0%, specificity 74.1%). In the ARMA cohort, the positive and negative predictive values were 63.3% and 70.7%, respectively. We expect that future efforts to refine this algorithm by increasing the sample size available for model building, including novel susceptibility factors as they are discovered, and by calibrating the model for diverse populations will improve accuracy.Entities:
Mesh:
Year: 2011 PMID: 21455292 PMCID: PMC3063776 DOI: 10.1371/journal.pone.0017784
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Previous studies that developed an AMD algorithm.
| Reference | Factors in the Model | Method(s) Used | Independent Dataset for Validation? | Sensitivity | Specificity | AUC |
| Gold et al. 2006 | CFH, C2, CFB | Genetic Algorithm | yes | 0.74 | 0.56 | . |
| Hughes et al. 2007 | CFH, ARMS2, smoking | risk score | no | . | . | . |
| Jakobsdottir et al. 2008 | CFH, ARMS2, C2, CFB | logistic regression | no | . | . | . |
| Jakobsdottir et al. 2008 | CFH, ARMS2, C2, age, gender, smoking | Generalized MDR | no | 0.70 | 0.74 | . |
| Jakobsdottir et al. 2009 | CFH, ARMS2, C2 | logistic regression | no | . | . | 0.79 |
| Seddon et al. 2009 | CFH, ARMS2, C2, CFB, C3, CFH*supplement treatment group, age, gender, education, baseline AMD grade, smoking, BMI | logistic regression | no | . | . | 0.83 |
| Gibson et al. 2010 | CFH, ARMS2, C3, SERPING1, age, gender, smoking | logistic regression | no | 0.76 | 0.76 | 0.83 |
Characteristics of the datasets.
| Characteristic | VM Training | VM Testing | VM Families | ARMA | p-value VM Training vs. ARMA |
| Cases/Affecteds (#) | 349 | 87 | 326 | 85 | NA |
| Controls/Unaffecteds (#) | 216 | 54 | 86 | 148 | NA |
| Age of exam [mean (sd)] | 73.5 (8.4) | 73.1 (8.3) | 72.8 (9.4) | 79.3 (3.6) | <0.0001 |
| Gender (% Female) | 61.1 | 59.6 | 67.0 | 51.5 | 0.01 |
| % ever Smokers | 52.0 | 56.0 | 54.6 | 50.2 | 0.64 |
| CFH frequency C allele | 50.6 | 48.9 | 61.9 | 42.9 | 0.01 |
| ARMS2 frequency T allele | 35.7 | 31.2 | 42.4 | 23.8 | <0.0001 |
| CFB frequency A allele | 6.8 | 7.1 | 5.2 | 9.7 | 0.05 |
| C3 frequency C allele | 25.3 | 26.2 | 29 | 25.5 | 0.93 |
Logistic regression model in the VM training dataset.
| Factor | Coefficient | p-value | Odds Ratio | 95% Confidence Interval | |
| Age | 0.13 | <0.001 | 1.13 | 1.10 | 1.17 |
| Smoking | 0.48 | 0.026 | 1.61 | 1.06 | 2.45 |
| CFH Y402H | 1.04 | <0.001 | 2.84 | 2.07 | 3.90 |
| ARMS2 A69S | 0.69 | <0.001 | 2.00 | 1.47 | 2.72 |
| CFB R32Q | −1.10 | <0.001 | 0.33 | 0.18 | 0.60 |
| C3 R102G | 0.41 | 0.014 | 1.51 | 1.09 | 2.11 |
| Constant | −10.48 | <0.001 | . | . | . |
Classification rates using the VM training dataset for training and VM testing dataset for testing.
| Method | Sensitivity | Specificity | Unadjusted PPV | Unadjusted NPV | % Overall Correct |
| LR [0.5] | 85.1 | 64.8 | 79.6 | 72.9 | 77.3 |
| MDR | 71.8 (58.6) | 80.5 (61.1) | 86.4 (NA) | 62.3 (NA) | 75.0 (59.6) |
| GENN | 83.9 | 74.1 | 83.9 | 74.1 | 80.1 |
| Consensus–LR, MDR, GENN | 82.8 | 74.1 | 83.7 | 72.7 | 79.4 |
| Consensus–LR, GENN | 77.0 | 74.1 | 82.7 | 66.7 | 75.9 |
PPV = positive predictive value, NPV = negative predictive value, NA = not applicable, LR = logistic regression. Logistic [0.5] indicates the threshold used for determining model calls in the logistic regression analysis. In this case, all individuals with probabilities ≥0.5 were given a model call of “high-risk”. For MDR 20.6% of the testing dataset could not be classified. The first entry in the table represents the classification rate considering only the individuals that could be classified in the denominator. The number in parentheses gives the classification rate considering the entire testing dataset as the denominator. For example, using MDR, 71 individuals who were actually cases could be classified and of those 51/71 = 71.8% were correctly classified as “high-risk”. Considering all cases that were tested, 51/87 = 58.6% were correctly classified. For the consensus of methods, individuals were called high-risk only if two or more methods classified them as high-risk.
Comparison of adjusted and unadjusted PPV and NPV in the VM testing dataset.
| Method | Unadjusted PPV | Unadjusted NPV | Adjusted PPV at Prev = 5.5% | Adjusted NPV at Prev = 5.5% | Adjusted PPV at Prev = 15% | Adjusted NPV at Prev = 15% |
| LR 0.5 | 79.6 | 72.9 | 12.3 | 98.7 | 29.9 | 96.1 |
| MDR | 86.4 | 62.3 | 17.6 | 98.0 | 39.4 | 94.2 |
| GENN | 83.9 | 74.1 | 15.9 | 98.8 | 36.4 | 96.3 |
| Consensus–LR, MDR, GENN | 83.7 | 72.7 | 15.7 | 98.7 | 36.1 | 96.1 |
| Consensus–LR, GENN | 82.7 | 66.7 | 14.8 | 98.2 | 34.4 | 94.8 |
Prev = Prevalence.
Classification rates using the VM training dataset for training and the ARMA dataset for testing.
| Method | Sensitivity | Specificity | Unadjusted PPV | Unadjusted NPV | % Overall Correct |
| LR [0.5] | 89.4 | 25.7 | 40.9 | 80.9 | 48.9 |
| LR [0.75] | 62.4 | 59.5 | 46.9 | 73.3 | 60.5 |
| LR [0.87, Optimal) | 36.5 | 87.8 | 63.3 | 70.7 | 69.1 |
| MDR | 68.5 (58.8) | 31.4 (25.0) | 38.2 (NA) | 61.7 (NA) | 45.5 (37.3) |
| GENN | 76.5 | 36.5 | 43.6 | 73.0 | 51.1 |
| Consensus–LR [0.5], MDR, GENN | 77.6 | 33.8 | 40.2 | 72.5 | 49.8 |
| Consensus–LR [0.5], GENN | 74.1 | 41.9 | 42.3 | 73.8 | 53.6 |
| Consensus–LR [0.75], MDR, GENN | 64.7 | 53.4 | 44.4 | 72.5 | 57.5 |
| Consensus–LR [0.75], GENN | 61.2 | 59.5 | 46.4 | 72.7 | 60.1 |
| Consensus–LR [0.87], MDR, GENN | 60.0 | 58.1 | 45.1 | 71.7 | 58.8 |
| Consensus–LR [0.87], GENN | 36.5 | 87.8 | 63.3 | 70.7 | 69.1 |
Logistic [0.87 Optimal] indicates that the threshold that would correctly classify the most individuals as determined by the ROC curve was applied to the testing dataset.
See notes accompanying Table 4 for further explanation.
Logistic regression model in the ARMA dataset.
| Factor | Coefficient | p-value | Odds Ratio | 95% Confidence Interval | |
| Age | 0.05 | 0.22 | 1.05 | 0.97 | 1.14 |
| Smoking | 0.41 | 0.16 | 1.51 | 0.85 | 2.68 |
| CFH Y402H | 0.73 | <0.0001 | 2.08 | 1.39 | 3.11 |
| CFB R32Q | −0.96 | 0.02 | 0.38 | 0.17 | 0.86 |
| ARMS2 A69S | 0.37 | 0.12 | 1.45 | 0.91 | 2.31 |
| C3 R102G | −0.03 | 0.89 | 0.97 | 0.61 | 1.53 |
| Constant | −5.45 | 0.10 | . | . | . |
Classification rates using the ARMA dataset for training and VM training combined with VM testing as the testing dataset.
| Method | Sensitivity | Specificity | Unadjusted PPV | Unadjusted NPV | % Overall Correct |
| LR [0.5] | 37.4 | 94.8 | 92.1 | 48.4 | 59.3 |
| LR [0.30 Optimal] | 79.6 | 70.7 | 81.5 | 68.2 | 76.2 |
| MDR | 48.8 (24.1) | 49.4 (15.9) | 70.5 (NA) | 28.1 (NA) | 49.0 (21.0) |
| GENN | 65.4 | 59.3 | 72.2 | 51.4 | 63.0 |
| Consensus–LR [0.5], MDR, GENN | 42.7 | 90.0 | 87.3 | 49.3 | 60.8 |
| Consensus–LR [0.5], GENN | 35.6 | 94.8 | 91.7 | 47.7 | 58.2 |
| Consensus–LR [0.3], MDR, GENN | 65.4 | 75.9 | 81.4 | 57.6 | 69.4 |
| Consensus–LR [0.3], GENN | 61.2 | 79.6 | 82.9 | 56 | 68.3 |
For MDR 57.2% of the testing dataset could not be classified. The first entry in the table represents the classification rate considering only the individuals that could be classified in the denominator. The number in parentheses gives the classification rate considering the entire testing dataset as the denominator.