| Literature DB >> 24926781 |
Ying Lin1, Xiaoning Qian2, Jeffrey Krischer3, Kendra Vehik3, Hye-Seung Lee3, Shuai Huang1.
Abstract
OBJECTIVE: To identify the risk-predictive baseline profile patterns of demographic, genetic, immunologic, and metabolic markers and synthesize these patterns for risk prediction. RESEARCH DESIGN AND METHODS: RuleFit is used to identify the risk-predictive baseline profile patterns of demographic, immunologic, and metabolic markers, using 356 subjects who were randomized into the control arm of the prospective Diabetes Prevention Trial-Type 1 (DPT-1) study. A novel latent trait model is developed to synthesize these baseline profile patterns for disease risk prediction. The primary outcome was Type 1 Diabetes (T1D) onset.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24926781 PMCID: PMC4057076 DOI: 10.1371/journal.pone.0091095
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Baseline statistics of the study subjects.
| Oral Insulin Trial N = 186 | Parenteral Insulin Trial N = 170 | |
| IDDM (%) | 53(28%) | 70(41%) |
| Age -year (mean) | 12.30(8.60) | 15.34(9.92) |
| BMI Z-score (median) | −0.90(−2.35−0.38) | −1.48(−3.04−0.01) |
| Race | ||
| White | 163(89.07%) | 128(95.18%) |
| African American | 2(1.09%) | 1(0.60%) |
| Hispanic | 14(7.65%) | 5(3.01%) |
| Other | 7(3.76%) | 6(3.53%) |
| Gender | ||
| Male | 105(56.45%) | 89(52.35%) |
| Female | 81(43.55%) | 81(47.65%) |
| Relationship to patient w/diabetes | ||
| Sibling | 108(58.06%) | 113(66.47%) |
| Offspring | 53(28.49%) | 39(22.94%) |
| Parent | 7(3.76%) | 5(2.94%) |
| Second Degree | 18(9.68%) | 13(7.65%) |
|
| ||
| Priamryhaptype | ||
| 0101/0501 | 27(14.59) | 14(8.24) |
| 0102/0604 | 10(5.41%) | 16(9.41%) |
| 0201/0201 | 12(6.49%) | 10(5.88%) |
| 0301/0301 | 19(10.27%) | 11(6.47%) |
| 0301/0302 | 77(41.62%) | 75(44.12%) |
| 0501/0201 | 16(8.65%) | 17(10.00%) |
| Other | 25(13.44%) | 10(5.88%) |
| Secondayhaptype | ||
| 0301/0301 | 8(4.32%) | 12(7.06) |
| 0301/0302 | 71(38.38%) | 45(26.47%) |
| 0501/0201 | 69(37.30%) | 81(47.65%) |
| 0501/0301 | 15(8.11) | 15(8.82) |
| Other | 23(12.37%) | 17(10.00%) |
|
| ||
| ICA titer (JDF Units | 80.00(40.00–160.00) | 160.00(40.00–320.00) |
| IAA titer (nU/ml) (median) | 192.30(83.30–435.70) | 109.25(26.70–295.34) |
| ICA512 (median) | 0.033(0.006–0.677) | 0.081(0.003–0.645) |
| GAD65(median) | 0.204(0.027–0.677) | 0.322(0.024–0.738) |
|
| ||
| Fasting Glucose (mmol/L)- IVGTT | 4.84(0.51) | 4.94(0.49) |
| Fasting Insulin (mU/L)-IVGTT | 15.41(9.68) | 12.01(7.76) |
| FPIR (ul/ml)-IVGTT | 158.88(99.16) | 72.80(37.10) |
| HOMA-R-IVGTT | 3.39(2.35) | 2.69(1.84) |
| FPIR/HOMA-R-IVGTT | 55.64(33.13) | 32.90(17.09) |
| Fasting Glucose (mg/dL)-OGTT | 86.20(7.78) | 89.22(9.58) |
| Two-hour Glucose (mg/dL)-OGTT | 105.74(19.57) | 122.28(31.59) |
| Peak C-Peptide (nmol/L)-OGTT | 5.44(2.19) | 4.84(1.97) |
| AUC C-Peptide(nmol/L)-OGTT | 508.70(205.79) | 439.28(174.03) |
| HBA1C | 5.33(0.34) | 5.38(0.50) |
Note: Data are mean (± SD), n (%), or median (Inter-quartile range).
*BMI Z-score from 2000 CDC Growth chart.
**JDF denotes Juvenile Diabetes Foundation.
Figure 1Latent trait model for rule synthesis.
The TOP 10 rules identified by RuleFit.
|
|
|
| 24.5 (0)< FPIR <56.5 (0) | |
| Early C-Peptide Response < 3.9 (0.67) | Peak C-Peptide < 4.75 (0.85) |
| Timing of the Peak C-Peptide > 2.5 (0.12) | |
|
|
|
| ICA <240 (0) | IAA < 369.7 (3.84) |
| IAA < 369.7 (4.84) | Fasting Glucose (IVGTT) < 103.5 (2.24) |
| Fasting Glucose (IVGTT) < 98.5 (1.72) | |
|
|
|
| Age < 13.89 (1.45) | ICA > 120 (3.35) |
| BMI > 19.27 (1.37) | AUC C-Peptide < 638.2 (3.61) |
| 2 hr Glucose > 97.5 (4.08) | |
|
|
|
| Age < 18.24 (0) | 2 hr Glucose < 117.5 (2.27) |
| 2 hr Glucose > 87.5 (1.96) | FPIR > 70.5 (1.13) |
| ICA > 30 (0) | |
|
|
|
| ICA > 30 (0) | ICA > 60 (1.39) |
| FPIR < 155 (0) | Early C-Peptide Response < 4.1 (0.67) |
| 7.906 (1.04) < Age < 18.24 (0) |
Note: the value in the bracket indicates the standard derivation that is calculated by the 80/20 cross validation as described in Section 3.4.
Figure 2Kaplan-Meier survival curves (with their 95% confidence intervals) of the two groups defined by each rule: one satisfies the rule (dotted curve) and one doesn't (solid curve).
P-values of the logrank test of the ten rules.
| Rules | P-value of the Logrank test | Rules | P-value of the Logrank test |
| Rule 1 | 0.0091 | Rule 6 | 4.44e–15 |
| Rule 2 | 1.98e–13 | Rule 7 | 8.95e–10 |
| Rule 3 | 2.36e–07 | Rule 8 | 1.12e–12 |
| Rule 4 | 3.5e–10 | Rule 9 | 3.55e–11 |
| Rule 5 | 1.13e–07 | Rule 10 | 1.73e–09 |
Figure 3Item response functions of the 10 rules.
Figure 4Item information curves of the 10 rules.
Prediction performances of different methods.
| Sonar | Liver | Pima | Cancer | Appendicitis | Heart | Simulated | DPT-1 | |
| Rule-based | 0.89 | 0.74 | 0.75 | 0.95 | 0.94 | 0.86 | 0.94 | 0.82 |
| Decision tree | 0.72 | 0.65 | 0.69 | 0.90 | 0.91 | 0.78 | 0.76 | 0.71 |
| Random forest | 0.84 | 0.71 | 0.76 | 0.95 | 0.93 | 0.82 | 0.88 | 0.74 |
| SVM (linear) | 0.82 | 0.68 | 0.72 | 0.93 | 0.87 | 0.79 | 0.65 | 0.62 |
| SVM (Gaussian) | 0.84 | 0.64 | 0.68 | 0.94 | 0.89 | 0.76 | 0.67 | 0.67 |
| SVM (Polynomial) | 0.84 | 0.65 | 0.71 | 0.88 | 0.91 | 0.75 | 0.65 | 0.65 |
| Logistic regression | 0.76 | 0.67 | 0.65 | 0.92 | 0.89 | 0.77 | 0.55 | 0.58 |