| Literature DB >> 22067671 |
Madhusmita Behera1, Erin E Fowler, Taofeek K Owonikoko, Walker H Land, William Mayfield, Zhengjia Chen, Fadlo R Khuri, Suresh S Ramalingam, John J Heine.
Abstract
BACKGROUND: Statistical learning (SL) techniques can address non-linear relationships and small datasets but do not provide an output that has an epidemiologic interpretation.Entities:
Mesh:
Year: 2011 PMID: 22067671 PMCID: PMC3280940 DOI: 10.1186/1475-925X-10-97
Source DB: PubMed Journal: Biomed Eng Online ISSN: 1475-925X Impact factor: 2.819
Figure 1Modified probabilistic neural network stochastic training and z generation flow diagram. This schema shows the modified probabilistic neural network (PNN) training flow for the differential evolution (DE) sigma-weight vector construction, competition, and feedback from the g to the g+1 populations. The sigma-weight vectors xg and ug compete for the next generation. The receiver operating characteristic curve area (Az) from the stochastic cross-validation is derived with ensemble averaging to reduce the chance of passing outliers back to the vector competition. When g = G, the evolution stops and the sigma-weights are used in the PNN to generate z for each patient stochastically with ensemble averaging. The z quantities are then passed to the survival and logistic regression analyses.
Patient characteristics.
| Characteristic | Incident n | Incident mean/SD or % | Censored n | Censored mean/SD or % | p-value |
|---|---|---|---|---|---|
| Age | 59 | 69.58/7.85 | 92 | 65.42/8.84 | 0.0038* |
| Grade | 59 | 2.22/0.62 | 92 | 2.10/0.68 | 0.2651* |
| One | 6 | 10.17% | 17 | 18.48% | 0.1656 |
| Two | 34 | 57.63% | 49 | 53.26% | 0.5988 |
| Three | 19 | 32.20% | 26 | 28.26% | 0.6053 |
| Gender | |||||
| Male | 38 | 64.41% | 34 | 36.96% | 0.0010 |
| Female | 21 | 35.59% | 58 | 63.04% | 0.0010 |
| Histology subtype | |||||
| Adenocarcinoma | 29 | 49.15% | 58 | 63.04% | 0.0919 |
| Squamous | 25 | 42.37% | 20 | 21.74% | 0.1510 |
| Large Cell | 3 | 5.08% | 11 | 11.96% | 0.1555 |
| Adenosquamous | 2 | 3.39% | 3 | 3.26% | 0.9655 |
| Smoking status | |||||
| Non-Smoker | 12 | 20.34% | 19 | 20.65% | 0.9629 |
| Smoker | 47 | 79.66% | 73 | 79.35% | 0.9629 |
The patient characteristics are summarized in this table. The incident column refers to group-2 patients (unfavorable survival outcome) and the censored column refers to group-1 patients (favorable survival outcome). The mean and standard deviation (SD) are provided for the continuous variables and percentages (%) are provided for the other variables by group. The number of patients (n) for each variable is given for each group. The incident and censored group characteristics were compared with either the t-test (*) or binomial proportional test. The relevant p-values are provided in the last column (right).
Odds Ratios.
| Model | SD | Age OR | Az | Covariate | Unit | Covariate OR |
|---|---|---|---|---|---|---|
| Accepted | ||||||
| Age | 8.681 | 0.60 | 0.636 | |||
| Grade adjusted | 8.681 | 0.58 | 0.657 | Grade | 1 | 0.68 |
| Grade and Gender adjusted | 8.681 | 0.63 | 0.703 | Grade | 1 | 0.73 |
| Gender | Male vs Female | 0.38 | ||||
| Model | SD | ln(z) OR | Az | Covariate | Unit | Covariate OR |
| Hybrid | ||||||
| z (Age and Grade) | 1.695 | 4.15 | 0.763 | |||
| Gender adjusted | 1.695 | 3.67 | 0.778 | Gender | Male vs. Female | 0.50 |
The odds ratios (ORs) and 95% confidence intervals (CIs) are provided parenthetically for the variables used in the logistic regression modeling. The ORs for the continuous variables (age and z) are cited per standard deviation (SD) increase in the respective variable or as a unit increase (grade) while controlling for the other variables (covariates) when applicable. The z variable includes grade and age simultaneously. The ORs for the other covariates are listed in the column to the right. The area under the receiver operating characteristic curve (Az) is also provided for each model.
Figure 2Logistic regression model output for each tumor grade. The plots on the left show the logistic regression model probabilities (P) using the age and grade variables as the model inputs for each tumor grade. The plots on the right show the respective hybrid logistic regression model probabilities (P) using the variable z (i.e. age and grade combined with the probabilistic neural network) as the model input. Because there are overlapping points (patients with the same grade and age), some points are not distinguishable. The censored group (black) is compared with the incident group (red). The curves were fitted with a cubic spline.
Age and z relationships.
| Censored group | Grade 1 | Grade 2 | Grade 3 | All |
|---|---|---|---|---|
| n | 17 | 49 | 26 | 92 |
| Age (mean) | 66.41 | 66.27 | 63.19 | 65.42 |
| z (mean) | 2.11 | 3.91 | 3.12 | 3.36 |
| Incident group | Grade 1 | Grade 2 | Grade 3 | All |
| n | 6 | 34 | 19 | 59 |
| Age (mean) | 73.83 | 68.88 | 69.47 | 69.58 |
| z (mean) | 0.26 | 1.37 | -0.07 | 0.8 |
This table gives the mean values for age and z as a function of tumor grade and censored/incident group status. The number (n) of patients in each category is also provided.
Figure 3Survival probability curves for age. The upper and lower-age groups were formed by dichotomizing the total collection of patients at their median age. The lower-age curve (upper blue curve) exhibits better survival characteristics than the upper-age group (bottom brown curve).
Hazard relationships for dichotomous age and z.
| Model | Age Hazard Ratio | Az |
|---|---|---|
| Accepted | ||
| Dichotomous Age | 1.72 (1.02, 2.90) | 0.5792 |
| Grade adjusted | 1.78 (1.06, 3.02) | 0.606 |
| Gender adjusted | 1.64 (0.96, 2.78) | 0.669 |
| Grade Gender adjusted | 1.68 (0.99, 2.85) | 0.677 |
| Model | z Hazard Ratio | Az |
| Hybrid | ||
| Dichotomous z | 0.25 (0.14, 0.47) | 0.691 |
| Gender adjusted | 0.28 (0.15, 0.53) | 0.738 |
For the age and z variables, two groups were formed using the respective distribution median as the cut-point and compared. The hazard ratios (HRs) are provided parenthetically with 95% confidence intervals. The area under the receiver operating characteristic curves (Azs) derived from Cox regression models are also provided. Because age and z translate inversely with respect to hazard, increased age confers a greater hazard while decreased z confers a greater hazard. To make HR comparisons of z with age, the reciprocal of the z HR is required.
Survival probability statistical test summaries.
| Model | Test | Chi-Square | DF | p-value |
|---|---|---|---|---|
| Accepted | ||||
| Dichotomous Age over Strata | Log-Rank | 4.1784 | 1 | 0.0409 |
| Wilcoxon | 3.4073 | 1 | 0.0649 | |
| Dichotomous Age and Gender over Strata | Log-Rank | 12.7383 | 3 | 0.0052* |
| Wilcoxon | 13.5117 | 3 | 0.0043* | |
| Hybrid | ||||
| Dichotomous z over Strata | Log-Rank | 22.7597 | 1 | < 0.0001 |
| Wilcoxon | 14.9418 | 1 | 0.0001 | |
| Dichotomous z and Gender | Log-Rank | 28.1863 | 3 | < 0.0001* |
| Wilcoxon | 22.4886 | 3 | < 0.0001* |
The statistical tests findings for the various age an z related survival probability curves are provided with the degrees of freedom (DF). When comparing more than two survival curves (*), the hypothesis that all the curves were the same was tested against the alternative that at least one curve was different.
Figure 4Survival probability curves for z. The upper and lower-z groups were formed by dichotomizing the total collection of patients at their median z value. The upper-z group (upper brown curve) exhibits better survival characteristics than the lower-z group (bottom blue curve). These findings incorporate tumor-grade with age via the probabilistic neural network combination.