| Literature DB >> 19455255 |
Abstract
Clinical covariates such as age, gender, tumor grade, and smoking history have been extensively used in prediction of disease occurrence and progression. On the other hand, genomic biomarkers selected from microarray measurements may provide an alternative, satisfactory way of disease prediction. Recent studies show that better prediction can be achieved by using both clinical and genomic biomarkers. However, due to different characteristics of clinical and genomic measurements, combining those covariates in disease prediction is very challenging. We propose a new regularization method, Covariate-Adjusted Threshold Gradient Directed Regularization (Cov-TGDR), for combining different type of covariates in disease prediction. The proposed approach is capable of simultaneous biomarker selection and predictive model building. It allows different degrees of regularization for different type of covariates. We consider biomedical studies with binary outcomes and right censored survival outcomes as examples. Logistic model and Cox model are assumed, respectively. Analysis of the Breast Cancer data and the Follicular lymphoma data show that the proposed approach can have better prediction performance than using clinical or genomic covariates alone.Entities:
Keywords: classification; microarray; regularized estimation; survival analysis
Year: 2007 PMID: 19455255 PMCID: PMC2675842
Source DB: PubMed Journal: Cancer Inform ISSN: 1176-9351
Figure 1.Breast cancer data. Parameter paths as a function of k for (τ1,τ2) = c(1.0, 0.9). Upper panel: clinical covariates; Lower panel: genomic covariates.
Breast Cancer Data: Cov-TGDR estimation. Variable: variable name (clinical) or systematic name (genomic).
| Clinical covariates | |||
| age | −0.193 | ||
| diameter | 0.090 | ||
| grade | 0.214 | ||
| Genomic covariates | |||
| AB033032 | 0.007 | AJ011306 | −0.214 |
| Contig5816_RC | 0.169 | NM_013438 | 0.045 |
| Contig35148_RC | −0.368 | NM_004994 | 0.142 |
| Contig46909_RC | −0.230 | AL080059 | 0.660 |
| Contig23356_RC | 0.097 | Contig42563_RC | 0.087 |
| Contig35229_RC | −0.134 | NM_006544 | 0.159 |
| Contig28433_RC | −0.014 | NM_005850 | 0.005 |
| NM_003366 | −0.068 | Contig64861_RC | 0.194 |
| NM_020120 | 0.038 | AF055033 | 0.514 |
| NM_020123 | 0.343 | NM_016017 | 0.037 |
| NM_020132 | 0.012 | Contig47544_RC | 0.674 |
| U72507 | −0.089 | Contig48697_RC | 0.029 |
| Contig6238_RC | −0.116 | NM_016361 | −0.174 |
| AF052087 | −0.083 | NM_016448 | 0.029 |
| NM_005007 | −0.082 | Contig412_RC | −0.510 |
| AB018337 | 0.270 | NM_016564 | 0.445 |
| AB040969 | 0.010 | NM_018089 | 0.178 |
| NM_012341 | −0.033 | D13540 | 0.089 |
| Contig47042 | 0.189 | U79298 | −0.177 |
| Contig38438_RC | −0.096 | NM_000127 | 0.234 |
| X67055 | −0.005 | NM_019018 | −0.074 |
| NM_003862 | −0.138 | NM_000207 | −0.049 |
| NM_003882 | −0.083 | AL050227 | −0.010 |
| AF131819 | 0.356 | Contig22253_RC | −0.012 |
| NM_014003 | 0.120 | NM_000801 | 0.059 |
| NM_005393 | 0.304 | ||
Analysis of Breast Cancer Data. # clinical: number of clinical variables. # gene: number of gene expressions. Tuning: optimal tuning parameters. Error: prediction error.
| Clinical-simple | 7 | – | – | 0.371 |
| Clinical-TGDR | 5 | – | 0.289 | |
| Gene-TGDR | – | 50 | 0.267 | |
| Cov-TGDR | 3 | 51 | ( | 0.227 |
Follicular Lymphoma Data: Cov-TGDR estimation. Variable: variable name (clinical) or Affymetrix Feature ID (genomic).
| Clinical covariates | |||
| nodal | 0.123 | pstat | 0.194 |
| age | 0.450 | stage | 0.309 |
| ldh | 0.469 | IPI.2 | 0.514 |
| Genomic covariates | |||
| 223710_at | −0.108 | 240593_x_a | 0.006 |
| 225981_at | 0.222 | 201739_at | −0.020 |
| 226587_at | 0.004 | 202783_at | −0.040 |
| 230280_at | 0.066 | 203612_at | 0.040 |
| 232204_at | −0.050 | 212713_at | −0.028 |
| 232883_at | 0.066 | 215536_at | −0.126 |
| 234062_at | −0.036 | 208470_s_a | 0.214 |
| 235058_at | −0.004 | 216950_s_a | 0.012 |
| 239565_at | 0.016 | 217893_s_a | −0.110 |
| 224280_s_a | −0.202 | 219360_s_a | 0.056 |
| 230938_x_a | 0.054 | 220235_s_a | −0.090 |
| 234792_x_a | 0.054 | ||
Analysis of Follicular lymphoma Data. # clinical: number of clinical variables. # gene: number of gene expressions. Tuning: optimal tuning parameters. Logrank: logrank statistics.
| Clinical-simple | 7 | – | – | 17.9 |
| Clinical-TGDR | 6 | – | 18.1 | |
| Gene-TGDR | – | 31 | 4.0 | |
| Cov-TGDR | 6 | 23 | ( | 23.9 |