| Literature DB >> 21695068 |
Egon Urgard1, Tõnu Vooder, Urmo Võsa, Kristjan Välk, Mingming Liu, Cheng Luo, Fabian Hoti, Retlav Roosipuu, Tarmo Annilo, Jukka Laine, Christopher M Frenz, Liqing Zhang, Andres Metspalu.
Abstract
NSCLC (non-small cell lung cancer) comprises about 80% of all lung cancer cases worldwide. Surgery is most effective treatment for patients with early-stage disease. However, 30%-55% of these patients develop recurrence within 5 years. Therefore, markers that can be used to accurately classify early-stage NSCLC patients into different prognostic groups may be helpful in selecting patients who should receive specific therapies.A previously published dataset was used to evaluate gene expression profiles of different NSCLC subtypes. A moderated two-sample t-test was used to identify differentially expressed genes between all tumor samples and cancer-free control tissue, between SCC samples and AC/BC samples and between stage I tumor samples and all other tumor samples. Gene expression microarray measurements were validated using qRT-PCR.Bayesian regression analysis and Kaplan-Meier survival analysis were performed to determine metagenes associated with survival. We identified 599 genes which were down-regulated and 402 genes which were up-regulated in NSCLC compared to the normal lung tissue and 112 genes which were up-regulated and 101 genes which were down-regulated in AC/BC compared to the SCC. Further, for stage Ib patients the metagenes potentially associated with survival were identified.Genes that expressed differently between normal lung tissue and cancer showed enrichment in gene ontology terms which were associated with mitosis and proliferation. Bayesian regression and Kaplan-Meier analysis showed that gene-expression patterns and metagene profiles can be applied to predict the probability of different survival outcomes in NSCLC patients.Entities:
Keywords: Kaplan-Meier curve; TNM stage; gene expression pattern; metagenes; microarray; non-small cell lung cancer
Year: 2011 PMID: 21695068 PMCID: PMC3118451 DOI: 10.4137/CIN.S7135
Source DB: PubMed Journal: Cancer Inform ISSN: 1176-9351
Figure 2.A) Leave-one-out cross-validation results demonstrating the ability of the model to classify new Ib patients into groups of low and high survival risk. The y-axis gives the posterior probabilities that a patient is classified into the long survival group (survival time > 1000 days) and the x-axis gives the observed survival time (including censored times). AC and BC patients are denoted by BA and SCC patients by S. Patients above the 0.5 posterior probability line (green dotted line) are classified into the long survival (low risk) group and patients below the line are classified into the short survival (high risk) group. B) A heatmap presentation of the two top metagenes most frequently found to be associated with the two risk groups in the Bayesian model. The genes included into the metagenes are listed on the y-axis and the patients sorted according to their survival times (days) are listed on the x-axis. High gene-expression values are shown with red color and low values with green ( ).
Figure 1.A) Validation of microarray data with qRT-PCR of four up-regulated and down-regulated genes. ▪, average log fold-change for paired lung cancer samples on microarray (n = 21), □, qRT-PCR average log fold-change for the lung cancer sample pairs that were presented on microarray (n = 8), , qRT-PCR average log fold-change for the lung cancer sample pairs that were not presented on the microarray (n = 3). Error bars indicate the standard error of the mean (SEM). B) Correlation between array log2(signaltumor)–log2(signalnormal) and qRT-PCR ΔΔCt for validated genes using same sample pairs as previous graph (n = 8). Pearson correlation coefficients (R), correlation test p-values, and best-fitting (least squares) lines are shown.
Figure 4.Kaplan-Meier plot of the survival probability in the high and low risk groups predicted by the Bayesian model. The high risk group (short survival) consisting of 24 patients and the low risk group (long survival) consisting of 22 patients. Vertical drops indicate deaths and ticks on the solid lines are censored survival times. The survival rates of the two groups are significantly different (P = 0.0007).
Figure 3.The empirical ROC curve (solid curve): The true positive rate plotted as a function of the false positive rate for different cut off values. Jumps in the curve correspond to changes in the classification outcome of the patients due to the use of different cut off values. For cut off value 0 all patients are classified into the long survival group, FTP = 0 and TPR = 0. In the other extreme, ie, cut off value 1 all patients are classified into the short survival group, FTP = 1 and FPR = 1. The area under the curve (AUC = 0.728, P-value 0.001) is an overall measure of the quality of the classification method. The dashed line (AUC = 0.5) is the expected ROC curve for a totally random classifier.