| Literature DB >> 25392690 |
Seyed M Iranmanesh1, Nancy L Guo2.
Abstract
Integrative analysis of multi-level molecular profiles can distinguish interactions that cannot be revealed based on one kind of data in the analysis of cancer susceptibility and metastasis. DNA copy number variations (CNVs) are common in cancer cells, and their role in cell behaviors and relationship to gene expression (GE) is poorly understood. An integrative analysis of CNV and genome-wide mRNA expression can discover copy number alterations and their possible regulatory effects on GE. This study presents a novel framework to identify important genes and construct potential regulatory networks based on these genes. Using this approach, DNA copy number aberrations and their effects on GE in lung cancer progression were revealed. Specifically, this approach contains the following steps: (1) select a pool of candidate driver genes, which have significant CNV in lung cancer patient tumors or have a significant association with the clinical outcome at the transcriptional level; (2) rank important driver genes in lung cancer patients with good prognosis and poor prognosis, respectively, and use top-ranked driver genes to construct regulatory networks with the COpy Number and EXpression In Cancer (CONEXIC) method; (3) identify experimentally confirmed molecular interactions in the constructed regulatory networks using Ingenuity Pathway Analysis (IPA); and (4) visualize the refined regulatory networks with the software package Genatomy. The constructed CNV/mRNA regulatory networks provide important insights into potential CNV-regulated transcriptional mechanisms in lung cancer metastasis.Entities:
Keywords: DNA copy number variation; lung cancer; mRNA gene expression; regulatory networks
Year: 2014 PMID: 25392690 PMCID: PMC4218678 DOI: 10.4137/CIN.S14055
Source DB: PubMed Journal: Cancer Inform ISSN: 1176-9351
Clinical information of patient cohorts analyzed in this study.
| VARIABLES | TCGA DATASET (31) ( | GEO DATASET (GSE31800; |
|---|---|---|
| Male | 28% | NA |
| Female | 72% | NA |
| White | 65% | NA |
| Black or African American | 5% | NA |
| Asian | 2% | NA |
| NA | 28% | NA |
| Lung adenocarcinoma | 0 | 179 |
| Lung squamous cell carcinoma | 201 | 92 |
| Mean ± std | 67.5 ± 8.5 | NA |
| [Min, Max] (Median) | [39, 85] (68) | NA |
| Alive | 58% | NA |
| Dead | 42% | NA |
| Stage I | 55% | NA |
| Stage II | 24% | NA |
| Stage III | 21% | NA |
| Not Hispanic or Latino | 59% | NA |
| Hispanic or Latino | 2% | NA |
| NA | 39% | NA |
| Current Smoker | 16% | NA |
| Reformed smoker for < or = 15 years | 56% | NA |
| Reformed smoker for > 15 years | 21% | NA |
| Non-Smoker | 4% | NA |
| NA | 3% | NA |
Figure 1Overview of integrative analysis of DNA copy number and GE.
Genes with consistent DNS CNVs between lung AC and SQCC in the dataset GSE31800. The percentage in this table stands for the percent of CNV in the corresponding patient cohort.
| GENE NAME | CHROMOSOME # | AC N=179 [ACCN GSE31800] | SQCC N=92 [ACCN GSE31800] | CNV TYPE |
|---|---|---|---|---|
| UBE1DC1 | 3q22.1 | 51% | 65% | Loss |
| CMTM6 | 3p22.3 | 51% | 65% | Loss |
| ULK4 | 3p22.1 | 51% | 65% | Loss |
| NKX2–5 | 5q34 | 78% | 78% | Loss |
| NEUROG1 | 5q23–q31 | 78% | 78% | Loss |
| LOC441150 | 6p21.1 | 78% | 76% | Loss |
| C6orf153 | 6p21.1 | 78% | 76% | Loss |
| C6orf134 | 6p21.33 | 78% | 76% | Loss |
| C6orf173 | 6q22.32 | 78% | 76% | Loss |
| C6orf194 | 6p22.1 | 78% | 76% | Loss |
| HIBADH | 7p15.2 | 68% | 67% | Loss |
| IFRD1 | 7q31.1 | 68% | 67% | Loss |
| PMPCB | 7q22.1 | 68% | 67% | Loss |
| TRIB1 | 8q24.13 | 51% | 68% | Loss |
| AZIN1 | 8q22.3 | 51% | 68% | Loss |
| CTSB | 8p22 | 70% | 79% | Loss |
| IKBKB | 8p11.2 | 70% | 79% | Loss |
| IMPAD1 | 8q12.1 | 70% | 79% | Loss |
| CPNE3 | 8q21.3 | 70% | 79% | Loss |
| TUSC3 | 8p22 | 70% | 79% | Loss |
| RIPK2 | 8q21 | 70% | 79% | Loss |
| LYN | 8q13 | 70% | 79% | Loss |
| ENTPD4 | 8p21.3 | 70% | 79% | Loss |
| ABCB9 | 12q24 | 67% | 66% | Loss |
| SPRYD4 | 12q13.3 | 67% | 66% | Loss |
| DDIT3 | 12q13.1 s | 67% | 66% | Loss |
| RAB22 A | 20q13.32 | 71% | 64% | Loss |
| NCOA6 | 20q11 | 71% | 64% | Loss |
| PRPF6 | 20q13.33 | 71% | 64% | Loss |
| STX16 | 20q13.32 | 71% | 64% | Loss |
Genes with consistent CNV in the TCGA dataset (31) and SQCC samples in dataset GSE31800. The percentage in this table stands for the percent of CNV in the corresponding patient cohort.
| GENE NAME | CHROMOSOME # | SQCC ( | SQCC ( | CNV TYPE |
|---|---|---|---|---|
| C3orf31 | 3p25.2 | 80% | 62% | Gain |
| SELT | 3q25.1 | 82% | 62% | Gain |
| C3orf52 | 3q13.2 | 80% | 64% | Gain |
Genes with a significant association between CNV and survival time in SQCC tumors in the TCGA dataset (31). Top genes are identified based on P-value. Hazard ratios and their confidence intervals are shown. Significant hazard ratios are marked with an asterisk (*).
| GENE NAME | GAIN-HAZARD (95% CI) | LOSS-HAZARD (95% CI) | |
|---|---|---|---|
| C14orf173 | 0.0007 | 2.7851 [1.61, 4.81]* | 1.3415 [0.77, 2.32] |
| BRMS1L | 0.0007 | 2.7851 [1.61, 4.81]* | 1.3415 [0.77, 2.32] |
| CHURC1 | 0.0007 | 2.7851 [1.61, 4.81]* | 1.3415 [0.77, 2.32] |
| NEK9 | 0.0007 | 2.7851 [1.61, 4.81]* | 1.3415 [0.77, 2.32] |
| CIDEB | 0.0007 | 2.7851 [1.61, 4.81]* | 1.3415 [0.77, 2.32] |
| SERPINA3 | 0.0007 | 2.7851 [1.61, 4.81]* | 1.3415 [0.77, 2.32] |
| C14orf172 | 0.0008 | 2.7625 [1.59, 4.77]* | 1.3195 [0.76, 2.28] |
| PTPRU | 0.0013 | 0.7796 [0.49, 1.23] | 0.2738 [0.12, 0.61]* |
| SEMA3F | 0.0014 | 0.4337 [0.24, 0.76]* | 1.1128 [0.54, 2.28] |
| TADA3L | 0.0015 | 0.3342 [0.19, 0.57]* | 0.4007 [0.18, 0.86]* |
| CCNB1IP1 | 0.0016 | 2.6560 [1.52, 4.62]* | 1.3451 [0.77, 2.34] |
| C14orf79 | 0.0016 | 2.6560 [1.52, 4.62]* | 1.3451 [0.77, 2.34] |
| UPB1 | 0.0016 | 0.3726 [0.20, 0.66]* | 0.7158 [0.42, 1.20] |
| SH3BP1 | 0.0016 | 0.3726 [0.20, 0.66]* | 0.7158 [0.42, 1.20] |
| TCF20 | 0.0016 | 0.3726 [0.20, 0.66]* | 0.7158 [0.42, 1.20] |
| MUSTN1 | 0.0020 | 0.4559 [0.25, 0.80]* | 1.1888 [0.57, 2.47] |
| ZC3H14 | 0.0021 | 2.5397 [1.46, 4.39]* | 1.2468 [0.71, 2.16] |
| C14orf50 | 0.0021 | 2.5397 [1.46, 4.39]* | 1.2468 [0.71, 2.16] |
| TDP1 | 0.0021 | 2.5397 [1.46, 4.39]* | 1.2468 [0.71, 2.16] |
| C14orf24 | 0.0021 | 2.5397 [1.46, 4.39]* | 1.2468 [0.71, 2.16] |
| KIAA0831 | 0.0021 | 2.5397 [1.46, 4.39]* | 1.2468 [0.71, 2.16] |
| PTGER2 | 0.0021 | 2.5397 [1.46, 4.39]* | 1.2468 [0.71, 2.16] |
| FOXN3 | 0.0021 | 2.5397 [1.46, 4.39]* | 1.2468 [0.71, 2.16] |
| PRMT5 | 0.0021 | 2.5397 [1.46, 4.39]* | 1.2468 [0.71, 2.16] |
| RIPK3 | 0.0021 | 2.5397 [1.46, 4.39]* | 1.2468 [0.71, 2.16] |
| ZFP36L1 | 0.0021 | 2.5397 [1.46, 4.39]* | 1.2468 [0.71, 2.16] |
| LGR6 | 0.0022 | 0.9346 [0.59, 1.47] | 0.3007 [0.13, 0.66]* |
| RFX5 | 0.0022 | 0.9346 [0.59, 1.47] | 0.3007 [0.13, 0.66]* |
| CD1D | 0.0022 | 0.9346 [0.59, 1.47] | 0.3007 [0.13, 0.66]* |
| FLVCR1 | 0.0022 | 0.9346 [0.59, 1.47] | 0.3007 [0.13, 0.66]* |
| MRPS14 | 0.0022 | 0.9346 [0.59, 1.47] | 0.3007 [0.13, 0.66]* |
| DAB1 | 0.0022 | 0.9346 [0.59, 1.47] | 0.3007 [0.13, 0.66]* |
| PUSL1 | 0.0022 | 0.9346 [0.59, 1.47] | 0.3007 [0.13, 0.66]* |
| CSDC2 | 0.0022 | 0.3736 [0.20, 0.67]* | 0.7728 [0.46, 1.29] |
| P2RXL1 | 0.0022 | 0.3736 [0.20, 0.67]* | 0.7728 [0.46, 1.29] |
| PNPLA3 | 0.0022 | 0.3736 [0.20, 0.67]* | 0.7728 [0.46, 1.29] |
| CSNK1E | 0.0022 | 0.3736 [0.20, 0.67]* | 0.7728 [0.46, 1.29] |
| CCDC117 | 0.0022 | 0.3736 [0.20, 0.67]* | 0.7728 [0.46, 1.29] |
| ATP8B2 | 0.0023 | 1.0134 [0.64, 1.59] | 0.3111 [0.13, 0.69]* |
| NPPB | 0.0023 | 1.0111 [0.64, 1.58] | 0.3108 [0.13, 0.69]* |
mRNA prognostic biomarkers identified in our previous studies7,38,39 ranked as top driver genes in poor prognosis SQCC patients.
| GENE NAME | MODULATOR? (# MODULES) | MODULATED BY | REFERENCE |
|---|---|---|---|
| SCLY | No | IFRD1 | |
| TNFSF9 | No | CTSB | |
| CD27 | No | STAT4 | |
| DAG1 | No | SELT | N/A |
| SAMD4B | Yes (3) | No modulator | N/A |
| THBS1 | No | CTSB | |
| XPO1 | No | PTGER2 | |
| C8orf70 | No | PTGER2 | |
| STK24 | No | CTSB | |
| AKAP13 | No | CTSB | |
| APOA2 | Yes (3) | No modulator | |
| CCL19 | No | STAT4 | |
| CLIC2 | No | STAT4 | |
| COL14A1 | No | VASH1 | |
| HMBOX1 | No | VASH1 | |
| IRF3 | No | SAMD4B | |
| ATAD4 | Yes (5) | No modulator | |
| SLC39A8 | No | VASH1 | |
| APIN1 | No | IFRD1 | |
| TAF4 | No | SELT | |
| TOMM34 | No | CTSB | |
| VASH1 | Yes (2) | SAMD4B | |
| VIPR2 | No | SERPINA3 | |
| HFE | No | IFRD1 | |
| HNF4A | No | CTSB | N/A |
| STAT6 | No | SERPINA3 |
Notes:
If a gene is a modulator (driver gene), the number of modules regulated by it is listed in parentheses.
Unpublished mRNA prognostic biomarkers associated with NSCLC outcome.
mRNA prognostic biomarkers identified in our previous studies7,38,39 ranked as top driver genes in good prognosis SQCC patients.
| GENE NAME | MODULATOR? (# MODULES) | MODULATED BY | REFERENCE |
|---|---|---|---|
| OGT | No | C6orf134 | |
| CCDC99 | No | CCL19 | |
| CD27 | No | CCL19 | |
| DAG1 | No | C6orf134 | N/A |
| XPO1 | No | PRMT5 | |
| C8orf70 | No | RB1 | |
| AKAP13 | No | RB1 | |
| MSX2 | No | RB1 | |
| ADH1B | Yes (4) | STAT4 | |
| ANXA6 | No | VASH1 | |
| CCL19 | Yes (4) | No Modulator | |
| CLIC2 | No | ADH1B | |
| COL14A1 | No | VASH1 | |
| FHL1 | Yes (1) | VASH1 | |
| ICA1 | No | RB1 | |
| IRF3 | No | C6orf134 | |
| IVD | No | VASH1 | |
| SLC39A8 | No | ADH1B | |
| SPIN1 | No | PRMT5 | |
| TAF4 | No | C6orf134 | |
| VASH1 | Yes (4) | ADH1B | |
| HFE | No | C14orf50 | |
| RB1 | Yes (7) | No Modulator | |
| STAT6 | No | CCL19 | |
| ZNF638 | No | C6orf134 | |
| UBE1L2 | no | C6orf134 |
Notes:
If a gene is a modulator (driver gene), the number of modules regulated by it is listed in parentheses.
Unpublished mRNA prognostic biomarkers associated with NSCLC outcome.
Figure 2Regulatory networks of CTSB in poor prognosis SQCC patient group. (A) Experimentally validated network of CTSB in poor prognosis SQCC patients (31) with IPA analysis. (B) CNV/mRNA regulatory network of CTSB in poor prognosis SQCC patients (31).
Notes: The top color bar under the driver gene shows the mRNA expression and the below one shows CNV. Green color indicates overexpression/amplification and red color indicates under-expression/deletion.
Figure 3Regulatory networks of RB1 in good prognosis SQCC patient group. (A) Experimentally validated networks for RB1 in good prognosis SQCC patients (31) with IPA analysis. (B) CNV/mRNA regulatory network of RB1 in good prognosis SQCC patients (31).
Notes: Green color indicates overexpression/amplification, and red color indicates under-expression/deletion.