| Literature DB >> 30066910 |
Li Cheng1, Lin Li1, Liling Wang1, Xiaofang Li1, Hui Xing1, Jinting Zhou1.
Abstract
Ovarian cancer (OC) is associated with a poor prognosis due to difficulties in early detection. The aims of the present study were to construct a recurrence risk prediction model and to reveal important OC genes or pathways. RNA sequencing data was obtained for 307 OC samples, and the corresponding clinical data were downloaded from The Cancer Genome Atlas database. Additionally, two validation datasets, GSE44104 (20 recurrent and 40 non‑recurrent OC samples) and GSE49997 (204 OC samples), were obtained from the Gene Expression Omnibus database. Differentially expressed genes were screened using the differential expression via distance synthesis algorithm, followed by gene ontology enrichment analysis and weighted gene coexpression network analysis (WGCNA). Furthermore, subnetwork analysis was conducted for the protein‑protein interaction (PPI) network using the BioNet package. Finally, a random forest classifier was constructed based on the subnetwork nodes, and its reliability was validated using the GSE44104 and GSE49997 validation datasets. A total of 44 upregulated and 117 downregulated genes were identified in the recurrent samples. Enrichment analysis indicated that cytochrome P450 family 17 subfamily A member 1 (CYP17A1) was associated with 'positive regulation of steroid hormone biosynthetic processes'. WGCNA identified turquoise and grey modules that were significantly correlated with status and prognosis. A significant PPI subnetwork containing 16 nodes was also identified, including: Transcription factor GATA‑4; fibroblast growth factor 9; aromatase; 3β‑hydroxysteroid dehydrogenase/δ5‑4‑isomerase type 2; corticosteroid 11β‑dehydrogenase isozyme 1; CYP17A1; pituitary homeobox 2; left‑right determination factor 1; homeobox protein ARX; estrogen receptor β; steroidogenic factor 1; forkhead box protein L2; myocardin; steroidogenic acute regulatory protein mitochondrial; vesicular inhibitory amino acid transporter; and twist‑related protein 1. A random forest classifier was constructed using the subnetwork nodes as feature genes, which exhibited a 92% true positive rate when classifying recurrent and non‑recurrent OC samples. The classifying efficiency of the random forest classifier was validated using the two other independent datasets. Overall, 44 upregulated and 117 downregulated genes associated with OC recurrence were identified. Furthermore, the 16 subnetwork node genes that were identified may be important molecules in OC recurrence.Entities:
Mesh:
Year: 2018 PMID: 30066910 PMCID: PMC6102638 DOI: 10.3892/mmr.2018.9300
Source DB: PubMed Journal: Mol Med Rep ISSN: 1791-2997 Impact factor: 2.952
Figure 1.Volcano plot illustrating differentially expressed gene expression distributions. Upregulated genes are depicted in red and downregulated in green.
Significant functional terms enriched for the upregulated and downregulated genes.
| A, Upregulated genes | |||
|---|---|---|---|
| Term | Corrected P-value | Count | Genes |
| Regulation of synapse assembly | 1.04×10−7 | 5 | |
| regulation of synapse organization | 5.26×10−7 | 5 | |
| regulation of synapse structure or activity | 5.71×10−7 | 5 | |
| synapse assembly | 2.08×10−6 | 5 | |
| positive regulation of synapse assembly | 6.78×10−6 | 4 | |
| synapse organization | 2.23×10−5 | 5 | |
| regulation of nervous system development | 2.72×10−4 | 6 | |
| positive regulation of nervous system development | 4.44×10−4 | 5 | |
| regulation of developmental process | 1.131×10−3 | 8 | |
| regulation of multicellular organismal development | 1.96×10−3 | 7 | |
| positive regulation of developmental process | 2.64×10−3 | 6 | |
| system development | 2.84×10−3 | 10 | |
| regulation of multicellular organismal process | 3.63×10−3 | 8 | |
| single-organism developmental process | 5.75×10−3 | 11 | |
| developmental process | 6.65×10−3 | 11 | |
| single-multicellular organism process | 8.10×10−3 | 11 | |
| positive regulation of multicellular organismal process | 8.16×10−3 | 6 | |
| multicellular organism development | 9.65×10−3 | 10 | |
| regulation of cellular component biogenesis | 9.92×10−3 | 5 | |
| single-multicellular organism process | 4.11×10−5 | 23 | |
| multicellular organismal process | 5.14×10−5 | 25 | |
| eye development | 5.32×10−5 | 7 | |
| sensory organ development | 7.88×10−5 | 8 | |
| multicellular organism development | 1.86×10−4 | 20 | |
| anatomical structure development | 1.87×10−4 | 21 | |
| single-organism developmental process | 5.26×10−4 | 21 | |
| system development | 5.74×10−4 | 18 | |
| developmental process | 6.76×10−4 | 21 | |
| positive regulation of steroid hormone biosynthetic process | 3.83×10−3 | 2 | |
| positive regulation of growth | 4.05×10−3 | 5 | |
| nervous system development | 4.48×10−3 | 12 | |
| positive regulation of organ growth | 7.93×10−3 | 3 | |
| camera-type eye development | 8.92×10−3 | 5 | |
| anatomical structure morphogenesis | 9.15×10−3 | 12 | |
Figure 2.Weighted gene coexpression network analysis cluster dendrogram. Grey means grey module identified in WGCNA with 30 upregulated and 44 downregulated; turquoise means turquoise module identified in WGCNA with 14 upregulated and 73 downregulated genes. WGCNA, Weighted gene coexpression network analysis.
Figure 3.Module-trait relationship graph illustrating that turquoise and grey modules have significant correlations with status/prognosis. Red color indicates a positive correlation, while green means a negative correlation. Grey means grey module identified in WGCNA with 30 upregulated and 44 downregulated; turquoise means turquoise module identified in WGCNA with 14 upregulated and 73 downregulated genes.
Figure 4.Heatmap displaying gene correlations for the turquoise and grey modules. A deeper color indicates a stronger correlation, while a lighter color suggests a weaker correlation. Grey means grey module identified in WGCNA; turquoise means turquoise module identified in WGCNA. WGCNA, Weighted gene coexpression network analysis.
Figure 5.Significant subnetwork identified within the protein-protein interaction network. Upregulated genes are denoted in red and downregulated in green. GATA4, transcription factor GATA-4; FGF9, fibroblast growth factor 9; CYP19A1, aromatase; HSD3B2, 3β-hydroxysteroid dehydrogenase/δ5-4-isomerase type 2; HSD11B1, corticosteroid 11β-dehydrogenase isozyme 1; CYP17A1, cytochrome P450 family 17 subfamily A member 1; PITX2, pituitary homeobox 2; LEFTY1, left-right determination factor 1; ARX, homeobox protein ARX; ESR2, estrogen receptor β; NR5A1, steroidogenic factor 1; FOXL2, forkhead box protein L2; MYOCD, myocardin; STAR, steroidogenic acute regulatory protein mitochondrial; SLC32A1, vesicular inhibitory amino acid transporter; TWIST1, twist-related protein 1.
Importance scores of the subnetwork nodes.
| Node | Score |
|---|---|
| ARX | 15.203 |
| CYP17A1 | 10.956 |
| CYP19A1 | 10.902 |
| ESR2 | 10.537 |
| FGF9 | 10.148 |
| FOXL2 | 9.827 |
| GATA4 | 9.786 |
| HSD11B1 | 9.223 |
| HSD3B2 | 9.196 |
| LEFTY1 | 8.259 |
| MYOCD | 7.867 |
| NR5A1 | 7.634 |
| SLC32A1 | 6.367 |
| STAR | 6.167 |
| PITX2 | 5.094 |
| TWIST1 | 4.708 |
GATA4, transcription factor GATA-4; FGF9, fibroblast growth factor 9; CYP19A1, aromatase; HSD3B2, 3β-hydroxysteroid dehydrogenase/δ5-4-isomerase type 2; HSD11B1, corticosteroid 11β-dehydrogenase isozyme 1; CYP17A1, cytochrome P450 family 17 subfamily A member 1; PITX2, pituitary homeobox 2; LEFTY1, left-right determination factor 1; ARX, homeobox protein ARX; ESR2, estrogen receptor β; NR5A1, steroidogenic factor 1; FOXL2, forkhead box protein L2; MYOCD, myocardin; STAR, steroidogenic acute regulatory protein mitochondrial; SLC32A1, vesicular inhibitory amino acid transporter; TWIST1, twist-related protein 1.
Figure 6.ROC curve analysis. ROC curves for the (A) Cancer Genome Atlas and (B) GSE44104 datasets. FPR, false positive rate; TPR, true positive rate; AUC, area under the curve; ROC, receiver operating characteristic.
Figure 7.KM survival analysis. KM survival curves for the (A) Cancer Genome Atlas and (B) GSE49997 datasets. Predicted high risk (cluster 1) groups are indicated in red and low risk (cluster 2) groups in black. KM, Kaplan-Meier.