| Literature DB >> 28361685 |
Mehmet Eren Ahsen1, Todd P Boren2, Nitin K Singh3, Burook Misganaw4, David G Mutch5, Kathleen N Moore6, Floor J Backes7, Carolyn K McCourt8, Jayanthi S Lea9, David S Miller9, Michael A White10, Mathukumalli Vidyasagar11.
Abstract
BACKGROUND: Metastasis via pelvic and/or para-aortic lymph nodes is a major risk factor for endometrial cancer. Lymph-node resection ameliorates risk but is associated with significant co-morbidities. Incidence in patients with stage I disease is 4-22% but no mechanism exists to accurately predict it. Therefore, national guidelines for primary staging surgery include pelvic and para-aortic lymph node dissection for all patients whose tumor exceeds 2cm in diameter. We sought to identify a robust molecular signature that can accurately classify risk of lymph node metastasis in endometrial cancer patients. 86 tumors matched for age and race, and evenly distributed between lymph node-positive and lymph node-negative cases, were selected as a training cohort. Genomic micro-RNA expression was profiled for each sample to serve as the predictive feature matrix. An independent set of 28 tumor samples was collected and similarly characterized to serve as a test cohort.Entities:
Keywords: Endometrial cancer; Lymph node metastasis; Machine learning; Sparse classification
Mesh:
Substances:
Year: 2017 PMID: 28361685 PMCID: PMC5374706 DOI: 10.1186/s12864-017-3604-y
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Clinical Parameters of the training cohort
| Lymph node | Lymph node | ||
|---|---|---|---|
| negative ( | positive ( | ||
| Age | ≤ 60 | 23 (50%) | 19 (40%) |
| >60 | 23 (50%) | 28 (60%) | |
| Race | AA | 2 (4%) | 3 (6%) |
| non-AA | 44 (96%) | 44 (94%) | |
| Tumor Grade ( | 1 | 8 (17%) | 13 (28%) |
| 2 | 14 (30%) | 17 (36%) | |
| 3 | 24 (53%) | 17 (36%) | |
| LVSI ( | Present | 17 (38%) | 37 (80%) |
| Absent | 28 (62%) | 9 (20%) | |
| Myometrial Invasion | Inner 1/2 | 23 (50%) | 11 (25%) |
| ( | Outer 1/2 | 23 (50%) | 33 (75%) |
Fig. 1Hierarchical clustering of training data. Unsupervised two-way hierarchical clustering of the 213 miRNA expression levels across the 86 tumor samples. The 43 samples at left are lymph node-negative while the 43 samples at right are lymph node-positive. It is evident that there is no discernible pattern in the clustering
Micro-RNA signature
| Micro-RNA | Weight |
|---|---|
| hsa-miR-3607-3p | –2.43 |
| hsa-miR-299-5p | 2.01 |
| hsa-miR-365 | 1.747 |
| hsa-miR-513a-5p | –2.4368 |
| hsa-miR-29b-1* | 2.2202 |
| hsa-miR-340 | –1.4319 |
| hsa-miR-1284 | 1.8007 |
| hsa_SNORD6 | 1.7312 |
| hsa-miR-934 | –2.223 |
| hsa-miR-3182 | 1.8238 |
| hsa-miR-1908 | –1.1631 |
| hsa-miR-155 | –1.5283 |
| hsa-miR-23c | 1.3968 |
| hsa-miR-451 | –1.2663 |
| hsa-miR-300 | –1.4832 |
| hsa-miR-223 | 1.0996 |
| hsa-miR-150 | –0.7774 |
| hsa-miR-3613-3p | 1.3349 |
| Threshold | –1.0025 |
Fig. 2Values of the discriminant function on the training cohort of 86 tumors. Negative values of the discriminant correspond to labelling the tumor as node-negative, while positive values of the discriminant correspond to labelling the tumor as node-positive. The 43 node-negative tumors are on the left side of the plot, and the 43 node-positive tumors are on the right side of the plot. It can be seen that the discriminant values of all node-negative tumors are negative, and that the discriminant values of all node-positive tumors are positive. Thus the classifier achieves 100% accuracy on the training cohort
The list of 23 genes and associated cancer sites
| Gene | Associated cancer sites |
|---|---|
| BCL2 | Colorectal cancer, Small cell lung cancer, Prostate cancer |
| MMP2 | Bladder cancer |
| E2F1 | Non-small cell lung cancer, Pancreatic cancer, Small cell lung cancer, Prostate cancer, Bladder cancer |
| MMP9 | Bladder cancer |
| AKT1 | Endometrial cancer, Colorectal cancer, Acute myeloid leukemia, Non-small cell lung cancer, Pancreatic cancer, Small cell lung cancer, Prostate cancer |
| HSP90B1 | Prostate cancer |
| CHUK | Acute myeloid leukemia, Pancreatic cancer, Small cell lung cancer, Prostate cancer |
| IL6 | Prostate cancer |
| NFIA | |
| SCARB1 | |
| RHOB | |
| LMO2 | |
| NFIX | |
| STMN1 | |
| ARPP19 | |
| MIF | |
| ABCB1 | |
| MEF2C | |
| CAB39 | |
| RAB14 | |
| TMED7 | |
| UBE2H | |
| MYB |
Fig. 3The network of 740 genes regulated by the 18 micro-RNA features. The micro-RNA with the vast majority of interactions, which are all confirmed, is hsa-mir-155. Out of the 18 micro-RNAs, three are differentially expressed across the two classes (lymph-positive and lymph-negative) in the training cohort of 86 tumors. The genes regulated by these three micro-RNAs are also shown in the figure
Fig. 4The Set of 23 Key Genes and Their Controlling micro-RNAs. Genes in this figure satisfy one of two criteria: (i) The gene is targeted by more than one micro-RNA in the set of 18 features, or (ii) the gene is targeted by one of the three differentially expressed micro-RNAs, which are the first three, namely hsa-mir-223, hsa-mir-451, and hsa-mir-155
Fig. 5Values of the discriminant function on the independent cohort of 28 tumors. Negative values of the discriminant correspond to labelling the tumor as node-negative, while positive values of the discriminant correspond to labelling the tumor as node-positive. The 19 node-negative tumors are on the left side of the plot, and it can be seen that 15 out of 19 tumors have negative discriminant values and are thus classified correctly. The 9 node-positive tumors are on the right side of the plot, and it can be seen that 8 out of 9 tumors have positive discriminant values and are thus classified correctly
Contingency table of classifier performance on test cohort
| Actual/Classification | Positive | Negative | Total | Positive | Negative | Total |
|---|---|---|---|---|---|---|
| Node-Positive | 8 | 1 | 9 | 7 | 2 | 9 |
| Node-Negative | 4 | 15 | 19 | 4 | 15 | 19 |
| Total | 12 | 16 | 28 | 11 | 17 | 28 |
| Accuracy | 0.8214 | 0.7857 | ||||
| Sensitivity | 0.8889 | 0.7778 | ||||
| Specificity | 0.7895 | 0.7895 | ||||
| False Discovery Rate | 0.0625 | 0.1174 | ||||
|
| 0.0012 | 0.0104 | ||||
|
| 0.0004 | 0.0037 | ||||
(The performance of the classifier on the 86 training cohort is not shown as it was 100%.) The left part of the table corresponds to sample #198 treated as node-positive, while the right part of the table corresponds to sample #198 treated as node-negative. When sample #198 is treate as node-positive, the classifier has accuracy of 82.14%, with 23 out of 28 tumors being correctly classified; sensitivity of 88.89% with 8 out of 9 lymph-positive tumors being correctly classified; and specificity of 78.95%, with 15 out of 19 lymph-negative tumors being correctly classified. The P-value of obtaining these values purely by chance was computed using the Fisher exact test at 0.0012 and as 0.0004 using the more powerful Barnnard exact test. The corresponding figures with sample #198 treated as node-negative are shown for comparison. It can be seen that even this case, all P-values are far lower than the widely accepted threshold of 0.05