| Literature DB >> 30732570 |
Shengqin Wang1, Zhihong Zheng2, Peichao Chen3, Mingjiang Wu4.
Abstract
BACKGROUND: The miRNA isoforms (isomiRs) have been suggested to regulate the same pathways as the canonical miRNA and play an important biological role in miRNA-mediated gene regulation. Recently, a study has demonstrated that the presence or absence of all isomiRs could efficiently discriminate amongst 32 TCGA cancer types. Besides, an effective reduction of distinguishing isomiR features for multiclass tumor discrimination must have a major impact on our understanding of the disease and treatment of cancer.Entities:
Keywords: Genetic algorithm; Random forest; Tumor classification; isomiR; miRNA
Mesh:
Substances:
Year: 2019 PMID: 30732570 PMCID: PMC6367816 DOI: 10.1186/s12885-019-5340-y
Source DB: PubMed Journal: BMC Cancer ISSN: 1471-2407 Impact factor: 4.430
Tumor types and number of TCGA isomiR samples used in the analysis
| Tumor Types | # of samples |
|---|---|
| Adrenocortical carcinoma [ACC] | 79 |
| Bladder Urothelial Carcinoma [BLCA] | 366 |
| Breast invasive carcinoma [BRCA] | 1064 |
| Cervical squamous cell carcinoma and endocervical adenocarcinoma [CESC] | 299 |
| Cholangiocarcinoma [CHOL] | 35 |
| Colon adenocarcinoma [COAD] | 386 |
| Lymphoid Neoplasm Diffuse Large B-cell Lymphoma [DLBC] | 46 |
| Esophageal carcinoma [ESCA] | 183 |
| Head and Neck squamous cell carcinoma [HNSC] | 487 |
| Kidney Chromophobe [KICH] | 60 |
| Kidney renal clear cell carcinoma [KIRC] | 455 |
| Kidney renal papillary cell carcinoma [KIRP] | 261 |
| Acute Myeloid Leukemia [LAML] | 105 |
| Brain Lower Grade Glioma [LGG] | 497 |
| Liver hepatocellular carcinoma [LIHC] | 356 |
| Lung adenocarcinoma [LUAD] | 445 |
| Lung squamous cell carcinoma [LUSC] | 419 |
| Mesothelioma [MESO] | 82 |
| Ovarian serous cystadenocarcinoma [OV] | 349 |
| Pancreatic adenocarcinoma [PAAD] | 152 |
| Pheochromocytoma and Paraganglioma [PCPG] | 178 |
| Prostate adenocarcinoma [PRAD] | 472 |
| Rectum adenocarcinoma [READ] | 144 |
| Sarcoma [SARC] | 243 |
| Skin Cutaneous Melanoma [SKCM] | 94 |
| Stomach adenocarcinoma [STAD] | 425 |
| Testicular Germ Cell Tumors [TGCT] | 149 |
| Thyroid carcinoma [THCA] | 483 |
| Thymoma [THYM] | 122 |
| Uterine Corpus Endometrial Carcinoma [UCEC] | 514 |
| Uterine Carcinosarcoma [UCS] | 55 |
| Uveal Melanoma [UVM] | 80 |
Fig. 1The work flow of our GA/RF based algorithm for detecting reliable sets of cancer-associated 5’isomiRs from TCGA isomiR expression data
Fig. 2Analysis of GA/SVM-derived optimal feature sets for 100 runs generated by GA/SVM. a The average sensitivity for 100 generated predictor sets. b The average MCC (Matthew’s Correlation Coefficient) for 100 generated predictor sets [43]. c The prediction accuracies for 32 tumor classifications. d The average sensitivity of test-set samples predicted to be each of the 32 tumor types. X-axis and Y-axis list the actual and the predicted cancer type, respectively. The color of each cell in the heatmap is the average sensitivity of the test-set samples originally as the cancer type in X-axis to be predicted as the cancer type in Y-axis
Detail description of 41 highly frequent 5′ isomiRs in 100 generated predictor sets
| 5’isomiR ID | Chromosome | Strand | Start site | Corresponding miRNA ID | Frequency in 100 runs | Canonical seed region? |
|---|---|---|---|---|---|---|
| 313 | Chr1 | + | 209,432,166 | hsa-miR-205-5p | 88 | Y |
| 698 | Chr6 | + | 52,144,401 | hsa-mir-206 | 85 | Y |
| 233 | Chr7 | – | 27,169,550 | hsa-mir-196b-5p | 75 | N |
| 121 | Chr2 | + | 176,150,330 | hsa-miR-10b-5p | 40 | N |
| 39 | Chr2 | + | 176,150,329 | hsa-miR-10b-5p | 36 | Y |
| 151 | Chr17 | – | 48,579,926 | hsa-miR-10a-5p | 32 | Y |
| 301 | Chr2 | – | 219,001,669 | hsa-miR-375 | 24 | Y |
| 193 | Chr2 | – | 219,001,670 | hsa-miR-375 | 24 | N |
| 317 | Chr2 | – | 219,001,671 | hsa-miR-375 | 23 | N |
| 103 | Chr3 | + | 189,829,974 | hsa-mir-944 | 21 | N |
| 350 | Chr2 | + | 176,150,331 | hsa-miR-10b-5p | 20 | N |
| 529 | Chr1 | – | 220,117,937 | hsa-miR-215-5p | 18 | N |
| 275 | Chr17 | – | 48,579,925 | hsa-miR-10a-5p | 18 | N |
| 506 | Chr3 | + | 189,829,975 | hsa-mir-944 | 16 | Y |
| 207 | Chr5 | – | 149,062,395 | hsa-miR-584-5p | 16 | N |
| 188 | Chr12 | + | 6,963,742 | hsa-miR-200c-3p | 15 | Y |
| 594 | Chr12 | + | 62,603,694 | hsa-let-7i-5p | 14 | N |
| 119 | Chr11 | – | 64,891,426 | hsa-miR-194-5p | 14 | N |
| 884 | ChrX | + | 136,550,892 | hsa-miR-934 | 13 | Y |
| 297 | ChrX | – | 151,958,652 | hsa-miR-224-5p | 13 | N |
| 264 | Chr17 | – | 1,713,934 | hsa-miR-22-3p | 13 | N |
| 208 | Chr17 | – | 48,579,928 | hsa-miR-10a-5p | 13 | N |
| 91 | Chr6 | – | 71,403,576 | hsa-miR-30a-3p | 12 | N |
| 90 | Chr17 | + | 31,560,016 | hsa-miR-193a-5p | 12 | Y |
| 854 | ChrX | + | 136,550,928 | hsa-mir-934 | 12 | N |
| 449 | Chr11 | – | 64,891,390 | hsa-miR-194-3p | 12 | N |
| 372 | Chr14 | – | 101,560,347 | hsa-miR-1247-3p | 12 | N |
| 247 | Chr1 | – | 207,802,474 | hsa-miR-29b-3p | 12 | N |
| 124 | Chr5 | + | 149,428,977 | hsa-miR-143-3p | 12 | N |
| 120 | Chr1 | – | 220,118,228 | hsa-miR-194-5p | 12 | N |
| 572 | Chr20 | + | 62,564,971 | hsa-miR-133a-3p | 11 | N |
| 475 | Chr2 | + | 176,150,328 | hsa-miR-10b-5p | 11 | N |
| 429 | Chr20 | + | 62,554,351 | hsa-miR-1-3p | 11 | Y |
| 392 | Chr9 | – | 21,512,179 | hsa-miR-31-5p | 11 | N |
| 37 | Chr6 | – | 71,403,617 | hsa-miR-30a-5p | 11 | N |
| 358 | Chr7 | – | 129,774,987 | hsa-miR-183-5p | 11 | N |
| 324 | Chr7 | – | 129,770,466 | hsa-miR-182-5p | 11 | N |
| 316 | Chr1 | + | 1,167,124 | hsa-miR-200b-5p | 11 | Y |
| 248 | Chr7 | – | 130,877,491 | hsa-miR-29b-3p | 11 | N |
| 232 | Chr1 | + | 1,167,160 | hsa-miR-200b-3p | 11 | Y |
| 113 | Chr21 | + | 16,539,101 | hsa-miR-99a-5p | 11 | Y |
Fig. 3The expression level of 41 highly frequent isomiRs in 100 generated predictor sets. X-axis lists the 5’isomiR ID used in this study (Detail description can be found in Table 2). Y-axis is the log2-transformed RPM value. The line indicates the value of log2-transformed 10 rpm. The outliers are hidden
Fig. 4Tumor classification and functional enrichment analysis with the 9 most frequently appearing 5’isomiRs. a Tumor classification. Y-axis is the average sensitivity for 1000 randomly produced test sets. “a” is the 9 most frequently appearing 5’isomiRs. In the list of 9 most frequently appearing 5’isomiRs, two 5’isomiRs were from hsa-miR-10b-5p and three 5’isomiRs belonged to has-375. “b” is obtained from “a” by removing three 5’isomiRs which had different 5′ loci from the canonical miRNA (one was from hsa-miR-10b-5p and two belonged to has-375). b Bar plot shows the enriched GO terms from DAVID functional annotation analysis. The clusters integrated with enrichment score are shown as Y-axis. The –log10(P-value after correlation) is plotted on the X-axis