Literature DB >> 32160711

Discovering Cancer-Related miRNAs from miRNA-Target Interactions by Support Vector Machines.

Cong Pian1, Shanjun Mao2, Guangle Zhang3, Jin Du2, Fei Li4, Suet Yi Leung5, Xiaodan Fan6.   

Abstract

MicroRNAs (miRNAs) have been shown to be closely related to cancer progression. Traditional methods for discovering cancer-related miRNAs mostly require significant marginal differential expression, but some cancer-related miRNAs may be non-differentially or only weakly differentially expressed. Such miRNAs are called dark matters miRNAs (DM-miRNAs) and are targeted through the Pearson correlation change on miRNA-target interactions (MTIs), but the efficiency of their method heavily relies on restrictive assumptions. In this paper, a novel method was developed to discover DM-miRNAs using support vector machine (SVM) based on not only the miRNA expression data but also the expression of its regulating target. The application of the new method in breast and kidney cancer datasets found, respectively, 9 and 24 potential DM-miRNAs that cannot be detected by previous methods. Eight and 15 of the newly discovered miRNAs have been found to be associated with breast and kidney cancers, respectively, in existing literature. These results indicate that our new method is more effective in discovering cancer-related miRNAs.
Copyright © 2020 The Author(s). Published by Elsevier Inc. All rights reserved.

Entities:  

Keywords:  cancer-related miRNAs; dark matters; expression data; miRNA-target interactions; support vector machine

Year:  2020        PMID: 32160711      PMCID: PMC7056629          DOI: 10.1016/j.omtn.2020.01.019

Source DB:  PubMed          Journal:  Mol Ther Nucleic Acids        ISSN: 2162-2531            Impact factor:   8.886


Introduction

MicroRNAs (miRNAs) represent a type of small non-coding RNA molecule with about 22 nucleotides found in plants, animals, and viruses that function in post-transcriptional regulation of gene expression and RNA silencing by binding to the 3′ untranslated regions of mRNA.1, 2, 3, 4 miRNAs are abundant in many mammalian cells, and appear to target about 60% of the genes of mammals., Many miRNAs are evolutionarily conserved, which indicates that they have significant biological functions. Research suggests that miRNAs can act as regulators of diverse cellular processes, such as cell differentiation, apoptosis, virus defense, embryonic development, and proliferation., Furthermore, miRNAs have been implicated in many diseases, such as various types of cancers,12, 13, 14 heart conditions, and neurological diseases. Up to now, miRNAs have been studied as promising candidates for diagnostic and prognostic biomarkers, as well as predictors of drug responses. For example, miR-1246 is a potential diagnostic and prognostic biomarker in esophageal squamous cell carcinoma (ESCC), and may act as a cell adhesion-related miRNA released from ESCC that affects distant organs. Research shows that single-nucleotide polymorphisms (SNPs) in miRNAs and their target sites can impact miRNA biology and affect cancer risk, as well as treatment response. It is likely that these SNPs can act as diagnostic and prognostic markers. Thus, discovering pivotal cancer-related miRNAs is an active area of research. The differential expression analysis (DE), which performs two groups comparison for individual miRNA followed by certain multiple comparison correction, may be the most common method of discovering cancer-related miRNAs. For example, in Zhou et al., differentially expressed miRNAs and mRNAs were separately selected as biomarkers using the limma package; in Liao et al., 5 miRNAs of 320 differentially expressed mRNAs were used for prognostic signature construction; in Le et al., a causality discovery-based method was used to uncover the causal regulatory relationship between miRNAs and mRNAs. However, some non-differentially or weak differentially expressed miRNAs may play important regulatory roles in cancer. Pian et al. named this type of miRNA “dark matters” miRNA (DM-miRNA) and developed a method to discover DM-miRNA based on the change of Pearson correlation coefficient (ΔPCC). However, ΔPCC may fail in some situations. For example, if the correlations between a miRNA and its target in cancer and normal samples are consistent as in Figure 1A, ΔPCC will be too small to discover this MTI. Also, ΔPCC is based on Pearson correlation, which cannot detect nonlinear associations, such as in Figure 1B.
Figure 1

Two Situations that ΔPCC Has Difficulty Handling

Points of two colors represent samples from the normal and cancer groups. (A) Consistent correction through embedding. (B) Nonlinear association.

Two Situations that ΔPCC Has Difficulty Handling Points of two colors represent samples from the normal and cancer groups. (A) Consistent correction through embedding. (B) Nonlinear association. Here, we introduce a machine learning method to discover cancer-related miRNAs. More specifically, support vector machines (SVMs) are used to construct nonlinear class separation boundaries in the two-dimensional space of a miRNA and its experimentally validated target. By focusing on experimentally validated miRNA-target interactions (MTIs), we can avoid many false positives as compared with the DE method on marginal expression. With the ability of SVMs to induce complex decision boundaries, we can accommodate nonlinear or even embedded class relationships as in Figure 1. The classification accuracy (ACC, see definition in Materials and Methods) is used to screen signals and compare different approaches.

Results

Results for Breast Cancer

miRNAs with High Classification Accuracy (S1)

We use the breast cancer expression data of each miRNA as the input feature to train an SVM classifier. Figure 3A shows the miRNAs whose ACC is greater than 0.8. The miRNAs in the red rectangular boxes are not experimentally confirmed to be associated with breast invasive carcinoma (BRCA). The remaining miRNAs have been shown to be associated with breast cancer based on the database HMDD 2.0 and literature mining. The PubMed numbers of these miRNAs are shown in Table 1. Figure 2B is the volcano map of miRNAs in Figure 2A. We find that most of these miRNAs are not differentially expressed. The results indicate that the SVM based on miRNA expression data alone can discover partial BRCA-related miRNAs.
Figure 3

The 2,028 mRNAs Whose ACCs Are Greater Than 0.8 for Breast Cancer

(A) The volcano map of 2,028 mRNAs. Red and blue represent downregulation and upregulation, respectively. (B) The enrichment analyses result of the above 2,028 mRNA genes based on KEGG pathways for BRCA.

Table 1

The Literature Reports of the Associations between the miRNAs with ACC >0.8 on miRNA Expression and Breast Cancer

miRNAPubMed No.
miR-13921953071
miR-2117531469
miR-18323060431
miR-14521723890
miR-99a27212167
miR-10b22573479
miR-9619574223
miR-14118376396
let-7c22388088
miR-125b-119738052
miR-20418922924
miR-18219574223
miR-10022926517
miR-59229039599
miR-42918376396
miR-200a20514023
miR-125b-220460378
miR-20617312270
miR-337unknown
miR-48619946373
miR-15b25783158
miR-551bunknown
miR-181b-123759567
miR-38316754881
miR-3226276160
miR-58423479725
miR-133a-122292984
miR-58522328513
miR-19530076862
miR-200b20514023
miR-133b19946373
miR-934unknown
Figure 2

The 32 miRNAs with ACC >0.8 in Breast Cancer

(A) The relationship between the 32 miRNAs and breast cancer. The miRNAs in the red rectangular boxes are so far not experimentally confirmed to be associated with BRCA. (B) The volcano map of the above 32 miRNAs. Most of these miRNAs are not differentially expressed.

The Literature Reports of the Associations between the miRNAs with ACC >0.8 on miRNA Expression and Breast Cancer The 32 miRNAs with ACC >0.8 in Breast Cancer (A) The relationship between the 32 miRNAs and breast cancer. The miRNAs in the red rectangular boxes are so far not experimentally confirmed to be associated with BRCA. (B) The volcano map of the above 32 miRNAs. Most of these miRNAs are not differentially expressed.

miRNAs with High Classification Accuracy (S2)

We also use the breast cancer expression data of each mRNA as the input feature to train an SVM classifier. Figure 3A describes the DE results of 2,028 mRNAs whose ACCs are greater than 0.8. In addition, the enrichment analyses results are shown in Figure 3B. DAVID, is employed for enrichment analyses for the above 2,028 mRNAs based on Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. Some cancer mechanism-related pathways (such as pathways in cancer and the p53 signaling pathway, prostate cancer, miRNAs in cancer, pancreatic cancer, chronic myeloid leukemia, melanoma, the p53 signaling pathway, small cell lung cancer, colorectal cancer) are significantly enriched. These results indicate that the discovered mRNAs are very important in cancers. The 2,028 mRNAs Whose ACCs Are Greater Than 0.8 for Breast Cancer (A) The volcano map of 2,028 mRNAs. Red and blue represent downregulation and upregulation, respectively. (B) The enrichment analyses result of the above 2,028 mRNA genes based on KEGG pathways for BRCA.

MTIs with High Classification Accuracy (S3)

For each of the 155,044 experimentally verified human MTIs from the miRTarBase database, we use the mRNA and miRNA breast cancer expression data of the miRNA-mRNA interaction as the two features of SVM. The MTIs with high ACC >0.8 are selected as candidate MTIs for discovering cancer-related miRNAs.

Discovery of DM-miRNAs in Breast Cancer

To demonstrate why our new method can catch better discriminant information, we analyze the MTIs with ACC >0.9 in the miRNA-mRNA joint space, whereas the corresponding marginal ACC of both the miRNA and the mRNA are <0.8. There are 136 MTIs satisfying the above conditions (Table S2). Thus, although the ACCs based on the marginal miRNA feature and the marginal mRNA feature are both nonideal, the performance of classification of the corresponding MTI, i.e., the joint feature, is significant. Figure 4A shows the 31 miRNAs in 136 MTIs. The miRNAs in the red rectangular boxes are so far not experimentally confirmed to be associated with BRCA. The PubMed numbers of these miRNAs are shown in Table 2. We see that most of these 31 miRNAs are related to BRCA and non-differentially expressed in Figure 4B. There are two differentially expressed miRNAs. Figure 4C represents the expression of miR-452 and IRS1 in normal and cancer samples. We find that it is hard to distinguish the normal and cancer samples based only on the feature of single miRNA or only on the mRNA expression profile data. More specifically, the classification accuracy of using miR-452 or IRS1 alone is 69.61% or 62.55%, respectively. Figure 4D is the scatterplot of miR-452 and IRS1. Compared with the classification performance of either marginal feature miR-452 or IRS1, the detection using the two-dimensional features of miR-452 and IRS1 is much more effective.
Figure 4

The 31 miRNAs in 136 MTIs with [ACC(miRNA-mRNA) > 0.9, ACC(miRNA) < 0.8, ACC(mRNA) < 0.8] for Breast Cancer

(A) The relationship between these miRNAs and cancer. The miRNAs in the red rectangular boxes are not experimentally confirmed to be associated with BRCA. (B) The volcano map of the above 31 miRNAs. Only 2 of 31miRNAs are differentially expressed. (C) The one-dimensional scatterplot of single miR-452 and IRS1 expression values in normal and cancer samples. The left two lines represent the expression value of miR-452 in BRCA and normal tissues, and the right two lines represent the expression value of IRS1 in BRCA and normal tissues. (D) The two-dimensional scatterplot of miRNA-mRNA interaction. The abscissa and ordinate represent the expression values of IRS1 and miR-452.

Table 2

The Literature Reports of the Associations between DM-miRNAs and Breast Cancer

miRNAPubMed No.
miR-28unknown
miR-200c21224848
miR-49727456360
miR-33528795314
miR-48330186493
miR-14023752191
miR-124730249392
miR-378c26749280
miR-14429561704
miR-10a21955614
miR-148b23233531
miR-1468unknown
miR-193a22333974
miR-190b26141719
miR-45427588500
miR-34021692045
miR-9321955614
miR-22422809510
miR-29619754881
miR-45222353773
miR-130129790898
miR-21022952344
miR-59029534690
miR-130b28163094
miR-130a29384218
miR-301b21393507
miR-9828232182
let-7i22388088
miR-14226657485
miR-30a22231442
miR-42128463794

The underlined miRNAs are experimentally confirmed.

The 31 miRNAs in 136 MTIs with [ACC(miRNA-mRNA) > 0.9, ACC(miRNA) < 0.8, ACC(mRNA) < 0.8] for Breast Cancer (A) The relationship between these miRNAs and cancer. The miRNAs in the red rectangular boxes are not experimentally confirmed to be associated with BRCA. (B) The volcano map of the above 31 miRNAs. Only 2 of 31miRNAs are differentially expressed. (C) The one-dimensional scatterplot of single miR-452 and IRS1 expression values in normal and cancer samples. The left two lines represent the expression value of miR-452 in BRCA and normal tissues, and the right two lines represent the expression value of IRS1 in BRCA and normal tissues. (D) The two-dimensional scatterplot of miRNA-mRNA interaction. The abscissa and ordinate represent the expression values of IRS1 and miR-452. The Literature Reports of the Associations between DM-miRNAs and Breast Cancer The underlined miRNAs are experimentally confirmed. If we relax the thresholds in the previous paragraph by analyzing the MTIs with ACC >0.8 in the joint feature and ACC <0.7 in both marginal features, the results are shown in Table 3. The underlined miRNAs are experimentally confirmed to be associated with BRCA. The second and third columns are the fold change (FC) and PubMed numbers of literature reports of these miRNAs, respectively. Most of these miRNAs are not differentially expressed.
Table 3

The FC and Literature Reports of miRNA [ACC(miRNA-mRNA) > 0.8, ACC(miRNA) < 0.7, ACC(miRNA) < 0.7)] for Breast Cancer

miRNAFCPubMed No.
miR-30a0.06522476851
miR-3310.34330063890
miR-23b0.01522231442
miR-170.09118695042
miR-92a-20.03622563438
miR-449a3.00427983918
miR-1340.09528454346
let-7b0.03522403704
miR-1270.08021409395
miR-31270.507unknown
miR-20a0.01822350790
miR-30c-20.07023340433
miR-4210.62728463794
miR-125a0.05223420759
miR-1860.048unknown
miR-8771.131unknown
miR-2220.06221553120
miR-3300.23429630118

The underlined miRNAs are experimentally confirmed.

The FC and Literature Reports of miRNA [ACC(miRNA-mRNA) > 0.8, ACC(miRNA) < 0.7, ACC(miRNA) < 0.7)] for Breast Cancer The underlined miRNAs are experimentally confirmed. In summary, compared with the single miRNA or mRNA, paired MTIs contain more biological information. Therefore, the SVM classifier based on the paired miRNA-mRNA features can effectively discover more DM-miRNAs. We draw receiver operating characteristic (ROC) curves by randomly selecting six MTIs with ACC >0.9 [ACC(miRNA) < 0.8, ACC(mRNA) < 0.8]. Figure 5 shows the classification performance based on the single mRNA, miRNA, and paired MTIs for BRCA. The results indicate that the information of MTIs is more effective. The classification ability of MTIs is significantly better than that of mRNAs and miRNAs. Therefore, MTIs can be effective biomarkers that contain more biological information.
Figure 5

The ROC Curves of Six MTIs with ACC >0.9 for Breast Cancer

The classification results of miR-452-IRS1, miR-98-PNRC1, miR-98-BCL9, miR-1301-CDCA4, miR-130a-TRIM59, and miR-130b-SMOC1. The black and red lines represent the ROC curve based on the single miRNA and mRNA, respectively. The green line represents the ROC curve based on the paired miRNA-mRNA interaction.

The ROC Curves of Six MTIs with ACC >0.9 for Breast Cancer The classification results of miR-452-IRS1, miR-98-PNRC1, miR-98-BCL9, miR-1301-CDCA4, miR-130a-TRIM59, and miR-130b-SMOC1. The black and red lines represent the ROC curve based on the single miRNA and mRNA, respectively. The green line represents the ROC curve based on the paired miRNA-mRNA interaction.

Comparison with DE of miRNAs

In order to show that SVM can effectively screen potential cancer-related miRNAs, we compared the results of SVM and DE. Table 4 records the top 20 |log2(FC)| miRNAs in breast cancer based on the DE. The results in Table 5 indicate that only 4 of the top 20 miRNAs were confirmed to be associated with breast cancer. The underlined miRNAs are experimentally confirmed to be associated with BRCA. However, Table 2 shows that 19 of the top 20 ACC miRNAs were confirmed to be associated with breast cancer, which indicates that using SVM to select cancer-related miRNAs is more effective.
Table 4

The Top 20 |log2(FC)| miRNAs in Breast Cancer

miRNA|log2(FC)|PubMed No.a
miR-8025.41226080894
miR-449c4.186unknown
miR-39274.764unknown
miR-31394.608unknown
miR-124-24.458unknown
miR-4924.32425407488
miR-5734.25325333258
miR-19084.253unknown
miR-5494.084unknown
miR-3156-24.034unknown
miR-3156-14.034unknown
miR-5074.03127167339
miR-31804.017unknown
miR-36123.982unknown
miR-39253.829unknown
miR-1302-33.677unknown
miR-449b3.580unknown
miR-3156-33.569unknown
miR-31483.568unknown
miR-5923.349unknown

The underlined miRNAs are experimentally confirmed to be associated with BRCA.

The third column represents the PubMed number of literature reports of these miRNAs.

Table 5

The Literature Reports of the Associations between DM-miRNAs and Kidney Cancer

miRNAPubMed No.
let-7b28694731
let-7g25951903
let-7i28694731
mir-10028765937
mir-15430138594
mir-15bunknown
mir-18326091793
mir-18628550686
mir-20b26708577
mir-21427226530
mir-216b30231239
mir-23b20562915
mir-26a-128881158
mir-30b28536082
mir-320a27760486
mir-33529070041
mir-340unknown
mir-369unknown
mir-37725776481
mir-483unknown
mir-493unknown
mir-513cunknown
mir-625unknown
mir-675unknown

The underlined miRNAs are experimentally confirmed to be associated with kidney cancer.

The Top 20 |log2(FC)| miRNAs in Breast Cancer The underlined miRNAs are experimentally confirmed to be associated with BRCA. The third column represents the PubMed number of literature reports of these miRNAs. The Literature Reports of the Associations between DM-miRNAs and Kidney Cancer The underlined miRNAs are experimentally confirmed to be associated with kidney cancer.

Results for Kidney Cancer

For comparison with the previous method ΔPCC, we show the results for kidney cancer. As before, we analyze MTIs with ACC >0.9 and whose single miRNA and mRNA have ACC <0.8. A total of 76 such MTIs are selected (Table S3). Table 5 describes the mRNAs in these 76 MTIs. The underlined miRNAs are experimentally confirmed to be associated with kidney cancer. The PubMed numbers of these miRNAs are shown in the second and fourth columns of Table 5. We also compare the results of SVM and DE in kidney renal clear cell carcinoma (KIRC). Table 6 records the top 20 |log2(FC)| miRNAs in kidney cancer based on DE. Table 7 records the top 20 ACC miRNAs in kidney cancer based on SVM classifier. The underlined miRNAs are experimentally confirmed to be associated with KIRC. Results in Table 6 indicate that only 3 of the top 20 miRNAs were confirmed to be associated with kidney cancer. However, Table 7 shows that 16 of the top 20 ACC miRNAs were confirmed to be associated with kidney cancer. These results also indicate that using SVM to select cancer-related miRNAs is more effective.
Table 6

The Top 20 |log2(FC)| miRNAs in Kidney Cancer

miRNA|log2(FC)|PubMed No.a
miR-12935.14328338236
miR-1225.00723056576
miR-8754.582unknown
miR-31664.523unknown
miR-3202-24.431unknown
miR-1285-14.10822294552
miR-12313.869unknown
miR-12503.832unknown
miR-520b3.788unknown
miR-518c3.777unknown
miR-36543.775unknown
miR-219-23.704unknown
miR-21153.602unknown
miR-36173.484unknown
miR-5553.434unknown
miR-548d-23.413unknown
miR-36623.302unknown
miR-19103.289unknown
miR-5973.278unknown
miR-39413.199unknown

The underlined miRNAs are experimentally confirmed to be associated with KIRC.

The third column represents the PubMed number of literature reports of these miRNAs.

Table 7

The Top 20 ACC miRNAs in Kidney Cancer

miRNAACC (%)PubMed No.a
miR-200c98.7529394133
miR-14198.5124647573
miR-20695.5329410711
miR-12294.2829410711
miR-129-194.1024802708
miR-129-293.7528251969
miR-62993.2125381221
miR-58492.8621119662
miR-891a92.68unknown
miR-106b91.9628423523
miR-21091.9629445446
miR-181b-191.43unknown
miR-15a90.8928849086
miR-93490.54unknown
miR-2190.5329131259
miR-42990.3527698878
miR-15190.00unknown
miR-181a-189.8229066014
miR-15589.6429228417
miR-2589.6429079415

The underlined miRNAs are experimentally confirmed to be associated with KIRC.

The third column represents the PubMed number of literature reports of these miRNAs.

The Top 20 |log2(FC)| miRNAs in Kidney Cancer The underlined miRNAs are experimentally confirmed to be associated with KIRC. The third column represents the PubMed number of literature reports of these miRNAs. The Top 20 ACC miRNAs in Kidney Cancer The underlined miRNAs are experimentally confirmed to be associated with KIRC. The third column represents the PubMed number of literature reports of these miRNAs.

Identification of Cancer Types via miRNA-mRNA Association

To verify whether miRNA-mRNA associations can effectively classify cancer types, we designed a multiclass classifier with multiple SVM sub-classifiers to identify the six cancers and the normal tissues. The miRNA-mRNA pairs with joint ACC >0.8 but marginal ACC <0.7 were selected as the features of the classifiers. The detailed flow chart is in Figure 6. The index “1–6” represents the six kinds of cancer (lung squamous cell carcinoma [LUSC], lung adenocarcinoma [LUAD], BRCA, thyroid carcinoma [THCA], prostate adenocarcinoma [PRAD], KIRC), respectively. The index “7” represents the integration of paired normal tissue samples. We divided these seven classes into two subclasses. Further subclasses are further divided into two subclasses, which are so circulated until a single class is obtained. Finally, we evaluated the performance of the classifier using 10-fold cross-validation. The accuracies of the seven classes are shown in Table 8. The diagonal elements are the percentages of real LUSC, LUAD, BRCA, THCA, PRAD, KIRC, and normal samples identified correctly. The remaining elements are the percentage of a class of samples judged to be the six types of samples. The results indicate that the miRNA-mRNA associations can be used to precisely identify cancer types.
Figure 6

The Flow Chart for Constructing the Multiclass Classifier

The numbers 1–6 represent LUSC, LUAD, BRCA, THCA, PRAD, and KIRC, respectively. The number 7 represents the normal tissue samples. The process contains six SVM classifiers. For sample S1, where the type of cancer is not known, if S1 is classified as “1,2,3” using SVM1, then we use SVM2 to judge its type. If S1 is classified as “3,” the final prediction type is BRCA, otherwise S1 needs to be further predicted through SVM4.

Table 8

The Performance of the Multiclass Classifier by Using 10-Fold Cross-Validation

LUSCLUADBRCATHCAPRADKIRCNormal
LUSC97.280.810.290.160.280.540.64
LUAD1.8396.220.520.350.410.360.31
BRCA0.120.2997.160.840420.580.59
THCA0.340.470.4697.380.730.390.23
PRAD0.260.380.410.4697.140.680.67
KIRC0.520.320.370.430.4697.420.48
Normal0.240.330.340.280.520.1598.14
The Flow Chart for Constructing the Multiclass Classifier The numbers 1–6 represent LUSC, LUAD, BRCA, THCA, PRAD, and KIRC, respectively. The number 7 represents the normal tissue samples. The process contains six SVM classifiers. For sample S1, where the type of cancer is not known, if S1 is classified as “1,2,3” using SVM1, then we use SVM2 to judge its type. If S1 is classified as “3,” the final prediction type is BRCA, otherwise S1 needs to be further predicted through SVM4. The Performance of the Multiclass Classifier by Using 10-Fold Cross-Validation

Comparison with Other Methods

Pian et al. provided a method called ΔPCC to discover potential DM-miRNAs by building the basic miRNA-mRNA network (BMMN) and miRNA-long noncoding RNA (lncRNA) network (BMLN). For breast cancer, 124 miRNAs with high activity scores were obtained by BMMN. In this paper, we obtained 49 miRNAs by integrating Tables 2 and 3. Through comparing these 124 and 49 miRNAs, we found that 9 of 49 miRNAs (hsa-miR-331, hsa-miR-142, hsa-miR-3127, hsa-miR-222, hsa-miR-378c, hsa-miR-92a-2, hsa-miR-421, hsa-miR-125a, and hsa-miR-590) did not appear in the 124 miRNAs. Tables 2 and 3 show that all nine of the above miRNAs except hsa-miR-3127 have been confirmed to be associated with breast cancer. For kidney cancer, 70 miRNAs with high activity score were obtained by BMMN. Only one (miR-let-7b) of the 24 miRNAs in Table 5 appears in the above 70 miRNAs. Fifteen of the remaining 23 miRNAs have been confirmed to be associated with kidney cancer. The above results indicate that our new method can find cancer-related miRNAs that cannot be discovered by ΔPCC.

Discussion

Cancers have a high incidence of occurrence globally. Their high mortality rates highlight the urgent need for new treatment methods. miRNAs are important post-transcriptional gene expression regulators. In cancer, the miRNAs aberrantly expressed have significant roles in progression and tumorigenesis. Currently, miRNAs are being studied as biomarkers for diagnosis and prognosis, and as therapeutic tools in cancer. However, some important miRNAs are easily overlooked, when the correlations between these miRNAs and their target genes in cancer and normal samples are consistent. In order to discover these miRNAs, we use a novel method to discover them by building SVM classifiers based on potential joint MTIs. Our results indicate that the new method can detect additional cancer-related miRNAs that cannot be detected by previous methods. Our new method should be considered complementary to previous methods. We also find that the edge biomarkers contain more biological information than the node biomarkers. Compared with the signal miRNA or mRNA biomarkers, edge biomarkers (paired miRNA-mRNA interaction) can more effectively distinguish tumor samples and normal samples. Furthermore, by constructing a classifier with multiple random forest sub-classifiers based on the edge biomarkers, the six cancers can be identified accurately. This will provide a new way to further study the classification of tumor sub-types. In conclusion, our method can help effectively discover new cancer-related miRNAs. These results will contribute to developing novel therapeutic candidates in cancers. Our method also has some limitations. For example, our method is based on the known MTIs from miRTarBase; thus, it cannot detect newly gained MTIs that have not been recorded in miRTarBase. To remedy this potential loss, a systematic scan of all miRNA-mRNA pairs may be needed, which will be very computationally costly.

Materials and Methods

Datasets

We studied different types of cancer, including BRCA, KIRC, LUAD, LUSC, THCA, and prostate adenocarcinoma (PRAD). The expression profiles of these six cancers were downloaded from the database of The Cancer Genome Atlas (TCGA) (https://www.cancer.gov/about-nci/organization/ccg/research/structural-genomics/tcga), which includes 1,071 miRNAs and 20,530 mRNAs. The number of cancer samples is shown in Table 9. The 155,044 experimentally validated MTIs (Table S1) and miRNA-disease associations were obtained from the databases miRTarBase and HMDD v.2.0, respectively.,
Table 9

The Type and Sample Number of Six Different Types of Cancer

Cancer AbbreviationFull Name of CancerNo. of Cancer Tissue SamplesNo. of Paired Normal Tissue Samples
BRCAbreast invasive carcinoma75586
KIRCkidney renal clear cell carcinoma25571
THCAthyroid carcinoma51159
LUADlung adenocarcinoma44519
LUSClung squamous cell carcinoma34238
PRADprostate adenocarcinoma49452
The Type and Sample Number of Six Different Types of Cancer

Flow Chart of the Method

The workflow of DM-miRNA discovery is divided into four steps (Figure 7). First, an SVM classifier is constructed for each of the 1,071 miRNAs based on its expression data in cancer and normal tissues. Therefore, the classification accuracy (ACC) based on each miRNA expression feature is obtained. We select miRNAs with high ACC as set S1. In step 2, likewise, ACC based on each mRNA expression feature is calculated by building 20,530 SVM classifiers. The mRNAs with high ACC are selected as set S2. In step 3, ACCs based on 155,044 paired miRNA-mRNA expression features are also obtained by building 155,044 SVM classifiers. We select paired miRNA-mRNA interactions with high ACC as set S3. Finally, we obtain potential DM-miRNAs by removing the MTIs of S3, which contain miRNAs of S1 or mRNAs of S2.
Figure 7

The Flow Chart of Our Method

The green modules represent the SVM classification results based on the miRNA expression feature. The miRNAs with high ACC are selected as set S1. The orange modules represent the SVM classification results based on the mRNA expression feature. The mRNAs with high ACC are selected as set S2. The blue modules represent the SVM classification results based on the paired MTIs feature. We select paired MTIs with high ACC as set S3. DM-miRNAs are inferred as the MTIs of S3 after removing those containing miRNAs of S1 or mRNAs of S2.

The Flow Chart of Our Method The green modules represent the SVM classification results based on the miRNA expression feature. The miRNAs with high ACC are selected as set S1. The orange modules represent the SVM classification results based on the mRNA expression feature. The mRNAs with high ACC are selected as set S2. The blue modules represent the SVM classification results based on the paired MTIs feature. We select paired MTIs with high ACC as set S3. DM-miRNAs are inferred as the MTIs of S3 after removing those containing miRNAs of S1 or mRNAs of S2.

Parameters of the Model

The kernel, cost, and gamma of SVM were set to radial, 1, and 1, respectively. Because the positive (86 normal samples) and negative samples (755 BRCA samples) were unbalanced, we used the random sub-sampling method to balance the data. We sampled the training set and the testing set 20 times. Each time, 40 positive samples and 40 negative samples were randomly chosen to form a training set. The corresponding test set is randomly selected from the remaining positive and negative samples, which guarantees that there is no overlap between the training and testing sets. The SVM classification accuracy (ACC) of the 20 groups of balanced data was obtained. We use the mean value of the 20 ACCs as the final accuracy. The formula for ACC from any testing data is defined as follows:where TP (true positive) is the number of positive samples that are identified correctly, FN (false negative) is the number of positive samples that are identified incorrectly, TN (true negative) is the number of negative samples that are identified correctly, and FP (false positive) is the number of negative samples that are identified incorrectly.

Author Contributions

C.P. and X.F. conceived and designed the study; C.P., S.M., and G.Z. analyzed the data; J.D., S.Y.L., and F.L. contributed ideas and comments; C.P. and X.F. wrote the paper; and all authors read and approved the final manuscript.

Conflicts of Interest

The authors declare no competing interests.
  26 in total

Review 1.  The functions of animal microRNAs.

Authors:  Victor Ambros
Journal:  Nature       Date:  2004-09-16       Impact factor: 49.962

Review 2.  A Uniform System for the Annotation of Vertebrate microRNA Genes and the Evolution of the Human microRNAome.

Authors:  Bastian Fromm; Tyler Billipp; Liam E Peck; Morten Johansen; James E Tarver; Benjamin L King; James M Newcomb; Lorenzo F Sempere; Kjersti Flatmark; Eivind Hovig; Kevin J Peterson
Journal:  Annu Rev Genet       Date:  2015-10-14       Impact factor: 16.830

3.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources.

Authors:  Da Wei Huang; Brad T Sherman; Richard A Lempicki
Journal:  Nat Protoc       Date:  2009       Impact factor: 13.491

4.  Inferring microRNA-mRNA causal regulatory relationships from expression data.

Authors:  Thuc Duy Le; Lin Liu; Anna Tsykin; Gregory J Goodall; Bing Liu; Bing-Yu Sun; Jiuyong Li
Journal:  Bioinformatics       Date:  2013-01-30       Impact factor: 6.937

Review 5.  MicroRNAs in learning, memory, and neurological diseases.

Authors:  Wenyuan Wang; Ester J Kwon; Li-Huei Tsai
Journal:  Learn Mem       Date:  2012-08-16       Impact factor: 2.460

6.  Identification of tissue-specific microRNAs from mouse.

Authors:  Mariana Lagos-Quintana; Reinhard Rauhut; Abdullah Yalcin; Jutta Meyer; Winfried Lendeckel; Thomas Tuschl
Journal:  Curr Biol       Date:  2002-04-30       Impact factor: 10.834

7.  Most mammalian mRNAs are conserved targets of microRNAs.

Authors:  Robin C Friedman; Kyle Kai-How Farh; Christopher B Burge; David P Bartel
Journal:  Genome Res       Date:  2008-10-27       Impact factor: 9.043

Review 8.  SNPing cancer in the bud: microRNA and microRNA-target site polymorphisms as diagnostic and prognostic biomarkers in cancer.

Authors:  David W Salzman; Joanne B Weidhaas
Journal:  Pharmacol Ther       Date:  2012-09-03       Impact factor: 12.310

Review 9.  MicroRNAs: target recognition and regulatory functions.

Authors:  David P Bartel
Journal:  Cell       Date:  2009-01-23       Impact factor: 41.582

10.  Discovering the 'Dark matters' in expression data of miRNA based on the miRNA-mRNA and miRNA-lncRNA networks.

Authors:  Cong Pian; Guangle Zhang; Sanling Wu; Fei Li
Journal:  BMC Bioinformatics       Date:  2018-10-16       Impact factor: 3.169

View more
  2 in total

1.  Inhibitory Effect of miR-339-5p on Glioma through PTP4A1/HMGB1 Pathway.

Authors:  Buyi Zheng; Shouyi Wang; Huanan Shen; Jie Lin
Journal:  Dis Markers       Date:  2022-07-15       Impact factor: 3.464

2.  A novel cross-species differential tumor classification method based on exosome-derived microRNA biomarkers established by human-dog lymphoid and mammary tumor cell lines' transcription profiles.

Authors:  Kaj Chokeshaiusaha; Thanida Sananmuang; Denis Puthier; Catherine Nguyen
Journal:  Vet World       Date:  2022-05-11
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.