Literature DB >> 28177900

MCMDA: Matrix completion for MiRNA-disease association prediction.

Jian-Qiang Li1, Zhi-Hao Rong2, Xing Chen3, Gui-Ying Yan4, Zhu-Hong You5.   

Abstract

Nowadays, researchers have realized that microRNAs (miRNAs) are playing a significant role in many important biological processes and they are closely connected with various complex human diseases. However, since there are too many possible miRNA-disease associations to analyze, it remains difficult to predict the potential miRNAs related to human diseases without a systematic and effective method. In this study, we developed a Matrix Completion for MiRNA-Disease Association prediction model (MCMDA) based on the known miRNA-disease associations in HMDD database. MCMDA model utilized the matrix completion algorithm to update the adjacency matrix of known miRNA-disease associations and furthermore predict the potential associations. To evaluate the performance of MCMDA, we performed leave-one-out cross validation (LOOCV) and 5-fold cross validation to compare MCMDA with three previous classical computational models (RLSMDA, HDMP, and WBSMDA). As a result, MCMDA achieved AUCs of 0.8749 in global LOOCV, 0.7718 in local LOOCV and average AUC of 0.8767+/-0.0011 in 5-fold cross validation. Moreover, the prediction results associated with colon neoplasms, kidney neoplasms, lymphoma and prostate neoplasms were verified. As a consequence, 84%, 86%, 78% and 90% of the top 50 potential miRNAs for these four diseases were respectively confirmed by recent experimental discoveries. Therefore, MCMDA model is superior to the previous models in that it improves the prediction performance although it only depends on the known miRNA-disease associations.

Entities:  

Keywords:  disease; matrix completion; miRNA; miRNA-disease association

Mesh:

Substances:

Year:  2017        PMID: 28177900      PMCID: PMC5400576          DOI: 10.18632/oncotarget.15061

Source DB:  PubMed          Journal:  Oncotarget        ISSN: 1949-2553


INTRODUCTION

MicroRNA (miRNA) is a kind of short non-coding single-stranded RNA (∼22nt) which can regulate the gene expression by binding to the 3′ untranslated regions (UTRs) of its target messenger RNA (mRNA) through base pairing [1, 2]. There are significant differences between the miRNAs in different tissues and different growth stages, which means that miRNAs have differential spatial and temporal expression patterns [3]. Based on plenty of biological experiments, researchers now believe that these small molecules have a wide range of regulation effects on eukaryotic gene expression, not only in human genes but also in genes of many other species [4]. Up to now, researchers have discovered that miRNAs are involved in a series of critical life processes, including early cell growth, proliferation, differentiation [5, 6], apoptosis, death [7], fat metabolism and so on. Therefore, it is no wonder that miRNAs are closely related to many complex human diseases [8, 9]. For example, studies have implicated that miRNA-143 and miRNA-145 are constantly down-regulated in colorectal tumors [10] and recently Croce et al. also have shown that the downregulation of these miRNAs is a common occurrence in breast carcinomas [11]. Besides, studies by Takamizawa et al. [12] and Yanaihara et al. [13] have presented evidence that transcripts of certain let-7 homologs are significantly downregulated in human lung cancer. Based on real-time polymerase chain reaction (PCR), the analysis of miRNA arrays using pooled RNA samples from five gastric cancer patients indicates that the expression of miRNA-107, miRNA-21, miRNA-196a, miRNA-26b, miRNA-9, miRNA-142-3p, miRNA-30b, miRNA-150, miRNA-191, and miRNA-17 was found to be upregulated [14]. However, it is expensive and time-consuming to identify the associations between miRNAs and diseases using experimental methods. Considering that large numbers of miRNA-associated datasets are available, computational methods are efficient in predicting miRNA-disease associations in that they can select the most promising associated miRNAs for further experimental studies [15-17]. Therefore, it is necessary for us to make further efforts and develop efficient computational models to predict the potential miRNA-disease associations [16, 18–31]. Many computational methods have been established to predict the potential associations between miRNAs and diseases depending on the assumption that miRNAs with similar functions are more likely to have connections with diseases which share similar phenotypes [32, 33]. Jiang et al. [34] proposed a hypergeometric distribution-based model to predict miRNA-disease associations based on disease phenotype similarity network, miRNA functional similarity network, and known human disease-miRNA association network. However, this method strongly depends on the miRNA-target interactions with a high rate of false positive and false negative samples. Moreover, Shi et al. [35] presented a new model by implementing random walk algorithm on protein-protein interaction (PPI) network based on the idea that miRNAs whose target genes are related to certain diseases are more likely to be associated with these diseases. They made use of the miRNA–target interactions, disease–gene associations, and PPIs to acquire potential associations between the miRNAs and diseases. Mork et al. [36] proposed a miRPD method with the help of protein-disease interactions as well as protein-miRNA interactions, where not only disease-related miRNAs but also potential disease-related proteins were analyzed. By integrating known disease–gene associations and miRNA-target interactions, Xu et al. [37] introduced a miRNA prioritization method which need not rely on the known miRNA-disease associations. Instead, what they needed to do was to evaluate the similarity between the targets of miRNAs and disease genes. Nevertheless, all the methods mentioned above suffered from the miRNA-target interactions with high false positive and false negative samples, which could significantly reduce the accuracy of the aforementioned models. Researchers also proposed some other computational models without relying on miRNA-target interactions. Based on miRNA functional similarity, disease semantic similarity, disease phenotype similarity, and miRNA-disease associations, Xuan et al. [38] presented an HDMP model which analyzed the miRNAs related to the diseases by considering the functional similarities of the miRNA’s k most similar neighbors. Compared with the previous methods, HDMP assigned higher weight to the miRNAs in the cluster and family since they are more likely to be associated with similar diseases. When applied to new diseases without some known related miRNAs, however, HDMP is unable to work since it strongly depends on the neighbors of the miRNAs. Besides, HDMP is based on a local similarity measure rather than a global measure which can notably promote the prediction performance. Xuan et al. [39] introduced another model called MIDP based on random walk, which exploited the characteristics of the nodes and the various ranges of topologies. The labeled nodes in MIDP were assigned higher transition weight than the unlabeled nodes, which efficiently exploited the prior information of nodes and various ranges of topologies. What is worth mentioning is that MIDP effectively relieved the negative effect of noisy data. MIDP also extended the walk on a miRNA-disease bilayer network to predict candidate specially for the diseases without any known miRNAs. Recently, Zeng et al. [40] utilized matrix completion to predict the miRNA-disease associations based on miRNA-miRNA network and disease-disease network. The method contributed multiple feature sets to address problems related to insufficient miRNA-disease association data. The method could be applied to predict unknown miRNA-disease associations and new pathogenic miRNAs for well-characterized diseases. Chen et al. [41] proposed RWRMDA model which integrated miRNA-miRNA functional similarity and known miRNA-disease associations information to predict miRNA-disease associations. RWRMDA was motivated based on the investigation that global similarity measures are better in predicting the associations between miRNAs and diseases than the previous local network similarity measures. Still, this method fails to predict miRNAs associated with new diseases without any known related miRNAs. Chen et al. [16] presented another model called WBSMDA based on miRNA functional similarity, disease semantic similarity, miRNA-disease associations, and Gaussian interaction profile kernel similarity for miRNAs and diseases. WBSMDA makes a breakthrough in that it succeeds in predicting related miRNAs for new diseases without known related miRNAs and new miRNAs without known related diseases. Recently, Chen et al. [42] presented a model of HGIMDA using miRNA functional similarity, disease semantic similarity, miRNA-disease associations, and Gaussian interaction profile kernel similarities. In HGIMDA, the new miRNA functional similarity network was obtained by combining miRNA functional similarity network with Gaussian interaction profile kernel similarities for miRNAs. The process of calculating new disease similarity network was quite similar. Then, a heterogeneous graph was obtained by combining new miRNA functional similarity network, new disease similarity network and known miRNA-disease associations. Moreover, the potential association between a disease and a miRNA could be inferred based on an iterative equation if they didn’t have known association. It has been verified that HGIMDA obtained a high prediction performance. In addition, several computational models have considered machine learning methods. For instance, Xu et al. [43] developed a miRNA target-dysregulated network (MTDN) based on miRNA-target interactions as well as miRNA and mRNA expression profiles. Besides, MTDN implemented support vector machine (SVM) classifier to distinguish positive miRNA-disease associations from negative ones. Nevertheless, it is still fairly difficult to obtain the negative miRNA-disease associations today, which seriously decreases the prediction performance of this computational model. Chen et al. [15] presented a RLSMDA model based on semi-supervised learning which calculated the semantic similarity between different diseases. It is worth mentioning that RLSMDA could identify related miRNAs for diseases without any known associated miRNAs, meanwhile avoiding the problem of using negative associations between miRNAs and diseases. The trouble of RLSMDA is how to find the appropriate parameters and how to combine the classifiers from miRNA space and disease space together. Chen et al. [19] developed another computational model called RBMMMDA based on miRNA-disease associations which presented restricted Boltzmann machine (RBM) which is a two-layer undirected graphical model consisting of layers of visible and hidden units. Compared to the previous models, RBMMMDA could obtain not only new miRNA-disease associations but also corresponding association types. However, it is still too difficult to learn the complex parameters. In this study, we developed an effective computational model of Matrix Completion for MiRNA-Disease Association prediction model (MCMDA) using matrix completion algorithm based on the known miRNA-disease associations to predict the potential miRNA-disease associations. Compared to the previous computational models, MCMDA predicts the miRNA-disease associations by using the matrix completion algorithm, which is of high efficiency to update the low-rank miRNA-disease matrix. Besides, negative associations which are required in some previous computational models are not needed in MCMDA. To evaluate the effectiveness of MCMDA, global and local LOOCV as well as 5-fold cross validation were introduced. The AUCs of global and local LOOCV were respectively 0.8749 and 0.7718, and the model obtained the average AUC of 0.8767+/−0.0011 on 5-fold cross validation. Besides, the top 10 and top 50 miRNAs related to colon neoplasms, kidney neoplasms, lymphoma and prostate neoplasms obtained by MCMDA were examined in dbDEMC [44] and miR2Disease [45] database. As a result, 84%, 86%, 78% and 90% of the top 50 potential miRNAs for these four complex diseases were respectively confirmed by recent experimental discoveries. Thus, it proves that MCMDA is effective in predicting potential miRNA-disease associations and it has significant advantages over the previous methods although MCMDA only depends on known miRNA-disease associations.

RESULTS

Performance evaluation

We used global and local LOOCV as well as 5-fold cross validation based on the known miRNA-disease associations in HMDD database to evaluate the performance of MCMDA. Meanwhile, MCMDA were compared with three previous classical computational methods: WBSMDA [16], RLSMDA [15] and HDMP [38]. In LOOCV evaluation, each known association in the database was regarded as the test sample in turn while the other known associations were regarded as training samples. The miRNA-diseases without known association evidences were considered as candidate samples. The scores of all miRNA-disease pairs could be obtained after MCMDA was implemented. In global LOOCV, the score of the test sample was compared with the scores of all the candidate samples while in local LOOCV, the test sample was merely compared with the scores of the candidate samples which included the particular disease in the test sample. In 5-fold cross validation, the known miRNA-disease associations were randomly divided into five disjoint parts. Each time, one part was picked out as test samples and the other four parts were treated as training samples. Still, the miRNA-disease pairs without known association evidences were regarded as candidate samples. Then, the score of each test sample were compared with the scores of all the candidate samples, respectively. This procedure was repeated five times until each known association was used as test sample and its score was compared with the scores of the candidate samples. Those test samples whose ranks exceeded the given threshold were considered to predict the miRNA-disease associations correctly. Finally, we drew a receiver operating characteristics curve (ROC) to compare MCMDA with all the previous methods. In this curve, the true positive rate (TPR, sensitivity) and false positive rate (FPR, 1-specificity) were plotted [46]. Sensitivity represents the percentage of miRNA-disease test samples whose ranks exceeded the given threshold while specificity represents the percentage of negative miRNA-disease associations whose ranks were lower than the threshold [47]. The area under the ROC curve (AUC) was calculated to evaluate the accuracy of MCMDA. If AUC=1, MCMDA proves to be a prefect performance. AUC of 0.5 means that the method merely has a random prediction performance. As a result, the AUCs of MCMDA, WBSMDA, RLSMDA and HDMP were 0.8749, 0.8030, 0.8426, and 0.8366, respectively in global LOOCV. For local LOOCV, MCMDA, WBSMDA, RLSMDA and HDMP acquired AUCs of 0.7718, 0.8030, 0.8031 and 0.6953, respectively. The average AUCs of MCMDA, WBSMDA, RLSMDA, HDMP were 0.8767/−0.0011, 0.8185/−0.0009, 0.8569/−0.0020 and 0.8342+/−0.0010, respectively in 5-fold cross validation (See Figure 1). All in all, MCMDA turns out to be more effective in predicting potential miRNA-disease associations compared with the previous methods, especially considering that MCMDA merely depends on the known miRNA-disease associations in the database.
Figure 1

Performance evaluation comparison between MCMDA and three previous prediction models (RLSMDA, HDMP, WBSMDA) in terms of ROC curve and AUC based on global LOOCV and local LOOCV tested by known miRNA-disease associations in the HMDD database

MCMDA achieved AUC of 0.8749 in global LOOCV and 0.7718 in local LOOCV. Thus, the performance of MCMDA is almost better than all the previous models in some degree and it proves to be effective in predicting the potential miRNA-disease associations.

Performance evaluation comparison between MCMDA and three previous prediction models (RLSMDA, HDMP, WBSMDA) in terms of ROC curve and AUC based on global LOOCV and local LOOCV tested by known miRNA-disease associations in the HMDD database

MCMDA achieved AUC of 0.8749 in global LOOCV and 0.7718 in local LOOCV. Thus, the performance of MCMDA is almost better than all the previous models in some degree and it proves to be effective in predicting the potential miRNA-disease associations.

Case studies

Furthermore, case studies of four significant diseases related to human health were implemented to practically evaluate the prediction accuracy of MCMDA. The top 10 and top 50 predicted miRNAs related with these diseases were examined by another two miRNA-disease databases, dbDEMC [44] and miR2Disease [45]. Colon Neoplasms is a malignant cancer which is commonly found in the boundary of rectum and sigmoid colon [48]. It is the third most common cancer and the third leading cause of cancer death for both men and women in the United States [49]. However, early patients of colon neoplasms only suffer from subtle symptoms [50], making the disease difficult to be detected. To make things worse, it is reported that its occurrence rate has an increasing trend these years [51]. Thus, it is urgent to predict the potential miRNAs related to colon neoplasms. With the help of the modern iatrology, many miRNAs have been confirmed to be correlated with colon neoplasms. For instance, miRNA-145 targets the insulin receptor substrate-1 and thus inhibits the growth of colon cancer cells [52]. Besides, miRNA-126, which is frequently lost in colon neoplasms cells, has the function of suppressing the growth of neoplastic cells by targeting phosphatidylinositol 3-kinase signaling [53]. MCMDA was implemented to predict the top 50 miRNAs associated with colon neoplasms. Therefore, 9 of the top 10 and 42 of the top 50 predicted miRNAs associated with colon neoplasms were verified by dbDEMC and miR2Disease database (See Table 1).
Table 1

Prediction of the top 50 predicted miRNAs associated with colon neoplasms based on known associations in HMDD database

miRNAEvidencemiRNAEvidence
hsa-mir-146adbdemchsa-mir-196adbdemc;miR2Disease
hsa-mir-155dbdemc;miR2Diseasehsa-mir-29cdbdemc
hsa-mir-122unconfirmedhsa-mir-223dbdemc;miR2Disease
hsa-mir-21dbdemc;miR2Diseasehsa-mir-143dbdemc;miR2Disease
hsa-mir-34adbdemc;miR2Diseasehsa-let-7aunconfirmed
hsa-mir-221dbdemc;miR2Diseasehsa-mir-195dbdemc;miR2Disease
hsa-mir-16dbdemchsa-mir-200bdbdemc
hsa-mir-125bdbdemchsa-mir-214dbdemc
hsa-mir-29adbdemc;miR2Diseasehsa-mir-106bdbdemc;miR2Disease
hsa-mir-29bdbdemc;miR2Diseasehsa-mir-23amiR2Disease
hsa-mir-15adbdemchsa-mir-142unconfirmed
hsa-mir-133adbdemc;miR2Diseasehsa-mir-31dbdemc;miR2Disease
hsa-mir-222dbdemchsa-mir-34cmiR2Disease
hsa-mir-20adbdemc;miR2Diseasehsa-mir-141dbdemc;miR2Disease
hsa-mir-199aunconfirmedhsa-mir-148adbdemc
hsa-mir-26adbdemc;miR2Diseasehsa-mir-182dbdemc;miR2Disease
hsa-mir-1dbdemc;miR2Diseasehsa-mir-200aunconfirmed
hsa-mir-19bdbdemc;miR2Diseasehsa-let-7cdbdemc
hsa-mir-19adbdemc;miR2Diseasehsa-mir-101unconfirmed
hsa-mir-15bmiR2Diseasehsa-mir-192dbdemc;miR2Disease
hsa-mir-18amiR2Diseasehsa-mir-181adbdemc;miR2Disease
hsa-mir-92adbdemchsa-mir-9dbdemc;miR2Disease
hsa-mir-206unconfirmedhsa-mir-133bdbdemc;miR2Disease
hsa-mir-30bdbdemc;miR2Diseasehsa-mir-34bdbdemc;miR2Disease
hsa-mir-150dbdemc;miR2Diseasehsa-mir-183dbdemc;miR2Disease

The first column records top 1-25 related miRNAs. The second column records the top 26-50 related miRNAs.

The first column records top 1-25 related miRNAs. The second column records the top 26-50 related miRNAs. Kidney neoplasms, also known as renal cancer, is a cancer starting in the cells of kidney that includes many different types [54]. The two most common types of kidney cancer are renal cell carcinoma (RCC) and transitional cell carcinoma (TCC, also known as urothelial cell carcinoma) of the renal pelvis [55]. The most common symptoms of kidney neoplasms patients are pains in the lumbar and hematuria [56]. Many existing kidney neoplasm-related miRNAs have been reported based on recent biological experiments. For example, the common target ACVR2B of five miRNAs (miRNA-192, miRNA-194, miRNA-215, miRNA-200c and miRNA-141) is strongly expressed in renal childhood neoplasms [57]. In addition, miRNA-23b, by targeting proline oxidase, a novel tumor suppressor protein, could function as an oncogene in renal cancer [58]. Thus, the decreasing miRNA-23b expression may prove to be an effective way of inhibiting kidney tumor growth [58]. Based on MCMDA, 7 of the top 10 potential miRNAs associated with kidney neoplasms were confirmed by deDEMC and miR2Disease database while 43 were verified of the top 50 (See Table 2).
Table 2

Prediction of the top 50 predicted miRNAs associated with kidney neoplasms based on known associations in HMDD database

miRNAEvidencemiRNAEvidence
hsa-mir-155dbdemchsa-mir-92aunconfirmed
hsa-mir-146adbdemchsa-mir-195dbdemc
hsa-mir-122dbdemc;miR2Diseasehsa-mir-126dbdemc;miR2Disease
hsa-mir-34adbdemchsa-mir-29cdbdemc;miR2Disease
hsa-mir-221unconfirmedhsa-mir-23adbdemc
hsa-mir-16dbdemchsa-mir-143dbdemc
hsa-mir-125bunconfirmedhsa-mir-223dbdemc
hsa-mir-29adbdemc;miR2Diseasehsa-mir-214dbdemc;miR2Disease
hsa-mir-133aunconfirmedhsa-let-7adbdemc
hsa-mir-29bdbdemc;miR2Diseasehsa-mir-148adbdemc
hsa-mir-145dbdemchsa-mir-200bdbdemc;miR2Disease
hsa-mir-26adbdemc;miR2Diseasehsa-mir-31dbdemc
hsa-mir-199adbdemc;miR2Diseasehsa-mir-142unconfirmed
hsa-mir-222dbdemchsa-mir-106bdbdemc;miR2Disease
hsa-mir-1dbdemchsa-mir-34cdbdemc
hsa-mir-15bdbdemchsa-mir-182dbdemc;miR2Disease
hsa-mir-20adbdemc;miR2Diseasehsa-mir-200adbdemc
hsa-mir-17dbdemc;miR2Diseasehsa-mir-101dbdemc;miR2Disease
hsa-mir-30bdbdemchsa-let-7cdbdemc
hsa-mir-206dbdemchsa-mir-181adbdemc
hsa-mir-19adbdemchsa-mir-9dbdemc
hsa-mir-196adbdemchsa-mir-34bdbdemc
hsa-mir-19bdbdemc;miR2Diseasehsa-mir-183dbdemc
hsa-mir-18adbdemchsa-mir-133bunconfirmed
hsa-mir-150dbdemc;miR2Diseasehsa-let-7bunconfirmed

The first column records top 1-25 related miRNAs. The second column records the top 26-50 related miRNAs.

The first column records top 1-25 related miRNAs. The second column records the top 26-50 related miRNAs. Lymphoma is a malignant tumor originating in the lymphatic hematopoietic system [59] which consists of two categories: non-Hodgkinlymphoma (NHL) and Hodgkin'slymphoma (HL) [60]. Lymphoma is thought to be associated with gene mutations, as well as viruses, pathogens, radiation, chemical drugs, autoimmune diseases, etc. [61]. For example, re-expression of miRNA-150 induces EBV-positive Burkitt lymphoma differentiation by modulating c-Myb in vitro [62]. Besides, the expressions of miRNA-21 and miRNA-210 in plasma of previously untreated lymphoma patient group were higher than those of the patients treated for 6 or more courses [63]. MCMDA model predicts the top 10 and top 50 miRNAs related to lymphoma. As a result, 9 of the top 10 and 39 of the top 50 potential miRNAs were confirmed in the deDEMC and miR2Disease database (See Table 3).
Table 3

Prediction of the top 50 predicted miRNAs associated with lymphoma based on known associations in HMDD database

miRNAEvidencemiRNAEvidence
hsa-mir-30bdbdemchsa-mir-208aunconfirmed
hsa-mir-148adbdemchsa-mir-26bdbdemc
hsa-mir-373dbdemchsa-mir-143unconfirmed
hsa-mir-196adbdemchsa-mir-9dbdemc
hsa-mir-23adbdemchsa-let-7bdbdemc
hsa-mir-206dbdemchsa-mir-96dbdemc
hsa-mir-195dbdemchsa-let-7ddbdemc
hsa-mir-372unconfirmedhsa-mir-93dbdemc
hsa-mir-199adbdemchsa-mir-483unconfirmed
hsa-mir-15bdbdemchsa-mir-371aunconfirmed
hsa-mir-34cunconfirmedhsa-let-7edbdemc;miR2Disease
hsa-mir-34bdbdemchsa-mir-7dbdemc
hsa-mir-183dbdemchsa-mir-223dbdemc
hsa-mir-132dbdemchsa-mir-106adbdemc;miR2Disease
hsa-mir-214dbdemchsa-mir-205dbdemc
hsa-mir-182dbdemchsa-mir-222dbdemc
hsa-mir-31unconfirmedhsa-mir-335dbdemc
hsa-mir-133adbdemchsa-mir-27adbdemc
hsa-mir-212dbdemchsa-mir-181cdbdemc
hsa-mir-141dbdemchsa-mir-224dbdemc
hsa-mir-142unconfirmedhsa-mir-27bdbdemc
hsa-mir-192dbdemchsa-mir-30adbdemc
hsa-mir-429unconfirmedhsa-mir-370unconfirmed
hsa-mir-451aunconfirmedhsa-mir-1dbdemc
hsa-mir-106bdbdemchsa-let-7gdbdemc

The first column records top 1-25 related miRNAs. The second column records the top 26-50 related miRNAs.

The first column records top 1-25 related miRNAs. The second column records the top 26-50 related miRNAs. Prostate neoplasms is a malignant tumor which originates in the epithelial cells of prostate [64]. Factors that increase the risk of prostate neoplasms include older age, a family history of the disease, race and a diet high in processed meat, red meat or milk products or low in certain vegetables [65]. Up to now, lots of miRNAs have been discovered to be associated with prostate neoplasms. For instance, the proto-oncogene ERG is a target of miRNA-145 in prostate cancer [66]. MCMDA predicts the top 10 and top 50 potential miRNAs which are associated with prostate neoplasms. As a consequence, 9 of the top 10 and 45 of the top 50 predicted miRNAs were confirmed in the dbDEMC and miR2Disease database (See Table 4).
Table 4

Prediction of the top 50 predicted miRNAs associated with prostate neoplasms based on known associations in HMDD database

miRNAEvidencemiRNAEvidence
hsa-mir-146amiR2Diseasehsa-mir-150dbdemc
hsa-mir-122unconfirmedhsa-mir-126dbdemc;miR2Disease
hsa-mir-155dbdemchsa-mir-195dbdemc;miR2Disease
hsa-mir-21dbdemc;miR2Diseasehsa-mir-29cdbdemc
hsa-mir-34adbdemc;miR2Diseasehsa-mir-223dbdemc;miR2Disease
hsa-mir-16dbdemc;miR2Diseasehsa-mir-143dbdemc;miR2Disease
hsa-mir-221dbdemc;miR2Diseasehsa-mir-23adbdemc;miR2Disease
hsa-mir-29adbdemchsa-let-7adbdemc;miR2Disease
hsa-mir-133adbdemchsa-mir-200bunconfirmed
hsa-mir-29bdbdemc;miR2Diseasehsa-mir-214dbdemc;miR2Disease
hsa-mir-15adbdemc;miR2Diseasehsa-mir-148amiR2Disease
hsa-mir-26adbdemc;miR2Diseasehsa-mir-106bdbdemc
hsa-mir-222dbdemc;miR2Diseasehsa-mir-34cdbdemc
hsa-mir-199adbdemc;miR2Diseasehsa-mir-142unconfirmed
hsa-mir-1dbdemchsa-mir-31dbdemc;miR2Disease
hsa-mir-20amiR2Diseasehsa-mir-141miR2Disease
hsa-mir-17miR2Diseasehsa-mir-182dbdemc;miR2Disease
hsa-mir-15bdbdemchsa-mir-200adbdemc
hsa-mir-19adbdemchsa-mir-101dbdemc;miR2Disease
hsa-mir-19bdbdemc;miR2Diseasehsa-let-7cdbdemc;miR2Disease
hsa-mir-206dbdemchsa-mir-192dbdemc
hsa-mir-30bdbdemc;miR2Diseasehsa-mir-181adbdemc;miR2Disease
hsa-mir-18aunconfirmedhsa-mir-9dbdemc
hsa-mir-196adbdemchsa-mir-34bdbdemc
hsa-mir-92aunconfirmedhsa-mir-133bdbdemc

The first column records top 1-25 related miRNAs. The second column records the top 26-50 related miRNAs.

The first column records top 1-25 related miRNAs. The second column records the top 26-50 related miRNAs. The result of case studies on the four aforementioned human diseases illustrates that MCMDA achieves excellent prediction performance. Moreover, we prioritized the potential miRNAs associated with all the human diseases in HMDD database (See Supplementary Table 1). We hope that the predictions of MCMDA can be verified in future scientific researches.

DISCUSSION

Nowadays, researchers propose several computational methods to predict the potential associations between miRNAs and diseases because computational models could select the most promising miRNAs related to human diseases and are less expensive than the traditional experimental methods. In order to predict potential miRNA-disease associations, we developed a computational model of MCMDA by analyzing the known miRNA-disease associations and implementing the matrix completion algorithm to get the association score of each miRNA-disease pair. MCMDA obtained excellent prediction performances based on LOOCV and 5-fold cross validation. In addition, the predicted miRNAs associated with four important human diseases: colon neoplasms, kidney neoplasms, lymphoma and prostate neoplasms, were verified by the experimental literatures in dbDEMC and miR2Disease database. The results from cross validation and case studies indicated that MCMDA was effective in predicting potential miRNA-disease associations although it only depends on known miRNA-disease associations. The reasons why MCMDA achieved excellent performances are as follows. Firstly, MCMDA predicts the miRNA-disease associations by using the matrix completion algorithm based on the observation that the miRNA-disease matrix is low-rank. MCMDA fills the candidate samples without known associations with 0 and then iteratively updates them with the predictive scores. Besides, MCMDA is based on the known miRNA-disease associations in HMDD database. Plenty of known associations guarantee the efficiency of the predictions in MCMDA. Finally, negative associations which are required in some previous models are not needed in MCMDA. Yet, there still exist several limitations in MCMDA. Firstly, MCMDA method is based on the known miRNA-disease associations, which means it cannot predict the potential miRNAs associated with the new diseases without any known related miRNAs and potential diseases associated with new miRNAs. Besides, there is no powerful method to find the optimal parameters for MCMDA. Finally, the current miRNA-disease associations are insufficient. To be specific, there are merely 5430 known miRNA-disease associations within the possible exploration spaces of 495 miRNAs and 383 diseases. The more known associations are confirmed in the future, the more accurate MCMDA model can become.

MATERIALS AND METHODS

Human miRNA-disease associations

The known miRNA-disease associations were downloaded from HMDD v2.0 database [67] which consisted of 5430 known miRNA-disease associations, 495 miRNAs, and 383 diseases. We furthermore constructed an adjacency matrix M to represent known miRNA-disease associations. For instance, if miRNA is reported to be associated with disease in the database, the value of is 1 and otherwise 0. denotes the set of all the known associations in matrix M which means if is associated with . represents the number of miRNAs in HMDD database and represents the number of diseases.

MCMDA

We developed MCMDA based on the known miRNA-disease associations in HMDD database to predict the potential associations (See Figure 2). MCMDA uses the singular value thresholding (SVT) algorithm to accomplish the matrix completion procedure. First, the miRNA-disease association matrix M was obtained according to known miRNA-disease associations. Here, all the known associations between miRNAs and diseases in HMDD database are used as training samples.
Figure 2

Flowchart of MCMDA model to predict the potential miRNA-disease associations based on the known associations in HMDD database

The matrix completion algorithm is iterative and a prediction matrix (k denotes the iteration times) can be obtained in each iteration. When MCMDA ends, the matrix ( denotes the ultimate iteration times) is obtained which records the scores of all the possible miRNA-disease pairs. To ensure that the scores of known associations in are close to those in M, the following optimization problem needs to be solved. where is a candidate solution matrix with scores of all the unknown miRNA-disease samples, is the orthogonal projector onto the span of matrices vanishing outside of so that the (i,j) th component of is equal to if or zero otherwise. is a nonlinear function of which can be written as the following form. where is the nuclear form of the matrix which is the sum of the singular values of , denotes the Frobernius form of X which is , is athresholding which will be introduced later. According to [68], problem (2) can be optimized using the Lagrangian multiplier method. Specifically, we introduce a Lagrangian multiplier Y and get the Lagrangian function as below: The singular value decomposition (SVD) of matrix X with rank r, which represents the number of singular values of matrix X, is needed in matrix completion algorithm. where U and V are and matrices. means that is a diagonal matrix with positive singular values on its main diagonal. For , we introduce an operator defined as follows: where is the positive part of . In other words, is equal to if or 0 otherwise and it effectively shrinks the singular values of X toward 0. The value of is according to the previous research of matrix completion algorithm [69]. There are two key steps which are special instances of Uzawa’s algorithm [70] to find a saddle point of (3) in each iteration. We introduce which are a series of matrices to record the intermediate scores of matrices . First, update X with Y: Then, update Y with X: where is a zero matrix [71] and is the step size. It is usually thought that the iteration can converge to an unique solution when [72], specifically, we empirically set the value of according to the excellent performance in previous model [73]. MCMDA applies K.K.T conditions as the stopping criteria which are checked in each iteration to makes sure the scores of the known associations in the prediction matrix are close enough to the original matrix M: where is a stopping tolerance, the value is since it proved to be appropriate in restricting the iteration times in previous algorithm [71]. If the stopping criteria is met, MCMDA stops iteration immediately and the ultimate matrix is obtained. Finally, a parameter maxiter is set which restricts the max iteration times and avoids the infinite loop. Specifically, maxiter is set 500 to ensure that the ultimate matrix has reliable predicted scores. Based on the method mentioned above, the ultimate matrix is obtained by above calculation process which can be utilized to predict the potential miRNA-disease associations.
  66 in total

1.  Prioritizing candidate disease miRNAs by topological features in the miRNA target-dysregulated network: case study of prostate cancer.

Authors:  Juan Xu; Chuan-Xing Li; Jun-Ying Lv; Yong-Sheng Li; Yun Xiao; Ting-Ting Shao; Xiao Huo; Xiang Li; Yan Zou; Qing-Lian Han; Xia Li; Li-Hua Wang; Huan Ren
Journal:  Mol Cancer Ther       Date:  2011-07-18       Impact factor: 6.261

2.  Cleavage of Scarecrow-like mRNA targets directed by a class of Arabidopsis miRNA.

Authors:  Cesar Llave; Zhixin Xie; Kristin D Kasschau; James C Carrington
Journal:  Science       Date:  2002-09-20       Impact factor: 47.728

3.  Genome-wide miRNA expression profiling of human lymphoblastoid cell lines identifies tentative SSRI antidepressant response biomarkers.

Authors:  Keren Oved; Ayelet Morag; Metsada Pasmanik-Chor; Varda Oron-Karni; Noam Shomron; Moshe Rehavi; Julia C Stingl; David Gurwitz
Journal:  Pharmacogenomics       Date:  2012-07       Impact factor: 2.533

Review 4.  Partial nephrectomy: alternative treatment for selected patients with renal cell carcinoma.

Authors:  J L Duque; K R Loughlin; M P O'Leary; S Kumar; J P Richie
Journal:  Urology       Date:  1998-10       Impact factor: 2.649

Review 5.  MicroRNAs in B cell development and malignancy.

Authors:  Thilini R Fernando; Norma I Rodriguez-Malave; Dinesh S Rao
Journal:  J Hematol Oncol       Date:  2012-03-08       Impact factor: 17.388

6.  Predicting lncRNA-disease associations and constructing lncRNA functional similarity network based on the information of miRNA.

Authors:  Xing Chen
Journal:  Sci Rep       Date:  2015-08-17       Impact factor: 4.379

7.  miR-23b targets proline oxidase, a novel tumor suppressor protein in renal cancer.

Authors:  W Liu; O Zabirnyk; H Wang; Y-H Shiao; M L Nickerson; S Khalil; L M Anderson; A O Perantoni; J M Phang
Journal:  Oncogene       Date:  2010-06-21       Impact factor: 9.867

8.  RBMMMDA: predicting multiple types of disease-microRNA associations.

Authors:  Xing Chen; Chenggang Clarence Yan; Xiaotian Zhang; Zhaohui Li; Lixi Deng; Yongdong Zhang; Qionghai Dai
Journal:  Sci Rep       Date:  2015-09-08       Impact factor: 4.379

9.  ILNCSIM: improved lncRNA functional similarity calculation model.

Authors:  Yu-An Huang; Xing Chen; Zhu-Hong You; De-Shuang Huang; Keith C C Chan
Journal:  Oncotarget       Date:  2016-05-03

10.  IRWRLDA: improved random walk with restart for lncRNA-disease association prediction.

Authors:  Xing Chen; Zhu-Hong You; Gui-Ying Yan; Dun-Wei Gong
Journal:  Oncotarget       Date:  2016-09-06
View more
  60 in total

1.  An integrated framework for the identification of potential miRNA-disease association based on novel negative samples extraction strategy.

Authors:  Chun-Chun Wang; Xing Chen; Jun Yin; Jia Qu
Journal:  RNA Biol       Date:  2019-01-28       Impact factor: 4.652

2.  Predicting microRNA-disease associations using bipartite local models and hubness-aware regression.

Authors:  Xing Chen; Jun-Yan Cheng; Jun Yin
Journal:  RNA Biol       Date:  2018-09-19       Impact factor: 4.652

3.  RSCMDA: Prediction of Potential miRNA-Disease Associations Based on a Robust Similarity Constraint Learning Method.

Authors:  Yu ShengPeng; Wang Hong
Journal:  Interdiscip Sci       Date:  2021-07-10       Impact factor: 2.233

4.  HNMDA: heterogeneous network-based miRNA-disease association prediction.

Authors:  Li-Hong Peng; Chuan-Neng Sun; Na-Na Guan; Jian-Qiang Li; Xing Chen
Journal:  Mol Genet Genomics       Date:  2018-04-23       Impact factor: 3.291

5.  ELLPMDA: Ensemble learning and link prediction for miRNA-disease association prediction.

Authors:  Xing Chen; Zhihan Zhou; Yan Zhao
Journal:  RNA Biol       Date:  2018-05-25       Impact factor: 4.652

6.  Bioentity2vec: Attribute- and behavior-driven representation for predicting multi-type relationships between bioentities.

Authors:  Zhen-Hao Guo; Zhu-Hong You; Yan-Bin Wang; De-Shuang Huang; Hai-Cheng Yi; Zhan-Heng Chen
Journal:  Gigascience       Date:  2020-06-01       Impact factor: 6.524

7.  GCSENet: A GCN, CNN and SENet ensemble model for microRNA-disease association prediction.

Authors:  Zhong Li; Kaiyancheng Jiang; Shengwei Qin; Yijun Zhong; Arne Elofsson
Journal:  PLoS Comput Biol       Date:  2021-06-03       Impact factor: 4.475

8.  A structural deep network embedding model for predicting associations between miRNA and disease based on molecular association network.

Authors:  Hao-Yuan Li; Hai-Yan Chen; Lei Wang; Shen-Jian Song; Zhu-Hong You; Xin Yan; Jin-Qian Yu
Journal:  Sci Rep       Date:  2021-06-16       Impact factor: 4.379

9.  Predicting miRNA-Disease Association Based on Modularity Preserving Heterogeneous Network Embedding.

Authors:  Wei Peng; Jielin Du; Wei Dai; Wei Lan
Journal:  Front Cell Dev Biol       Date:  2021-06-10

10.  Prediction of miRNA-Disease Association Using Deep Collaborative Filtering.

Authors:  Li Wang; Cheng Zhong
Journal:  Biomed Res Int       Date:  2021-02-23       Impact factor: 3.411

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.