Literature DB >> 30018632

TLHNMDA: Triple Layer Heterogeneous Network Based Inference for MiRNA-Disease Association Prediction.

Xing Chen1, Jia Qu1, Jun Yin1.   

Abstract

In recent years, microRNAs (miRNAs) have been confirmed to be involved in many important biological processes and associated with various kinds of human complex diseases. Therefore, predicting potential associations between miRNAs and diseases with the huge number of verified heterogeneous biological datasets will provide a new perspective for disease therapy. In this article, we developed a novel computational model of Triple Layer Heterogeneous Network based inference for MiRNA-Disease Association prediction (TLHNMDA) by using the experimentally verified miRNA-disease associations, miRNA-long noncoding RNA (lncRNA) interactions, miRNA function similarity information, disease semantic similarity information and Gaussian interaction profile kernel similarity for lncRNAs into an triple layer heterogeneous network to predict new miRNA-disease associations. As a result, the AUCs of TLHNMDA are 0.8795 and 0.8795 ± 0.0010 based on leave-one-out cross validation (LOOCV) and 5-fold cross validation, respectively. Furthermore, TLHNMDA was implemented on three complex human diseases to evaluate predictive ability. As a result, 84% (kidney neoplasms), 78% (lymphoma) and 76% (prostate neoplasms) of top 50 predicted miRNAs for the three complex diseases can be verified by biological experiments. In addition, based on the HMDD v1.0 database, 98% of top 50 potential esophageal neoplasms-associated miRNAs were confirmed by experimental reports. It is expected that TLHNMDA could be a useful model to predict potential miRNA-disease associations with high prediction accuracy and stability.

Entities:  

Keywords:  association prediction; computational prediction model; disease; microRNA; triple layer heterogeneous network

Year:  2018        PMID: 30018632      PMCID: PMC6038677          DOI: 10.3389/fgene.2018.00234

Source DB:  PubMed          Journal:  Front Genet        ISSN: 1664-8021            Impact factor:   4.599


Introduction

According to the central law of molecular biology, genetic information was found to be stored in protein-coding genes (Crick et al., 1961). Recent studies have revealed that up to 70% of the human genome is transcribed into RNA, whereas protein-coding genes only make up less than 2% of total genome (Djebali et al., 2012). The majority of the human genome is made up of non-coding RNAs (ncRNAs) (Derrien et al., 2012). Based on whether transcript lengths are larger than 200 nucleotides, ncRNAs can be further divided into small ncRNA and long ncRNA (lncRNA) (Kapranov et al., 2007; Guttman et al., 2013). MicroRNAs (miRNAs) are endogenous non-coding RNAs (~22 nt) that bind to the 3′-untranslated regions (3′-UTRs) of their target RNAs (mRNAs) and control the expression of gene (Ganju et al., 2017). MiRNAs could also serve as positive regulators (Jopling et al., 2005; Vasudevan et al., 2007). Sufficient evidences indicated that thousands of miRNAs have associations with many critical biological processes (Lu et al., 2008), such as cell proliferation (Cheng et al., 2005), development (Karp and Ambros, 2005), metabolism (Alshalalfa and Alhajj, 2013), aging (Bartel, 2009), transduction (Cui et al., 2006), viral infection (Miska, 2005), and so on. Some researchers also founded that allogeneic T cell responses are regulated by miRNAs (Sun et al., 2013). It also has been shown that by attenuating shared miRNAs, competing endogenous RNAs (ceRNAs) could crosstalk and regulate each other, which is essential for regulating many biological functions (Yuan et al., 2016). Moreover, miRNA34s might be key effectors of p53 tumor-suppressor function, and their inactivation might contribute to certain cancers (Bommer et al., 2007). Recently, experiments further showed that special class of 5′-capped pre-miRNAs have been identified in both C. elegans and mouse, this promotes the understanding of the transcriptional regulation of miRNA genes themselves (Chen et al., 2017a). Therefore, it is no wonder that miRNAs are closely connected with diverse human cancer types, including breast neoplasms, lung neoplasms, colon neoplasms, kidney neoplasms, lymphoma, etc. (Pasquier and Gardès, 2016). For example, studies have implicated that miR-16-1 and miR-15a could cause chromosomal translocations in patients with chronic lymphocytic leukemia (CLL) (Calin et al., 2002). Experiments further shown that miRNAs may be a new target for the molecular targeted therapy of various cancers (Guzzi et al., 2015; Chen et al., 2017b). Thus, the identification of disease-associated miRNAs can provided a new viewpoint with the respect to the diagnosis, prevention and treatment of human complex diseases in the field of medicine (Chen, 2016). However, using the traditional biological methods to identify miRNA-disease associations is usually time-consuming and expensive. Therefore, more and more scholars have focused on developing efficient computational models to predict potential miRNA-disease associations by integrating various experimentally validated datasets. Database HMDD and miR2Disease (Jiang et al., 2009; Li et al., 2014c) have been constructed to collect the associations between human miRNAs and diseases based on previous biological experiments. According to the assumption that functionally similar miRNAs tend to be associated with phenotypically similar diseases (Lu et al., 2008; Bandyopadhyay et al., 2010), several computational approaches have been established to infer the new miRNA-disease associations. Mork et al. (2014) introduced a computational model, named miRPD. They identified potential miRNA–disease associations by systematic combination of known miRNA-protein associations with known protein-disease associations. Shi et al. (2013) established a computational framework on the basis of the assumption that miRNAs whose target genes are associated with specific diseases are more possible to be related to these diseases. They constructed protein-protein interaction (PPI) networks and implemented random walk on the network to calculate the probability scores of each miRNA-disease pair. Xu et al. (2011) introduced an approach to infer novel human miRNA-disease associations by combining computational target prediction with expression profiles of miRNA and mRNA in tumor and nontumor tissues. In the model, the probability scores of each miRNA-disease pair could be converted into the functional similarity calculation between miRNA targets and known diseases-related genes. More importantly, the model could be a useful tool for miRNA-disease association prediction without relying on the known miRNA-disease associations. Jiang et al. (2010) proposed a computational model on the basic of hypergeometric distribution to predict new disease-associated miRNA by systematic integration of miRNA functional similarity network, disease phenotype similarity network, and experimentally verified disease-miRNA association network. However, less than 40 percent of the molecular for human disease is known and the dataset of miRNA-target interactions used in the above studies were not highly accurate, which may limit the application of the method mentioned above. Researchers have also proposed other methods without relying on the dataset of miRNA-target interactions. For example, Chen et al. (2012b) developed the method of Random Walk with Restart for MiRNA–Disease Association (RWRMDA) to identify new disease-associated miRNAs by applying a similarity-based RWR on miRNA functional similarity network. Xuan et al. (2015) proposed the method of MIRNAs associated with Diseases Prediction (MIDP) to predict new miRNAs candidates using random walk. In which they built a miRNA network derived from miRNA-associated diseases by integration of the nodes similarities, nodes prior information and their local topological structure. Then, the potential association between a disease and a miRNA could be inferred until the iterative walking process on the network converged. Xuan et al. (2013) further proposed an effective computational approach of HDMP by comprehensive integration of miRNA functional similarity and the distribution of miRNAs associated with the disease in the k most similar neighbors to obtain scores of new miRNAs-disease associations. Li et al. (2017) developed Matrix Completion for MiRNA-Disease Association prediction (MCMDA), a reliable computational method in which they updated scores of each pair using matrix completion algorithm. The model is of high efficiency to update the low-rank miRNA-disease association matrix. Chen and Yan (2014) reported a method named Regularized Least Squares for MiRNA-Disease Association prediction (RLSMDA) on the basis of miRNA functional similarity, disease semantic similarity and known human miRNA-disease associations using a semi-supervised classifier. Recently, Chen et al. (2016a) introduced the model of Within and Between Score for MiRNA-Disease Association prediction (WBSMDA) by combination of integrated similarity and known miRNA-disease associations. The model built two prediction functions from the perspective of disease and miRNA according to the idea that functionally similar miRNAs tend to be associated with similar diseases, and combined them to calculate the association probability of each miRNA-disease pair. Chen et al. (2016b) further developed Heterogeneous Graph Inference for MiRNA-Disease Association prediction (HGIMDA), a new approach in which they constructed a heterogeneous network on the basic of miRNA functional similarity, disease semantic similarity, known miRNA-disease associations and an iterative update equation that propagates information across the heterogeneous network were established to infer new disease-associated miRNAs. A deep ensemble miRNA-disease association prediction (DeepMDA) framework was also introduced by Fu and Peng (2017) to identify potential miRNA-disease associations using a three-layer neural network classifier based on high-level features extracted from miRNA and disease similarity. Moreover, some other computing models for the identification of miRNA-disease associations were also gradually proposed, such as Liu et al. (2016) predicted miRNA-disease associations by implementing random walk on a heterogeneous network with multiple data sources. Zou et al. (2015) introduced two computational methods of KATZ and CATAPULT to make prediction for miRNA-disease pairs based on social network analysis methods. Pallez et al. (2017) presented a predictive approach named MiRAI using an evolutionary tuned latent semantic analysis. Pasquier and Gardès (2016) make prediction for miRNA-disease associations with a vector space model. As mentioned above, an integration strategy may provide more comprehensive and accurate information to predict disease-related miRNAs. Actually, miRNA dysregulation is related to many human diseases through many factors, including, for example, miRNA-mRNA interactions, miRNA-lncRNA interactions, miRNA-protein interactions and so on. The miRNAs involved in genes, coding RNAs, and proteins have been used widely in other computational model for the identification of miRNA-disease associations (Shi et al., 2013) (Mork et al., 2014). In this paper, considering many experimentally verified miRNA-lncRNA interactions have been confirmed by recent biological experiments (Li et al., 2014a), we introduced the model of Triple Layer Heterogeneous Network based inference for MiRNA-Disease Association prediction (TLHNMDA) to identify the potential biological links between miRNAs and diseases by integrating multi-level data regarding miRNAs, diseases, lncRNAs and their association information into a triple layer heterogeneous network. We implemented leave-one out cross validation (LOOCV) and 5-fold cross validation on the TLHNMDA to evaluate its performance. The AUCs of LOOCV were respectively 0.8795, and the model obtained the average AUC of 0.8795 ± 0.0010 on 5-fold cross validation. Then, case studies of kidney neoplasms, prostate neoplasms and lymphoma were implemented to assess the independent prediction performance of the model. As a result, 42, 38, and 39 out of top 50 potential miRNAs for these three important diseases were confirmed in dbDEMC (Yang et al., 2010) and miR2Disease (Jiang et al., 2009) database, respectively. We further tested TLHNMDA on the database HMDD v1.0 (Lu et al., 2008) to see whether the TLHNMDA still performs well. Taking esophageal neoplasms as an example, as a result, 49 of the top 50 esophageal neoplasms-associated miRNAs were verified by experimental reports. It has proved that TLHNMDA is reliable and effective in predicting potential disease-associated miRNAs.

Materials and methods

Human miRNA-disease association

In this paper, the known dataset of human miRNA–disease associations were downloaded from HMDD v2.0 database. The dataset contains 383 diseases, 495 miRNAs and 5430 high-quality experimentally verified human miRNA-diseases associations. Furthermore, an adjacency matrix A was established to denote known miRNAs-disease associations. The row of the matrix represents the disease, and the column represents the miRNAs. We used the variables nm and nd to represent the number of miRNAs and diseases in the dataset, respectively. The value of A(d(i), m(j)) is 1 when miRNA m(i) is associated with disease d(j), otherwise 0.

miRNA-lncRNA interactions

The dataset of miRNA-lncRNA interactions can be obtained from starBase v2.0 database (Li et al., 2014a), which provided the most comprehensive experimentally confirmed miRNA–lncRNA interactions. The dataset consists of 10112 known miRNA-lncRNA interactions about 132 miRNAs and 1114 lncRNAs. In addition, the known lncRNAs-related miRNAs that do not appear in the dataset of known miRNA-disease associations mentioned above is deleted. As a result, 9088 miRNA-lncRNA interactions were obtained. We also constructed an adjacency matrix B to represent known miRNA-lncRNA interactions. The row of the B represents the miRNAs, and the column represents the lncRNAs. The variable nl represents the number of lncRNA in the dataset. If miRNA m(i) is interacted with lncRNA l(j), the value of B(m(i), l(j)) in the B is 1, otherwise 0.

miRNA functional similarity

Wang et al. (2010) introduced a computational method of miRNA functional similarity between a miRNA pair (m and m). The whole process of the computational method can be divided into four steps. First, we need to identify the diseases set D(m) (diseases related to m) and D(m) (diseases related to m) for miRNA m and m, respectively. Second, in both sets, the semantic values of all diseases are calculated according to the corresponding DAG. Third, the semantic similarity for each disease pairs between D(m) and D(m) can be computed by consideration of their semantic value. In the last step, the functional similarity between m and m is calculated in the light of the semantic similarity obtained in step three. From http://www.cuilab.cn/files/images/cuilab/misim.zip, miRNA functional similarity probability scores can be downloaded. Similarly, we built matrix FS to stand for the miRNA functional similarity matrix, where FS(m(i), m(j)) is the functional similarity probability score between miRNA m(i) and m(j).

Disease semantic similarity model 1

Each disease can be described as a Directed Acyclic Graph (DAG). For example, disease D can be denoted as DAG(D) = (D,T(D),E(D)), where T(D) is a set of node D itself and its ancestor nodes, E(D) stands for the edges between parent and child nodes (Wang et al., 2010). Therefore, the semantic value of disease D could be calculated as follows: where Δ is the semantic contribution factor. For disease D, the contribution of itself to the semantic value of disease D is 1. If the distance between D and d increases, the semantic contribution value of disease d to the D will decreases. Thus, if diseases in the same layer, they would have the same contribution to the semantic value of disease D. The value of semantic similarity in disease semantic similarity model 1 between disease d(i) and d(j) can be defined as follows:

Disease semantic similarity model 2

In the disease semantic similarity model 2, considering different disease terms in the same layer of DAG(D) may appear in different numbers of disease DAGs, disease with more specific which appears in less disease DAGs should contribute to the semantic similarity of disease D at a higher contribution level. Therefore, the contribution of disease d to the semantic value of disease D can be calculated as follows: In disease semantic similarity model 2, the value of semantic similarity between d(i) and d(j) can be defined as follows:

Gaussian interaction profile kernel similarity

Gaussian interaction profile kernel similarity for diseases can be defined based on the known miRNA-disease associations dataset by considering the assumption that similar diseases tend to be related with more common miRNAs. In this paper, the binary vector IP(d(u)) is the uth row of matrix A, which was used to indicate the interaction profiles between disease d(u) and each miRNA. Therefore, the value of Gaussian interaction profile kernel similarity between diseases d(u) and d(v) is defined as follows. where parameter γ is used to control the kernel bandwidth, which can be obtained from the normalization of a new bandwidth by the average number of associated miRNAs for all the diseases. Similarly, we defined the value of Gaussian interaction profile kernel similarity between miRNA m(i) and m(j) as follows: Gaussian interaction profile kernel similarity for lncRNA l(i) and l(j) can also be calculated as follows:

Integrated similarity for miRNAs and diseases

Here, integrated miRNA similarity matrix SM are defined on the basis of miRNA functional similarity and Gaussian interaction profile kernel similarity for miRNAs. Integrated disease similarity matrix SD are constructed according to disease semantic similarity and Gaussian interaction profile kernel similarity for diseases. where

TLHMDA

According to the guilt-by-association principle (Barabási et al., 2011), new miRNA–disease associations can be inferred through existing associations between similar miRNAs and similar diseases, Likewise, novel miRNA-lncRNA interactions can be inferred through existing associations between similar miRNA and lncRNA (see Figure 1). We infer new miRNA-lncRNA associations in the newly proposed triple layer heterogeneous network by using an information flow-based method. New disease-lncRNA association matrix could be constructed as follows: As shown in the above formula, we can identify potential disease-lncRNA associations on the basis of miRNA-disease associations W, miRNA-lncRNA interactions W as well as integrated similarity for miRNAs SM according to the equation. Once the associations between diseases and lncRNAs are established. New association between diseases and miRNAs can be defined by considering these associations: Equation (16) is potentially more powerful in capturing miRNA-disease associations by incorporating lncRNA information into miRNA-disease prediction. As a by-product from the model, we can also obtain a new interaction between each miRNA and lncRNA pair by incorporating miRNA-disease associations W, disease-lncRNA associations W as well as integrated similarity for miRNAs SD. New association between miRNAs and lncRNAs can be defined as follows: where the superscript T indicates the transpose of the corresponding matrix.
Figure 1

Flowchart of potential disease-miRNA association prediction based on the computational model of TLHNMDA: (A) Constructing miRNA-disease association matrices, miRNA-lncRNA interaction matrices and obtaining integrated similarity network by combining miRNA functional similarity, disease semantic similarity, Gaussian interaction profile kernel similarity; (B) Constructing a triple layer heterogeneous network and predicting potential miRNA-disease associations based on an iterative equation to obtain the stable association probability.

Flowchart of potential disease-miRNA association prediction based on the computational model of TLHNMDA: (A) Constructing miRNA-disease association matrices, miRNA-lncRNA interaction matrices and obtaining integrated similarity network by combining miRNA functional similarity, disease semantic similarity, Gaussian interaction profile kernel similarity; (B) Constructing a triple layer heterogeneous network and predicting potential miRNA-disease associations based on an iterative equation to obtain the stable association probability. We treat W as a temporary value, and replace W in the two Equations (16, 17) using the Equation (15), respectively. Once the new miRNA-disease associations and new miRNA-lncRNA interactions were obtained, we established iterative updating procedure based on Equations (18, 19). The final computational model can be written as follows: Here α a decay factor in the range of (0,1). A and B represents the initial disease–miRNA associations and miRNA–lncRNA interactions, respectively. and would be converge with proper normalization utilizing Equations (24, 25), respectively (Wang et al., 2013c) (the proof can be found in the Supplementary Materials). After some steps, the iteration will be stable after some steps (the change in value between and measured by L1 norm is less than a given cutoff, the cutoff in this paper was 10−6). The three-layer model is proposed by incorporating miRNA-lncRNA information into miRNA-disease association prediction based on miRNA dysregulation is associated with many human complex diseases may through miRNA-lncRNA interactions. It can be seen from the two iterative algorithms, once new association between miRNA and disease is estimated, it can be used to update other miRNA-disease associations and miRNA-lncRNA interactions. Similarly, once new association between miRNA and lncRNA is estimated, it can also be used to update other miRNA-disease associations and miRNA-lncRNA interactions. Therefore, the layer between miRNA and disease and the layer between miRNA and lncRNA paly the same important role in the triple layer heterogeneous network to propagate information for the identification of potential miRNA-disease associations and miRNA-lncRNA interactions simultaneously. In order to make the two constructed iterative equations to work effectively, known miRNA-disease associations and known miRNA-lncRNA interactions as weights were added to the inferred equations because the initial links deserve more credibility. At last, and were expected to converge, which means that the propagation of information would be stable at the end.

Results

Performance evaluation

We implemented LOOCV as well as 5-fold cross validation on the basis of the experimentally verified miRNA-disease associations in HMDD v2.0 database (Li et al., 2014c) to evaluate the prediction performance of TLHNMDA. Moreover, TLHNMDA were compared with four previous classical computational methods: RLSMDA (Chen et al., 2012b), HDMP (Xuan et al., 2013), WBSMDA (Xu et al., 2011), RKNNMDA (Chen et al., 2017c). In the framework of LOOCV evaluation, each known association of miRNA-disease pair in the database was considered as test samples in turn, the other known miRNA-disease associations were considered as training samples, the miRNA-disease pairs with no known verified associations were regarded as candidate samples. After TLHNMDA was implemented, we would obtain the scores of the test samples and the scores of the candidate samples, and then the score of the test sample was compared with the scores of all the candidate samples in LOOCV. While in 5-fold cross validation, the experimentally verified miRNA-disease associations were evenly divided into five disjoint parts. One part was selected as test samples and the other four parts were regarded as training samples in each time. Similarly, the miRNA-disease pairs without known association evidences were regarded as candidate samples. Then, the score of each test sample was compared with the scores of all the candidate samples. It is worth noting that the above process was repeated 100 times, we would get 100 rankings for all miRNA and disease pairs. It is worth noting that almost all the models for the prediction of miRNA-disease associations according to the assumption that miRNAs with similar functions tend to be related to phenotypically similar diseases were proposed based on the LOOCV and 5-fold cross validation (Mork et al., 2014; Xuan et al., 2015; You et al., 2017; Zhong et al., 2017). At last, we drew Receiver Operating Characteristics (ROC) curve using true positive rate (TPR, sensitivity) against the false positive rate (FPR, 1-specificity) at different thresholds evaluate the performance of TLHNMDA clearly. Sensitivity refers to the percentage of the positive miRNA-disease associations whose score ranks are higher than the preset threshold, while specificity refers to the percentage of negative miRNA-disease pairs with ranks lower than the threshold. Then, the value of Area under the ROC curve (AUC) could be calculated to evaluate the prediction performance of the model. If the value of AUC is 1, it tells us the approach possesses perfect prediction performance; if the value of AUC is 0.5, it stands for the method possesses random prediction performance. For LOOCV, TLHNMDA, RLSMDA, HDMP, WBSMDA, RKNNMDA obtained AUCs of 0.8795, 0.8426, 0.8366, 0.8030 and 0.7159, respectively (see Figure 2). For 5-fold, TLHNMDA, RLSMDA, HDMP, WBSMDA, RKNNMDA obtained the average AUCs and corresponding standard deviations of 0.8795 ± 0.0010, 0.8569 ± 0.0020, 0.8342 ± 0.0010, 0.8185 ± 0.0009, and 0.6723 ± 0.0027, respectively.
Figure 2

Comparison between TLHNMDA, RLSMDA, HDMP, WBSMDA, RKNNMDA in terms of ROC curve and AUC based on LOOCV. As a result, TLHNMDA, RLSMDA, HDMP, WBSMDA, RKNNMDA achieved AUCs of 0.8795, 0.8426, 0.8366, 0.8030, and 0.7159 in the LOOCV, respectively. In conclusion, TLHNMDA outperform the other models.

Comparison between TLHNMDA, RLSMDA, HDMP, WBSMDA, RKNNMDA in terms of ROC curve and AUC based on LOOCV. As a result, TLHNMDA, RLSMDA, HDMP, WBSMDA, RKNNMDA achieved AUCs of 0.8795, 0.8426, 0.8366, 0.8030, and 0.7159 in the LOOCV, respectively. In conclusion, TLHNMDA outperform the other models.

Case studies

Here, to evaluate the prediction accuracy of TLHNMDA, case studies were implemented on kidney neoplasms, lymphoma and prostate neoplasms. In the model, the 5430 known miRNA-disease associations in HMDD v2.0 were utilized as the training set. All candidate miRNAs for each interested disease were ranked in accordance with their predicted scores. After that, the top 50 predicted miRNAs were picked out and verified in other two important miRNA-disease association databases (i.e., dbDEMC and miR2Disease). Furthermore, the results showed that 232 of the 5430 known miRNA-disease associations in HMDD v2.0 also existed in miR2Disease and 546 known associations also existed in dbDEMC. It is noteworthy that there was no overlap between the training samples and the prediction lists. That is because only candidate miRNAs (miRNAs have any no known associations with interested disease in HMDD v2.0) for interested disease were ranked and verified in case studies. Accordingly, none of the top 50 predicted miRNAs existed in HMDD v2.0 and the verification of miRNAs in the prediction lists was completely independent of HMDD v2.0. Kidney neoplasms, known as renal cancer, is a common health problem in cancer diseases (Manojlovi et al., 1986). The age of its incidence can be in all ages, particularly in the age between 50 and 70 years old (Nickerson et al., 2002). The most common symptoms of kidney neoplasms patients are pains in the lumbar and hematuria (Duque et al., 1998). Many existing treatments of kidney neoplasms are usually radiation therapy and chemotherapy drugs, which do not have much effect in the cure (Zbar et al., 2003). Up to now, lots of miRNAs have been reported to be associated with kidney neoplasms. For example, miRNA-192, miRNA-194, miRNA-215, miRNA-200c, and miRNA-141 were proved to be associated with renal childhood neoplasms (Senanayake et al., 2012). MiRNA-210 was reported to be upregulated in renal neoplasms (Eilertsen et al., 2014). Another miRNA named miRNA-23b could act as an oncogene and reducing the expression of miRNA-23b would be an effective way to inhibit the growth of kidney tumor, which might contribute to the treatment of renal neoplasms in medicine (Liu et al., 2010). In case studies, we implemented TLHNMDA on kidney neoplasms to predict the potential miRNA-disease associations. In short, 8 of the top 10 and 42 of the top 50 novel identified miRNAs associated with kidney neoplasms were validated by the two database deDEMC and miR2Disease (see Table 1).
Table 1

Prediction of the top 50 predicted miRNAs associated with kidney neoplasms based on known associations in HMDD v2.0 database.

miRNAEvidencemiRNAEvidence
hsa-mir-16dbDEMChsa-mir-20adbDEMC miR2Disease
hsa-mir-15bdbDEMChsa-mir-539unconfirmed
hsa-mir-195dbDEMChsa-mir-26adbDEMC miR2Disease
hsa-mir-424dbDEMC miR2Diseasehsa-mir-27bdbDEMC
hsa-mir-497dbDEMChsa-mir-34adbDEMC
hsa-mir-103aunconfirmedhsa-mir-17miR2Disease
hsa-mir-485unconfirmedhsa-mir-29bdbDEMC miR2Disease
hsa-mir-23adbDEMChsa-mir-125bunconfirmed
hsa-mir-214dbDEMC miR2Diseasehsa-mir-143dbDEMC
hsa-mir-155dbDEMChsa-mir-128dbDEMC
hsa-mir-107dbDEMChsa-mir-320aunconfirmed
hsa-mir-590unconfirmedhsa-mir-708unconfirmed
hsa-mir-19adbDEMChsa-mir-124dbDEMC
hsa-mir-125adbDEMChsa-mir-149dbDEMC
hsa-mir-142unconfirmedhsa-mir-199adbDEMC miR2Disease
hsa-mir-19bdbDEMC miR2Diseasehsa-mir-34cdbDEMC
hsa-mir-138dbDEMChsa-mir-181adbDEMC
hsa-mir-26bdbDEMChsa-mir-152dbDEMC
hsa-mir-150dbDEMC miR2Diseasehsa-mir-106adbDEMC miR2Disease
hsa-mir-29cdbDEMC miR2Diseasehsa-mir-18adbDEMC
hsa-mir-370dbDEMChsa-mir-181bdbDEMC
hsa-mir-31dbDEMChsa-mir-193adbDEMC
hsa-mir-185dbDEMC miR2Diseasehsa-mir-7dbDEMC miR2Disease
hsa-mir-24dbDEMChsa-mir-122dbDEMC miR2Disease
hsa-mir-29adbDEMC miR2Diseasehsa-mir-106bdbDEMC miR2Disease

The first column records top 1–25 related miRNAs. The second column records the top 26–50 related miRNAs.

Prediction of the top 50 predicted miRNAs associated with kidney neoplasms based on known associations in HMDD v2.0 database. The first column records top 1–25 related miRNAs. The second column records the top 26–50 related miRNAs. Lymphoma is the fastest growing human tumor (Chen et al., 2013), which is a group of blood cell tumors develop from lymphocytes (a type of white blood cell). The disease consists of two categories: Hodgkin lymphomas (HL) and the non-Hodgkin lymphomas(NHL) (Mcduffie et al., 2009). Many lymphoma-related miRNAs have been reported based on recent biological experiments. For example, the expression of miRNA-150 was confirmed to be a tumor suppressor in malignant lymphoma (Watanabe et al., 2011), which induces the differentiation of EBV-positive Burkitt lymphoma differentiation based on the modulation of c-Mybi in vitro (Li et al., 2014a). In addition, miR-21 could regulate cell activity of proliferation, invasion, and apoptosis. Accordingly, it has a potential therapeutic application in lymphoma (Sekar et al., 2014). We implemented TLHNMDA on lymphoma to predict the top 10 and top 50 related miRNAs. Briefly speaking, 7 of top 10 and 39 of top 50 potential lymphoma-related miRNAs were verified in the deDEMC and miR2Disease database (see Table 2).
Table 2

Prediction of the top 50 predicted miRNAs associated with lymphoma based on known associations in HMDD v2.0 database.

miRNAEvidencemiRNAEvidence
hsa-mir-15bdbDEMChsa-mir-199adbDEMC
hsa-mir-195dbDEMChsa-mir-34cunconfirmed
hsa-mir-424dbDEMChsa-mir-152dbDEMC
hsa-mir-497dbDEMChsa-mir-106adbDEMC miR2Disease
hsa-mir-103aunconfirmedhsa-mir-181bdbDEMC
hsa-mir-485unconfirmedhsa-mir-193aunconfirmed
hsa-mir-23adbDEMChsa-mir-7dbDEMC
hsa-mir-214dbDEMChsa-mir-106bdbDEMC
hsa-mir-107dbDEMChsa-mir-22dbDEMC
hsa-mir-590unconfirmedhsa-mir-27adbDEMC
hsa-mir-142unconfirmedhsa-mir-144unconfirmed
hsa-mir-26bdbDEMChsa-mir-326dbDEMC
hsa-mir-370unconfirmedhsa-mir-93dbDEMC
hsa-mir-31dbDEMChsa-mir-186dbDEMC
hsa-mir-185dbDEMChsa-mir-30adbDEMC
hsa-mir-23bdbDEMChsa-mir-148adbDEMC
hsa-mir-29adbDEMChsa-mir-182dbDEMC
hsa-mir-27bdbDEMChsa-mir-199bdbDEMC
hsa-mir-34adbDEMChsa-mir-145dbDEMC miR2Disease
hsa-mir-29bdbDEMChsa-mir-328dbDEMC miR2Disease
hsa-mir-125bunconfirmedhsa-mir-330dbDEMC
hsa-mir-143dbDEMC miR2Diseasehsa-mir-421unconfirmed
hsa-mir-128dbDEMChsa-mir-1dbDEMC
hsa-mir-320aunconfirmedhsa-mir-181cdbDEMC
hsa-mir-149dbDEMC miR2Diseasehsa-mir-141dbDEMC

The first column records top 1–25 related miRNAs. The second column records the top 26–50 related miRNAs.

Prediction of the top 50 predicted miRNAs associated with lymphoma based on known associations in HMDD v2.0 database. The first column records top 1–25 related miRNAs. The second column records the top 26–50 related miRNAs. Prostate neoplasms is the most common disease in men (Siegel et al., 2013). The malignant tumor originates from prostate in the epithelial cells (Gmyrek et al., 2001). In the recent years, many miRNAs have been verified to be related with prostate neoplasms base on accumulating researches. For instance, miR-141, miR-375, miR-21, miR-93, miR-106a, miR-874, miR-1207, and miR-26a were reported to upregulate in prostate neoplasms (Xiao et al., 2012; Chu et al., 2014; Dong et al., 2015). We also implemented TLHNMDA on prostate neoplasms to identify the related miRNAs. As a result, 7 of top 10 and 38 of top 50 potential Prostate neoplasms-miRNAs were confirmed in the deDEMC and miR2Disease database (see Table 3).
Table 3

Prediction of the top 50 predicted miRNAs associated with prostate neoplasms based on known associations in HMDD v2.0 database.

miRNAEvidencemiRNAEvidence
hsa-mir-15adbDEMC miR2Diseasehsa-mir-24dbDEMC miR2Disease
hsa-mir-16dbDEMC miR2Diseasehsa-mir-29adbDEMC miR2Disease
hsa-mir-15bdbDEMChsa-mir-539unconfirmed
hsa-mir-195dbDEMC miR2Diseasehsa-mir-20amiR2Disease
hsa-mir-424unconfirmedhsa-mir-26adbDEMC miR2Disease
hsa-mir-497miR2Diseasehsa-mir-34adbDEMC miR2Disease
hsa-mir-103aunconfirmedhsa-mir-27bdbDEMC miR2Disease
hsa-mir-485unconfirmedhsa-mir-29bdbDEMC miR2Disease
hsa-mir-23adbDEMC miR2Diseasehsa-mir-17miR2Disease
hsa-mir-214dbDEMC miR2Diseasehsa-mir-143dbDEMC miR2Disease
hsa-mir-155dbDEMChsa-mir-128dbDEMC
hsa-mir-107unconfirmedhsa-mir-320aunconfirmed
hsa-mir-590unconfirmedhsa-mir-708unconfirmed
hsa-mir-19adbDEMChsa-mir-124dbDEMC
hsa-mir-125adbDEMC miR2Diseasehsa-mir-149dbDEMC miR2Disease
hsa-mir-142unconfirmedhsa-mir-199adbDEMC miR2Disease
hsa-mir-19bdbDEMC miR2Diseasehsa-mir-34cdbDEMC
hsa-mir-138dbDEMChsa-mir-181adbDEMC miR2Disease
hsa-mir-26bdbDEMC miR2Diseasehsa-mir-152dbDEMC
hsa-mir-150dbDEMChsa-mir-18aunconfirmed
hsa-mir-370miR2Diseasehsa-mir-21dbDEMC miR2Disease
hsa-mir-29cdbDEMChsa-mir-106adbDEMC miR2Disease
hsa-mir-31dbDEMC miR2Diseasehsa-mir-181bdbDEMC miR2Disease
hsa-mir-185unconfirmedhsa-mir-193aunconfirmed
hsa-mir-23bdbDEMC miR2Diseasehsa-mir-7dbDEMC

The first column records top 1–25 related miRNAs. The second column records the top 26–50 related miRNAs.

Prediction of the top 50 predicted miRNAs associated with prostate neoplasms based on known associations in HMDD v2.0 database. The first column records top 1–25 related miRNAs. The second column records the top 26–50 related miRNAs. Moreover, we further implemented TLHNMDA on the known miRNA-disease associations in HMDD v1.0 database (Lu et al., 2008) to see whether the approach worked properly on a different dataset. Consequently, the predicted scores for candidate miRNAs showed that 10 of top 10 and 49 of top 50 potential esophageal neoplasms-associated miRNAs were verified by three databases (see Table 4). Lastly, we list the potential miRNAs related to all the human diseases and the association scores of the entire ranking results obtained by the computational model of TLHNMDA (see Supplementary Table 1).
Table 4

Prediction of the top 50 predicted miRNAs associated with esophageal neoplasms based on HMDD v1.0 database.

miRNAEvidencemiRNAEvidence
hsa-mir-15adbDEMC and HMDDhsa-mir-143dbDEMC and HMDD
hsa-mir-16dbDEMChsa-mir-29adbDEMC
hsa-mir-15bdbDEMChsa-mir-125bdbDEMC
hsa-mir-195dbDEMChsa-mir-29bdbDEMC
hsa-mir-424dbDEMChsa-mir-181bdbDEMC
hsa-mir-497dbDEMChsa-mir-34adbDEMC HMDD
hsa-mir-214dbDEMC HMDDhsa-mir-106adbDEMC
hsa-mir-107dbDEMC miR2Diseasehsa-mir-106bdbDEMC
hsa-mir-155dbDEMC HMDDhsa-mir-199adbDEMC HMDD
hsa-mir-19adbDEMC HMDDhsa-mir-330dbDEMC
hsa-mir-19bdbDEMChsa-mir-20bdbDEMC
hsa-mir-125adbDEMChsa-mir-26adbDEMC HMDD
hsa-mir-185dbDEMChsa-mir-1dbDEMC
hsa-mir-20adbDEMC HMDDhsa-mir-181adbDEMC
hsa-mir-24dbDEMChsa-mir-186dbDEMC
hsa-mir-17dbDEMChsa-mir-141dbDEMC HMDD
hsa-mir-23adbDEMChsa-mir-93dbDEMC
hsa-mir-26bdbDEMChsa-mir-421dbDEMC
hsa-mir-539unconfirmedhsa-mir-222dbDEMC
hsa-mir-150dbDEMC HMDDhsa-mir-28dbDEMC HMDD
hsa-mir-23bdbDEMChsa-mir-145dbDEMC HMDD
hsa-mir-29cdbDEMC HMDDhsa-mir-92aHMDD
hsa-mir-370dbDEMChsa-mir-22dbDEMC HMDD
hsa-mir-142dbDEMChsa-mir-199bdbDEMC
hsa-mir-18adbDEMChsa-mir-34cdbDEMC HMDD

The first column records top 1–25 related miRNAs. The second column records the top 26–50 related miRNAs.

Prediction of the top 50 predicted miRNAs associated with esophageal neoplasms based on HMDD v1.0 database. The first column records top 1–25 related miRNAs. The second column records the top 26–50 related miRNAs.

Discussion

Although progress has been made in the discovery of miRNA, the role of miRNAs in physiologic and pathophysiologic processes is just emerging. MiRNAs as governors of gene expression during cardiovascular development and disease have associations with many critical biological processes (Liu and Olson, 2010). Identification of miRNAs expressed in specific cardiac cell types may provide us with new diagnostic, prognostic, and therapeutic targets for many forms of cardiovascular disease (Cordes and Srivastava, 2009). Furthermore, aberrant expression of miRNAs has also been involved in various neurological disorders (NDs) of the central nervous system such as alzheimer disease, parkinson's disease, huntington disease, amyotrophic lateral sclerosis, schizophrenia and autism. If dysregulated miRNAs are found in patients with NDs, this may also be a biomarker for the earlier diagnosis and monitoring of disease progression. Identifying the role of miRNAs in normal cellular processes is critical in the development of new therapeutic strategies for NDs (Kamal et al., 2015). Therefore, predicting disease-associated miRNAs is important for the understanding of disease pathogenesis and treatment of a variety of clinically important disease. In this paper, according to the hypothesis that functional similar miRNAs and lncRNAs are likely to be associated with similar diseases. We introduced a novel model, named TLHNMDA, which constructed a triple layer heterogeneous network by systematic combination of miRNA functional similarity, disease semantic similarity, Gaussian interaction profile kernel similarity, known miRNA-disease associations and miRNA-lncRNA interactions to identify new disease-associated miRNAs. In the model, an iterative updating algorithm that propagates information across the network was proposed based on the triple layer heterogeneous graph to obtain final prediction scores between diseases and miRNAs. The experimental results from LOOCV and 5-fold cross validation have demonstrated that TLHNMDA outperforms other four computational methods. What's more, case studies of four human diseases: kidney neoplasms, lymphoma, prostate neoplasms and esophageal neoplasms were implemented and the results were verified by the experimental literatures in dbDEMC and miR2Disease database. We can see that the TLHNMDA turns out to be more reliable and effective in inferring the potential miRNA–disease associations than the previous computational models. Therefore, our model could be an effective and useful computational model to predict new miRNA-disease associations. Biomedical researchers could use TLHNMDA to computationally identify the miRNAs that were potentially related to the investigated diseases. TLHNMDA could obtain the valid performances due to the following several reasons. Firstly, TLHNMDA improved prediction accuracy and decrease the prediction bias by integration of several reliable types of biological datasets, including the accurate experimentally verified miRNA-disease associations, known miRNA-lncRNA interactions, miRNA functional similarity network, disease semantic similarity network and Gaussian interaction profile kernel similarity. Secondly, the model captured new miRNAs-diseases associations using global network similarity information, it has an advantage over the local network similarity information model to capture miRNA-disease associations. Finally, TLHNMDA is an iterative algorithm to update predicted scores based on global network similarity information until the state is in convergence, which promote the effective prediction of TLHNMDA. However, several limitations also exist in the TLHNMDA, for example, TLHNMDA cannot predict the new miRNAs associated with the new diseases without any known miRNA-disease associations. Besides, there is no powerful methods to find optimal parameters of TLHNMDA. The selection of parameters in the iterative algorithm is based on past experiences which can't guarantee the model with best state in the implementation process. Finally, the number of miRNA-disease associations and miRNA-lncRNA interactions, confirmed by biological experiments, is still insufficient. Therefore, in the future research, we can have a try to propose a new model by integrating more available biological datasets. It is noteworthy that there exist many other types of data can also be used to predict miRNA-disease associations, for example, miRNA-mRNA interactions (Li et al., 2014b), miRNA-protein interactions (Shi et al., 2016), miRNA-environmental factors interactions (Chen et al., 2012a), and so on. Considering some existing methods have taken advantage of different datasets to identify miRNA-disease associations, which makes direct comparison of their performance and the performance of the proposed method is not realistic. For example, two model proposed by Pallez et al. (2017) and Pasquier and Gardès (2016) were based on the dataset of miRNA-disease associations, miRNA-neighbor associations, miRNA-target associations, miRNA-word associations and miRNA-family associations. The model proposed by Mork et al. (2014) was based on the dataset of miRNA–protein associations and protein-disease associations to predict potential miRNA-disease associations. The model introduced by Shi et al. (2013) for the identification of miRNA-disease associations was based on disease-gene association, protein-protein interaction, miRNA-target associations. Moreover, Liu et al. (2016) proposed a new computational to predict unobserved miRNA-disease associations based on disease functional similarity, disease semantic similarity and miRNA similarity. It is worth noting that miRNA similarity in the model was calculated based on miRNA-lncRNA interactions. In addition to datasets, there are different ways in defining relationships among nodes of the same type. For example, in the DeepMDA proposed by Fu and Peng (2017), Gaussian interaction profile kernel similarity for disease was calculated by using three association matrices, the miRNA-disease association matrix, the lncRNA-disease association matrix, and the gene-disease association matrix. MiRNA similarity used in KATZ and CATAPULT introduced by Zou et al. (2015) was calculated by text mining analysis of their phenotype descriptions in the Online Mendelian Inheritance in Man (OMIM) database. Especially, the relative merits of using different measures are worth further study. Network analysis and modeling researches constructed by diverse data were also widely applied in other fields. Some studies modeled cancer cells by constructing and modeling networks for individual clones based on tumor genome sequencing (Wang et al., 2013a). Integrative network modeling has been applied in the modeling of drug resistance for personalized treatment (Wang et al., 2013b). Moreover, Hallmark-specific networks were modeled to better understand key cellular processes, which are involved in cancer development and progression (Gao et al., 2016). The hallmarks of cancer are one of the most widely acknowledged organizing principles for research on cancer (Wang et al., 2015). Accumulating evidences indicated that there are some associations between cancer hallmarks and genes (Wang et al., 2015). For example, miR-16 obtained the highest score in the case study on kidney neoplasms and the second high score in the case study on prostate neoplasms. APP, ATG12, and ATF2 are the common targets for this miRNA and have been identified to be involved in hallmark of inflammation (Wang et al., 2015). In the future work, we plan to extend the model we proposed into new multi-layer prediction model, one extension is to add more diverse datasets of different types (other than the three discussed here) and more associations to the model, then construct the iterative updating algorithm to identify disease-associated miRNAs.

Author contributions

XC conceived the project, developed the prediction method, designed the experiments, analyzed the result, and wrote the paper. JQ implemented the experiments, analyzed the result, and wrote the paper. YJ analyzed the result and revised the paper.

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
  74 in total

Review 1.  How microRNAs control cell division, differentiation and death.

Authors:  Eric A Miska
Journal:  Curr Opin Genet Dev       Date:  2005-10       Impact factor: 5.578

2.  miR-141 modulates androgen receptor transcriptional activity in human prostate cancer cells through targeting the small heterodimer partner protein.

Authors:  Jing Xiao; Ai-Yu Gong; Alex N Eischeid; Dongqing Chen; Caishu Deng; Charles Y F Young; Xian-Ming Chen
Journal:  Prostate       Date:  2012-02-07       Impact factor: 4.104

3.  Prediction of potential disease-associated microRNAs based on random walk.

Authors:  Ping Xuan; Ke Han; Yahong Guo; Jin Li; Xia Li; Yingli Zhong; Zhaogong Zhang; Jian Ding
Journal:  Bioinformatics       Date:  2015-01-23       Impact factor: 6.937

4.  RKNNMDA: Ranking-based KNN for MiRNA-Disease Association prediction.

Authors:  Xing Chen; Qiao-Feng Wu; Gui-Ying Yan
Journal:  RNA Biol       Date:  2017-04-19       Impact factor: 4.652

5.  Inferring microRNA-disease associations by random walk on a heterogeneous network with multiple data sources.

Authors:  Yuansheng Liu; Xiangxiang Zeng; Zengyou He; Quan Zou
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2016-04-05       Impact factor: 3.710

6.  A non-negative matrix factorization based method for predicting disease-associated miRNAs in miRNA-disease bilayer network.

Authors:  Yingli Zhong; Ping Xuan; Xiao Wang; Tiangang Zhang; Jianzhong Li; Yong Liu; Weixiong Zhang
Journal:  Bioinformatics       Date:  2018-01-15       Impact factor: 6.937

7.  The role of microRNA-150 as a tumor suppressor in malignant lymphoma.

Authors:  A Watanabe; H Tagawa; J Yamashita; K Teshima; M Nara; K Iwamoto; M Kume; Y Kameoka; N Takahashi; T Nakagawa; N Shimizu; K Sawada
Journal:  Leukemia       Date:  2011-04-19       Impact factor: 11.528

8.  Allogeneic T cell responses are regulated by a specific miRNA-mRNA network.

Authors:  Yaping Sun; Isao Tawara; Meng Zhao; Zhaohui S Qin; Tomomi Toubai; Nathan Mathewson; Hiroya Tamaki; Evelyn Nieves; Arul M Chinnaiyan; Pavan Reddy
Journal:  J Clin Invest       Date:  2013-11       Impact factor: 14.808

9.  Prediction of disease-related interactions between microRNAs and environmental factors based on a semi-supervised classifier.

Authors:  Xing Chen; Ming-Xi Liu; Qing-Hua Cui; Gui-Ying Yan
Journal:  PLoS One       Date:  2012-08-24       Impact factor: 3.240

10.  starBase v2.0: decoding miRNA-ceRNA, miRNA-ncRNA and protein-RNA interaction networks from large-scale CLIP-Seq data.

Authors:  Jun-Hao Li; Shun Liu; Hui Zhou; Liang-Hu Qu; Jian-Hua Yang
Journal:  Nucleic Acids Res       Date:  2013-12-01       Impact factor: 16.971

View more
  8 in total

1.  An integrated framework for the identification of potential miRNA-disease association based on novel negative samples extraction strategy.

Authors:  Chun-Chun Wang; Xing Chen; Jun Yin; Jia Qu
Journal:  RNA Biol       Date:  2019-01-28       Impact factor: 4.652

2.  MSFSP: A Novel miRNA-Disease Association Prediction Model by Federating Multiple-Similarities Fusion and Space Projection.

Authors:  Yi Zhang; Min Chen; Xiaohui Cheng; Hanyan Wei
Journal:  Front Genet       Date:  2020-04-30       Impact factor: 4.599

3.  Benchmark of computational methods for predicting microRNA-disease associations.

Authors:  Zhou Huang; Leibo Liu; Yuanxu Gao; Jiangcheng Shi; Qinghua Cui; Jianwei Li; Yuan Zhou
Journal:  Genome Biol       Date:  2019-10-08       Impact factor: 13.583

4.  A novel information diffusion method based on network consistency for identifying disease related microRNAs.

Authors:  Min Chen; Yan Peng; Ang Li; Zejun Li; Yingwei Deng; Wenhua Liu; Bo Liao; Chengqiu Dai
Journal:  RSC Adv       Date:  2018-10-30       Impact factor: 3.361

5.  DANE-MDA: Predicting microRNA-disease associations via deep attributed network embedding.

Authors:  Bo-Ya Ji; Zhu-Hong You; Yi Wang; Zheng-Wei Li; Leon Wong
Journal:  iScience       Date:  2021-04-20

6.  Prediction of miRNA-Disease Association Using Deep Collaborative Filtering.

Authors:  Li Wang; Cheng Zhong
Journal:  Biomed Res Int       Date:  2021-02-23       Impact factor: 3.411

7.  Prediction of circRNA-disease associations based on inductive matrix completion.

Authors:  Menglu Li; Mengya Liu; Yannan Bin; Junfeng Xia
Journal:  BMC Med Genomics       Date:  2020-04-03       Impact factor: 3.063

8.  QIMCMDA: MiRNA-Disease Association Prediction by q-Kernel Information and Matrix Completion.

Authors:  Lin Wang; Yaguang Chen; Naiqian Zhang; Wei Chen; Yusen Zhang; Rui Gao
Journal:  Front Genet       Date:  2020-10-22       Impact factor: 4.599

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.