Literature DB >> 30384427

PWCDA: Path Weighted Method for Predicting circRNA-Disease Associations.

Xiujuan Lei1, Zengqiang Fang2, Luonan Chen3,4,5, Fang-Xiang Wu6.   

Abstract

CircRNAs have particular biological structure and have proven to play important roles in diseases. It is time-consuming and costly to identify circRNA-disease associations by biological experiments. Therefore, it is appealing to develop computational methods for predicting circRNA-disease associations. In this study, we propose a new computational path weighted method for predicting circRNA-disease associations. Firstly, we calculate the functional similarity scores of diseases based on disease-related gene annotations and the semantic similarity scores of circRNAs based on circRNA-related gene ontology, respectively. To address missing similarity scores of diseases and circRNAs, we calculate the Gaussian Interaction Profile (GIP) kernel similarity scores for diseases and circRNAs, respectively, based on the circRNA-disease associations downloaded from circR2Disease database (http://bioinfo.snnu.edu.cn/CircR2Disease/). Then, we integrate disease functional similarity scores and circRNA semantic similarity scores with their related GIP kernel similarity scores to construct a heterogeneous network made up of three sub-networks: disease similarity network, circRNA similarity network and circRNA-disease association network. Finally, we compute an association score for each circRNA-disease pair based on paths connecting them in the heterogeneous network to determine whether this circRNA-disease pair is associated. We adopt leave one out cross validation (LOOCV) and five-fold cross validations to evaluate the performance of our proposed method. In addition, three common diseases, Breast Cancer, Gastric Cancer and Colorectal Cancer, are used for case studies. Experimental results illustrate the reliability and usefulness of our computational method in terms of different validation measures, which indicates PWCDA can effectively predict potential circRNA-disease associations.

Entities:  

Keywords:  circRNA-disease associations; heterogeneous network; pathway

Mesh:

Substances:

Year:  2018        PMID: 30384427      PMCID: PMC6274797          DOI: 10.3390/ijms19113410

Source DB:  PubMed          Journal:  Int J Mol Sci        ISSN: 1422-0067            Impact factor:   5.923


1. Introduction

In recent years, an increasing number of circRNAs [1] have been uncovered and have drawn more attention than before. CircRNA is a newly discovered category of non-coding RNAs. Non-coding RNAs also include a large number of different RNAs, such as miRNAs, lncRNAs, piRNAs [2]. The first discovery of circular RNA was in the Tetrahymena cell [3]. There is an obvious difference between circular RNAs and common linear RNAs. That is, circRNA has a circular closed loop RNA structure, yet have no free 5’ and 3’ compared with linear RNAs [4]. In addition, circRNAs can also be classified into 4 categories as follows: Exonic circRNAs, intronic circRNAs, exonintron circRNAs and intergenic circRNAs [4,5]. Because of such a closed loop structures, they are usually stable, abundant, conserved, and tissue-specifically expressed [5]. With the progress of high throughput sequencing technology [6], more and more circRNAs have been confirmed to play significant roles in different biological processes [7]. According to many experiments, a large amount of circRNAs functions have been found to work as a scaffold in the assembly of protein complexes [8], and local subcellular positions [9], and so on. They also regulate the expression of their ancestor genes [10] and acts as a microRNA (miRNA) sponge [11,12]. Especially, many studies have proved that circRNA can be biomarkers of tumors [13,14,15]. Recently, a sharply increasing number of circRNAs have been discovered and there are also some circRNA-disease databases being developed, such as circR2Disease [16], Circ2Traits [17] and Circ2Disease [18]. Simultaneously, circRNAs-related diseases also have been verified by classic biological experiments. However, they are both time-consuming and expensive. Therefore, it is appealing to develop computational methods that can produce reliable prediction results and reduce both time and cost. Although, some computational methods have been proposed for predicting miRNA-disease associations [19,20,21], lncRNA-disease associations [22,23] and drug-target associations [18,24,25], there is no computational method for predicting circRNA-disease associations yet. In this study, we propose the first computational method, Path Weighed method for predicting CircRNA-Disease Associations (PWCDA). After building a heterogeneous network consisting of three sub-networks, the disease similarity network, the circRNA similarity network and circRNA-disease association network, we calculate an association score for each circRNA-disease pair based on the paths connecting them in the heterogeneous network to determine whether a circRNA-disease pair is associated. Our method is evaluated with leave one out cross validation (LOOCV) and five-fold cross validation. The average AUC (Area Under roc Curve) of LOOCV is 0.900, while the AUC value of five-fold cross validation is 0.890. For further investigating the performance of our proposed model, we conduct several case studies of some common cancers. What’s more, we compare our method with some other computational prediction methods. The results show that our method outperforms other methods, which indicates that our proposed model has the better capability to predict potential circRNA-disease associations.

2. Results and Discussion

2.1. Effect of Parameter

Based on the previous study [26], we fix the maximum path length as 3. If the maximum path length is more than 3, not only do the running time of the method increases, but our method also takes some noisy information. In this study, we give a comprehensive analysis for the parameter α in our decaying function. After we calculate scores for each disease-circRNA pair, we can obtain a disease-circRNA association score matrix. Based on the scores matrix, we calculate the AUC. The results are represented in Table 1. It’s obvious that the effect of different values of α on the final AUC value is quite small and it can take value from 1 to 3. Therefore, we adopt the best result setting the value of α as 1. In order to reduce the running time, we don’t use any cross validation in this experiment. Furthermore, we also carry out an experiment to analyze another parameter, the threshold γ, which is represented in Table 2. For the sake of reducing the running time, any cross validation is not adopted. The result shows that the parameter γ might have tiny effect on the final AUC value. Thus, we set the γ value as 0.5, which gets the greatest AUC value.
Table 1

The Area Under roc Curve (AUC) value based on changing α and fixed pathway maximum length.

α 0.511.5233.544.55
AUC 0.971000.972090.972060.972080.972020.970100.970100.970100.96879
Table 2

The AUC value based on changing γ and fixed pathway maximum length.

γ 0.10.20.30.40.50.6
AUC 0.964830.964830.964830.965000.972090.97205

2.2. LOOCV

For a given particular disease i, there are some associations between disease i and a number of circRNAs. In LOOCV, during each computational iteration, we leave one association out as a test data and use the remaining associations as a training dataset. If there is just one association between disease i and circRNAs in our dataset, we do not adopt LOOCV for this kind of disease. In LOOCV, we obtain an association score for each circRNA-disease pair and then rank all the prediction association scores. If a score value is greater than the pre-set threshold, we determine that the corresponding disease-circRNA is associated. With the change of the threshold, we can get a variety of true positive rates (TPRs) and false positive rates (FPRs), which can be used to draw the Receiver Operating Characteristic Curve (ROC) curve. In the end, we have compared our prediction method with other computational prediction methods [27,28]. The results can be found in Figure 1 and show that our proposed method outperforms the existing prediction methods.
Figure 1

Comparison of Path Weighed method for predicting CircRNA-Disease Associations (PWCDA) with other models by leave one out cross validation (LOOCV). FPR, false positive rate.

2.3. Five-Fold Cross Validation

In order to further illustrate the performance of our proposed method, we have adopted five-fold cross validation verification method as well for investigating the prediction performance. In our study, we divide all disease-circRNA associations into 5 parts. Each time we pick up one part as the test dataset and the remaining four parts consist of the training set. Then we can obtain the scores of all circRNA-disease associations. Similarly, we follow the same procedure as LOOCV to draw the AUC curve based on five-fold cross validation. What’s more, we have compared our proposed computational method with other prediction methods [27,28]. Our method gets more outstanding result than other methods, which is shown in the Figure 2.
Figure 2

Comparison of PWCDA with other computational methods via five-fold cross validation.

2.4. Case Studies

Here, we also have conducted some case studies, which can help us further understand the associations between circRNAs and diseases. In this study, we choose three common diseases as prediction targets of our case studies, which are Breast Cancer [29], Gastric Cancer [30] and Colorectal Cancer [31]. In order to prove the prediction accuracy of our proposed method, we have used circRNA-disease database, and associations between circRNAs and diseases—which have been experimentally verified in the published articles [32]. Breast cancer is one the common cancers all over the world now [33], and breast cancer causes thousands of deaths every year. With the development of deep sequencing technology, circRNAs are confirmed to be biomarkers for diagnosing breast cancer. Based on our computational method, we have succeeded in predicting 29 of top 30 candidate circRNAs. For example, circpvt1 (top1) can be worked as miRNA spouse to regulate miRNA by moderating let-7 activity selected [30], and circRNA hsa_circ_104689 wasn’t predicted by our method and the predicting result have been presented in Table 3.
Table 3

The top 30 breast cancer related candidates circRNAs.

Breast Cancer
RankcircRNA Name/idEvidencesRankcircRNA Name/idEvidences
1circpvt1/hsa_circ_0001821PMID:27928005816hsa_circ_0001667circRNAdisease
2circ-foxo3circRNAdisease17hsa_circ_0085495circRNAdisease
3hsa_circ_0001313/circccdc66PMID:2824990318hsa_circ_0086241circRNAdisease
4hsa_circ_0007534PMID:2959343219hsa_circ_0092276circRNAdisease
5hsa_circ_0000284/circhipk3PMID:2705039220hsa_circ_0003838circRNAdisease
6hsa_circ_0011946PMID:2959343221circvrk1PMID:29221160
7hsa_circ_0093869PMID: 2959343222circbripPMID: 29221160
8hsa_circ_0001982circRNAdisease23circolaPMID: 29221160
9hsa_circ_0001785circRNAdisease24circetfaPMID: 29221160
10hsa_circ_0108942circRNAdisease25circmed13PMID: 29221160
11hsa_circ_0068033circRNAdisease26circbc111bPMID:28739726
12circamot11/hsa_circ_0004214circRNAdisease27circdennd4ccircRNAdisease
13hsa_circ_0006528circRNAdisease28hsa_circ_103110/hsa_circ_0004771circRNAdisease
14hsa_circ_0002113circRNAdisease29hsa_circ_104689/hsa_circ_0001824unconfirmed
15hsa_circ_0002874circRNAdisease30hsa_circ_104821/hsa_circ_0001875circRNAdisease
Gastric cancer [34] causes a high mortality rate in human. It can be produced in any tissue of the human stomach. These tumors in the stomach are usually malignant tumors, and they can also destroy the surrounding nervous tissue. With our computational method, there are 25 of top 30 candidate circRNAs that have been confirmed by another database, circRNA disease. For example, hsa_circ_0076304 (top1) and hsa_circ_0076305 (top2) are identified to downregulate in a group of gastric cancer [35]. circpvt1 (top3) can be regarded as the sponge of the miR-125 family [13], which can upregulate in the gastric cells. The more details of results are shown in Table 4.
Table 4

The top 30 gastric cancer related candidates circRNAs.

Gastric Cancer
RankcircRNA Name/idEvidencesRankcircRNA Name/idEvidences
1hsa_circ_0076305circRNAdisease16circma0138960/hsa-circma7690-15circRNAdisease
2hsa_circ_0076304circRNAdisease17hsa_circ_0000181circRNAdisease
3circpvt1/hsa_circ_0001821circRNAdisease18hsa_circ_0000745circRNAdisease
4hsa_circ_0001649unconfirmed19hsa_circ_0085616circRNAdisease
5hsa_circ_0000284/circhipk3unconfirmed20hsa_circ_0006127circRNAdisease
6hsa_circ_0014717circRNAdisease21hsa_circ_0000026circRNAdisease
7cdr1as/cirs-7/hsa_circ_0001946unconfirmed22hsa_circ_0000144circRNAdisease
8hsa_circ_0003195circRNAdisease23hsa_circ_0032821circRNAdisease
9hsa_circ_0000520circRNAdisease24hsa_circ_0005529circRNAdisease
10hsa_circ_0074362circRNAdisease25hsa_circ_0061274circRNAdisease
11hsa_circ_0001017circRNAdisease26hsa_circ_0005927circRNAdisease
12hsa_circ_0061276circRNAdisease27hsa_circ_0092341circRNAdisease
13circ-zfrunconfirmed28hsa_circ_0001561unconfirmed
14circma0047905/hsa_circ_0047905circRNAdisease29circlarp4circRNAdisease
15circma0138960/hsa_circ_0138960circRNAdisease30hsa_circ_0035431circRNAdisease
Colorectal cancer [36] is one of the three most frequent cancers for women. Even though the incidence of colorectal cancer has been declined for a long time, a large proportion of patients die each year from colorectal cancer. In this study, we have succeeded in predicting 24 of top 30 candidate circRNAs. For example, hsa_circ_0001649 (top1) [31] has been identified to downregulate in colorectal cancer tissue. hsa_circ_0007534 (top2) [37] can upregulate in the different colorectal cancer cells. The more details of results are presented in Table 5.
Table 5

The top 30 colorectal cancer related candidates circRNAs.

Colorectal Cancer
RankcircRNA Name/idEvidencesRankcircRNA Name/idEvidences
1hsa_circ_0001649PMID:2942166316has-circ_0006174circRNAdisease
2hsa_circ_0007534PMID:2936447817hsa_circ_0008509circRNAdisease
3cdr1as/cirs-7/hsa_circ_0001946circRNAdisease18hsa_circ_0084021circRNAdisease
4hsa_circ_0000284/circhipk3PMID:2705039219circ_banpcircRNAdisease
5hsa_circ_0001313/circccdc66circRNAdisease20hsa_circrna_103809circRNAdisease
6ciritch/hsa_circ_0001141/hsa_circ_001763unconfirmed21hsa_circrna_104700circRNAdisease
7hsa_circ_0014717PMID:2957124622hsa_circ_0000069circRNAdisease
8hsa_circ_0000567PMID:2933361523hsa_circ_001988/hsa_circ_0001451circRNAdisease
9hsa_circ_000984/hsa_circ_0001724circRNAdisease24hsa_circ_0000677/hsa_circ_001569/circabcccircRNAdisease
10hsa_circ_0020397circRNAdisease25circ_kldhc10/hsa_circ_0082333PMID:26138677
11hsa_circ_0007031circRNAdisease26circ_stxbp51unconfirmed
12hsa_circ_0000504circRNAdisease27circ-shkbp1unconfirmed
13hsa_circ_0007006circRNAdisease28circ-fbxw7unconfirmed
14hsa_circ_0074930circRNAdisease29hsa_circ_0046701unconfirmed
15hsa_circ_0048232circRNAdisease30circttbk2/hsa_circ_0000594unconfirmed

3. Materials and Methods

3.1. Human circRNA-Disease Associations Network

All the circRNA-disease associations are downloaded from the website of circR2Disease database [16] (http://bioinfo.snnu.edu.cn/CircR2Disease/). This initial dataset contains 739 associations between 661 circRNA entities and 100 disease entities that are found based on three main species—human, mouse and rat. In this study, we select 541 circRNA entities and 83 human disease entities from our initial dataset, which includes Gastric cancer, Breast cancer, Colorectal cancer, etc. Finally, we obtain 592 circRNA-disease associations, which have experimentally been verified. These make up our circRNA-disease association network with adjacency matrix M. If there is a verified association between disease i and circRNA j, the entry M(i, j) is equal to 1, otherwise it is equal to 0.

3.2. CircRNA Semantic Similarity

For calculating circRNA semantic similarity, we download circRNA and its related gene targets dataset from circR2Disease. To measure circRNA semantic similarities, we also need to obtain gene related annotation terms that can be downloaded from Human Protein Reference Database (HPRD) database [38] (http://www.hprd.org/). Reviewing previous literature [39,40,41], there are some methods that can be referred to calculate the circRNA-related gene GO terms semantic similarities, including path-length-based methods, information-content-based methods, common-term-based methods and hybrid methods. In this study, we utilize a common-term-based method to measure circRNA similarity scores based on JACCARD index. In the previous studies [21,42], genes have been widely adopted to infer RNA similarity. Thus, the more gene related terms were shared by two circRNA C and C, the higher the similarity score they get. Denote CS as the circRNA semantic similarity matrix, and its entry CS(i, j) can be calculated by the following formula: where G/G denotes the GO terms that circRNA C/C target genes related.

3.3. Disease Functional Similarity

We adopt disease related gene annotations to measure disease functional similarities. These gene annotations are being extracted from two online databases. The first one is DisGeNET [43] (http://www.disgenet.org/web/DisGeNET/menu), which collects 381,056 gene-disease associations (GDAs) between 16,666 genes and 13,172 diseases. In addition, we also download disease phenotype data from OMIM [44]—Online Mendelian Inheritance in Man. OMIM is a biological database that is updated daily. We use the OMIM_2018_04_24 version. Then we integrate multiple annotation resources of diseases related genes, which help us get a more reliable performance. There are also some methods for calculating disease similarities from previous studies[45]. The common methods include annotation-based measurements, function-based measurements and topology-based measurements [46,47,48,49]. We have adopted annotation-based methods to obtain disease similarities. We apply the JACCARD index, which is a standard method for computing similarities based on two collections of finite numbers of elements so as to estimate the similarity scores between diseases. Let g be a collection of annotations of a gene associated with disease d. We calculate the functional similarity score of two diseases d and d based on the JACCARD similarity coefficient score of g and g. Denote DS as the disease functional similarity matrix, then its entry DS(i, j) can be calculated by the following formula: We have constructed circRNA semantic similarity matrix based on their related GO terms and disease functional similarity based on its related annotating genes. However, one essential weakness that cannot be ignored is that the aforementioned similarity matrices are sparse, which indicates similarity of many pairs of diseases (or circRNAs) are unable to be calculated in their functional (or semantic) similarity matrices. To alleviate this weakness, the Gaussian interaction profile (GIP) kernel similarity [50,51] is adopted in this study to get additional information about the similarity of diseases and circRNAs.

3.4. CircRNA GIP Kernel Similarity

There is an assumption that the more similar the circRNA is, the more likely similar patterns of association and non-association with diseases. The GIP kernel similarity is adopted to calculate similarity based on the topological features of the known associations network widely, such miRNA-disease associations network [52], lncRNA-disease associations networks [53] and drug-target association network [54]. Accordingly, GIP kernel similarity is also used in this study to calculate the similarity of circRNA and disease. According to previous literature [54], we use a binary vector C(i) to indicate whether circRNA i is associated with diseases. The GIP kernel similarity between circRNA C(i) and C(j) can be computed by the following formula: To overcome the shortcomings that the disease functional similarity matrix and circRNA semantic matrix are sparse matrices, the parameter is to adjust the kernel bandwidth, which can be calculated by the following formula: where n is the number of circRNAs in our finial dataset. The parameter γ’ is set as 1 based on the previous study [54], which has obtained a better performance.

3.5. Disease GIP Kernel Similarity

We also calculate the GIP kernel similarity score between disease i and j as follows: where d(i) and d(j) are the association profiles of diseases i and j, respectively, n is the number of diseases in our finial dataset, γ’ is also set to 1 based on previous studies.

3.6. Combine Multiple Similarity (circRNA and Disease)

We integrate the GIP kernel similarity for circRNAs with the semantic similarity of circRNAs to construct the circRNA similarity network. Specifically, the elements of the adjacency matrix of this network is calculated as follows: We also integrate the GIP kernel similarity for diseases with the functional similarity diseases to construct the diseases similarity network. Specifically, the elements of the adjacency matrix of this network is calculated as follows:

3.7. Constructing Heterogeneous Network

After we obtain the final disease similarity scores and circRNA similarity scores. We can construct an initial heterogeneous network, which is composed of disease similarity network, circRNA network and disease-circRNA associations network. In this initial heterogeneous network, there are some small weighted edges, which may represent noises. Therefore, to weaken the effect of those unimportant or noisy edges, we set a threshold γ (γ is equal to 0.5 based on previous studies [26] and our experiment) to remove them. Specifically, let P and P be the adjacency matrices of the final and heterogeneous network, respectively, then we have:

3.8. Perfomance Metrics

In this study, we adopt the AUC value to measure the prediction results. The AUC is the area under the ROC curve, which depicts the true positive rate (TPR) verse the false positive rate (FPR). The following equations are adopted to calculate the TPR and FPR: where TP are positive samples (known associations), which are identified correctly, and TN are negative samples (unknown associations), which are identified correctly. FP are positive samples which are identified incorrectly while FN are negative samples, which are identified incorrectly.

3.9. PWCDA

In this study, we proposed a novel computational model called PWCDA (a Path-Weighted CircRNA-Disease Associations method) to predict potential associations between circRNAs and diseases. The framework of our method is depicted in Figure 3. The computational method PWCDA traverses each node in each pathway without repeating based on heterogeneous network. To avoid traversing the same node repeatedly, we adopt the depth-first search (DFS) algorithm and mark the traversed nodes during each turn. Depth first search is implemented as a recursive function traversing the graph moving along the edge. We modify it to mark nodes, because they are accessed in recursion, and then delete tags before returning from recursive calls. In this study, we set the maximum searching length η as 3 steps according to previous studies [26], i.e., for circRNA i and disease j, there are several pathways, such as circRNA i connecting disease j directly, circRNA i’s neighbor circRNA connecting with disease j or circRNA i connecting with disease j’s neighbor diseases, circRNA i’s neighbor circRNAs connecting with disease j’s neighbor diseases directly. The choice of these paths is based on a hypothesis that the larger similarity score is between two circRNAs, the higher probability that they have the same associations is. Thus, after the weight of each circRNA-disease pair within all three paths are summed up. We can obtain the final scores between each circRNA-disease pair.
Figure 3

The flowchart of PWCDA is illustrated by five main steps. Step 1: Calculate circRNA semantic similarity and disease similarity scores, respectively. Step 2: Calculate GIP Kernel similarity scores for circRNAs and diseases. Step 3: Integrate circRNA (disease) semantic (functional) similarity with circRNA/disease GIP Kernel similarity, respectively. Step 4: Construct the heterogeneous network. Step 5: Calculate an association score for each circRNA-disease pair.

The more the number of paths between circRNA j and disease i exists, the greater the predictive score they obtain. Accordingly, the path set that connects circRNA C to disease di can be represented as {p1, p2, …, pm}, where m is the number of the paths that connect disease d and circRNA C with the length less than η. The final predictive scores of C and d can be calculated as follows: where S(P) is the score of the path p = {e1, e2, …, e} [42] can be calculated as follows: The longer the path is, the smaller the contribution it is made, which means that the longer path would have less effect on predicting potential circRNA-disease associations than the shorter one. Therefore, the decaying function is an exponential function to reduce the influence of long path on final prediction scores, which can be represented as Equation (14): where α is a constraint factor and len(p) is the length of path p. An example for calculating the score between circRNA c and disease d is shown in Figure 4. In the Figure 4, three paths {c1-c4-d2}, {c1-c3-d1-d2} and {c1-c5-d3-d2}, which are marked as red, are used to calculate the score between c1 and d2. Therefore, the score of c1 and d2 can be calculated as follows: Score (c1, d2) = {c1-c4-d2} (w2 × w5)3*exp(2) + {c1-c3-d1-d2} (w1 × w4 × w7)3*exp(3) + {c1-c5-d3-d2} (w3 × w6 × w8)3*exp(3). There are also some other paths that can connect c1 with d2. Because the length of those paths, such as {c1-c2-c5-d3-d2}, are more than 3, we don’t consider this path.
Figure 4

The path between c and d is within the maximum path length.

4. Conclusions

With the increasing number of diseases related to circRNAs being discovered, more and more researchers have been paying attention to investigate diseases-related circRNAs. Although, experimental methods can find potential circRNA-disease associations with a high precision, the process is not only time-consuming, but also expensive. Here, we have proposed an effective computational method called PWCDA, which can predict potential circRNA-disease associations. Firstly, we calculate disease/circRNA similarities by combining their functional/semantic similarity and GIP kernel similarity. Secondly, we build a heterogeneous network, including the circRNA-disease association sub-network, the disease similarity sub-network and the circRNA similarity sub-network. PWCDA searches all the paths within three steps to compute an association score for each circRNA-disease pair to determine if a circRNA-disease pair is associated. To thoroughly investigate the performance of our proposed method, we adopt LOOCV and five-fold cross validation. Furthermore, we have also compared our method with two state-of-the-art prediction methods. The comparison results illustrate that our methods work much better than other methods. The AUC value of five-fold cross validation is 0.884. Moreover, we apply our method to three diseases: Breast Cancer, Gastric Cancer, Colorectal Cancer for case studies. There are several significant factors, which may explain why our proposed method can get a better performance than other computational models. Firstly, we have taken into account the sparsity of disease/circRNA similarity sub-networks. Thus, we have integrated disease functional similarity scores and circRNA semantic similarity scores with their corresponding GIP kernel similarity scores. Secondly, according to previous studies, we just use the paths within three steps, which can reduce the noisy information. Although we have combined different similarity scores, there is still some information unavailable. Therefore, we set a threshold to remove those edges whose weights are less than the predefined threshold. Although we get a much better performance than other computational models, we can’t ignore the limitation. The prediction of associations between circRNAs and diseases is a relatively new research field, and the amount of data that we can use is limited. The ratio of positive samples to negative samples of circRNA-disease association is seriously unbalanced. To solve this problem, we may have two main solutions. One is that we can update the circRNA-disease database to obtain new data. The other is that we can extract the same number of positive samples as that of negative samples. Furthermore, our computational method tends to predict those circRNA-disease associations that are covered in the known associations’ dataset, and it just predicts fewer novel circRNA-disease associations. Thus, we will adopt more biological data to overcome this weakness. As a future topic, we can apply this work to the disease diagnosis based on network biomarkers [55,56,57] and disease prediction based on dynamic network biomarkers [58,59,60] in an accurate and reliable manner.
  59 in total

1.  Measures of semantic similarity and relatedness in the biomedical domain.

Authors:  Ted Pedersen; Serguei V S Pakhomov; Siddharth Patwardhan; Christopher G Chute
Journal:  J Biomed Inform       Date:  2006-06-10       Impact factor: 6.317

2.  Survey: Enhancing protein complex prediction in PPI networks with GO similarity weighting.

Authors:  True Price; Francisco I Peña; Young-Rae Cho
Journal:  Interdiscip Sci       Date:  2013-12-04       Impact factor: 2.233

3.  Diagnosing phenotypes of single-sample individuals by edge biomarkers.

Authors:  Wanwei Zhang; Tao Zeng; Xiaoping Liu; Luonan Chen
Journal:  J Mol Cell Biol       Date:  2015-04-26       Impact factor: 6.216

4.  Circular RNA profile identifies circPVT1 as a proliferative factor and prognostic marker in gastric cancer.

Authors:  Jie Chen; Yan Li; Qiupeng Zheng; Chunyang Bao; Jian He; Bin Chen; Dongbin Lyu; Biqiang Zheng; Yu Xu; Ziwen Long; Ye Zhou; Huiyan Zhu; Yanong Wang; Xianghuo He; Yingqiang Shi; Shenglin Huang
Journal:  Cancer Lett       Date:  2016-12-13       Impact factor: 8.679

5.  Circular intronic long noncoding RNAs.

Authors:  Yang Zhang; Xiao-Ou Zhang; Tian Chen; Jian-Feng Xiang; Qing-Fei Yin; Yu-Hang Xing; Shanshan Zhu; Li Yang; Ling-Ling Chen
Journal:  Mol Cell       Date:  2013-09-12       Impact factor: 17.970

6.  DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes.

Authors:  Janet Piñero; Núria Queralt-Rosinach; Àlex Bravo; Jordi Deu-Pons; Anna Bauer-Mehren; Martin Baron; Ferran Sanz; Laura I Furlong
Journal:  Database (Oxford)       Date:  2015-04-15       Impact factor: 3.451

7.  Dynamic network biomarker indicates pulmonary metastasis at the tipping point of hepatocellular carcinoma.

Authors:  Biwei Yang; Meiyi Li; Wenqing Tang; Weixin Liu; Si Zhang; Luonan Chen; Jinglin Xia
Journal:  Nat Commun       Date:  2018-02-14       Impact factor: 14.919

8.  Drug-Target Interaction Prediction through Label Propagation with Linear Neighborhood Information.

Authors:  Wen Zhang; Yanlin Chen; Dingfang Li
Journal:  Molecules       Date:  2017-11-25       Impact factor: 4.411

9.  Inhibition of RNA lariat debranching enzyme suppresses TDP-43 toxicity in ALS disease models.

Authors:  Maria Armakola; Matthew J Higgins; Matthew D Figley; Sami J Barmada; Emily A Scarborough; Zamia Diaz; Xiaodong Fang; James Shorter; Nevan J Krogan; Steven Finkbeiner; Robert V Farese; Aaron D Gitler
Journal:  Nat Genet       Date:  2012-10-28       Impact factor: 38.330

10.  Predicting disease associations via biological network analysis.

Authors:  Kai Sun; Joana P Gonçalves; Chris Larminie; Nataša Przulj
Journal:  BMC Bioinformatics       Date:  2014-09-17       Impact factor: 3.169

View more
  11 in total

Review 1.  Circular RNAs and complex diseases: from experimental results to computational models.

Authors:  Chun-Chun Wang; Chen-Di Han; Qi Zhao; Xing Chen
Journal:  Brief Bioinform       Date:  2021-11-05       Impact factor: 11.622

2.  Predicting circRNA-Disease Associations Based on circRNA Expression Similarity and Functional Similarity.

Authors:  Yongtian Wang; Chenxi Nie; Tianyi Zang; Yadong Wang
Journal:  Front Genet       Date:  2019-09-12       Impact factor: 4.599

3.  Integrating random walk with restart and k-Nearest Neighbor to identify novel circRNA-disease association.

Authors:  Xiujuan Lei; Chen Bian
Journal:  Sci Rep       Date:  2020-02-06       Impact factor: 4.379

4.  Using Graph Attention Network and Graph Convolutional Network to Explore Human CircRNA-Disease Associations Based on Multi-Source Data.

Authors:  Guanghui Li; Diancheng Wang; Yuejin Zhang; Cheng Liang; Qiu Xiao; Jiawei Luo
Journal:  Front Genet       Date:  2022-02-07       Impact factor: 4.599

5.  NCPCDA: network consistency projection for circRNA-disease association prediction.

Authors:  Guanghui Li; Yingjie Yue; Cheng Liang; Qiu Xiao; Pingjian Ding; Jiawei Luo
Journal:  RSC Adv       Date:  2019-10-16       Impact factor: 4.036

6.  Prediction of circRNA-Disease Associations Based on the Combination of Multi-Head Graph Attention Network and Graph Convolutional Network.

Authors:  Ruifen Cao; Chuan He; Pijing Wei; Yansen Su; Junfeng Xia; Chunhou Zheng
Journal:  Biomolecules       Date:  2022-07-02

7.  circGPA: circRNA functional annotation based on probability-generating functions.

Authors:  Petr Ryšavý; Jiří Kléma; Michaela Dostálová Merkerová
Journal:  BMC Bioinformatics       Date:  2022-09-27       Impact factor: 3.307

8.  Prediction of circRNA-disease associations based on inductive matrix completion.

Authors:  Menglu Li; Mengya Liu; Yannan Bin; Junfeng Xia
Journal:  BMC Med Genomics       Date:  2020-04-03       Impact factor: 3.063

9.  Prioritizing CircRNA-Disease Associations With Convolutional Neural Network Based on Multiple Similarity Feature Fusion.

Authors:  Chunyan Fan; Xiujuan Lei; Yi Pan
Journal:  Front Genet       Date:  2020-09-16       Impact factor: 4.599

10.  SAAED: Embedding and Deep Learning Enhance Accurate Prediction of Association Between circRNA and Disease.

Authors:  Qingyu Liu; Junjie Yu; Yanning Cai; Guishan Zhang; Xianhua Dai
Journal:  Front Genet       Date:  2022-02-22       Impact factor: 4.599

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.