| Literature DB >> 35910021 |
Dong Ouyang1, Rui Miao1, Jianjun Wang2, Xiaoying Liu3, Shengli Xie4, Ning Ai1, Qi Dang1, Yong Liang5.
Abstract
Many studies have indicated miRNAs lead to the occurrence and development of diseases through a variety of underlying mechanisms. Meanwhile, computational models can save time, minimize cost, and discover potential associations on a large scale. However, most existing computational models based on a matrix or tensor decomposition cannot recover positive samples well. Moreover, the high noise of biological similarity networks and how to preserve these similarity relationships in low-dimensional space are also challenges. To this end, we propose a novel computational framework, called WeightTDAIGN, to identify potential multiple types of miRNA-disease associations. WeightTDAIGN can recover positive samples well and improve prediction performance by weighting positive samples. WeightTDAIGN integrates more auxiliary information related to miRNAs and diseases into the tensor decomposition framework, focuses on learning low-rank tensor space, and constrains projection matrices by using the L 2,1 norm to reduce the impact of redundant information on the model. In addition, WeightTDAIGN can preserve the local structure information in the biological similarity network by introducing graph Laplacian regularization. Our experimental results show that the sparser datasets, the more satisfactory performance of WeightTDAIGN can be obtained. Also, the results of case studies further illustrate that WeightTDAIGN can accurately predict the associations of miRNA-disease-type.Entities:
Keywords: 1 norm; L2; graph Laplacian regularization; multi-view biological similarity network; multiple types of miRNA–disease associations; weighted tensor decomposition
Year: 2022 PMID: 35910021 PMCID: PMC9335924 DOI: 10.3389/fbioe.2022.911769
Source DB: PubMed Journal: Front Bioeng Biotechnol ISSN: 2296-4185
Statistics of all datasets used in this study.
| Dataset | #miRNA | #Disease | #Type | #Association | #Density (%) |
|---|---|---|---|---|---|
| MDA v2.0-2 | 211 | 59 | 4 | 1,410 | 2.83 |
| MDA v2.0-3 | 69 | 25 | 4 | 586 | 8.49 |
| MDA v2.0-4 | 40 | 20 | 4 | 347 | 10.84 |
| MDA v3.2-5 | 125 | 65 | 5 | 4,785 | 11.78 |
#Disease, disease number; #miRNA, miRNA number; #asssociation, association number; #type, type number; #density, sparsity rate.
FIGURE 1The workflow of our proposed WeightTDAIGN model for predicting potential multiple types of miRNA–disease associations. (A) Multi-view miRNA and disease similarity networks are incorporated into tensor decomposition. It is worth noting that Gipk represents the Gaussian interaction profile kernel. (B) We take slice as an example to show how to assign weight to positive samples. (C) If the similarity S between miRNAs (or diseases) is high, the embedding information of nodes a and a will be very similar (that is, the nodes have the same color) for miRNAs (or diseases) in the low-dimensional embedding space.
FIGURE 2The influence of different hyperparameters on WeightTDAIGN based on the MDA v2.0–4 dataset. (A) The impact of hyperparameters α and β WeightTDAIGN and (B) the impact of hyperparameter r on WeightTDAIGN. Note that to facilitate visualization panel (A), we use 2 to represent 2 × 10 when n < 0.
FIGURE 3The influence of different hyperparameters on WeightTDAIGN based on the MDA v2.0–4 dataset. (A) The impact of hyperparameters r′ WeightTDAIGN and (B) the impact of hyperparameter episode on WeightTDAIGN.
FIGURE 4The influence of different hyperparameters on WeightTDAIGN based on the MDA v2.0–4 dataset. (A) The impact of hyperparameters weight WeightTDAIGN and (B) the impact of hyperparameter k on WeightTDAIGN.
The performance of all models evaluated by 5-fold cross-validation under CV .
| AUC | AUPR | F1 | MSE | ||
|---|---|---|---|---|---|
| MDA v2.0-2 | RBMMMDA | 0.855252 | 0.845632 | 0.791036 | 0.819910 |
| — | CP | 0.912348 | 0.924856 | 0.834056 | 0.538964 |
| — | TFAI | 0.911866 | 0.924786 | 0.835192 | 0.540715 |
| — | TDRC | 0.900086 | 0.915792 | 0.822848 | 0.559964 |
| — | TDAI | 0.908280 | 0.920506 | 0.828144 | 0.549998 |
| — | TDAIGN | 0.910948 | 0.923750 | 0.829668 | 0.539089 |
| — | WeightTDAIGN |
|
|
|
|
| MDA v2.0-3 | RBMMMDA | 0.787792 | 0.777968 | 0.744724 | 0.734902 |
| — | CP | 0.944596 | 0.955340 | 0.875980 | 0.298760 |
| — | TFAI | 0.936446 | 0.947686 | 0.861838 | 0.357501 |
| — | TDRC | 0.931564 | 0.942962 | 0.852522 | 0.362024 |
| — | TDAI | 0.944558 | 0.955166 | 0.875612 | 0.308545 |
| — | TDAIGN | 0.948034 | 0.957388 | 0.876888 | 0.307712 |
| — | WeightTDAIGN |
|
|
|
|
| MDA v2.0-4 | RBMMMDA | 0.790984 | 0.781692 | 0.746280 | 0.674311 |
| — | CP | 0.930410 | 0.943304 | 0.852792 | 0.322797 |
| — | TFAI | 0.933182 | 0.944776 | 0.853160 | 0.323449 |
| — | TDRC | 0.928236 | 0.940798 | 0.852378 | 0.338039 |
| — | TDAI | 0.933482 | 0.944868 | 0.856652 | 0.323383 |
| — | TDAIGN | 0.933978 | 0.945050 | 0.856182 | 0.323505 |
| — | WeightTDAIGN |
|
|
|
|
| MDA v3.2-5 | RBMMMDA | 0.857432 | 0.852302 | 0.787936 | 0.568752 |
| — | CP | 0.862516 | 0.867546 | 0.789574 | 0.514538 |
| — | TFAI | 0.862770 | 0.867616 | 0.790276 | 0.514886 |
| — | TDRC | 0.861966 | 0.866272 | 0.790432 | 0.512712 |
| — | TDAI | 0.862536 | 0.867434 | 0.789696 | 0.514705 |
| — | TDAIGN | 0.862640 | 0.867566 | 0.789866 | 0.514937 |
| — | WeightTDAIGN |
|
|
|
|
Since the open-source web server can no longer be used, the reported results here are our re-implementation of the original algorithms.
The performance of all models evaluated by 5-fold cross-validation under CV .
| Top-1 precision | Top-1 recall | Top-1 F1 | ||
|---|---|---|---|---|
| MDA v2.0-2 | RBMMMDA | 0.368033 | 0.318645 | 0.323771 |
| — | CP | 0.584426 | 0.505831 | 0.528142 |
| — | TFAI | 0.587705 | 0.508327 | 0.526776 |
| — | TDRC | 0.585246 | 0.506537 | 0.529235 |
| — | TDAI | 0.575410 | 0.498121 | 0.524454 |
| — | TDAIGN | 0.609016 | 0.527208 | 0.550273 |
| — | WeightTDAIGN |
|
|
|
| MDA v2.0-3 | RBMMMDA | 0.390437 | 0.309279 | 0.318581 |
| — | CP | 0.568611 | 0.452077 | 0.492431 |
| — | TFAI | 0.583551 | 0.463528 | 0.492374 |
| — | TDRC | 0.585770 | 0.466194 | 0.495993 |
| — | TDAI | 0.572821 | 0.454553 | 0.492710 |
| — | TDAIGN | 0.585747 | 0.465819 | 0.500618 |
| — | WeightTDAIGN |
|
|
|
| MDA v2.0-4 | RBMMMDA | 0.359857 | 0.269427 | 0.277623 |
| — | CP | 0.544385 | 0.409100 | 0.433500 |
| — | TFAI | 0.555865 | 0.416682 | 0.443161 |
| — | TDRC | 0.577683 | 0.434773 | 0.467879 |
| — | TDAI | 0.569554 | 0.425020 | 0.454890 |
| — | TDAIGN | 0.577398 | 0.431664 | 0.462733 |
| — | WeightTDAIGN |
|
|
|
| MDA v3.2-5 | RBMMMDA | 0.542408 | 0.325892 | 0.347250 |
| — | CP | 0.581700 | 0.349530 | 0.377299 |
| — | TFAI | 0.585528 | 0.351849 | 0.381735 |
| — | TDRC | 0.597008 | 0.358833 | 0.388722 |
| — | TDAI | 0.580311 | 0.348687 | 0.375127 |
| — | TDAIGN | 0.588309 | 0.353523 | 0.383068 |
| — | WeightTDAIGN |
|
|
|
Since the open-source web server can no longer be used, the reported results here are our re-implementation of the original algorithms.
Top 50 disease-related miRNAs predicted by WeightTDAIGN based on MDA v2.0–4.
| miRNA | Disease | Type | Score | PMID | miRNA | Disease | Type | Score | PMID |
|---|---|---|---|---|---|---|---|---|---|
| hsa-mir-34a | Colorectal neoplasms | Target | 1.424456 | 24370784 | hsa-mir-200b | Prostatic neoplasms | Target | 1.028036 | 21224847 |
| hsa-mir-17 | Carcinoma and hepatocellular | Target | 1.325876 | 23418359 | hsa-mir-124-1 | Breast neoplasms | Target | 1.003782 | 22085528 |
| hsa-mir-145 | Breast neoplasms | Target | 1.307542 | 19360360 | hsa-mir-19a | Carcinoma and hepatocellular | Target | 0.985647 | 28724429 |
| hsa-mir-125b-1 | Carcinoma and hepatocellular | Target | 1.289963 | 22293115 | hsa-mir-17 | Melanoma | Circulation | 0.983089 | 20529253 |
| hsa-mir-125b-1 | Breast neoplasms | Target | 1.267228 | 19738052 | hsa-mir-19b-1 | Carcinoma and hepatocellular | Target | 0.975428 | 17188425 |
| hsa-mir-15a | Leukemia, lymphocytic, chronic, and B cell | Target | 1.258953 | 19498445 | hsa-mir-200c | Carcinoma and hepatocellular | Epigenetics | 0.974930 | 23222811 |
| hsa-mir-125b-2 | Breast neoplasms | Target | 1.230360 | 19738052 | hsa-mir-16-1 | Multiple myeloma | Target | 0.967225 | 23104180 |
| hsa-mir-34a | Breast neoplasms | Target | 1.227899 | 21814748 | hsa-mir-18a | Breast neoplasms | Genetics | 0.957834 | 16754881 |
| hsa-mir-16-1 | Leukemia, lymphocytic, chronic, and B cell | Target | 1.199684 | 19498445 | hsa-mir-29c | Breast neoplasms | Target | 0.956380 | 22330642 |
| hsa-mir-125b-2 | Carcinoma and hepatocellular | Target | 1.166586 | 22293115 | hsa-mir-200c | Breast neoplasms | Target | 0.948673 | 23209748 |
| hsa-mir-200b | Carcinoma and hepatocellular | Epigenetics | 1.162828 | 22370893 | hsa-mir-29b-1 | Breast neoplasms | Target | 0.939779 | 22864815 |
| hsa-mir-221 | Breast neoplasms | Target | 1.152046 | 21868360 | hsa-mir-15a | Multiple myeloma | Target | 0.937730 | 23104180 |
| hsa-mir-19b-1 | Breast neoplasms | Genetics | 1.130424 | 16754881 | hsa-mir-16-1 | Breast neoplasms | Target | 0.933405 | 19250063 |
| hsa-mir-200a | Carcinoma and hepatocellular | Epigenetics | 1.127118 | 21837748 | hsa-mir-200a | Prostatic neoplasms | Target | 0.929946 | 21224847 |
| hsa-mir-200a | Breast neoplasms | Target | 1.118853 | 21926171 | hsa-mir-18a | Carcinoma and hepatocellular | Genetics | 0.925369 | 15944709 |
| hsa-mir-124-2 | Carcinoma and hepatocellular | Target | 1.110440 | 21672940 | hsa-mir-17 | Colorectal neoplasms | Epigenetics | 0.924489 | 22308110 |
| hsa-mir-124-3 | Carcinoma and hepatocellular | Target | 1.110440 | 21672940 | hsa-mir-16-1 | Prostatic neoplasms | Genetics | 0.922385 | 17940623 |
| hsa-mir-19a | Breast neoplasms | Genetics | 1.079414 | 16754881 | hsa-mir-16-1 | Carcinoma and hepatocellular | Target | 0.916582 | 23226427 |
| hsa-mir-124-2 | Breast neoplasms | Target | 1.078512 | 22085528 | hsa-mir-218-1 | Breast neoplasms | Genetics | 0.913360 | 16754881 |
| hsa-mir-124-3 | Breast neoplasms | Target | 1.078512 | 22085528 | hsa-mir-124-1 | Carcinoma and hepatocellular | Target | 0.912254 | 21672940 |
| hsa-mir-200c | Stomach neoplasms | Target | 1.057460 | 25986864 | hsa-mir-16-2 | Carcinoma and hepatocellular | Target | 0.910880 | 23226427 |
| hsa-mir-200c | Prostatic neoplasms | Target | 1.054738 | 21224847 | hsa-mir-15a | Prostatic neoplasms | Genetics | 0.906294 | 17940623 |
| hsa-mir-17 | Breast neoplasms | Genetics | 1.049201 | 16754881 | hsa-mir-133a-2 | Colorectal neoplasms | Epigenetics | 0.906253 | 22766685 |
| hsa-mir-126 | Carcinoma and non-small cell lung | Circulation | 1.046761 | 22009180 | hsa-mir-34a | Prostatic neoplasms | Target | 0.892918 | 21240262 |
| hsa-mir-31 | Breast neoplasms | Epigenetics | 1.045268 | 22289355 | hsa-mir-200b | Breast neoplasms | Target | 0.889298 | 20514023 |
FIGURE 5The association network of the top 50 predictions for miRNAs with type as the target in breast neoplasms. (A) Predicted association between miRNAs and breast neoplasms. (B) Functional similarity network between miRNAs associated with breast neoplasms. Darker colors indicate higher similarity between miRNAs. The similarity values range from 0.5 to 1.
FIGURE 6The enrichment analysis of miRNA target gene sets. (A) The statistical significance of target gene sets associated with hsa-mir-218-1. (B) The statistical significance of target gene sets associated with hsa-mir-218-2.