| Literature DB >> 35295301 |
Xiongfei Tian1, Ling Shen1, Pengfei Gao2, Li Huang3,4, Guangyi Liu1, Liqian Zhou1, Lihong Peng1,2.
Abstract
Coronavirus disease 2019 (COVID-19) is rapidly spreading. Researchers around the world are dedicated to finding the treatment clues for COVID-19. Drug repositioning, as a rapid and cost-effective way for finding therapeutic options from available FDA-approved drugs, has been applied to drug discovery for COVID-19. In this study, we develop a novel drug repositioning method (VDA-KLMF) to prioritize possible anti-SARS-CoV-2 drugs integrating virus sequences, drug chemical structures, known Virus-Drug Associations, and Logistic Matrix Factorization with Kernel diffusion. First, Gaussian kernels of viruses and drugs are built based on known VDAs and nearest neighbors. Second, sequence similarity kernel of viruses and chemical structure similarity kernel of drugs are constructed based on biological features and an identity matrix. Third, Gaussian kernel and similarity kernel are diffused. Forth, a logistic matrix factorization model with kernel diffusion is proposed to identify potential anti-SARS-CoV-2 drugs. Finally, molecular dockings between the inferred antiviral drugs and the junction of SARS-CoV-2 spike protein-ACE2 interface are implemented to investigate the binding abilities between them. VDA-KLMF is compared with two state-of-the-art VDA prediction models (VDA-KATZ and VDA-RWR) and three classical association prediction methods (NGRHMDA, LRLSHMDA, and NRLMF) based on 5-fold cross validations on viruses, drugs, and VDAs on three datasets. It obtains the best recalls, AUCs, and AUPRs, significantly outperforming other five methods under the three different cross validations. We observe that four chemical agents coming together on any two datasets, that is, remdesivir, ribavirin, nitazoxanide, and emetine, may be the clues of treatment for COVID-19. The docking results suggest that the key residues K353 and G496 may affect the binding energies and dynamics between the inferred anti-SARS-CoV-2 chemical agents and the junction of the spike protein-ACE2 interface. Integrating various biological data, Gaussian kernel, similarity kernel, and logistic matrix factorization with kernel diffusion, this work demonstrates that a few chemical agents may assist in drug discovery for COVID-19.Entities:
Keywords: anti-SARS-CoV-2 drug; kernel diffusion; logistic matrix factorization; molecular docking; virus-drug association
Year: 2022 PMID: 35295301 PMCID: PMC8919055 DOI: 10.3389/fmicb.2022.740382
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Statistics for three VDA networks.
| Datasets | Viruses | Drugs | VDAs |
| Dataset 1 | 12 | 78 | 96 |
| Dataset 2 | 69 | 128 | 770 |
| Dataset 3 | 34 | 203 | 407 |
FIGURE 1The flowchart of the VDA-KLMF framework.
The optimal parameter combinations of six VDA prediction methods.
| Method | Dataset 1 | Dataset 2 | Dataset 3 |
| NGRHMDA | α = 0.4, β = 0.8 | α = 0.6, β = 0.9 | α = 0.9, β = 0.9 |
| LRLSHMDA | μ | μ | μ |
| NRLMF | |||
| VDA-KATZ | β = 0.04, | β = 0.06, | β = 0.05, |
| VDA-RWR | |||
| VDA-KLMF |
The performance of six VDA prediction methods on three datasets under CV1.
| Datasets | Methods | Recall | Specificity | Precision | F1 score | AUC | AUPR |
| Dataset 1 | NGRHMDA |
| 0.3997 ± 0.0071 | 0.0366 ± 0.0024 | 0.0643 ± 0.0039 | 0.7026 ± 0.0411 |
|
| LRLSHMDA | 0.1299 ± 0.0272 | 0.6170 ± 0.0034 | 0.0047 ± 0.0005 | 0.0084 ± 0.0009 | 0.1844 ± 0.0307 | 0.0121 ± 0.0001 | |
| NRLMF | 0.4933 ± 0.0072 | 0.6494 ± 0.0248 | 0.1572 ± 0.0168 | 0.1842 ± 0.0159 | 0.6621 ± 0.0260 | 0.1827 ± 0.0180 | |
| VDA-KATZ | 0.2616 ± 0.0499 | 0.5407 ± 0.0125 | 0.0125 ± 0.0015 | 0.0184 ± 0.0023 | 0.2683 ± 0.0543 | 0.0248 ± 0.0023 | |
| VDA-RWR | 0.4977 ± 0.0132 |
| 0.0830 ± 0.0146 | 0.1055 ± 0.0111 |
| 0.1090 ± 0.0266 | |
| VDA-KLMF | 0.6460 ± 0.0702 | 0.5122 ± 0.0081 |
|
| 0.7495 ± 0.0575 | 0.2538 ± 0.0598 | |
| Dataset 2 | NGRHMDA | 0.3987 ± 0.0107 | 0.5823 ± 0.0085 | 0.0461 ± 0.0007 | 0.0329 ± 0.0011 | 0.4301 ± 0.0098 | 0.0236 ± 0.0040 |
| LRLSHMDA | 0.3507 ± 0.0077 | 0.4585 ± 0.0047 | 0.0435 ± 0.0001 | 0.0179 ± 0.0003 | 0.3173 ± 0.0053 | 0.0122 ± 0.0001 | |
| NRLMF | 0.5156 ± 0.0023 | 0.6303 ± 0.0134 | 0.1541 ± 0.0086 | 0.1895 ± 0.0078 | 0.6545 ± 0.0100 | 0.1614 ± 0.0094 | |
| VDA-KATZ | 0.5912 ± 0.0080 | 0.3143 ± 0.0039 | 0.0122 ± 0.0002 | 0.0232 ± 0.0003 | 0.3981 ± 0.0073 | 0.0142 ± 0.0001 | |
| VDA-RWR | 0.5106 ± 0.0025 |
| 0.0620 ± 0.0025 | 0.0844 ± 0.0021 | 0.6932 ± 0.0074 | 0.0658 ± 0.0030 | |
| VDA-KLMF |
| 0.5279 ± 0.0018 |
|
|
|
| |
| Dataset 3 | NGRHMDA | 0.4435 ± 0.0207 | 0.4699 ± 0.0122 | 0.0124 ± 0.0009 | 0.0232 ± 0.0017 | 0.4058 ± 0.0228 | 0.0817 ± 0.0158 |
| LRLSHMDA | 0.1801 ± 0.0099 | 0.5777 ± 0.0021 | 0.0017 ± 0.0001 | 0.0074 ± 0.0003 | 0.2920 ± 0.0100 | 0.0077 ± 0.0001 | |
| NRLMF | 0.5416 ± 0.0056 |
|
|
| 0.7591 ± 0.0146 | 0.2279 ± 0.0174 | |
| VDA-KATZ | 0.5712 ± 0.0185 | 0.3631 ± 0.0025 | 0.0117 ± 0.0005 | 0.0216 ± 0.0010 | 0.4639 ± 0.0173 | 0.0131 ± 0.0005 | |
| VDA-RWR | 0.5270 ± 0.0057 | 0.7021 ± 0.0115 | 0.0355 ± 0.0076 | 0.0812 ± 0.0071 | 0.7276 ± 0.0118 | 0.0372 ± 0.0092 | |
| VDA-KLMF |
| 0.5179 ± 0.0029 | 0.1459 ± 0.0127 | 0.2044 ± 0.0172 |
|
|
The best results are denoted in bold in each column.
The performance of six VDA prediction methods on three datasets under CV3.
| Datasets | Methods | Recall | Specificity | Precision | F1 score | AUC | AUPR |
| Dataset 1 | NGRHMDA | 0.5783 ± 0.0141 | 0.5582 ± 0.0160 | 0.0335 ± 0.0013 | 0.0615 ± 0.0024 | 0.6459 ± 0.0155 | 0.0410 ± 0.0035 |
| LRLSHMDA | 0.8034 ± 0.0117 | 0.5804 ± 0.0050 | 0.0696 ± 0.0015 | 0.1119 ± 0.0017 | 0.8403 ± 0.0099 | 0.2838 ± 0.0212 | |
| NRLMF | 0.6482 ± 0.0073 | 0.7665 ± 0.0127 |
|
| 0.8679 ± 0.0092 | 0.6511 ± 0.0171 | |
| VDA-KATZ | 0.6976 ± 0.0118 | 0.6639 ± 0.0168 | 0.1067 ± 0.0101 | 0.1517 ± 0.0112 | 0.8803 ± 0.0106 | 0.3513 ± 0.0144 | |
| VDA-RWR | 0.4824 ± 0.0089 |
| 0.1110 ± 0.0077 | 0.1153 ± 0.0058 | 0.8582 ± 0.0097 | 0.1268 ± 0.0100 | |
| VDA-KLMF |
| 0.5440 ± 0.0008 | 0.3001 ± 0.0040 | 0.3670 ± 0.0055 |
|
| |
| Dataset 2 | NGRHMDA | 0.4544 ± 0.0053 | 0.3643 ± 0.0099 | 0.0112 ± 0.0002 | 0.0218 ± 0.0005 | 0.3011 ± 0.0055 | 0.0121 ± 0.0002 |
| LRLSHMDA | 0.7838 ± 0.0050 | 0.4837 ± 0.0060 | 0.0757 ± 0.0008 | 0.0733 ± 0.0014 | 0.8248 ± 0.0020 | 0.0731 ± 0.0019 | |
| NRLMF | 0.5565 ± 0.0024 |
|
|
| 0.8146 ± 0.0030 | 0.3335 ± 0.0062 | |
| VDA-KATZ | 0.5512 ± 0.0069 | 0.7558 ± 0.0124 | 0.0464 ± 0.0009 | 0.0805 ± 0.0013 | 0.8296 ± 0.0023 | 0.0834 ± 0.0028 | |
| VDA-RWR | 0.5022 ± 0.0016 | 0.6651 ± 0.0052 | 0.0326 ± 0.0008 | 0.0574 ± 0.0011 | 0.6675 ± 0.0049 | 0.0328 ± 0.0010 | |
| VDA-KLMF |
| 0.5311 ± 0.0003 | 0.2077 ± 0.0016 | 0.2829 ± 0.0016 |
|
| |
| Dataset 3 | NGRHMDA | 0.3582 ± 0.0165 | 0.4423 ± 0.0208 | 0.0071 ± 0.0002 | 0.0119 ± 0.0005 | 0.2554 ± 0.0088 | 0.0078 ± 0.0006 |
| LRLSHMDA | 0.8124 ± 0.0051 | 0.5237 ± 0.0023 | 0.0312 ± 0.0004 | 0.0552 ± 0.0008 | 0.8169 ± 0.0048 | 0.1057 ± 0.0103 | |
| NRLMF | 0.5890 ± 0.0038 |
|
|
| 0.8572 ± 0.0048 |
| |
| VDA-KATZ | 0.7116 ± 0.0166 | 0.5564 ± 0.0302 | 0.0359 ± 0.0010 | 0.0626 ± 0.0015 | 0.8478 ± 0.0042 | 0.0847 ± 0.0034 | |
| VDA-RWR | 0.5053 ± 0.0031 | 0.7049 ± 0.0068 | 0.0369 ± 0.0025 | 0.0556 ± 0.0024 | 0.7123 ± 0.0067 | 0.0374 ± 0.0028 | |
| VDA-KLMF |
| 0.5224 ± 0.0004 | 0.1631 ± 0.0023 | 0.2331 ± 0.0025 |
| 0.3906 ± 0.0158 |
The best results are denoted in bold in each column.
The performance of six VDA prediction methods on three datasets under CV2.
| Datasets | Methods | Recall | Specificity | Precision | F1 score | AUC | AUPR |
| Dataset 1 | NGRHMDA | 0.6435 ± 0.0185 | 0.6713 ± 0.0112 | 0.0468 ± 0.0012 | 0.0850 ± 0.0021 | 0.8329 ± 0.0031 | 0.0674 ± 0.0074 |
| LRLSHMDA | 0.7938 ± 0.0069 | 0.5762 ± 0.0049 | 0.0695 ± 0.0014 | 0.1122 ± 0.0014 | 0.8249 ± 0.0064 | 0.3127 ± 0.0240 | |
| NRLMF | 0.6069 ± 0.0085 | 0.7454 ± 0.0165 |
| 0.3648 ± 0.0105 | 0.8409 ± 0.0106 | 0.5510 ± 0.0214 | |
| VDA-KATZ | 0.6889 ± 0.0120 | 0.6348 ± 0.0162 | 0.0925 ± 0.0168 | 0.1328 ± 0.0170 | 0.8419 ± 0.0096 | 0.3896 ± 0.0140 | |
| VDA-RWR | 0.5070 ± 0.0094 |
| 0.1393 ± 0.0052 | 0.1294 ± 0.0047 | 0.9182 ± 0.0023 | 0.1576 ± 0.0062 | |
| VDA-KLMF |
| 0.5459 ± 0.0028 | 0.3088 ± 0.0057 |
|
|
| |
| Dataset 2 | NGRHMDA | 0.4867 ± 0.0116 | 0.8504 ± 0.0022 | 0.0395 ± 0.0005 | 0.0719 ± 0.0008 | 0.8017 ± 0.0008 | 0.0567 ± 0.0020 |
| LRLSHMDA | 0.7720 ± 0.0036 | 0.4152 ± 0.0034 | 0.0085 ± 0.0007 | 0.0639 ± 0.0012 | 0.7334 ± 0.0029 | 0.1074 ± 0.0058 | |
| NRLMF | 0.5477 ± 0.0026 | 0.7476 ± 0.0094 |
| 0.2787 ± 0.0046 | 0.7848 ± 0.0061 | 0.2916 ± 0.0079 | |
| VDA-KATZ | 0.5913 ± 0.0082 | 0.5699 ± 0.0107 | 0.0427 ± 0.0005 | 0.0696 ± 0.0004 | 0.6886 ± 0.0033 | 0.1086 ± 0.0052 | |
| VDA-RWR | 0.5045 ± 0.0020 |
| 0.0454 ± 0.0007 | 0.0814 ± 0.0010 | 0.8025 ± 0.0029 | 0.0460 ± 0.0007 | |
| VDA-KLMF |
| 0.5327 ± 0.0007 | 0.2309 ± 0.0041 |
|
|
| |
| Dataset 3 | NGRHMDA | 0.4579 ± 0.0155 | 0.7070 ± 0.0042 | 0.0227 ± 0.0003 | 0.0279 ± 0.0007 | 0.6772 ± 0.0024 | 0.0351 ± 0.0015 |
| LRLSHMDA | 0.7420 ± 0.0063 | 0.5235 ± 0.0020 | 0.0241 ± 0.0002 | 0.0493 ± 0.0005 | 0.7468 ± 0.0054 | 0.0623 ± 0.0067 | |
| NRLMF | 0.5592 ± 0.0056 |
|
| 0.2390 ± 0.0059 | 0.7847 ± 0.0075 | 0.2989 ± 0.0130 | |
| VDA-KATZ | 0.7246 ± 0.0068 | 0.3995 ± 0.0058 | 0.0297 ± 0.0002 | 0.0491 ± 0.0004 | 0.6840 ± 0.0058 | 0.0964 ± 0.0034 | |
| VDA-RWR | 0.5054 ± 0.0082 | 0.8087 ± 0.0064 | 0.0815 ± 0.0013 | 0.0628 ± 0.0019 | 0.8168 ± 0.0048 | 0.1002 ± 0.0013 | |
| VDA-KLMF |
| 0.5245 ± 0.0011 | 0.1810 ± 0.0035 |
|
|
|
The best results are denoted in bold in each column.
FIGURE 2The AUC values predicted by six VDA prediction methods (D denotes dataset, Dl denotes dataset 1, D2 denotes dataset 2, D3 denotes dataset 3).
FIGURE 3The AUPR values predicted by six VDA prediction methods (D denotes dataset, DI denotes dataset 1, D2 denotes dataset 2, D3 denotes dataset 3).
FIGURE 4Effect of Gaussian kernel on virus-drug association prediction performance.
The effect of different r on VDA-KLMF on three datasets under CV3.
| Datasets | Methods | Recall | Precision | F1 score | AUC | AUPR |
| Dataset 1 | 0.8668 ± 0.0079 | 0.2794 ± 0.0034 | 0.3471 ± 0.0044 | 0.9105 ± 0.0088 | 0.6174 ± 0.0222 | |
| 0.8820 ± 0.0098 | 0.2942 ± 0.0040 | 0.3614 ± 0.0043 | 0.9276 ± 0.0109 | 0.7205 ± 0.0258 | ||
| 0.8869 ± 0.0096 | 0.2986 ± 0.0043 | 0.3657 ± 0.0051 | 0.9331 ± 0.0106 | 0.7506 ± 0.0255 | ||
| 0.8896 ± 0.0056 | 0.2994 ± 0.0036 | 0.3661 ± 0.0046 | 0.9360 ± 0.0062 | 0.7614 ± 0.0209 | ||
|
|
|
|
| 0.7631 ± 0.0259 | ||
| 0.8887 ± 0.0121 | 0.2994 ± 0.0055 | 0.3659 ± 0.0070 | 0.9349 ± 0.0134 |
| ||
| 0.8874 ± 0.0100 | 0.2997 ± 0.0033 | 0.3658 ± 0.0043 | 0.9339 ± 0.0109 | 0.7650 ± 0.0209 | ||
| 0.8895 ± 0.0111 | 0.2997 ± 0.0042 | 0.3659 ± 0.0056 | 0.9362 ± 0.0123 | 0.7568 ± 0.0208 | ||
| 0.8845 ± 0.0139 | 0.2986 ± 0.0051 | 0.3649 ± 0.0071 | 0.9304 ± 0.0150 | 0.7600 ± 0.0275 | ||
| 0.8839 ± 0.0086 | 0.2990 ± 0.0045 | 0.3653 ± 0.0052 | 0.9300 ± 0.0096 | 0.7592 ± 0.0243 | ||
| Dataset 2 | 0.8069 ± 0.0040 | 0.1965 ± 0.0026 | 0.2716 ± 0.0023 | 0.8363 ± 0.0044 | 0.3269 ± 0.0110 | |
| 0.8195 ± 0.0030 | 0.2042 ± 0.0018 | 0.2791 ± 0.0016 | 0.8502 ± 0.0033 | 0.3619 ± 0.0089 | ||
| 0.8245 ± 0.0042 | 0.2070 ± 0.0021 | 0.2821 ± 0.0024 | 0.8556 ± 0.0046 | 0.3742 ± 0.0093 | ||
| 0.8241 ± 0.0030 | 0.2067 ± 0.0018 | 0.2819 ± 0.0019 | 0.8553 ± 0.0033 | 0.3729 ± 0.0089 | ||
| 0.8254 ± 0.0025 | 0.2072 ± 0.0014 | 0.2825 ± 0.0013 | 0.8567 ± 0.0004 | 0.3741 ± 0.0074 | ||
| 0.8250 ± 0.0038 | 0.2074 ± 0.0016 | 0.2824 ± 0.0019 | 0.8562 ± 0.0042 | 0.3761 ± 0.0073 | ||
| 0.8255 ± 0.0033 | 0.2070 ± 0.0017 | 0.2824 ± 0.0019 | 0.8568 ± 0.0035 | 0.3733 ± 0.0078 | ||
| 0.8250 ± 0.0032 | 0.2071 ± 0.0018 | 0.2824 ± 0.0016 | 0.8562 ± 0.0035 | 0.3748 ± 0.0083 | ||
| 0.8255 ± 0.0033 |
|
| 0.8568 ± 0.0036 |
| ||
| 0.8241 ± 0.0036 | 0.2063 ± 0.0021 | 0.2816 ± 0.0019 | 0.8552 ± 0.0040 | 0.3709 ± 0.0097 | ||
|
| 0.2074 ± 0.0024 | 0.2828 ± 0.0023 |
| 0.3752 ± 0.0125 | ||
| Dataset 3 | 0.8401 ± 0.0065 | 0.1581 ± 0.0026 | 0.2252 ± 0.0027 | 0.8617 ± 0.0070 | 0.3620 ± 0.0184 | |
| 0.8458 ± 0.0081 | 0.1536 ± 0.0026 | 0.2244 ± 0.0023 | 0.8677 ± 0.0086 | 0.3276 ± 0.0161 | ||
| 0.8458 ± 0.0081 | 0.1536 ± 0.0026 | 0.2244 ± 0.0023 | 0.8677 ± 0.0086 | 0.3276 ± 0.0161 | ||
| 0.8577 ± 0.0083 | 0.1603 ± 0.0032 | 0.2298 ± 0.0034 | 0.8804 ± 0.0088 | 0.3695 ± 0.0208 | ||
|
| 0.1625 ± 0.0019 | 0.2326 ± 0.0021 |
| 0.3836 ± 0.0131 | ||
| 0.8583 ± 0.0089 | 0.1597 ± 0.0036 | 0.2296 ± 0.0039 | 0.8810 ± 0.0095 | 0.3657 ± 0.0235 | ||
| 0.8631 ± 0.0072 |
|
| 0.8861 ± 0.0076 |
| ||
| 0.8574 ± 0.0078 | 0.1608 ± 0.0031 | 0.2306 ± 0.0029 | 0.8800 ± 0.0083 | 0.3757 ± 0.0211 | ||
| 0.8622 ± 0.0047 | 0.1609 ± 0.0024 | 0.2312 ± 0.0020 | 0.8851 ± 0.0050 | 0.3710 ± 0.0182 | ||
| 0.8567 ± 0.0064 | 0.1554 ± 0.0024 | 0.2273 ± 0.0023 | 0.8791 ± 0.0068 | 0.3351 ± 0.0149 | ||
| 0.8585 ± 0.0048 | 0.1531 ± 0.0026 | 0.2238 ± 0.0023 | 0.8757 ± 0.0050 | 0.3199 ± 0.0222 |
The best results are denoted in bold in each column.
The predicted top 10 drugs associated with SARS-CoV-2 on dataset 1.
| Rank | Drug | Evidence |
| 1 | Remdesivir | |
| 2 | Ribavirin | |
| 3 | Oseltamivir | |
| 4 | Zanamivir |
|
| 5 | Mycophenolic acid |
|
| 6 | Chloroquine | |
| 7 | Peramivir |
|
| 8 | Laninamivir | Unconfirmed |
| 9 | Rimantadine |
|
| 10 | Presatovir |
|
The predicted top 10 drugs associated with SARS-CoV-2 on dataset 3.
| Rank | Drug | Evidence |
| 1 | Nitazoxanide | |
| 2 | Ribavirin | |
| 3 | Chloroquine | |
| 4 | Umifenovir | DOI: |
| 5 | Camostat |
|
| 6 | Favipiravir | |
| 7 | Emetine |
|
| 8 | Amantadine |
|
| 9 | Hexachlorophene |
|
| 10 | Irbesartan | Unconfirmed |
Binding energy between the predicted antiviral drugs and the junction of the S protein-ACE2 interface.
| Drug | Binding energy (kcal/mol) | Drug | Binding energy (kcal/mol) |
| Remdesivir | –7.00 | BCX4430 Galidesivir | –6.87 |
| Ribavirin | –6.59 | Camostat | –7.48 |
| Nitazoxanide | –7.74 | Cyclosporine | –8.92 |
| Favipiravir | –5.32 | Hexachlorophene | –7.67 |
| Emetine | –6.95 | Irbesartan | –8.13 |
| Chloroquine | –5.82 | Laninamivir | –5.7 |
| Mycophenolic acid | –7.0 | Navitoclax | –8.39 |
| Rimantadine | –6.63 | Niclosamide | –8.06 |
| Silvestrol | –5.54 | Oseltamivir | –6.5 |
| Umifenovir | –6.89 | Peramivir | –6.88 |
| Zanamivir | –5.96 | Presatovir | –8.38 |
FIGURE 5Molecular dockings between the predicted four possible antiviral drugs against COV1D-19 (remdesivir, ribavinn, nitazoxanide, and emetine) and the junction of the S protein-ACE2 interface. (A) remdesivir (Peng et al., 2021; Wang J. et al., 2021; Shen et al., 2022), (B) ribavirin (Peng et al., 2021; Wang J. et al., 2021; Shen et al., 2022), (C) nitazoxanide, and (D) emetine.
The predicted top 10 drugs associated with SARS-CoV-2 on dataset 2.
| Rank | Drug | Evidence |
| 1 | Remdesivir | |
| 2 | Emetine |
|
| 3 | BCX4430 Galidesivir |
|
| 4 | Niclosamide | |
| 5 | Cyclosporine | |
| 6 | Silvestrol | DOI: |
| 7 | Mycophenolic acid |
|
| 8 | Favipiravir | |
| 9 | Nitazoxanide | |
| 10 | Navitoclax |
|