| Literature DB >> 33552306 |
Shatadru Majumdar1, Soumik Kumar Nandi1, Shuvam Ghosal1, Bavrabi Ghosh1, Writam Mallik1, Nilanjana Dutta Roy1, Arindam Biswas2, Subhankar Mukherjee3, Souvik Pal3, Nabarun Bhattacharyya3.
Abstract
To fight against the present pandemic scenario of COVID-19 outbreak, medication with drugs and vaccines is extremely essential other than ventilation support. In this paper, we present a list of ligands which are expected to have the highest binding affinity with the S-glycoprotein of 2019-nCoV and thus can be used to make the drug for the novel coronavirus. Here, we implemented an architecture using 1D convolutional networks to predict drug-target interaction (DTI) values. The network was trained on the KIBA (Kinase Inhibitor Bioactivity) dataset. With this network, we predicted the KIBA scores (which gives a measure of binding affinity) of a list of ligands against the S-glycoprotein of 2019-nCoV. Based on these KIBA scores, we are proposing a list of ligands (33 top ligands based on best interactions) which have a high binding affinity with the S-glycoprotein of 2019-nCoV and thus can be used for the formation of drugs.Entities:
Keywords: 1D CNN; Binding affinity; COVID-19; Drug–target interaction values; ECFP4; KIBA; Ligand; Protein Sequence Composition; S-glycoprotein
Year: 2021 PMID: 33552306 PMCID: PMC7852055 DOI: 10.1007/s12559-021-09840-x
Source DB: PubMed Journal: Cognit Comput ISSN: 1866-9956 Impact factor: 4.890
KIBA Dataset Description
| Protein | Compound | Interaction |
|---|---|---|
| 229 | 2111 | 118254 |
Fig. 1The block diagram of the proposed work
Fig. 2(a) KIBA scores, (b) length of SMILES (chars), (c) length of protein sequence (chars)
Fig. 3The proposed architecture for DTI model
Fig. 4Training loss of RMSE vs the epoch number
Fig. 5MSE vs the epoch number
Fig. 6MAE vs the epoch number
SMILES codes of the top 33 ligands in the descending order of their binding affinity with the S-glycoprotein of SARS-CoV-2
| SMILES code No | SMILES codes |
|---|---|
| 16 | NCCNC(=O)c1cccc(c1)c2cnc(Nc3cc(ccn3)N4CCC(F)(F)CC4)s2 |
| 17 | Cc1cn2c(cnc2c(Nc3ccc(C(=O)N4C[C@H]5CC[C@@H]4CN5)c(Cl) |
| c3)n1)c6cn[nH]c6 | |
| 19 | COCCCc1cc(Nc2nc(NCc3onc(C)c3)ncc2Br)n[nH]1 |
| 20 | Cc1cn2c(cnc2c(Nc3ccc(C(=O)N4CCCNCC4)c(Cl)c3)n1)c5cn[nH]c5 |
| 29 | Clc1cc(Nc2nc(cn3c(cnc23)c4cn[nH]c4)C5CC5)ccc1C(=O)N6C[C@H] |
| 7CC[C@@H]6CN7 | |
| 32 | CN1CC(CN(C)C1=O)c2ccc(NC(=O)c3nc(c[nH]3)C |
| c(c2)C4=CCCCC4 | |
| 36 | Cc1cn2c(cnc2c(Nc3ccc(C(=O)N4CCNCC4)c(Cl)c3)n1)c5cn[nH]c5 |
| 37 | Cc1cn2c(cnc2c(Nc3ccc(C(=O)N4CCNC5(CC5)C4)c(Cl)c3)n1) |
| c6cn[nH]c6 | |
| 39 | CN1CC(CN(C)C1=O)c2ccc(NC(=O)c3nc(c[nH]3)C |
| c(c2)C4=CCCCCC4 | |
| 40 | Cc1cc(CNc2ncc(Br)c(Nc3cc([nH]n3)C4CC4)n2)on1 |
| 44 | Cn1ncc(NC(=O)c2nc(sc2N)c3c(F)cccc3F)c1N4CCCN(CC4) |
| C5CNC5 | |
| 47 | CC(C)c1cn2c(cnc2c(Nc3ccc(C(=O)N4CCNCC4)c(Cl)c3)n1) |
| c5cn[nH]c5 | |
| 51 | Clc1cc(Nc2nc(cn3c(cnc23)c4cn[nH]c4)C5CC5)ccc1C(=O) |
| N6CCNCC6 | |
| 54 | Cc1cn2c(cnc2c(Nc3ccc(cc3F)C(=O)N4CCNCC4)n1)c5cn[nH]c5 |
| 55 | Cc1cn2c(cnc2c(Nc3ccc(C(=O)N4CCN(CCO)CC4)c(Cl)c3)n1) |
| c5cn[nH]c5 | |
| 56 | CN1CC(CN(C)C1=O)c2ccc(NC(=O)c3nc(c[nH]3)C |
| C4=CCC(C)(C)CC4 | |
| 63 | C[C@H](Nc1nc(nc2c1cc(C(=O)NCCN(C)C)n2C)n3cnc4ccncc34) |
| c5ccccc5 | |
| 64 | Cc1cc2c(Nc3ccc4nc(N)sc4c3)c(cnc2cc1OCCCN5CCNCC5)C |
| 67 | Cc1cn2c(cnc2c(Nc3ccc(C(=O)N4CCNCC45CC5)c(Cl)c3)n1)c6cn |
| [nH]c6 | |
| 68 | O=C(Nc1ccc(cc1C2=CCCCC2)C3CCN(CCC |
| C | |
| 70 | CC1(C)CNCCN1C(=O)c2ccc(Nc3nc(cn4c(cnc34)c5cn[nH]c5) |
| C6CC6)cc2Cl | |
| 71 | CCN1CCC(C1) |
| (NC(=O)NCC(F)(F)F)c4 | |
| 74 | COc1ccc(CCNC(=O)c2cc3C(=O)N4C=CC=C(C)C4=Nc3s2)cc1OC |
| 75 | CC(C)c1nc2c(Nc3ccc(C(=O)N4CCNCC4)c(Cl)c3)nc(C) |
| cn2c1c5cn[nH]c5 | |
| 79 | Clc1cc(Nc2nc(cn3c(cnc23)c4cn[nH]c4)C5CC5)ccc1C(=O) |
| N6CCNC7(CC7)C6 | |
| 88 | Clc1cc(Nc2nc(cn3c(cnc23)c4cn[nH]c4)C5CC5)ccc1C(=O) |
| N6CCNCC67CC7 | |
| 90 | Cc1cn2c(cnc2c(Nc3ccc(C(=O)N4CCNCC4)c(c3)C5CC5)n1)c6cn[nH]c6 |
| 91 | Cc1cn2c(cnc2c(Nc3ccc(cc3F)C(=O)N4CCNCC4(C)C)n1)c5cn[nH]c5 |
| 93 | Fc1cc(ccc1Nc2nc(cn3c(cnc23)c4cn[nH]c4)C5CC5)C(=O)N6CCNCC6 |
| 95 | Cc1cn2c(cnc2c(Nc3ccc(C(=O)N4CCNCC4(C)C)c(Cl)c3)n1)c5cn[nH]c5 |
| 96 | CC(C)(N)CC(=O)N1CCC(CC1)c2ccc(NC(=O)c3nc(c[nH]3)C |
| c(c2)C4=CCCCC4 | |
| 99 | CN1CC(CN(C)S1(=O)=O)c2ccc(NC(=O)c3nc(c[nH]3)C |
| c(c2)C4=CCCCC4 | |
| 100 | Fc1ccc(NC(=O)C2=C(CCC2)c3nc(Nc4cc([nH]n4)C5CC5) |
| c6cccn6n3)cn1 |
Fig. 72D diagram corresponding to the 3D structure of the ligands, 16, 20, 54, 75 and 99, given in Table 2
Fig. 8Visualization of the PDB file representing chain A of S-glycoprotein of SARS-COV-2
Druggability test of few predicted ligands taken from Table 2
| Serial no. | SMILES code | Molecular weight | No. of hydrogen bond donors | No. of hydrogen bond acceptors | No. of rotatable bonds | Partition coefficient | No of rules satisfied |
|---|---|---|---|---|---|---|---|
| 1 | 16 | 458.538 | 3 | 7 | 4 | 3.8727 | 5 |
| 2 | 17 | 462.945 | 3 | 7 | 2 | 3.4012 | 5 |
| 3 | 19 | 422.287 | 3 | 8 | 4 | 3.1933 | 5 |
| 4 | 20 | 450.934 | 3 | 7 | 2 | 3.26032 | 5 |
| 5 | 29 | 488.983 | 2 | 6 | 5 | 3.97 | 5 |
| 6 | 32 | 418.501 | 2 | 8 | 3 | 3.572 | 5 |
| 7 | 36 | 436.907 | 3 | 7 | 2 | 2.8702 | 5 |
| 8 | 37 | 462.945 | 3 | 7 | 2 | 3.4028 | 5 |
| 9 | 39 | 432.528 | 2 | 4 | 3 | 3.962 | 5 |
| 10 | 40 | 390.245 | 3 | 7 | 3 | 3.492 | 5 |
| 11 | 44 | 488.568 | 3 | 9 | 3 | 2.1402 | 5 |
| 12 | 47 | 464.961 | 3 | 7 | 2 | 3.6852 | 5 |
| 13 | 51 | 462.945 | 3 | 7 | 2 | 3.4392 | 5 |
| 14 | 54 | 420.452 | 3 | 7 | 2 | 2.3559 | 5 |
| 15 | 55 | 480.96 | 3 | 8 | 2 | 2.5749 | 5 |
| 16 | 56 | 446.555 | 2 | 4 | 3 | 4.20798 | 5 |
| 17 | 63 | 483.58 | 2 | 9 | 5 | 3.1667 | 5 |
| 18 | 64 | 473.606 | 3 | 9 | 5 | 3.4028 | 5 |
| 19 | 67 | 462.945 | 3 | 7 | 2 | 3.4028 | 5 |
| 20 | 68 | 428.54 | 2 | 5 | 3 | 4.584 | 5 |
| 21 | 70 | 490.99 | 3 | 7 | 2 | 4.2178 | 5 |
| 22 | 71 | 499.541 | 3 | 6 | 5 | 4.583 | 5 |
| 23 | 74 | 423.494 | 1 | 7 | 5 | 3.2073 | 5 |
| 24 | 75 | 478.988 | 3 | 7 | 2 | 3.993 | 5 |
| 25 | 79 | 488.983 | 3 | 7 | 2 | 3.971 | 5 |
| 26 | 88 | 488.983 | 3 | 7 | 2 | 3.9718 | 5 |
| 27 | 90 | 442.527 | 3 | 7 | 2 | 3.0942 | 5 |
| 28 | 91 | 462.945 | 3 | 7 | 2 | 2.92498 | 5 |
| 29 | 93 | 446.49 | 3 | 7 | 2 | 2.9249 | 5 |
| 30 | 95 | 464.961 | 3 | 7 | 2 | 3.6488 | 5 |
| 31 | 96 | 474.609 | 3 | 5 | 3 | 4.324 | 5 |
| 32 | 99 | 454.556 | 2 | 5 | 3 | 2.696 | 5 |
| 33 | 100 | 444.474 | 3 | 7 | 3 | 4.1837 | 5 |
Comparison analysis
| Architectures | RMSE Values |
|---|---|
| PADME-ECFP | 0.7915 |
| KronRLS | 0.6566 |
| SimBoost | 0.4711 |
| Our Architecture | 0.83 |
Fig. 9Comparative analysis based on RMSE score