| Literature DB >> 32960159 |
Nikhil Maroli1,2, Balu Bhasuran1, Jeyakumar Natarajan3, Ponmalai Kolandaivel4.
Abstract
A novel coronavirus (SARS-CoV-2) has caused a major outbreak in human all over the world. There are several proteins interplay during the entry and replication of this virus in human. Here, we have used text mining and named entity recognition method to identify co-occurrence of the important COVID 19 genes/proteins in the interaction network based on the frequency of the interaction. Network analysis revealed a set of genes/proteins, highly dense genes/protein clusters and sub-networks of Angiotensin-converting enzyme 2 (ACE2), Helicase, spike (S) protein (trimeric), membrane (M) protein, envelop (E) protein, and the nucleocapsid (N) protein. The isolated proteins are screened against procyanidin-a flavonoid from plants using molecular docking. Further, molecular dynamics simulation of critical proteins such as ACE2, Mpro and spike proteins are performed to elucidate the inhibition mechanism. The strong network of hydrogen bonds and hydrophobic interactions along with van der Waals interactions inhibit receptors, which are essential to the entry and replication of the SARS-CoV-2. The binding energy which largely arises from van der Waals interactions is calculated (ACE2=-50.21 ± 6.3, Mpro=-89.50 ± 6.32 and spike=-23.06 ± 4.39) through molecular mechanics Poisson-Boltzmann surface area also confirm the affinity of procyanidin towards the critical receptors. Communicated by Ramaswamy H. Sarma.Entities:
Keywords: Text mining; covid; molecular docking; molecular dynamics simulation; procyanidin
Mesh:
Substances:
Year: 2020 PMID: 32960159 PMCID: PMC7544928 DOI: 10.1080/07391102.2020.1823887
Source DB: PubMed Journal: J Biomol Struct Dyn ISSN: 0739-1102
Figure 1.The schematic architecture of the text mining approach for the identification of critical proteins and their biological co-occurrence on COVID-19. The text mining approach employed the CORD-19 database as the literature source. A set of highly studied COVID-19 protein list was manually constructed and a dictionary matching procedure was applied to find protein co-occurrence with frequency. Finally, a protein co-occurrence network was created and analyzed to find the functional association among critical proteins in COVID-19.
Figure 2.(a) Represents the linear frequency of the important proteins related to COVID-19 (b) Represents the top 10 critical proteins connected with COVID-19 (c) Protein co-occurrence network of COVID-19 (d)Highly reported proteins related to COVID-19.
Highly interacting benchmarking gene-gene interaction pairs.
| Gene A | Gene B | Frequency |
|---|---|---|
| ORF1a | ORF1b | 303 |
| ACE2 | S protein | 144 |
| NSP3 | NSP4 | 124 |
| NSP13 | Helicase | 104 |
| NSP2 | NSP3 | 93 |
| NSP14 | ExoN | 76 |
| NSP3 | NSP5 | 60 |
| NSP7 | NSP8 | 56 |
| NSP14 | NSP10 | 46 |
| Helicase | RdRp | 45 |
Figure 3.Network analysis properties of the protein co-occurrence network of COVID-19 (a) Centrality measures (closeness (blue colour), betweenness (green colour), harmonic closeness (red colour) of each protein from the co-occurrence network (b) Network Hub (red colour) and Authority (blue colour) values of each protein from the co-occurrence network (c) Page Rank (blue colour) and Clustering (red colour) statistics of each protein from the co-occurrence network (d) Triangles (red colour) and Eccentricity(blue colour) measures of each protein from the co-occurrence network.
Connected proteins in each subnetwork with network analysis measurements.
| Hub Protein | Connected Components | Closness centrality | Betweeness centrality | Eigen centrality | Page ranks | Hub |
|---|---|---|---|---|---|---|
| ACE2 | S protein | 0.468354 | 0.069328 | 0.309572 | 0.022098 | 0.089565 |
| E protein | 0.493333 | 0.033867 | 0.314617 | 0.012692 | 0.103421 | |
| ORF8 | 0.373737 | 0.023807 | 0.082985 | 0.036943 | 0.00629 | |
| Surface glycoprotein | 0.336364 | 0 | 0.04398 | 0.007195 | 0.015058 | |
| Helicase | RdRp | 0.587302 | 0.087633 | 0.890853 | 0.065624 | 0.284112 |
| NSP2 | 0.587302 | 0.046212 | 0.968646 | 0.036751 | 0.333196 | |
| N protein | 0.578125 | 0.083752 | 0.790443 | 0.02517 | 0.305663 | |
| NSP10 | 0.45122 | 0.017466 | 0.523294 | 0.039623 | 0.12841 | |
| ExoN | 0.486842 | 0.014993 | 0.606035 | 0.036957 | 0.228533 | |
| NSP12 | 0.578125 | 0.059686 | 0.946454 | 0.036738 | 0.243958 | |
| NSP13 | 0.397849 | 0 | 0.272434 | 0.01722 | 0.128044 | |
| NSP9 | 0.513889 | 0.010905 | 0.629844 | 0.025474 | 0.183429 | |
| Hel | 0.45122 | 0.002244 | 0.355033 | 0.021652 | 0.040965 | |
| S Protein | E protein | 0.493333 | 0.033867 | 0.314617 | 0.012692 | 0.103421 |
| M protein | 0.406593 | 0.005986 | 0.16895 | 0.014046 | 0.046801 | |
| N protein | 0.578125 | 0.083752 | 0.790443 | 0.02517 | 0.305663 | |
| ORF1a | 0.587302 | 0.094328 | 0.78173 | 0.047027 | 0.247667 | |
| ACE2 | 0.373737 | 0.017162 | 0.093084 | 0.012052 | 0.013174 | |
| RdRp | 0.587302 | 0.087633 | 0.890853 | 0.065624 | 0.284112 | |
| N Protein | E protein | 0.493333 | 0.033867 | 0.314617 | 0.012692 | 0.103421 |
| M protein | 0.406593 | 0.005986 | 0.16895 | 0.014046 | 0.046801 | |
| NSP1 | 0.649123 | 0.215808 | 1 | 0.026757 | 0.347485 | |
| NSP12 | 0.578125 | 0.059686 | 0.946454 | 0.036738 | 0.243958 | |
| NSP15 | 0.468354 | 0.035962 | 0.262739 | 0.015383 | 0.047781 | |
| NSP2 | 0.587302 | 0.046212 | 0.968646 | 0.036751 | 0.333196 | |
| NSP3 | 0.578125 | 0.099375 | 0.767396 | 0.0163 | 0.293389 | |
| NSP4 | 0.528571 | 0.017165 | 0.619071 | 0.017897 | 0.221376 | |
| Helicase | 0.480519 | 0.018757 | 0.649594 | 0.036368 | 0.185347 | |
| ORF1a | 0.587302 | 0.094328 | 0.78173 | 0.047027 | 0.247667 | |
| RdRp | 0.587302 | 0.087633 | 0.890853 | 0.065624 | 0.284112 | |
| S protein | 0.468354 | 0.069328 | 0.309572 | 0.022098 | 0.089565 | |
| M Protein | E protein | 0.493333 | 0.033867 | 0.314617 | 0.012692 | 0.103421 |
| S protein | 0.468354 | 0.069328 | 0.309572 | 0.022098 | 0.089565 | |
| NSP15 | 0.468354 | 0.035962 | 0.262739 | 0.015383 | 0.047781 | |
| N protein | 0.578125 | 0.083752 | 0.790443 | 0.02517 | 0.305663 | |
| E Protein | S protein | 0.468354 | 0.069328 | 0.309572 | 0.022098 | 0.089565 |
| N protein | 0.578125 | 0.083752 | 0.790443 | 0.02517 | 0.305663 | |
| ACE2 | 0.373737 | 0.017162 | 0.093084 | 0.012052 | 0.013174 | |
| M protein | 0.406593 | 0.005986 | 0.16895 | 0.014046 | 0.046801 | |
| NSP1 | 0.649123 | 0.215808 | 1 | 0.026757 | 0.347485 | |
| ORF1a | 0.587302 | 0.094328 | 0.78173 | 0.047027 | 0.247667 |
Molecular docking results of procyanidin with different proteins that associated with novel COVID 19.
| Protein | Binding energy (kcal/mol) | hydrogen bonds | Hydrophobic interaction | Function of the protein |
|---|---|---|---|---|
| Host translation inhibitor nsp1 | −8.5 | 9Asn, 12Thr,13His, | 14Val, 134His | Inhibits host translation by interacting with the 40S ribosomal subunit |
| 81His,83His,134His | ||||
| Non-structural protein 2 (nsp2) | −8.8 | 125Pro,127Ala,129Pro, | 129Pro | play a role in the modulation of host cell survival signaling pathway |
| 263Ser,263Ser | ||||
| Papain-like proteinase | −9.7 | 178Asp,181Thr,211Ser, | 1301Leu,1316Pro | Responsible for the cleavages located at the N-terminus of the replicase polyprotein |
| 215Lys,218Asp,807Asp, | ||||
| 1301Leu,1313Asn | ||||
| Non-structural protein 4 (nsp4) | −9.7 | 177Glu,225Thr,295Thr, | 177Glu,299Tyr, | Participates in the assembly of virally-induced cytoplasmic double-membrane vesicles necessary for viral replication. |
| 429Lys | 428Leu | |||
| Proteinase 3CL-PRO | −7.5 | 131Arg,197Asp,199Thr, | 239Tyr, 272Leu | Cleaves the C-terminus of replicase polyprotein at 11 sites |
| 238Asn,272Leu | ||||
| Non-structural protein 6 (nsp6) | −8.9 | 174Asn, 179Val,255Asn | 171Val, 175Tyr, 1 84Phe | Plays a role in the initial induction of autophagosomes from host reticulum endoplasmic. |
| Non-structural protein 7 (nsp7) | −8.1 | 5Asp,34Gln,37Asn | 7Lys,37Asn,41Leu | Forms a hexadecamer with nsp8 |
| Non-structural protein 8 (nsp8) | −8.7 | 101Asp,105Asn,108Asn, | 104Asn,148Thr | Forms a hexadecamer with nsp7 |
| 109Asn,140Asn,148Thr | ||||
| Non-structural protein 9 (nsp9) | −8.4 | 11Gln,26Asp,29Leu, | 29Leu | participate in viral replication by acting as a ssRNA-binding protein. |
| 47Asp,86Lys | ||||
| Non-structural protein 10 (nsp10) | −8.3 | 28Lys,85Asn,87Asp,91Asp, | 112Leu | Plays a pivotal role in viral transcription by stimulating both nsp14 3′-5′ exoribonuclease and nsp16 2′-O-methyltransferase activities |
| 92Leu,116Val | ||||
| RNA-directed RNA polymerase (RdRp) | −8.9 | 456Tyr,556Thr,622Cys, | 555Arg | Responsible for replication and transcription of the viral RNA genome. |
| 624Arg,682Ser | ||||
| Helicase (Hel) | −9.5 | 139Lys,142Glu,178Arg, | 178Arg,179Asn, | Multi-functional protein with a zinc-binding domain in N-terminus displaying RNA and DNA duplex-unwinding activities with 5′ to 3′ polarity. |
| 179Asn,339Arg,361Asn, | 410Thr | |||
| 383Asp,410Thr | ||||
| Guanine-N7 methyltransferase (ExoN) | −9.6 | 290Val,292Trp,354Gln, | 290Val,292Trp, | an exoribonuclease activity acting on both ssRNA and dsRNA in a 3′ to 5′ direction and a N7-guanine methyltransferase activity. |
| 286Asn,422Asn,426Phe | 335Pro,336Lys, | |||
| 424His | ||||
| Uridylate-specific endoribonuclease (NendoU) | −8.2 | 62Asn,64Lys,82Asn, | 15Phe,60Lys,63Ile, | Uridylate-specific endoribonuclease |
| 83Thr,124Asp | 64Lys,83Thr | |||
| 2′-O-methyltransferase (2′-O-MT) | −10.1 | 30Tyr,46Lys,74Ser, | 76Lys | Uridylate-specific endoribonuclease |
| 130Asp,170Lys,198Asn,203Glu | ||||
| Surface glycoprotein/Spike protein (S) | −10.3 | 801Asn,925Asn,929Ser, | 928Asn,1140Pro | Binding to human ACE2 and other possible receptors |
| 935Gln,936Asp,1138Tyr, | ||||
| ORF3a | −7.9 | 10Ile,12Thr,131Trp, | 9Thr | Forms homotetrameric potassium sensitive ion channels (viroporin) and may modulate virus release |
| 134Arg,135Ser,227His | ||||
| E Protein | −9.0 | 2Tyr,32Ala | 25Val,28Leu, | Plays a central role in virus morphogenesis and assembly |
| 74Leu | ||||
| M Protein | −7.8 | 4Ser,42Arg,107Arg, | 8Ile,34Leu,97Ile, | Component of the viral envelope that plays a central role in virus morphogenesis and assembly via its interactions with other viral proteins. |
| 125His,126Gly | 107Arg,110Trp,128Ile | |||
| ORF6 | −6.7 | 13Glu,38Lys | 12Ala,38Lys | stimulate cellular DNA synthesis |
| ORF7a | −8.0 | 60Ser,61Thr,62Gln | 109Phe,110Ile | Suppression of host tetherin activity. |
| ORF8 | −8.9 | 48Arg,53Lys | 6Phe,7Leu,10Ile,52Arg | |
| N Protein | −9.7 | 3Asp,17Phe,19Gly, | 394Leu,395Leu,403Phe | Packages the positive strand viral genome RNA into a helical ribonucleocapsid (RNP) and plays a fundamental role during virion assembly through its interactions with the viral genome and membrane protein M. |
| 21Ser,23Ser,25Gly, | ||||
| 281Gln,393Thr | ||||
| ORF10 | −7.2 | 15Ser,20Arg | 11Phe,13Ile,16Leu | Exact function is not known but identified in the pathway |
| ACE2 | −8.9 | 44Ser,47Ser,350Asp, | 40Phe,390Phe | Receptor that binds the virus protein |
| 382Asp,385Tyr, | ||||
| 393Arg,394Asn,401His | ||||
| Mpro | −9.2 | 26Thr,46Ser, | 142Asn | |
| 143Gly,144Ser,145Cyc | ||||
| Spike | −9.5 | 375Ser,376Thr,404Gly,405Asp,408Arg,410Ile |
Figure 4.ACE2- procyanidin docked complex and representation of different interactions such as hydrogen bonds, hydrophobic interaction, van der Waals and electrostatic. The hydrogen bonds are marked in green line and red colour on the left panel represents the hydrophobic interactions.
Figure 5.Mpro- procyanidin docked complex and representation of different interactions such as hydrogen bonds, hydrophobic interaction, van der Waals and electrostatic. The hydrogen bonds are marked in green line and red colour on the left panel represents the hydrophobic interactions.
Figure 6.Spike protein- procyanidin docked complex and representation of different interactions such as hydrogen bonds, hydrophobic interaction, van der Waals and electrostatic. The hydrogen bonds are marked in green line and red colour on the left panel represents the hydrophobic interactions.
Figure 7. The Cα based RMSF fluctuations calculated through principal component analysis. The top panel represents the RMSF fluctuations along the first eigenvector and the bottom panel represents RMSF fluctuations along the second eigenvector. The native proteins fluctuations are shown in cyan and red indicates the RMSF fluctuations of procyanidin bound proteins. (a) native ACE2 (b) ACE2-procyanidin (c) Mpro(d) Mpro- procyanidin (e) spike protein (f) spike protein- procyanidin. Higher RMSF fluctuations on procyanidin bound proteins can be seen in all the proteins.
Figure 8.(a) represents the radius of gyration of the protein and procyanidin-protein complex. Native proteins show stable and lower values than the procyanidin bound proteins. (b) The number of hydrogen bonds formed between the procyanidin and proteins. (c) The electrostatic interaction energy between procyanidin and proteins (d) van der Waals interaction energy between procyanidin and proteins. The van der Waals interaction energies are found to be higher in all the proteins. (e)-(g) represents the energy contribution of residues at the binding site. (e) ACE2 (f) Mpro and (g) spike protein.
Figure 12.The binding energy decomposition obtained from MM-PBSA calculations. (a) ACE2 (b) Mpro (c) spike protein. In all the cases high van der Waals interaction energies are observed. Epol represents the polar solvation energy, Eapo represents apolar solvation energy Eele and Evdwrepresents the electrostatic and van der Waals interaction energy. Delta-E represents the total binding energy.
The binding energy calculated from MM-PBSA Method.
| van der Waal energy (kJ/mol) | Electrostatic energy (kJ/mol) | Polar solvation energy (kJ/mol) | SASA energy (kJ/mol) | Binding energy (kJ/mol) | |
|---|---|---|---|---|---|
| ACE2 | −184.16 ± 3.1 | −52.72 ± 5.2 | 208.61 ± 3.03 | −21.94 ± 1.03 | −50.21 ± 6.3 |
| Mpro | −193.13 ± 1.3 | −69.83 ± 3.06 | 194.74 ± 1.06 | −21.28 ± 3.02 | −89.50 ± 6.32 |
| Spike | −53.53 ± 1.6 | −31.47 ± 2.03 | 68.94 ± 2.36 | −7.00 ± 3.12 | −23.06 ± 4.39 |