Literature DB >> 35517466

Consensus virtual screening of dark chemical matter and food chemicals uncover potential inhibitors of SARS-CoV-2 main protease.

Marisa G Santibáñez-Morán1, Edgar López-López2, Fernando D Prieto-Martínez1, Norberto Sánchez-Cruz1, José L Medina-Franco1.   

Abstract

The pandemic caused by SARS-CoV-2 (COVID-19 disease) has claimed more than 500 000 lives worldwide, and more than nine million people are infected. Unfortunately, an effective drug or vaccine for its treatment is yet to be found. The increasing information available on critical molecular targets of SARS-CoV-2 and active compounds against related coronaviruses facilitates the proposal (or repurposing) of drug candidates for the treatment of COVID-19, with the aid of in silico methods. As part of a global effort to fight the COVID-19 pandemic, herein we report a consensus virtual screening of extensive collections of food chemicals and compounds known as dark chemical matter. The rationale is to contribute to global efforts with a description of currently underexplored chemical space regions. The consensus approach included combining similarity searching with various queries and fingerprints, molecular docking with two docking protocols, and ADMETox profiling. We propose compounds commercially available for experimental testing. The full list of virtual screening hits is disclosed. This journal is © The Royal Society of Chemistry.

Entities:  

Year:  2020        PMID: 35517466      PMCID: PMC9055157          DOI: 10.1039/d0ra04922k

Source DB:  PubMed          Journal:  RSC Adv        ISSN: 2046-2069            Impact factor:   4.036


Introduction

Coronaviruses (COVs) per se can infect humans and other animal species. Some of them cause a variety of previously studied diseases such as Severe Acute Respiratory Syndrome (SARS) and Middle East Respiratory Syndrome (MERS). SARS-CoV-2 is an emergent virus that generates the COVID-19 disease[1] which is currently considered a “pandemic” according to the World Health Organization (WHO), with more than ten million confirmed cases and more than 500 000 deaths worldwide (as per June 30th, 2020).[2] SARS-CoV-2 has a complex architecture, and as happens with different viruses, there are several proteins involved in viral internalization and replication. The life cycle of SARS-CoV-2 starts with the viral recognition of its spike protein by a cellular receptor (ACE receptor and TMPRSS2). After that, the internalization and uncoating process is mediated by membrane proteins. Once into the host cell, RNA replication, and biosynthesis of viral polypeptides are carried out (RdRp – ribosomes). Finally, the processing of precursors proteins by the main protease (3CLpro or Mpro) and the assembly of these, contributes to the generation of new viruses.[3-5] These main targets offer a venue for the development of new treatments via rational drug design. Examples include spike protein, RNA polymerase, and chymotrypsin-like cysteine protease (3CLpro or Mpro) which are presented in Fig. 1.[3-5] Of these, the main protease (Mpro) is a promising target for the design and proposal of new therapies due to the lack of homologous proteins in humans.[6] Also, its selective inhibition would take advantage of the natural life cycle of SARS-CoV-2, avoiding its replication and dissemination. Several research groups are actively pursuing Mpro as a molecular target to identify drug candidates for the treatment of COVID-19.
Fig. 1

Schematic life cycle and main studied targets of SARS-CoV-2. (A) Cellular recognition; (B) internalization and uncoating process; (C) biosynthesis of viral proteins and RNA replication; and (D) assembly of new virions.

Computational methods represent an approach with the power of efficiently filter large and diverse compound libraries to select potential candidates for drug development.[7,8] Recently published works show a tendency towards drug repurposing and to search structurally different libraries (e.g., with broad scaffold diversity), and natural products.[9-13] Moreover, the search for novel compounds commercially available or with the possibility of being synthesized has had a vital rebound (e.g., screening part or the entire ZINC database).[9,14-16]Table 1 summarizes representative examples of virtual screening (VS) studies directed to different molecular targets, including SARS-CoV-2 Mpro. Most of these efforts relied on structure-based drug design (SBDD). Few others include similarity searching and quantitative structure–activity relationship (QSAR) modeling.[17] In this sense, there are many compounds suggested by computational methods that could be evaluated quickly with in vitro techniques. However, the use of computational consensus methodologies could improve the performance of each technique.

Representative virtual screening studies to identify drug candidates for the treatment of COVID-19

TargetExperimental methodsLibrariesCompounds screened/outcomeRef.
MproDeep dockingZINC 151.3 billion/1,000a 9
MproPharmacophore model, molecular docking, and dynamicsMarine natural products14 064/17a 10
MproPharmacophore screening and molecular dockingZINC50 000/10a 15
Spike proteinHomology modeling and molecular dockingFDA3300/12a 18
Mpro, PLpro and RdRpHomology modeling, molecular docking, and dynamicsDrugBank and traditional Chinese medicine1973/57a 11
ACE2Molecular dockingLiterature compilation (natural products)—/5a 12
MproMolecular dockingLiterature compilation (natural products)80/8a 13
MproMolecular dockingFDA486/20a 19
MproMolecular docking, and dynamicsZINC606 million/12a 20
MproSimilarity search and QSAR modelingDrugBank (marketed, withdrawn, experimental, and investigational)9615/41a 17
MproMolecular docking and dynamicsDrugBank (approved and drug candidates in clinical trials)2201/5a 21
Mpro and TMPRSS2Homology modeling and molecular dockingZINC34 500/8a 14
MproInduced fit dockingIn-house10 000/6b 22

Computational hits.

Active hits.

Computational hits. Active hits. The goal of this work is to propose active compounds against Mpro from SARS-CoV-2 and related coronaviruses. One of the novelties of the present study relies on the probed chemical space: food chemicals and molecules in the Dark Chemical Matter (DCM), which to the best of our knowledge, have been explored for SARS-CoV-2 on a limited basis. Thus, the rationale was to expand the search of chemical space and suggest molecules for experimental screening. Active compounds could be later optimized to increase activity. As a screening strategy, we started with similarity searching using different fingerprints to pre-select compounds using data fusion strategies. Selected compounds from similarity searching were screened with molecular docking with two different software. The final selection of computational hits was based on consensus scoring, information of protein–ligand contacts, and the ADMETox (absorption, distribution, metabolism, excretion, and toxicity) profile of compounds. Additional criteria used to guide the selection of hit candidates for testing included predictions by machine learning (ML) models for SARS-CoV-2 activity developed by Collaborations Pharmaceuticals, Inc and freely available.[23]

Materials and methods

Herein we combined ligand- and structure-based methods to virtually screen compounds from two primary molecular databases and select hit candidates for testing. Ligand-based methods were based on similarity searching using the principles of data fusion.[24,25] Structure-based approaches were based on molecular docking and consensus scoring.[26] The selection of hit compounds was also made considering the predicted ADMETox profile as well as prediction by ML models made freely available by Collaborations Pharmaceuticals. Fig. 2 outlines the main VS strategy and hit selection. Overall, two main general approaches were considered that are distinguished by the type of reference compounds used in the similarity searching. In one method (left-hand side of Fig. 2), three HIV-1 protease inhibitors approved for clinical use were used as queries. As elaborated below in Section 3.1, the three compounds have shown in vitro activity against SARS-CoV or SARS-CoV-2. In the second approach, (right-hand side of Fig. 2) 1052 compounds with potential affinity for SARS-CoV-2 Mpro or SARS-CoV Mpro were used as queries. The workflow in Fig. 2 is described in more detail in the next subsections.
Fig. 2

General workflow of the virtual screening approach used in this work.

Screening and reference databases

Table 2 summarizes the four major types of data sets considered in this study.

Main screening data sets and reference compounds considered in this work

DatasetContent overview and sizeaRationaleRef.
ActivesN3, alpha-ketoamides 11a, 11r, and 11s, carmofur, cinaserin, disulfiram, ebselen, PX12, shikonin, and tideglusibReference compounds used in docking to compare docking scores and predicted binding modes 22 and 27
FooDB22 880 compoundsLarge library of food chemicals. Smaller food chemical data sets have been screened 28
Dark chemical matter (DCM)139 329 compoundsLarge screening library underexplored. Likelihood to shade light into the darkness of the COVID-19 pandemic 29
ZINC (top-ranked hits)10 top-ranked virtual screening hits of ZINC using deep docking/Glide and SARS-CoV-2 Mpro (PDB ID: 6LU7)Further consensus of published computational hits with other docking programs (Vina and MOE) 9

After data curation.

After data curation. One of the screening databases was the public food chemical database (FooDB) with 23 883 compounds.[28] The chemical diversity and coverage of chemical space of FooDB have been reported revealing that food chemicals are structurally diverse and have, in general, large molecular complexity.[30] DCM was the other screening database. DCM is a collection of 139 352 compounds that had shown no activity when tested in at least 100 screening assays.[29] Even though DCM has a low activity profile against common targets, the rationale of screening this collection was to explore regions in chemical space currently overlooked. Moreover, DCM has yielded active molecules in other assays[31,32] probing the value of screening this region of the chemical space. The structures of FooDB and DCM were curated and standardized, employing RDKit, CDK (Chemistry Development Kit), and ChemAxon tools. The largest component of molecules with more than one fragment was retained, compounds containing an atom type other than H, C, O, N, S, P, F, Cl, Br, I, B, Si, and Se were removed. The tautomer with the lowest energy for each remaining compound was generated. Active compounds from the study of Jin et al.[22] were used as a reference. These were the peptide-like inhibitor N3, carmofur, cinaserin, disulfiram, ebselen, PX12, shikonin, tideglusib, and alpha-keto amides (11a, 11r, 11s).[27] Lopinavir, nelfinavir, and ritonavir were other reference compounds for the molecular docking performed in AutoDock Vina. To identify additional potential hit compounds, we included the top 10 ranked virtual screening hits from the study of Ton et al.[9] Authors of that work screened the ZINC database against the SARS-CoV-2 Mpro (PDB ID 6LU7) using the docking program Glide. The rationale of using this set was to explore further the predicted profile of top-ranked compounds using different docking programs (i.e., Vina and MOE, vide infra).

Similarity searching

Eight two-dimensional molecular fingerprints (Molecular ACCess System-MACCS-keys (166-bits), Morgan 2 [ECFP4-like], Morgan 3 [ECFP6-like], FeatMorgan, AtomPair, Torsion, Layered, and Pattern) were generated for all the queries, the 22 880 compounds in FooDB, and 139 329 molecules in DCM. In the first virtual screening approach (Fig. 2), nelfinavir, lopinavir, and ritonavir were used as independent queries (vide infra). The molecular similarity between each of the queries and each of the molecules in FooDB and DCM was estimated with the Tanimoto coefficient.[33] The compounds with a Tanimoto coefficient higher than the median plus two standard deviations were considered as a hit. The molecules labeled as hits according to more than one molecular fingerprint (consensus hits), were selected. The consensus hits for the three queries were additionally analyzed by molecular docking. In the second approach (Fig. 2), 1052 compounds with potential affinity for SARS-CoV-2 Mpro or SARS-CoV Mpro were selected from published molecular docking studies[9,19,27,34,35,59] and used as queries. The structure file with the chemical structures of the 1052 compounds is available in the ESI.† Mean-fusion similarity scores and max-fusion similarity scores were determined using the eight molecular fingerprints and the Tanimoto coefficient.[36] Compounds with max-fusion similarity scores and mean-fusion similarity scores higher than the median plus two standard deviations for more than one fingerprint were selected as consensus hits and evaluated by molecular docking. The molecular similarity analyses were generated in KNIME employing the RDKit node for molecular fingerprints generation and the CDK node for the similarity calculation.[37,38]

Molecular docking

To enhance the likelihood of finding active compounds, two docking programs with different algorithms were used, namely; Autodock Vina, version 1.1.2,[39] and Molecular Operating Environment (MOE) v.2019.[40] As explained hereunder, the docking protocols for each program were validated with experimental information available. Docking with Autodock Vina was conducted with two crystallographic structures obtained from the Protein Data Bank (PDB),[41] namely, SARS-CoV-2 Mpro (PDB ID 6LU7)[22] and the structurally related SARS-CoV Mpro (PDB ID 5N5O).[42] Both structures are co-crystallized with a peptide-like (N3) and an alpha-ketoamide (11s) inhibitor, respectively. The crystal structures were prepared in Autodock Tools. The grid-box was constructed based on the binding site of the alpha-ketoamide inhibitors 11a and 11s. The ligands were normalized, their clean 3D form was generated, hydrogens were added, and molecules were optimized using the Universal Force Field (UFF) in KNIME. The results were visualized in PyMol (version 2.3). Induced fit docking protocol for the Mpro (PDB ID 6LU7) of SARS-CoV-2 was carried out with MOE software v.2019. The protein was prepared with the “Quick prepare” tool using the parameters assigned by the PFROSST force field. The peptide-like inhibitor N3 was removed, and their binding site was used to direct the docking. Triangle matcher method was refined with the induced fit protocol, and the other parameters were established by default. This protocol was validated using experimental information recently published by Jin et al.[22] The binding poses were successfully reproduced. The binding scores showed a correlation of 0.703 with the in vitro inhibition values of the data set.

ADME/Tox profiling

Early consideration of ADMET/Tox properties is fundamental in current drug discovery efforts. Due to the availability of several free chemoinformatic resources,[43] herein we employed SwissADME[44] to calculate more than 40-related properties including descriptors associated with drug-likeness, solubility, blood-brain barrier (BBB) permeability, Pgp substrate, inhibition of CYPs, Bioavailability Score, PAINS alerts, and the number of violations to empirical rules (Lipinski, Veber, Egan, Brenk). The full list of ADME/Tox related properties calculated with SwissADME is in the ESI.† We have used SwissADME to profile other compound databases of pharmaceutical relevance.[45]

Results and discussion

We describe the results of similarity searching, molecular docking, and ADMETox followed by the combined analysis to select hit compounds for experimental testing. As previously stated, Mpro is a promising drug target due to its importance in COVs life cycle (Fig. 1, vide supra). The recent publication of the SARS-CoV-2 Mpro crystal structure showed a 96% similarity with the SARS-CoV Mpro and the conservation of the active binding site. To search for SARS-CoV-2 Mpro inhibitors in underexplored regions of the chemical space, we assessed the molecular similarity of FooDB and DCM databases with compounds that potentially inhibit SARS-CoV Mpro or SARS-CoV-2 Mpro. As a first approach, three HIV-1 protease inhibitors approved for clinical use, namely; lopinavir, ritonavir, and nelfinavir were used as queries or reference compounds. Lopinavir and ritonavir have shown activity against SARS-CoV[46,47] and are currently under clinical trials for the treatment of COVID-19. In addition, molecular dynamics predicted binding affinity of both molecules for the active site of SARS-CoV Mpro[48] and there is recent evidence of in vitro activity of lopinavir against SARS-CoV-2.[49] Another protease inhibitor with in vitro activity against SARS-CoV,[50] nelfinavir, has been predicted to have high binding affinity to the SARS-CoV-2 Mpro by molecular dynamics.[51,52] Thus, nelfinavir was also included as a reference for the similarity search. Despite those observations, there is still no conclusive evidence of the effectiveness of these drugs in the treatment of COVID-19 (ref. 53–57) which encourages the identification of other existing molecules that target SARS-CoV-2. After the ligands were prepared (as described in the Methods Section 2.3), 143 consensus hits from FooDB were found to be highly similar to nelfinavir, lopinavir, and ritonavir (i.e., with similarity values above than the median plus two standard deviations). From the 143 consensus hits, 40 compounds with drug-like properties were selected for more analyses. Five hundred compounds were selected from the DCM database with significantly high Tanimoto similarity values to nelfinavir, lopinavir, and ritonavir. DCM compounds are constantly tested in HTS assays, and therefore, they were considered to have suitable physicochemical properties for drug development. In this sense, it is not surprising that a more significant number of consensus hits for the three drugs were found in DCM, considering that the molecular and physicochemical properties of DCM do not significantly differ from approved drugs. In contrast, FooDB was not assembled to be “drug-like.” A small dataset of 1052 compounds with predicted affinity to SARS-CoV-2 Mpro was assembled to broaden the search of potential Mpro inhibitors. Although these alternative reference compounds are potentially (but not confirmed) active, it has been suggested that they can increase the likelihood to identify active molecules. Such an approach is reminiscent of what has been described as “turbo-similarity searching”.[58] As more data becomes available, a more chemically diverse and larger set could be integrated. Meanwhile, the top hits reported in six peer-reviewed molecular docking studies were included.[9,19,27,34,35,59] After ligand preparation, 178 and 174 consensus hits from FooDB and DCM were recovered, respectively. Significant hits were found for five of the eight molecular fingerprints, highlighting the advantages of using multiple molecular fingerprints.[60] Four compounds were overlapping consensus hits from both similarities searching methods. DBB13044 and DBB18117 from FooDB, and DCM33835 and DCM97265 from DCM database. The total number of consensus hits further analyzed by molecular docking and ADMETox in silico profiling was 888 compounds (including stereoisomers). Molecular docking of SARS-CoV Mpro was performed with Autodock Vina (PDB ID 5N5O). The docking scores for the reference compounds ranged from −8.5 to −4.1 kcal mol−1, with a mean value of −6.8 kcal mol−1. Of note, lopinavir, ritonavir, and nelfinavir were included as references. A total of 393 compounds, from the hits selected by molecular similarity, fell above (less favorable) the mean docking score. However, reference compounds with docking scores above the mean value, such as ebselen (−6.2 kcal mol−1) bound to the active site of SARS-CoV Mpro by four hydrogen bonds with residues Lys141, Gly143, Ser144, and Cys145. Hence, a hard cut-off value purely based on docking scores was not established. The docking scores for the reference compounds docked to SARS-CoV-2 Mpro carried out in MOE ranged from −9.4 to −5.16 kcal mol−1. Fig. 3 shows the predicted binding mode of representative hits compounds with Mpro. As discussed hereunder in the Section 3.4 Hit Selection (vide infra), the selected hit compounds shown in Fig. 3 had favorable docking scores with Vina and MOE and had at least one interaction with the catalytic residues His41, Cys145 and/or Glu166 (key interactions reported).[22] According to the docking models, other important key interactions were observed. DBB2790 makes Pi–H interactions with sidechain of His 41, H-bond interaction with the sidechain of Cys 145 and H-bonds interactions with the sidechain and backbone of Glu 166; DCM78683 makes H-bond interactions with the sidechain of Asn 142 and Cys 145, and DCM111769 makes Pi–H interactions with Glu 166. These proposed compounds are predicted to preferentially bind to the P1, P2, and P3 regions.
Fig. 3

Binding modes of three selected hits within SARS-CoV-2 Mpro (PDB ID 6LU7) as predicted by Molecular Operating Environment.

A literature survey revealed that the VS hit DBB2790 (Fig. 3) has a high structural similarity to compound GC373 (a molecule with nanomolar activity against Mpro from SARS-CoV-2).[61] In 2013 Kim et al. reported GC373 as an inhibitor of Mpro from feline coronavirus.[62] Moreover, the protein–ligand interactions of both compounds and Mpro are similar. These observations support the potential antiviral activity of DBB2790.

ADMETox

For 888 selected hits, the ADMETox-related descriptors were computed with SwissADME. As described hereunder, some of these descriptors were used as a guide for the classification of hit compounds in different priority groups. The main types of ADMETox descriptors considered were those associated with drug-likeness, solubility, and cytochromes' inhibition.

Hit selection

Instead of establishing stringent (and arguably heuristic and hard) cut-off values, the compounds selected by molecular similarity were classified into four groups considering their interactions with the catalytic residues of the SARS-CoV-2 Mpro (H41 and C145), their commercial availability, ADMETox characteristics, and their predicted activity by ML. Thereby, most compounds with suitable profiles were classified into one of the groups. The number of the group is associated with the priority for acquisition and testing. Table 3 summarizes the group classification strategy and the number of compounds that were classified into each group. A further description of each group is presented below.

Summary of the classification criteria to prioritize the compounds in four groups for testing. The number of compounds in each group is indicated

GroupNumber of compoundsCommercial availabilitya In silico safety criteriabHydrogen bonds with H41 or C145Active according to machine learning
141AvailableSafePresentActive/inactive
210AvailableSafeNot presentActive
AvailableNot safePresentActive
334Not availableSafePresentActive/inactive
420Not availableSafeNot presentActive
Not availableNot safePresentActive

Compounds reported as “in-stock” in the ZINC database were considered commercially available.

Compounds that do not have PAINS alerts, do not pass through the BBB, and are predicted to not inhibit CYP1A2, CYP2C19, CYP2C9, CYP2D6 or CYP3A4.

Compounds reported as “in-stock” in the ZINC database were considered commercially available. Compounds that do not have PAINS alerts, do not pass through the BBB, and are predicted to not inhibit CYP1A2, CYP2C19, CYP2C9, CYP2D6 or CYP3A4. Group 1 includes commercially available compounds that meet our safety criteria (based on the predictions of SwissADME), i.e., they do not have PAINS alerts, do not pass through the BBB, and do not inhibit CYP1A2, CYP2C19, CYP2C9, CYP2D6 or CYP3A4. The molecules in this group are predicted to form hydrogen bonds with at least one of the catalytic residues of PDB ID 6LU7. Table 4 summarizes the 41 molecules that fell into this top priority group.

Virtual screening hits selected. The complete hit list is available in the ESI

SetIDZINC IDVina's score 5N5O kcal mol−1MOE's score 6LU7 kcal mol−1GIa absorptionPgpb substrateAlic log_SAli classLipinski violationsBrenk violationsBioavailabilityd
foodb_mfsmDBB9450169676920−6.6−10.9LowYes−8.39Poorly soluble340.17
foodb_mfsmDBB555485545908−7.9−10.9LowYes−6.76Poorly soluble320.17
foodb_mfsmDBB27904217536−7.8−9.3LowYes−6.4Poorly soluble330.17
dcm_chDCM11021434805301−7.4−9.2LowYes−2.7Soluble110.55
dcm_chDCM12203415990331−7−8.9HighYes−3.55Soluble010.55
dcm_chDCM735988918473−7.2−8.7LowYes−4.01Moderately soluble110.55
foodb_mfsmDBB245553057130−7.6−8.6LowYes−4.76Moderately soluble110.55
dcm_chDCM227938144961−6.8−8.5LowYes−3.84Soluble110.55
dcm_chDCM822164270581−7.1−8.3HighYes−2.36Soluble010.55
dcm_chDCM555338917865−6.4−8.3HighYes−1.82Very soluble010.55
dcm_chDCM1193539409555−7.8−8.2LowYes−4.18Moderately soluble000.56
dcm_chDCM65267100771995−6.2−8.2LowYes−1.75Very soluble000.55
foodb_mfsmDBB138254228235−7.4−8.1LowNo0.85Highly soluble240.17
dcm_chDCM1317799159501−6.4−8LowYes−3.37Soluble110.55
dcm_chDCM65270100778159−7.2−7.9HighYes−0.63Very soluble000.55
dcm_chDCM828319109751−7.8−7.8LowNo−2.37Soluble010.55
foodb_mfsmDBB134835283951−6.3−7.8HighNo−3.7Soluble020.55
foodb_mfsmDBB130022005305−7.3−7.8LowNo−3.27Soluble240.11
foodb_mfsmDBB141638577218−7.4−7.7LowNo−2.11Soluble220.11
dcm_chDCM13178315954557−6.9−7.7HighYes−2.24Soluble010.55
dcm_chDCM93255e32980237−7.2−7.7HighNo−3.22Soluble010.55
foodb_mfsmDBB139172036915−7.5−7.7LowNo−2.74Soluble220.11
dcm_mfsmDCM116923e2970717−6.5−7.7HighYes−2.54Soluble010.55
dcm_chDCM104784083870−6.6−7.6LowNo0.02Highly soluble130.55
dcm_chDCM28770100778693−6.7−7.6HighYes−2.52Soluble000.55
dcm_chDCM33486e1181094−6.6−7.5HighYes−3.62Soluble010.55
dcm_chDCM30682e1577795−6.4−7.5HighNo−3.76Soluble020.55
dcm_chDCM110206e12652624−7.2−7.5HighYes−4.15Moderately soluble020.55
dcm_mfsmDCM91011e6754750−7.7−7.4HighNo−1.49Very soluble010.55
foodb_mfsmDBB139194228265−7.7−7.4LowNo−1.97Very soluble220.17
foodb_mfsmDBB17132e20431033−6.2−7.1HighNo−1.7Very soluble020.55
dcm_chDCM131782e2126038−7.1−7.1HighNo−0.08Very soluble010.55
dcm_mfsmDCM7172418056800−6.2−7.1LowYes−4.37Moderately soluble020.55
dcm_mfsmDCM94188e18143600−7.1−6.9HighNo−2.63Soluble000.55
foodb_mfsmDBB201852242693−6.1−6.6LowNo0.98Highly soluble020.55
foodb_mfsmDBB171144090721−7−6.5HighNo−1.38Very soluble020.55
foodb_mfsmDBB189614321512−6.8−6.5LowNo−0.96Very soluble000.55
foodb_mfsmDBB189471303441−6.1−6.1HighNo−0.42Very soluble000.55
foodb_mfsmDBB197362040854−5.4−6.1HighNo2.05Highly soluble000.55
foodb_mfsmDBB197191532770−5.6−5.9HighNo1.67Highly soluble000.55
foodb_mfsmDBB21857e,f895813−5.8−5.6HighNo−1.75Very soluble000.56

GI gastrointestinal.

Pgp P-glycoprotein.

Ali topological method implemented from Ali J. et al. 2012.[63]

Probability that the compound will have F > 10%.

Compounds that do not violate any of the following rules: Lipinski, Ghose, Veber, Egan, and Muegge.

Compounds predicted to be active by the ML model.

GI gastrointestinal. Pgp P-glycoprotein. Ali topological method implemented from Ali J. et al. 2012.[63] Probability that the compound will have F > 10%. Compounds that do not violate any of the following rules: Lipinski, Ghose, Veber, Egan, and Muegge. Compounds predicted to be active by the ML model. Group 2 comprises ten commercially available compounds that are predicted to be active by ML, but they violate one of the other two criteria. They can meet our safety criteria and do not form hydrogen bonds with the catalytic residues. Else, they can form hydrogen bonds with the catalytic residues but do not meet our safety criteria. Group 3 consists of 34 molecules that are not commercially available but meet the safety criteria and form hydrogen bonds with at least one of the catalytic residues. These compounds would be suited for synthesis and testing. Group 4 contains 20 molecules that are not commercially available and are predicted to be active by ML. However, they do not meet the safety criteria or do not form hydrogen bonds with the catalytic residues. According to our classification, compounds in this group would have the lowest priority for acquiring (synthesizing since they are not commercially available) and testing. Compounds that do not fall into any of these four groups were considered as non-priority for acquisition. Table 4 summarizes the in silico profile of representative hit compounds selected for experimental validation. Table 5 summarizes the information of 18 compounds listed in group 1 from FooDB with their corresponding IDs and annotated sources. Interestingly, some of the selected hits that were structurally similar to potential Mpro inhibitors were from endogenous sources. For instance, angiotensin II (DBB9450) and angiotensin IV (DBB5554) (a degradation product) were predicted as binders of the active site of SARS-CoV-2 Mpro. Key interactions predicted were hydrogen-bonds with His41, Ser46, Cys145, Gln189 (DBB9450) and Thr26, Met49, Cys145, and Glu166 (DBB5554). Angiotensin II (ANG-II) is an octapeptide hormone product of angiotensin I's cleavage by the angiotensin-converting enzyme (ACE). ANG-II binds to AT1 and AT2 receptors; the activation of AT1 receptors by ANG-II induces vasoconstriction, vasopressin and aldosterone release, thirst, renal sodium reabsorption, angiogenesis, vascular aging, and inflammation. ANG-II can be converted to angiotensin 1–7 by the angiotensin-converting enzyme II (ACE2). The action of aminopeptidase A and aminopeptidase N produces angiotensin III and angiotensin IV, respectively.

Representative food chemicals as hits in the virtual screening

IDsFooDB annotation
DBB9450/FDB022383Angiotensin II, endogenous
DBB5554/FDB022385Angiotensin IV
DBB2790/FDB023765Tetragastrin, endogenous
DBB2455/FDB023767Morphiceptin, endogenous
DBB13825/FDB031192Tetrahydrofolate
DBB13483/FDB013079Neotame, artificial sweetener
DBB13002/FDB0226005-Methyltetrahydrofolic acid (5-MTHF)
DBB14163/FDB014504Folic acid
DBB13917/FDB022702Aminopterin
DBB13919/FDB022395Dihydrofolic acid
DBB17132/FDB028374Phenylbutyrylglutamine, metabolite of phenylbutyrate
DBB20185/FDB003618Gamma-l-glutamyl-l-phenylalanine, soft-necked garlic
DBB17114/FDB029352Indole acetyl glutamine, endogenous
DBB18961/FDB023789N4-Acetylcytidine, endogenous
DBB18947/FDB0229175-Methyldeoxycytidine (5-mdc)
DBB19736/FDB012937Carnosine 44A
DBB19719/FDB022217Homocarnosine, metabolite
DBB21857/FDB022212Hydroxyphenylacetylglycine, endogenous human metabolite
Angiotensin 1–7 has opposite actions to ANG-II. Because ACE2 mediates the entry of SARS-CoV-2 to the host cells and ACE2 activity may be downregulated after virus infection, the accumulation of ANG-II could be linked to the development of severe symptoms of COVID-19 disease. If Mpro inhibitors are structurally similar to ANG-II, their potential binding affinity for the active site of ACE2 should be evaluated. Some studies have assessed the ability of ACE2 inhibitors to prevent SARS-CoV from entering into the cells.[64] However, the inhibition of the ACE2 function could cause overaccumulation of ANG-II and promote its undesired effects. Nonetheless, probably, DCM compounds may not elicit a dual inhibition of SARS-CoV-2 Mpro and ACE2, considering that these molecules had shown no activity against common targets evaluated in HTS assays. Food folates like 5-MTHF, folic acid, dihydrofolic acid, and tetrahydrofolate (Table 5) were also among the compounds in the top priority group with observed hydrogen bonds to the catalytic residues of the SARS-CoV-2 Mpro, and favorable docking scores (below −7.4 kcal mol−1). Folates are cofactors in many one-carbon transfer reactions, including nucleotide synthesis for DNA and RNA synthesis, interconversion of serine and glycine, methionine generation and methylation of histones, DNA, proteins, phospholipids, and neurotransmitters. Folate deficiency has been linked to neural tube defects, brain dysfunction, coronary heart disease, and increased risk of colorectal and breast cancer.[65] Since mammalian cells cannot synthesize de novo folate, naturally occurring food folates and synthetic folic acid are used in dietary supplements and fortified food. Nevertheless, recent studies showed that a high intake of folic acid might be associated with a risk of developing leukemia and other conditions such as cancer, arthritis, insulin resistance, and masking deficiency of vitamin B12.[66] Thus, the implications of low and high plasma levels of folates in COVID-19 patients must be evaluated. Our results suggest that folates could inhibit SARS-CoV-2 Mpro, but their activity in in vitro and in vivo assays remains to be confirmed. To broaden our knowledge of the impact of a healthy diet, and the specific mechanisms through which food chemicals participate in the progression of COVID-19 disease could be a simple approach for the prevention and combat of the current pandemic. Intriguingly, aminopterin (DBB13917), a folic acid analog that inhibits the dihydrofolate reductase enzyme was also a potential Mpro inhibitor. Aminopterin is one of the so-called antifolates that interfere with folate metabolism and in turn nucleotide synthesis. Currently, an aminopterin analog with less toxic effects, methotrexate, is under clinical trials for the treatment of COVID-19 disease (NCT04352465). Methotrexate is an immunosuppressant used in the treatment of cancer and inflammatory conditions; it is often concurrently administered with folic acid.

Top-ranked hits from deep docking of ZINC

The ten top-ranked compounds from the analysis conducted by Ton et al. were included in this study (vide supra).[9] Even though the ML model did not predict activity against the main protease for these molecules, they represent new hits selected from billions of compounds in the ZINC database. They had good docking scores in our analyses, and three of them ZINC1218583693, ZINC1186058814, and ZINC1655436520 met our safety criteria and had interactions with the catalytic residues of SARS-CoV-2 Mpro. Furthermore, ZINC1655436520 also formed hydrogen bonds with residues Phe140, Leu141, Gly143, Ser144, Cys145, and Glu166 of SARS-CoV Mpro, it is predicted to have good water solubility and high GI absorption, and it does not violate Lipinski's, Ghose, Veber, Egan or Muegge rules.

Conclusions

Herein we report a consensus structure- and ligand-based virtual screening of two large chemical databases, namely, 22 880 food chemicals and 139 329 compounds classified as dark chemical matter to identify potential drug candidates for the treatment of COVID-19 targeting the SARS-CoV-2 Mpro. This work is part of our continued effort to identify systematically bioactive food chemicals.[67] We also screened top-ranked hits identified in a previous VS of 1.6 billion molecules from ZINC using Glide.[9] The similarity searching was done following two approaches. The first approach yielded 40 drug-like food chemicals and 500 DCM molecules with high similarity to nelfinavir, lopinavir, and ritonavir. The data fusion approach returned 178 food chemicals and 174 DCM compounds. In total, 888 hit compounds were subject to molecular docking with two docking programs. The hit compounds were selected considering docking score, predicted interactions with key residues, and ADMETox profiling. An additional criterion used as a guide was a prediction by ML models developed by collaborators in North Carolina, USA.[68] After the selection criteria, 105 hits in total were identified, of which several are commercially available (and with reasonable prices) and ready for experimental testing. The full list of hit compounds annotated with the in silico profile is available in the ESI.† We disclose that a preliminary version of this work is available as a preprint.[69] This work contributes to a global effort to screen compound databases from different sources aimed at identifying candidate drugs for the treatment of COVID-19. To the best of our knowledge, this is one of the first reports to systematically screen a large food chemical database and one of the first to explore the molecules in DCM for COVID-19.

Conflicts of interest

The authors declare no conflict of interest.
  50 in total

1.  Risk Factors Associated With Clinical Outcomes in 323 Coronavirus Disease 2019 (COVID-19) Hospitalized Patients in Wuhan, China.

Authors:  Ling Hu; Shaoqiu Chen; Yuanyuan Fu; Zitong Gao; Hui Long; Hong-Wei Ren; Yi Zuo; Jie Wang; Huan Li; Qing-Bang Xu; Wen-Xiong Yu; Jia Liu; Chen Shao; Jun-Jie Hao; Chuan-Zhen Wang; Yao Ma; Zhanwei Wang; Richard Yanagihara; Youping Deng
Journal:  Clin Infect Dis       Date:  2020-11-19       Impact factor: 9.079

2.  Docking Finds GPCR Ligands in Dark Chemical Matter.

Authors:  Flavio Ballante; Axel Rudling; Alexey Zeifman; Andreas Luttens; Duy Duc Vo; John J Irwin; Jan Kihlberg; Jose Brea; Maria Isabel Loza; Jens Carlsson
Journal:  J Med Chem       Date:  2020-01-13       Impact factor: 7.446

3.  Treatment of severe acute respiratory syndrome with lopinavir/ritonavir: a multicentre retrospective matched cohort study.

Authors:  K S Chan; S T Lai; C M Chu; E Tsui; C Y Tam; M M L Wong; M W Tse; T L Que; J S M Peiris; J Sung; V C W Wong; K Y Yuen
Journal:  Hong Kong Med J       Date:  2003-12       Impact factor: 2.227

4.  Drug repurposing for coronavirus (COVID-19): in silico screening of known drugs against coronavirus 3CL hydrolase and protease enzymes.

Authors:  Ammar D Elmezayen; Anas Al-Obaidi; Alp Tegin Şahin; Kemal Yelekçi
Journal:  J Biomol Struct Dyn       Date:  2020-04-26

5.  Remdesivir, lopinavir, emetine, and homoharringtonine inhibit SARS-CoV-2 replication in vitro.

Authors:  Ka-Tim Choy; Alvina Yin-Lam Wong; Prathanporn Kaewpreedee; Sin Fun Sia; Dongdong Chen; Kenrie Pui Yan Hui; Daniel Ka Wing Chu; Michael Chi Wai Chan; Peter Pak-Hang Cheung; Xuhui Huang; Malik Peiris; Hui-Ling Yen
Journal:  Antiviral Res       Date:  2020-04-03       Impact factor: 5.970

6.  Putative Inhibitors of SARS-CoV-2 Main Protease from A Library of Marine Natural Products: A Virtual Screening and Molecular Modeling Study.

Authors:  Davide Gentile; Vincenzo Patamia; Angela Scala; Maria Teresa Sciortino; Anna Piperno; Antonio Rescifina
Journal:  Mar Drugs       Date:  2020-04-23       Impact factor: 5.118

Review 7.  An Overview of Severe Acute Respiratory Syndrome-Coronavirus (SARS-CoV) 3CL Protease Inhibitors: Peptidomimetics and Small Molecule Chemotherapy.

Authors:  Thanigaimalai Pillaiyar; Manoj Manickam; Vigneshwaran Namasivayam; Yoshio Hayashi; Sang-Hun Jung
Journal:  J Med Chem       Date:  2016-02-29       Impact factor: 7.446

8.  HIV protease inhibitor nelfinavir inhibits replication of SARS-associated coronavirus.

Authors:  Norio Yamamoto; Rongge Yang; Yoshiyuki Yoshinaka; Shinji Amari; Tatsuya Nakano; Jindrich Cinatl; Holger Rabenau; Hans Wilhelm Doerr; Gerhard Hunsmann; Akira Otaka; Hirokazu Tamamura; Nobutaka Fujii; Naoki Yamamoto
Journal:  Biochem Biophys Res Commun       Date:  2004-06-04       Impact factor: 3.575

9.  α-Ketoamides as Broad-Spectrum Inhibitors of Coronavirus and Enterovirus Replication: Structure-Based Design, Synthesis, and Activity Assessment.

Authors:  Linlin Zhang; Daizong Lin; Yuri Kusov; Yong Nian; Qingjun Ma; Jiang Wang; Albrecht von Brunn; Pieter Leyssen; Kristina Lanko; Johan Neyts; Adriaan de Wilde; Eric J Snijder; Hong Liu; Rolf Hilgenfeld
Journal:  J Med Chem       Date:  2020-02-24       Impact factor: 7.446

View more
  7 in total

1.  Latin American databases of natural products: biodiversity and drug discovery against SARS-CoV-2.

Authors:  Marvin J Núñez; Bárbara I Díaz-Eufracio; José L Medina-Franco; Dionisio A Olmedo
Journal:  RSC Adv       Date:  2021-05-04       Impact factor: 4.036

2.  Identification of phytocompounds from Houttuynia cordata Thunb. as potential inhibitors for SARS-CoV-2 replication proteins through GC-MS/LC-MS characterization, molecular docking and molecular dynamics simulation.

Authors:  Sanjib Kumar Das; Saurov Mahanta; Bhaben Tanti; Hui Tag; Pallabi Kalita Hui
Journal:  Mol Divers       Date:  2021-05-07       Impact factor: 3.364

3.  Computational assessment of saikosaponins as adjuvant treatment for COVID-19: molecular docking, dynamics, and network pharmacology analysis.

Authors:  Rupesh Chikhale; Saurabh K Sinha; Manish Wanjari; Nilambari S Gurav; Muniappan Ayyanar; Satyendra Prasad; Pukar Khanal; Yadu Nandan Dey; Rajesh B Patil; Shailendra S Gurav
Journal:  Mol Divers       Date:  2021-01-25       Impact factor: 3.364

Review 4.  Predicting global diet-disease relationships at the atomic level: a COVID-19 case study.

Authors:  Lennie Ky Cheung; Rickey Y Yada
Journal:  Curr Opin Food Sci       Date:  2022-01-03       Impact factor: 9.800

5.  Repurposing the antibacterial drugs for inhibition of SARS-CoV2-PLpro using molecular docking, MD simulation and binding energy calculation.

Authors:  Rohit Patel; Jignesh Prajapati; Priyashi Rao; Rakesh M Rawal; Meenu Saraf; Dweipayan Goswami
Journal:  Mol Divers       Date:  2021-09-30       Impact factor: 3.364

6.  Consensus virtual screening of dark chemical matter and food chemicals uncover potential inhibitors of SARS-CoV-2 main protease.

Authors:  Marisa G Santibáñez-Morán; Edgar López-López; Fernando D Prieto-Martínez; Norberto Sánchez-Cruz; José L Medina-Franco
Journal:  RSC Adv       Date:  2020-07-01       Impact factor: 4.036

Review 7.  Progress and Impact of Latin American Natural Product Databases.

Authors:  Alejandro Gómez-García; José L Medina-Franco
Journal:  Biomolecules       Date:  2022-08-30
  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.