| Literature DB >> 36120024 |
Alejandro Speck-Planche1, Valeria V Kleandrova2.
Abstract
Respiratory viruses are infectious agents, which can cause pandemics. Although nowadays the danger associated with respiratory viruses continues to be evidenced by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) as the virus responsible for the current COVID-19 pandemic, other viruses such as SARS-CoV-1, the influenza A and B viruses (IAV and IBV, respectively), and the respiratory syncytial virus (RSV) can lead to globally spread viral diseases. Also, from a biological point of view, most of these viruses can cause an organ-damaging hyperinflammatory response known as the cytokine storm (CS). Computational approaches constitute an essential component of modern drug development campaigns, and therefore, they have the potential to accelerate the discovery of chemicals able to simultaneously inhibit multiple molecular and nonmolecular targets. We report here the first multicondition model based on quantitative structure-activity relationships and an artificial neural network (mtc-QSAR-ANN) for the virtual design and prediction of molecules with dual pan-antiviral and anti-CS profiles. Our mtc-QSAR-ANN model exhibited an accuracy higher than 80%. By interpreting the different descriptors present in the mtc-QSAR-ANN model, we could retrieve several molecular fragments whose assembly led to new molecules with drug-like properties and predicted pan-antiviral and anti-CS activities.Entities:
Year: 2022 PMID: 36120024 PMCID: PMC9476185 DOI: 10.1021/acsomega.2c03363
Source DB: PubMed Journal: ACS Omega ISSN: 2470-1343
Different Experimental Conditions under Which the Molecules of the Present Dataset Were Assayed
| cutoff value (nM) | ||||
|---|---|---|---|---|
| EC50 (nM) | ≤3800 | IAV (A-Puerto Rico-8-1934 (H1N1)) | F (organism-based format) | |
| EC50 (nM) | IAV (A-Puerto Rico-8-1934 (H1N1)) | F (cell-based format) | ||
| EC50 (nM) | IAV (A-Puerto Rico-8-1934 (H1N1)) | F (assay format) | ||
| EC50 (nM) | ≤10000 | IBV | F (organism-based format) | |
| EC50 (nM) | IBV | F (cell-based format) | ||
| EC50 (nM) | ≤300 | RSV | F (organism-based format) | |
| EC50 (nM) | RSV | F (cell-based format) | ||
| EC50 (nM) | ≤7500 | SARS-CoV-1 | F (organism-based format) | |
| EC50 (nM) | ≤1200 | SARS-CoV-2 | F (organism-based format) | |
| IC50 (nM) | ≤8600 | IAV (A-Puerto Rico-8-1934 (H1N1)) | F (organism-based format) | |
| IC50 (nM) | IAV (A-Puerto Rico-8-1934 (H1N1)) | F (cell-based format) | ||
| IC50 (nM) | ≤10000 | IBV | F (organism-based format) | |
| IC50 (nM) | ≤3600 | RSV | F (organism-based format) | |
| IC50 (nM) | ≤1080 | SARS-CoV-1 | F (organism-based format) | |
| IC50 (nM) | ≤6070 | SARS-CoV-2 | F (organism-based format) | |
| IC50 (nM)p | ≤1100 | Caspase-1 | B (assay format) | |
| IC50 (nM)p | Caspase-1 | B (single protein format) | ||
| IC50 (nM)p | Caspase-1 | B (cell-based format) | ||
| IC50 (nM)p | ≤1635 | TNF-alpha | B (single protein format) | |
| IC50 (nM)p | TNF-alpha | F (assay format) | ||
| IC50 (nM)p | TNF-alpha | B (assay format) | ||
| IC50 (nM)p | TNF-alpha | B (cell-based format) | ||
| IC50 (nM)p | TNF-alpha | F (cell-based format) |
Codes for the different experimental conditions cj, which are combinations of the elements ma (measures of activity), bt (biological targets), and ai (assay information containing diverse experimental protocols).
Measures of inhibitory activity. EC50 (nM) is the effective concentration leading to a 50% reduction in the cytophaticity caused by a virus or inhibition of viral replication, IC50 (nM) is the concentration required for 50% inhibition of the virus, and IC50 (nM)p is the concentration required for 50% inhibition of a protein associated with the cytokine storm.
Value of activity from which a chemical was annotated as active [BAi(cj) = 1].
Targets (respiratory viruses or proteins associated with the cytokine storm).
Assay information related to the different test protocols. Each annotation combines the columns “assay type” (first letter) and “BioAssay Ontology” (phrase between parentheses), which were extracted from the ChEMBL file containing inhibitory activity data.
Symbols and Codes of the D[GTI]cj Descriptors Present in the mtc-QSAR-ANN Model
| symbology | code | concept |
|---|---|---|
| deviation of the Kier–Hall (valence) connectivity index based only on cluster subgraphs of order 6 | ||
| deviation of the normalized Kier–Hall (valence) connectivity index based only on cluster subgraphs of order 4 | ||
| deviation of the Kier–Hall (valence) connectivity index based only on path subgraphs of order 6 | ||
| deviation of the normalized spectral moment of order 1 based on bonds weighted by the Gasteiger–Marsili charges. | ||
| deviation of the normalized spectral moment of order 2 based on bonds weighted by the Gasteiger–Marsili charges. | ||
| deviation of the normalized Kier–Hall (valence) connectivity index based only on path-cluster subgraphs of order 4 | ||
| deviation of the normalized edge (bond) connectivity index based only on path subgraphs of order 5 | ||
| deviation of the normalized Balaban index. | ||
| deviation of the spectral moment of order 1 based on bonds weighted by the hydrophobicity contributions. | ||
| deviation of Kier’s shape index based only on path subgraphs of order 3 | ||
| deviation of the Balaban index. | ||
| deviation of the normalized spectral moment of order 6 based on bonds weighted by the polar surface area. | ||
| deviation of the normalized spectral moment of order 5 based on bonds weighted by the molar refractivity. | ||
| deviation of the normalized Kier–Hall (valence) connectivity index based only on chain subgraphs of order 5 | ||
| deviation of the normalized Kierflexibility index. |
The D[GTI]cj descriptors with ending “ma” characterize both the molecular structure and the measures of inhibitory activity. Those D[GTI]cj descriptors with the ending “bt” describe the chemical structure as well as the biological targets (respiratory viruses and proteins associated with the cytokine storm). Finally, the D[GTI]cj descriptors with ending “ai” characterize the chemical structure and information related to different experimental assay protocols.
From now on, for the sake of simplicity, the codes will be used instead of the original symbols to explain either the statistical significance or the physicochemical interpretation of the D[GTI]cj descriptors.
Statistical Performance of the mtc-QSAR-ANN Model
| symbols | training set | test set |
|---|---|---|
| 1156 | 374 | |
| 981 | 297 | |
| 84.86 | 79.41 | |
| 1435 | 469 | |
| 1225 | 379 | |
| 85.37 | 80.81 | |
| 0.700 | 0.600 |
NActive, number of molecules/cases labeled as active; NInactive, number of molecules/cases annotated as inactive; CCCActive, number of molecules/cases correctly classified as active; CCCInactive, number of molecules/cases correctly classified as inactive; Sn (%), statistical sensitivity (percentage of molecules/cases correctly classified as active); Sp (%), statistical specificity (percentage of molecules/cases correctly classified as inactive); MCC, Matthews’ correlation coefficient.
Tendencies of Variation of the D[GTI]cj Descriptors in the mtc-QSAR-ANN Model
| class-based
means | |||
|---|---|---|---|
| descriptors | active | inactive | tendency |
| 1.4391 × 10–2 | 5.9700 × 10–2 | decrease | |
| 2.8440 × 10–2 | 3.1588 × 10–2 | decrease | |
| 1.6610 × 10–3 | –5.2028 × 10–2 | increase | |
| –3.1118 × 10–2 | 1.2448 × 10–1 | decrease | |
| 1.0951 × 10–2 | –1.2250 × 10–1 | increase | |
| 1.1698 × 10–2 | 1.0999 × 10–1 | decrease | |
| –2.9689 × 10–2 | –3.8392 × 10–2 | increase | |
| 1.3665 × 10–2 | 1.3667 × 10–1 | decrease | |
| –2.4072 × 10–2 | 8.7241 × 10–2 | decrease | |
| 4.9842 × 10–3 | –2.3010 × 10–2 | increase | |
| 1.6154 × 10–2 | 4.7476 × 10–2 | decrease | |
| 1.3308 × 10–2 | –1.1988 × 10–3 | increase | |
| 1.4227 × 10–3 | 9.9303 × 10–2 | decrease | |
| 8.0425 × 10–3 | 2.4919 × 10–2 | decrease | |
| 8.4249 × 10–3 | 4.8861 × 10–2 | decrease | |
These are the averages calculated for each D[GTI]cj descriptor by considering chemicals (from the training set) belonging to a defined class (active or inactive).
Variation of the value of a D[GTI]cj descriptor that should be expected to increase the inhibitory activity against the respiratory viruses and the proteins associated with the cytokine storm.
Figure 1Sensitivity values as measures of the importance of the D[GTI]cj descriptors present in the mtc-QSAR-ANN model.
Figure 2Different molecular fragments whose presence favorably affects the values of the D[GTI]cj descriptors. The symbols have the following meanings: A = O or S; X = C or N; Y1 = O, −CH2– or −NH–; Y2 and Y3 can be any atom; Y4 = O or −NH–; Z1 and Z2 can be F or any functional group whose electronegative atom (can be only O or N) is the one attached to the aromatic ring.
Figure 3New molecules designed by assembling several fragments according to the physicochemical and structural interpretation of the D[GTI]cj descriptors.
Summary of the Predictions Performed by the mtc-QSAR-ANN Model for the Designed Molecules
| DP-001 | DP-002 | DP-003 | DP-004 | DP-005 | DP-006 | DP-007 | DP-008 | |
|---|---|---|---|---|---|---|---|---|
| 52.88 | 60.88 | 24.08 | 29.62 | 71.47 | 86.00 | 70.60 | 71.84 | |
| 88.40 | 91.62 | 71.30 | 78.93 | 96.18 | 99.06 | 95.29 | 93.36 | |
| 90.56 | 91.98 | 73.59 | 78.25 | 94.58 | 95.56 | 89.16 | 89.46 | |
| 75.59 | 78.16 | 98.34 | 98.56 | 88.62 | 86.73 | 96.56 | 96.56 | |
| 95.54 | 96.10 | 98.93 | 99.09 | 98.04 | 98.28 | 99.23 | 99.20 | |
| 65.55 | 71.68 | 61.22 | 68.75 | 63.53 | 82.51 | 79.92 | 74.79 | |
| 87.66 | 84.65 | 89.23 | 86.12 | 76.52 | 63.78 | 51.61 | 57.04 | |
| 12.44 | 17.24 | 9.92 | 14.98 | 33.60 | 76.73 | 83.15 | 78.64 | |
| 80.37 | 80.37 | 88.57 | 90.72 | 90.01 | 95.24 | 98.51 | 98.41 | |
| 70.77 | 77.26 | 43.63 | 50.75 | 84.86 | 92.18 | 82.03 | 83.10 | |
| 88.97 | 92.17 | 73.60 | 80.72 | 96.22 | 98.95 | 93.86 | 91.29 | |
| 62.41 | 67.01 | 97.46 | 97.89 | 82.87 | 81.14 | 94.97 | 94.70 | |
| 42.66 | 50.53 | 55.69 | 64.14 | 46.43 | 71.93 | 75.12 | 68.21 | |
| 18.20 | 25.63 | 17.83 | 26.53 | 49.93 | 86.47 | 90.15 | 86.64 | |
| 80.27 | 81.67 | 91.00 | 93.23 | 92.40 | 96.54 | 98.81 | 98.63 | |
| 9.93 | 5.63 | 41.45 | 24.99 | 8.38 | 11.35 | 60.80 | 73.98 | |
| 98.31 | 98.08 | 96.43 | 96.55 | 99.04 | 99.03 | 97.20 | 97.20 | |
| 88.70 | 89.80 | 89.77 | 91.90 | 94.64 | 95.43 | 96.21 | 95.50 | |
| 59.44 | 66.91 | 18.14 | 22.38 | 60.12 | 74.77 | 28.50 | 23.29 | |
| 44.64 | 42.04 | 46.34 | 44.05 | 48.14 | 54.20 | 55.20 | 54.41 | |
| 0.43 | 0.31 | 7.54 | 4.32 | 0.07 | 0.10 | 4.16 | 7.26 | |
| 54.11 | 55.17 | 61.31 | 61.79 | 62.25 | 73.36 | 71.89 | 70.13 | |
| 92.53 | 90.32 | 82.20 | 77.51 | 86.73 | 75.53 | 50.58 | 58.71 |
This refers to the different experimental conditions as reported in Table .
The numbers in this table are the predicted values of probability for each molecule to be considered active. In the Supporting Information (Table S10), these probability values appear in a column named Prob.(%)Act.
Physicochemical Properties Suggesting the Druglikeness of the Designed Molecules
| ID | HD | HA | MW | AMR | NAT | NRB | TPSA | ||
|---|---|---|---|---|---|---|---|---|---|
| DP-001 | 3 | 9 | 448.51 | 1.247 | 1.614 | 112.72 | 52 | 6 | 133.17 |
| DP-002 | 3 | 9 | 448.51 | 1.247 | 1.614 | 112.72 | 52 | 6 | 133.17 |
| DP-003 | 4 | 10 | 429.44 | 0.914 | 1.548 | 106.65 | 51 | 6 | 125.99 |
| DP-004 | 4 | 10 | 429.44 | 0.914 | 1.548 | 106.65 | 51 | 6 | 125.99 |
| DP-005 | 4 | 10 | 450.51 | 1.137 | 1.330 | 112.99 | 51 | 5 | 127.77 |
| DP-006 | 3 | 10 | 451.49 | 1.137 | 1.977 | 111.04 | 50 | 6 | 124.97 |
| DP-007 | 4 | 10 | 430.45 | 1.762 | 1.904 | 108.41 | 51 | 6 | 108.56 |
| DP-008 | 4 | 10 | 430.45 | 1.762 | 1.904 | 108.41 | 51 | 6 | 108.56 |
In the table, the abbreviations have the following meanings: the number of atoms behaving as hydrogen bond donors (HD), the number of atoms acting as hydrogen bond acceptors (HA), the molecular weight (MW), the logarithm of Moriguchi’s octanol/water partition coefficient (M log P), the logarithm of Ghose–Crippen’s octanol/water partition coefficient (A log P), Ghose–Crippen’s molar refractivity (AMR), the number of atoms (NAT), the number of rotatable bonds (NRB), and the topological polar surface area (PSA).
Figure 4Development and use of an mtc-QSAR-ANN model. The D[GTI]cj descriptors were calculated by applying the Box-Jenkins approach; such calculations were carried out in Microsoft Excel. Before finding the mtc-QSAR-ANN model, the data set was randomly split into training and test sets, which accounted for 75% and 25% of the data set, respectively. The abbreviation “INTP” signifies the physicochemical and structural interpretations of the D[GTI]cj descriptors.