| Literature DB >> 33945106 |
David T Stanton1, Jennifer R Baker2, Adam McCluskey2, Stefan Paula3.
Abstract
The Arylhydrocarbon Receptor (AhR), a member of the Per-ARNT-SIM transcription factor family, has been as a potential new target to treat breast cancer sufferers. A series of 2-phenylacrylonitriles targeting AhR has been developed that have shown promising and selective activity against cancerous cell lines while sparing normal non-cancerous cells. A quantitative structure-activity relationship (QSAR) modeling approach was pursued in order to generate a predictive model for cytotoxicity to support ongoing synthetic activities and provide important structure-activity information for new structure design. Recent work conducted by us has identified a number of compounds that exhibited false positive cytotoxicity values in the standard MTT assay. This work describes a good quality model that not only predicts the activity of compounds in the MCF-7 breast cancer cell line, but was also able to identify structures that subsequently gave false positive values in the MTT assay by identifying compounds with aberrant biological behavior. This work not only allows the design of future breast cancer cytotoxic activity in vitro, but allows the avoidance of the synthesis of those compounds anticipated to result in anomalous cytotoxic behavior, greatly enhancing the design of such compounds.Entities:
Keywords: 2-phenylacrylonitriles; Breast cancer; Drug design; MCF-7; MTT assay; Model development; Model interpretation; QSAR
Mesh:
Substances:
Year: 2021 PMID: 33945106 PMCID: PMC8093599 DOI: 10.1007/s10822-021-00387-5
Source DB: PubMed Journal: J Comput Aided Mol Des ISSN: 0920-654X Impact factor: 3.686
Fig. 1Structures of ANI-7 and NAP-6, the initial ligands from our previous work that demonstrated breast cancer cytotoxicity via their action in the AhR pathway
Fig. 2A dot plot illustrating the distribution of the observed activity data (log 1/GI50) for the full dataset (N = 80). Each symbol represents a single observation (structure)
Structure identification and observed cell growth inhibition values for the 80 structures used in this study
| Structure label | Observed growth inhibition (GI50, µM) | Observed growth inhibition (log 1/GI50, µM) | IUPAC Name |
|---|---|---|---|
| A1 | 17 | − 1.23E+00 | ( |
| A2 | 15 | − 1.18E+00 | ( |
| A3 | 4 | − 6.02E−01 | ( |
| A5 | 0.56 | 2.52E−01 | ( |
| A7 | 65 | − 1.81E+00 | 2-(4-fluorophenyl)-3-(1 |
| A8 | 26 | − 1.41E+00 | 2-(4-chlorophenyl)-3-(1 |
| A9 | 37 | − 1.57E+00 | 2-(4-aminophenyl)-3-(1 |
| A11 | 49 | − 1.69E+00 | ( |
| A22 | 2.2 | − 3.42E−01 | ( |
| A23 | 1.5 | − 1.76E−01 | ( |
| A25 | 6.5 | − 8.13E−01 | ( |
| A26 | 4.3 | − 6.33E−01 | ( |
| A27 | 16 | − 1.20E+00 | ( |
| A28 | 0.13 | 8.86E−01 | ( |
| A29 | 7.2 | − 8.57E−01 | ( |
| A30 | 23 | − 1.36E+00 | ( |
| A31 | 0.6 | 2.22E−01 | ( |
| A32 | 25 | − 1.40E+00 | ( |
| A33 | 15 | − 1.18E+00 | ( |
| A35 | 28 | − 1.45E+00 | ( |
| B2 | 31 | − 1.49E+00 | ( |
| B3 | 1.3 | − 1.14E−01 | ( |
| B4 | 7 | − 8.45E−01 | ( |
| B14 | 2.8 | − 4.47E−01 | ( |
| B15 | 3.7 | − 5.68E−01 | ( |
| B16 | 2.3 | − 3.62E−01 | ( |
| B17 | 6.9 | − 8.39E−01 | ( |
| B19 | 4 | − 6.02E−01 | ( |
| B20 | 0.23 | 6.38E−01 | ( |
| C1 | 27 | − 1.43E+00 | ( |
| C11 | 23 | − 1.36E+00 | ( |
| C18 | 35 | − 1.54E+00 | ( |
| C21 | 13 | − 1.11E+00 | (2 |
| C23 | 11 | − 1.04E+00 | 3‐(1H‐indol‐3‐yl)‐2‐[(1 |
| C27 | 34 | − 1.53E+00 | ( |
| C28 | 6 | − 7.78E-01 | ( |
| C29 | 27 | − 1.43E+00 | ( |
| C30 | 20 | − 1.30E+00 | ( |
| C31 | 21 | − 1.32E+00 | ( |
| C33 | 9 | − 9.54E−01 | ( |
| C34 | 11 | − 1.04E+00 | ( |
| C35 | 3 | − 4.77E−01 | ( |
| C36 | 20 | − 1.30E+00 | ( |
| C37 | 7 | − 8.45E−01 | ( |
| C39 | 18 | − 1.26E+00 | ( |
| C40 | 36 | − 1.56E+00 | ( |
| C41 | 11 | − 1.04E+00 | ( |
| C44 | 33 | − 1.52E+00 | ( |
| C45 | 16 | − 1.20E+00 | ( |
| C46 | 29 | − 1.46E+00 | ( |
| C48 | 21 | − 1.32E+00 | ( |
| C50 | 45 | − 1.65E+00 | ( |
| C51 | 29 | − 1.46E+00 | ( |
| C52 | 8 | − 9.03E−01 | ( |
| D6 | 2.5 | − 3.98E−01 | ( |
| D7 | 15 | − 1.18E+00 | (2 |
| D8 | 9.8 | − 9.91E−01 | ( |
| D10 | 1.7 | − 2.30E−01 | ( |
| D12 | 1.9 | − 2.79E−01 | ( |
| D14 | 23 | − 1.36E+00 | ( |
| E13 | 26 | − 1.41E+00 | ( |
| E14 | 25 | − 1.40E+00 | ( |
| E15 | 2.9 | − 4.62E−01 | ( |
| E16 | 5.3 | − 7.24E−01 | ( |
| E18 | 0.89 | 5.06E−02 | ( |
| E19 | 1 | 0.00E+00 | ( |
| E20 | 0.33 | 4.81E−01 | ( |
| E21 | 0.48 | 3.19E−01 | ( |
| E26 | 3.5 | − 5.44E− 01 | ( |
| E27 | 12 | − 1.08E+00 | ( |
| E28 | 7.4 | − 8.69E−01 | ( |
| E29 | 2.8 | − 4.47E−01 | ( |
| E30 | 4.5 | − 6.53E−01 | ( |
| E32 | 0.32 | 4.95E−01 | ( |
| E35 | 0.03 | 1.52E+00 | ( |
| E36 | 0.17 | 7.70E−01 | ( |
| E37 | 0.28 | 5.53E−01 | ( |
| E38 | 0.034 | 1.47E+00 | ( |
| E39 | 2.1 | − 3.22E−01 | ( |
| F13c | 13 | − 1.11E+00 | (2 |
The structure label indicates the source of the data; the letter identifies the published article and the number is the number assigned to the structure in that article. The letter assignments are A—Tarleton, et al. [19], B—Tarelton, et al. [22], C—Tarleton, et al. [23], D—Al Otaibi, et al. [24], E—Baker, et al. [6], F—Baker, et al. [21]
Details of the QSAR model derived for the set of 75 observations from the 2-phenylacrylonitrile data set
| Descriptor label | Coefficient | Partial-F | Variance inflation factor |
|---|---|---|---|
| 0.3754 | 81.24 | 5.42 | |
| 1.606 | 93.97 | 4.52 | |
| 0.2088 | 37.20 | 9.15 | |
| 6.628 | 66.90 | 1.83 | |
| − 0.2500 | 25.86 | 3.72 | |
| 0.2865 | 31.09 | 2.55 | |
| − 1.341 | 13.57 | 4.08 | |
| − 4.182 | |||
R2 = 0.726, s = 0.344, Q2LOO = 0.663, Overall F-value = 25.39
Equation form: Obs. log(1/GI50) = 1.606 × SdssC + 0.2088 × SaaCH + 0.2865 × SssNH + 6.628 × xch5 − 1.341 × dxp9 − 0.2500 × netype22 + 0.3754 × n2pag13 − 4.182
Labels, a brief description and a diagram of the key molecular structure feature for each of the seven molecular descriptors in the final model
| Descriptor label | Description | Key structural feature |
|---|---|---|
| SdssC | Sum of the atom level E-State of all carbon atoms in the molecule of type = C < [ |
|
| SaaCH | Sum of the atom level E-State of all unsubstituted aromatic carbon atoms in the molecule [ |
|
| SssNH | Sum of the atom level E-State of all nitrogen atoms in the molecule of type –NH– [ |
|
| xch5 | Simple 5th-order chain (five ring bonds) molecular connectivity index (Only the ring bonds are considered in this version, no extra-ring bonds are included in the subgraph) [ |
|
| dxp9 | Simple 9th-order path difference molecular connectivity index. Computed by taking the difference between xp9 for the structure in question and the same descriptor for the hypothetical unbranched version of the structure with the same atom count and atom types [ | |
| netype22 | Count of single edges between two delta-2 vertices |
|
| n2pag13 | Count of 2nd-order path subgraphs (two consecutive bonds) between a delta-1 vertex and a delta-3 vertex |
|
The descriptor labels within the winMolconn output are case sensitive
Fig. 3Fit plot for the QSAR model showing the correlation of the fitted and observed activity values for the 75 observations in the model training set
Fig. 4Structures and molecular descriptor values for examples illustrating the key structure features identified in Component-1 of the model. Activity value is log 1/GI50, micromolar
Fig. 5Structures and molecular descriptor values for examples illustrating the key structure features identified in Component-2 of the model. Activity value is log 1/GI50, micromolar
Fig. 6Structures and molecular descriptor values for examples illustrating the key structure features identified in Component-3 of the model. Activity value is log 1/GI50, micromolar
Fig. 7Structures and molecular descriptor values for examples illustrating the key structure features identified in Component-4 of the model. Activity value is log 1/GI50, micromolar
Fig. 8Structures and molecular descriptor values for examples illustrating the key structure features identified in Component-5 of the model. Activity value is log 1/GI50, micromolar
Fig. 9Structures and molecular descriptor values for examples illustrating the key structure features identified in Component-6 of the model. Activity value is log 1/GI50, micromolar
Fig. 10A graphic illustration of the important structure regions identified in the structure activity relationship extracted using PLS analysis. Two main features were identified; The linking group, (a), connecting the two terminal groups, (b) and (c). The structure for E36 is used to illustrate features favored in each region
Identity, observed and predicted activity (MTT, log 1/GI50, micromolar) for the five outliers identified during model development
| Structure label | Observed activity (MTT, log 1/GI50, µM) | Predicted activity (MTT, log 1/GI50, µM) | Prediction error | Observed activity (SRB, log 1/GI50, µM) |
|---|---|---|---|---|
| A28 | 0.886 | − 0.631 | 1.52 | − 1.26 |
| E32 | 0.495 | − 0.242 | 0.737 | − 0.398 |
| E35 | 1.52 | − 0.655 | 2.18 | − 0.380 |
| E38 | 1.47 | − 0.335 | 1.80 | ND |
| F13c | − 1.11 | 0.214 | − 1.33 | ND |
The available results from the SRB assay are also shown
ND not determined
Fig. 11Predicted activity results for the five modeling outliers computed using the model. The prediction results are shown in the context of the training set fitted results