| Literature DB >> 31538958 |
Nikhil Verma1, Harpreet Singh2, Divya Khanna2, Prashant Singh Rana2, Sanjay Kumar Bhadada3.
Abstract
In humans, oxidative stress is involved in the development of diabetes, cancer, hypertension, Alzheimers' disease, and heart failure. One of the mechanisms in the cellular defence against oxidative stress is the activation of the Nrf2-antioxidant response element (ARE) signalling pathway. Computation of activity, efficacy, and potency score of ARE signalling pathway and to propose a multi-level prediction scheme for the same is the main aim of the study as it contributes in a big amount to the improvement of oxidative stress in humans. Applying the process of knowledge discovery from data, required knowledge is gathered and then machine learning techniques are applied to propose a multi-level scheme. The validation of the proposed scheme is done using the K-fold cross-validation method and an accuracy of 90% is achieved for prediction of activity score for ARE molecules which determine their power to refine oxidative stress.Entities:
Mesh:
Year: 2019 PMID: 31538958 PMCID: PMC8687196 DOI: 10.1049/iet-syb.2018.5078
Source DB: PubMed Journal: IET Syst Biol ISSN: 1751-8849 Impact factor: 1.615
Fig. 1Antioxidant defence against free radical induced damage in a human body
Fig. 2Neutralisation of free radical by antioxidant
Fig. 3Nrf2‐antioxidant signalling pathway
Molecular descriptors calculated by PaDEL
| Descriptor type | Descriptor ID | Class |
|---|---|---|
| AcidicGroupCount | nAcid | 2D |
| ALOGP | ALogP, ALogP2, AMR | 2D |
| APol | Apol | 2D |
| aromatic atoms count | naAromAtom | 2D |
| aromatic bonds count | nAromBond | 2D |
| atom count | nAtom, nHeavyAtom, nH, nB, nC, nN, nO, nS, nP, nF, nCl, nBr, nI | 2D |
| BasicGroupCount | nBase | 2D |
| BondCount | nBonds, nBonds2, nBondsS, nBondsS2, nBondsS3, nBondsD, nBondsD2, nBondsT, nBondsQ | 2D |
| BPol | Bpol | 2D |
| carbon types | C1SP1, C2SP1, C1SP2, C2SP2, C3SP2, C1SP3, C2SP3, C3SP3, C4SP3 | 2D |
| HBondAcceptorCount | nHBAcc, nHBAcc2, nHBAcc3, nHBAcc_Lipinski | 2D |
| HBondDonorCount | nHBDon, nHBDon_Lipinski | 2D |
| LargestChain | nAtomLC | 2D |
| LargestPiSystem | nAtomP | 2D |
| LongestAliphaticChain | nAtomLAC | 2D |
| MannholdLogP | MLogP | 2D |
| McGowanVolume | McGowan_Volume | 2D |
| MLFER | MLFER_A, MLFER_BH, MLFER_BO, MLFER_S, MLFER_E, MLFER_L | 2D |
| ring count | nRing, n3Ring, n4Ring, n5Ring, n6Ring, n7Ring, n8Ring, n9Ring, n10Ring, n11Ring, n12Ring, nG12Ring, nFRing, nF4Ring, nF5Ring, nF6Ring, nF7Ring, nF8Ring, nF9Ring, nF10Ring, nF11Ring, nF12Ring, nFG12Ring, nTRing, nT4Ring, nT5Ring, nT6Ring, nT7Ring, nT8Ring, nT9Ring, nT10Ring, nT11Ring, nT12Ring, nTG12Ring | 2D |
| rotatable bonds count | nRotB | 2D |
| rule of five | LipinskiFailures | 2D |
| topological polar surface area | TopoPSA | 2D |
| van der Waals volume | VABC | 2D |
| weight | MW | 2D |
| XLogP | XLogP | 2D |
| charged partial surface area | PPSA‐1, PPSA‐2, PPSA‐3, PNSA‐1, PNSA‐2, PNSA‐3, DPSA‐1, DPSA‐2, DPSA‐3, FPSA‐1, FPSA‐2, FPSA‐3, FNSA‐1, FNSA‐2, FNSA‐3, WPSA‐1, WPSA‐2, WPSA‐3, WNSA‐1, WNSA‐2, WNSA‐3, RPCG, RNCG, RPCS, RNCS, THSA, TPSA, RHSA, RPSA | 3D |
| moment of inertia | MOMI‐X, MOMI‐Y, MOMI‐Z, MOMI‐XY, MOMI‐XZ, MOMI‐YZ, MOMI‐R | 3D |
| Pubchem fingerprint | Hierarchal element countsRings in a canonic extended smallest set of smallest rings ring setSimple atom pairsSimple atom nearest neighboursDetailed atom neighbourhoodsSimple SMARTS patternsComplex SMARTS patterns | fingerprint |
ML models used for classification of molecules
| Model | Method | Package | Tuning Parameter(s) | Ref. | |
|---|---|---|---|---|---|
| M1 | ada boost | ada | kernlab, rpart, ada, hmeasure | maxdepth, cp, minsplit, xval, iter | [ |
| M2 | decision tree | rpart | rpart, hmeasure | parms, control | [ |
| M3 | linear model | multinom | car, nnet, hmeasure | maxit | [ |
| M4 | neural network | nnet | nnet, hmeasure | size, MaxNWTs, maxit | [ |
| M5 | random forest | randomForest | randomForest, hmeasure | ntree, mtry | [ |
| M6 | SVM | ksvm | e1071, kernlab, heasure | rules, pruned, kernel | [ |
Fig. 4Multi‐level proposed prediction scheme for new molecules
Evaluation results of models used for binary classification by oversampling
| Model | AUC | H‐measure | Gini | Accuracy, % | |
|---|---|---|---|---|---|
| M1 | ada boost | 0.849 | 0.4328 | 0.6976 | 62.5 |
| M2 | decision tree | 0.7502 | 0.26328 | 0.5002 | 53.0 |
| M3 | linear model | 0.7634 | 0.3202 | 0.5268 | 51.5 |
| M4 | neural network | 0.7162 | 0.21 | 0.4322 | 48.5 |
| M5 | random forest | 0.8608 | 0.47 | 0.7216 | 72.0 |
| M6 | SVM | 0.8248 | 0.3792 | 0.649 | 55.0 |
Evaluation results of models used for binary classification by undersampling
| Model | AUC | H‐measure | Gini | Accuracy, % | |
|---|---|---|---|---|---|
| M1 | ada boost | 0.836 | 0.385 | 0.672 | 75.0 |
| M2 | decision tree | 0.743 | 0.233 | 0.486 | 70.86 |
| M3 | linear model | 0.777 | 0.286 | 0.554 | 71.01 |
| M4 | neural network | 0.739 | 0.222 | 0.478 | 68.1 |
| M5 | random forest | 0.862 | 0.451 | 0.723 | 79.2 |
| M6 | SVM | 0.796 | 0.32 | 0.591 | 72.39 |
Fig. 5ROC curves of models used for binary classification by oversampling
ROC curve first sub‐dataset, ROC curve second sub‐dataset, ROC curve third sub‐dataset, ROC curve fourth sub‐dataset, ROC curve fifth sub‐dataset
Fig. 6ROC curves of models used for binary classification by under‐sampling
AUC values for red curves in Fig. 5
| Curve | Fig. | Fig. | Fig. | Fig. | Fig. |
|---|---|---|---|---|---|
| AUC | 0.905 | 0.895 | 0.808 | 0.830 | 0.866 |
Cross‐validation results
| mtry | RMSE |
| MAE |
|---|---|---|---|
| 2 | 0.1655441 | 0.9145148 | 0.09410988 |
| 3 | 0.1640467 | 0.9148280 | 0.08907119 |
| 4 | 0.1642636 | 0.9139220 | 0.08748790 |
Regression model evaluation results for predicting activity, efficacy and potency score
| Model for | Correlation |
| Accuracy, % | |
|---|---|---|---|---|
| 1 | activity score | 0.86 | 0.74 | 90 |
| 2 | potency score | 0.6 | 0.36 | 82.5 |
| 3 | efficacy score | 0.68 | 0.46 | 80 |