| Literature DB >> 28852076 |
Ashok K Sharma1, Shubham K Jaiswal1, Nikhil Chaudhary1, Vineet K Sharma2.
Abstract
The human gut microbiota is constituted of a diverse group of microbial species harbouring an enormous metabolic potential, which can alter the metabolism of orally administered drugs leading to individual/population-specific differences in drug responses. Considering the large heterogeneous pool of human gut bacteria and their metabolic enzymes, investigation of species-specific contribution to xenobiotic/drug metabolism by experimental studies is a challenging task. Therefore, we have developed a novel computational approach to predict the metabolic enzymes and gut bacterial species, which can potentially carry out the biotransformation of a xenobiotic/drug molecule. A substrate database was constructed for metabolic enzymes from 491 available human gut bacteria. The structural properties (fingerprints) from these substrates were extracted and used for the development of random forest models, which displayed average accuracies of up to 98.61% and 93.25% on cross-validation and blind set, respectively. After the prediction of EC subclass, the specific metabolic enzyme (EC) is identified using a molecular similarity search. The performance was further evaluated on an independent set of FDA-approved drugs and other clinically important molecules. To our knowledge, this is the only available approach implemented as 'DrugBug' tool for the prediction of xenobiotic/drug metabolism by metabolic enzymes of human gut microbiota.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28852076 PMCID: PMC5575299 DOI: 10.1038/s41598-017-10203-6
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1The distribution of substrate molecules into the six EC classes is shown by Principal Component Analysis. Each substrate molecule of a respective class is represented by colour coded circles.
Figure 2Optimization of parameters to construct the final RF model for classification into six EC classes.
Performance evaluation of EC class-specific RF model using three different methods.
| Validation on test sets | Hybrid Fingerprint | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| RF model without-upsampling dataset | RF model with-upsampling dataset | |||||||||
| TPR (%) | TNR (%) | PPV (%) | ACC (%) | MCC | TPR (%) | TNR (%) | PPV (%) | ACC (%) | MCC | |
| CV-10 FOLD | 52.83 | 92.63 | 60.32 | 89.33 | 0.49 | 91.58 | 98.32 | 91.46 | 97.19 | 0.89 |
| Splitting and Testing | 60.84 | 92.71 | 49.05 | 88.94 | 0.46 | 87.02 | 97.49 | 87.25 | 95.75 | 0.84 |
| Blind Set | 55.52 | 94.17 | 66.15 | 92.14 | 0.54 | 67.41 | 94 | 63.19 | 91.18 | 0.59 |
TPR = True Positive Rate or Sensitivity, TNR = True Negative Rate or Specificity, PPV = Positive Predictive Value or Precision, ACC = Accuracy, MCC = Matthews correlation coefficient.
Performance evaluation of EC subclass-specific RF models using three different methods.
| Validation on test sets | Hybrid Fingerprint | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| RF model without-upsampling dataset | RF model with-upsampling dataset | ||||||||||
| TPR (%) | TNR (%) | PPV (%) | ACC (%) | MCC | TPR (%) | TNR (%) | PPV (%) | ACC (%) | MCC | ||
| EC1 | CV-10 FOLD | 62.41 | 97.1 | 65.11 | 95.9 | 0.61 | 97.03 | 99.82 | 97.17 | 99.67 | 0.97 |
| Splitting and Testing | 82.14 | 97.02 | 75.71 | 95.26 | 0.75 | 97.09 | 99.82 | 97.02 | 99.66 | 0.97 | |
| Blind Set | 87.62 | 98.18 | 95.07 | 97.5 | 0.89 | 83.17 | 98.73 | 83.67 | 97.3 | 0.81 | |
| EC2 | CV-10 FOLD | 55.65 | 93.39 | 56.96 | 89.27 | 0.50 | 86.67 | 98.52 | 86.26 | 97.33 | 0.85 |
| Splitting and Testing | 66.94 | 94.23 | 64.75 | 90.34 | 0.6 | 85.36 | 98.44 | 85.83 | 97.17 | 0.84 | |
| Blind Set | 81.76 | 94.36 | 81.75 | 91.02 | 0.76 | 83.09 | 96.12 | 80.95 | 92.86 | 0.77 | |
| EC3 | CV-10 FOLD | 65.38 | 97.38 | 78.82 | 96.3 | 0.68 | 97.04 | 99.63 | 97 | 99.34 | 0.97 |
| Splitting and Testing | 88.77 | 96.56 | 93.91 | 93.93 | 0.87 | 95.4 | 99.42 | 95.3 | 98.96 | 0.95 | |
| Blind Set | 95 | 97.5 | 94.44 | 96.3 | 0.92 | 95 | 97.5 | 94.44 | 96.3 | 0.92 | |
| EC4 | CV-10 FOLD | 59.21 | 88.86 | 69.42 | 86.27 | 0.53 | 91.86 | 98.84 | 91.47 | 97.96 | 0.9 |
| Splitting and Testing | 70 | 76.98 | 48.91 | 73.15 | 0.34 | 88.83 | 98.48 | 89.06 | 97.26 | 0.87 | |
| Blind Set | 78.57 | 82.5 | 80.55 | 83.33 | 0.63 | 80.83 | 91.22 | 75 | 86.54 | 0.67 | |
| EC5* | CV-10 FOLD | 76.67 | 83.54 | 85 | 91.16 | 0.7 | 95.62 | 98.91 | 95.93 | 98.25 | 0.95 |
| Splitting and Testing | 88.89 | 75 | 93.75 | 86.36 | 0.62 | 97.77 | 99.39 | 97.5 | 99 | 0.97 | |
| EC6* | CV-10 FOLD | 95.74 | 93.77 | 92.61 | 95.16 | 0.89 | 97.78 | 99.44 | 97.79 | 99.11 | 0.97 |
| Splitting and Testing | 95 | 95 | 90 | 92.86 | 0.85 | 98 | 99.46 | 97.78 | 99.11 | 0.97 | |
*For EC5 and EC6 classes, the validation could not be performed on blind set due to less representation of molecules in these classes. The average accuracy of cross-validation, splitting and testing and the blind set was 98.61, 98.52 and 93.25%, respectively.
TPR = True Positive Rate or Sensitivity, TNR = True Negative Rate or Specificity, PPV = Positive Predictive Value or Precision, ACC = Accuracy, MCC = Matthews correlation coefficient.
Figure 3(a) Complete workflow for the construction of DrugBug. Figure 3 (b) Steps for the analysis of a query molecule through DrugBug web server. DrugBug consists of three different components namely, EC class-specific RF module (RF module 1), EC subclass-specific RF module (RF module 2) and a similarity search module. In the given example, the query molecule is analyzed by these modules to identify the EC number and the corresponding metabolic enzyme which was found in two bacterial genomes (M1 and M2). In each of the predicted bacteria (M1 and M2), two or more proteins (P1 and P2) similar to the EC enzyme were found.
Prediction of gut bacteria and the corresponding metabolic enzyme for biotransformation of some selected FDA-approved drugs and other clinically important molecules.
|
|
|
|
|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
*Enzyme was known in human host
1FDA approved drug;2: Pharmacologically active plant derivative;3: Pharmacologically active synthetic molecule; Ref: Reference
Figure 4Schematic representation of digoxin metabolism. (a) Structure of digoxin, (b) Metabolism of digoxin by gut microbe, (c) Metabolism of digoxin at low gastric PH in human host, (d) Metabolism of digoxin in liver, (e) Previous reports on the metabolism of digoxin and (f) Prediction of digoxin metabolism by DrugBug approach.