| Literature DB >> 28391292 |
Tina Begum1, Tapash Chandra Ghosh2, Surajit Basak1,3.
Abstract
Identification of various factors involved in adverse drug reactions in target proteins to develop therapeutic drugs with minimal/no side effect is very important. In this context, we have performed a comparative evolutionary rate analyses between the genes exhibiting drug side-effect(s) (SET) and genes showing no side effect (NSET) with an aim to increase the prediction accuracy of SET/NSET proteins using evolutionary rate determinants. We found that SET proteins are more conserved than the NSET proteins. The rates of evolution between SET and NSET protein primarily depend upon their noncomplex (protein complex association number = 0) forming nature, phylogenetic age, multifunctionality, membrane localization, and transmembrane helix content irrespective of their essentiality, total druggability (total number of drugs/target), m-RNA expression level, and tissue expression breadth. We also introduced two novel terms-killer druggability (number of drugs with killing side effect(s)/target), essential druggability (number of drugs targeting essential proteins/target) to explain the evolutionary rate variation between SET and NSET proteins. Interestingly, we noticed that SET proteins are younger than NSET proteins and multifunctional younger SET proteins are candidates of acquiring killing side effects. We provide evidence that higher killer druggability, multifunctionality, and transmembrane helices support the conservation of SET proteins over NSET proteins in spite of their recent origin. By employing all these entities, our Support Vector Machine model predicts human SET/NSET proteins to a high degree of accuracy (∼86%).Entities:
Keywords: essential druggability; killer druggability; nonside effect associated drug target (NSET); protein evolutionary rates; side effect associated drug target (SET); support vector machine (SVM)
Mesh:
Substances:
Year: 2017 PMID: 28391292 PMCID: PMC5499873 DOI: 10.1093/gbe/evw301
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
FThe relationships between protein evolutionary rates (dN) and target drug numbers of human SET vs. NSET proteins in different bins. PMWT < 0.05 denotes significant difference between groups.
FThe impact of complex forming ability on dN/dS of SET and NSET proteins. The bar graphs demonstrate the difference in the distribution of dN/dS between SET and NSET proteins in complex/noncomplex forming groups considering (a) all protein complexes (b) large protein complexes (size ≥ 5) of CORUM database. PMWT < 0.05 between groups was used to represent a statistically significant difference. Error bars signify 95% confidence interval.
Age Distribution of SET/NSET Proteins within Complex Forming and Noncomplex Forming Groups
| Resource | Class | Proteins Those Form Complex | Noncomplex Forming proteins | ||
|---|---|---|---|---|---|
| ProteinHistorian server | SET | ≈885 Ma | 1.82 × 10−1 | ≈965 Ma | 1.10 × 10−2** |
| NSET | ≈1183 Ma | ≈1169 Ma | |||
| Supplementary Material | Old | SET: 14.69% | 5.50 × 10−1 | SET: 57.47% | 1.10 × 10−3** |
| NSET: 15.92% | NSET: 63.31% | ||||
| New | SET: 4.64% | 6.38 × 10−1 | SET: 21.13% | 3.49 × 10−2** | |
| NSET: 4.10% | NSET: 13.71% |
Note.—Significant difference **(P< 0.05) between pairs are highlighted in bold. Mann–Whitney U test was used to demonstrate the difference in protein numerical age data taken from ProteinHistorian.
Killer Druggability Per Target Proteins Using Age Data from Two Different Resources
| Gene Class | Killer Druggability/Target (Considering age data of Domazet-Lošo and Tautz) | Killer Druggability/Target (Considering Protein Historian server) |
|---|---|---|
| New | 0.715 | 0.421 |
| Old | 0.210 | 0.248 |
| 1.30 × 10−4** | 3.60 × 10−2** |
Note.—Significant difference **(P< 0.05) between groups are highlighted in bold. Mann–Whitney U test was used to exhibit the differences between groups. In case of age data from ProteinHistorian, we considered proteins with age ≤ 500 Ma as “new” and proteins with age > 500 Ma as “old.”
Comparison of Features between Drug Targets with Killing Side Effect(s) and with Nonkilling/No Side Effect(s)
| Serial no. | Groups | Evolutionary Rates (dN) | Pleiotropic Index | Tissue Expression Breadth |
|---|---|---|---|---|
| 1 | Drug targeted proteins with killer druggability > 0 | 0.058 | 24.195 | 7 |
| 2 | Drug targeted proteins with killer druggability = 0 | 00.081 | 15.636 | 12 |
| 3.48 × 10−4** | 5.94 × 10−13** | 2.09 × 10−8** | ||
Note.—Mann–Whitney U test was used to demonstrate the significant **(PMWT< 0.05) differences between groups. Bold data represent significant difference.
FComparisons of evolutionary rates (dN) of SET vs. NSET proteins within different age groups. In the figure, (a) the categorical age data are provided by Domazet-Lošo and Tautz (2008); (b) the numerical protein age data were obtained from ProteinHistorian (Capra et al. 2012). For numerical data, we considered proteins with age ≤ 500 Ma as “new” and rest as “old” proteins. The plots showing the importance of young/new gene age in the disparity of evolutionary rates of SET and NSET proteins. Error bars represent 95% confidence interval.
Distribution of SET/NSET Membrane Proteins within Complex Forming and Noncomplex Forming New Age Class Considering Targets with Killer Druggability = 0
| Resource | Proteins Those Form Complex | Noncomplex Forming Proteins | ||
|---|---|---|---|---|
| Age data provided by Domazet-Lošo and Tautz | SET: 1.16% | Z = 0.679, | SET: 9.30% | |
| NSET: 0.69% | NSET: 3.54% | |||
| From age data of ProteinHistorian server | SET: 1.74% | SET: 16.28% | ||
| NSET: 1.10% | NSET: 5.84% |
Note.—Significant difference **(P< 0.05) between pairs and their corresponding Z scores are highlighted in bold.
Efficiency Evaluation of Our Optimized SVM Model
| Test Set | MCC | Sensitivity | Specificity | PPV | NPV | |
|---|---|---|---|---|---|---|
| Our data set using categorical gene age data | 0.4810 | 29.26% | 99.74% | 94.94% | 85.30% | 44.57% |
| Our data set using numerical gene age data | 0.5056 | 31.47% | 99.75% | 97.02% | 85.11% | 47.39% |
Performance measures were averaged over 10 randomized test sets to obtain a single value.