| Literature DB >> 31076736 |
Maria Ryaboshapkina1, Mårten Hammar2.
Abstract
Tissue-specific genes are believed to be good drug targets due to improved safety. Here we show that this intuitive notion is not reflected in phase 1 and 2 clinical trials, despite the historic success of tissue-specific targets and their 2.3-fold overrepresentation among targets of marketed non-oncology drugs. We compare properties of tissue-specific genes and drug targets. We show that tissue-specificity of the target may also be related to efficacy of the drug. The relationship may be indirect (enrichment in Mendelian disease and PTVesc genes) or direct (elevated betweenness centrality scores for tissue-specifically produced enzymes and secreted proteins). Reduced evolutionary conservation of tissue-specific genes may represent a bottleneck for drug projects, prompting development of novel models with smaller evolutionary gap to humans. We show that the opportunities to identify tissue-specific drug targets are not exhausted and discuss potential use cases for tissue-specific genes in drug research.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31076736 PMCID: PMC6510781 DOI: 10.1038/s41598-019-43829-9
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Tissue-specific genes were overrepresented among targets of phase 3 drugs and targets of marketed non-oncology drugs. Prevalence of tissue-specific genes among targets of drugs for (a) non-oncology and (b) oncology indications. Percentages of tissue-specific genes among targets of drugs in each phase of clinical development were plotted in comparison to the ‘background’ distribution among all protein coding-genes (black line). Tissue-specificity was defined at nine increasingly stringent constraints x = 2 to 10. (c) Fisher test p-value and fold enrichment for each gene category and each constraint x. Enrichment values >1 indicated over-representation of tissue-specific genes while values <1 indicated under-representation of tissue-specific genes. Nominal p-values < 0.05 were considered statistically significant.
Figure 2Tissue-specific targets represented less frequently reused (a,b) and older (c) subsets of targets of marketed drugs. a Numbers and percentages of targets of marketed drugs that were targeted by candidate drugs in clinical trials, i.e., reused. (b) Number of candidate drugs per re-used target. (c) Year of regulatory approval by FDA or another agency of the first drugs modulating tissue-specific targets compared to non-tissue-specific targets of marketed drugs. For example, carglumic acid was the first marketed drug modulating CPS1 and it was approved in 2010.
Figure 3Tissue-specific genes were less evolutionary conserved in mice compared to all protein-coding genes and drug targets. (a) Percentages of genes without 1-to-1 orthologs in mice. 1-to-1 ortholog refers to a human gene with one unique counterpart in mouse as opposed to 1-to-many or many-to-many orthologs that arise from duplication or gene fusion events. (b) Ka/Ks ratios for human-mouse 1-to-1 orthologs. (c) Sequence similarity of human protein-coding genes and their 1-to-1 orthologs in mice.
Figure 4Tissue-specific genes were enriched in disease genes with potential gain-of-function mechanism but not loss-of-function mechanism. The bars show percentages of (a) OMIM, (b) PTVesc and (c) loss-of-function tolerant (ExAC pLI < = 0.1) and intolerant (pLI > = 0.9) genes in each gene category. Violin plots in (d) show loss-of-function tolerance as a continuous score according to the gnomAD consortium data. Higher values indicate higher tolerance to loss-of-function variation.
Tissue-specific genes had elevated betweenness centrality scores compared to all protein-coding genes and negative controls (rhLOF) in STRING v10.5.
| Group | Median (IQR) | Nominal p | Bonferroni p |
|---|---|---|---|
| All proteins, N = 19,574 | 14, (0–19,548) | Reference | Reference |
| Tissue-specific (x = 2), N = 4,474 | 30, (0–19,534.7) | 1.3e-06 | 1.9e-05 |
| Tissue-specific (x = 6), N = 1,004 | 32.1, (0–19,536.3) | 7.3e-03 | 0.11 |
| Tissue-specific (x = 10), N = 553 | 38.6, (1–19,806.7) | 2.5e-04 | 3.7e-03 |
| Essential, N = 1,713 | 29,414.2, (40–130,570) | 8.9e-179 | 1.3e-177 |
| OMIM, N = 3,844 | 14,754.6, (7–59,036.2) | 2.6e-169 | 3.9e-168 |
| PTVesc, N = 1,912 | 13, (0–19,012.8) | 0.17 | 1 |
| rhLOF, N = 107 | 8, (0–2,241.5) | 0.20 | 1 |
| Marketed, oncology, N = 211 | 48,734.4, (11,077.1–201,844.9) | 2.0e-46 | 3.0e-45 |
| Marketed, non-oncology, N = 476 | 1,326, (13–42,544) | 9.4e-23 | 1.4e-21 |
| Phase 3, oncology, N = 145 | 70,125.7, (15,160.1–36,5211.2) | 6.5e-34 | 9.8e-33 |
| Phase 3, non-oncology, N = 265 | 8,555.3, (38.2–59,080.5) | 5.3e-21 | 8.0e-20 |
| Phase 2, oncology, N = 239 | 58,807, (595.9–375,702) | 9.4e-46 | 1.4e-44 |
| Phase 2, non-oncology, N = 314 | 18,58.1, (8.8–51,355.6) | 4.4e-15 | 6.6e-14 |
| Phase 1, oncology, N = 253 | 51,848.7, (5,140.7–222,555.3) | 6.8e-46 | 1.0e-44 |
| Phase 1, non-oncology, N = 78 | 12,652.2, (51.5–58,423.5) | 2.9e-07 | 4.3e-06 |
IQR stands for interquartile range. P-values are from two tailed Mann-Whitney U test between the gene categories and all protein-coding genes (marked as ‘Reference’).
Betweenness centrality scores for tissue-specific genes were elevated due to enzymes and genes encoding secreted proteins.
| Group | Median (IQR) | Mann-Whitney U, p | Bonferroni p |
|---|---|---|---|
|
| |||
| Secreted and enzyme (N = 71) | 220.8 (3.1–57,712.4) | 2.7e-03 | 1.3e-02 |
| Secreted (N = 584) | 47.4 (1–19,911.5) | 2.4e-04 | 1.2e-03 |
| Enzyme (N = 363) | 10,147.6 (10.4–5,6871.6) | 3.9e-22 | 1.9e-21 |
| Transporter (N = 151) | 21.8 (1–240.1) | 0.40 | 1 |
| Transcription factor (N = 273) | 12 (0–19,366.8) | 0.28 | 1 |
| Neither of the above (N = 3,032) | 21.8 (0–19,099.1) | Reference | Reference |
|
| |||
| Secreted and enzyme (N = 42) | 122.7 (1.6–40,206.9) | 5.2e-02 | 0.26 |
| Secreted (N = 252) | 236.4 (2–2,7831.9) | 7.0e-06 | 3.5e-05 |
| Enzyme (N = 72) | 7,294 (39-5,1895.2) | 2.3e-08 | 1.1e-07 |
| Transporter (N = 57) | 28 (3–292.8) | 0.44 | 1 |
| Transcription factor (N = 48) | 13 (0–19,343.1) | 0.81 | 1 |
| Neither of the above (N = 533) | 18 (0–3,438.1) | Reference | Reference |
|
| |||
| Secreted and enzyme (N = 34) | 192.9 (1.6–40,206.9) | 7.0e-02 | 0.35 |
| Secreted (N = 186) | 105 (2–35,507.9) | 2.6e-03 | 1.3e-02 |
| Enzyme (N = 54) | 8,318.3 (54.9–60,946.2) | 3.0e-06 | 1.5e-05 |
| Transporter (N = 35) | 31 (7.8–9,919.4) | 0.18 | 0.91 |
| Transcription factor (N = 25) | 14 (1–19,570) | 0.63 | 1 |
| Neither of the above (N = 219) | 13 (0–10,240.7) | Reference | Reference |
Figure 5Tissue-specific genes (x = 6) that were not yet explored as targets of marketed or clinical trial drugs but were potentially druggable or had human genetic evidence. Dots indicate overlapping sets, while bars on the top indicate overlap size. For example, the fifth column indicates that 40 secreted proteins had known crystal structures and did not have other indications of druggability. Proteins classified as Tchem in the TCRD data base have potent compounds with binding affinities in the nanomolar (G-protein coupled receptors, nuclear hormone receptors, kinases) or lower micromolar range (ion channels and other target categories). Such chemical compounds can be optimized and transitioned to clinical trials. Secreted proteins may be amenable to antibody therapies. Other tissue-specific genes have some knowledge around them to start a chemical development program. Chemical compounds with binding activities in ChEMBL v23, possibly less potent than the criteria used to define Tchem, offer hints to infer structure-activity relationships, to discover and optimize a lead compound. Known crystal structure may be used to identify binding pockets and design binding molecules. Known drug targets in the same protein family may aid chemical discovery through sequence similarity and homology modelling.
Figure 6Combining tissue-specificity with genetic evidence may represent an effective de-risking strategy for non-oncology drug targets. The graph shows prevalence of genes that are both tissue-specific at each x and have an OMIM Morbid Map entry or are PTVesc among drug targets compared to all protein-coding genes. The table below contains fold enrichment and p-values. Enrichment <1 indicates depletion in tissue-specific genes with human genetic evidence.