| Literature DB >> 31827124 |
Aroon D Hingorani1,2, Valerie Kuan3,4, Chris Finan3,4, Felix A Kruger5, Anna Gaulton6, Sandesh Chopade3,4, Reecha Sofat4,7, Raymond J MacAllister8, John P Overington3,9, Harry Hemingway4,7, Spiros Denaxas4,7, David Prieto7,10, Juan Pablo Casas11.
Abstract
Lack of efficacy in the intended disease indication is the major cause of clinical phase drug development failure. Explanations could include the poor external validity of pre-clinical (cell, tissue, and animal) models of human disease and the high false discovery rate (FDR) in preclinical science. FDR is related to the proportion of true relationships available for discovery (γ), and the type 1 (false-positive) and type 2 (false negative) error rates of the experiments designed to uncover them. We estimated the FDR in preclinical science, its effect on drug development success rates, and improvements expected from use of human genomics rather than preclinical studies as the primary source of evidence for drug target identification. Calculations were based on a sample space defined by all human diseases - the 'disease-ome' - represented as columns; and all protein coding genes - 'the protein-coding genome'- represented as rows, producing a matrix of unique gene- (or protein-) disease pairings. We parameterised the space based on 10,000 diseases, 20,000 protein-coding genes, 100 causal genes per disease and 4000 genes encoding druggable targets, examining the effect of varying the parameters and a range of underlying assumptions, on the inferences drawn. We estimated γ, defined mathematical relationships between preclinical FDR and drug development success rates, and estimated improvements in success rates based on human genomics (rather than orthodox preclinical studies). Around one in every 200 protein-disease pairings was estimated to be causal (γ = 0.005) giving an FDR in preclinical research of 92.6%, which likely makes a major contribution to the reported drug development failure rate of 96%. Observed success rate was only slightly greater than expected for a random pick from the sample space. Values for γ back-calculated from reported preclinical and clinical drug development success rates were also close to the a priori estimates. Substituting genome wide (or druggable genome wide) association studies for preclinical studies as the major information source for drug target identification was estimated to reverse the probability of late stage failure because of the more stringent type 1 error rate employed and the ability to interrogate every potential druggable target in the same experiment. Genetic studies conducted at much larger scale, with greater resolution of disease end-points, e.g. by connecting genomics and electronic health record data within healthcare systems has the potential to produce radical improvement in drug development success rate.Entities:
Mesh:
Year: 2019 PMID: 31827124 PMCID: PMC6906499 DOI: 10.1038/s41598-019-54849-w
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
The relationship between α, β andγ, the true discovery rate (TDR) and the false discovery rate (FDR).
| Outcome | Causal pairings | Non-causal pairings | Hypotheses tested | ||
|---|---|---|---|---|---|
| Declared positive | [ | ||||
| Declared negative | (1 − | [ | |||
| 1 − | 1 |
Figure 1Sample space (N × N) defined by 10,000 human diseases (columns) and 20,000 protein coding genes (rows). Expanded region comprising 1/10,000tℎ of the whole sample space is enlarged: (a) based on 10th causative genes per disease); (b) (based on 100 causative genes per disease); and c (based on 1000 causative genes per disease). Each cell represents a unique gene-disease pairing. Dark blue cells indicate causal gene-disease pairings, light blue cells druggable gene-disease pairings, with red cells indicating causal and druggable gene disease pairings.
Figure 2Venn diagram illustrating the (a) the probabilities of selecting and (b) the number of causal, druggable gene-disease pair (), a druggable gene disease pair (TD) and a causal, gene disease pair (CD) from 200 × 106 gene disease pairings, 100 causal genes per disease and 4000 druggable genes from the 20,000 in the genome. (Not to scale).
Figure 3Re-assorted ‘therapeutic genome’ of a hypothetical disease (d1). The 20,000 protein coding genes are organised into 100 causal and 19,900 non-causal genes. Causal genes are further subdivided into 20 that are also druggable and 80 that are not. Of the 20 causal, druggable genes, 3 are the targets of licensed drugs for the treatment of d1. Of the non-causal genes, 3980 are druggable but not causal for d1. The right hand panel indicates the expected number of true and false positive genes (including druggable genes) expected in a GWAS of d1 undertaken with a sample size that provides power, 1 − β = 0.8 and type 1 error rate of α = 5 × 10−8 at all loci.
The relationship α, β, and γ TP, TN, FP FN, and the declared success rate (s) in preclinical and clinical drug development (see text for details).
| True relationship | No true relationship | All | ||
|---|---|---|---|---|
| 1 | ||||
| 1 |
Figure 4Back calculation of proportion of true target-disease relationships (γ) studied in preclinical development, inferred from observed rates of clinical success (S = 0.1) and preclinical success (S = 0.4). Estimates of γ assume power in clinical phase development(1 − β) = 0.8 and false positive rate in clinical development, α = 0.05, so that the proportion of true target-disease relationships in clinical development, γ = 0.0667. The graph shows estimates of γ (red line) for a range of values for power (1 − β) in preclinical development and corresponding estimates of the preclinical false positive rate, α (blue line). (See text for details).
Figure 5Distribution of number of licensed drug compounds per target.
Figure 6Probability of orthodox drug development success according to the number of candidate targets in the initial sampling frame (left panel) and the number of parallel preclinical development programmes pursued (right panel). The calculations assume there are 4000druggable genes and 20 causal, druggable targets per disease.
(following pages). Illustrative examples of mapping SNPs curated in the GWAS catalogue to genomic linkage dis-equilibrium (LD) intervals containing targets of licensed and clinically used drugs (adapted with modification from.Finan C, Gaulton A, et al. Sci. Translational Med. 2017 Mar 29; 9(383). pii: eaag1166. doi: 10.1126/scitranslmed.aag1166).
| Gene | Drug | Molecule | Curation code | GWAS EFO term | Drug Indication (FDB) | Associated Variant | Reference (pmid) | Minimun distance from druggable gene (bp) | Distance rank of druggable gene | Number of Genes In LD interval | Number of Druggable |
|---|---|---|---|---|---|---|---|---|---|---|---|
| ALDH2 | DISULFIRAM | Small molecule | 1 | alcohol drinking|drinking behavior | Alcoholism (adjunctive treatment) | rs11066280| rs12229654| rs2074356|rs671 | 21270382| 21372407| 23364009| 24277619 | 6016–790230 | 1–18 | 22–33 | 2–4 |
| PDE4D | AMINOPHYLLINE | Small molecule | 1 | asthma | Acute asthma|Acute exacerbation of chronic obstructive airways disease|Bronchial asthma|Chronic obstructive pulmonary disease|Left ventricular failure - cardiac failure - cardiac asthma|Reversible airways obstruction|Routine maintenance therapy in chronic bronchitis and asthma | rs1588265 | 19426955 | 448153 | 1 | 2 | 1 |
| IGF1R | MECASERMIN | Protein | 1 | body height | Growth failure due to primary IGF-1 deficiency | rs2871865 | 20881960| 25429064 | 2696 | 1 | 2 | 1 |
| TNFSF11 | DENOSUMAB | Antibody | 1 | bone density | Prevention of skeletal related events in advanced malignancy involving bone|Treatment of bone loss associated with hormone ablation in prostate cancer|Treatment of osteoporosis in postmenopausal women to prevent fractures | rs17536328| rs9525638 | 24945404 | 6157–8295 | 1 | 1 | 1 |
| ESR1 | TAMOXIFEN CITRATE | Small molecule | 1 | breast carcinoma | Carcinoma of breast|Infertility - female - anovulatory | rs140068132| rs3757318|rs9383938 | 22976474| 23535729| 25327703 | 9531–63713 | 1–2 | 2 | 1 |
| PLG | ALTEPLASE | Enzyme | 1 | coronary heart disease|large artery stroke|stroke | Acute ischaemic stroke: fibrinolytic treatment| Thrombolysis in acute myocardial infarction| Thrombolysis of occluded central venous access devices|Thrombolytic treatment in acute massive pulmonary embolism | rs10455872 | 24262325 | 113152 | 3 | 3 | 2 |
| TNF | ADALIMUMAB | Antibody | 1 | Crohn’s disease | Active polyarticular juvenile chronic arthritis-inadequate response to MTX|Active progressive rheumatoid arthritis|Moderate to severe plaque psoriasis: when other treatment is inappropriate|Moderate/severe ulcerative colitis: when other treatment is inappropriate|Rheumatoid arthritis when inadequate response to DMARDs incl. methotrexate|Severe active rheumatoid arthritis|Severe ankylosing spondylitis in adults if conventional therapy inadequate|Treatment of active & progressive psoriatic arthritis when DMARD inadequate|Treatment of active Crohn’s disease | rs1799964 | 21102463 | 1036 | 2 | 13 | 4 |
| CACNA1D | AMLODIPINE | Small molecule | 1 | diastolic blood pressure | Essential hypertension when stabilised on same ingreds.in same proportions|Hypertension-not adequately controlled by individual components|Prinzmetal’s angina|Prophylaxis of chronic stable angina pectoris|Treatment of essential hypertension| | rs9810888 | 25249183 | 106912 | 1 | 1 | 1 |
| NPC1L1 | EZETIMIBE | Small molecule | 1 | LDL cholesterol|low density lipoprotein cholesterol measurement|total cholesterol measurement | Combined hyperlipidaemia: lipid lowering therapy adjunct to diet|Homozygous familial hypercholesterolaemia (adjunct to statin therapy)|Homozygous familial hypercholesterolaemia: Adjunct to diet|Homozygous sitosterolaemia (phytosterolaemia)|Primary hypercholesterolaemia (hyperlipidaemia type IIa): Adjunct to diet|Primary hypercholesterolaemia: lipid lowering therapy adjunct to diet | rs2072183 | 20686565| 24097068 | 1734 | 1 | 1 | 1 |
| PPARA | GEMFIBROZIL | Small molecule | 1 | LDL cholesterol|low density lipoprotein cholesterol measurement|total cholesterol measurement | Mixed hyperlipidaemia when statin is contraindicated or not tolerated|Primary hypercholesterolaemia: lipid lowering therapy adjunct to diet|Reduction of cardiac events in hypercholesterolaemia|Severe hypertriglyceridaemia with or without low HDL cholesterol | rs4253772 | 24097068 | 12050 | 1 | 7 | 2 |
| CASR | CINACALCET HYDROCHLORIDE | Small molecule | 1 | calcuim measurment | Homoeopathic|Hypercalcaemia due to malignant disease|Hypercalcaemia in primary HPT when parathyroidectomy contraindicated|Secondary hyperparathyroidism in end stage renal disease: treatment | rs17251221| rs1801725 | 20661308| 20705733| 24068962 | 1585–12095 | 1 | 5 | 1 |
| IL6R | TOCILIZUMAB | Antibody | 1 | rheumatoid arthritis | Active juvenile idiopathic arthritis (unresp to NSAIDs) in comb with MTX|Active juvenile idiopathic arthritis when inadequate response to NSAIDs|Rheumatoid arthritis (unresp to DMARD/TNF inhib.) in comb with methotrexate|Rheumatoid arthritis when inadequate response to DMARDs incl. methotrexate | rs2228145 | 24390342 | 14956 | 1 | 1 | 1 |
| TNF | ADALIMUMAB | Antibody | 1 | rheumatoid arthritis | Active polyarticular juvenile chronic arthritis-inadequate response to MTX|Active progressive rheumatoid arthritis|Moderate to severe plaque psoriasis: when other treatment is inappropriate|Moderate/severe ulcerative colitis: when other treatment is inappropriate|Rheumatoid arthritis when inadequate response to DMARDs incl. methotrexate|Severe active rheumatoid arthritis|Severe ankylosing spondylitis in adults if conventional therapy inadequate|Treatment of active & progressive psoriatic arthritis when DMARD inadequate|Treatment of active Crohn’s disease | rs2596565 | 24532677 | 190015 | 24 | 145 | 27 |
| ABCC8 | GLIPIZIDE | Small molecule | 1 | type II diabetes mellitus | Non insulin dependent diabetes mellitus when diet has failed | rs5219 | 19056611 | 4860–5802 | 3 | 5 | 3 |
| ABCC8 | GLYBURIDE | Small molecule | 1 | type II diabetes mellitus | Type 2 diabetes (NIDDM) not controlled by diet,weight loss & exercise alone | rs5215|rs5219 | 17463248| 17463249| 19056611| 24509480 | 4860–5802 | 3 | 5 | 3 |
| ABCC8 | NATEGLINIDE | Small molecule | 1 | type II diabetes mellitus | Control of type-2 diabetes (NIDDM) with metformin if metformin inadequate | rs5219 | 19056611 | 4860–5802 | 3 | 5 | 3 |
| ABCC8 | REPAGLINIDE | Small molecule | 1 | type II diabetes mellitus | Control of type-2 diabetes (NIDDM) with metformin if metformin inadequate|Type 2 diabetes (NIDDM) not controlled by diet,weight loss & exercise alone | rs5219 | 19056611 | 4860–5802 | 3 | 5 | 3 |
| KCNJ11 | GLIMEPIRIDE | Small molecule | 1 | type II diabetes mellitus | Type 2 diabetes (NIDDM) not controlled by diet,weight loss & exercise alone | rs5219 | 19056611 | 1224–1306 | 1 | 5 | 3 |
| KCNJ11 | GLIPIZIDE | Small molecule | 1 | type II diabetes mellitus | Non insulin dependent diabetes mellitus when diet has failed | rs5219 | 19056611 | 1224–1306 | 1 | 5 | 3 |
| KCNJ11 | GLYBURIDE | Small molecule | 1 | type II diabetes mellitus | Type 2 diabetes (NIDDM) not controlled by diet,weight loss & exercise alone | rs5215|rs5219 | 17463248| 17463249| 19056611| 24509480 | 1224–1306 | 1 | 5 | 3 |
| KCNJ11 | NATEGLINIDE | Small molecule | 1 | type II diabetes mellitus | Control of type-2 diabetes (NIDDM) with metformin if metformin inadequate | rs5219 | 19056611 | 1224–1306 | 1 | 5 | 3 |
| KCNJ11 | REPAGLINIDE | Small molecule | 1 | type II diabetes mellitus | Control of type-2 diabetes (NIDDM) with metformin if metformin inadequate|Type 2 diabetes (NIDDM) not controlled by diet,weight loss & exercise alone | rs5219 | 19056611 | 1224–1306 | 1 | 5 | 3 |
| PPARG | PIOGLITAZONE HYDROCHLORIDE | Small molecule | 1 | type II diabetes mellitus | Combination treatment of Type 2 diabetes with insulin|Control of type-2 diabetes if metformin+sulphonylurea therapy is inadequate|Monotherapy for type2 diabetes if overweight and metformin inappropriate|Oral combination treatment of type 2 diabetes | rs1801282 | 24509480 | 64258 | 1 | 1 | 1 |
| SCN1A | OXCARBAZEPINE | Small molecule | 1 | Mesial temporal lobe epilepsy with hippocampal sclerosis|febrile seizures | Epilepsy - combination of both partial and tonic-clonic seizures|Epilepsy - partial seizures | rs7587026 | 24014518 | 5773–52194 | 1 | 3 | 1 |
| GRIN3B | MEMANTINE HYDROCHLORIDE | Small molecule | 1 | Alzheimers disease | Moderate to severe Alzheimer’s disease|No information available | rs115550680 | 23571587 | 40689 | 8 | 8 | 2 |
| SLC22A12 | SULFINPYRAZONE | Small molecule | 1 | urate measurement | Gout (prophylaxis)|Gouty arthritis|Hyperuricaemia | rs2078267|rs478607 | 20884846| 23263486 | 23999–108243 | 2–3 | 2–3 | 2 |
| SLC22A11 | PROBENECID | Small molecule | 1 | urate measurement|uric acid measurement | rs17300741|rs2078267 | 19503597| 20884846| 23263486 | 6233–8364 | 1 | 1–2 | 1–2 | |
| SCN2A | CARBAMAZEPINE | Small molecule | 2 | febrile seizures | Epilepsy - grand mal|Epilepsy - partial seizures|Epilepsy - tonic-clonic seizures|Prophylaxis of manic-depressive illness unresponsive to lithium|Trigeminal neuralgia | rs3769955 | 25344690 | 14186 | 1 | 1 | 1 |
| DIO1 | PROPYLTHIOURACIL | Small molecule | 3 | thyroxine|thyroxine measurement | Hyperthyroidism|Thyrotoxic crisis|Unlicensed product | rs2235544 | 23408906 | 1189 | 1 | 4 | 1 |
| PDE4D | DIPYRIDAMOLE | Small molecule | 4 | asthma | Alternative to exercise stress in thallium-201 myocardial imaging|Ischemic stroke: Secondary prevention (with/without aspirin)|Secondary prevention of ischaemic stroke|Secondary prevention of transient ischaemic attacks|Thromboembolism+prosthetic heart valve: prophylaxis (+oral anticoagulant)|Transient ischemic attacks: Secondary prevention (with/without aspirin) | rs1588265 | 19426955 | 448153 | 1 | 2 | 1 |
| ACHE | RIVASTIGMINE | Small molecule | 4 | resting heart rate | Mild - moderate dementia in Alzheimer’s disease|Mild - moderate dementia in idiopathic Parkinson’s disease | rs12666989|rs314370 | 20639392 | 861–34407 | 3–7 | 9 | 4 |
| ACHE | NEOSTIGMINE METHYLSULFATE | Small molecule | 4 | heart rate | Myasthenia gravis|Paralytic ileus|Paroxysmal supra-ventricular tachyarrhythmias|Post operative distention| Post operative urinary retention|Reversal of residual competitive neuromuscular block|Unlicensed product | rs13245899 | 23583979 | 861–34407 | 1–7l | 9 | 4 |
| CHRM2 | TOLTERODINE TARTRATE | Small molecule | 4 | heart rate | Symptomatic treatment of urinary urgency, frequency or urge incontinence | rs2350782 | 23583979 | 62368 | 1 | 3 | 1 |
The gene encoding the drug target is listed using Human Genome Nomenclature Catalogue designation. Drug names and indications are from First Data bank. GWAS SNPs are listed according to Refseq number and physical distances are in base pairs (bp). Curation code refers to the correspondence between the treatment indication and GWAS disease or trait association (see Text). Examples are shown of treatment indication rediscoveries which refer to a drug target indication-genetic association match (Curation code 1 = precise match, code 2 = disease area match). For many of these the drug target gene is the sole occupant of the LD interval defined by the GWAS SNP. Examples come from a variety of disease areas and, for some diseases (e.g. type 2 diabetes and rheumatoid arthritis), multiple target rediscoveries are noted. Examples of rediscoveries of mechanism of action (curation code 3) and mechanism-based side effects are also seen (curation code 4).
A priori estimates of preclinical (pc), clinical (c) and overall (o) drug development success contrasting orthodox (non-genomic) with genomic approaches.
| 10 | 0.0001 | 0.05 | 0.2 | 0.9984024 | 0.05008 | 0.0015976 | 0.05 | 0.2 | 0.97503657 | 0.02496343 | 0.051198203 | 0.00256 |
| 100 | 0.001 | 0.05 | 0.2 | 0.98423645 | 0.05075 | 0.01576355 | 0.05 | 0.2 | 0.79601594 | 0.20398406 | 0.06182266 | 0.00314 |
| 1000 | 0.01 | 0.05 | 0.2 | 0.86086957 | 0.0575 | 0.13913043 | 0.05 | 0.2 | 0.27887324 | 0.72112676 | 0.154347826 | 0.00888 |
| 10 | 0.0001 | 0.00000005 | 0.2 | 0.00062455 | 0.00008 | 0.99937545 | 0.05 | 0.2 | 0.000039057 | 0.99996094 | 0.79953159 | 0.000064 |
| 100 | 0.001 | 0.00000005 | 0.2 | 0.000062434 | 0.0008 | 0.99993757 | 0.05 | 0.2 | 3.9023E-06 | 0.9999961 | 0.799953175 | 0.00064 |
| 1000 | 0.01 | 0.00000005 | 0.2 | 6.1875E-06 | 0.008 | 0.99999381 | 0.05 | 0.2 | 3.8672E-07 | 0.99999961 | 0.799995359 | 0.0064 |
| 10 | 0.0005 | 0.05 | 0.2 | 0.99205955 | 0.050375 | 0.00794045 | 0.05 | 0.2 | 0.8864745 | 0.1135255 | 0.055955335 | 0.00282 |
| 100 | 0.005 | 0.05 | 0.2 | 0.9255814 | 0.05375 | 0.074418605 | 0.05 | 0.2 | 0.43736264 | 0.56263736 | 0.105813953 | 0.00569 |
| 1000 | 0.05 | 0.05 | 0.2 | 0.54285714 | 0.0875 | 0.45714286 | 0.05 | 0.2 | 0.06909091 | 0.93090909 | 0.392857143 | 0.03438 |
| 10 | 0.0005 | 0.00000005 | 0.2 | 0.00012492 | 0.00040005 | 0.99987508 | 0.05 | 0.2 | 7.8085E-06 | 0.99999219 | 0.799906309 | 0.00032 |
| 100 | 0.005 | 0.00000005 | 0.2 | 0.000012437 | 0.00400005 | 0.99998756 | 0.05 | 0.2 | 7.7734E-07 | 0.99999922 | 0.799990672 | 0.0032 |
| 1000 | 0.05 | 0.00000005 | 0.2 | 0.000001875 | 0.04000008 | 0.99999881 | 0.05 | 0.2 | 7.4219E-08 | 0.99999993 | 0.799999109 | 0.032 |
TDR, FDR, S, S and S are presented at different values of α (Type 1 error rate) β (Type 2 error rate) and γ (proportion causal and druggable targets).
(a) when the sample space is defined by , and (b) when the sample space is restricted to the druggable genome. See text for details.
Selected examples of Academia, Pharma, and Pharma-Academia initiatives concerning genomics and drug development.
| Initiative | Partners | Drug development model | Aims |
|---|---|---|---|
| Accelerating Drug Development and Repurposing Incubator at Vanderbilt Universitya | Multiple departments at Vanderbilt University Medical Centre | Academic incubator | De-identified genotype data linked to de-identified demographic and health record data to aid precision drug development and drug repurposing |
| DECODE Geneticsb | Decode is a subsidiary of Amgen, a biopharmaceutical company | Within-company | Discover genetic variation underlying human disease in the Icelandic population with the aim of diagnosing, treating and preventing disease |
| Open Targetsc | GSK, Biogen, European Bioinformatics Institute, Wellcome Trust Sanger Institute | Pre-competitive, open access | Public-private initiative based on the use of genomics for drug target validation |
Astra Zeneca Centre for Genomics Research | Human Longevity, Inc Wellcome Trust Sanger Institute Institute for Molecular Medicine, Finland | Within-company | ‘Integrated genomics initiative to transform drug discovery and development across (AZ’s) entire therapeutic pipeline’ |
Eisai Andover Innovative Medicines Institutee | Seeking collaborations with external scientific partners | Pre-competitive research consortia | ‘Executing novel therapeutic targets validated by human genetics’ |
| Regeneron Genetics Centref | Geisinger Health System, and other health service and academic partners | Within-company | ‘Comparing genetic information against medical histories.to develop new means of diagnosing, preventing and/or treating medical conditions’ |
| GSK-Regeneron UK Biobank Partnerhshipg | GSK, Regeneron and UK Biobank | Industry academia partnership, with 9 month exclusivity period for Pharma partners | Exome sequencing of stored DNA from UK Biobank participants: 50,000 samples in year 1, 500,000 by year 3. |
ahttp://online.liebertpub.com/doi/10.1089/adt.2016.772
bhttp://www.decode.com/
chttps://www.opentargets.org/
dhttps://www.astrazeneca.com/media-centre/press-releases/2016/AstraZeneca-launches-integrated-genomics-approach-to-transform-drug-discovery-and-development-22042016.html
ehttp://us.eisai.com/research/andover-innovative-medicines-institute
fhttps://www.regeneron.com/genetics-center
ghttp://www.ukbiobank.ac.uk/2017/03/gsk-regeneron-initiative-to-develop-better-treatments-more-quickly.
Figure 7Study designs relevant to drug target identification and validation based on human genomics: (a) conventional genome-wide association analysis in which variation in 20,000 genes is tested against a single disease; (b) phenome wide association analysis of a gene encoding a drug target in which variation in a single druggable gene is evaluated against many (all) diseases; (c) druggable genome and phenome wide association analysis; and (d) whole genome and phenome wide association analysis.