| Literature DB >> 27612175 |
Anselm S Hoppmann1,2, Pascal Schlosser3, Rolf Backofen4, Ekkehart Lausch1, Anna Köttgen2.
Abstract
Genome-wide association studies (GWAS) evaluate associations between genetic variants and a trait or disease of interest free of prior biological hypotheses. GWAS require stringent correction for multiple testing, with genome-wide significance typically defined as association p-value <5*10-8. This study presents a new tool that uses external information about genes to prioritize SNP associations (GenToS). For a given list of candidate genes, GenToS calculates an appropriate statistical significance threshold and then searches for trait-associated variants in summary statistics from human GWAS. It thereby allows for identifying trait-associated genetic variants that do not meet genome-wide significance. The program additionally tests for enrichment of significant candidate gene associations in the human GWAS data compared to the number expected by chance. As proof of principle, this report used external information from a comprehensive resource of genetically manipulated and systematically phenotyped mice. Based on selected murine phenotypes for which human GWAS data for corresponding traits were publicly available, several candidate gene input lists were derived. Using GenToS for the investigation of candidate genes underlying murine skeletal phenotypes in data from a large human discovery GWAS meta-analysis of bone mineral density resulted in the identification of significantly associated variants in 29 genes. Index variants in 28 of these loci were subsequently replicated in an independent GWAS replication step, highlighting that they are true positive associations. One signal, COL11A1, has not been discovered through GWAS so far and represents a novel human candidate gene for altered bone mineral density. The number of observed genes that contained significant SNP associations in human GWAS based on murine candidate gene input lists was much greater than the number expected by chance across several complex human traits (enrichment p-value as low as 10-10). GenToS can be used with any candidate gene list, any GWAS summary file, runs on a desktop computer and is freely available.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27612175 PMCID: PMC5017755 DOI: 10.1371/journal.pone.0162466
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1GenToS principle.
(A) First, GenToS extracts for each gene on a given candidate gene input list the region of the gene including a user-defined flanking region. (B) Next, all independent SNPs within each region are identified from a reference population, and a significance threshold based on the number of independent SNPs is calculated. (C) In the final step, SNPs with an association p-value below the calculated significance threshold are extracted from the human GWAS summary results. (D) Enrichment of the number of observed significant genes (vertical line) can be assessed visually compared to the expected number based on a null distribution derived by resampling from a binomial distribution (histogram).
Fig 2QQ-plots of the number of observed significant genes under the null hypothesis comparing random draws of gene input lists and simulated draws.
The graph shows that simulated draws based on a binomial experiment approximate the number of significant genes under the null hypothesis derived from iterations of randomly generated input gene lists, while being computationally more efficient. QQ plots were generated across a range of possible significance thresholds. Spearman correlation coefficients were determined for each setting and found to be in the range of 0.90–1.00.
Fig 3GenToS identifies significant enrichment of genes containing femoral neck bone mineral density-associated SNPs based on candidate gene input lists for murine bone phenotypes.
For each of the six candidate gene input lists, the number of expected significant genes under the null hypothesis was generated based on iterations of randomly drawn gene lists that contained an equal number of genes as the respective candidate gene input list and is displayed as a histogram. In addition, the binomial density distribution corresponding to the candidate gene input list significance threshold was overlaid (dots connected with lines). The observed number of significant genes based on the use of GenTos with the candidate gene input lists and the human GWAS results for femoral neck bone mineral density is indicated by a vertical black line. The enrichment p-value is computed from the complementary cumulative binomial distribution (see Methods).
Genes identified by GenToS in association with human bone mineral density phenotypes that reached genome-wide significance and were replicated in previous GWAS.
| Gene | Cyto-band | index SNP, FNBMD | p-value, FNBMD | index SNP, LSBMD | p-value, LSBMD | in GWAS Catalog | OMIM number | monogenic phenotype |
|---|---|---|---|---|---|---|---|---|
| 1p31.3 | rs1430742 | 2.91E-13 | rs878548 | 1.52E-19 | Estrada K, rs12407028-T, 3 x 10–45 (LSBMD) | |||
| 2p16.2 | rs11898505 | 9.72E-12 | Estrada K, rs4233949-C, 2 x 10–18 (LSBMD) | |||||
| 2q21 | rs13005448 | 7.99E-07 | Estrada K, rs7584262-T, 1 x 10–9 (FNBMD) | |||||
| 2q24.3 | rs1346004 | 1.62E-10 | rs1346004 | 4.44E-08 | Estrada K, rs1346004-A, 4 x 10–30 (LSBMD); Duncan EL, rs6710518-T, 5 x 10–10 (femoral neck) | 211900 | Tumoral calcinosis, hyperphosphatemic, familial | |
| 4p16.3 | rs3755955 | 3.73E-07 | Estrada K, rs3755955-A, 5 x 10–15 (LSBMD) | 607014; 607015; 607016 | Mucopolysaccharidosis Ih; Mucopolysaccharidosis Ih/s; Mucopolysaccharidosis Is | |||
| 4q22.1 | rs1054629 | 9.23E-10 | rs1471403 | 1.67E-10 | Estrada K, rs6532023-T, 1 x 10–27 (LSBMD); Zhang L, rs1463104-?, 2 x 10–9 (spine) | |||
| 5q14.3 | rs17558396 | 6.54E-07 | Estrada K, rs1366594-A, 4 x 10–61 (FNBMD); Zhang L, rs6894139-?, 7 x 10–18 (FNK); Zheng HF, rs11951031-T, 9 x 10–9; Duncan EL, rs6710518-T, 8 x 10–10 (femoral neck) | 613443; 613443 | Chromosome 5q14.3 deletion syndrome; Mental retardation, stereotypic movements, epilepsy, and/or cerebral malformations | |||
| 6q25.1 | rs3020331 | 1.30E-14 | rs2941741 | 1.19E-14 | Estrada K, rs4869742-T, 4 x 10–35 (LSBMD); Paternoster L, rs6909279-G, 1 x 10–9 (Cortical vBMD) | 615363; 114480; 157300; 608446 | Estrogen resistance; {Breast cancer}; {Migraine, susceptibility to}; {Myocardial infarction, susceptibility to}; {Breast cancer} (no OMIM); {Migraine, susceptibility to} (no OMIM) | |
| 7q31.31 | rs3801387 | 4.67E-14 | rs3801387 | 1.58E-15 | Estrada K, rs3801387-A, 3 x 10–51 (LSBMD); Zhang L, rs10242100-?, 2 x 10–10 (hip) | |||
| 8q24.12 | rs4242592 | 1.58E-14 | rs10505348 | 3.22E-18 | Estrada K, rs2062377-A, 3 x 10–39 (LSBMD); Zhang L, rs4424296-?, 9 x 10–14 (spine); Richards JB, rs4355801-A, 8 x 10–10; Paternoster L, rs2062377-A, 1 x 10–7 (Cortical vBMD) | 239000 | Paget disease of bone 5, juvenile-onset | |
| 11p11.2 | rs7932354 | 2.62E-08 | Estrada K, rs7932354-T, 5 x 10–18 (FNBMD) | |||||
| 11p15.2-p15.1 | rs4757390 | 3.94E-07 | Estrada K, rs7108738-T, 1 x 10–32 (FNBMD); Zhang L, rs7108738-?, 1 x 10–15 (FNK) | |||||
| 11q13.2 | rs608343 | 5.77E-07 | rs3736228 | 1.32E-10 | Estrada K, rs3736228-T, 2 x 10–26 (LSBMD); Richards JB, rs3736228-T, 6 x 10–12; Zhang L, rs525592-?, 3 x 10–11 (spine) | 601813; 144750; 607634; 259770; 144750; 607636; 601884; 166710 | Exudative vitreoretinopathy 4; Hyperostosis, endosteal; Osteopetrosis, autosomal dominant 1; Osteoporosis-pseudoglioma syndrome; Osteosclerosis; van Buchem disease, type 2; [Bone mineral density variability 1]; {Osteoporosis} | |
| 12q13.13 | rs10876528 | 1.23E-07 | rs894737 | 6.38E-10 | Estrada K, rs736825-C, 8 x 10–16 (LSBMD) | |||
| 12q13.13 | rs2016266 | 5.44E-07 | rs2016266 | 3.97E-12 | Estrada K, rs2016266-A, 3 x 10–20 (LSBMD) | 613849 | ?Osteogenesis imperfecta, type XII | |
| 16p13.3 | rs9921222 | 2.22E-07 | rs9921222 | 7.26E-08 | Estrada K, rs9921222-T, 1 x 10–16 (LSBMD) | 607864; 114550 | ?Caudal duplication anomaly; Hepatocellular carcinoma, somatic | |
| 16p13.3 | rs13336428 | 2.55E-07 | Estrada K, rs13336428-A, 1 x 10–16 (FNBMD) | 166600; 611490 | Osteopetrosis, autosomal dominant 2; Osteopetrosis, autosomal recessive 4 | |||
| 17q21.31 | rs2741856 | 5.10E-08 | rs2741856 | 1.29E-07 | Estrada K, rs4792909-T, 2 x 10–11 (FNBMD) | 122860; 269500; 239100 | Craniodiaphyseal dysplasia, autosomal dominant; Sclerosteosis 1; Van Buchem disease | |
| 18q21.33 | rs884205 | 2.62E-08 | Estrada K, rs884205-A, 2 x 10–17 (LSBMD) | 174810; 612301; 602080 | Osteolysis, familial expansile; Osteopetrosis, autosomal recessive 7; {Paget disease of bone 2, early-onset} | |||
| 20p12.2 | rs6514116 | 8.88E-08 | rs6040061 | 9.23E-10 | Estrada K, rs3790160-T, 3 x 10–19 (LSBMD); Kung AW, rs2273061-A, 5 x 10–8 | 118450; 187500 | Alagille syndrome; Tetralogy of Fallot; Deafness, congenital heart defects, and posterior embryotoxon (non OMIM) |
The index SNP is defined as the SNP with the lowest association p-value with a given trait. The GWAS Catalog entry refers to results obtained from the NHGRI GWAS catalog upon entry of the given index SNP. Monogenic phenotypes are retrieved from OMIM. Of note, several of these genes only achieved genome-wide significance after the replication step, whereas GenToS is based on data from the discovery step and already implicated the genes at this point. Empty cells for LSBMD and MNBMD p-values and SNP identifiers indicate that no SNP in the gene contained significant associations below any of the six murine candidate gene list-wise thresholds.
*LSBMD and FNBMD entries from the GWAS catalog represent summary estimates from the combined discovery and replication step.
LSBMD = Lumber spine bone mineral density; FNBMD = Femoral neck bone mineral density; OMIM = Online Mendelian Inheritance in Man database
Newly implicated genes identified by GenToS in association with bone mineral density phenotypes.
These genes either mapped into known associated GWAS regions but were not previously named as the index gene, or were not replicated at genome-wide significance at the time the GWAS data was published.
| Gene | Cyto-band | index SNP, FNBMD | p-value, FNBMD | index SNP, LSBMD | p-value, LSBMD | in GWAS Catalog | Info | OMIM number | mongenic phenotype | Additional annotation information based on SniPA |
|---|---|---|---|---|---|---|---|---|---|---|
| 1p21.1 | rs11809524 | 1.41E-06 | Estrada K, rs11809524-T, 9 x 10–6 (FNBMD) | 228520; 154780; 604841; 603932 | Fibrochondrogenesis 1; Marshall syndrome; Stickler syndrome, type II; {Lumbar disc herniation, susceptibility to} | |||||
| 4q22.1 | rs1054629 | 9.228E-10 | rs17711209 | 1.42E-06 | Duncan EL, rs1054627-G, 8 x 10–7 (femoral neck) | proximity to | ||||
| 11p11.2 | rs6485702 | 2.605E-07 | proximity to | 616304; 212780; 614305 | ?Myasthenic syndrome, congenital, 17; Cenani-Lenz syndactyly syndrome Sclerosteosis 2 | |||||
| 11p11.2 | rs2070852 | 3.164E-08 | proximity to | 613679; 613679; 188050; 614390; 601367 | Dysprothrombinemia; Hypoprothrombinemia; Thrombophilia due to thrombin defect; {Pregnancy loss, recurrent, susceptibility to, 2}; {Stroke, ischemic, susceptibility to} | Cis-eQTL for | ||||
| 4p16.3 | rs6827815 | 3.966E-06 | rs6827815 | 3.13E-06 | Zhang L, rs6827815, 5 x 10–12 | upstream gene variant; Putative effect on regulation; Cis-eQTL for different genes | ||||
| 12q13.13 | rs11614913 | 6.981E-08 | rs11614913 | 8.92E-10 | proximity to | Cis-eQTL for | ||||
| 12q13.13 | rs10876528 | 1.226E-07 | rs894737 | 6.38E-10 | proximity to | Cis-eQTL for | ||||
| 12q13.13 | rs12300425 | 1.23E-06 | proximity to | Cis-eQTL for | ||||||
| 12q13.13 | rs11614913 | 6.981E-08 | rs11614913 | 8.92E-10 | proximity to | Cis-eQTL for |
The index SNP is defined as the SNP with the lowest association p-value with a given trait. The GWAS Catalog entry refers to results obtained from the NHGRI GWAS catalog upon entry of the given index SNP. Monogenic phenotypes are retrieved from OMIM. Empty cells for LSBMD and MNBMD p-values and SNP identifiers indicate that no SNP in the gene contained significant associations below any of the six murine candidate gene list-wise thresholds.
*LSBMD and FNBMD entries from the GWAS catalog represent summary estimates from the combined discovery and replication step.
SNiPA was used to retrieve cis-eQTL evidence from numerous tissues. Evidence is indicated when any tissue showed indication of an eQTL.
LSBMD = Lumber spine bone mineral density; FNBMD = Femoral neck bone mineral density; OMIM = Online Mendelian Inheritance in Man database
Ontology terms and number of genes in each murine input gene list.
| abnormal skeleton physiology | MP:0005508 | 518 | 498 |
| abnormal skeleton morphology | MP:0001533 | 1292 | 1247 |
| abnormal skeleton development | MP:0002113 | 379 | 366 |
| abnormal bone mineralization | MP:0002113 | 134 | 128 |
| abnormal vertebrae morphology | MP:0000137 | 324 | 317 |
| abnormal long bone morphology | MP:0003723 | 184 | 180 |
| abnormal circulating glucose level | MP:0000188 | 560 | 543 |
| abnormal fasted circulating glucose level | MP:0013277 | 60 | 60 |
| decreased circulating glucose level | MP:0005560 | 324 | 315 |
| decreased fasted circulating glucose level | MP:0013278 | 21 | 21 |
| increased circulating glucose level | MP:0005559 | 272 | 263 |
| increased fasted circulating glucose level | MP:0013279 | 44 | 44 |
| abnormal circulating insulin level | MP:0001560 | 385 | 373 |
| abnormal insulin secretion | MP:0003564 | 147 | 144 |
| increased insulin secretion | MP:0003058 | 42 | 39 |
| decreased circulating insulin level | MP:0002727 | 240 | 234 |
| decreased insulin secretion | MP:0003059 | 110 | 109 |
| increased circulating insulin level | MP:0002079 | 163 | 157 |
| increased systemic arterial blood pressure | MP:0002842 | 131 | 128 |
| increased systemic arterial systolic blood pressure | MP:0006144 | 67 | 65 |
| decreased systemic arterial blood pressure | MP:0002843 | 92 | 87 |
| decreased systemic arterial systolic blood pressure | MP:0006264 | 39 | 36 |
| hyperglycemia | MP:0001559 | 99 | 93 |
| abnormal glucose tolerance | MP:0005291 | 406 | 394 |
For each list, <5% of genes were filtered, mostly because they were mapping to human gonosomes and gonosomal GWAS summary statistics were not available. Other reasons for filtering included ambiguous mapping and accounted for <1% of filtered genes for each list.