| Literature DB >> 30154825 |
Amelia D Wallace1, George A Wendt2, Lisa F Barcellos1, Adam J de Smith3, Kyle M Walsh4, Catherine Metayer1, Joseph F Costello5, Joseph L Wiemels3,5, Stephen S Francis2,3.
Abstract
Approximately 8% of the human genome is comprised of endogenous retroviral insertions (ERVs) originating from historic retroviral integration into germ cells. The function of ERVs as regulators of gene expression is well established. Less well studied are insertional polymorphisms of ERVs and their contribution to the heritability of complex phenotypes. The most recent integration of ERV, HERV-K, is expressed in a range of complex human conditions from cancer to neurologic diseases. Using an in-house computational pipeline and whole-genome sequencing data from the diverse 1,000 Genomes Phase 3 population (n = 2,504), we identified 46 polymorphic HERV-K insertions that are tagged by adjacent single nucleotide polymorphisms (SNPs). To test the potential role of polymorphic HERV-K in the heritability of complex diseases, existing databases were queried for enrichment of established relationships between the HERV-K insertion-associated SNPs (hiSNPs), and tissue specific gene expression and disease phenotypes. Overall, hiSNPs for the 46 polymorphic HERV-K sites were statistically enriched (p < 1.0E-16) for eQTLs across 44 human tissues. Fifteen of the 46 HERV-K insertions had hiSNPs annotated in the EMBL-EBI GWAS Catalog and cumulatively associated with >100 phenotypes. Experimental factor ontology enrichment analysis suggests that polymorphic HERV-K specifically contribute to neurologic and immunologic disease phenotypes, including traits related to intra cranial volume (FDR 2.00E-09), Parkinson's disease (FDR 1.80E-09), and autoimmune diseases (FDR 1.80E-09). These results provide strong candidates for context-specific study of polymorphic HERV-K insertions in disease-related traits, serving as a roadmap for future studies of the heritability of complex disease.Entities:
Keywords: GWAS; HERV-K; eQTL; polymorphism; recombination
Year: 2018 PMID: 30154825 PMCID: PMC6102640 DOI: 10.3389/fgene.2018.00298
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
46 Polymorphic HERV-K insertions with hiSNPs.
| chr3:14132679 | 0.96 | 453 | chr1:106015875 | 0.04 | 436 |
| chr3:125609298 | 0.54 | 194 | chr1:111802591 | 0.59 | 234 |
| chr3:129776131 | 0.47 | 203 | chr1:223578304 | 0.01 | 206 |
| chr3:195654395 | 0.96 | 205 | chr4:9603240 | 0.67 | 969 |
| chr4:120263688 | 0.68 | 1416 | chr4:9981605 | 0.02 | 650 |
| chr5:8937848 | 0.83 | 345 | chr5:4537604 | 0.01 | 176 |
| chr6:32505702 | 0.13 | 1817 | chr5:64388440 | 0.07 | 226 |
| chr6:32746812 | 0.08 | 222 | chr5:80442266 | 0.05 | 49 |
| chr7:16237347 | 0.81 | 259 | chr6:32648036 | 0.35 | 4965 |
| chr7:158029477 | 0.28 | 239 | chr6:161270899 | 0.84 | 576 |
| chr8:7355392 | 0.14 | 108 | chr7:158773385 | 0.01 | 102 |
| chr8:18651453 | 0.52 | 199 | chr11:60449890 | 0.07 | 292 |
| chr8:37050885 | 0.32 | 125 | chr12:44313657 | 0.27 | 593 |
| chr10:135355522 | 0.18 | 155 | chr12:124066477 | 0.13 | 444 |
| chr11:71478951 | 0.82 | 379 | chr13:90743183 | 0.12 | 292 |
| chr11:71875417 | 0.88 | 117 | chr15:63374594 | 0.68 | 238 |
| chr12:55727210 | 0.76 | 385 | chr19:21841536 | 0.20 | 613 |
| chr14:20552746 | 0.30 | 144 | chr19:22414379 | 0.43 | 993 |
| chr17:44361947 | 0.17 | 2300 | chr19:22457244 | 0.01 | 907 |
| chr19:386675 | 0.10 | 19 | chr19:29855781 | 0.55 | 536 |
| chr19:52924209 | 0.39 | 37 | chr19:57996939 | 0.02 | 191 |
| chr20:25215439 | 0.84 | 93 | chr20:12402387 | 0.03 | 271 |
| chr21:15654234 | 0.65 | 19 | chrX:93606603 | 0.02 | 186 |
Not previously recognized as polymorphic.
Prev, prevalence averaged across 5 super populations.
Counts averaged across 5 super populations.
Figure 1Manhattan plot. Results from a genome-wide association study for a polymorphic HERV-K insertion discovered with HERVnGoSeq at chr1:111802591. The gray horizontal line represents the threshold for genome-wide significance of 5 × 10−8. Logistic regression was performed separately within each of the five super populations in the 1,000 Genomes. AFR, African; AMR, Ad Mixed American; EAS, East Asian; EUR, European; SAS, South Asian.
Figure 2(A) Ideogram. Relative genomic locations of HERV-K insertion locations with (green) and without (blue) hiSNPs identified via HERVnGoSeq. (B) Histogram. Frequency distribution of identified HERV-K insertion prevalences for insertions with (green) and without (blue) hiSNPs.
Figure 3Heat map of log-transformed p-values for enrichment of hiSNPs that are eQTLs. Includes the 30 polymorphic HERV-K insertion sites with strong SNP associations in the European continental population. Enrichment of SNPs that are both hiSNPs and eQTLs was determined using Fisher's exact test. Results are stratified across 44 human tissue types.
Figure 4Manhattan plots. Plots of hiSNP sets for 15 HERV-K insertion sites among the 30 polymorphic HERV-K insertion sites with strong SNP associations in the European continental population. Vertical black lines denote HERV-K insertion locations. hiSNPs that are annotated in the NHGRI-EBI GWAS Catalog are represented by red points.
HERV-K insertions with hiSNPs annotated in NHGRI-EBI GWAS Catalog and associated traits.
| chr1:111802591 | Interferon alpha levels in systemic lupus erythematosus | rs7411387 | 25338677 |
| chr4:120263688 | Corneal astigmatism | rs11098499 | 23322567 |
| Educational attainment | rs10028773 | 25201988 | |
| chr5:64388440 | Schizophrenia | rs17206232 | |
| chr5:8937853 | Obesity-related traits | rs11134338 | 23251661 |
| chr6:161270898 | Lipoprotein (a) - cholesterol levels | rs1620921 | 25575512 |
| Lipoprotein (a) levels | rs9355814 (β = 0.33), rs783147 (β = 0.3) | 26377243, 21900290 | |
| Protein quantitative trait loci | rs7770628 | 18464913 | |
| chr10:135355522 | Obesity-related traits | rs2249694 (β = 0.03) | 23251661 |
| chr12:124066477 | Pubertal anthropometrics | rs786425 | 23449627 |
| chr12:55727213 | Contrast sensitivity | rs12230513 | 24152035 |
| chr13:90743183 | Longevity | rs2882281 | 20834067 |
| chr15:63374594 | Blood metabolite levels | rs1472631 (β = 0.037) | 24816252 |
| Mean platelet volume | rs11071720 (β = 0.061) | 19820697 | |
| Metabolic traits | rs2652822 (β = 0.085) | 21886157 | |
| Platelet count | rs3809566 (β = 2.43x10-9) | 22139419 | |
| Social communication problems | rs17828380 (β = 0.18) | 24564958 | |
| chr17:44361947 | Bone mineral density | rs1864325 (β = 0.04) | 22504420 |
| Corticobasal degeneration | rs12185268 (OR = 1.3) | 26077951 | |
| Epithelial ovarian cancer | rs183211 (OR = 1.11) | 25581431 | |
| Idiopathic pulmonary fibrosis | rs17690703 (OR = 1.43) | 24429156 | |
| Interstitial lung disease | rs1981997 (OR = 1.41) | 23583980 | |
| Intracranial volume | rs9303525 (β = 14.9) | 22504418 | |
| Male-pattern baldness | rs12373124 (OR = 1.33) | 22693459 | |
| Ovarian cancer in BRCA1 mutation carriers | rs183211 (OR = 1.25) | 23544013 | |
| Parkinson's disease | rs12185268 (OR = 3.46), rs17577094 (OR = 1.56), rs17649553 (OR = 1.3), rs183211, rs199515 (OR = 1.32), rs199533 (OR = 1.28), rs415430, rs8070723 (OR = 1.3) | 21738487, 24842889, 25064009, 21812969, 22451204, 20711177, 21812969, 21044948 | |
| Progressive supranuclear palsy | rs8070723 (OR = 5.11) | 21685912 | |
| Subcortical brain region volumes | rs17689882 (β = 13460.47), rs8072451 (β = 14489.99) | 25607358, 25607350 | |
| chr19:22414379 | Body mass index (change over time) | rs8105895 | 25378290 |
| Chagas cardiomyopathy in Tripanosoma cruzi seropositivity | rs2262909 | 24324551 | |
| Dental caries | rs10404998 | 23259602, 23259602, 23064961 | |
| Response to statin therapy (LDL-C) | rs931608 | 22331829 | |
| Telomere length | rs1975174 | 20421499, 23001564 | |
| chr6:32505702 | Cervical cancer | rs9272143 (OR = 1.49) | 28806749 |
| Hepatitis B vaccine response | rs3135363 (OR = 1.53) | 24282030 | |
| Hepatitis C induced liver cirrhosis | rs3135363 (OR = 1.37) | 23321320 | |
| Hepatocellular carcinoma | rs9272105 (OR = 1.28) | 22807686 | |
| Leishmaniasis (visceral) | rs9271858 (OR = 1.41) | 23291585 | |
| Response to interferon beta therapy | rs9272105 (β = 0.026) | 21502966 | |
| Rheumatoid arthritis | rs2157337 | 21156761 | |
| Systemic sclerosis | rs3129763 (OR = 1.65) | 21779181 | |
| chr6:32648036 | Alzheimer's disease (late onset) | rs9271192 (OR = 1.1) | 24162737 |
| Antinuclear antibody levels | rs2395185 | 25186300 | |
| Arthritis (juvenile idiopathic) | rs2395148 (OR = 5.37) | 18576341 | |
| Asthma | rs3117098 (OR = 1.16), rs7775228 | 21804548, 21804548, 23028483, 29273806 | |
| Asthma and hay fever | rs9273373 | 24388013 | |
| Atopic dermatitis | rs9469099 (OR = 1.61) | 23042114 | |
| Chronic lymphocytic leukemia | rs674313 (OR = 1.69) | 21131588 | |
| Circulating myeloperoxidase levels (serum) | rs3134931 (β = 0.05) | 23620142 | |
| Cystic fibrosis severity | rs9268905 | 21602797 | |
| Dementia and core Alzheimer's disease neuropathologic changes | rs7453498 | 25188341 | |
| Epstein-Barr virus immune response (EBNA-1) | rs477515 | 23326239 | |
| Follicular lymphoma | rs12195582 (OR = 1.78), rs2647012 | 25279986, 21533074 | |
| Hepatitis B vaccine response | rs477515 | 24282030 | |
| Hepatitis C induced liver cirrhosis | rs3817963 | 23321320 | |
| Hepatocellular carcinoma (hepatitis B virus related) | rs9275319 | 23242368 | |
| Hodgkin's lymphoma | rs2395185 | 22286212, 24920014 | |
| Hypothyroidism | rs3129720 (OR = 1.16) | 22493691 | |
| IgA nephropathy | rs2856717 | 25305756, 26028593, 25305756, 25305756 | |
| IgE grass sensitization | rs7775228 | 22036096 | |
| Inflammatory bowel disease | rs477515 | 18758464 | |
| Leprosy | rs9271100 (OR = 1.68) | 25642632 | |
| Lung adenocarcinoma | rs3817963 | 22797724 | |
| Lung cancer | rs2395185 | 23143601 | |
| Lupus nephritis in systemic lupus erythematosus | rs2647012 | 24925725 | |
| Lymphoma | rs2647045 | 23349640, 23349640, 23349640 | |
| Multiple sclerosis (OCB status) | rs3129720 (OR = 1.91), rs3817963 | 23472185, 23472185, 23472185 | |
| Narcolepsy (age of onset) | rs7744020 | 24204295 | |
| Nasopharyngeal carcinoma | rs28421666 | 20512145 | |
| Nephropathy | rs9275596 | 21399633 | |
| Neurofibrillary tangles | rs34075049 | 25188341 | |
| Parkinson's disease | rs2395163 (OR = 1.24), rs9275326 (OR = 1.18) | 22451204, 28892059 | |
| Peanut allergy | rs9275596 | 25710614 | |
| Primary biliary cirrhosis | rs7774434 | 22961000 | |
| Rheumatoid arthritis | rs12194148 | 21156761, 24782177,17804836, 24782177, 24390342, 23918589 | |
| Sarcoidosis | rs2076530 | 22936702 | |
| Schizophrenia | rs9274623 (OR = 1.14) | 26198764 | |
| Sjogren's syndrome | rs9271588 | 24097066 | |
| Systemic lupus erythematosus | rs2647012 | 21408207, 19838193 | |
| Systemic sclerosis | rs9275390 (OR = 2.38) | 21779181 | |
| Type 1 diabetes | rs9272346 | 17554300 | |
| Ulcerative colitis | rs1063355 | 24837172, 20228799, 23128233, 19915573, 23511034, 18836448, 20228798 | |
| Vitiligo | rs3806156 | 20410501 | |
| Waist-hip ratio | rs2076529 (β = 0.02) | 20935629 | |
| Waist-to-hip ratio adjusted for body mass index | rs7759742 | 28443625 | |
| chr6:32746812 | Kawasaki disease | rs2857151 (OR = 1.47) | 22446962 |
* denotes strength of HERV-K::hiSNP association, all p-values are < bonferroni threshold
p < 1.0e−10,
p < 1.0e−15,
p < 1.0e−20. OR/β value from SNP: Phenotype association.
PMID, Pubmed ID for original research article in GWAS Catalog describing the SNP-phenotype association.
Figure 5Mean distance to nearest recombination hotspot. Distances indicated for polymorphic HERV-K insertions with and without hiSNPs (dashed lines) and the distribution of mean distances of random genomic locations matched to HERV-K insertions on proximal GC content and chromosome. Distributions were derived from 1000 repeated random samples with replacement. (A) Distances from nearest ChIP-seq-based recombination hotspot, (B) Distances from nearest LD-based recombination hotspot.