| Literature DB >> 27049301 |
Ana Jeroncic1, Yasin Memari2, Graham Rs Ritchie2, Audrey E Hendricks2,3, Anja Kolb-Kokocinski2, Angela Matchan2, Veronique Vitart4, Caroline Hayward4, Ivana Kolcic5, Dominik Glodzik4, Alan F Wright4, Igor Rudan6, Harry Campbell6, Richard Durbin2, Ozren Polašek5,6, Eleftheria Zeggini2, Vesna Boraska Perica2,7.
Abstract
We have whole-exome sequenced 176 individuals from the isolated population of the island of Vis in Croatia in order to describe exonic variation architecture. We found 290 577 single nucleotide variants (SNVs), 65% of which are singletons, low frequency or rare variants. A total of 25 430 (9%) SNVs are novel, previously not catalogued in NHLBI GO Exome Sequencing Project, UK10K-Generation Scotland, 1000Genomes Project, ExAC or NCBI Reference Assembly dbSNP. The majority of these variants (76%) are singletons. Comparable to data obtained from UK10K-Generation Scotland that were sequenced and analysed using the same protocols, we detected an enrichment of potentially damaging variants (non-synonymous and loss-of-function) in the low frequency and common variant categories. On average 115 (range 93-140) genotypes with loss-of-function variants, 23 (15-34) of which were homozygous, were identified per person. The landscape of loss-of-function variants across an exome revealed that variants mainly accumulated in genes on the xenobiotic-related pathways, of which majority coded for enzymes. The frequency of loss-of-function variants was additionally increased in Vis runs of homozygosity regions where variants mainly affected signalling pathways. This work confirms the isolate status of Vis population by means of whole-exome sequence and reveals the pattern of loss-of-function mutations, which resembles the trails of adaptive evolution that were found in other species. By cataloguing the exomic variants and describing the allelic structure of the Vis population, this study will serve as a valuable resource for future genetic studies of human diseases, population genetics and evolution in this population.Entities:
Mesh:
Year: 2016 PMID: 27049301 PMCID: PMC4950961 DOI: 10.1038/ejhg.2016.23
Source DB: PubMed Journal: Eur J Hum Genet ISSN: 1018-4813 Impact factor: 4.246
The count of all variants, and completely novel variants in each functional effect group separated by allele frequency categories
| Loss_of_function | 1775 | 892 | 219 | 100 | 152 | 93 | 79 | 240 |
| Non_synonymous | 60 011 | 22 939 | 6988 | 3626 | 6119 | 4834 | 3944 | 11 561 |
| Splice_region | 8748 | 2657 | 886 | 468 | 892 | 791 | 689 | 2365 |
| Synonymous | 44 582 | 12 891 | 4292 | 2501 | 4491 | 4111 | 3658 | 12 638 |
| UTR | 17 165 | 4945 | 1792 | 1073 | 1865 | 1585 | 1438 | 4467 |
| ncRNA | 23 033 | 6292 | 2210 | 1256 | 2274 | 2190 | 2145 | 6666 |
| Intronic | 131 531 | 35 463 | 13 016 | 7519 | 13 797 | 12 426 | 11 828 | 37 482 |
| Upstream | 2056 | 506 | 206 | 118 | 204 | 181 | 209 | 632 |
| Downstream | 1142 | 265 | 108 | 63 | 105 | 114 | 131 | 356 |
| Regulatory | 73 | 13 | 8 | 5 | 8 | 5 | 11 | 23 |
| Intergenic | 461 | 106 | 28 | 18 | 43 | 38 | 50 | 178 |
| Total | 290 577 | 86 969 | 29 753 | 16 747 | 29 950 | 26 368 | 24 182 | 76 608 |
| Loss_of_function | 269 | 230 | 27 | 7 | 5 | 0 | 0 | 0 |
| Non_synonymous | 4783 | 3978 | 515 | 167 | 116 | 3 | 1 | 3 |
| Splice_region | 552 | 455 | 60 | 17 | 19 | 1 | 0 | 0 |
| Synonymous | 2141 | 1738 | 250 | 85 | 57 | 5 | 1 | 5 |
| UTR | 1702 | 1312 | 240 | 81 | 59 | 5 | 0 | 5 |
| ncRNA | 2273 | 1772 | 305 | 109 | 71 | 8 | 3 | 5 |
| Intronic | 13 308 | 10 328 | 1890 | 604 | 427 | 30 | 11 | 18 |
| Upstream | 230 | 175 | 32 | 7 | 9 | 1 | 0 | 6 |
| Downstream | 120 | 86 | 19 | 8 | 6 | 0 | 0 | 1 |
| Regulatory | 9 | 6 | 3 | 0 | 0 | 0 | 0 | 0 |
| Intergenic | 44 | 38 | 3 | 2 | 0 | 1 | 0 | 0 |
| Total | 25 431 | 20 118 | 3344 | 1087 | 769 | 54 | 16 | 43 |
Figure 1Proportion of functional effects by allele frequency categories.
Figure 2Proportion of functional effects of completely novel variants by MAF.
Genes with LoF variants — summary of predicted gene product function and location using gene ontology terms and pathway analysis
| P | ||||||
|---|---|---|---|---|---|---|
| CYP2E1 reactions | Reactome | 11 | 6 (54.5%) | 0.00004 | 0.0518 | Xenobiotics related |
| Leukotriene metabolism | EHMN | 104 | 19 (18.3%) | 9.89 × 10−5 | 0.0518 | Lipid metabolism |
| Galactose metabolism | KEGG | 30 | 9 (30.0%) | 0.000148 | 0.0518 | Other |
| Tryptophan degradation | INOH | 66 | 14 (21.2%) | 0.000158 | 0.0518 | Xenobiotics related |
| Metabolism of xenobiotics by cytochrome P450 | KEGG | 74 | 15 (20.3%) | 0.000159 | 0.0518 | Xenobiotics related |
| Androgen and oestrogen biosynthesis and metabolism | EHMN | 87 | 16 (18.4%) | 0.000318 | 0.0789 | Lipid metabolism |
| Fatty acids | Reactome | 15 | 6 (40.0%) | 0.000339 | 0.0789 | Lipid metabolism |
| Xenobiotics | Reactome | 21 | 7 (33.3%) | 0.000398 | 0.0803 | Xenobiotics related |
| Chemical carcinogenesis | KEGG | 81 | 15 (18.5%) | 0.000448 | 0.0803 | Xenobiotics related |
| C21-steroid hormone biosynthesis and metabolism | EHMN | 57 | 12 (21.1%) | 0.000493 | 0.0803 | Lipid metabolism |
Annotations are ordered by q-values.
Tryptophan degradation pathway is classified as xenobiotic-related as tryptophan metabolites are known to activate aryl hydrocarbon receptor, transcription factor known to mediate most of the toxic and carcinogenic effects of a wide variety of environmental contaminants.
Putative LoF variants (n=9) with extremely high variability among populations: Vis and 1000Genomes super populations EUR, ASN, AFR, AMR
| 1 | 27942176 | rs2231879 | T | C | 0.02 | 0.02 | 0.07 | 0.51 | — | Regulatory element ID: ENSR00001518649, regulatory_region_variant, — Gene ID: FGR, splice_acceptor_variant, nc_transcript_variant, 1; intron_variant, 6 |
| 5 | 111481696 | rs17134155 | C | T | 0.18 | 0.18 | 0.13 | 0.52 | 0.05 | Regulatory element ID: ENSR00001287518, regulatory_region_variant, — Gene ID: EPB41L4A, splice_acceptor_variant, nc_transcript_variant, 1 |
| 6 | 139576544 | rs41289819 | G | A | 0.13 | 0.16 | 0.14 | 0.53 | 0.02 | Gene ID: TXLNB, stop_gained:373:125, 1; intron_variant, 1 |
| 7 | 144364918 | rs67644764 | G | T | 0.06 | 0.05 | 0.11 | 0.61 | 0.002 | Gene ID: TPK1, stop_gained:71:24, 1; intron_variant, NMD_transcript_variant, 2; intron_variant, 3; intron_variant, nc_transcript_variant, 1; upstream_gene_variant, 2; synonymous_variant, NMD_transcript_variant:129:43:L>L, 1; 5_prime_UTR_variant, 1 |
| 16 | 66861836 | rs7195853 | G | A | 0.05 | 0.07 | 0.09 | 0.56 | 0.03 | Gene ID: NAE1 splice_donor_variant, NMD_transcript_variant, 1; intron_variant, NMD_transcript_variant, 3; intron_variant, 8; intron_variant, nc_transcript_variant, 4; |
| 16 | 90110950 | rs1048149 | C | T | 0.12 | 0.14 | 0.19 | 0.59 | 0.03 | Regulatory element ID: ENSR00000512444, regulatory_region_variant, — Gene ID: ENSG00000222019, stop_gained:68:23, 2; stop_gained,NMD_transcript_variant:68:23, 1; non_coding_exon_variant, nc_transcript_variant, 1 Gene ID: GAS8, 3_prime_UTR_variant, NMD_transcript_variant, 1; 3_prime_UTR_variant, 1; downstream_gene_variant, 5; non_coding_exon_variant,nc_transcript_variant, 1 |
| 17 | 72588806 | rs545652 | C | A | 0.11 | 0.14 | 0.19 | 0.52 | 0.04 | Gene ID: C17orf77, stop_gained:621:207, 2; downstream_gene_variant, 1 Gene ID: CD300LD, upstream_gene_variant, 1 |
| 22 | 42336172 | rs5758511 | G | A | 0.25 | 0.27 | 0.20 | 0.03 | 0.51 | Regulatory element ID: ENSR00000085774, regulatory_region_variant, — Gene ID: CENPM stop_gained:7:3, 1; intron_variant, 5; downstream_gene_variant, 1 |
| X | 75004529 | rs1343879 | C | A | 0.02 | 0.03 | 0.24 | 0.05 | 0.91 | Gene ID: MAGEE2, stop_gained:358:120, 1 |
The genomic reference sequence used is GRCh37/hg19. Population allele frequency of variants range from rare to common major allele.
Called with the Ensembl Variant Effect Predictor v2.8 against Ensembl 70.
Genes with LoF variants in ROH hotspots—summary of predicted-gene product function and location using gene ontology terms and pathway analysis
| P | ||||||
|---|---|---|---|---|---|---|
| Allograft rejection | Wikipathways | 80 | 5 (6.2%) | 2.21 × 10−5 | 0.00225 | Immune response to allograft |
| Cytochrome P450—arranged by substrate type | Reactome | 61 | 4 (6.6%) | 0.00013 | 0.00662 | Xenobiotics metabolism |
| Phase 1—functionalization of compounds | Reactome | 79 | 4 (5.1%) | 0.000353 | 0.00882 | Xenobiotics metabolism |
| Olfactory signalling pathway | Reactome | 427 | 8 (1.9%) | 0.000401 | 0.00882 | Response to external signal |
| Warfarin pathway, pharmacokinetics | PharmGKB | 8 | 2 (25.0%) | 0.000496 | 0.00882 | Xenobiotics metabolism |
| Allograft rejection— | KEGG | 37 | 3 (8.1%) | 0.000519 | 0.00882 | Immune response to allograft |
Annotations are ordered by q-values.