| Literature DB >> 26883865 |
Sophie Hackinger1, Thirsa Kraaijenbrink2, Yali Xue1, Massimo Mezzavilla1,3, George van Driem4, Mark A Jobling5, Peter de Knijff2, Chris Tyler-Smith1, Qasim Ayub6.
Abstract
High-altitude adaptation in Tibetans is influenced by introgression of a 32.7-kb haplotype from the Denisovans, an extinct branch of archaic humans, lying within the endothelial PAS domain protein 1 (EPAS1), and has also been reported in Sherpa. We genotyped 19 variants in this genomic region in 1507 Eurasian individuals, including 1188 from Bhutan and Nepal residing at altitudes between 86 and 4550 m above sea level. Derived alleles for five SNPs characterizing the core Denisovan haplotype (AGGAA) were present at high frequency not only in Tibetans and Sherpa, but also among many populations from the Himalayas, showing a significant correlation with altitude (Spearman's correlation coefficient = 0.75, p value 3.9 × 10(-11)). Seven East- and South-Asian 1000 Genomes Project individuals shared the Denisovan haplotype extending beyond the 32-kb region, enabling us to refine the haplotype structure and identify a candidate regulatory variant (rs370299814) that might be interacting in an additive manner with the derived G allele of rs150877473, the variant previously associated with high-altitude adaptation in Tibetans. Denisovan-derived alleles were also observed at frequencies of 3-14% in the 1000 Genomes Project African samples. The closest African haplotype is, however, separated from the Asian high-altitude haplotype by 22 mutations whereas only three mutations, including rs150877473, separate the Asians from the Denisovan, consistent with distant shared ancestry for African and Asian haplotypes and Denisovan adaptive introgression.Entities:
Mesh:
Substances:
Year: 2016 PMID: 26883865 PMCID: PMC4796332 DOI: 10.1007/s00439-016-1641-2
Source DB: PubMed Journal: Hum Genet ISSN: 0340-6717 Impact factor: 4.132
Fig. 1Core Denisovan haplotype frequency in a Eurasia, b Nepal and c Bhutan. Circles represent populations residing at various altitudes, indicated by colours as shown in the legend key. The circle areas are proportional to population sizes. Derived alleles for five SNPs (rs115321619, rs73926263, rs73926264, rs73926265, rs55981512) constitute the core Denisovan haplotype (AGGAA) and its frequency is shown as a black pie-slice inside each coloured circle. Three-letter population codes are defined in Table S1. The maps were obtained from d-map.com using the urls given in each panel
Fig. 2Median-joining network of genotyped intronic SNPs in EPAS1. The median-joining network was constructed using phased haplotypes of frequency >1. Each coloured circle represents a haplotype and its area is proportional to frequency. Archaic haplotypes are shown in black (Denisovan) and yellow (Neanderthal); grey circles represent chromosomes with an extended 19 SNP haplotype in which the five-SNP core Denisovan haplotype is present, whereas those in orange lack this five-SNP haplotype. Lines between haplotypes indicate mutational distance. The haplotypes are labelled as listed in Table S5. Haplotype 3 is common in Tibetans and is separated by six mutations from haplotype 57 that does not contain the core Denisovan haplotype
Fig. 3Correlation between altitude and core Denisovan haplotype frequency. Each circle represents a population sample genotyped in this study
Fig. 4ENCODE regulatory features in region surrounding core Denisovan haplotype. a A 136,701-kb region on chromosome 2 (46,561,000–466,697,700 in GRCh37) that spans the EPAS1 and TMEM247 genes showing the SNPs genotyped in this study (vertical lines, top row), the copy number deletion frequent in Tibetans (TED block) and variants that differ between the human reference sequence (hg19) and the Denisovan, an archaic hominin. The region containing the five SNPs (rs115321619, rs73926263, rs73926264, rs73926265, rs55981512) that constitute the core Denisovan haplotype is highlighted in blue. The GENCODE (Version 19) transcript annotation is shown and the two EPAS1 transcripts expressed in lung tissues are shown in red. The TED is located ~80-kb downstream in another gene (TMEM247) in a region that has low conservation GERP scores and no ENCODE-annotated regulatory feature in the human lung fibroblast cell line (NHLF). b An expanded view of the upstream ~40-kb region (demarcated by the black box in panel a). The EPAS1 genotyped SNPs track shows the two plausible candidate regulatory variants (vertical red lines) that are located upstream of the core Denisovan haplotype (highlighted in blue). Both candidate variants are located in an evolutionarily conserved region of open chromatin, as depicted by the DNase I Hypersensitivity Clusters in the 125 cell line track, and show ENCODE chromatin state segmentation associated with an active promoter site in a human lung fibroblast cell line (NHLF). One of the SNPs (rs370299814) is associated with CTCF binding in many cells (like the chronic myelogenous leukemic K562 cell line shown here) and enhancer-1-associated marks in NHLF. The lower part of this panel shows a median-joining haplotype network for this region in the 1000 Genomes Project African and Asian samples, with circles representing haplotypes with areas proportional to frequency. Population origins are indicated by node colours as shown in the legend. Phased low-coverage sequences generated by Phase 3 of the 1000 Genomes Project were used for network construction along with the high-coverage Denisovan and Neanderthal sequences. Only African and Asian samples containing derived alleles for the core Denisovan haplotype were used in the network construction. The arrow indicates the branch leading towards the samples with the introgressed haplotype and the location of the candidate regulatory mutation (rs370299814) on this branch. Three mutations separate the common Asian high-altitude haplotype from the Denisovan, whereas the closest African haplotype is separated from this Asian haplotype by 22 mutations