| Literature DB >> 25299700 |
Charles E Larsen1, Dennis R Alford2, Michael R Trautwein2, Yanoh K Jalloh2, Jennifer L Tarnacki2, Sushruta K Kunnenkeri2, Dolores A Fici3, Edmond J Yunis4, Zuheir L Awdeh3, Chester A Alper5.
Abstract
We resequenced and phased 27 kb of DNA within 580 kb of the MHC class II region in 158 population chromosomes, most of which were conserved extended haplotypes (CEHs) of European descent or contained their centromeric fragments. We determined the single nucleotide polymorphism and deletion-insertion polymorphism alleles of the dominant sequences from HLA-DQA2 to DAXX for these CEHs. Nine of 13 CEHs remained sufficiently intact to possess a dominant sequence extending at least to DAXX, 230 kb centromeric to HLA-DPB1. We identified the regions centromeric to HLA-DQB1 within which single instances of eight "common" European MHC haplotypes previously sequenced by the MHC Haplotype Project (MHP) were representative of those dominant CEH sequences. Only two MHP haplotypes had a dominant CEH sequence throughout the centromeric and extended class II region and one MHP haplotype did not represent a known European CEH anywhere in the region. We identified the centromeric recombination transition points of other MHP sequences from CEH representation to non-representation. Several CEH pairs or groups shared sequence identity in small blocks but had significantly different (although still conserved for each separate CEH) sequences in surrounding regions. These patterns partly explain strong calculated linkage disequilibrium over only short (tens to hundreds of kilobases) distances in the context of a finite number of observed megabase-length CEHs comprising half a population's haplotypes. Our results provide a clearer picture of European CEH class II allelic structure and population haplotype architecture, improved regional CEH markers, and raise questions concerning regional recombination hotspots.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25299700 PMCID: PMC4191933 DOI: 10.1371/journal.pgen.1004637
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
MHP cell line and CEH allele-level typing in the core MHC region and HLA-DPB1.
| Class I | Complotype | Class II | ||||||||||
| Cell Line | Conserved Ext. Haplotype | Number Sequenced |
|
|
|
|
|
|
|
|
|
|
|
|
|
| UNK | UNK | UNK | UNK |
|
|
|
| ||
| B7,DR15 | 23 |
|
|
|
|
|
|
|
|
|
| |
| B18,DR15 | 5 |
|
|
|
|
|
|
|
|
| UNK | |
|
|
|
|
|
|
|
|
|
|
|
| ||
| B8,DR3 | 30 |
|
|
|
|
|
|
|
|
|
| |
|
|
|
|
|
|
|
|
|
|
|
| ||
| B18,DR3 | 18 |
|
|
|
|
|
|
|
|
|
| |
|
|
|
|
| UNK |
|
|
|
|
|
| ||
| C4,B44,DR7 | 6 |
|
|
|
|
|
|
|
|
| UNK | |
| C16,B44,DR7 | 4 |
|
|
|
|
|
|
|
|
| UNK | |
|
|
|
|
|
|
|
|
|
|
|
| ||
| B57,DR7 | 4 |
|
|
|
|
|
|
|
|
| UNK | |
|
|
|
| UNK | UNK | UNK | UNK |
|
|
|
| ||
| B44,DR4,DQ7 | 7 |
|
|
|
|
|
|
|
|
|
| |
|
|
|
| UNK | UNK | UNK | UNK |
|
|
|
| ||
| B49,DR4,DQ8 | 1 |
|
|
|
|
|
|
|
|
| UNK | |
| B44,DR4,DQ8 | 6 |
|
|
|
|
|
|
|
|
|
| |
| B62,SC33,DR4,DQ8 | 8 |
|
|
|
|
|
|
|
|
|
| |
| B38,SC21,DR4,DQ8 | 9 |
|
|
|
|
|
|
|
|
|
| |
| B60,SC31,DR4,DQ8 | 10 |
|
|
|
|
|
|
|
|
|
| |
| B62,SB42,DR4,DQ8 | 7 |
|
|
|
|
|
|
|
|
|
| |
|
|
|
| UNK | UNK | UNK | UNK |
|
|
|
| ||
Shown are MHC alleles for the eight MHP cell lines and, underneath each, for the population CEH(s) that share HLA-DR-DQ specificities with them. Although a known CEH shares HLA-DR-DQ specificities with APD, that CEH does not share significant class II sequence similarity to APD, and is not displayed. Genes are in chromosomal order from telomere to centromere, except CFB and C2 are switched because complotype was historically defined in the order shown. HLA gene alleles are shown at the highest definition known up to 4-digit resolution. Alleles containing “/” indicate microvariation.
Abbreviations: UNK = Unknown (insufficient data).
This CEH has two possible HLA-C alleles: *04:01 and *04:09N [14], [16].
Figure 1A map of the MHC class II and extended class II regions of chromosome 6p21.
Sequenced sub-regions are marked by colored blocks (top). Distances (kb) are to scale from the human reference sequence. Gene locations from HLA-DRA on the telomeric (T) end to DAXX on the centromeric (C) end are shown.
Figure 2MHP sequences represent CEHs to variable extents in MHC class II from HLA-DQA2 to DAXX.
MHP cell line names (left) and their complotypes (when known) and some of their HLA alleles are above their corresponding lines. Shown is the region (solid horizontal line) in which the listed MHP sequence represented the dominant CEH sequence sharing MHC markers identical or similar to the cell line. Cell line specificities or alleles sometimes differed from the “represented” CEHs (see text). A dashed line indicates the region within which a break point between the shared identity of the MHP sequence and the dominant sequence occurred but could not be precisely localized. Icons at the ends of MANN, COX and SSTO show precise break points of shared sequence identity. The relative location of several class II genes (telomere toward centromere from left to right) is shown to scale on the abscissa.
Amplicon DOB7.5 SNPs determined from resequencing.
| Amplicon ID | Gene/Region | rs# | Location |
| DOB7.5-1 |
| 2857101 | 32,794,676 |
| DOB7.5-2 |
| 2857100 | 32,794,856 |
Figure 3CEH sequence fixity and crossover frequencies from HLA-DQA2 to DAXX.
Chromosomal location is shown to scale on the abscissa and starts at the mid-point between HLA-DRB1 and HLA-DQB1 (A–C) or at HLA-DQB1 (D–O). The locations of several HLA class II and extended class II genes are marked by arrows below Figures 3A–C and 3O. The 11 regions analyzed for normalized crossover frequency (NCF) are enumerated in Figure 3D. The numbers of haplotypes analyzed for each CEH are given in Table 1. Sequence fixities (A) and NCFs (D–F) are shown for the CEHs B7,DR15 (black open circles), (D); B8,DR3 (green closed circles), (E); and B18,DR3 (red squares), (F). Sequence fixities (B) and NCFs (G–J) are shown for the CEHs C4,B44,DR7 (green closed circles), (G); C16,B44,DR7 (purple open circles), (H); B57,DR7 (red squares), (I); and B44,DR4,DQ7 (blue diamonds), (J). Asterisks (*) in Figures 3B, 3G and 3H indicate that sequence fixities and NCFs could not be determined centromeric to the last data points for the two B44,DR7 CEHs. Sequence fixities (C) and NCFs (K–O) for various DR4,DQ8 CEHs are shown. These include the CEHs B44,SC30/SC31 (black diamonds), (K); B62,SC33 (green closed circles), (L); B38,SC21 (purple open circles), (M); B60,SC31 (red squares), (N); and B62,SB42 (blue triangles), (O). NCFs are normalized to the remaining conserved sequences and to 1 Mb relative to the distance over which crossovers were observed, and values are displayed for 11 sub-regions (Table S2).
Figure 4Shared and divergent sequences in related CEHs.
A) A region of nearly identical sequence for the B8,DR3 and B18,DR3 CEHs was previously reported [19] and is represented by the broken line rectangle, and ends just centromeric to MTC30P1, approximately 50 kb centromeric to HLA-DQB1 [18]. B) Shared sequence for the B7,DR15 and B18,DR15 CEHs is shown in the broken line rectangle. Sequence identity for these two CEHs ends centromerically between introns 8 and 6 of TAP2. C) Shared and divergent sequences for four DR4,DQ8 CEHs are shown in the broken line rectangle. HLA-B*15:01 and HLA-B*40:01 are alleles of the B62 and B60 specificities, respectively.