| Literature DB >> 35251132 |
Ke He1, Chun-Hong Liang1, Ying Zhu2, Peter Dunn3, Ayong Zhao1, Piotr Minias4.
Abstract
The Major Histocompatibility Complex (MHC) is a hyper-polymorphic genomic region, which forms a part of the vertebrate adaptive immune system and is crucial for intra- and extra-cellular pathogen recognition (MHC-I and MHC-IIA/B, respectively). Although recent advancements in high-throughput sequencing methods sparked research on the MHC in non-model species, the evolutionary history of MHC gene structure is still poorly understood in birds. Here, to explore macroevolutionary patterns in the avian MHC architecture, we retrieved contigs with antigen-presenting MHC and MHC-related genes from available genomes based on third-generation sequencing. We identified: 1) an ancestral avian MHC architecture with compact size and tight linkage between MHC-I, MHC-IIA/IIB and MHC-related genes; 2) three major patterns of MHC-IIA/IIB unit organization in different avian lineages; and 3) lineage-specific gene translocation events (e.g., separation of the antigen-processing TAP genes from the MHC-I region in passerines), and 4) the presence of a single MHC-IIA gene copy in most taxa, showing evidence of strong purifying selection (low dN/dS ratio and low number of positively selected sites). Our study reveals long-term macroevolutionary patterns in the avian MHC architecture and provides the first evidence of important transitions in the genomic arrangement of the MHC region over the last 100 million years of bird evolution.Entities:
Keywords: MHC architecture; MHC gene structure; avian MHC; macroevolutionary; third-generation sequencing genome
Year: 2022 PMID: 35251132 PMCID: PMC8893315 DOI: 10.3389/fgene.2022.823686
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
FIGURE 1The protocol used to reconstruct the evolution of avian MHC structure.
Gene numbers, linkage, and gene arrangement of MHC-I and MHC-II genes in non-passerine birds (TGS-based genomes).
| Species | MHC gene copy number | Gene arrangement and linkage | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| I | IIB | IIA | Type of MHC-I region | Type of MHC-II region | MHC-I and II linkage | MHC-IIA and IIB linkage | MHC-IIA-IIB structure | MHC-IIA and COL11A2 linkage | Linked contig | |
|
| 2 | 3 | 1 | GalGal-like | GalGal-like | Y | N | — | N | I-related ∼ IIB-related |
|
| 2 | 2 | 1 | GalGal-like | GalGal-like | Y | N | — | N | I-related ∼ IIB-related |
|
| 0 | 1 | 1 | — | GalGal-like | — | N | — | N | — |
|
| 2 | 3 | 0 | GalGal-like | GalGal-like | Y | — | — | N | I-related ∼ IIB-related |
|
| 1 | 1 | 1 | GalGal-like | AnaPla-like | N | Y | IIA ∼ IIB ∼ IIB | Y | IIB-related ∼ IIA |
|
| 6 | 2 | 1 | GalGal-like | StrHab-like | N | Y | IIA ∼ IIB | Y | IIB-related ∼ IIA |
|
| 2 | 2 | 1 | GalGal-like | AnaPla-like | N | Y | IIA ∼ IIB ∼ IIB | Y | IIB-related ∼ IIA |
|
| 11 | 2 | 1 | GalGal-like | AnaPla-like | Y (551 kb) | Y | IIA ∼ IIB ∼ IIB | Y | I-related ∼ IIB-related ∼ IIA |
|
| 3 | 2 | 1 | GalGal-like | AnaPla-like | Y (104 kb) | Y | IIA ∼ IIB ∼ IIB | Y | I-related ∼ IIB-related ∼ IIA |
|
| 0 | 0 | 0 | — | — | — | — | — | — | |
|
| 9 | 2 | 2 | GalGal-like-V | NipNip-like (2) | N | Y | (IIA ∼ IIB) × 2 | Y | IIB-related ∼ IIA |
|
| 5 | 3 | 3 | GalGal-like | NipNip-like (2) | Y | Y | (IIA ∼ IIB) × 3 | N | I-related ∼ IIB-related ∼ IIA |
|
| 4 | 0 | 0 | GalGal-like | — | — | — | — | N | — |
|
| 0 | 0 | 0 | — | — | — | — | — | N | — |
|
| 5 | 3 | 2 | GalGal-like | AnaPla-like | Y (972 kb) | Y | IIB ∼ IIA ∼ IIB; IIA ∼ IIB | N | I-related ∼ IIB-related ∼ IIA |
|
| 2 | 6 | 1 | GalGal-like | AnaPla-like | Y (336 kb) | Y | IIA ∼ IIB ∼ IIB × 5 | Y | I-related ∼ IIA ∼ IIB-related |
|
| 4 | 2 | 2 | GalGal-like-V | NipNip-like (2) | N | Y | (IIA ∼ IIB) × 2 | Y | IIB-related ∼ IIA |
|
| 3 | 1 | 1 | GalGal-like | StrHab-like | Y (150 kb) | Y | IIA ∼ IIB | Y | I-related ∼ IIB-related ∼ IIA |
|
| 3 | 0 | 1 | GalGal-like-V | — | — | — | — | Y | |
|
| 5 | 2 | 1 | GalGal-like | AnaPla-like | Y | Y | IIA ∼ IIB ∼ IIB | N | I-related ∼ IIB-related ∼ IIA |
|
| 3 | 2 | 1 | GalGal-like | StrHab-like | N | Y | IIA ∼ IIB | Y | IIB-related ∼ IIA |
|
| 3 | 2 | 2 | GalGal-like | NipNip-like (2) | N | Y | (IIA ∼ IIB) × 2 | N | IIA ∼ IIB, MHC-I-related |
|
| 17 | 3 | 3 | GalGal-like | NipNip-like (2) | Y | Y | (IIA ∼ IIB) × 2; IIB ∼ IIA | N | I ∼ IIB-related ∼ IIA ∼ I-related |
|
| 0 | 1 | 1 | — | StrHab-like | — | Y | IIA ∼ IIB | N | IIB-related ∼ IIA |
|
| 0 | 0 | 0 | — | — | — | — | — | N | — |
|
| 5 | 7 | 1 | — | StrHab-like | Y | Y | IIA ∼ IIB; and others | N | I ∼ IIB-related |
| IIB-related ∼ IIA | ||||||||||
|
| 1 | 4 | 0 |
| Only IIB (3) | N | — | Only IIB (3) | N | — |
|
| 3 | 1 | 1 | — | StrHab-like | Y | Y | IIA ∼ IIB | N | I ∼ IIB-related ∼ IIA |
|
| 2 | 1 | 1 | GalGal-like | StrHab-like | N | Y | IIA ∼ IIB | N | IIB-related ∼ IIA |
|
| 3 | 3 | 3 | GalGal-like | NipNip-like (3) | N | Y | (IIA ∼ IIB) × 3 | Y | IIB-related ∼ IIA |
|
| 4 | 8 | 6 | Other | Other | N | Y | Listed in footnote | N | Other |
|
| 2 | 2 | 2 | GalGal-like | NipNip-like (2) | Y (175 kb) | Y | (IIA ∼ IIB) × 2 | Y | I-related ∼ IIA ∼ IIB-related |
Note: Linkage means that genes or gene regions were located within the same contig. If the distance between the MHC-I region and IIB-region is more than 100kb, it was provided in the brackets.
Estimated based on partial exon.
Genome data ID of AnaPla correspond to GCA_015476345.1 (AnaPla-1) and GCA_900411745.1 (AnaPla-2).
In NipNip-like, the numbers in brackets suggested duplication numbers of IIA-IIB dyads.
Although MHC-I and MHC-II genes were not in the same contig, some MHC-I and II related genes were located in the same contig.
The MHC I-related region contained MHC-II genes (I × 2∼IIB ∼ IIA ∼ TAP1/2 region ∼ I ∼ TAPBP ∼ I × 7) and some MHC-I genes were located in MHC-II region (I × 5∼DMA/B region ∼ BRD2∼IIA ∼ IIB ∼ IIA ∼ IIB).
CoL11A2 was not adjacent to IIA but in the same contig with BRD2.
In Picoides pubescens, there was one unit of IIA ∼ IIB; and one contig with 6 IIB.
In Pogoniulus pusillus, no TAPs were found in genome data, so we didn’t infer the pattern of MHC-I related region.
Species-specific gene arrangement was IIA-IIB × 3; (IIB × 2 ∼ I × 2) × 2∼Tap1/2; IIA ∼ IIB ∼ IIA × 3 ∼ BRD2 ∼ DMA/B.
Gene numbers, linkage, and gene arrangement of MHC-I and MHC-II genes in passerine birds (TGS-based genomes).
| Species | Number of gene copies | Number of MHC-I and IIB related clusters | Gene linkage (whether in the same contig) | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| TAP1 | TAP2 | IIA | COL11A2 | MHC-I contigs | MHC-IIB contigs | TAP1-TAP2 linkage | MHC-I and TAPs linkage | Existence of MHC-I and IIB linkage | MHC-IIA and IIB linkage | |
|
| 1 | 1 | 1 | 1 | 1 | 1 | N | N | N | Y |
|
| 1 | 1 | 1 | — | 1 | 2 | Y | N | Y (1) | N |
|
| 1 | 1 | — | 1 | 2 | 2 | Y | N | Y (1) | — |
|
| 1 | 1 | 1 | 1 | 4 | 5 | Y | N | N | Y |
|
| 1 | 1 | — | 1 | 1 | 1 | Y | N | N | — |
|
| 1 | — | 1 | 1 | 4 | 24 | — | N | Y (3) | N |
|
| 1 | 1 | 1 | 1 | 2 | 3 | Y | N | Y (2) | Y |
|
| 1 | 1 | — | 1 | 3 | 6 | Y | N | Y (3) | — |
|
| 1 | 1 | 1 | 1 | 20 | 135 | Y | N | Y (5) | N |
|
| 1 | 1 | 1 | 1 | 13 | 34 | Y | N | Y (2) | N |
|
| 3 | 1 | 1 | 1 | 1 | 1 | Y | N | N | N |
|
| 1 | 1 | — | 1 | 2 | 2 | N | N | N | — |
|
| — | 1 | — | 1 | 1 | 3 | — | Y (>700 kb) | N | — |
Note:The numbers in brackets suggested the numbers of contigs having both MHC-I and MHC-IIB genes.
Only one linked TAP1-TAP2 pair found, despite three TAP1 in Sporophila hypoxantha.
FIGURE 2The patterns of gene arrangement in (A) MHC class I-related, and (B) class II-related regions. *n indicates copy number variation of a single gene (MHC-I or MHC-IIB), while D with parentheses indicates duplication of a linked region. For MHC-I region, GalGal-like-V variant indicates duplicated (I × n ∼ TAP1∼TAP2∼I × n) unit, and Passer-like pattern indicates lack of linkage between TAPs and MHC-I genes. For MHC-II region, the NipNip-like pattern indicates a duplicated (IIB ∼ IIA) unit. The dashed box of COL11A2 indicates that this gene may be either present or absent in the MHC-II region.
FIGURE 3The arrangement of MHC-I and MHC-II regions in 20 non-passerine bird species. The MHC-I- related region is marked with green, while MHC-II region including or excluding MHC-IIA gene is marked with purple and yellow, respectively. MHC-IIA is marked with blue, when not linked to the core MHC-II region. A single slash indicates the same contig or chromosome (numbers listed under the gene arrangement patterns indicate the total length of missing distances associated with single slashes), while a double slash indicates different contig or chromosome. The numbers marked with asterisks (above gene arrangement) indicate the distances that do not match the scale. Note: 1 Visualization of MHC structure for Anas platyrhynchos based on genome-2 in Table 1 (GCA_900411745.1), while the distance between MHC-I and MHC-II regions was 310 kb in genome-1 (GCA_015476345.1).
FIGURE 4The arrangement of MHC-I, MHC-II, and TAP genes in eight passerine bird species. A single slash indicates the same contig or chromosome and associated numbers indicate the length of missing distances, while *number indicates the number of duplication events. A single slash indicates the same contig or chromosome (numbers listed under the gene arrangement patterns indicate the total length of missing distances associated with single slashes).
Gene numbers, linkage, and gene arrangement of MHC-I and MHC-II genes in waterfowl Anseriformes (SGS- and TGS-based genomes) TGS-based data were marked with asterisks (*).
| Species | Family | MHC gene copy number | Gene linkage | ||||
|---|---|---|---|---|---|---|---|
| I | IIA | IIB | I-IIB related region | IIA-IIB | IIA-COL11A2 | ||
|
| Anatidae | 6 | 1 | 2 | N | Y | Y |
|
| Anatidae | 1 | 1 | 1 | Y (24,085 kb) | Y | Y |
|
| Anatidae | 8 | 1 | 4 | Y (103 kb) | Y | Y |
|
| Anatidae | 2 | 1 | 0 (0, 0, 0) | N | — | Y |
|
| Anatidae | 1 | 1 | 1 | N | Y | N |
|
| Anseranatidae | 0 (1, 1, 1) | 1 | 0 (1, 1, 2) | N | N | Y |
|
| Anatidae | 2 | 1 | 1 | N | Y | Y |
|
| Anatidae | 11 | 1 | 3 | Y (558.7 kb) | Y | Y |
|
| Anatidae | 2 | 1 | 0 (0, 1, 1) | N | N | Y |
|
| Anatidae | 1 | 1 | 0 (1, 0, 0) | Y (73.6 kb) | Y | Y |
|
| Anhimidae | 0 (0, 2, 1) | — | 0 (0, 1, 1) | N | — | — |
|
| Anatidae | 2 | 1 | 2 | N | Y | Y |
|
| Anatidae | 1 | 1 | 0 | N | — | Y |
|
| Anatidae | 2 | 1 | 2 | Y (59.5 kb) | Y | Y |
|
| Anatidae | 1 | 1 | 0 | N | — | N |
|
| Anatidae | 1 | 1 | 1 | N | N | Y |
|
| Anatidae | 1 | 1 | 0 (0, 1, 1) | N | N | N |
|
| Anatidae | 1 | 1 | 1 | N | N | Y |
Note: If exons 2-4 were not found within a single conting, the number of hits containing separate exons 2, 3, and 4 was listed in the brackets.
The numbers in the brackets indicate the distance between the two regions.
FIGURE 5The predicted ancient MHC architecture in birds and its evolution in derived groups. Asterisks indicate characters that are unique to a group.