| Literature DB >> 31197605 |
Naxin Huo1,2, Tingting Zhu2, Shengli Zhang3, Toni Mohr1, Ming-Cheng Luo2, Jong-Yeol Lee4, Assaf Distelfeld5, Susan Altenbach1, Yong Q Gu6.
Abstract
α-Gliadins are a major group of gluten proteins in wheat flour that contribute to the end-use properties for food processing and contain major immunogenic epitopes that can cause serious health-related issues including celiac disease (CD). α-Gliadins are also the youngest group of gluten proteins and are encoded by a large gene family. The majority of the gene family members evolved independently in the A, B, and D genomes of different wheat species after their separation from a common ancestral species. To gain insights into the origin and evolution of these complex genes, the genomic regions of the Gli-2 loci encoding α-gliadins were characterized from the tetraploid wild emmer, a progenitor of hexaploid bread wheat that contributed the AABB genomes. Genomic sequences of Gli-2 locus regions for the wild emmer A and B genomes were first reconstructed using the genome sequence scaffolds along with optical genome maps. A total of 24 and 16 α-gliadin genes were identified for the A and B genome regions, respectively. α-Gliadin pseudogene frequencies of 86% for the A genome and 69% for the B genome were primarily caused by C to T substitutions in the highly abundant glutamine codons, resulting in the generation of premature stop codons. Comparison with the homologous regions from the hexaploid wheat cv. Chinese Spring indicated considerable sequence divergence of the two A genomes at the genomic level. In comparison, conserved regions between the two B genomes were identified that included α-gliadin pseudogenes containing shared nested TE insertions. Analyses of the genomic organization and phylogenetic tree reconstruction indicate that although orthologous gene pairs derived from speciation were present, large portions of α-gliadin genes were likely derived from differential gene duplications or deletions after the separation of the homologous wheat genomes ~ 0.5 MYA. The higher number of full-length intact α-gliadin genes in hexaploid wheat than that in wild emmer suggests that human selection through domestication might have an impact on α-gliadin evolution. Our study provides insights into the rapid and dynamic evolution of genomic regions harboring the α-gliadin genes in wheat.Entities:
Keywords: Celiac disease; Gene duplication; Genome evolution; Phylogeny; Wheat gluten proteins; α-Gliadin gene family
Mesh:
Substances:
Year: 2019 PMID: 31197605 PMCID: PMC6797660 DOI: 10.1007/s10142-019-00686-z
Source DB: PubMed Journal: Funct Integr Genomics ISSN: 1438-793X Impact factor: 3.410
α-Gliadin genes and mutation events in the pseudogenes
| Gene ID | Length (bp)* | Types of mutations | In-frame stop codon no. | Stop codons next to Q |
|---|---|---|---|---|
|
| 924 | Stop codon | 2 | 1 |
|
| 879 | Stop codon | 2 | 2 |
|
| 785 | Stop codon | 1 | 1 |
|
| 859 | Frameshift and stop codon | 1 | 1 |
|
| 859 | Frameshift and stop codon | 3 | 2 |
|
| 583 | Deletion at 5′ end, frameshift, and gap | 0 | 0 |
|
| 696 | Deletion at 5′ end, frameshift, and stop codon | 2 | 1 |
|
| 859 | Stop codon | 4 | 2 |
|
| 846 | Stop codon | 2 | 2 |
|
| 834 | Stop codon | 2 | 1 |
|
| 837 | Stop codon | 1 | 1 |
|
| 845 | Frameshift and stop codon | 8 | 5 |
|
| 845 | Stop codon | 3 | 1 |
|
| 894 | Intact full-length | ||
|
| 870 | Stop codon | 1 | 0 |
|
| 456 | Deletion at 3′ end and stop codon | 2 | 0 |
|
| 887 | Intact full-length | ||
|
| 845 | Frameshift and stop codon | 5 | 1 |
|
| 579 | Frameshift, stop codon, and gap | 1 | 0 |
|
| 861 | Intact full-length | ||
|
| 858 | Intact full-length | ||
|
| 859 | Deletion at 5′ end | ||
|
| 855 | Stop codon | 1 | 1 |
|
| 849 | Stop codon | 1 | 1 |
|
| 949 | Intact full-length | ||
|
| 996 | Intact full-length | ||
|
| 1023 | Stop codon | 1 | 1 |
|
| 969 | Intact full-length | ||
|
| 927 | Stop codon | 1 | 1 |
|
| 918 | Intact full-length | ||
|
| 765 | TE insertion and stop codon | 3 | 3 |
|
| 852 | TE insertion and stop codon | 3 | 3 |
|
| 813 | Deletion at 5′ end and stop codon | 3 | 3 |
|
| 1104 | TE insertions and stop codon | 1 | 0 |
|
| 231 | Deletion at 5′ end and stop codon | 1 | 0 |
|
| 669 | Deletion at 5′ end and stop codon | 1 | 1 |
|
| 606 | Stop codon | 2 | 0 |
|
| 146 | Deletions at both 5′ and 3′ ends | ||
|
| 885 | Intact full-length | ||
|
| 237 | Deletion at 5′ end and stop codon | 1 | 0 |
Note: *for pseudogeenes containing TE insertions, the length was calculated after the TE sequence was removed
Fig. 1Pairwise comparison of α-gliadin genomic regions from two homologous genomes. Dotplot analyses between homologous regions from the A genomes (a) and B genomes (b) of wild emmer and hexaploid wheat Chinese Spring were performed using YASS program with default parameter setting (Noe and Kucherov 2005). The positions of α-gliadin genes are indicated with arrows along the axes
Fig. 2Phylogenetic trees of α-gliadin genes. Nucleotide sequences of α-gliadin gene sequences from the A genomes (a) and B genomes (b) of wild emmer and Chinese Spring were used for phylogenetic analysis. Phylogenetic trees were reconstructed with the neighbor-joining method. The bootstrap consensus tree inferred from 1000 replicates is taken to represent the evolutionary history of the α-gliadin genes analyzed. The percentages of replicate trees in which the associated genes clustered together in the bootstrap test (1000 replicates) are shown next to the branches
Fig. 3Expression profiles of α-gliadin genes in wild emmer. Expression profiles were generated using Illumina RNA-seq transcriptome datasets published previously from developing grain 12 days after anthesis (Avni et al. 2017). Pseudogenes are indicated with *
Fig. 4Sequence analysis of deduced α-gliadin proteins in wild emmer. Deduced protein sequences from nine full-length α-gliadin genes from wild emmer were aligned using ClusterW with manual editing to improve the alignment quality. The first four and last seven amino acid sequences of the mature α-gliadins are highlighted with yellow with amino acid substitutions indicated by red. The sequences of three CD epitopes are highlighted with magenta. Two poly Q regions are indicated