| Literature DB >> 35769554 |
Plaimein Amnuaycheewa1, Mohamed Abdelmoteleb2, John Wise3, Barbara Bohle4, Fatima Ferreira5, Afua O Tetteh6, Steve L Taylor3, Richard E Goodman3.
Abstract
Celiac disease (CeD) is an autoimmune enteropathy induced by prolamin and glutelin proteins in wheat, barley, rye, and triticale recognized by genetically restricted major histocompatibility (MHC) receptors. Patients with CeD must avoid consuming these proteins. Regulators in Europe and the United States expect an evaluation of CeD risks from proteins in genetically modified (GM) crops or novel foods for wheat-related proteins. Our database includes evidence-based causative peptides and proteins and two amino acid sequence comparison tools for CeD risk assessment. Sequence entries are based on the review of published studies of specific gluten-reactive T cell activation or intestinal epithelial toxicity. The initial database in 2012 was updated in 2018 and 2022. The current database holds 1,041 causative peptides and 76 representative proteins. The FASTA sequence comparison of 76 representative CeD proteins provides an insurance for possible unreported epitopes. Validation was conducted using protein homologs from Pooideae and non-Pooideae monocots, dicots, and non-plant proteins. Criteria for minimum percent identity and maximum E-scores are guidelines. Exact matches to any of the 1,041 peptides suggest risks, while FASTA alignment to the 76 CeD proteins suggests possible risks. Matched proteins should be tested further by CeD-specific CD4/8+ T cell assays or in vivo challenges before their use in foods.Entities:
Keywords: Pooideae; T-cell epitopes; celiac disease; gluten; peptide database; prolamin; risk assessment; sequence comparison
Year: 2022 PMID: 35769554 PMCID: PMC9234867 DOI: 10.3389/falgy.2022.900573
Source DB: PubMed Journal: Front Allergy ISSN: 2673-6101
Statistics of the AllergenOnline.org CeD peptide and protein database version construction and inclusion characteristics.
| References | Number of publication references | 68 |
|
| Publication year of references | 1984–2012 | 1984– | |
| Peptides | Number of peptides | 1,016 |
|
| Number of native peptides | 464 |
| |
| Number of deamidated peptides | 552 |
| |
| Number of immunogenic peptides | 998 |
| |
| Number of CD4+ T cell reactive peptides | 997 |
| |
| Number of CD8+ T cell reactive peptides | 1 | 1 | |
| Number of toxic peptides (without T cell reactivity) | 18 |
| |
| Length of peptides (AA) | 8–55 | ||
| Averaged length of peptides (AA) | 16 ± 4 | 16 ± 4 | |
| Proteins | Number of proteins | 68 |
|
| Number of proteins in | 43 | 43 | |
| Number of synthetic constructs in | 1 | 1 | |
| Number of proteins in | 2 | 2 | |
| Number of proteins in | 11 |
| |
| Number of proteins in | 6 | 6 | |
| Number of proteins in | 3 |
| |
| Number of proteins in | 2 | 2 | |
| Length of proteins (AA) | 20–800 | 20–800 |
Changes between the two database versions are in bold font.
Complete list of the 72 reference publications with the PubMed links is available on the database.
FASTA sequence identity scores and alignments of the representative prolamin-like protein groups clustered by source organism types that were tested with the AllergenOnline.org CeD database version 1.
|
|
|
|
| |||
|---|---|---|---|---|---|---|
|
|
|
| ||||
| I | Prolamins in Pooideae with CeD peptides | 2,104 | Yes | 827 (827) | 100 | 2.8e−179 |
| 287 (290) | 100 | 7.8eb−81 | ||||
| 842 (838) | 98.1 | 1.4e−195 | ||||
| Prolamins in Pooideae without CeD peptides | 562 | No | 20 (20) | 95 | 2.9e−05 | |
| 187 (288) | 98.4 | 2.7e−45 | ||||
| 290 (288) | 79.3 | 3.5e−63 | ||||
| II | Prolamins and prolamin-like proteins in Chloridoideae, Ehrhartoideae, and Panicoideae | 1,059 | No | 54 (52) | 40.7 | 6.7 |
| 12 (20) | 66.7 | 1.9 | ||||
| 268 (360) | 41 | 3.5e−17 | ||||
| III | Prolamin-like proteins in Dicotyledons | 1,050 | No | 68 (68) | 33.8 | 2.3 |
| 10 (20) | 60 | 8.8 | ||||
| 121 (648) | 30.6 | 1.8e−06 | ||||
| IV | Unrelated proteins (animals, fungi and microbes) | 48 | No | 29 (29) | 58.6 | 3.8 |
| 11 (20) | 72.7 | 5.8e−03 | ||||
| 437 (439) | 41.2 | 8.7e−25 | ||||
Proteins were identified from the NCBI protein database using keywords: gluten, glutelin, glutenin, prolamin, prolamine, gliadin, hordein, secalin, avenin, zein, kafirin, coixin, canein, and pennisetin.
35 proteins were obtained by BLAST, which searched the 68 representative celiac proteins against the NCBI Protein-Protein (non-redundant sequences) database with the exclusion of Pooideae (taxid: 147368) that had close to 45% identity over 100 AA and an E-score close to 1e-14. None had a direct CeD peptide match.
proteins were obtained by BLAST searches with the 68 representative celiac proteins against the NCBI Protein-Protein (non-redundant sequences) database with the exclusion of Pooideae (taxid: 147368).
Repeat of the FASTA sequence identity scores and alignments of the larger representative prolamin-like protein groups clustered by source organism types that were tested with the AllergenOnline.org CeD database version 2.
|
|
|
|
| |||
|---|---|---|---|---|---|---|
|
|
|
| ||||
| I | Prolamins in Pooideae with CeD peptides | 4,623 | Yes | 828 (828) | 100 | 1.0e−177 |
| 439 (290) | 100 | 1.6e−165 | ||||
| 455 (455) | 100 | 8.4e−153 | ||||
| Prolamins in Pooideae without CeD peptides | 1,163 | No | 291 (288) | 98.6 | 3.7e−09 | |
| 264 (279) | 98.9 | 1.1e−73 | ||||
| 266 (269) | 98.5 | 3.6e−68 | ||||
| II | Prolamins and prolamin-like proteins in non-Pooideae monocots | 1,755 | No | 292 (250) | 37.3 | 3.6e−09 |
| 168 (181) | 40.5 | 9.1e−09 | ||||
| 222 (222) | 37.4 | 2.4e−08 | ||||
| III | Prolamin-like proteins in Dicotyledons | 4,724 | No | 305 (838) | 32.1 | 1.6e−04 |
| 372 (439) | 28.8 | 9.5e−04 | ||||
| 253 (290) | 29.2 | 9.3e−03 | ||||
Proteins were identified from the NCBI protein database using keywords: gluten, glutelin, glutenin, prolamin, prolamine, gliadin, hordein, secalin, avenin, zein, kafirin, coixin, canein, and pennisetin.
44 proteins were obtained by BLAST, which searched the 72 representative celiac proteins against the NCBI Protein-Protein (non-redundant sequences) database with the exclusion of Pooideae (taxid: 147368), that are close to significance based on percent identity and E-scores close to or below 1e-14. None contained a CeD peptide match, however one had 45–50% identity to four wheat glutens.
Figure 1Taxonomic tree of cereal and dicotyledonous plants based on NCBI taxonomy. Published evidence of CeD safe foods show reactions only to grains of the Pooideae subfamily of grasses.
Figure 2(A) Amino acid sequence alignments of an α-gliadin (NCBI accession number: CAB76964) with 53 overlapping CeD-associated peptides identified with the exact sequence match tool; (B) full FASTA sequence alignment results with homology scores of the α-gliadin theoretically substituted with 13 alanine residues; (C) full FASTA sequence alignment results with homology scores of the α-gliadin theoretically substituted with 11 alanine residues.
Figure 3Proposed evaluation criteria to predict the likelihood of a query protein to cause elicitation of CeD. An exact match to any of the 1,041 peptides indicates probable rejection. Alternatively, a FASTA3 alignment with an E-score limit of 1e-14 and minimum alignment length > 100 AA with an identity percent of the protein at 45% should trigger testing or rejection.