| Literature DB >> 28962455 |
Ping Song1, Rod Herman1, Siva Kumpatla1.
Abstract
In the context of regulatory assessment of transgenic proteins for potential allergenicity, a previous investigation demonstrated that a 1:1 FASTA comparison using an E-value of 1.0E-09 as a criterion is superior to the conventional FASTA search (using the whole sequence as a query) for >35% identity over 80 amino acids, but with improved specificity. A further study, using groups of known cross-reactive peanut allergens, indicates the sensitivity of this approach is superior to the conventional FASTA search and equivalent to 80-mer sliding window FASTA search recommended by WHO/FAO. Specifically, the 1:1 FASTA approach eliminated the technical issues resulting from lack of identification of short query sequences with high identity to known allergens, or high identity over short amino acid stretches, and different E-value settings when searching for >35% identity over 80aa. Based on the performance of this simple application of existing bioinformatics tools, and its ease of implementation and interpretation in the context of a regulatory assessment, we advocate that adoption of this 1:1 FASTA approach as a supplement to the FAO/WHO/ CODEX criterion (>35% identity over 80aa) formulated 13 years ago. Adoption of this approach eliminates many biologically irrelevant homology hits generated by the FAO/WHO/CODEX criterion and improves the safety assessment of GM crops.Entities:
Keywords: Allergen; Bioinformatics; Transgenic crops
Year: 2015 PMID: 28962455 PMCID: PMC5598423 DOI: 10.1016/j.toxrep.2015.08.005
Source DB: PubMed Journal: Toxicol Rep ISSN: 2214-7500
Cross-reactivity of peanut allergens [3].
| Protein super family | Cupin | Prolamin | ||||
|---|---|---|---|---|---|---|
| Protein family | Vicilin or 7S globulin | Legumin or 11S globulin | 2S albumin | nsLTP | ||
| Allergen | Ara h 1 | Ara h 3 | Ara h 2 | Ara h 6 | Ara h 7 | Ara h 9 |
| Isoallergen (UniProt accession) | Ara h 1.0101 (P43238) | Ara h 3.0101 (O82580) | Ara h 2.0101 (Q6PSU2) | Ara h 6 (Q647G9) | Ara h 7.0101 (Q9SQH1) | Ara h 9.0101 (B6CEX8) |
| Cross-reactivity | With other legume and tree nut vicilins and Ara h 2 and Ara h 3 | With other legumes and tree nut legumins and Ara h 1, 2, and 6 | With 2S albumins from almond and Brazil nut, and Ara h 1, 3 and 6 | With Ara h 1–3 | Not known | With peach and hazelnut nsLTP (Pru p 3 and Cor a 8) |
| Number of sequences in allergen online database V14 | 27 | 36 | 12 | 9 | 32 | |
Hits detected by 1:1 FASTA with E-value of ≤1.0E-09 as a threshold or 80-mer sliding window FASTA but not by conventional whole sequence FASTA with >35% identity over 80 amino acids or longer as threshold.a
| Peanut allergen (UniPro accession) | 1:1 FASTA | 80-mer sliding window FASTA | Whole sequence FASTA | |||||
|---|---|---|---|---|---|---|---|---|
| Hit accession | Hit accession (GenBank accession) | Identity (%) | Alignment Overlap | Identity (%) | Alignment overlap | |||
| Ara h 1 | AAK15089.1 | 3.80E-025 | AAK15089.1 | 37.5 | 80 | 31.7 | 619 | 1.2E-18 |
| AAM73729.1 | 6.20E-017 | AAM73729.1 | 45.1 | 82 | 28.8 | 579 | 3.7E-12 | |
| AAM73730.2 | 1.60E-018 | AAM73730.2 | 45.1 | 82 | 28.7 | 579 | 4.0E-12 | |
| Ara h 1 | AAK15089.1 | 5.50E-025 | AAK15089.1 | 55.1 | 78 | 33.2 | 561 | 7.4E-18 |
| AAM73729.1 | 1.80E-019 | AAM73729.1 | 45.1 | 82 | 28.9 | 599 | 2.0E-13 | |
| AAM73730.2 | 1.60E-019 | AAM73730.2 | 37.5 | 80 | 28.7 | 599 | 2.0E-18 | |
| AAF18269.1 | 1.10E-017 | AAF18269.1 | 45.1 | 82 | 35.0 | 625 | 2.2E-13 | |
| Ara h 3 | O23878.1 | 2.50E-020 | O23878.1 | 43.6 | 78 | 33.4 | 575 | 9.8E-19 |
| Q9XFM4.1 | 9.10E-021 | Q9XFM4.1 | 43.6 | 78 | 34.3 | 545 | 4.3E-19 | |
| Ara h 2 | CAA38363 | 1.80E-05 | CAA38363 | 35.8 | 81 | 27.9 | 179 | 9.0E-05 |
FASTA (v35.04) searches were conducted using default setting (Matrix = BLOSUM50; Gap Penalties = −10/−2; ktup = 2; E-value = 10).
AAK15089.1—7S globulin from Sesamum indicum (sesame); AAM73729.1 and AAM73730.2—vilcilin-like protein from Anacardium occidentale (cashew); AAF18269.1—vilcilin-like protein precursor from Juglans regia; O23878.1—13S globulin seed storage protein (legumin-like protein) from Fagopyrum esculentum (common buckwheat); Q9XFM4.1—13S globulin seed storage protein (legumin-like protein) from Fagopyrum esculentum (common buckwheat); CAA38363—2S albumin from Bertholletia excels Brazil nut).
E-Values from whole sequence FASTA search of Ara h1, Ara h 2, and Ara h 3 were generated based on a database with 27, 12, and 36 sequence entries, respectively. The E-values of those alignments generated by whole sequence conventional FASTA search are significant except the one between Ara h 2 and the 2S albumin large subunit (accession CAA38363), but they won't be classified as hits according to the criterion of >35% identity over 80aa or longer.