Literature DB >> 31743058

Trypsin cleavage sites are highly unlikely to occur in celiac-causing restricted epitopes.

Rod A Herman1, Ping Song1, Henry P Mirsky2.   

Abstract

To assess risk, the European Food Safety Authority requires that the amino-acid sequences of newly expressed proteins in genetically engineered (GE) crops should be searched for partial matches with 9-mer restricted epitopes known to cause celiac disease. None of the 26 known celiac-causing 9-mer epitopes contain an in-silico predicted trypsin cleavage site. The probability of this occurring by chance alone is 0.000056. Based on the absence of in-silico predicted trypsin cleavage sites within 9-mer epitopes known to cause celiac disease, it can be concluded with very high confidence that true celiac-causing epitopes are highly unlikely to contain in-silico predicted trypsin cleavage sites and that this criterion can reliably be used to exclude the risk that imperfect 9-mer peptide matches within newly expressed proteins from GE crops cause celiac disease.

Entities:  

Keywords:  Celiac; bioinformatics; genetically engineered; proteins; trypsin cleavage site

Mesh:

Substances:

Year:  2019        PMID: 31743058      PMCID: PMC7289517          DOI: 10.1080/21645698.2019.1692612

Source DB:  PubMed          Journal:  GM Crops Food        ISSN: 2164-5698            Impact factor:   3.074


Introduction

Celiac disease is an adverse immunological response among a subset of humans to gluten proteins found in wheat, rye, and barley.[1] The European Food Safety Authority (EFSA) recently issued regulatory guidance on the risk assessment of newly expressed proteins in genetically engineered (GE) crops for the potential to cause celiac disease[2] as a follow-up to regulation implemented by the European Commission.[3] This guidance includes bioinformatic methods that search for imperfect nine amino-acid peptide matches with known celiac-causing restricted epitopes or matches with short amino-acid motifs. In addition, the guidance includes criteria to exclude certain matches as low risk based on the presence of certain amino acids at certain positions (e.g., positively charged amino acids at positions 1, 4, 6, 7, and 9) within the 9-mer alignment. While no in-silico predicted trypsin sites (hereafter referred to as trypsin cleavage sites) (https://web.expasy.org/peptide_cutter/) are present within the 26 unique celiac epitopes listed in the guidance as derived from Sollid et al.[4], EFSA recently clarified in a consultation that the presence of a trypsin site within a 9-mer alignment present within a GE protein would not be considered an exclusion criterion in regulatory submissions for GE crops (https://www.efsa.europa.eu/sites/default/files/event/190405-p04.pdf). Here, the likelihood that all 26 bonified celiac epitopes would randomly exclude an in-silico predicted trypsin cleavage site in every instance was estimated. While a putative trypsin-site exclusion criterion may reflect digestive susceptibility, it is noteworthy that the physiochemical properties of trypsin cleavage sites might simply inhibit binding to the biochemical molecules required to induce celiac symptoms and not reflect actual digestion in vivo (which could be incomplete). The results of this investigation should clarify the value of these exclusion criteria in separating true celiac epitopes from low-risk sequences.

Methods and Materials

Trypsin cleavage sites occur in proteins, on average, once per 14 amino acids[5], and there are eight amino-acid bonds in each 9-mer epitope known to cause celiac disease. However, the 26 unique epitopes tabulated by Sollid et al. have certain amino acids found in certain positions that are considered conserved motifs for celiac epitopes, further reducing the total number of potential trypsin-cleavage-site substitutions that are possible. The number of possible trypsin cleavage sites that could occur in each of the 26, unique 9-mer epitopes identified by Sollid et al. (and included in the EFSA guidance document) were determined (Table 1). Positions were counted where lysine or arginine could be substituted into these epitopes that are not blocked by a proline on the carboxyl-terminal side (as part of the conserved celiac motif)[6] and where the celiac motif was also not disrupted. Since none of the celiac motifs initiate with a proline, the number of carboxyl-terminal bonds on non-motif amino acids equals the number of potential trypsin-cleavage-site substitutions that are possible. The random likelihood that none of these 26 epitopes contain a trypsin cleavage site was calculated using the binomial distribution based on the average presence of trypsin cleavage sites occurring in proteins once per 14 amino acids.
Table 1.

Celiac disease 9-mer restricted epitopes and potential number of trypsin-cleavage-site substitutions in peptide matches outside of motif.

Celiac restricted epitopesPotential trypsin cleavage sites
Motif in bold(with positions 1, 4, 6, 7, and 9 excluded)
P F P Q P Q L P Y5(3)
P Y P Q P Q L P Y5(3)
P Q P Q L P Y P Q4(2)
F R P Q Q P Y P Q4(2)
P Q Q S F P Q Q Q8(3)
I Q P Q Q P A Q L4(2)
Q Q P Q Q P Y P Q4(2)
S Q P Q Q Q F P Q5(3)
P Q P Q Q Q F P Q5(3)
Q Q P Q Q P F P Q4(2)
P Q P Q Q P F C Q4(2)
Q Q P F P Q Q P Q5(3)
P F P Q P Q Q P F5(3)
P Q P Q Q P F P W4(2)
P F S Q Q Q Q P V5(3)
F S Q Q Q Q S P F5(3)
P F P Q P Q Q P F5(3)
P Q P Q Q P F P Q4(2)
P F P Q P Q Q P F5(3)
P Q P Q Q P F P Q4(2)
P Y P E Q Q E P F5(3)
P Y P E Q Q Q P F5(3)
Q G S F Q P S Q Q7(4)
Q Q P Q Q P F P Q7(4)
Q Q P Q Q P Y P Q7(4)
Q G Y Y P T S P Q7(4)
Total =132(73)
Celiac disease 9-mer restricted epitopes and potential number of trypsin-cleavage-site substitutions in peptide matches outside of motif.

Results and Discussion

The current EFSA guidance for celiac risk allowing imperfect matches between a newly expressed proteins in GE crops and known celiac-causing epitopes finds many false positives[7]; so additional exclusion criteria are critical to practical implementation as part of risk assessment. This high false-positive rate is especially problematic because the subsequent assessment of risk is dependent on modeling protocols (predicting binding to HLA-DQ2 or HLA-DQ8 molecules and recognition by T cells) that are not yet developed or validated as fit for purpose (https://etendering.ted.europa.eu/cft/cft-display.html?cftId=4505). The 26 celiac epitopes reported by Sollid et al. contain 132 positions where arginine or lysine could be substituted without a blocking proline on the carboxyl-terminal side of the substitution from the intact celiac motif (potential trypsin cleavage sites). It is noteworthy that no blocking prolines are found in any celiac motif. Based on the average presence of trypsin cleavage sites occurring once per 14 amino acids, the random chance that none of the 26 celiac epitopes contain a trypsin cleavage site is <1 in 10,000 (0.000056). While the aforementioned probability calculation is based on the 26 unique 9-mer epitopes reported by Sollid et al., 31 epitopes are listed in this publication due to five duplicate 9-mer epitopes being identified from different source organisms. These duplicate epitopes could be treated as additional opportunities for the occurrence of trypsin cleavage sites, further diminishing the chance that trypsin cleavage sites are randomly excluded from true celiac-disease-causing epitopes. The celiac restricted epitopes are also most often repeated in each celiac-causing protein multiple times, creating many more opportunities for a mutation that could introduce a trypsin site. Thus, the probability reported here is highly conservative in that the random chance of finding no bonified celiac restriction epitopes containing an in-silico predicted trypsin cleavage site is even more remote when these factors are considered. Furthermore, two exceptions to proline blocking trypsin cleavage (WKP and MRP) were excluded from this analysis as they have a negligible effect on the probability calculations. It is also noteworthy that only 1 of the 26 unique restricted epitopes known to cause celiac disease contains the lysine or arginine required for trypsin cleavage, and this potential cleavage site is blocked by proline on the carboxyl-terminal side (F R P Q Q P Y P Q). However, trypsin is known to sometimes cleave peptides before proline even though classically not considered a trypsin cleavage site[8], but as indicated earlier, the binding properties of in-silico predicted trypsin cleavage sites to the biochemical molecules required to induce celiac symptoms (rather than digestion by trypsin) may explain the absence of in-silico predicted trypsin cleavage sites in celiac peptides. Finally, even when the currently allowed exclusion of positively charged arginine and lysine in positions 1, 4, 6, 7, and 9 of the 26 Sollid et al. 9-mers are considered, there are 73 remaining potential trypsin cleavage sites (Table 1), and the probability that none of these randomly contain a trypsin cleavage site is still very small (0.0045). Although potential digestion in the intestine by trypsin within the 9-mer restricted celiac-causing epitopes could preclude the presence of trypsin cleavage sites within these epitopes (especially due to the expected exposure of these epitopes within the peptides allowing for binding to the biochemical molecules that cause celiac disease), the physicochemical properties of the in-silico predicted trypsin cleavage sites might simply inhibit binding to the biochemical molecules associated with celiac disease. The inhibitory properties of positively charged lysine and arginine have already been implicated in reduced binding at multiple positions within the 9-mer restricted epitopes, and the added effect of a carboxy-terminal proline may block this inhibition in a manner similar to that inhibiting trypsin-catalyzed cleavage. Whatever the mechanism, it is highly unlikely that in-silico predicted trypsin cleavage sites are absent from known celiac-causing restricted epitopes by chance. In conclusion, based on the absence of in-silico predicted trypsin cleavage sites in known celiac restricted epitopes, there is high confidence that the presence of an in-silico predicted trypsin cleavage site in a putative celiac 9-mer peptide excludes its risk as a true celiac-disease-causing sequence. This high confidence supports the presence of an in-silico predicted trypsin cleavage site in a putative celiac-causing peptide contained within a newly expressed protein in a GE crop as a reliable exclusion criterion for celiac disease risk.
  6 in total

1.  Systematic and quantitative comparison of digest efficiency and specificity reveals the impact of trypsin quality on MS-based proteomics.

Authors:  Julia Maria Burkhart; Cornelia Schumbrutzki; Stefanie Wortelkamp; Albert Sickmann; René Peiman Zahedi
Journal:  J Proteomics       Date:  2011-11-30       Impact factor: 4.044

2.  Does trypsin cut before proline?

Authors:  Jesse Rodriguez; Nitin Gupta; Richard D Smith; Pavel A Pevzner
Journal:  J Proteome Res       Date:  2007-12-08       Impact factor: 4.466

3.  Q-X1-P-X2 motif search for potential celiac disease risk has poor selectivity.

Authors:  Ping Song; Nancy Podevin; Henry Mirsky; Jennifer Anderson; Bryan Delaney; Carey Mathesius; Laura Rowe; Rod A Herman
Journal:  Regul Toxicol Pharmacol       Date:  2018-09-26       Impact factor: 3.271

4.  Guidance on allergenicity assessment of genetically modified plants.

Authors:  Hanspeter Naegeli; Andrew Nicholas Birch; Josep Casacuberta; Adinda De Schrijver; Mikolaj Antoni Gralak; Philippe Guerche; Huw Jones; Barbara Manachini; Antoine Messéan; Elsa Ebbesen Nielsen; Fabien Nogué; Christophe Robaglia; Nils Rostoks; Jeremy Sweet; Christoph Tebbe; Francesco Visioli; Jean-Michel Wal; Philippe Eigenmann; Michelle Epstein; Karin Hoffmann-Sommergruber; Frits Koning; Martinus Lovik; Clare Mills; Francisco Javier Moreno; Henk van Loveren; Regina Selb; Antonio Fernandez Dumont
Journal:  EFSA J       Date:  2017-06-22

Review 5.  Celiac disease.

Authors:  E Rivera; A Assiri; S Guandalini
Journal:  Oral Dis       Date:  2013-03-18       Impact factor: 3.511

Review 6.  Nomenclature and listing of celiac disease relevant gluten T-cell epitopes restricted by HLA-DQ molecules.

Authors:  Ludvig M Sollid; Shuo-Wang Qiao; Robert P Anderson; Carmen Gianfrani; Frits Koning
Journal:  Immunogenetics       Date:  2012-02-10       Impact factor: 2.846

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.