| Literature DB >> 26340620 |
Piotr Minkiewicz1, Małgorzata Darewicz2, Anna Iwaniak3, Jolanta Sokołowska4, Piotr Starowicz5, Justyna Bucholska6, Monika Hrynkiewicz7.
Abstract
A common subsequence is a fragment of the amino acid chain that occurs in more than one protein. Common subsequences may be an object of interest for food scientists as biologically active peptides, epitopes, and/or protein markers that are used in comparative proteomics. An individual bioactive fragment, in particular the shortest fragment containing two or three amino acid residues, may occur in many protein sequences. An individual linear epitope may also be present in multiple sequences of precursor proteins. Although recent recommendations for prediction of allergenicity and cross-reactivity include not only sequence identity, but also similarities in secondary and tertiary structures surrounding the common fragment, local sequence identity may be used to screen protein sequence databases for potential allergens in silico. The main weakness of the screening process is that it overlooks allergens and cross-reactivity cases without identical fragments corresponding to linear epitopes. A single peptide may also serve as a marker of a group of allergens that belong to the same family and, possibly, reveal cross-reactivity. This review article discusses the benefits for food scientists that follow from the common subsequences concept.Entities:
Keywords: allergens; biologically active peptides; biomarkers; databases; epitopes
Mesh:
Substances:
Year: 2015 PMID: 26340620 PMCID: PMC4613229 DOI: 10.3390/ijms160920748
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1Location of biologically active fragments in the sequence of yeast (Saccharomyces cerevisiae, strain ATCC 204508/S288c) protease B inhibitor 2 (Accession No P0CT04 in the UniProt Knowledgebase [21,22]). (1) angiotensin I-converting enzyme (ACE) inhibitors; (2) glucose uptake stimulators; (3) antioxidant fragments; (4) dipeptidyl peptidase IV inhibitors; (5) calmodulin-dependent phosphodiesterase 1 inhibitors; (6) renin inhibitors; (7) fragments with other activities (see Table 1). Bioactive fragments were found with the use of the BIOPEP search engine [23,24] where the protein sequence was the query. The search was performed in May 2014.
Reference data for biologically active fragments of yeast (Saccharomyces cerevisiae, strain ATCC 204508/S288c) protease B inhibitor 2, indicated in Figure 1.
| ID a | Sequence b | Activity | Primary Resource c | Reference |
|---|---|---|---|---|
| 3379 | AKK | ACE inhibitor | Muscle of fish of the genus Sardina d | [ |
| 3532 | GY | ACE inhibitor | Synthetic | [ |
| 7587 | VP | ACE inhibitor | Synthetic | [ |
| 7600 | AG | ACE inhibitor | Synthetic | [ |
| 7602 | HL | ACE inhibitor | Synthetic | [ |
| 7604 | KG | ACE inhibitor | Synthetic | [ |
| 7607 | GS | ACE inhibitor | Synthetic | [ |
| 7616 | GG | ACE inhibitor | Synthetic | [ |
| 7623 | EA | ACE inhibitor | Synthetic | [ |
| 7654 | NKL | ACE inhibitor | Wakame ( | [ |
| 7683 | NF | ACE inhibitor | Garlic ( | [ |
| 7692 | KF | ACE inhibitor | Garlic ( | [ |
| 7693 | KL | ACE inhibitor | Wakame ( | [ |
| 7698 | NK | ACE inhibitor | Wakame ( | [ |
| 7827 | IE | ACE inhibitor | Bovine ( | [ |
| 7828 | EV | ACE inhibitor | Bovine ( | [ |
| 7829 | VE | ACE inhibitor | Bovine ( | [ |
| 7832 | LN | ACE inhibitor | Bovine ( | [ |
| 7840 | EK | ACE inhibitor | Bovine ( | [ |
| 7841 | KE | ACE inhibitor | Bovine ( | [ |
| 8320 | VL | Glucose uptake stimulating | Bovine ( | [ |
| 8322 | IV | Glucose uptake stimulating | Bovine ( | [ |
| 8325 | II | Glucose uptake stimulating | Bovine ( | [ |
| 8329 | EE | Vasoactive substance release stimulating | Soybean ( | [ |
| 3305 | LH | Antioxidant | Soybean ( | [ |
| 3317 | HL | Antioxidant | Soybean ( | [ |
| 3319 | HH | Antioxidant | Soybean ( | [ |
| 7794 | VHH | Antioxidant | Chicken ( | [ |
| 7995 | LHL | Antioxidant | Synthetic | [ |
| 8130 | EAK | Antioxidant | Bonito ( | [ |
| 8217 | LK | Antioxidant | Chicken ( | [ |
| 3751 | KK | Bacterial permease ligand | Synthetic | [ |
| 3181 | VP | Dipeptidyl peptidase IV inhibitor | Rat ( | [ |
| 3184 | HA | Dipeptidyl peptidase IV inhibitor | Rat ( | [ |
| 8249 | KF | CaMPDE inhibitor | Pea ( | [ |
| 8250 | EF | CaMPDE inhibitor | Pea ( | [ |
| 8248 | KF | Renin inhibitor | Pea ( | [ |
| 8251 | EF | Renin inhibitor | Pea ( | [ |
a ID number in the BIOPEP database; b Sequence given in a single-letter code; c Source from which the peptide was isolated for the first time; d Organism used as a food resource. Abbreviations used in Table 1: ACE—angiotensin I-converting enzyme; CaMPDE—calmodulin-dependent phosphodiesterase 1.
Examples of protocols involving the search for shorter fragments in sequences of proteins or peptides relevant for food and/or nutrition sciences.
| Database Search Application | Reference |
|---|---|
| Location of short, bioactive fragments in sequences of peptides released during hydrolysis of bovine and trout meat proteins in the porcine digestive tract. Peptides used as query sequences were identified by mass spectrometry. | [ |
| Location of bioactive fragments in sequences of rapeseed proteins. Protein sequences from UniProt were used as queries. | [ |
| Location of bioactive fragments in sequences of bovine meat proteins. Protein sequences from UniProt were used as queries. | [ |
| Location of short, bioactive fragments in sequences of peptides released during hydrolysis of fish sarcoplasmic proteins. Peptides used as query sequences were identified by mass spectrometry. | [ |
| Location of bioactive fragments in sequences of cereal proteins. Protein sequences from UniProt were used as queries. | [ |
| The BIOPEP database was used to determine the profiles of potential biological activity of salmon proteins. Some of the predicted peptides were identified in protein hydrolysates by liquid chromatography and mass spectrometry. | [ |
| Location of bioactive fragments in sequences of proteins from the human digestive tract, followed by proteolysis simulation by digestive proteolytic enzymes. Protein sequences from UniProt were used as queries. | [ |
| Location of bioactive fragments in sequences of amaranthus proteins. Protein sequences from UniProt were used as queries. | [ |
Proteins containing fragment PANLPWGSSNV with an ACE inhibitory activity [66] (ID 49468 in the PepBank database [67,68]). The UniProt Knowledgebase [21,22] was screened with the BLAST program [69,70] with the use of the above sequence as a query and screening parameters described by Minkiewicz et al. [71]. The search was performed in May 2014.
| No | Protein Name | Entry Name in UniProtKB | Organism a |
|---|---|---|---|
| 1. | Uncharacterized protein | TR:W4ZV89_WHEAT | |
| 2. | Glyceraldehyde-3-phosphate dehydrogenase | SP:G3P3_YEAST | |
| 3. | Glyceraldehyde-3-phosphate dehydrogenase | TR:A6ZUK2_YEAS7 | |
| 4. | Glyceraldehyde-3-phosphate dehydrogenase | TR:B3LI45_YEAS1 | |
| 5. | Glyceraldehyde-3-phosphate dehydrogenase | TR:B5VJD4_YEAS6 | |
| 6. | Glyceraldehyde-3-phosphate dehydrogenase | TR:C8Z985_YEAS8 | |
| 7. | Glyceraldehyde-3-phosphate dehydrogenase | TR:E7KD02_YEASA | |
| 8. | Glyceraldehyde-3-phosphate dehydrogenase | TR:E7KP33_YEASL | |
| 9. | Glyceraldehyde-3-phosphate dehydrogenase | TR:E7LUX3_YEASV | |
| 10. | Glyceraldehyde-3-phosphate dehydrogenase | TR:E7NI37_YEASO | |
| 11. | Glyceraldehyde-3-phosphate dehydrogenase | TR:E7Q4A2_YEASB | |
| 12. | Glyceraldehyde-3-phosphate dehydrogenase | TR:E7QF80_YEASZ | |
| 13. | Glyceraldehyde-3-phosphate dehydrogenase | TR:G2WES0_YEASK | |
| 14. | Tdh3p | TR:H0GGT7_9SACH | |
| 15. | Uncharacterized protein | TR:J7S7S3_KAZNA | |
| 16. | Tdh3p | TR:N1P2H7_YEASC | |
| 17. | Tdh3p | TR:W7PUI3_YEASX | |
| 18. | Tdh3p | TR:W7RBG4_YEASX |
a Defined by the Latin name and NCBI taxonomic identifier [72,73] (in parentheses).
Figure 2Distribution of proteins containing at least one of the four IgE-binding epitopes of ω5-gliadin [102] across protein families. (a) distribution based on the number of proteins containing epitopes in the family; (b) percentage content of proteins with epitopes in the family. The data for families containing at least two proteins with epitopes are shown in b.
Proteins containing fragments FFVAPFPEVFGK and FESNFNTQATNR, used as markers of αs1-casein from milk and lysozyme from eggs, respectively [126]. The UniProt Knowledgebase [21,22] was screened with the BLAST program [69,70] with the use of the above sequence as a query and screening parameters described by Minkiewicz et al. [71]. The search was performed in April 2015.
| No | Entry Name in UniProtKB | Allergome Annotation | Organism a |
|---|---|---|---|
| Peptide (R)FFVAPFPEVFGK b—marker of αs1-casein | |||
| 1. | CASA1_BOVIN | Bos d 9.0101; Code 10197 | |
| 2. | CASA1_BUBBU | Bub b 8; Code 1259 | |
| 3. | G3C8Y4_BUBBU | ||
| 4. | B5B3R8_BOVIN | Bos d 9; Code 2734 | |
| 5. | L8I5S0_9CETA | ||
| 6. | G3C8Y5_BUBBU | ||
| 7. | Q4F6X6_BUBBU | ||
| Peptide (K)FESNFNTQATNR c—marker of lysozyme C | |||
| 1. | LYSC_CHICK | Gal d 4.0101; Code 3294 | |
| 2. | LYSC_COTJA | ||
| 3. | B8YK77_GALLA | Gal la 4; Code 9143 | |
| 4. | B8YK75_GALSO | Gal so 4; Code 9144 | |
| 5. | B8YK79_CHICK | Gal d 4; Code 362 | |
| 6. | B8YJP1_CHICK | Gal d 4; Code 362 | |
| 7. | B8YJN9_CHICK | Gal d 4; Code 362 | |
| 8. | B8YJT7_CHICK | Gal d 4; Code 362 | |
a Defined by the Latin name and NCBI taxonomic identifier [72,73] (in parentheses); b Fragment preceded by arginine residue in the sequences of all proteins annotated in the Table. The preceding residue (in parentheses) was included in the query sequence; c Fragment preceded by lysine residue in the sequences of all proteins annotated in the Table. The preceding residue (in parentheses) was included in the query sequence.
Selected applications of mass spectrometry for the identification of food peptides.
| Aim of the Experiment | Mass Spectrometry Technique | Separation Method | Reference |
|---|---|---|---|
| Identification of Angiotensin I-converting enzyme (ACE) inhibitory peptides released during simulated gastrointestinal digestion of salmon ( | ESI-IT-MS/MS, SRM | RP-HPLC, low TFA concentration in mobile phase | [ |
| Detection and quantitative determination of peptides that are markers of bovine ( | ESI-MS/MS, SRM | RP-HPLC | [ |
| Detection and quantitative determination of peptides that are markers of mustard allergen Sin a 1 in foods | ESI-MS/MS, SRM | RP-HPLC | [ |
| Identification of peptides from peanut ( | nano-ESI Q-TOF MS/MS | capillary RP-HPLC | [ |
| Identification of peptides from soybean ( | MALDI-TOF and MALDI-TOF-TOF | RP-HPLC | [ |
Abbreviations used in Table 5: ESI: electrospray ionization; IT: ion trap; MALDI: Matrix-Assisted Laser Desorption Ionization; MS: mass spectrometry; MS/MS: tandem mass spectrometry; Nano-ESI: nanoelectrospray; Q-TOF: quadrupole-time-of-flight; RP-HPLC: reversed-phase high-performance liquid chromatography; SRM: selected reaction monitoring; TFA: trifluoroacetic acid; TOF: time of flight.