Literature DB >> 33665245

A comprehensive dataset of the extra virgin olive oil (EVOO) proteome.

Antonio Jesús Castro¹, Elena Lima-Cabello¹, Juan de Dios Alché¹.

Abstract

Proteins and peptides are minor components of vegetal oils. The presence of these compounds in virgin olive oil was first reported in 2001, but the nature of the olive oil proteome is still a puzzling question for food science researchers. In this paper, we have compiled for a first time a comprehensive proteomic dataset of olive fruit and fungal proteins that are present at low but measurable concentrations in a vegetable oil from a crop of great agronomical relevance as olive (Olea europaea L.). Accurate mass nLC-MS data were collected in high definition direct data analysis (HD-DDA) mode using the ion mobility separation step. Protein identification was performed using the Mascot Server v2.2.07 software (Matrix Science) against an ad hoc database made of olive protein entries. Starting from this proteomic record, the impact of these proteins on olive oil stability and quality could be tested. Moreover, the effect of olive oil proteins on human health and their potential use as functional food components could be also evaluated. In addition, this dataset provides a resource for use in further functional comparisons across other vegetable oils, and also expands the proteomic resources to non-model species, thus also allowing further comparative inter-species studies. The data presented here are related to the research article of Castro et al. [1].

Entities: Chemical Disease Species

Keywords: Extra virgin olive oil (EVOO); Lipoxygenase; Olea europaea; Proteomics; seed storage proteins (SSP)

Year: 2021 PMID： 33665245 PMCID： PMC7900235 DOI： 10.1016/j.dib.2021.106822

Source DB: PubMed Journal: Data Brief ISSN： 2352-3409

Specifications Table

Value of the Data

This proteomic dataset provides a comprehensive list of olive fruit and fungal proteins present in the extra virgin olive oil (EVOO). Starting from these proteomic data, the impact of different protein components present in EVOO on its stability and quality could be further studied. The effect of the olive oil proteins on human health and their potential use as functional food components could be also evaluated. This dataset provides a resource for use in further functional comparisons across other vegetable oils. Expanding the proteomic resources to non-model but agronomically relevant species will allow further comparative inter-species studies.

Data Description

A comprehensive dataset of olive fruit (i.e. pulp and seed) proteins identified in the extra virgin olive (cv. Picual) oil (EVOO) is provided in Table S1. In addition, Table S2 provides a list of fungal proteins present in EVOO, derived from the yeast-like fungus Aureobasidium pullulans, which is ubiquitous in the phyllosphere and carposphere of the olive tree [4]. Tables S1 and S2 are available with this article as Excel (.xlsx) files and each list of identified proteins is displayed in drop-down table format. Proteins and organisms in which proteins were identified were annotated according to NCBI nr database [5]. Main tables also incorporate a number of parameters about each protein identified, including: a) the gel slice in which the protein was isolated and identified, b) the number of amino acid residues, c) the theoretical mass and isoelectric point, d) the total percent coverage (i.e. the number of AA in all found peptides divided by the total number of AA in the entire protein sequence), e) the total number of identified proteins in the protein group of a master protein, f) the total number of peptide sequences unique to a protein group, g) the total number of distinct peptide sequences in the protein group, and h) the total number of identified peptide sequences (PSM, peptide spectrum matches) for the protein. Mascot searches were carried out against olive genome (https://denovo.cnag.cat/olive; [6]) and transcriptome (http://reprolive.eez.csic.es/olivodb; [7]) records and the Uniprot database (https://www.uniprot.org/; [8]). For each search, the protein score (i.e. the sum of the score of the individual peptides), the protein coverage, the number of distinct peptide sequences in the protein group and the number of identified peptide sequences are also provided. For each protein identified, the sheet can be expanded by clicking on the [+] key located on the left margin of the Excel file, which opens the row parameters for the associated peptides, including: a) the AA sequences of identified peptides, b) the number of PSMs for the protein, c) the number of proteins in which this peptide is found, d) the number of protein groups in which this peptide is found, e) fixed and variable modifications of peptides, f) the number of missed cleavage positions, g) the monoisotopic mass of the peptide, h) the accession number of the protein in the corresponding database, i) the top level confidence (only the high-confidence data were considered) achieved with the peptide sequence, j) the score for the peptide after MASCOT search in the corresponding database, and k) the experimental m/z value for the peptide after MASCOT search in the corresponding database. MS/MS spectra corresponding to hand-validated peptides are also included as Figs. S1–S23 embedded in a single PDF file. The mass spectrometry raw data were deposited to the ProteomeXchange Consortium via the PRIDE [2], [3] partner repository with the dataset identifier PXD019894 (www.ebi.ac.uk/pride/archive/projects/PXD019894). Alternatively, raw data files can be also downloaded from ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2020/09/PXD019894.

Experimental Design, Materials and Methods

Materials

Freshly bottled extra virgin olive (cv. Picual) oil (2018 harvest) of the Protected Designation of Origin (PDO) “Montes de Granada” was purchased from a local market and stored in the dark at 15–18 °C until use. All chemicals used had purity greater than 99%.

In situ digestion of extra virgin olive oil proteins

Extra virgin olive oil proteins were extracted, electrophoresed on 1-D polyacrylamide gels and stained as described in [1]. The gel lane containing the EVOO proteins was systematically cut from the top (slice S1) to the bottom (slice S10) into slices of ∼1 cm width each (see Fig. 1A in ref. [1]). In situ digestion of proteins was performed using the MassPREP Station (Micromass, Manchester, UK). Gel slices were washed three times in a mixture containing 25 mM NH4HCO3: acetonitrile (ACN) (1:1, v/v). The Cys-residues were reduced by 50 µL of 10 mM dithiothreitol (DTT) at 57 °C and alkylated by 50 µL of 55 mM iodoacetamide at room temperature. After gel dehydration with ACN, proteins were digested overnight at room temperature in 15 µL of a solution containing 12.5 ng µL−1 of a modified porcine trypsin (Promega, Madison, WI, USA) prepared in 25 mM NH4HCO3. Finally, a double extraction was performed, first with 60% (v/v) ACN in 5% (v/v) formic acid, and subsequently with 100% (v/v) ACN.

LC-MS data acquisition and analysis

Nano-Liquid chromatography (nLC) of the resulting tryptic peptides was performed using a nanoACQUITY UPLC© system (Waters, Milford, MA, USA), equipped with a nanoACQUITY UPLC© Peptide BEH C18 (200 mm length × 75 µm ID, 1.7 µm particle size) nano-column (catalog no. 186,007,483, Waters), coupled with a nanoACQUITY UPLC C18 trap column (20 mm length × 180 µm ID, 5 µm particle size) (catalog no. 186,007,496, Waters). About 0.5 µg was loaded per run. The solvent system consisted of 0.1% (v/v) formic acid in water (mobile phase A) and 0.1% (v/v) formic acid in acetonitrile (ACN) (mobile phase B). Elution was performed at a flow rate of 300 nL min−1, using a linear gradient (5 to 60%) of mobile phase B over a chromatographic ramp of 120 min. A lock mass compound [Glu1]-Fibrinopeptide B (100 fmol µl−1) (catalog no. 196,007,091–2, Waters) was delivered by an auxiliary pump of the LC system at 500 nL min−1 to the reference sprayer of the NanoLockSpray Exact Mass Ionization source (Waters Corp.) of the mass spectrometer. A Synapt G2Si ESI Q-Mobility-TOF spectrometer (Waters Corp.) equipped with an ion mobility chamber (T-Wave-IMS) was used for high definition data acquisition analysis. The mass spectrometer was operating in positive mode ESI with the following settings: source temperature was set to 120 °C, while dry gas flow was at 3.7 ml min–1. The nano-electrospray voltage was optimized to 1.3 kV. Data were post-acquisition lock mass corrected using the double charged monoisotopic ion of [Glu1]-Fibrinopeptide B. Accurate LC-MS data were collected in High Definition Direct Data Analysis (HD-DDA) mode that enhances signal intensities using the ion mobility separation step [9].

Database search and protein identification

Protein identification was performed using the Mascot Server v2.2.07 software (Matrix Science, London, UK) against an ad hoc-generated database composed of protein entries retrieved from the olive genome [6] and transcriptome [7] records, as well as known contaminant proteins such as human keratins and trypsin, extracted from the NCBI nr protein database [5]. To be accepted for the identification, an error of less than 15 ppm of peptide mass tolerance and 0.2 Da of fragment mass tolerance were tolerated. Up to 3 missed cleavage points were allowed and some modifications were taken into account: carbamidomethylation of Cys-residues (+57 Da) as fixed modification, oxidation of Met (+16 Da) as variable modification, and peptide charges of +2 and +3. In addition, searches were performed without any molecular weight (Mr) or isoelectric point (pI) restrictions. To calculate the false discovery rate (FDR) [10], the search was performed using the “decoy” option in Mascot (Matrix Science). Peptide identifications extracted from Mascot result files were validated at a final peptide FDR of 1%. Peptide matches were also manually validated if their score was close to the Mascot homology threshold for a given Mascot p value.

CRediT Author Statement

Antonio Jesús Castro: Conceptualization, Investigation, Data curation, Visualization, Writing - Original draft preparation, Writing - Reviewing and Editing; Elena Cabello-Lima: Investigation, Validation, Writing - Reviewing and Editing; Juan de Dios Alché: Conceptualization, Funding Acquisition, Supervision, Writing - Reviewing and Editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Subject	Agricultural and Biological Sciences
Specific subject area	Food Science, biochemical composition of vegetable oils
Type of data	Drop-drown tables, figures
How data were acquired	Data were acquired by nanoLC-MS using a nanoAcquity UPLC system (Waters Corp.) and a Synapt G2Si ESI Q-Mobility-TOF spectrometer (Waters Corp.) equipped with an ion mobility chamber (T-Wave-IMS)
Data format	Excel files with data analysis output, figures embedded in a single PDF file. Raw data are also available on a public data repository (for more information, please see the Data accessibility section below)
Parameters for data collection	Extra virgin olive (cv. Picual) oil (2018 harvest) of the Protected Designation of Origin (PDO) “Montes de Granada” was used as material for protein extraction
Description of data collection	Proteins were extracted from five liters of EVOO and subjected to SDS-PAGE separation prior to nLC-MS analysis. Accurate mass nLC-MS data were collected in high definition direct data analysis (HD-DDA) mode using the ion mobility separation step. Protein identification was performed using the Mascot Server v2.2.07 software (Matrix Science) against an in-house olive protein database
Data source location	Estación Experimental del Zaidín (CSIC), Granada, Spain
Data accessibility	The mass spectrometry raw data have been deposited to the ProteomeXchange Consortium via the PRIDE [2], [3] partner repository with the dataset identifier PXD019894 (http://www.ebi.ac.uk/pride/archive/projects/PXD019894). Analyzed data are with this article.
Related research article	A.J. Castro, E. Lima-Cabello, J.D. Alché, Identification of seed storage proteins as the major constituents of the extra virgin olive oil proteome, Food Chemistry: X 7 (2020) 100,099 http://doi.org/10.1016/j.fochx.2020.100099

9 in total

1. Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry.

Authors: Joshua E Elias; Steven P Gygi
Journal: Nat Methods Date: 2007-03 Impact factor: 28.547

2. Ion mobility tandem mass spectrometry enhances performance of bottom-up proteomics.

Authors: Dominic Helm; Johannes P C Vissers; Christopher J Hughes; Hannes Hahne; Benjamin Ruprecht; Fiona Pachl; Arkadiusz Grzyb; Keith Richardson; Jason Wildgoose; Stefan K Maier; Harald Marx; Mathias Wilhelm; Isabelle Becher; Simone Lemeer; Marcus Bantscheff; James I Langridge; Bernhard Kuster
Journal: Mol Cell Proteomics Date: 2014-08-08 Impact factor: 5.911

3. Metabarcoding Analysis of Fungal Diversity in the Phyllosphere and Carposphere of Olive (Olea europaea).

Authors: Ahmed Abdelfattah; Maria Giulia Li Destri Nicosia; Santa Olga Cacciola; Samir Droby; Leonardo Schena
Journal: PLoS One Date: 2015-07-01 Impact factor: 3.240

4. Identification of seed storage proteins as the major constituents of the extra virgin olive oil proteome.

Authors: Antonio Jesús Castro; Elena Lima-Cabello; Juan de Dios Alché
Journal: Food Chem X Date: 2020-06-27

5. The PRIDE database and related tools and resources in 2019: improving support for quantification data.

Authors: Yasset Perez-Riverol; Attila Csordas; Jingwen Bai; Manuel Bernal-Llinares; Suresh Hewapathirana; Deepti J Kundu; Avinash Inuganti; Johannes Griss; Gerhard Mayer; Martin Eisenacher; Enrique Pérez; Julian Uszkoreit; Julianus Pfeuffer; Timo Sachsenberg; Sule Yilmaz; Shivani Tiwary; Jürgen Cox; Enrique Audain; Mathias Walzer; Andrew F Jarnuczak; Tobias Ternent; Alvis Brazma; Juan Antonio Vizcaíno
Journal: Nucleic Acids Res Date: 2019-01-08 Impact factor: 16.971

6. UniProt: a worldwide hub of protein knowledge.

Authors:
Journal: Nucleic Acids Res Date: 2019-01-08 Impact factor: 16.971

7. The ProteomeXchange consortium in 2020: enabling 'big data' approaches in proteomics.

Authors: Eric W Deutsch; Nuno Bandeira; Vagisha Sharma; Yasset Perez-Riverol; Jeremy J Carver; Deepti J Kundu; David García-Seisdedos; Andrew F Jarnuczak; Suresh Hewapathirana; Benjamin S Pullman; Julie Wertz; Zhi Sun; Shin Kawano; Shujiro Okuda; Yu Watanabe; Henning Hermjakob; Brendan MacLean; Michael J MacCoss; Yunping Zhu; Yasushi Ishihama; Juan A Vizcaíno
Journal: Nucleic Acids Res Date: 2020-01-08 Impact factor: 16.971

8. ReprOlive: a database with linked data for the olive tree (Olea europaea L.) reproductive transcriptome.

Authors: Rosario Carmona; Adoración Zafra; Pedro Seoane; Antonio J Castro; Darío Guerrero-Fernández; Trinidad Castillo-Castillo; Ana Medina-García; Francisco M Cánovas; José F Aldana-Montes; Ismael Navas-Delgado; Juan de Dios Alché; M Gonzalo Claros
Journal: Front Plant Sci Date: 2015-08-11 Impact factor: 5.753

9. Genome sequence of the olive tree, Olea europaea.

Authors: Fernando Cruz; Irene Julca; Jèssica Gómez-Garrido; Damian Loska; Marina Marcet-Houben; Emilio Cano; Beatriz Galán; Leonor Frias; Paolo Ribeca; Sophia Derdak; Marta Gut; Manuel Sánchez-Fernández; Jose Luis García; Ivo G Gut; Pablo Vargas; Tyler S Alioto; Toni Gabaldón
Journal: Gigascience Date: 2016-06-27 Impact factor: 6.524

9 in total