Literature DB >> 26587561

TAILS N-terminomic and proteomic datasets of healthy human dental pulp.

Ulrich Eckhard¹, Giada Marino¹, Simon R Abbey¹, Ian Matthew², Christopher M Overall³.

Abstract

The Data described here provide the in depth proteomic assessment of the human dental pulp proteome and N-terminome (Eckhard et al., 2015) [1]. A total of 9 human dental pulps were processed and analyzed by the positional proteomics technique TAILS (Terminal Amine Isotopic Labeling of Substrates) N-terminomics. 38 liquid chromatography tandem mass spectrometry (LC-MS/MS) datasets were collected and analyzed using four database search engines in combination with statistical downstream evaluation, to yield the by far largest proteomic and N-terminomic dataset of any dental tissue to date. The raw mass spectrometry data and the corresponding metadata have been deposited in ProteomeXchange with the PXD identifier <PXD002264>; Supplementary Tables described in this article are available via Mendeley Data (10.17632/555j3kk4sw.1).

Entities: CellLine Chemical Disease Gene Mutation Species

Keywords: Human dental pulp; N-terminome; Proteome; TAILS N-terminomics; Tandem mass spectrometry

Year: 2015 PMID： 26587561 PMCID： PMC4625376 DOI： 10.1016/j.dib.2015.10.003

Source DB: PubMed Journal: Data Brief ISSN： 2352-3409

Specifications Table Value of the data Largest compendium human dental pulp proteome datasets. Generated by liquid chromatography tandem mass spectrometry (LC-MS/MS), reporting >4000 proteins and >9000 protein N-termini. Comprehensive assessment of global in vivo proteolytic processing by TAILS N-terminomics in a healthy human tissue. Proteolytic processing represents an eminent post-translational modification which frequently alters protein function and localization. Thus large scale positional proteomics datasets such as this one are required to decipher pervasive proteolytic networks in health and disease.

Data

The data in the ProteomeXchange archive (PXD002264) provide a novel description of the human pulp proteome and N-terminome as seen by TAILS N-terminomics [1]. These data were analyzed using four different database search engines, namely Mascot [2], X! Tandem [3], Comet [4], and MS-GF+ [5] in combination with PeptideProphet [11], iProphet [12], and ProteinProphet [13], all compiled within the Trans Proteomic Pipeline (v4.8.0, PHILAE) [14]. A variety of other free or commercial database search engines such as MyriMatch [15], MS Amanda [16], or OMSSA [17], and data analysis pipelines (e.g. MaxQuant [18], Peptide Shaker [19], or Scaffold[20]) could be used for additional analysis, e.g. to explore additional N-terminal modifications [21], [22]. We provide a comprehensive set of >380000 Peptide Spectrum Matches (PSMs), corresponding to >21000 unique peptides which mapped to >4000 proteins. More than 9000 protein N-termini were identified, including 5292 neo-N-termini, indicative of pervasive proteolytic processing even in healthy human tissues [1], [8], [23]. Furthermore, 316 and 762 protein N-termini were identified withMet1 intact or removed (N-terminal Methionine Excision), and 125, 21, and 58 representing natural N-termini after signal-, transit-, and pro-peptide removal, respectively. Together with 24 UniProt curated internal processing sites (representing e.g. the proteolytic release of cryptic peptides or known proteolytic activation events as in the case of complement component C3 and the release of C3a anaphylatoxin), a total of 1306 UniProt [24] and TopFIND [25] curated protein N-termini were identified. For an up-to-date TAILS N-terminomics protocol, please refer to our webpage (www.clip.ubc.ca; CLIP stands for Canadian Laboratories for Innovations in Proteomics), which hosts many highly valuable, proteolytic processing centered, proteomic resources, such as (i) the knowledgebase and analysis resource for protein termini and protease processing, TopFIND [26], [25], [27], (ii) WebPICS [28] for the streamlined analysis of active site specificity profiling experiments by PICS (Proteomic identification of protease Cleavage Sites) [29], [30], [31], [32], [33], (iii) resources such as CLIPPER or TAILS annotator aimed to facilitate TAILS analysis [34], and information on (iv) the protease and inhibitor centric CLIP-CHIP™ microarray [35], and (v) the new proteomic tool LysargiNase, a protease cleaving N-terminal to lysine and arginine residues and thus perfectly complementing trypsin in any proteomic approach [36], and making protein C-terminal peptides more amenable to mass spectrometry due to the LysargiNase-generated N-terminal basic residue [36], [37]. The following materials and methods section will enable other investigators and laboratories to design similar experimental procedures to study human dental pulp or any other human or animal tissue by TAILS N-terminomics, or a comparable proteomic technique. Importantly, we used only trypsin for pulp proteome digestion. However, by using additional digestion approaches (e.g. by using GluC, LysargiNase, chymotrypsin, or multiple enzymes) [37], [38], [39], an even more comprehensive picture of the dental pulp proteome and N-terminome may be feasible.

Materials and methods

Dental pulp collection and proteome extraction

Nine healthy dental pulps were collected within 5–10 min of prophylactic extraction of healthy teeth numbered 38 and 48. A written informed consent was obtained from the patients before surgery in accordance with a protocol approved by the University of British Columbia Clinical Research Ethics Board (UBC CREB). After extensive washing, extracted third molars (wisdom teeth) were sectioned without traumatizing the pulp tissue, and immediately transferred into 250 μl 8 M guanidine hydrochloride, snap frozen on dry ice, and stored at −80 °C. Specimens were separately homogenized on ice using a tissue homogenizer (Ultra-Turrax; IKA Works, Inc.), and proteins were precipitated by chloroform/methanol [40]. Pellets were redissolved in 0.5 ml of 8 M guanidine hydrochloride and protein concentrations were determined by Bradford protein assay (Bio-Rad) using 1:10 dilutions of the samples (in ddH2O).

TAILS N-terminomics

Protein N-termini were enriched from human dental pulp proteomes by TAILS N-terminomics as described in great detail previously [6], [7], [8], [9]. 1.0 mg of non-fractionated sample was diluted in 4 M guanidine hydrochloride, reduced with 5 mM dithiothreitol (DTT; 30 min, 65 °C) and cysteines were carbamidomethylated using 10 mM iodoacetamide (45 min, room temperature in the dark). After quenching with 10 mM DTT (30 min, room temperature), pH was adjusted to 6.5 for reductive dimethylation of primary amines (i.e. α-amines of protein N-termini and ε-amines of lysine sidechains) with 40 mM isotopically heavy formaldehyde (13CD2 in D2O; Cambridge Isotopes) and 20 mM sodium cyanoborohydride (overnight, 37 °C). To ensure completion of amine blocking, 20 mM heavy formaldehyde and 20 mM cyanoborohydride were added (2 h, 37 °C) after overnight incubation. After quenching with 100 mM Tris–HCl, pH 6.8 (30 min, 37 °C), samples were precipitated with chloroform/methanol for reaction clean-up [40]. Protein pellets were resolubilized in a small volume (25–50 μl of 50 mM NaOH, pH-neutralized using 100 mM HEPES (pH 7.5) to 250 μl, diluted 1:1 with HPLC-grade water, and digested with mass spectrometry-grade trypsin (Trypsin Gold, Promega) at a proteome:enzyme ratio of 100:1 (w/w) overnight (37 °C). Digestion efficiency was confirmed by SDS-PAGE; in case of incomplete digestion more trypsin was added (2 h, 37 °C). An aliquot of 50 μg of tryptic digest was saved for shotgun-like analysis (preTAILS). The TAILS samples were incubated with a 5-fold excess (w/w) of water soluble HPG-ALD polymer (http://flintbox.com/public/project/1948/) and 20 mM sodium cyanoborohydride, (overnight, 37 °C, pH 6.8) to remove trypsin-generated internal and C-terminal peptides. After quenching the reaction with Tris–HCl (100 mM, pH 6.8, 30 min, 37 °C), unbound peptides representing naturally blocked or experimentally labeled N-terminal peptides were recovered in the filtrate following ultra-filtration (Amicon Ultra-0.5, MWCO 10 kDa). All preTAILS and TAILS samples were desalted using C18 StageTips [41], snap frozen in liquid nitrogen, and stored at −80 °C until LC-MS/MS analysis.

Mass spectrometry

Purified peptide samples were analyzed using a quadrupole time-of-flight mass spectrometer (Accurate Mass G6550A Q-TOF) coupled online to an 1200 Series nanoflow HPLC (160 nl enrichment column; 0.075 mm×150 mm analytical column packed with Zorbax 300SB-C18 5 μm stationary phase) with a Chip Cube ionization interface (all Agilent Technologies) with temperature set at 6 °C. Each sample was automatically loaded on the enrichment column at flow rate 4 μl/min of Buffer A (0.1% formic acid in HPLC-grade water) and at 4 μl injection flush volume. After that, a 110.2 min gradient was established with the nano-pump at 300 nl/min from 0% to 5% Buffer B (99.9% acetonitrile, 0.1% formic acid) over 2 min, then from 5% to 45% Buffer B in the next 78 min, then increased to 60% over 10 min period, further increased to 95% Buffer B over 0.1 min, held at 95% for 20 min, and then reduced to 3% Buffer B for 0.1 min to recondition the column for the next analysis. Peptides were ionized by electrospray ionization (ESI; 1.8 kV), and mass spectrometry analysis was performed in positive polarity with precursor ions detected from 300 to 2000 m/z. The top three ions per scan were selected for collision-induced dissociation (CID) using a narrow exclusion window of 1.3 amu (atomic mass unit) and at a MS/MS scan rate of two spectra per second. Collision energy was calculated automatically depending on the charge state of the parent ions, and precursor ions were then excluded from further CID for 30 s. The entire LC-MS/MS system was run by Mass Hunter version B.02.01 (Agilent Technologies).

Data analysis

Conversion of acquired MS/MS raw data into mgf and mzXML files was performed using MSConvert [42]. Four different search engines, namely Mascot v2.4 [2], X! TANDEM CYCLONE TPP 2011.12.01.1 [3], [14], MS-GF+ v10072 [5], and Comet 2015.01 rev 0 [4], were used for peptide spectral matching in the human UniProt protein database (release October 2013). The following database search criteria were applied: semi-ArgC cleavage pattern allowing for 2 missed cleavages; 20 ppm tolerance for MS1 and 50 ppm for MS2 (0.25 Da in case of Mascot [2]); cysteine carbamidomethylation (+57.0215 Da) and lysine dimethylation (+34.0631 Da) were set as fixed modifications; N-terminal acetylation (+42.0106 Da) and dimethylation (+34.0631 Da) were set as variable. Further variable modifications included: cyclization of N-terminal glutamine (Gln->pyro-Glu; −17.0266 Da), glutamate (Glu->pyro-Glu; −18.0106 Da), and carbamidomethylated cysteine (pyro-cmC; −17.0266 Da), and Met oxidation (+15.9949 Da). PeptideProphet [11] and iProphet [12] were used within the Trans Proteomic Pipeline (v4.8.0 PHILAE) [14] to statistically evaluate and combine all identified Peptide Spectrum Matches (PSMs) using a 1% false discovery rate (FDR) cut-off. ProteinProphet [13] was used for peptide grouping and only proteins with a probability ≥0.95 were reported, corresponding to a protein FDR of approx. 0.7%. Note, if peptides match multiple proteins, ProteinProphet [13] determines protein groups (i.e. including all proteins identified by the same set of peptides) and names one representative entry. Proteins and protein N-termini were annotated using neXtProt (release 2014-09-19) [43], UniProtKB (April 2015) [24], and TopFIND [26], [25], [27]; e.g. chromosomal location, protein evidence status, sequence position and curation status of identified protein N-termini, and sequence distances to known protein maturation sites such as signal-, transit-, and pro-peptide removal sites, were added to the respective entries.

Subject area	Biology

More specific subject area	Dental biology, proteolytic processing, protein N-termini, proteomics, N-terminomics, Human Proteome Project (HPP)
Type of data	Mass spectrometry raw-files; search engine output files; processed metadata (.xlsx) reporting identified peptide spectrum matches (1% FDR) and proteins (protein probability ≥0.95)
How data was acquired	Liquid chromatography tandem mass spectrometry (LC-MS/MS): Accurate Mass G6550A Quadrupole-time-of-flight (Q-TOF) mass spectrometer coupled on-line to a 1200 Series nanoflow HPLC with a Chip Cube interface (Agilent).
Data format	RAW files:.d-folders and.mzXML-files; peak lists:.mgf; post database search output files from Mascot [2], X!Tandem [3], Comet [4], and MS-GF+ [5]: .pep.xml; metadata: .xlsx-files
Experimental factors	Healthy dental pulps were collected within 10 min of routine extraction of lower third molars (wisdom teeth; teeth 38 and 48); written informed consent was obtained from the patients before surgery. Teeth were partially sectioned and mechanically split, exposing the dental pulp. The pulp was immediately transferred into 250 μl 8 M guanidine hydrochloride, frozen on dry ice, and stored at −80 °C for a maximum period of 30 days. Specimens were separately homogenized on ice, proteins extracted, and samples cleaned-up using chloroform/methanol precipitation. Pellets were redissolved in 0.5 mL of 8 M guanidine hydrochloride and protein concentrations were determined using 1:10 dilutions (in ddH2O) and Bradford protein assay (Bio-Rad) with bovine serum albumin (BSA) for calibration.
Experimental features	TAILS N-terminomics [6], [7], [8], [9], [10] was performed on 1 mg of human dental pulp proteome extracts to identify both proteins (preTAILS; pulp proteome) and corresponding protein N-termini (TAILS; pulp N-terminome). Primary amines of protein N-termini and lysine sidechains were blocked and labeled by dimethylation before trypsinization at the intact protein level. Labeled proteomes were then digested using proteomics-grade trypsin overnight, and trypsin-generated internal and C-terminal peptides with new free N-terminal primary amines were covalently bound to an aldehyde-derivatized, dendritic polyglycerol polymer (HPG-ALD; http://flintbox.com/public/project/1948/). Unbound peptides representing original and processed protein N-termini (either naturally blocked by e.g. Nα-acetylation or experimentally labeled by dimethylation) were separated from polymer-bound internal and C-terminal peptides using ultra filtration and analyzed by LC-MS/MS (TAILS sample). To increase proteome coverage, approx. 50 μg of each pulp sample were analyzed immediately after tryptic digest, omitting the polymer-mediated N-terminal enrichment step (preTAILS).
Data source location	Overall Laboratory, Centre for Blood Research, Department of Oral Biological and Medical Sciences, Faculty of Dentistry, University of British Columbia, Vancouver, BC, Canada. 49°15׳44.5"N 123°14׳41.8"W.
Data accessibility	The mass spectrometry raw data and metadata have been deposited to ProteomeXchange with the PXD identifier <PXD002264>. Supplementary tables described in this article are available via Mendeley Data, 10.17632/555j3kk4sw/1

43 in total

1. Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search.

Authors: Andrew Keller; Alexey I Nesvizhskii; Eugene Kolker; Ruedi Aebersold
Journal: Anal Chem Date: 2002-10-15 Impact factor: 6.986

2. Open mass spectrometry search algorithm.

Authors: Lewis Y Geer; Sanford P Markey; Jeffrey A Kowalak; Lukas Wagner; Ming Xu; Dawn M Maynard; Xiaoyu Yang; Wenyao Shi; Stephen H Bryant
Journal: J Proteome Res Date: 2004 Sep-Oct Impact factor: 4.466

3. Large-scale quantitative assessment of different in-solution protein digestion protocols reveals superior cleavage efficiency of tandem Lys-C/trypsin proteolysis over trypsin digestion.

Authors: Timo Glatter; Christina Ludwig; Erik Ahrné; Ruedi Aebersold; Albert J R Heck; Alexander Schmidt
Journal: J Proteome Res Date: 2012-10-16 Impact factor: 4.466

4. Factor Xa subsite mapping by proteome-derived peptide libraries improved using WebPICS, a resource for proteomic identification of cleavage sites.

Authors: Oliver Schilling; Ulrich auf dem Keller; Christopher M Overall
Journal: Biol Chem Date: 2011-11 Impact factor: 3.915

5. The Human Dental Pulp Proteome and N-Terminome: Levering the Unexplored Potential of Semitryptic Peptides Enriched by TAILS to Identify Missing Proteins in the Human Proteome Project in Underexplored Tissues.

Authors: Ulrich Eckhard; Giada Marino; Simon R Abbey; Grace Tharmarajah; Ian Matthew; Christopher M Overall
Journal: J Proteome Res Date: 2015-08-14 Impact factor: 4.466

6. Protocol for micro-purification, enrichment, pre-fractionation and storage of peptides for proteomics using StageTips.

Authors: Juri Rappsilber; Matthias Mann; Yasushi Ishihama
Journal: Nat Protoc Date: 2007 Impact factor: 13.491

7. A cross-platform toolkit for mass spectrometry and proteomics.

Authors: Matthew C Chambers; Brendan Maclean; Robert Burke; Dario Amodei; Daniel L Ruderman; Steffen Neumann; Laurent Gatto; Bernd Fischer; Brian Pratt; Jarrett Egertson; Katherine Hoff; Darren Kessner; Natalie Tasman; Nicholas Shulman; Barbara Frewen; Tahmina A Baker; Mi-Youn Brusniak; Christopher Paulse; David Creasy; Lisa Flashner; Kian Kani; Chris Moulding; Sean L Seymour; Lydia M Nuwaysir; Brent Lefebvre; Frank Kuhlmann; Joe Roark; Paape Rainer; Suckau Detlev; Tina Hemenway; Andreas Huhmer; James Langridge; Brian Connolly; Trey Chadick; Krisztina Holly; Josh Eckels; Eric W Deutsch; Robert L Moritz; Jonathan E Katz; David B Agus; Michael MacCoss; David L Tabb; Parag Mallick
Journal: Nat Biotechnol Date: 2012-10 Impact factor: 54.908

8. MS Amanda, a universal identification algorithm optimized for high accuracy tandem mass spectra.

Authors: Viktoria Dorfer; Peter Pichler; Thomas Stranzl; Johannes Stadlmann; Thomas Taus; Stephan Winkler; Karl Mechtler
Journal: J Proteome Res Date: 2014-06-26 Impact factor: 4.466

9. MS-GF+ makes progress towards a universal database search tool for proteomics.

Authors: Sangtae Kim; Pavel A Pevzner
Journal: Nat Commun Date: 2014-10-31 Impact factor: 14.919

10. Cleavage specificity analysis of six type II transmembrane serine proteases (TTSPs) using PICS with proteome-derived peptide libraries.

Authors: Olivier Barré; Antoine Dufour; Ulrich Eckhard; Reinhild Kappelhoff; François Béliveau; Richard Leduc; Christopher M Overall
Journal: PLoS One Date: 2014-09-11 Impact factor: 3.240

4 in total

1. Active site specificity profiling datasets of matrix metalloproteinases (MMPs) 1, 2, 3, 7, 8, 9, 12, 13 and 14.

Authors: Ulrich Eckhard; Pitter F Huesgen; Oliver Schilling; Caroline L Bellac; Georgina S Butler; Jennifer H Cox; Antoine Dufour; Verena Goebeler; Reinhild Kappelhoff; Ulrich Auf dem Keller; Theo Klein; Philipp F Lange; Giada Marino; Charlotte J Morrison; Anna Prudova; David Rodriguez; Amanda E Starr; Yili Wang; Christopher M Overall
Journal: Data Brief Date: 2016-02-22

2. Moonlighting matrix metalloproteinase substrates: Enhancement of proinflammatory functions of extracellular tyrosyl-tRNA synthetase upon cleavage.

Authors: Parker G Jobin; Nestor Solis; Yoan Machado; Peter A Bell; Simran K Rai; Nam Hoon Kwon; Sunghoon Kim; Christopher M Overall; Georgina S Butler
Journal: J Biol Chem Date: 2019-11-26 Impact factor: 5.157

Review 3. Dentin Matrix Metalloproteinases: A Futuristic Approach Toward Dentin Repair and Regeneration.

Authors: Paridhi Agrawal; Pradnya Nikhade; Manoj Chandak; Anuja Ikhar; Rushikesh Bhonde
Journal: Cureus Date: 2022-08-12

4. Complementing the pulp proteome via sampling with a picosecond infrared laser (PIRL).

Authors: Yaghoup Feridouni Khamaneh; Parnian Kiani; R J Dwayne Miller; Hartmut Schlüter; Reinhard E Friedrich
Journal: Clin Oral Investig Date: 2021-05-12 Impact factor: 3.573

4 in total