Literature DB >> 23979514

Experimental and computational analysis of the secretome of the hyperthermophilic archaeon Pyrococcus furiosus.

G Schmid1, G Mathiesen, M O Arntzen, V G H Eijsink, M Thomm.   

Abstract

Although Pyrococcus furiosus is one of the best studied hyperthermophilic archaea, to date no experimental investigation of the extent of protein secretion has been performed. We describe experimental verification of the extracellular proteome of P. furiosus grown on starch. LC-MS/MS-based analysis of culture supernatants led to the identification of 58 proteins. Fifteen of these proteins had a putative N-terminal signal peptide (SP), tagging the proteins for translocation across the membrane. The detected proteins with predicted SPs and known function were almost exclusively involved in important extracellular functions, like substrate degradation or transport. Most of the 43 proteins without predicted N-terminal signal sequences are known to have intracellular functions, mainly (70 %) related to intracellular metabolism. In silico analyses indicated that the genome of P. furiosus encodes 145 proteins with N-terminal SPs, including 21 putative lipoproteins and 17 with a class III peptide. From these we identified 15 (10 %; 7 SPI, 3 SPIII and 5 lipoproteins) under the specific growth conditions of this study. The putative lipoprotein signal peptides have a unique sequence motif, distinct from the motifs in bacteria and other archaeal orders.

Entities:  

Mesh:

Substances:

Year:  2013        PMID: 23979514      PMCID: PMC3824201          DOI: 10.1007/s00792-013-0574-0

Source DB:  PubMed          Journal:  Extremophiles        ISSN: 1431-0651            Impact factor:   2.395


Introduction

Pyrococcus furiosus is a heterotrophic, anaerobic, hyperthermophilic archaeon, belonging to the order Thermococcales. This deep branching organism was isolated from geothermally heated marine sediments near Vulcano Island, Italy (Fiala and Stetter 1986). The microbe can utilize starch and a range of other glucans, as well as peptides as carbon and energy sources. P. furiosus produces organic acids, H2 and CO2 as fermentation end products and is a model organism for analysis of the biochemistry and molecular biology of archaea. Recent advances in the development of genetic tools (Waege et al. 2010; Lipscomb et al. 2011) have further facilitated research on this organism and led to a stronger focus on the possible exploitation of P. furiosus in biotechnological applications, e.g., for biofuel production (Basen et al. 2012). Secreted and surface-located proteins sense the environment and support both protection against toxic components and passage of nutrients into the cells. In eukarya, bacteria and archaea most proteins are translocated across the cytoplasmic membrane using the Sec pathway (Driessen et al. 1998; Pohlschroder et al. 2005). Some of these secreted proteins are retained at the surface, while others are released to the surroundings. Secretion is directed by a N-terminal signal peptide (SP) that has a positively charged N-terminus (n-region), a hydrophobic (h-) region, and a c-region containing mostly small, uncharged residues and a characteristic cleavage site (von Heijne 1990). During or shortly after translocation across the membrane, the SP is cleaved off by a signal peptidase. Depending on the type of signal peptide (primarily defined by the character of the c-region), proteins are either recognized by a signal peptidase I (SPase I) or a signal peptidase II (SPase II). SPase I substrates are often released as soluble proteins, while SPase II substrates get attached to the cell membrane by a lipid anchor (lipoproteins). The SPs of bacterial lipoproteins contain a typical lipobox with a conserved cysteine as the first residue downstream of the cleavage site to which the lipo-anchor is attached (Hayashi and Wu 1990). In archaea, no SPase II homolog has been identified to date, although many archaeal genomes, including the P. furiosus genome, encode for proteins with lipobox containing N-terminal signal peptides (Saleh et al. 2010). Archaea contain a third type of secretion signal referred to as SPIII signal peptides, which are similar to bacterial type IV prepilin signal peptides (Ng et al. 2009). SPIII signal peptides lack the c-region; the cleavage site occurs directly after the n-region and the h-region is left as a part of the mature protein. Another common type of secretion mechanism is the twin-arginine translocation (Tat) pathway, which allows the secretion of proteins in a folded state (Sargent et al. 1998). The extent of Tat utilization varies widely in different organisms and in many archaea no or only few potential Tat substrates have been identified in the genome (Dilks et al. 2003). It is reported that the genome of P. furiosus encodes for three proteins with an N-terminus similar to known Tat substrates (Dilks et al. 2003), but as no homologs to the Tat transportation system have been identified, it is doubtful that P. furiosus uses this pathway. There is a lack of experimentally verified data for the composition of archaeal secretomes, and current knowledge about this topic is therefore primarily based on the results generated by predictive programs like SignalP or ExProt (Bardy et al. 2003; Saleh et al. 2010). A drawback of these commonly used programs is that they are based on sequence information of bacteria or eukaryotes. The reliability of the predictions is therefore questionable. In 2009, the first prediction program (PRED-SIGNAL) was presented, which is based on 69 experimentally verified archaeal SPs (Bagos et al. 2009). The signal peptides used as input for developing PRED-SIGNAL are derived from different orders; they include 19 SPs from the order Thermococcales, thereof six from P. furiosus. In the present study, we describe a first study of the secretome of P. furiosus DSM 3638 by experimental proteomics in combination with in silico analysis. Extracellular proteins were concentrated from the supernatants of cultures grown on starch. Proteins were analyzed after in-gel trypsination using nano online liquid chromatography (nanoLC) combined with high-resolution tandem mass spectrometry (MS/MS). Further, the SP-dependent secretome of P. furiosus was predicted by an in silico analysis of the genome and characteristic features of the signal peptide sequences for secreted proteins and lipoproteins were identified.

Materials and methods

Culture medium and growth conditions

Pyrococcus furiosus DSM 3638 was cultivated under anaerobic conditions in sulfur-free medium based on 1/2SME medium (Fiala and Stetter 1986). The medium was supplemented with 0.1 % (w/v) starch as primary carbon source and 0.05 % (w/v) yeast extract. The medium was prepared anaerobically and reduced with 0.03 g l−1 Na2S·3H2O. The cultures were grown in 1-l bottles with 330 ml medium, incubated at 95 °C with shaking at 200 rpm. The growth media were inoculated with 1.65 ml (0.5 %) of a fresh P. furiosus pre-culture (grown until late exponential growth phase).

Preparation of extracellular proteins

The P. furiosus cultures were sampled at late exponential growth phase (1–2 × 108 cells/ml). The cultures were transferred to 450-ml centrifuge beakers inside an anaerobic chamber and centrifuged anaerobically for 25 min at 6,000g. After the harvesting the supernatants were sterile filtered (0.2 μm pore size). The proteins in the cell-free supernatant fractions were concentrated by a two-step procedure. First, the proteins of 330-ml supernatant were concentrated using a Vivacell 250 ultrafiltration unit (Sartorius AG, Germany, filter cut off 5 kDa) following the procedure provided by the manufacturer, to a volume of 20 ml, diluted with 180 ml 50 mM Tris/HCl pH 7.7 and finally re-concentrated to 20 ml. In the second step, the proteins were further concentrated to 500–700 μl using a Vivaspin 20 centrifugal concentrator (Sartorius AG, Germany, filter cut off 5 kDa). After the final concentration step, the proteins were precipitated by acetone, a step that was crucial to obtain protein samples of sufficient quality for the subsequent analyses. In short, four volumes of ice cold acetone were added followed by 2 h incubation at −20 °C. After centrifugation (15 min, 21,500g, 4 °C) the supernatants were carefully discarded and the protein pellets were dried at room temperature. The protein pellets were dissolved in 100 mM Tris/Cl pH 7.7 and the protein concentration was determined by the Lowry assay (Simonian and Smith 2001).

SDS-PAGE and in-gel trypsin digestion

To visualize and separate the proteins, samples from two Pyrococcus cultures were applied to two lanes of a 4–20 % Tris/Glycine Mini-Protean TGX gel (Bio-Rad, Hercules/California, USA). The gel was stained with Coomassie [0.2 % (w/v) Coomassie Brilliant Blue R250, 40 % (v/v) isopropanol, 7 % (v/v) acetic acid] and destained with distilled water. Gel lanes were sliced into 6 pieces with a scalpel and individual pieces were subjected to in-gel protein digestion with trypsin (Promega, Mannheim, Germany) following the protocol of Shevchenko et al. (2006). After trypsination, the 12 samples (six per lane, two parallel lanes) with digested proteins were individually desalted, using C18-StageTips (Rappsilber et al. 2003) and subsequently analyzed by nanoLC–MS/MS.

Identification of proteins by Orbitrap-MS

Peptides were analyzed by an ESI-Orbitrap (LTQ Orbitrap XL, Thermo Scientific, Bremen, Germany) mass spectrometer coupled to an Ultimate 3000 nano-LC system (Dionex, Sunnyvale CA). For separation of peptides an Acclaim PepMap 100 column (120 mm × 75 μm) packed with 3 μm C18 particles (100 Å pore size) (Dionex) was used. A flow rate of 300 nl/min was employed with a solvent gradient of 7–35 % B in 40 min, to 50 % B in 3 min and then to 80 % B in 2 min. Solvent A was 0.1 % formic acid and solvent B was 0.1 % formic acid/90 % ACN. The mass spectrometer was operated in data-dependent mode in order to automatically switch between Orbitrap-MS and LTQ-MS/MS acquisition. Survey full scan MS spectra (from m/z 300–2,000) were acquired in the Orbitrap with the resolution R = 60,000 at m/z 400 (after accumulation to a target of 500,000 charges in the LTQ). The method used allowed sequential isolation of the most intense ions, up to six, depending on signal intensity, for fragmentation on the linear ion trap using collision induced dissociation (CID) at a target value of 10,000 charges. For accurate mass measurements, the lock mass option was enabled in MS mode and the polydimethylcyclosiloxane ions generated in the electrospray process from ambient air were used for internal recalibration during the analysis. Target ions already selected for MS/MS were dynamically excluded for 60 s. General mass spectrometry conditions were electrospray voltage, 1.6 kV; no sheath and auxiliary gas flow. Ion selection threshold was 5,000 counts for MS/MS and an activation Q value of 0.25 and activation time of 30 ms were in addition applied for MS/MS. Data were acquired using Xcalibur v2.5.5 and processed into searchable mgf-files using ProteoWizard v2.1.2708. The data were then searched against a local Fasta database of P. furiosus extracted from NCBI (4,867 sequences) using Mascot (Perkins et al. 1999) as search engine. Allowed variable post-translational modifications were: deamidation of glutamines and asparagines, oxidation of methionines, propionamidylation of cysteines, and, for peptides with N-terminal glutamines, conversion of glutamine to pyro-glutamic acid. As enzyme trypsin was chosen and the maximum number of allowed miscleavages was 1. The accuracy of precursor ions was set to 10 ppm and for fragment ions 0.6 Da. Mascot result files were imported into Scaffold v.3.00.08 (Proteome Software, Portland, Oregon, USA) (Searle 2010) and researched with X!Tandem with default parameters. For valid protein identification at least 1 unique peptide in both parallels was required with a probability ≥95 % and total protein probability ≥98 %. For protein quantification unique peptide counts were exported from Scaffold and emPAI values (Ishihama et al. 2005) were calculated using an in-house python-script.

Bioinformatic analysis of identified proteins

All identified proteins in the supernatant fractions were analyzed for N-terminal signal sequences using the programs SignalP 4.0 (Petersen et al. 2011) and PRED-SIGNAL (Bagos et al. 2009). PRED-SIGNAL was used to do a genome-wide signal peptide analysis of P. furiosus. The genome sequence was extracted from the NCBI gene bank (http://www.ncbi.nlm.nih.gov/genome/?term=pyrococcus%20furiosus). LipoP 1.0 (Juncker et al. 2003) was used to predict lipoproteins. Selected proteins were analyzed for N-terminal transmembrane segments using the Phobius web server (Kall et al. 2007). Putative domain annotations in hypothetical proteins were done using Pfam 26.0 (Punta et al. 2012).

Results and discussion

Proteins identified in the supernatant

Supernatant fractions from P. furiosus grown on starch as major carbon source were collected in the late exponential growth phase, using anaerobic conditions during harvesting to prevent cell lysis. The proteins in the cell-free supernatant were concentrated by ultrafiltration followed by acetone precipitation. The concentrated proteins were separated by 1D-gel electrophoresis (Fig. 1) and converted to tryptic peptides by in-gel trypsination. The peptides were then analyzed using high resolution LC–MS/MS as described in the “Materials and methods” section.
Fig. 1

A representative coomassie stained SDS-PAGE gel showing concentrated proteins from P. furiosus cell-free culture supernatant. 1 PageRuler Prestained Protein Ladder; molecular weights are indicated in kDa. 2 37 μg protein from culture supernatant

A representative coomassie stained SDS-PAGE gel showing concentrated proteins from P. furiosus cell-free culture supernatant. 1 PageRuler Prestained Protein Ladder; molecular weights are indicated in kDa. 2 37 μg protein from culture supernatant Using this approach, 58 proteins were identified (Tables 1, 2), including major enzymes for starch degradation. Previous microarray analysis have shown that the amylopullulanase PF1935*, the maltotriose binding protein PF1938 and the hypothetical protein PF1109 are the only extracellular proteins that are specifically up-regulated when P. furiosus grows on starch (Lee et al. 2006). All these proteins were also found in the current study (Table 1), confirming their importance in starch metabolism. Although quantification of proteins is not straightforward in the present type of LC–MS/MS experiments, it is possible to obtain a relative measure of protein quantities by quantifying the number of peptide counts acquired per protein. Since this number will be biased by differences in the occurrence of basic residues (i.e., tryptic cleavage sites), the peptide counts need to be corrected by a factor representing the number of likely observable peptides for a given protein. This approach yields the so-called emPAI value (exponentially modified protein abundance index) (Ishihama et al. 2005). As expected, the emPAI quantification of the data indicated that the above-mentioned proteins involved in starch metabolism were among the most abundant in the culture medium (Table S1).
Table 1

Proteins with a predicted N-terminal signal sequence identified in the supernatant of P. furiosus DSM 3638 grown on starch

GeneAccession numberGene productPfama Predicted signal sequenceMW (kDa)Unique peptidesb Total coverage (%)
PF011918892036Periplasmic sugar binding proteinSP II61.2820
PF019018892121Hypothetical proteinBacterial extracellular solute-binding proteinsSP I94.83644
PF028718892232PyrolysinSP III154.42827
PF033718976709FlagellinSP III28.6425
PF04772183106Alpha amylaseSP I52.9921
PF1109/PF1110c

18893182

18893183

Extracellular starch binding proteinSP I20/7c 28/44c
PF120918893298Oligopeptide ABC transporter (oligopeptide-binding protein)SP I71.634
PF130418893406Hypothetical proteinUnknown functionSP III93.7915
PF139918893514Putative ATPase, vanadate-sensitiveSP I65.72452
PF140818893525Putative dipeptide-binding proteinSP IIe 79.43052
PF150518893637Hypothetical proteinTranslocon-associated protein betaSP I78.51425
PF169518893856Hypothetical lipoproteinBasic membrane proteinSP II44.2921
PF177418893949Iron (III) ABC transporter, ATP-binding proteinSP II40.31146
PF1935*d 75993212AmylopullulanaseSP I127.15247
PF193818202323Maltotriose-binding proteinSP II48.22059

aSignificant hits obtained after searches in Pfam 26.0 (Punta et al. 2012) for hypothetical proteins

bThe column shows the cumulative number of unique peptide hits from two biological replicates. A protein was considered as significant if it was identified by at least one unique peptide in both parallels

cA recent study has shown, that pf1109 and pf1110 represent a single ORF, encoding a starch-binding protein (Comfort et al. 2008). In the text we are using the term PF1109 for this gene product. In the Table, we have split the total number of detected peptides and the total coverage into the numbers detected for each of the two originally annotated ORFs

dThe previously annotated ORFs pf1934 and pf1935 are one continuous gene, now termed pf1935* (Lee et al. 2006)

eThis protein was among the 18 SPIII proteins predicted by Szabó et al. (2007). We conclude that PF1408 is a lipoprotein (see text)

Table 2

Proteins without predicted N-terminal signal sequences identified in the supernatant of P. furiosus DSM 3638 grown on starch

Functional groupGeneAccession numberGene product (NCBI)Pfama MW (kDa)Unique peptidesb Total coverage (%)Phobius TM domainc
Energy and metabolismPF004318891945Phosphoenolpyruvate synthase901117No
PF028918892234Phosphoenolpyruvate carboxykinase7346No
PF034618892300Aldehyde ferredoxin oxidoreductase673257No
PF02721351936Alpha-amylase761015No
PF045618892427Carboxypeptidase 159812No
PF058818892584Phospho-sugar mutase50719No
PF059718976969IAA-amino acid hydrolase49410No
PF075118892770Flavoprotein47514No
PF082518892854Prolyl endopeptidase71611No
PF09651197364Pyruvate ferredoxin oxidoreductase beta-236628No
PF09661197363Pyruvate ferredoxin oxidoreductase alpha-244717No
PF120318893290Formaldehyde ferredoxin oxidoreductase701824No
PF126618893362Cystathionine gamma-lyase41832No
PF128318893381Rubrerythrin20526No
PF139418893507Phosphoglycerate dehydrogenase34516No
PF142118893540Hypothetical 4-aminobutyrate aminotransferase511542No
PF147218893598Aspartate/serine transaminase43414No
PF148018893608Formaldehyde ferredoxin oxidoreductase wor56535No
PF153518893671Alpha-glucan phosphorylase981320No
PF154018893678Acetyl coenzyme A synthetase50617No
PF154718893685Endoglucanase39826No
PF16021122753Glutamate dehydrogenase471230No
PF161618893766Myo-inositol-1-phosphate synthase42833No
PF17191373331Protease I19761No
PF177818893953Serine hydroxymethyltransferase481129No
PF178718893964Acetyl-CoA synthetase26835No
PF186618978238S-adenosylmethionine synthetase44726No
PF192018894116Triosephosphate isomerase24424No
PF195918894161Phosphonopyruvate decarboxylase bcpc45411No
PF196118894163Tungsten-containing formaldehyde ferredoxin oxidoreductase wor46958No
TransportPF193318894131Putative sugar transport ATP-hydrolyzing41513No
PF193618894134Putative sugar transport inner membrane protein (malg-like)454126
Translation and transcriptionPF137518893486Translation elongation factor eF-1, subunit alpha48824No
PF180318893984LSU ribosomal protein L30P18422No
PF188118978253Chromatin protein10230No
Protein foldingPF197418894178Thermosome, single subunit602043No
Cell divisionPF052518892510Cell division protein441233No
Hypothetical proteinsPF038033359476Hypothetical protein PF0380ParB-like nuclease domain28725No
PF054718892536Hypothetical protein PF0547CobW/HypB/UreG, nucleotide-binding domain511333No
PF104718893110Hypothetical protein PF1047FUN14 family (unknown function)102273
PF111118893184Hypothetical protein PF1111Protein of unknown function DUF4340513No
PF150018893630Hypothetical protein PF1500PRC-barrel domain10661No
PF183718894020Hypothetical protein PF1837ATP-grasp domain; Binding-protein-dependent transport system inner membrane component26419No

aSignificant hits obtained after search in Pfam 26.0 (Punta et al. 2012) for hypothetical proteins

bThe column shows the number of unique peptide hits from two biological replicates. A protein was considered as significant if it was identified by at least one unique peptide in both parallels

cPrediction of transmembrane (TM) domains using the Phobius web server (Kall et al. 2007)

Proteins with a predicted N-terminal signal sequence identified in the supernatant of P. furiosus DSM 3638 grown on starch 18893182 18893183 aSignificant hits obtained after searches in Pfam 26.0 (Punta et al. 2012) for hypothetical proteins bThe column shows the cumulative number of unique peptide hits from two biological replicates. A protein was considered as significant if it was identified by at least one unique peptide in both parallels cA recent study has shown, that pf1109 and pf1110 represent a single ORF, encoding a starch-binding protein (Comfort et al. 2008). In the text we are using the term PF1109 for this gene product. In the Table, we have split the total number of detected peptides and the total coverage into the numbers detected for each of the two originally annotated ORFs dThe previously annotated ORFs pf1934 and pf1935 are one continuous gene, now termed pf1935* (Lee et al. 2006) eThis protein was among the 18 SPIII proteins predicted by Szabó et al. (2007). We conclude that PF1408 is a lipoprotein (see text) Proteins without predicted N-terminal signal sequences identified in the supernatant of P. furiosus DSM 3638 grown on starch aSignificant hits obtained after search in Pfam 26.0 (Punta et al. 2012) for hypothetical proteins bThe column shows the number of unique peptide hits from two biological replicates. A protein was considered as significant if it was identified by at least one unique peptide in both parallels cPrediction of transmembrane (TM) domains using the Phobius web server (Kall et al. 2007) In addition to these proteins that are known to play a key role in starch degradation, we detected the α-amylase PF0477 and the periplasmic sugar binding protein PF0119. The gene pf0477 is up-regulated when Pyrococcus grows on peptides, indicating that this α-amylase may be involved in a metabolic switch from peptide to α-glucan degradation, when α-glucans become available during growth on peptides (Lee et al. 2006). The sugar binding protein PF0119 is not known to be specifically expressed under glycolytic or proteolytic growth conditions and might therefore play a more general role in sugar uptake. Furthermore, we identified the serine protease pyrolysin (PF0287), which was previously shown to be cell envelope-associated (Eggen et al. 1990; Voorhorst et al. 1996), and the peptide binding proteins PF1209 and PF1408. All the above-mentioned proteins with (putative) roles in starch and protein metabolism were among the in total 15 detected proteins in the supernatant fractions with a putative N-terminal signal peptide (Table 1, for details see below). Seven additional proteins with a predicted signal peptide were identified in the supernatant fractions, an ATPase (PF1399), an ATP-binding transporter protein (PF1774), four hypothetical proteins and a flagellin (PF0337). In addition, we identified 43 proteins in the supernatant fraction without a typical N-terminal signal peptide (Table 2). Judged by the emPAI values, these proteins varied in abundance: some were among the most abundant of all detected proteins, whereas the majority appeared in the lower regions of the abundance list (Table S1). Most of these 43 proteins are predicted to have intracellular functions and are therefore not supposed to be actively secreted. Intracellular proteins are regularly found in the culture media of bacteria (Antelmann et al. 2001; Trost et al. 2005) and archaea (Palmieri et al. 2009; Ellen et al. 2010a) and it remains to some extent uncertain whether this is a result of artifacts such as cell lysis or whether this reflects active secretion of intracellular proteins, some of which may even have different intracellular and extracellular functions [“moonlighting” proteins; (Huberts and van der Klei 2010)]. In the case of archaea, an additional possible explanation could be protein export via membrane vesicles (Soler et al. 2008; Ellen et al. 2009, 2010b; Deatherage and Cookson 2012). Further work is needed to establish which of these possible explanations are valid.

Signal sequences of experimentally verified extracellular proteins

The 58 proteins identified in the supernatant fractions were analyzed for N-terminal signal sequences using the programs PRED-SIGNAL (Bagos et al. 2009) and SignalP 4.0 (Petersen et al. 2011). PRED-SIGNAL is trained on signal sequences of archaea and is the only available program that specifically predicts archaeal SPs. The program predicted that 15 of the identified proteins contain N-terminal SPs (Table 1). SignalP 4.0 is the most used prediction program for SPs as it is based on a considerable number of experimentally verified eukaryotic and bacterial SPs. When the program was selected to search for signal peptides of Gram-positive bacteria, 11 of the 15 proteins selected by PRED-SIGNAL were predicted to contain N-terminal SPs. When SignalP was selected to search for eukaryotic signal peptides, 14 of the 15 proteins selected by PRED-SIGNAL were identified. The additional protein identified by PRED-SIGNAL (relative to SignalP) was a flagellin (PF0337). Previous analyses of the P. furiosus genome using FlaFind have led to the identification of 18 proteins putatively carrying a class III signal peptide (Szabó et al. 2007). We identified four of these proteins in the supernatant fractions (PF0287, PF0337, PF1304 and PF1408; Table 1) and all these four proteins were also predicted to be secreted by PRED-SIGNAL. For reasons described below, we conclude that one of these four, PF1408, in fact is a lipoprotein. Analysis of the 43 proteins without a predicted signal sequence using the Phobius server (Kall et al. 2007), indicated that only two of these proteins (PF1047 and PF1963) contain transmembrane segments (3 and 6, respectively). In both these proteins one of the transmembrane segments is located N-terminally and could potentially function as a signal sequence without a cleavage site that co-directs insertion into the cell membrane. For the detection of lipoproteins, all identified proteins were analyzed with the prediction program LipoP 1.0 (Juncker et al. 2003). Although lipoproteins are naturally attached to the cell membrane, it is not unusual to find them in the supernatant fraction as a result of natural shedding (Cole et al. 2005; Tjalsma et al. 2008; Bøhle et al. 2011). LipoP predicted that three of the 58 proteins in the supernatant fraction are lipoproteins (PF0119, PF1695 and PF1774). A manual examination of all identified proteins (see below) showed that two other proteins (PF1938 and PF1408) have features in the N-terminus very similar to the lipoproteins predicted by LipoP (Table 3). Besides a positively charged n-region and a leucine-rich h-region these proteins share the sequence G/CIGG (‘/’ indicates the cleavage site). This motif matches the lipobox motif previously suggested for Pyrococcus spp. (Albers et al. 2004). PF1408 has previously been predicted to harbor a class III signal peptide (Szabó et al. 2007).
Table 3

Predicted N-terminal signal sequences of proteins identified in the culture medium

GenePredicted signal peptide
SPIa
PF1109MRRNAQVFAMVLLLVLSGIPKALA/LYTPTPFSID
PF1209MKRLVGVLIGAFVIFGVFGQVVAA/QEQELPREET
PF1399MKVKKIAALAVGAAVAGATLGFASA/QGEVPEIPKD
PF0477MNIKKLTPLLTLLLFFIVLASPVSAA/KYLELEEGGV
PF1935*b MSRKLSLLLVFLIFGSMLGANNIVKA/EEPKPLNVII
PF0190MRKKLVGILTILVALGMLVSPLLKPVAA/EDQKVLKIAM
PF1505MKKASILLIIMLIASGLTIFNPKA/LGLEKYSTLT
Lipoproteinsc
PF1938d MRRATYAFALLAILVLGVVASG/CIGGGTTTPT
PF1408d,e MKKGLLAILLVGVMVLGTFGSG/CIGGGTQTQT
PF0119c MKHKAVFLLVVLISGVLASG/CIGGETKETQ
PF1774c MKRAIPVFLLIVLVWISG/CIGGGTSTIP
PF1695c MRKVGITLSVVALVIMGFVAG/CIGGTQTQGE
SPIIIf
PF0337MKKG/AIGIGTLIVFIAMVLVAAVAAGVLI
PF0287MNKKG/LTVLFIAIMLLSVVPVHFVS
PF1304MRRG/FIINSTLLILIIPLLLLAATYAEI

Predicted cleavage sites are indicated by “/”; the h-regions of the signal peptides are underlined

aSignal peptides and cleavage sites under “SPI” were predicted by the PRED-SIGNAL program that is optimized for archaea

bThe previously annotated ORFs pf1934 and pf1935 are currently considered one continuous gene, now termed pf1935* (Lee et al. 2006)

cSignal peptides of lipoproteins were predicted using LipoP 1.0 combined with manual inspection of the SPI sequences (see text)

dManually predicted lipoproteins (see text for details)

eThis protein was among the 18 SPIII proteins predicted by Szabó et al. 2007. We conclude that PF1408 is a lipoprotein (see text)

fSignal peptides predicted previously using FlaFind by Szabó et al. 2007

Predicted N-terminal signal sequences of proteins identified in the culture medium Predicted cleavage sites are indicated by “/”; the h-regions of the signal peptides are underlined aSignal peptides and cleavage sites under “SPI” were predicted by the PRED-SIGNAL program that is optimized for archaea bThe previously annotated ORFs pf1934 and pf1935 are currently considered one continuous gene, now termed pf1935* (Lee et al. 2006) cSignal peptides of lipoproteins were predicted using LipoP 1.0 combined with manual inspection of the SPI sequences (see text) dManually predicted lipoproteins (see text for details) eThis protein was among the 18 SPIII proteins predicted by Szabó et al. 2007. We conclude that PF1408 is a lipoprotein (see text) fSignal peptides predicted previously using FlaFind by Szabó et al. 2007 To summarize, the 15 extracellular proteins with signal peptides detected in this study (Tables 1, 3) comprise seven proteins with a predicted SPase I cleavage site, five proteins with a putative SPase II cleavage site (lipoproteins) and three with a predicted SPase III cleavage site. The predicted SPase I signal sequences are very similar in length (24–28 amino acids) and amino acid composition (Table 3). They have two or more lysine or arginine residues at the N-terminus and a distinct hydrophobic region dominated by leucines (Fig. 2a). The signal peptides of the lipoproteins are 18–22 amino acids in length and share the motif ([S(A)]G/CIGG) around the predicted cleavage site (indicated by ‘/’) (Table 3; Fig. 2b).
Fig. 2

Frequency plot for signal peptides based on multiple alignment of 16 residues upstream and 10 residues downstream of predicted signal peptide cleavage sites. a A composition map based on 107 predicted SPase I signal sequences identified in the P. furiosus DSM 3638 genome. b A composition map based on 21 predicted lipoprotein signal sequences identified in the P. furiosus DSM 3638 genome. These pictures were made with WebLogo (Crooks et al. 2004)

Frequency plot for signal peptides based on multiple alignment of 16 residues upstream and 10 residues downstream of predicted signal peptide cleavage sites. a A composition map based on 107 predicted SPase I signal sequences identified in the P. furiosus DSM 3638 genome. b A composition map based on 21 predicted lipoprotein signal sequences identified in the P. furiosus DSM 3638 genome. These pictures were made with WebLogo (Crooks et al. 2004)

Genome-wide analysis

An analysis of the whole predicted proteome of P. furiosus using PRED-SIGNAL, led to identification of 166 proteins with putative N-terminal signal sequences (Table S2). Surprisingly, 31 of these proteins had an uncharged or even negatively charged N-terminus. Signal peptides usually have a positively charged N-terminus, which interacts with the negatively charged inner part of the cytoplasmic membrane (von Heijne 1990). All signal peptides from the experimentally verified signal peptide-containing secreted proteins in this study had at least two positively charged amino acids at the N-terminus (Table 3). Whether these 31 proteins without a positive net charge at the N-terminus are actively secreted remains to be seen. The SPs of the remaining 135 proteins exhibited typical key features of signal peptides (von Heijne 1990) (Table S2). For the identification of lipoproteins we combined a computational analysis by LipoP with a manual examination. In the first step, the whole genome of P. furiosus was searched for lipoproteins using LipoP. The program identified 18 putative lipoproteins, two of which (PF1298, PF2063) were not identified by PRED-SIGNAL (Table S2; this brings the total number of putatively secreted proteins to 137). All 18 identified SPs contained a Gly-Cys motif at the predicted cleavage site. In a second step, all of the 135 proteins predicted to be secreted that were not predicted to be lipoproteins by LipoP were searched for the occurrence of a Gly-Cys motif within the first 25 amino acids. This analysis identified three additional putative lipoproteins, including the maltotriose-binding protein, PF1938, and the putative dipeptide binding protein, PF1408, identified in the secretome (see above and Tables 1, 3), as well as the hypothetical protein PF0978. We suggest these proteins to be lipoproteins due to their signal peptide features and the typical lipobox motif (Tables 3; S2). Six of the 21 putative lipoproteins shared the sequence SG/CIGG, which has previously been suggested to be the consensus sequence of Pyrococcus spp. lipoboxes (Albers et al. 2004). Of the 18 proteins predicted by Szabó et al. (2007) to contain a class III signal peptide (Table S2), ten are predicted by PRED-SIGNAL to harbor an SP I signal peptide. One of these ten was manually predicted to be a lipoprotein (see above). This brings the total number of putatively secreted proteins to 145. Interestingly, a frequency plot (Fig. 2b) of the 21 putative lipoproteins, including PF1408, showed that the −2 position upstream of the cleavage site is dominated by serine, representing almost two-thirds of the amino acids at that position. In bacteria the −2 position is dominated by the apolar amino acid alanine (Hayashi and Wu 1990). Another interesting feature is that the +2 position is dominated by isoleucine, with a frequency of 76 % (Fig. 2b). In bacteria isoleucine is usually not found at the +2 position (Hayashi and Wu 1990). From the +3 to the +5 position glycine is dominating, while threonine is the most abundant amino acid at the positions +6 to +10. This sequence profile can neither be found in bacteria, nor is it known from other archaea. As Pyrococcus is a very deep branching organism, we assume that the profile of its lipoprotein SP sequences represents an ancestral type. It is conceivable that the deviating, probably ancestral sequence profile of the lipoprotein SPs in Pyrococcus requires a different type of signal peptidase II, compared to the known bacterial one. This might explain why in Pyrococcus no signal peptidase II homolog has been identified yet. The putative SPase I signal peptides (Table S2) were on average 5 residues longer (~25 residues) compared to the SPase II signal peptides (~20 residues), similar to what has been reported for bacteria (Klein et al. 1988; von Heijne 1989). In the n-region lysine (62 % of basic residues) is more frequent than arginine (36 % of basic residues), which reflects the common archaeal pattern (Bagos et al. 2009) (Table S2). A frequency plot for all 107 predicted SPase I signal peptides showed that the −1 position relative to the cleavage site is clearly dominated by alanine (72 SPs), while at the −3 position valine (35 SPs) is the most frequent amino acid, followed by alanine and serine (both 17 SPs) (Fig. 2a). This is in accordance with the common archaeal pattern (Bagos et al. 2009), the only exception being that, generally, the −3 residue is an alanine rather than a valine as in P. furiosus. Interestingly, in the h-regions of the P. furiosus SPs leucine is the most frequent amino acid. Such a pre-dominance of leucine in the h-region is also found in eukaryotes (Bagos et al. 2009), underpinning the possible evolutionary relationship between eukaryotes and the deep-branching archaeon P. furiosus. In summary, we suggest that 145 proteins of P. furiosus are secreted by use of an N-terminal signal sequence, including 21 lipoproteins (Table S2). This corresponds to 6.7 % of the P. furiosus proteome, which is significantly less compared to a previous prediction where the secretome of P. furiosus was estimated to comprise 9 % of the proteome (Saleh et al. 2010). This disparity may be due to the fact that in the latter study the secretome was predicted using ExProt, a program trained on signal peptides from bacteria.

Conclusion

Generally, little is known about the secretomes of archaea. The present study adds to a slowly growing data set which so far seems to indicate that secretion in archaea is a limited process, with only few signal-peptide containing proteins being freely secreted into the growth medium (Saunders et al. 2006; Ellen et al. 2010a). Under the specific growth conditions investigated in this study, only 15 proteins with N-terminal signal sequences were identified, that is only 10 % of all proteins with putative N-terminal signal peptides. This low number may be due to limited expression/release of secreted proteins, whereas some secreted proteins may escape detection due to them being relatively resistant to trypsination even after SDS-PAGE. The sequences of the SPase I signal peptides share features with common archaeal sequence patterns. Remarkably, the sequence motifs around the putative or predicted SPase II cleavage sites of the lipoproteins of P. furiosus differ from the motifs found in bacteria and other archaea. The combination of this first experimental glimpse of the secretome and our analysis of signal peptide sequences should provide a useful basis for further studies on protein secretion in this hyperthermophilic archaeon. Below is the link to the electronic supplementary material. Supplementary material 1 (PDF 261 kb) Supplementary material 2 (PDF 81 kb)
  43 in total

1.  Probability-based protein identification by searching sequence databases using mass spectrometry data.

Authors:  D N Perkins; D J Pappin; D M Creasy; J S Cottrell
Journal:  Electrophoresis       Date:  1999-12       Impact factor: 3.535

2.  Stop and go extraction tips for matrix-assisted laser desorption/ionization, nanoelectrospray, and LC/MS sample pretreatment in proteomics.

Authors:  Juri Rappsilber; Yasushi Ishihama; Matthias Mann
Journal:  Anal Chem       Date:  2003-02-01       Impact factor: 6.986

3.  Prokaryotic utilization of the twin-arginine translocation pathway: a genomic survey.

Authors:  Kieran Dilks; R Wesley Rose; Enno Hartmann; Mechthild Pohlschröder
Journal:  J Bacteriol       Date:  2003-02       Impact factor: 3.490

4.  Prediction of lipoprotein signal peptides in Gram-negative bacteria.

Authors:  Agnieszka S Juncker; Hanni Willenbrock; Gunnar Von Heijne; Søren Brunak; Henrik Nielsen; Anders Krogh
Journal:  Protein Sci       Date:  2003-08       Impact factor: 6.725

5.  Archaeal signal peptides--a comparative survey at the genome level.

Authors:  Sonia L Bardy; Jerry Eichler; Ken F Jarrell
Journal:  Protein Sci       Date:  2003-09       Impact factor: 6.725

6.  In-gel digestion for mass spectrometric characterization of proteins and proteomes.

Authors:  Andrej Shevchenko; Henrik Tomas; Jan Havlis; Jesper V Olsen; Matthias Mann
Journal:  Nat Protoc       Date:  2006       Impact factor: 13.491

7.  Virus-like vesicles and extracellular DNA produced by hyperthermophilic archaea of the order Thermococcales.

Authors:  Nicolas Soler; Evelyne Marguet; Jean-Marc Verbavatz; Patrick Forterre
Journal:  Res Microbiol       Date:  2008-06-25       Impact factor: 3.992

8.  Functional-genomics-based identification and characterization of open reading frames encoding alpha-glucoside-processing enzymes in the hyperthermophilic archaeon Pyrococcus furiosus.

Authors:  Donald A Comfort; Chung-Jung Chou; Shannon B Conners; Amy L VanFossen; Robert M Kelly
Journal:  Appl Environ Microbiol       Date:  2007-12-21       Impact factor: 4.792

9.  Shedding & shaving: disclosure of proteomic expressions on a bacterial face.

Authors:  Harold Tjalsma; Lambert Lambooy; Peter W Hermans; Dorine W Swinkels
Journal:  Proteomics       Date:  2008-04       Impact factor: 3.984

10.  Advantages of combined transmembrane topology and signal peptide prediction--the Phobius web server.

Authors:  Lukas Käll; Anders Krogh; Erik L L Sonnhammer
Journal:  Nucleic Acids Res       Date:  2007-05-05       Impact factor: 16.971

View more
  3 in total

1.  Computational and experimental analysis of the secretome of Methylococcus capsulatus (Bath).

Authors:  Stine Indrelid; Geir Mathiesen; Morten Jacobsen; Tor Lea; Charlotte R Kleiveland
Journal:  PLoS One       Date:  2014-12-05       Impact factor: 3.240

Review 2.  Non-proteolytic functions of microbial proteases increase pathological complexity.

Authors:  Veronica M Jarocki; Jessica L Tacchi; Steven P Djordjevic
Journal:  Proteomics       Date:  2015-02-06       Impact factor: 3.984

3.  Efficacy of signal peptide predictors in identifying signal peptides in the experimental secretome of Picrophilous torridus, a thermoacidophilic archaeon.

Authors:  Neelja Singhal; Anjali Garg; Nirpendra Singh; Pallavi Gulati; Manish Kumar; Manisha Goel
Journal:  PLoS One       Date:  2021-08-06       Impact factor: 3.240

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.