Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Comparative Membrane Proteomics Reveals a Nonannotated E. coli Heat Shock Protein.

Literature DB >> 29039649

Comparative Membrane Proteomics Reveals a Nonannotated E. coli Heat Shock Protein.

Peijia Yuan^1,2, Nadia G D'Lima^1,2, Sarah A Slavoff^1,2,3.

Abstract

Recent advances in proteomics and genomics have enabled discovery of thousands of previously nonannotated small open reading frames (smORFs) in genomes across evolutionary space. Furthermore, quantitative mass spectrometry has recently been applied to analysis of regulated smORF expression. However, bottom-up proteomics has remained relatively insensitive to membrane proteins, suggesting they may have been underdetected in previous studies. In this report, we add biochemical membrane protein enrichment to our previously developed label-free quantitative proteomics protocol, revealing a never-before-identified heat shock protein in Escherichia coli K12. This putative smORF-encoded heat shock protein, GndA, is likely to be ∼36-55 amino acids in length and contains a predicted transmembrane helix. We validate heat shock-regulated expression of the gndA smORF and demonstrate that a GndA-GFP fusion protein cofractionates with the cell membrane. Quantitative membrane proteomics therefore has the ability to reveal nonannotated small proteins that may play roles in bacterial stress responses.

Entities: CellLine Chemical Disease Gene Species

Mesh：

Substances：

Year: 2017 PMID： 29039649 PMCID： PMC5761644 DOI： 10.1021/acs.biochem.7b00864

Source DB: PubMed Journal: Biochemistry ISSN： 0006-2960 Impact factor: 3.162

Despite their varied and often essential functions, small proteins have been consistently underannotated in both prokaryotic and eukaryotic genomes.[1] Small open reading frame (smORF)-encoded small proteins function in bacteria as regulators of sporulation, cell division, membrane transport, membrane-bound enzymes, protein kinases, and chaperones.[1−7] In a study of 51 recently discovered small Escherichia coli proteins, 21 were upregulated under a specific stress or growth condition.[8] Notably, 90% of the small proteins that exhibited regulated expression were predicted to contain single transmembrane helices.[8] It is therefore reasonable to hypothesize that additional small, membrane-associated bacterial stress response proteins remain to be discovered. Of the three leading technologies for smORF discovery, computational genomics,[9] ribosome footprinting,[10,11] and liquid chromatography–tandem mass spectrometry proteomics (LC–MS/MS),[12−15] LC–MS/MS has the advantage of direct detection of peptides derived from nonannotated proteins and has recently been extended to quantitative analysis.[16−19] However, bottom-up LC–MS/MS proteomics affords relatively poor detection of membrane proteins due to their low abundance and hydrophobicity,[20,21] suggesting that membrane-associated, nonannotated small proteins may have been missed by previous quantitative LC–MS/MS studies. To address this limitation, we present a workflow for quantitative membrane proteomics. We apply this methodology to the E. coli K12 heat shock response, enabling the discovery of a previously nonannotated, membrane-associated small heat shock protein, which we provisionally name GndA. We and others recently reported a label-free quantitation protocol for comparative profiling of nonannotated peptides between two conditions.[16,19]Figure A provides an overview of a membrane-focused quantitative proteomic workflow. Briefly, E. coli K12 substr. MG1655 is grown under standard (control) conditions or subjected to heat shock, then lysed. Cell membranes are pelleted via ultracentrifugation, and the membrane proteome is resolubilized and separated on a peptide gel.[15] Protein bands of low molecular weight are excised and subjected to trypsin digest. The digest is then fractionated by electrostatic repulsion hydrophilic interaction chromatography (ERLIC), and the fractions are analyzed by LC–MS/MS. Subsequently, the data are searched against a 6-frame translation of the E. coli K12 MG1655 genome using MASCOT, permitting identification of both known and nonannotated peptides. Annotated peptides are excluded with a string-matching algorithm[14] via comparison to the E. coli K12 MG1655 proteome. For semiquantitative, comparative analysis of peptide abundance, sequences detected only in the heat shock sample and not the control are identified, then MS1 extracted ion chromatograph (EIC) peak intensities at the same retention time are compared.

Figure 1

Discovery of small open reading frame (smORF)-encoded membrane proteins through quantitative proteomics. (A) Overview of membrane-targeted quantitative proteomic discovery protocol. (B) MS/MS spectrum corresponding to an unannotated tryptic peptide fragment detected only in the heat shock sample is shown. Identified y-ions and b-ions are shown in red on the spectrum and indicated on the peptide sequence to which the spectrum was matched. (C) Extracted ion chromatograms (EICs) comparing peaks (shown in stick mode) corresponding to the peptide ion m/z value detected in (B) in heat shock and control conditions at the same retention time. The same y-axis scale is used in both conditions. A viewing window of 1 Da around the parent ion mass is used. Prior to analysis of nonannotated sequences, we first validated our workflow’s ability to quantify differential expression of peptides from an annotated heat shock protein. Comparing the EICs for selected tryptic fragments of the heat shock protein DnaJ (Hsp40) and a known nonheat shock protein, 50S ribosomal subunit protein L6 (RplF), verified that the DnaJ peptide was detected only during heat shock, while the RplF peptide was detected equally under both normal growth and heat shock (Supporting Information (SI), Figure S1), as expected. Therefore, we reliably distinguished heat shock responsive vs constitutive expression using label-free quantitation. Second, we analyzed our workflow’s size selectivity. To do so, we first plotted the sizes of all annotated proteins identified in both our heat shock and control samples that were subjected to membrane enrichment. This analysis revealed a clear enrichment of small proteins, with the most commonly detected protein sizes ranging from 10 to 20 kDa (SI, Figure S2A), similar to size distributions obtained for soluble proteins in past LC–MS/MS proteomics studies of smORFs.[14−16] Finally, we confirmed that we obtained an enrichment in peptides derived from membrane proteins by comparing our membrane-enriched control sample (not subjected to heat shock) to a previously reported sample grown under similar conditions that was not subjected to membrane preparation.[19] We compared all of the annotated proteins identified using our membrane-enriched sample and the sample without membrane enrichment against a list of all E. coli K12 substr. MG1655 membrane proteins obtained from EcoCyc. These searches showed that 412/1208, or 34%, of annotated proteins detected from the membrane enrichment workflow had a membrane localization annotation, as opposed to 488/1849, or 26%, of annotated proteins detected from the regular workflow without enriching for membrane proteins (SI, Figure S2B). Of these proteins with membrane annotation, we detected peptides from 135 of them only in the workflow with membrane enrichment. These results suggest that our workflow provides an enhancement in the detection of peptides derived from membrane proteins while retaining small size selectivity. The results of our proteomic analysis of heat shock and control samples are presented in SI, Proteomic results, and protein-level identifications are ranked according to sequence coverage. Because we focused on molecular genetic validation rather than statistical analysis of replicates to identify GndA as a heat shock protein (vide infra), we note that only a single experimental replicate is presented, so any other candidate heat shock-specific peptides must be considered putative. Nevertheless, our data set may aid hypothesis generation about regulated expression of predicted proteins. For example, peptides mapping to four known or predicted small proteins without currently annotated heat shock functions were detected in the heat shock sample but not in the control sample (SI, Figure S3 and S4). Two of these proteins are known or predicted to localize to the membrane (YfgG and YghG), and three currently lack functional characterization. Further experiments will be required to test heat shock responsive expression of these proteins. In our heat shock data set, we identified precisely one nonannotated tryptic peptide exhibiting excellent sequence coverage (Figure B). Comparative analysis of the extracted ion chromatogram for this nonannotated tryptic peptide revealed MS1 ion intensity in our heat shock sample and not in the control (Figure C). This nonannotated peptide maps uniquely to an open reading frame (ORF) that is contained entirely within the gene gnd in the +2 reading frame (Figure ). The putative protein that would be produced by translation of this ORF would therefore be completely different from Gnd at the amino acid level. Because of its coencoding with gnd, we refer to the smORF as gndA. There are two in-frame ATG codons upstream of the sequence putatively encoding the peptide detected by LC–MS/MS, either of which could plausibly initiate translation of GndA (Figure B). The length of GndA would thus most likely be 36–54 amino acids. Because bottom-up proteomics does not provide full sequence coverage for this putative protein, we have not yet confirmed the start codon or complete primary sequence for GndA, and it remains possible that neither in-frame ATG codon is the correct start site for this protein.

Figure 2

Location of the nonannotated gene, gndA, within the E. coli MG1655 genome. (A) A gene locus diagram shows the coordinate of the stop codon downstream of a frame-shifted sequence within the annotated gnd gene. Sizes are proportional to gene lengths and directionality of coding sequences is indicated with arrows. (B) The coding sequence of gnd is shown with the sequence corresponding to the tryptic peptide fragment detected by MS/MS bolded and underlined. Highlighted in red are two upstream, in-frame candidate ATG start codons. Because we identified only a single tryptic peptide that mapped to GndA, rigorous molecular genetic confirmation of its expression was required. We verified that gndA was expressed and upregulated during heat shock by generating a chromosomally tagged strain with the coding sequence for the tandem epitope tag SPA[8] integrated at the 3′ end of the predicted gndA smORF. We confirmed the site of SPA tag insertion via integration check PCR and sequencing (SI, Figure S5). We grew the SPA-tagged strain under control and heat shock conditions and specifically detected expression of an immunoreactive band during heat shock (Figure ). (Many membrane proteins exhibit anomalous mobility in SDS-PAGE,[22] so the apparent migration of GndA-SPA may not exactly correlate with its molecular weight.) This result is consistent with expression of a small protein in the gndA reading frame during heat shock.

Figure 3

gndA is expressed and upregulated during heat shock. (A) An E. coli MG1655 strain was generated with the SPA epitope tag (followed by a kanamycin selection marker, kan) introduced at the C-terminus of GndA. (B) Cell lysates of SPA-tagged and wild-type E. coli MG1655 strains grown at 30 °C (control) and 45 °C (heat shock) were separated on a 16% tricine gel and stained with Coomassie blue (right). Western blotting was performed on the same samples using anti-FLAG antibody to detect a portion of the SPA tag (left). In the absence of a complete assignment of the gndA coding sequence, it remained possible that the observed peptide was generated via an alternative mechanism, such as ribosomal frameshifting during translation of 6-phosphogluconate dehydrogenase (Gnd), the protein product of gnd. We therefore confirmed that GndA can be translated independently. We generated pET21a plasmids containing the genomic sequence comprising the annotated ATG start codon of gnd to the stop codon of gndA. GFP was fused to the C-terminus of GndA to enhance stability and enable immunoblotting. We also deleted the start codon of gnd from this construct. We observed that expression of both of these constructs in BL21 cells produces the same product, which migrates at a slightly higher apparent molecular weight than GFP alone (SI, Figure S6). This result is consistent with independent translation of GndA, although it does not exclude all alternative interpretations. Bioinformatic and biochemical analyses suggest that the predicted primary sequence of GndA may correspond to a small transmembrane protein. A portion of the putative GndA sequence (Figure A), highlighted in red, was predicted by three programs (TMPred, Phobius,[23,24] and PredictProtein[25]) to form a transmembrane helix. Using the GFP fusion construct employed in SI, Figure S6, we performed a membrane fractionation. We verified by Western blotting that GndA-GFP is highly enriched in the membrane pellet after ultracentrifugation as compared to total clarified lysate and the soluble fraction, consistent with membrane localization (Figure C). A BLAST search against the NCBI nonredundant protein database did not reveal significant homology between GndA and known proteins (data not shown), and the predicted primary sequence of GndA lacks a signal sequence. Therefore, determination of the full sequence, function, mechanism of membrane insertion, inner vs outer membrane localization, and orientation of GndA in the membrane will require further study.

Figure 4

GndA is enriched in the membrane fraction. (A) The hypothetical primary sequence of GndA contains a predicted transmembrane helix (red). (B) BL21 cells were transformed with a pET21a plasmid encoding a GndA-GFP fusion protein. (C) Cell lysates were fractionated, separated on a 16% tricine gel, and stained with Coomassie blue as a loading control (right). Western blotting was performed on the same samples, probing using anti-GFP antibody (left). kDa, molecular weight ladder; CL, clarified lysate; S, soluble fraction; M, membrane pellet; PL, preclarified lysate. In summary, we have developed an LC–MS/MS method to detect a peptide derived from a nonannotated small membrane protein regulated by heat shock, GndA. Notably, gndA would have been difficult to identify through alternative approaches to smORF discovery, including bioinformatics and ribosome footprinting, because the frameshifted gndA coding sequence is completely contained within the larger gnd sequence. Thus, our method presents a complementary approach to new gene discovery. In the future, we anticipate that this method can be extended to profiling of nonannotated membrane proteins expressed under different stress conditions and in other organisms.

25 in total

1. 'Intergenic' blr gene in Escherichia coli encodes a 41-residue membrane protein affecting intrinsic susceptibility to certain inhibitors of peptidoglycan synthesis.

Authors: R S Wong; L M McMurry; S B Levy
Journal: Mol Microbiol Date: 2000-07 Impact factor: 3.501

2. A combined transmembrane topology and signal peptide prediction method.

Authors: Lukas Käll; Anders Krogh; Erik L L Sonnhammer
Journal: J Mol Biol Date: 2004-05-14 Impact factor: 5.469

Review 3. Proteomics of integral membrane proteins--theory and application.

Authors: Anna E Speers; Christine C Wu
Journal: Chem Rev Date: 2007-08 Impact factor: 60.622

4. Improved Identification and Analysis of Small Open Reading Frame Encoded Polypeptides.

Authors: Jiao Ma; Jolene K Diedrich; Irwin Jungreis; Cynthia Donaldson; Joan Vaughan; Manolis Kellis; John R Yates; Alan Saghatelian
Journal: Anal Chem Date: 2016-03-24 Impact factor: 6.986

5. Comparative proteogenomics of twelve Roseobacter exoproteomes reveals different adaptive strategies among these marine bacteria.

Authors: Joseph Alexander Christie-Oleza; Juana Maria Piña-Villalonga; Rafael Bosch; Balbina Nogales; Jean Armengaud
Journal: Mol Cell Proteomics Date: 2011-11-28 Impact factor: 5.911

6. An unusually small gene required for sporulation by Bacillus subtilis.

Authors: P A Levin; N Fan; E Ricca; A Driks; R Losick; S Cutting
Journal: Mol Microbiol Date: 1993-08 Impact factor: 3.501

7. Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling.

Authors: Nicholas T Ingolia; Sina Ghaemmaghami; John R S Newman; Jonathan S Weissman
Journal: Science Date: 2009-02-12 Impact factor: 47.728

8. PredictProtein--an open resource for online prediction of protein structural and functional features.

Authors: Guy Yachdav; Edda Kloppmann; Laszlo Kajan; Maximilian Hecht; Tatyana Goldberg; Tobias Hamp; Peter Hönigschmid; Andrea Schafferhans; Manfred Roos; Michael Bernhofer; Lothar Richter; Haim Ashkenazy; Marco Punta; Avner Schlessinger; Yana Bromberg; Reinhard Schneider; Gerrit Vriend; Chris Sander; Nir Ben-Tal; Burkhard Rost
Journal: Nucleic Acids Res Date: 2014-05-05 Impact factor: 16.971

Comparative Membrane Proteomics Reveals a Nonannotated E. coli Heat Shock Protein.

1. 'Intergenic' blr gene in Escherichia coli encodes a 41-residue membrane protein affecting intrinsic susceptibility to certain inhibitors of peptidoglycan synthesis.

2. A combined transmembrane topology and signal peptide prediction method.

Review 3. Proteomics of integral membrane proteins--theory and application.

4. Improved Identification and Analysis of Small Open Reading Frame Encoded Polypeptides.

5. Comparative proteogenomics of twelve Roseobacter exoproteomes reveals different adaptive strategies among these marine bacteria.

6. An unusually small gene required for sporulation by Bacillus subtilis.

7. Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling.

8. PredictProtein--an open resource for online prediction of protein structural and functional features.

9. Feedback inhibition in the PhoQ/PhoP signaling system by a membrane peptide.

10. Discovery of human sORF-encoded polypeptides (SEPs) in cell lines and tissue.

Review 1. Small open reading frames and cellular stress responses.

Review 2. Alternative ORFs and small ORFs: shedding light on the dark proteome.

Review 3. Escherichia coli Small Proteome.

4. Retapamulin-Assisted Ribosome Profiling Reveals the Alternative Bacterial Proteome.

Review 5. Non-AUG start codons: Expanding and regulating the small and alternative ORFeome.

6. Proteomic Detection and Validation of Translated Small Open Reading Frames.

7. Identification of Translation Start Sites in Bacterial Genomes.

8. Comparative Proteomic Profiling of Unannotated Microproteins and Alternative Proteins in Human Cell Lines.