| Literature DB >> 30467398 |
Vaishnavi Ravikumar1, Nicolas C Nalpas2, Viktoria Anselm2, Karsten Krug2,3, Maša Lenuzzi4, Martin Sebastijan Šestak4, Tomislav Domazet-Lošo4,5, Ivan Mijakovic6,7, Boris Macek8.
Abstract
Bacillus subtilis is a sporulating Gram-positive bacterium widely used in basic research and biotechnology. Despite being one of the best-characterized bacterial model organism, recent proteomics studies identified only about 50% of its theoretical protein count. Here we combined several hundred MS measurements to obtain a comprehensive map of the proteome, phosphoproteome and acetylome of B. subtilis grown at 37 °C in minimal medium. We covered 75% of the theoretical proteome (3,159 proteins), detected 1,085 phosphorylation and 4,893 lysine acetylation sites and performed a systematic bioinformatic characterization of the obtained data. A subset of analyzed MS files allowed us to reconstruct a network of Hanks-type protein kinases, Ser/Thr/Tyr phosphatases and their substrates. We applied genomic phylostratigraphy to gauge the evolutionary age of B. subtilis protein classes and revealed that protein modifications were present on the oldest bacterial proteins. Finally, we performed a proteogenomic analysis by mapping all MS spectra onto a six-frame translation of B. subtilis genome and found evidence for 19 novel ORFs. We provide the most extensive overview of the proteome and post-translational modifications for B. subtilis to date, with insights into functional annotation and evolutionary aspects of the B. subtilis genome.Entities:
Mesh:
Substances:
Year: 2018 PMID: 30467398 PMCID: PMC6250715 DOI: 10.1038/s41598-018-35589-9
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Functional annotation of Bacillus subtilis proteins. Proteome categories comprise: Detected = all identified proteins; Undetected = proteins not identified in our study; Phosphorylated = all proteins found phosphorylated at least once; Acetylated = all proteins found acetylated at least once. On the x-axis, the foreground ratio is plotted for each KEGG pathway; these ratios represent the number of proteins per category and per pathway divided by the total number of proteins per pathway. On the y-axis, the KEGG pathway description is displayed. Color gradient corresponds to the multiple correction testing adjusted p-value from lowest (red) to highest (blue). The size of the dots corresponds to protein count per KEGG pathway for each category.
Figure 2Interaction network of regulated putative substrates (direct or indirect) of all analyzed kinases and phosphatases. Kinases and their phosphorylation events are depicted in red color and phosphatases and their dephosphorylation events are depicted in black color. Proteins that are regulated by the respective kinases and phosphatases in the opposite direction are indicated by light grey lines. Respective putative direct substrates are depicted in brown. All proteins in the above interaction network are referred to as nodes and their interactions as edges.
Figure 3The phylostratigraphy map of B. subtilis subsp. subtilis str. 168 genome depicting distribution of: (a) detected and undetected, (b) phosphorylated and acetylated proteins. Estimated evolutionary origins of individual genes are mapped on the depicted reference evolutionary tree (x-axis). All B. subtilis genes have been distributed into 15 groups (phylostrata) according to the estimated point of emergence of their protein family founders. In panel a), the y-axis denotes the percentage of detected and undetected proteins, out of the total number of proteins sorted to each phylostratum. In panel b), the y-axis denotes the percentage of modified proteins out of the total number of proteins assigned to each phylostratum.
Figure 4MS/MS quality and genomic coverage. (a) Circos graph representation of B. subtilis genome, including the annotated and potentially novel ORFs expressed in this study. (b) Venn diagram illustrating the MS coverage at peptide and protein levels in context of B. subtilis genome. (c) The genomic region visualization for seq_51322. Top panel includes known ORFs (in blue for + strand and in red for − strand) and all ORFs generated from six-frame genome translation (in green for + strand and in orange for − strand); color lightness corresponds to whether ORF was identified or not in our data (dark color for expressed ORF and light color for non-expressed ORF). The middle panel is zoomed around the expressed novel ORF of interest for visualization of peptide sequences (in khaki are peptides mapping to novel ORF and in purple are peptides from known ORF) and RT-PCR nucleotide sequences at this genomic location. The bottom panel contains the MS/MS spectra of the top scoring novel PSM.
Novel ORFs selected for RT-PCR validation.
| Database ID | Sequences | PCR validated | ORF length (aa) | Annotation |
|---|---|---|---|---|
| seq_154909 | FIKISRSESASK, ISRSESASK | Yes | 70 | Uncharacterised (no BLAST results) |
| seq_51322 | LENKTNNQLLVK, TNNQLLVK, VNSALNSLVK | Yes | 69 | Uncharacterised (no BLAST results) |
| seq_163507 | NVMYRLCYFLSEK, SPGMFSGLFVFK | No | 57 | Unclear/false positive |
| seq_134853 | AFGRMLRLILMMPMK, RWLALSSRQSCCLIGNTIIGAWISSSNEFIN | No | 51 | Unclear/false positive |
| seq_145510 | SVMLSAVQELLCGSILK, TLLNYFLRPAMNLFPAK | Yes | 45 | Erroneous termination of P42977 |
| seq_49263 | SLRYLHQETVQTSK, YLHQETVQTSKPSSR | Yes | 20 | Erroneous termination of P12043 |
| seq_223100 | MNISSNVCRPMIMLK, NISSNVCRPMIMLK | No | 17 | Unclear/false positive |
Potentially novel ORFs identified in this study, including associated peptide sequences and possible annotation for these events.