| Literature DB >> 24802639 |
Hosein Mohimani1, Roland D Kersten, Wei-Ting Liu, Mingxun Wang, Samuel O Purvine, Si Wu, Heather M Brewer, Ljiljana Pasa-Tolic, Nuno Bandeira, Bradley S Moore, Pavel A Pevzner, Pieter C Dorrestein.
Abstract
Ribosomally synthesized and posttranslationally modified peptides (RiPPs), especially from microbial sources, are a large group of bioactive natural products that are a promising source of new (bio)chemistry and bioactivity.1 In light of exponentially increasing microbial genome databases and improved mass spectrometry (MS)-based metabolomic platforms, there is a need for computational tools that connect natural product genotypes predicted from microbial genome sequences with their corresponding chemotypes from metabolomic data sets. Here, we introduce RiPPquest, a tandem mass spectrometry database search tool for identification of microbial RiPPs, and apply it to lanthipeptide discovery. RiPPquest uses genomics to limit search space to the vicinity of RiPP biosynthetic genes and proteomics to analyze extensive peptide modifications and compute p-values of peptide-spectrum matches (PSMs). We highlight RiPPquest by connecting multiple RiPPs from extracts of Streptomyces to their gene clusters and by the discovery of a new class III lanthipeptide, informatipeptin, from Streptomyces viridochromogenes DSM 40736 to reflect that it is a natural product that was discovered by mass spectrometry based genome mining using algorithmic tools rather than manual inspection of mass spectrometry data and genetic information. The presented tool is available at cyclo.ucsd.edu.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24802639 PMCID: PMC4215869 DOI: 10.1021/cb500199h
Source DB: PubMed Journal: ACS Chem Biol ISSN: 1554-8929 Impact factor: 5.100
Figure 1Workflow implemented in the RiPPquest algorithm for automated peptidogenomics of RiPPs. (a) Prediction of lanthipeptide gene clusters in microbial genome sequence. (b) Generation of 10 kb windows centered at LANC-domain of gene clusters. (c) Prediction of ORFs in each gene cluster. (d) Selection of all candidate precursor peptides ORFs <100 aa. (e) Generation of candidate core peptides via C-terminal half of each selected ORF. (f) Generation of all biosynthetic and gas phase products of each core peptide, exemplified by peptide TFCRS. (g) Generation of MS/MS peptide database of predicted lanthipeptide products. (h) MS/MS analysis of microbial extract. (i) Matching of MS/MS data with MS/MS lanthipeptide spectral database with computed p-values. (j) Molecular network analysis of MS/MS data to identify peptide homologues and to confirm PSMs.
Top Lanthipeptides Discovered by the RiPPquest Automated Peptidogenomics Pipelinea
| peptide | species | MS/MS type | MS/MS
precursor [ | p-value | annotation | |
|---|---|---|---|---|---|---|
| informati-peptin | CID | 1065.56 | 2 | 9.2 × 10–8 | TDGGGASTVSLLSCISAASVLLCL (6 Dehyd) | |
| HCD | 1065.54 | 2 | 0.005 | TDGGGASTVSLLSCISAASVLLCL (6 Dehyd) | ||
| CID | 1015.02 | 2 | 0.013 | DGGGASTVSLLSCISAASVLLCL (6 Dehyd) | ||
| HCD | 1015.02 | 2 | 0.03 | DGGGASTVSLLSCISAASVLLCL (6 Dehyd) | ||
| CID | 957.51 | 2 | 0.0026 | GGGASTVSLLSCISAASVLLCL (6 Dehyd) | ||
| HCD | 957.50 | 2 | 0.0098 | GGGASTVSLLSCISAASVLLCL (6 Dehyd) | ||
| CID | 928.49 | 2 | 0.0003 | GGASTVSLLSCISAASVLLCL (6 Dehyd) | ||
| HCD | 928.99 | 2 | 0.012 | GGASTVSLLSCISAASVLLCL (6 Dehyd) | ||
| AmfS | CID | 1028.04 | 2 | 2.5 × 10–10 | GSQVSLLVCEYSSLSVVLCTP (4 Dehyd) | |
| SRO-2212 | CID | 1107.07 | 2 | 6.3 × 10–10 | TGSQVSLLVCEYSSLSVVLCTP (4 Dehyd) | |
| HCD | 1107.06 | 2 | 5.3 × 10–7 | TGSQVSLLVCEYSSLSVVLCTP (4 Dehyd) | ||
| HCD | 738.37 | 3 | 6.9 × 10–5 | TGSQVSLLVCEYSSLSVVLCTP (4 Dehyd) | ||
| CID | 1056.55 | 2 | 2.0 × 10–9 | GSQVSLLVCEYSSLSVVLCTP (4 Dehyd) | ||
| HCD | 1065.53 | 2 | 0.001 | GSQVSLLVCEYSSLSVVLCTP (3 Dehyd) | ||
| HCD | 710.693 | 3 | 0.001 | GSQVSLLVCEYSSLSVVLCTP (3 Dehyd) | ||
| CID | 1142.58 | 2 | 3.8 × 10–6 | ATGSQVSLLVCEYSSLSVVLCTP (4 Dehyd) | ||
| HCD | 1142.57 | 2 | 0.013 | ATGSQVSLLVCEYSSLSVVLCTP (4 Dehyd) | ||
| HCD | 1028.02 | 2 | 5.0 × 10–5 | SQVSLLVCEYSSLSVVLCTP (4 Dehyd) | ||
| SRO-3108 | CID | 1555.68 | 2 | 1.2 × 10–6 | TTWACATVTLTVTVCSPTGTLCGSCSMGTRGCC (9 Dehyd) | |
| HCD | 1564.17 | 2 | 0.02 | TTWACATVTLTVTVCSPTGTLCGSCSMGTRGCC (9 Dehyd) | ||
| CID | 1582.81 | 2 | 1.0 × 10–7 | {TTWAC}+54ATVTLTVTVCSPTGTLCGSCSMGTRGCC (9 Dehyd) | ||
| CID | 1636.71 | 2 | 0.001 | {TTWAC}+162ATVTLTVTVCSPTGTLCGSCSMGTRGCC (9 Dehyd) | ||
| CID | 1037.11 | 2 | 0.0001 | TTWACATVTLTVTVCSPTGTLCGSCSMGTRGCC (9 Dehyd) | ||
| HCD | 1037.11 | 3 | 0.0007 | TTWACATVTLTVTVCSPTGTLCGSCSMGTRGCC (9 Dehyd) | ||
| CID | 1055.59 | 3 | 0.001 | {TTWAC}+54ATVTLTVTVCSPTGTLCGSCSMGTRGCC (9 Dehyd) | ||
| CID | 1067.92 | 3 | 0.0001 | TVTVCSPTGTLCGSCSMGTRGCC (5 Dehyd) | ||
| HCD | 1067.42 | 2 | 7.7 × 10–7 | TVTVCSPTGTLCGSCSMGTRGCC (5 Dehyd) | ||
| CID | 1123.99 | 2 | 4.6 × 10–7 | LTVTVCSPTGTLCGSCSMGTRGCC (5 Dehyd) | ||
| CID | 1166.47 | 2 | 0.0002 | TLTVTVCSPTGTLCGSCSMGTRGCC (6 Dehyd) | ||
| CID | 1215.54 | 2 | 0.0003 | VTLTVTVCSPTGTLCGSCSMGTRGCC (6 Dehyd) | ||
| CID | 1256.13 | 2 | 0.038 | TVTLTVTVCSPTGTLCGSCSMGTRGCC (7 Dehyd) | ||
| CID | 1075.42 | 2 | 0.009 | TVTVCS{PTGTLCGSCSMG}+16TRGCC (4 Dehyd) | ||
| CID | 1174.49 | 2 | 5.6 × 10–8 | TLTVTVCSPTGTLCGSCSMGTRGCC (5 Dehyd) | ||
| SapB | CID | 1013.99 | 1 | 0.02 | TGSRASLLLCGDSSLSITTCN (4 Dehyd) | |
| SapB | HCD | 934.96 | 2 | 0.016 | SRASLLLCGDSSLSITTCN (4 Dehyd) | |
| HCD | 1013.99 | 2 | 4.0 × 10–5 | TGSRASLLLCGDSSLSITTCN (4 Dehyd) | ||
| SAL-2242 | HCD | 1062.53 | 2 | 5.7 × 10–5 | GSQISLLICEYSSLSVTLCTP (4 Dehyd) | |
| HCD | 1113.06 | 2 | 0.015 | TGSQISLLICEYSSLSVTLCTP (5 Dehyd) | ||
| HCD | 1122.06 | 2 | 0.017 | TGSQISLLICEYSSLSVTLCTP (4 Dehyd) |
Number of dehydrations involved in lanthipeptide processing are shown in each case. Abbreviations: m/z, mass-to-charge ratio; z, precursor ion charge; dehydr, dehydration. Corresponding annotated spectra are shown in SI Figure S4.
Figure 2Characterization of class III lanthipeptide informatipeptin from Streptomyces viridochromogenes DSM 40736 by RiPPquest. (a) PSM of informatipeptin. (b) Gene cluster analysis of informatipeptin. (c) Predicted structures of informatipeptin. Abbreviation: N/A = not annotated.