| Literature DB >> 29334896 |
Michael A Skinnider1,2, Chad W Johnston1,2, Nishanth J Merwin1,2, Chris A Dejong1,2, Nathan A Magarvey3,4.
Abstract
BACKGROUND: Among naturally occurring small molecules, tRNA-derived cyclodipeptides are a class that have attracted attention for their diverse and desirable biological activities. However, no tools are available to link cyclodipeptide synthases identified within prokaryotic genome sequences to their chemical products. Consequently, it is unclear how many genetically encoded cyclodipeptides represent novel products, and which producing organisms should be targeted for discovery.Entities:
Keywords: Cyclodipeptide synthase; Genome mining; Secondary metabolism
Mesh:
Substances:
Year: 2018 PMID: 29334896 PMCID: PMC5767969 DOI: 10.1186/s12864-018-4435-1
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Schematic overview of an algorithm for tRNA-derived cyclodipeptide identification and prediction. Given a microbial genome sequence as input, our algorithm uses a hidden Markov model to identify CDPSs and cluster them with surrounding biosynthetic and resistance enzymes, including tailoring enzymes specific to cyclodipeptide biosynthesis. CDPS active site residues are identified by multiple sequence alignment to a large database of CDPS sequences, and their aminoacyl-tRNA substrates are predicted by a naive Bayes classifier. Finally, tailoring reactions are executed to generate a predicted cyclodipeptide chemical structure
Fig. 2Validation of cyclodipeptide synthase chemical structure prediction in PRISM. a–b Accuracy of CDPS active site residue prediction in LOOCV for the P1 (c) and P2 (d) active sites. c Accuracy of CDPS aminoacyl-tRNA substrate prediction at the P1 and P2 active sites in LOOCV. d Tanimoto coefficient accuracy of chemical structure prediction for five known cluster-compound pairs
Fig. 3Unique tRNA-derived cyclodipeptide clusters in a sample of 93,107 prokaryotic genomes, organized by genus of producing organism. Genera with fewer than five cyclodipeptide clusters are marked as “Other”
Fig. 4a Frequency of predicted tRNA-derived cyclodipeptides in a global analysis of prokaryotic CDPS biosynthesis. b Comparison of known and predicted novel genetically encoded tRNA-derived cyclodipeptides. c Chemocentric view of known NRPS- and CDPS-derived cyclodipeptides and predicted tRNA-derived cyclodipeptides. “CDPS” refers to cyclodipeptides known to be produced by characterized CDPS enzymes, while “Non-CDPS only” refers to naturally occurring cyclodipeptides that are not known to be produced by any characterized CDPS
Fig. 5a Frequency of known cyclodipeptide tailoring reaction domains found in association with cyclodipeptide clusters. In all panels, frequency corresponds to the number of unique CDPS clusters in which the tailoring reaction was observed. b Frequency of other PRISM tailoring reaction domains found in association with cyclodipeptide clusters. Only domains found at least three times are shown. c Frequency of PRISM resistance domains found in association with cyclodipeptide clusters. Only domains found at least three times are shown. d Frequency of biosynthetic and non-biosynthetic Pfam domians found in association with cyclodipeptide clusters
Fig. 6a Sequence similarity network of unique CDPSs encoded within 93,107 prokaryotic genomes. b Selected cyclodipeptide biosynthetic gene cluster families discussed in the text