| Literature DB >> 23677608 |
Auke J van Heel1, Anne de Jong, Manuel Montalbán-López, Jan Kok, Oscar P Kuipers.
Abstract
Identifying genes encoding bacteriocins and ribosomally synthesized and posttranslationally modified peptides (RiPPs) can be a challenging task. Especially those peptides that do not have strong homology to previously identified peptides can easily be overlooked. Extensive use of BAGEL2 and user feedback has led us to develop BAGEL3. BAGEL3 features genome mining of prokaryotes, which is largely independent of open reading frame (ORF) predictions and has been extended to cover more (novel) classes of posttranslationally modified peptides. BAGEL3 uses an identification approach that combines direct mining for the gene and indirect mining via context genes. Especially for heavily modified peptides like lanthipeptides, sactipeptides, glycocins and others, this genetic context harbors valuable information that is used for mining purposes. The bacteriocin and context protein databases have been updated and it is now easy for users to submit novel bacteriocins or RiPPs. The output has been simplified to allow user-friendly analysis of the results, in particular for large (meta-genomic) datasets. The genetic context of identified candidate genes is fully annotated. As input, BAGEL3 uses FASTA DNA sequences or folders containing multiple FASTA formatted files. BAGEL3 is freely accessible at http://bagel.molgenrug.nl.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23677608 PMCID: PMC3692055 DOI: 10.1093/nar/gkt391
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Schematic overview of the BAGEL3 genome mining procedure. BAGEL3 uses two different approaches in parallel to find bacteriocins and modified peptides. Both approaches use nucleotide sequences in FASTA format as input. The first approach (left, red) describes how the context-based approach proceeds. The second approach (right, blue) describes the simpler precursor peptide-based mining. Finally, both methods generate a single summary table with links to detailed graphical reports.
Currently supported classes of RiPPs and the rules used to identify potential clusters
| Name | Rule |
|---|---|
| Bottromycin | (PF04055) AND (PF02624) |
| Cyanobactin | (CyaG) |
| Glycocin | (TIGR04195) AND (PF03412) |
| Lanthipeptide class II | (PF05147) AND (PF13575) |
| Lanthipeptide class I | (PF04737|PF04738|PF14028) AND (PF05147) |
| Lanthipeptide class III | (lanKC) |
| Lanthipeptide class IV | (PF05147) AND (LanL) |
| LAPs | (PF02624) AND (PF00881) |
| Lasso peptide | (PF13471) AND (PF00733) |
| Linaridin | (LinL) |
| Microcin | (PF02794) |
| Sactipeptides | (SacCD) AND (PF04055) |
| Thiopeptide | (PF02624) AND (PF00881) AND (PF14028) |
| = or AND = additional requirement. The rules in this table describe the criteria that have to be matched by a certain stretch of DNA to become an AOI. Some rules might overlap, and therefore they are checked in an ordered fashion. In this way, the more stringent rule is checked after the less stringent.
Figure 2.Example detailed report of a lantibiotic cluster encoding a nisin variant and its modification enzymes found in Streptococcus suis J14 (NC_017618.1) using BAGEL3. The target gene (smallORF_6) was in this case identified by the specialized small ORF calling procedure.
A selection of novel RIPPs identified by BAGEL3
| DNA screened | Homology ( | Identification | Sequence |
|---|---|---|---|
| Sporulation-killingfactor_skfA [1e-10] | Context: SacCD | Sactipeptide: | |
| MSNHNVRNEPAPAWESSAQNNLSKPAGIPLIKSVGCAACWGAK NISLTRACLPPTPINLAL | |||
| pTEF2 | Enterocin_96 [2e-41] (exact match) | context: TIGR04195 | Glycocin: |
| MLNKKLLENGVVNAVTIDELDAQFGGMSKRDCNL MKACCAGQAVTYAIHSLLNRLGGDSSDPAGCNDIVRKYCK | |||
| PF03412 | |||
| leader_abc mature_ab PF02052.7 leaderLanBC | context: | Lantibiotic: | |
| PF04737.5 PF04738.5 | MPKYDDFDLNLKQTSASNQKDTRVTSVMACTPGTCNNKCPN TNWLCSNVCVTKTCWTCA | ||
| PF05147.5 | |||
| leader_abc PF02052.7 leaderLanBC | context: | Lantibiotic: | |
| PF04737.5 PF04738.5 | MPKYDDFDLNLKQNVSSSNKEPRITSIKWCTPG TCNNTCKGDSTLKSNCCGGSLMCSLGGC | ||
| PF05147.5 | |||
| Trunkamide[1e-06] | Context: | Cyanobactin: | |
| CyaG | MPCYPSYDGVDASVCMPCYPSYDGVDASVCMPCYP SYDDAE | ||
| Capistruin[5e-23] | Context: | Lasso peptide: | |
| PF13471.1 | MVRFLAKLLRSTIHGSHGVSLDAVSSTHGTPGFQTPDARV ISRFGFN | ||
| PF00733.16 |