| Literature DB >> 35377231 |
Lucija Slemc1, Jernej Jakše2, Alessandro Filisetti3, Damir Baranasic4, Antonio Rodríguez-García5, Francesco Del Carratore6, Stefano Maria Marino3, Jurica Zucko4, Antonio Starcevic4, Martin Šala7, Mercedes Pérez-Bonilla8, Marina Sánchez-Hidalgo8, Ignacio González8, Fernando Reyes8, Olga Genilloud8, Vicki Springthorpe9, Dušan Goranovič10, Gregor Kosec10, Gavin H Thomas9, Davide De Lucrezia3, Hrvoje Petković1, Miha Tome1.
Abstract
Streptomyces rimosus ATCC 10970 is the parental strain of industrial strains used for the commercial production of the important antibiotic oxytetracycline. As an actinobacterium with a large linear chromosome containing numerous long repeat regions, high GC content, and a single giant linear plasmid (GLP), these genomes are challenging to assemble. Here, we apply a hybrid sequencing approach relying on the combination of short- and long-read next-generation sequencing platforms and whole-genome restriction analysis by using pulsed-field gel electrophoresis (PFGE) to produce a high-quality reference genome for this biotechnologically important bacterium. By using PFGE to separate and isolate plasmid DNA from chromosomal DNA, we successfully sequenced the GLP using Nanopore data alone. Using this approach, we compared the sequence of GLP in the parent strain ATCC 10970 with those found in two semi-industrial progenitor strains, R6-500 and M4018. Sequencing of the GLP of these three S. rimosus strains shed light on several rearrangements accompanied by transposase genes, suggesting that transposases play an important role in plasmid and genome plasticity in S. rimosus. The polished annotation of secondary metabolite biosynthetic pathways compared to metabolite analysis in the ATCC 10970 strain also refined our knowledge of the secondary metabolite arsenal of these strains. The proposed methodology is highly applicable to a variety of sequencing projects, as evidenced by the reliable assemblies obtained. IMPORTANCE The genomes of Streptomyces species are difficult to assemble due to long repeats, extrachromosomal elements (giant linear plasmids [GLPs]), rearrangements, and high GC content. To improve the quality of the S. rimosus ATCC 10970 genome, producer of oxytetracycline, we validated the assembly of GLPs by applying a new approach to combine pulsed-field gel electrophoresis separation and GLP isolation and sequenced the isolated GLP with Oxford Nanopore technology. By examining the sequenced plasmids of ATCC 10970 and two industrial progenitor strains, R6-500 and M4018, we identified large GLP rearrangements. Analysis of the assembled plasmid sequences shed light on the role of transposases in genome plasticity of this species. The new methodological approach developed for Nanopore sequencing is highly applicable to a variety of sequencing projects. In addition, we present the annotated reference genome sequence of ATCC 10970 with a detailed analysis of the biosynthetic gene clusters.Entities:
Keywords: Oxford Nanopore sequencing; Streptomyces rimosus; biosynthetic gene clusters; genome; linear plasmid; oxytetracycline; pulsed-field electrophoresis; transposase
Mesh:
Substances:
Year: 2022 PMID: 35377231 PMCID: PMC9045324 DOI: 10.1128/spectrum.02434-21
Source DB: PubMed Journal: Microbiol Spectr ISSN: 2165-0497
Streptomyces rimosus WT genome assembly (GenBank accession no. GCF_006229535.1) and annotation data
| Source | No. of bases | Circular | GC content (%) | No. of CDS | No. of tRNAs | No. of rRNAs |
|---|---|---|---|---|---|---|
| Chromosome | 9,365,899 | No | 72.0 | 8,085 | 68 | 21 |
| Plasmid | 292,604 | No | 69.6 | 282 | ||
| Total | 9,658,503 | 71.9 | 8,367 | 68 | 21 |
FIG 1Dot plot alignment for four publicly available S. rimosus complete genome assemblies.
FIG 2(A) Dot plot alignment between publicly available S. rimosus genome sequences and the GenBank GCF_006229535.1 plasmid assembly. (B) Genome assembly GCF_000716745.1 from strain NRRL B-8076 alignment to our GCF_006229535.1 plasmid assembly and highlighted junctions between aligned contigs. The highlighted junctions are coding sequences for transposases (Table S8).
FIG 3Plasmid DNA of S. rimosus ATCC 10970, M4018, and R6-500 separated by pulsed-field gel electrophoresis (PFGE). Determination of plasmid size in the S. rimosus strains ATCC 10970, M4018, and R6-500. M, lambda PFG ladder marker (New England BioLabs).
Overview of Nanopore sequencing and Canu de novo assembly of isolated plasmid DNA
| Features | M4018 | ATCC 10970 | R6-500 |
|---|---|---|---|
| No. of base-called sequences | 161,092 | 81,553 | 653,342 |
| Yield of base-called sequences (Mb) | 308.88 | 252.43 | 1733.66 |
| Control lambda phage reads | 45,408 | 6,400 | 82,806 |
| Base-called sequences for assembly (Mb) | 175.01 | 232.06 | 1472.47 |
| Avg read length (bp) | 1,512.86 | 3,087.81 | 2,580.85 |
| Longest read length (bp) | 25,954 | 39,848 | 253,123 |
| Avg GC content (%) | 66.79 | 68.03 | 68.63 |
| Assembled contig (bp) | 299,081 | 291,520 | 189,364 |
| Coverage (×) | 39.35 | 53.11 | 86.45 |
| Additional contigs | 7 | 14 | 3 |
| Contig lengths (bp) | 1,516–7,273 | 3,220–13,771 | 3,424–6,784 |
FIG 4R6-500 plasmid analysis. (A) Coverage of Nanopore reads mapped to the R6-500 plasmid assembly. The proposed 167-kb inverted repeat has twice the coverage of Nanopore reads compared to the central region (167 to 190 kb); CDS, annotated coding sequences. The only unduplicated coding sequence in the central region, an IS5 family transposase (R6500_083610), is highlighted in red at 173.6 kb. (B) In silico digestion of our R6-500 plasmid assembly with AseI, BfrI, and XbaI enzymes.
General features of the plasmids found in the three strains
| Strain | Length (bp) | GC content (%) | CDS (no.) | Median CDS length (bp) | Transposases (no.) | Transcriptional regulators (no.) | Hypothetical proteins (no.) |
|---|---|---|---|---|---|---|---|
| WT5260 | 292,604 | 69.6 | 282 | 621 | 21 | 18 | 148 |
| M4018 | 299,299 | 69.5 | 291 | 618 | 23 | 18 | 152 |
| R6 | 189,563 (∼356 kbp) | 69.8 | 202 (386) | 564 | 14 (24) | 11 (21) | 117 (225) |
The numbers in parentheses are the number of features with the proposed large inverted repeat in strain R6.
FIG 5Schematic representation of the rearrangements of the DNA sequences of the linear plasmids from ATCC 10970, M4018, and R6-500. Rectangles, annotated transposase-related coding sequences for each plasmid sequence; Asterisk, 6.7-kb transposase fragment from the chromosome present in the M4018 plasmid; ribbons, ATCC 10970 plasmid rearrangements in M4018 and R6-500 plasmid; dotted ribbon, proposed long inverted repeat in the R6-500 plasmid; red dotted rectangle, central region of the R6-500 plasmid, consisting largely of scattered duplications; scale, size of the genomic sequences in bp.
FIG 6Schematic representation of the whole genome for strain ATCC 10970 (GenBank assembly no. GCF_006229535.1). First/outer ring, size of the genomic sequences in bp; second ring. Plasmid and chromosome identification. Third ring, forward/reverse CDSs; marked dots show functional RNA elements (orange, tRNA; yellow, rRNA); numbered BGCs. Forth ring, marked BGCs; orange indicates polyketide synthase-like clusters, green indicates nonribosomal peptide-like clusters, and blue indicates remaining clusters. Fifth ring, GC content (colored line) and GC average at 71.9% (black line).
Putative biosynthetic gene clusters in the S. rimosus ATCC 10979 genome and isolated metabolites in our study
| Cluster no. in ATCC 10970 | Type | Position from | Position to | Most similar known biosynthetic gene cluster (percent similarity) | Metabolites detected in extract in our study |
|---|---|---|---|---|---|
| Chromosome | |||||
| 1 | NRPS fragment | 90930 | 97183 | Paromomycin (7) | Guanipiperazines A and B |
| 2 | PKS type I-NRPS | 188819 | 209096 | NA | |
| 3 | Terpene | 209047 | 217564 | Isorenieratene (85) | |
| 4 | NRPS | 225846 | 253508 | Atratumycin (13) | |
| 5 | Type I PKS | 321687 | 347936 | Sceliphrolactam (32) | |
| 6 | Type I PKS | 399364 | 499930 | Nystatin A1 (72) | Rimocidin, CE108, amide CE108 |
| 7 | NRPS | 513458 | 544839 | Qinichelins (22) | |
| 8 | Lassopeptide | 579166 | 586929 | Lagmysin (80) | |
| 9 | Type II PKS | 628015 | 655782 | Oxytetracycline (100) | Oxytetracycline |
| 10 | Type I PKS | 786388 | 806568 | NA | |
| 11 | Lantipeptide | 899955 | 907971 | NA | |
| 12 | Type I PKS | 922668 | 952762 | Spiroindimicins/indimicins/lynamicins (6) | |
| 13 | NRPS like | 989591 | 1015728 | Stenothricin (13) | |
| 14 | NRPS-PKS type | 1034416 | 1064312 | Rimosamide (92) | Rimosamides A–D |
| 15 | NRPS | 1095198 | 1140552 | Daptomycin (14) | |
| 16 | Arylpolyene | 1162316 | 1218483 | Herboxidiene (3) | |
| 17 | Terpene | 1386125 | 1399202 | Hopene (76) | |
| 18 | NRPS | 1568818 | 1619165 | Isocomplestatin (93) | |
| 19 | Melanin | 1756702 | 1763509 | Bagremycin A/B (11) | |
| 20 | Lantipeptide | 2189994 | 2200974 | NA | |
| 21 | NRPS | 2267432 | 2288427 | Streptobactin (70) | |
| 22 | NRPS | 2320795 | 2393710 | Ulleungmycin (36) | |
| 23 | NRPS-PKS type | 3089234 | 3116494 | Tyrobetaine (100) | Tyrobetaine, tyrobetaine-2, chlorotyrobetaine, chlorotyrobetaine-2 |
| 24 | NRPS | 4147387 | 4194710 | Mannopeptimycin (22) | |
| 25 | Arylpolyene | 4258214 | 4287270 | Fusaricidin B (25) | |
| 26 | NRPS | 4793268 | 4840550 | Ishigamide (61) | |
| 27 | Lassopeptide | 5834963 | 5841023 | Moomysin (50) | |
| 28 | Lantipeptide | 6587454 | 6598475 | SAL-2242 (77) | |
| 29 | Terpene | 6817266 | 6819473 | Geosmin (100) | |
| 30 | Ectoine | 7244554 | 7247941 | Ectoine (100) | Ectoine, hydroxyectoine |
| 31 | Siderophore | 7331013 | 7336394 | Desferrioxamine E (100) | |
| 32 | Siderophore | 7433301 | 7442083 | NA | |
| 33 | Terpene | 8052420 | 8062100 | NA | |
| 34 | Type I PKS-NRPS | 8343488 | 8380063 | Marinacarboline (23) | |
| 35 | NRPS | 8502626 | 8519135 | Deimino-antipain (66) | Chymostatin A, B, C |
| 36 | NRPS like | 8619558 | 8643234 | NA | |
| 37 | PKS type I or PKS type I-saccharide | 8655191 | 8687260 | Tetronasin (9) | |
| 38 | NRPS | 8692521 | 8715452 | Mannopeptimycin (14) | |
| 39 | Terpene | 8720327 | 8725815 | NA | |
| 40 | Other-NRPS like | 8825293 | 8867032 | A83543A (8) | |
| 41 | Butyrolactone | 8884982 | 8896849 | Cyphomycin (11) | |
| 42 | PKS type I-NRPS | 8971199 | 8996615 | NA | |
| 43 | NRPS | 9016185 | 9065343 | Teicoplanin (28) | |
| 44 | Nucleoside | 9075785 | 9088816 | Pseudouridimycin (68) | Pseudouridimycin |
| 45 | NRPS | 9091105 | 9149322 | NA | |
| 46 | NRPS | 9257979 | 9275999 | NA | |
| Plasmids | |||||
| 1P | Type I PKS | 143989 | 163050 | Kanamycin (1) | |
| 2P | NRPS | 215829 | 230795 | NA | |
FIG 7Structures of secondary metabolites detected in the extract from S. rimosus ATCC 10970.