Literature DB >> 30687825

Complete Genome Sequence of Nocardia farcinica W6977T Obtained by Combining Illumina and PacBio Reads.

Christopher A Gulvik1, Robert A Arthur1, Ben W Humrighouse1, Dhwani Batra2, Lori A Rowe2, Brent A Lasker1, John R McQuiston1.   

Abstract

The complete genome sequence of the Nocardia farcinica type strain was obtained by combining Illumina HiSeq and PacBio reads, producing a single 6.29-Mb chromosome and 2 circular plasmids. Bioinformatic analysis identified 5,991 coding sequences, including putative genes for virulence, microbial resistance, transposons, and biosynthesis gene clusters.

Entities:  

Year:  2019        PMID: 30687825      PMCID: PMC6346157          DOI: 10.1128/MRA.01373-18

Source DB:  PubMed          Journal:  Microbiol Resour Announc        ISSN: 2576-098X


ANNOUNCEMENT

Nocardia farcinica, first isolated in 1888 (1), is an opportunistic bacterial pathogen of clinical relevance due to high levels of morbidity and mortality and a high intrinsic degree of antimicrobial resistance (2, 3). We report here the completed genome sequence of the Nocardia farcinica type strain obtained from the ATCC and identify potential markers for virulence, antimicrobial resistance, transposons, and the production of secondary biosynthesis metabolites. A single colony was inoculated into Trypticase soy broth and grown at 35°C for 5 days. Genomic DNA was extracted using the MasterPure DNA purification kit (Epicentre, Madison, WI, USA). A 20-kb library was prepared with the SMRTbell template prep kit 1.0 (Pacific Biosciences, Menlo Park, CA, USA). The library was bound to polymerase using the DNA/polymerase binding kit P6v2 (PacBio), loaded on a single-molecule real-time (SMRT) cell (PacBio), and then sequenced with C4v2 chemistry (PacBio) on an RS II (PacBio) instrument. An aliquot of the same DNA preparation was sheared on an M220 Focused-ultrasonicator (Covaris, Inc., Woburn, MA) to generate DNA fragments averaging 500 bp in length. The NEBNext Ultra DNA kit (New England BioLabs, Ipswich, MA, USA) was used to create libraries, and paired-end sequencing (2 × 250 bp) was performed on a MiSeq version 2 500-cycle reagent kit (Illumina, San Diego, CA). PacBio raw reads of ≥1 kbp were converted into FastQ format with bash5tools 0.8.0. These 77,719 long reads (643,112,291 bp) were scrubbed using DASCRUBBER-wrapper (4) and Gene Myers’s Dazzler utilities (DALIGNER, datander, DAScover, DASedit, DASpatch, DASqv, DAStrim, REPmask, and TANmask) (5). Using default parameters, reads were self-aligned with DALIGNER in order to mask interspersed repeats with REPmask, and tandem repeats (3.3% of reads, 0.6% of bases) were identified with datander and masked with TANmask. The reads were then realigned with repeat sites masked to find overlaps and estimate coverage with DAScover. The intrinsic qualities were plotted with DASqv to identify thresholds (best 80% were ≥Q22, worst 7% were ≤Q29), which were used in read trimming and patching with DAStrim and DASpatch. Finally, 72,526 scrubbed reads (516,421,081 bp) were captured with DASedit and DB2fasta. This scrubbing process discarded 106.7 Mbp (16.6%) and repaired 48.8 Mbp (7.5%) low-quality nucleotides, removed 110.9 Mbp of chimeras, and clipped off 8.9 Mbp of missed adaptamers. Paired-end Illumina reads were cleaned with BBDUK 37.77 to remove PhiX, and Trimmomatic 0.36 (6) was used to remove adapters and discard sequences with a Phred score of less than 30. Illumina and PacBio reads were then assembled with Unicycler 0.4.6 (7), which depended on SPAdes 3.12.0 (8), minimap2 (9), miniasm (10), racon 1.3.1 (11), and BLAST 2.7.1+ (12) for assembly. Short-read polishing of the assembly was also accomplished in Unicycler, which used ALE 4aec46e (13), Bowtie 2.3.4.1 (14), SAMtools 1.8 (15), and Pilon 1.22 (16) and fixed 24 assembly errors. CheckM 1.0.11 (17) indicated that the genome is 99.8% complete (two missing markers) when using 799 Nocardia genus reference markers. The genome was annotated with the NCBI Prokaryotic Genome Annotation Pipeline (PGAP) (18). The W6799T genome contains 6,291,633 bp, with a G+C content of 70.0%, 5,991 coding sequences (CDSs), 3 complete rRNA operons, 57 tRNAs, 5 CRISPR arrays, and two circular plasmids of 96.7 and 41.2 kb, respectively. A total of 129 virulence-related genes and 11 antimicrobial resistance-related genes, many previously described for Nocardia farcinica IFM 10152 (19), were identified. Thirteen transposable elements, 85 genomic biosynthesis gene clusters (BGCs), and 1 plasmid BGC were identified using antiSMASH 4.0 (20), suggesting that the genome shows great potential for the discovery of new natural products.

Data availability.

The whole-genome sequence of Nocardia farcinica W6977T has been deposited at the DDBJ/ENA/GenBank database under the accession number CP031418 for the chromosome and numbers CP031419 and CP031420 for plasmids 1 and 2, respectively. Illumina and PacBio raw reads have been submitted to the SRA under BioSample number SAMN09723209.
  17 in total

1.  The complete genomic sequence of Nocardia farcinica IFM 10152.

Authors:  Jun Ishikawa; Atsushi Yamashita; Yuzuru Mikami; Yasutaka Hoshino; Haruyo Kurita; Kunimoto Hotta; Tadayoshi Shiba; Masahira Hattori
Journal:  Proc Natl Acad Sci U S A       Date:  2004-10-04       Impact factor: 11.205

2.  SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing.

Authors:  Anton Bankevich; Sergey Nurk; Dmitry Antipov; Alexey A Gurevich; Mikhail Dvorkin; Alexander S Kulikov; Valery M Lesin; Sergey I Nikolenko; Son Pham; Andrey D Prjibelski; Alexey V Pyshkin; Alexander V Sirotkin; Nikolay Vyahhi; Glenn Tesler; Max A Alekseyev; Pavel A Pevzner
Journal:  J Comput Biol       Date:  2012-04-16       Impact factor: 1.479

3.  ALE: a generic assembly likelihood evaluation framework for assessing the accuracy of genome and metagenome assemblies.

Authors:  Scott C Clark; Rob Egan; Peter I Frazier; Zhong Wang
Journal:  Bioinformatics       Date:  2013-01-09       Impact factor: 6.937

4.  Fast gapped-read alignment with Bowtie 2.

Authors:  Ben Langmead; Steven L Salzberg
Journal:  Nat Methods       Date:  2012-03-04       Impact factor: 28.547

Review 5.  Clinical and laboratory features of the Nocardia spp. based on current molecular taxonomy.

Authors:  Barbara A Brown-Elliott; June M Brown; Patricia S Conville; Richard J Wallace
Journal:  Clin Microbiol Rev       Date:  2006-04       Impact factor: 26.132

6.  BLAST+: architecture and applications.

Authors:  Christiam Camacho; George Coulouris; Vahram Avagyan; Ning Ma; Jason Papadopoulos; Kevin Bealer; Thomas L Madden
Journal:  BMC Bioinformatics       Date:  2009-12-15       Impact factor: 3.169

7.  The Sequence Alignment/Map format and SAMtools.

Authors:  Heng Li; Bob Handsaker; Alec Wysoker; Tim Fennell; Jue Ruan; Nils Homer; Gabor Marth; Goncalo Abecasis; Richard Durbin
Journal:  Bioinformatics       Date:  2009-06-08       Impact factor: 6.937

8.  CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes.

Authors:  Donovan H Parks; Michael Imelfort; Connor T Skennerton; Philip Hugenholtz; Gene W Tyson
Journal:  Genome Res       Date:  2015-05-14       Impact factor: 9.043

9.  Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement.

Authors:  Bruce J Walker; Thomas Abeel; Terrance Shea; Margaret Priest; Amr Abouelliel; Sharadha Sakthikumar; Christina A Cuomo; Qiandong Zeng; Jennifer Wortman; Sarah K Young; Ashlee M Earl
Journal:  PLoS One       Date:  2014-11-19       Impact factor: 3.240

10.  Trimmomatic: a flexible trimmer for Illumina sequence data.

Authors:  Anthony M Bolger; Marc Lohse; Bjoern Usadel
Journal:  Bioinformatics       Date:  2014-04-01       Impact factor: 6.937

View more
  1 in total

1.  Complete and Circularized Bacterial Genome Sequence of Gordonia sp. Strain X0973.

Authors:  Christopher A Gulvik; Dhwani Batra; Lori A Rowe; Milli Sheth; Sarah Nobles; Justin S Lee; John R McQuiston; Brent A Lasker
Journal:  Microbiol Resour Announc       Date:  2021-03-04
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.