Literature DB >> 35384715

De Novo Assembly and Annotation of the Complete Genome Sequence of Myxococcus xanthus DZ2.

Rodolfo Aramayo1, Beiyan Nan1.   

Abstract

We report the assembly and annotation of a high-quality genome sequence for Myxococcus xanthus strain DZ2 (GenBank accession number CP080538), created using a combination of short reads generated using DNBSEQ technology (BGI Genomics) and long high-fidelity (HiFi) reads generated using Pacific Biosciences (PacBio) technology.

Entities:  

Year:  2022        PMID: 35384715      PMCID: PMC9119067          DOI: 10.1128/mra.01074-21

Source DB:  PubMed          Journal:  Microbiol Resour Announc        ISSN: 2576-098X


ANNOUNCEMENT

The Myxococcus xanthus isolate (GenBank accession number CP080538) reported here was originally acquired by David Zusman from the Roger Stanier collection at UC Berkeley and named DZ2 (1). While this work was in progress, Jain et al. (2) reported the assembly of a Myxococcus xanthus DZ2 isolate of similar origin (CP070500). While the assembly submitted under GenBank accession number CP070500 represents a big improvement over the previously reported draft DZ2 assembly (3), the assembly reported here (CP080538) is both larger in size and different in gene content (Table 1).
TABLE 1

Comparative analysis of the highly related Myxococcus xanthus strain DZ2 assemblies

CharacteristicaData for assembly (GenBank accession no.):
CP080538 CP070500
Genome size (bp)9,365,7839,359,382
Total no. of genes7,5817,576
Total no. of CDSs7,4997,494
No. of genes (coding)7,4027,408
No. of CDSs (with protein)7,4027,408
No. of RNA genes8282
No. of complete rRNAs (5S, 16S, 23S)4, 4, 44, 4, 4
No. of tRNAs6666
No. of ncRNAs44
No. of CDSs (without protein)9786
No. of pseudogenes:
 Total9786
 With ambiguous residues00
 Frameshifted4937
 Incomplete5256
 With an internal stop1412
 With multiple problems1516
No. of CRISPR arrays44

CDS, coding DNA sequences; ncRNAs, noncoding RNAs; rRNAs, ribosomal RNAs; tRNAs, transfer RNAs.

Comparative analysis of the highly related Myxococcus xanthus strain DZ2 assemblies CDS, coding DNA sequences; ncRNAs, noncoding RNAs; rRNAs, ribosomal RNAs; tRNAs, transfer RNAs. Cells of Myxococcus xanthus strain DZ2, derived from frozen stock from the Roger Stanier collection at UC Berkeley, were grown in liquid CYE medium (10 g/L Casitone and 5 g/L yeast extract), 8 mM MgSO4, and 10 mM 3-(N-morpholino)propanesulfonic acid (MOPS), pH 7.6, at 32°C, harvested by centrifugation, and frozen in liquid nitrogen. The pellet was ground into a fine powder, and DNA was extracted by lysing the cells with cetyltrimethylammonium bromide (CTAB) at 65°C. The DNA was purified using phenol/chloroform/isoamyl alcohol, followed by ethanol precipitation. DNA sequencing was performed by BGI Genomics using two different technologies: DNBSEQ and high-fidelity (HiFi) PacBio sequencing. Construction of the DNA libraries, DNA sequencing, and quality control (QC) of the long HiFi reads and the short reads derived from PCR-free rolling circle replication of the DNA nanoballs were all performed by BGI Genomics using their standard operating procedures (SOP) (4). Sequencing generated a total of 436,127 HiFi PacBio subreads with an average length of 9,018 bp, totaling 3,933.4 Mbp (representing 420-fold genome coverage), and ∼10.2 million 100-bp paired-end DNBSEQ reads, totaling 2,042.54 Mbp (representing 218-fold coverage). Small reads that passed our standard quality control (QC) protocol (5) and large reads with a quality value (QV) of >20 or >99% accuracy were used for assembly. Genome assembly was performed using HiCanu version 2.1.1-Java-1.8 (6), Unicycler version 0.4.8 (7), and Velvet version 1.2.10 (8, 9). We performed a total of 48 assemblies (36 HiCanu, 11 Unicycler, and 1 Velvet). The parameters used and the associated resulting data are described in a supplemental Zenodo repository (10). The largest circular contig of the HiCanu assemblies (contig tig00000005 of Assembly 33), was ∼23.8 kbp larger than the largest Unicycler assembly and was thus selected for analysis. We then used PSI-CD-HIT version 4.8.1/blastn version 2.12.0+ (11–14) to verify that the smaller homologous contigs generated by the other assemblies were contained within this largest HiCanu assembly (10). Given that all the smaller homologous contigs were indeed contained within the largest HiCanu contig (tig00000005), this contig was then declared as our genome assembly and submitted to the NCBI databases (10). Genome annotation was performed using the NCBI Prokaryotic Genome Annotation Pipeline (PGAP) version 5.2 (15). The CP080538 assembly presented here differs from the CP070500 assembly in many ways. The CP080538 assembly is 6.4 kbp larger than the CP070500 assembly, and while it has five more genes and 11 more pseudogenes, it also has six fewer coding sequences (CDSs). Both assemblies have the same number of rRNAs, tRNAs, ncRNAs, and CRISPR arrays (Table 1).

Data availability.

The raw data associated with this publication have been deposited at NCBI under BioProject accession number PRJNA748417 and SRA accession numbers SRX11508707 (PacBio reads) and SRX11508706 (Illumina reads). The complete genome sequence has been deposited in GenBank under accession number CP080538.1. The different genome assembly commands and resulting assembly files generated throughout this work have all been deposited at Zenodo (10).
  13 in total

1.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences.

Authors:  Weizhong Li; Adam Godzik
Journal:  Bioinformatics       Date:  2006-05-26       Impact factor: 6.937

2.  Velvet: algorithms for de novo short read assembly using de Bruijn graphs.

Authors:  Daniel R Zerbino; Ewan Birney
Journal:  Genome Res       Date:  2008-03-18       Impact factor: 9.043

3.  Pebble and rock band: heuristic resolution of repeats and scaffolding in the velvet short-read de novo assembler.

Authors:  Daniel R Zerbino; Gayle K McEwen; Elliott H Margulies; Ewan Birney
Journal:  PLoS One       Date:  2009-12-22       Impact factor: 3.240

4.  Ultrafast clustering algorithms for metagenomic sequence analysis.

Authors:  Weizhong Li; Limin Fu; Beifang Niu; Sitao Wu; John Wooley
Journal:  Brief Bioinform       Date:  2012-07-06       Impact factor: 11.622

5.  De Novo Assembly and Annotation of the Complete Genome Sequence of Myxococcus xanthus DZ2.

Authors:  Rodolfo Aramayo; Beiyan Nan
Journal:  Microbiol Resour Announc       Date:  2022-04-06

6.  Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation.

Authors:  Sergey Koren; Brian P Walenz; Konstantin Berlin; Jason R Miller; Nicholas H Bergman; Adam M Phillippy
Journal:  Genome Res       Date:  2017-03-15       Impact factor: 9.043

7.  Translational control of one-carbon metabolism underpins ribosomal protein phenotypes in cell division and longevity.

Authors:  Nairita Maitra; Chong He; Heidi M Blank; Mitsuhiro Tsuchiya; Birgit Schilling; Matt Kaeberlein; Rodolfo Aramayo; Brian K Kennedy; Michael Polymenis
Journal:  Elife       Date:  2020-05-20       Impact factor: 8.140

8.  Complete Genome Assembly of Myxococcus xanthus Strain DZ2 Using Long High-Fidelity (HiFi) Reads Generated with PacBio Technology.

Authors:  Rikesh Jain; Bianca H Habermann; Tâm Mignot
Journal:  Microbiol Resour Announc       Date:  2021-07-15

9.  CD-HIT: accelerated for clustering the next-generation sequencing data.

Authors:  Limin Fu; Beifang Niu; Zhengwei Zhu; Sitao Wu; Weizhong Li
Journal:  Bioinformatics       Date:  2012-10-11       Impact factor: 6.937

10.  Draft Genome Sequence of Myxococcus xanthus Wild-Type Strain DZ2, a Model Organism for Predation and Development.

Authors:  Susanne Müller; Jonathan W Willett; Sarah M Bahr; Cynthia L Darnell; Katherine R Hummels; Carolyn K Dong; Hera C Vlamakis; John R Kirby
Journal:  Genome Announc       Date:  2013-05-09
View more
  1 in total

1.  De Novo Assembly and Annotation of the Complete Genome Sequence of Myxococcus xanthus DZ2.

Authors:  Rodolfo Aramayo; Beiyan Nan
Journal:  Microbiol Resour Announc       Date:  2022-04-06
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.