| Literature DB >> 32928108 |
Zhao Chen1, David L Erickson1, Jianghong Meng2.
Abstract
BACKGROUND: We benchmarked the hybrid assembly approaches of MaSuRCA, SPAdes, and Unicycler for bacterial pathogens using Illumina and Oxford Nanopore sequencing by determining genome completeness and accuracy, antimicrobial resistance (AMR), virulence potential, multilocus sequence typing (MLST), phylogeny, and pan genome. Ten bacterial species (10 strains) were tested for simulated reads of both mediocre- and low-quality, whereas 11 bacterial species (12 strains) were tested for real reads.Entities:
Keywords: Bacterial pathogen; Genomic analyses; Hybrid assembly; Illumina sequencing; MaSuRCA; Oxford Nanopore sequencing; SPAdes; Unicycler
Mesh:
Year: 2020 PMID: 32928108 PMCID: PMC7490894 DOI: 10.1186/s12864-020-07041-8
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Hybrid assemblies of bacterial strains with simulated Illumina short reads and mediocre-quality Oxford Nanopore long reads using MaSuRCA, SPAdes, and Unicycler compared to their corresponding reference genomes
| Strain | Number of contigs | Total length (bp) | GC content (%) | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| MaSuRCA | SPAdes | Unicycler | Reference | MaSuRCA | SPAdes | Unicycler | Reference | MaSuRCA | SPAdes | Unicycler | Reference | |
| 1 (0 cir.; 2 dead) | 39 | 1 (0 cir.; 2 dead) | 1 | 6,264,321 | 6,261,361 | 6,264,377 | 6,264,404 | 66.56 | 66.57 | 66.56 | 66.56 | |
| 2 (0 cir.; 4 dead) | 456 | 3 (0 cir.; 6 dead) | 3 | 5,591,228 | 5,518,282 | 5,594,148 | 5,594,605 | 50.49 | 50.40 | 50.48 | 50.48 | |
| 3 (0 cir.; 6 dead) | 47 | 3 (0 cir.; 6 dead) | 3 | 5,503,594 | 5,508,786 | 5,489,222 | 5,503,926 | 35.24 | 35.25 | 35.23 | 35.24 | |
| 1 (0 cir.; 2 dead) | 38 | 1 (0 cir.; 2 dead) | 1 | 5,521,107 | 5,517,301 | 5,521,041 | 5,521,203 | 57.56 | 57.57 | 57.56 | 57.56 | |
| 2 (0 cir.; 4 dead) | 43 | 3 (0 cir.; 3 dead) | 2 | 4,951,226 | 4,956,436 | 4,947,278 | 4,951,383 | 52.24 | 52.24 | 52.24 | 52.24 | |
| 3 (1 cir.; 4 dead) | 54 | 4 (1 cir.; 6 dead) | 4 | 4,658,297 | 4,642,775 | 4,663,118 | 4,663,565 | 56.64 | 56.67 | 56.64 | 56.64 | |
| 2 (1 cir.; 2 dead) | 86 | 1 (0 cir.; 2 dead) | 1 | 4,398,971 | 4,393,287 | 4,392,893 | 4,393,047 | 28.04 | 28.01 | 28.02 | 28.02 | |
| 2 (1 cir.; 2 dead) | 7 | 1 (0 cir.; 2 dead) | 1 | 2,950,445 | 2,944,768 | 2,944,366 | 2,944,528 | 38.01 | 37.98 | 37.98 | 37.98 | |
| 1 (0 cir.; 2 dead) | 47 | 1 (0 cir.; 2 dead) | 1 | 2,821,292 | 2,825,487 | 2,821,211 | 2,821,361 | 32.87 | 32.87 | 32.87 | 32.87 | |
| 1 (0 cir.; 2 dead) | 2 | 1 (0 cir.; 2 dead) | 1 | 1,641,372 | 1,641,539 | 1,641,262 | 1,641,481 | 30.55 | 30.55 | 30.55 | 30.55 | |
acir., circularized contigs
bdead, dead ends
Hybrid assemblies of bacterial strains with simulated Illumina short reads and low-quality Oxford Nanopore long reads using MaSuRCA, SPAdes, and Unicycler
| Strain | Number of contigs | Total length (bp) | GC content (%) | ||||||
|---|---|---|---|---|---|---|---|---|---|
| MaSuRCA | SPAdes | Unicycler | MaSuRCA | SPAdes | Unicycler | MaSuRCA | SPAdes | Unicycler | |
| 4 (0 cir.; 8 dead) | 64 | 1 (0 cir.; 2 dead) | 6,173,430 | 6,261,368 | 6,264,384 | 66.55 | 66.57 | 66.56 | |
| 6 (0 cir.; 12 dead) | 550 | 83 (0 cir.; 6 dead) | 5,503,884 | 5,487,649 | 5,562,446 | 50.50 | 50.38 | 50.45 | |
| 7 (0 cir.; 14 dead) | 78 | 3 (0 cir.; 6 dead) | 5,433,227 | 5,482,457 | 5,489,222 | 35.26 | 35.17 | 35.23 | |
| 1 (0 cir.; 2 dead) | 121 | 1 (0 cir.; 2 dead) | 5,482,918 | 5,505,259 | 5,520,752 | 57.54 | 57.57 | 57.56 | |
| 3 (0 cir.; 6 dead) | 78 | 3 (0 cir.; 3 dead) | 4,848,561 | 4,952,963 | 4,947,288 | 52.22 | 52.24 | 52.24 | |
| 18 (0 cir.; 36 dead) | 72 | 4 (1 cir.; 6 dead) | 4,492,402 | 4,640,822 | 4,663,144 | 56.60 | 56.67 | 56.64 | |
| 2 (1 cir.; 2 dead) | 109 | 11 (0 cir.; 2 dead) | 4,398,971 | 4,379,115 | 4,372,396 | 28.04 | 27.95 | 27.92 | |
| 4 (1 cir.; 6 dead) | 11 | 1 (0 cir.; 2 dead) | 2,927,219 | 2,933,282 | 2,942,862 | 38.02 | 37.94 | 37.97 | |
| 3 (0 cir.; 6 dead) | 37 | 1 (0 cir.; 2 dead) | 2,759,087 | 2,818,488 | 2,821,119 | 32.88 | 32.84 | 32.87 | |
| 1 (0 cir.; 2 dead) | 8 | 1 (0 cir.; 2 dead) | 1,630,638 | 1,640,924 | 1,641,330 | 30.56 | 30.55 | 30.55 | |
acir., circularized contigs
bdead, dead ends
Hybrid assemblies of bacterial strains with real Illumina short reads and Oxford Nanopore long reads using MaSuRCA, SPAdes, and Unicycler compared to their corresponding reference genomes
| Strain | Number of contigs | Total length (bp) | GC content (%) | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| MaSuRCA | SPAdes | Unicycler | Reference | MaSuRCA | SPAdes | Unicycler | Reference | MaSuRCA | SPAdes | Unicycler | Reference | |
| 2 (2 cir.; 1 dead) | 585 | 97 (3 cir.; 0 dead) | 2 | 5,803,063 | 5,674,058 | 5,722,981 | 5,778,004 | 50.67 | 50.57 | 50.67 | 50.68 | |
| 5 (0 cir.; 2 dead) | 488 | 108 (1 cir.; 0 dead) | 2 | 5,604,298 | 5,512,883 | 5,522,347 | 5,593,613 | 50.65 | 50.53 | 50.57 | 50.64 | |
| 1 (1 cir.; 0 dead) | 89 | 1 (1 cir.; 0 dead) | 1 | 5,536,665 | 5,543,412 | 5,536,929 | 5,536,659 | 57.53 | 57.53 | 57.53 | 57.53 | |
| 1 (1 cir.; 0 dead) | 56 | 1 (1 cir.; 0 dead) | 1 | 5,221,910 | 5,230,821 | 5,221,909 | 5,221,909 | 57.58 | 57.53 | 57.58 | 57.58 | |
| 2 (2 cir.; 0 dead) | 113 | 2 (2 cir.; 0 dead) | 2 | 4,990,691 | 5,003,034 | 4,992,376 | 4,992,376 | 55.59 | 55.56 | 55.59 | 55.59 | |
| 2 (2 cir.; 0 dead) | 80 | 2 (2 cir.; 0 dead) | 2 | 4,806,601 | 4,816,325 | 4,806,603 | 4,808,805 | 52.21 | 52.23 | 52.21 | 52.22 | |
| 1 (1 cir.; 0 dead) | 66 | 1 (1 cir.; 0 dead) | 1 | 4,917,488 | 4,939,416 | 4,917,491 | 4,917,511 | 52.05 | 51.99 | 52.05 | 52.05 | |
| 3 (2 cir.; 1 dead) | 75 | 4 (4 cir.; 0 dead) | 4 | 4,577,802 | 4,597,231 | 4,581,781 | 4,581,781 | 56.73 | 56.71 | 56.73 | 56.73 | |
| 2 (2 cir.; 1 dead) | 18 | 2 (2 cir.; 0 dead) | 2 | 3,102,056 | 3,140,938 | 3,137,527 | 3,108,102 | 37.95 | 37.91 | 37.90 | 37.91 | |
| 4 (1 cir.; 6 dead) | 75 | 3 (3 cir.; 0 dead) | 2 | 2,749,203 | 2,774,416 | 2,763,803 | 2,757,659 | 32.86 | 32.92 | 32.82 | 32.84 | |
| 2 (2 cir.; 0 dead) | 55 | 2 (2 cir.; 0 dead) | 2 | 1,811,805 | 1,774,178 | 1,814,450 | 1,782,911 | 30.53 | 30.67 | 30.54 | 30.53 | |
| 3 (3 cir.; 0 dead) | 51 | 3 (3 cir.; 0 dead) | 3 | 1,750,134 | 1,765,016 | 1,750,172 | 1,750,177 | 31.41 | 31.56 | 31.41 | 31.41 | |
acir., circularized contigs
bdead, dead ends
Genotypes and predicted phenotypes of antimicrobial resistance (AMR) of bacterial strains with simulated Illumina short reads and either mediocre-quality or low-quality Oxford Nanopore long reads, as predicted based on their MaSuRCA, SPAdes, and Unicycler assemblies and compared to their corresponding reference genomesa
| Strain | Genotype | Predicted phenotype | ||||||
|---|---|---|---|---|---|---|---|---|
| MaSuRCA | SPAdes | Unicycler | Reference | MaSuRCA | SPAdes | Unicycler | Reference | |
Kanamycin Ampicillin Amoxicillin/clavulanic acid Cefoxitin Ceftriaxone Chloramphenicol Fosfomycin | Kanamycin Ampicillin Amoxicillin/clavulanic acid Cefoxitin Ceftriaxone Chloramphenicol Fosfomycin | Kanamycin Ampicillin Amoxicillin/clavulanic acid Cefoxitin Ceftriaxone Chloramphenicol Fosfomycin | Kanamycin Ampicillin Amoxicillin/clavulanic acid Cefoxitin Ceftriaxone Chloramphenicol Fosfomycin | |||||
| Fosfomycin | Fosfomycin | Fosfomycin | Fosfomycin | |||||
| Ampicillin | Ampicillin | Ampicillin | Ampicillin | |||||
| Fosfomycin | Fosfomycin | Fosfomycin | Fosfomycin | |||||
| Ampicillin | Ampicillin | Ampicillin | Ampicillin | |||||
aNo antimicrobial resistance genes (ARGs) were detected in E. coli O157:H7 Sakai, S. Typhimurium LT2, C. sakazakii ATCC 29544, or S. aureus NCTC 8325 as predicted based on their hybrid assemblies using MaSuRCA, SPAdes, and Unicycler
Genotypes and predicted phenotypes of antimicrobial resistance (AMR) of bacterial strains with real Illumina short reads and Oxford Nanopore long reads, as predicted based on their MaSuRCA, SPAdes, and Unicycler assemblies and compared to their corresponding reference genomesa
| Strain | Genotype | Predicted phenotype | ||||||
|---|---|---|---|---|---|---|---|---|
| MaSuRCA | SPAdes | Unicycler | Reference | MaSuRCA | SPAdes | Unicycler | Reference | |
| Ampicillin | Ampicillin | Ampicillin | Ampicillin | |||||
Ampicillin Fosfomycin Chloramphenicol | Ampicillin Fosfomycin Chloramphenicol | Ampicillin Fosfomycin Chloramphenicol | Ampicillin Fosfomycin Chloramphenicol | |||||
Ampicillin Amoxicillin/clavulanic acid Cefoxitin Ceftriaxone | Ampicillin Amoxicillin/clavulanic acid Cefoxitin Ceftriaxone | Ampicillin Amoxicillin/clavulanic acid Cefoxitin Ceftriaxone | Ampicillin Amoxicillin/clavulanic acid Cefoxitin Ceftriaxone | |||||
Erythromycin Azithromycin Tetracycline | Ampicillin | Ampicillin | Ampicillin | |||||
Kanamycin Ampicillin Tetracycline | Kanamycin Ampicillin Tetracycline | Kanamycin Ampicillin Tetracycline | Kanamycin Ampicillin Tetracycline | |||||
Kanamycin Ampicillin Tetracycline | Kanamycin Ampicillin Tetracycline | Kanamycin Ampicillin Tetracycline | Kanamycin Ampicillin Tetracycline | |||||
aNo antimicrobial resistance genes (ARGs) were detected in E. coli O26:H11 CFSAN027343, E. coli O26:H11 CFSAN027350, E. cancerogenus CFSAN086183, S. Bareilly CFSAN000189, C. sakazakii CFSAN068773, or L. monocytogenes CFSAN008100, as predicted based on their hybrid assemblies using MaSuRCA, SPAdes, and Unicycler
Numbers of virulence genes of bacterial strains with simulated Illumina short reads and mediocre-quality Oxford Nanopore long reads, as predicted based on their MaSuRCA, SPAdes, and Unicycler assemblies and compared to their corresponding reference genomes
| Strain | Number of virulence genes | |||
|---|---|---|---|---|
| MaSuRCA | SPAdes | Unicycler | Reference | |
| 241 | 241 | 241 | 241 | |
| 126 | 128 | 126 | 126 | |
| 13 | 13 | 13 | 13 | |
| 10 | 10 | 10 | 10 | |
| 118 | 118 | 118 | 118 | |
| 2 | 2 | 2 | 2 | |
| 0 | 0 | 0 | 0 | |
| 32 | 32 | 32 | 32 | |
| 63 | 63 | 63 | 63 | |
| 118 | 119 | 119 | 119 | |
Numbers of virulence genes of bacterial strains with simulated Illumina short reads and low-quality Oxford Nanopore long reads, as predicted based on their MaSuRCA, SPAdes, and Unicycler assemblies
| Strain | Number of virulence genes | ||
|---|---|---|---|
| MaSuRCA | SPAdes | Unicycler | |
| 184 | 184 | 184 | |
| 110 | 128 | 128 | |
| 13 | 13 | 13 | |
| 10 | 10 | 10 | |
| 107 | 117 | 118 | |
| 2 | 2 | 2 | |
| 0 | 0 | 0 | |
| 32 | 32 | 32 | |
| 62 | 63 | 63 | |
| 118 | 119 | 118 | |
Numbers of virulence genes of bacterial strains with real Illumina short reads and Oxford Nanopore long reads, as predicted based on their MaSuRCA, SPAdes, and Unicycler assemblies and compared to their corresponding reference genomes
| Strain | Numbers of virulence genes | |||
|---|---|---|---|---|
| MaSuRCA | SPAdes | Unicycler | Reference | |
| 115 | 115 | 114 | 121 | |
| 110 | 108 | 109 | 115 | |
| 10 | 10 | 10 | 10 | |
| 10 | 10 | 10 | 10 | |
| 15 | 15 | 15 | 16 | |
| 109 | 109 | 109 | 111 | |
| 23 | 23 | 23 | 22 | |
| 2 | 2 | 2 | 4 | |
| 31 | 31 | 31 | 32 | |
| 63 | 63 | 63 | 63 | |
| 104 | 105 | 104 | 107 | |
| 76 | 76 | 76 | 77 | |
Fig. 1Whole-genome phylogenetic tree of the hybrid assemblies of Pseudomonas aeruginosa PAO1 with simulated Illumina short reads and mediocre- or low-quality Oxford Nanopore long reads using MaSuRCA, SPAdes, and Unicycler in addition to the reference genome (in red) compared to 30 P. aeruginosa strains. The scale bar indicates the genetic distance
Fig. 2Whole-genome phylogenetic tree of the hybrid assemblies of Listeria monocytogenes CFSAN008100 with real Illumina short reads and Oxford Nanopore long reads using MaSuRCA, SPAdes, and Unicycler in addition to the reference genome (in red) compared to 30 L. monocytogenes strains. The scale bar indicates the genetic distance
Fig. 3Core-genome phylogenetic tree of the hybrid assemblies of Escherichia coli O157:H7 Sakai with simulated Illumina short reads and mediocre- or low-quality Oxford Nanopore long reads using MaSuRCA, SPAdes, and Unicycler in addition to the reference genome (in red) compared to 30 Shiga-toxin producing E. coli (STEC) strains. The scale bar indicates the genetic distance
Fig. 4Core-genome phylogenetic tree of the hybrid assemblies of Cronobacter sakazakii CFSAN068773 with real Illumina short reads and Oxford Nanopore long reads using MaSuRCA, SPAdes, and Unicycler in addition to the reference genome (in red) compared to 30 C. sakazakii strains. The scale bar indicates the genetic distance
Fig. 5Pan genomes of the hybrid assemblies of Salmonella Typhimurium LT2 with simulated Illumina short reads and mediocre- or low-quality Oxford Nanopore long reads using MaSuRCA (mediocre-quality, a low-quality, d, SPAdes (mediocre-quality, b low-quality, e, and Unicycler (mediocre-quality, c low-quality, f) and 20 S. Typhimurium strains compared to the reference genome (g)
Fig. 6Pan genomes of the hybrid assemblies of Campylobacter jejuni CFSAN032806 with real Illumina short reads and Oxford Nanopore long reads using MaSuRCA (a), SPAdes (b), and Unicycler (c) and 20 C. jejuni strains compared to the reference genome (d)