| Literature DB >> 23275870 |
Jochen Klumpp1, Derrick E Fouts, Shanmuga Sozhamannan.
Abstract
The dawn of next generation sequencing technologies has opened up exciting possibilities for whole genome sequencing of a plethora of organisms. The 2nd and 3rd generation sequencing technologies, based on cloning-free, massively parallel sequencing, have enabled the generation of a deluge of genomic sequences of both prokaryotic and eukaryotic origin in the last seven years. However, whole genome sequencing of bacterial viruses has not kept pace with this revolution, despite the fact that their genomes are orders of magnitude smaller in size compared with bacteria and other organisms. Sequencing phage genomes poses several challenges; (1) obtaining pure phage genomic material, (2) PCR amplification biases and (3) complex nature of their genetic material due to features such as methylated bases and repeats that are inherently difficult to sequence and assemble. Here we describe conclusions drawn from our efforts in sequencing hundreds of bacteriophage genomes from a variety of Gram-positive and Gram-negative bacteria using Sanger, 454, Illumina and PacBio technologies. Based on our experience we propose several general considerations regarding sample quality, the choice of technology and a "blended approach" for generating reliable whole genome sequences of phages.Entities:
Year: 2012 PMID: 23275870 PMCID: PMC3530529 DOI: 10.4161/bact.22111
Source DB: PubMed Journal: Bacteriophage ISSN: 2159-7073
Table 1. Summary of bacteriophage shotgun genome sequencing projects
| Phage name (host bacteria) | Virus family | Genome size | Number of reads | Average read length | Complete sequencing in | Reference |
|---|---|---|---|---|---|---|
| P40 ( | 35.64 kb | 164 | 942 | 55 d | 44 | |
| ΦS63 ( | 33.61 kb | 263 | 915 | 27 d | 45 | |
| B653 ( | 31.17 kb | 235 | 923 | 31 d | unpublished | |
| NF5 ( | 36.95 kb | 287 | 850 | 25 d | 46 | |
| BL3 ( | 41.52 kb | 242 | 871 | 28 d | 46 |

Figure 1. Sanger read pile-up in the assembly of a shotgun library sequencing approach of Listeria phage P70. Image captured from CLC Genomics Workbench 5.1. Upper scale shows sequence length in bp. Green are forward reads, red are reverse reads. Blue are mate-pair reads. Light green and light read color indicates trimmed sequence parts. The coverage plot shows the region of sequence and cloning bias, which features a significant higher coverage (up to 55-fold) than the rest of the contig sequence (2–21 fold).
Table 2. Results of 454 sequencing of bacteriophage genomes A) 16 bacteriophage genomes on one sequencing plate
| A) 16 bacteriophage genomes on one sequencing plate. | |||||
|---|---|---|---|---|---|
| Sample (Host) | Number of sequences | Number of bases | Average read length | Average coverage | # contigs |
| 1 ( | 11344 | 2161998 | 191 | 48x | 4 |
| 2 ( | 30271 | 6025214 | 199 | 46x | 2 |
| 3 ( | 26871 | 4771285 | 178 | 58x | 7 |
| 4 ( | 39479 | 7344858 | 186 | 198x | 16 |
| 5 ( | 27507 | 5082812 | 185 | 34x | 14 |
| 6 ( | 30877 | 6066035 | 196 | 46x | 3 |
| 7 ( | 668 | 116563 | 174 | 0.83x | 9 |
| 8 ( | 34842 | 6112171 | 175 | 76x | 3 |
| 9 ( | 16325 | 3237309 | 198 | 27x | 6 |
| 10 ( | 25081 | 4662743 | 186 | 34x | 7 |
| 11 ( | 25805 | 5450722 | 211 | 39x | 3 |
| 12 ( | 45510 | 8086412 | 194 | 71x | 3 |
| 13 ( | 37291 | 6585482 | 177 | 78x | 8 |
| 14 ( | 35671 | 7045886 | 198 | 201x | 11 |
| 15 ( | 37561 | 7474743 | 199 | 59x | 19 |
| 16 ( | 23148 | 4517691 | 195 | 30x | 9 |

Figure 2. De novo assembly of approximately 60 million Illumina reads generated for a 178 kb Cronobacter phage. 219 large contigs were produced and at least 20 of them are of similar size or larger than the actual phage genome, which sticks out because of the unusual high sequence coverage of 22,880-fold. Several other assemblies also feature reliable coverage when viewed separately from the rest.
Table 3. Results from PacBio RS sequencing of bacteriophage CP-51 (Bacillus) and P70 (Listeria) DNA using SMRTanalysis version 1.3
| Phage name | # of SMRT-cells (# of 45 min movies) | Pre-filter # of bases | Post-filter # of bases | # of post-filter reads | Post-filter mean readlength (library insert size) in nt | Post-filter mean read quality |
|---|---|---|---|---|---|---|
| P70 | 3 (6) | 354960481 | 70378003 | 33848 | 1881 (1800–2000) | 0.871 |
| CP-51 | 6 (12) | 676462435 | 212266847 | 107146 | 1770 (2800) | 0.875 |