| Literature DB >> 18179705 |
Appolinaire Djikeng1, Rebecca Halpin, Ryan Kuzmickas, Jay Depasse, Jeremy Feldblyum, Naomi Sengamalay, Claudio Afonso, Xinsheng Zhang, Norman G Anderson, Elodie Ghedin, David J Spiro.
Abstract
BACKGROUND: Most emerging health threats are of zoonotic origin. For the overwhelming majority, their causative agents are RNA viruses which include but are not limited to HIV, Influenza, SARS, Ebola, Dengue, and Hantavirus. Of increasing importance therefore is a better understanding of global viral diversity to enable better surveillance and prediction of pandemic threats; this will require rapid and flexible methods for complete viral genome sequencing.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18179705 PMCID: PMC2254600 DOI: 10.1186/1471-2164-9-5
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Overview of the strategy. Viral particles are separated from host contaminants using centrifugation and filtration. Viral particles are treated with DNAse I to remove contaminated nucleic acids. Random priming is used to generate 500–1000 bp amplicons which are size-selected and cloned. Colonies are picked and sequenced. Sequence is trimmed and assembled. Contigs are closed using sequence-specific primers.
Figure 2Outline of the SISPA method. A. Viral RNA is converted to cDNA using random-tagged and poly-A tagged primers (FR26RV-N and FR40RV-T). B. Second strand DNA is synthesized using Klenow exo-DNA polymerase, in the presence of random tagged and virus specific 5' end oligo primers. C. Double stranded DNA is amplified by PCR using the primer tag (FR20RV). D. Amplicons are separated by electrophoresis and products ranging from 500–1000 nucleotides are cloned into the TOPO vector. 96–288 colonies are picked, plasmid DNA is purified and the inserts are sequenced.
Viral isolates discussed in this study.
| Woodchuck hepatitis virus (WHV) | dsDNA | 3308 | n/a |
| Enterobacteriophage MS2 (MS2) | ssRNA positive | 3569 | 108 |
| Enterobacteriophage M13 (M13) | ssDNA | 6407 | 108 |
| Human Rhinovirus 16 (HRV16) | ssRNA positive | 7124 | n/a |
| Turkey Astrovirus (TA) | ssRNA positive | 7355 | n/a |
| Newcastle disease virus (NDV) | ssRNA negative | 15186 | n/a |
| Bacteriophage lambda (lambda) | dsDNA | 48502 | 108 |
Figure 3Representative assemblies of viruses described in this study. Images shown were generated using DNASTAR Seqman program. A. Enterobacteriophage MS2 (3569 bp). B. Human Rhinovirus 16 (7124 bp). C. Newcastle disease virus Lasota (15186 bp). All assemblies have been aligned with their reference genomes. Gaps and low coverage areas which require closure are circled.
Figure 4A. Depth of coverage of viruses. Depth of coverage statistics were generated for each contig (using the output of DNAStar Seqman program). Average coverage is the summed length of all sequence reads in a contig, including gaps divided by the contig length. The average and standard deviation for each virus was determined. B. Correlation of genome coverage with colonies picked. The SISPA method was performed for enterobacteriophage M13 (6407 bp), Newcastle disease virus Lasota (15,186 bp) and enterobacteriophage lambda (48502 bp). One, two or three 96-well blocks of clones were sequenced, trimmed and assembled. The sum of the total lengths of edited contigs for each condition was calculated as percent of the total reference genome length.
Lander-Waterman analysizs of viral genome coverage.
| Virus Name | Total Sequences | Observed Coverage | Expected Coverage | Observed Redundancy | Expected Redundancy |
| WHV | 121 | 0.84 | 1.00 | 14.86 | 18.45 |
| MS2 | 283 | 0.93 | 1.00 | 40.29 | 40.20 |
| M13 | 232 | 0.90 | 1.00 | 21.26 | 18.36 |
| HRV 16 | 195 | 0.90 | 1.00 | 14.33 | 13.88 |
| TA | 148 | 0.93 | 1.00 | 11.29 | 10.22 |
| NDV | 349 | 0.97 | 1.00 | 13.72 | 11.65 |
| Lambda | 281 | 0.52 | 0.95 | 3.72 | 2.92 |
Observed coverage and redundancy was compared with the expected coverage and redundancy as predicted by the Lander-Waterman model for the total number of sequences in each assembly.
Figure 5Relationship between initial virus particle number, genome coverage and percent non-specific sequences generated by SISPA. MS2 viruses were diluted to 108, 106, 104, and 102 particles per SISPA DNAse I reaction. The sum of the total lengths of edited contigs for each dilution was calculated as percent of the total reference genome length. Non-specific sequences were determined as those sequences which did not match reference genome with a cutoff value less than 10-25.
Specific sequences and contaminants in turkey astrovirus and human rhinovirus 16 assemblies.
| 689 | 2 | Avian |
| 501 | 2 | None |
| 423 | 2 | None |
| 518 | 2 | None |
| 259 | 2 | None |
| 1267 | 28 | Turkey astrovirus |
| 2785 | 67 | Turkey astrovirus |
| 1692 | 43 | Turkey astrovirus |
| 148 | ||
| 93.24% | ||
| 90.84% | ||
| 265 | 8 | Mammalian |
| 537 | 2 | Mammalian |
| 981 | 27 | Mammalian |
| 1091 | 7 | Bacterial |
| 1297 | 10 | Bacterial |
| 553 | 1 | None |
| 385 | 1 | None |
| 342 | 3 | None |
| 815 | 32 | None |
| 909 | 18 | None |
| 676 | 8 | None |
| 487 | 9 | None |
| 4823 | 105 | Human Rhinovirus 16 |
| 1685 | 90 | Human Rhinovirus 16 |
| 321 | ||
| 60.75% | ||
| 90.23% | ||
Sequences were analyzed against a non redundant database using a blastn algorithm. Viral specific sequences were identified as matching the reference genome with a blastn cut off below 10-25. Non-specific (non-viral contaminant) sequences were identified if they had a cut off value below 10-10, while None means that no blastn results were found below the 10-10 cut off value.