| Literature DB >> 22347512 |
Nathan L Yozwiak1, Peter Skewes-Cox, Mark D Stenglein, Angel Balmaseda, Eva Harris, Joseph L DeRisi.
Abstract
Dengue virus is an emerging infectious agent that infects an estimated 50-100 million people annually worldwide, yet current diagnostic practices cannot detect an etiologic pathogen in ∼40% of dengue-like illnesses. Metagenomic approaches to pathogen detection, such as viral microarrays and deep sequencing, are promising tools to address emerging and non-diagnosable disease challenges. In this study, we used the Virochip microarray and deep sequencing to characterize the spectrum of viruses present in human sera from 123 Nicaraguan patients presenting with dengue-like symptoms but testing negative for dengue virus. We utilized a barcoding strategy to simultaneously deep sequence multiple serum specimens, generating on average over 1 million reads per sample. We then implemented a stepwise bioinformatic filtering pipeline to remove the majority of human and low-quality sequences to improve the speed and accuracy of subsequent unbiased database searches. By deep sequencing, we were able to detect virus sequence in 37% (45/123) of previously negative cases. These included 13 cases with Human Herpesvirus 6 sequences. Other samples contained sequences with similarity to sequences from viruses in the Herpesviridae, Flaviviridae, Circoviridae, Anelloviridae, Asfarviridae, and Parvoviridae families. In some cases, the putative viral sequences were virtually identical to known viruses, and in others they diverged, suggesting that they may derive from novel viruses. These results demonstrate the utility of unbiased metagenomic approaches in the detection of known and divergent viruses in the study of tropical febrile illness.Entities:
Mesh:
Year: 2012 PMID: 22347512 PMCID: PMC3274504 DOI: 10.1371/journal.pntd.0001485
Source DB: PubMed Journal: PLoS Negl Trop Dis ISSN: 1935-2727
Summary of viruses identified in this study.
| Patient code | Clinic virus ID | Virochip virus ID | Sequencing virus ID | Virus TaxID | # virus reads | # initial reads | Fraction virus reads |
| 187 | DENV-2 | DENV-2 | Dengue virus 2 | 11060 | 4280 | 1.1E+06 | 3.9E−03 |
| 275 | DENV-2 | DENV-2 | Dengue virus 2 | 11060 | 1511 | 1.6E+06 | 9.7E−04 |
| 282 | DENV-2 | DENV-2 | Dengue virus 2 | 11060 | 699 | 1.6E+06 | 4.2E−04 |
| 266 | DENV-2 | DENV-2 | Dengue virus 2 | 11060 | 135749 | 4.8E+06 | 2.8E−02 |
| 274 | DENV-1 | DENV-1 | Dengue virus 1 | 11053 | 27 | 1.2E+06 | 2.3E−05 |
| 401 | HAV | HAV | Hepatitis A virus | 12092 | 2164 | 1.8E+05 | 1.2E−02 |
| 401 | HAV | HAV | Hepatitis A virus | 12092 | 4562 | 1.3E+06 | 3.5E−03 |
| 235 | - | - | Human herpesvirus 6 | 10368 | 116 | 5.5E+06 | 2.1E−05 |
| 451 | - | - | Human herpesvirus 6 | 10368 | 88 | 2.7E+06 | 3.2E−05 |
| 207 | - | - | Human herpesvirus 6 | 10368 | 390 | 9.6E+06 | 4.1E−05 |
| 432 | - | - | Human herpesvirus 6 | 10368 | 411 | 3.5E+06 | 1.2E−04 |
| 574 | - | - | Human herpesvirus 6 | 10368 | 138 | 3.2E+06 | 4.4E−05 |
| 370 | - | - | Human herpesvirus 6 | 10368 | 90 | 3.2E+06 | 2.9E−05 |
| 78 | - | - | Human herpesvirus 6 | 10368 | 113 | 1.2E+06 | 9.8E−05 |
| 131 | - | - | Human herpesvirus 6 | 10368 | 24 | 1.2E+06 | 2.0E−05 |
| 183 | - | - | Human herpesvirus 6 | 10368 | 66 | 3.0E+06 | 2.2E−05 |
| 270 | - | - | Human herpesvirus 6 | 10368 | 28 | 1.2E+06 | 2.4E−05 |
| 344 | - | - | Human herpesvirus 6 | 10368 | 303 | 1.3E+06 | 2.2E−04 |
| 350 | - | - | Human herpesvirus 6 | 10368 | 48 | 3.0E+06 | 1.6E−05 |
| 438 | - | - | Human herpesvirus 6 | 10368 | 72 | 4.4E+06 | 1.6E−05 |
| 315 | - | - | African swine fever virus | 10497 | 42 | 1.9E+06 | 2.2E−05 |
| 382 | - | - | Human herpesvirus 4 | 10376 | 44 | 9.6E+05 | 4.6E−05 |
| 387 | - | - | GB virus C | 54290 | 171 | 9.0E+06 | 1.9E−05 |
| 180 | - | - | GB virus C | 54290 | 42 | 8.0E+05 | 5.2E−05 |
| 161 | - | - | Human parvovirus B19 | 10798 | 14 | 3.0E+06 | 4.7E−06 |
| 118 | - | - | Circovirus-like genome RW-E | 642255 | 177 | 7.4E+06 | 2.4E−05 |
| 323 | - | - | Circovirus-like genome RW-E | 642255 | 12 | 5.0E+06 | 2.4E−06 |
| 363 | - | - | Circovirus-like genome RW-E | 642255 | 17 | 1.9E+06 | 8.9E−06 |
| 371 | - | - | Circovirus-like genome RW-E | 642255 | 21 | 1.6E+06 | 1.3E−05 |
| 387 | - | - | Circovirus-like genome RW-E | 642255 | 92 | 9.0E+06 | 1.0E−05 |
| 355 | - | - | Beak and feather disease virus | 77856 | 12 | 2.1E+06 | 5.7E−06 |
| 345 | - | - | Beak and feather disease virus | 77856 | 62 | 2.2E+06 | 2.9E−05 |
| 315 | - | - | Swan circovirus | 459957 | 26 | 1.9E+06 | 1.4E−05 |
| 329 | - | - | Gull circovirus | 400121 | 14 | 2.2E+06 | 6.3E−06 |
| 321 | - | - | Porcine circovirus 1 | 133704 | 30 | 4.6E+06 | 6.5E−06 |
| 375 | - | - | Porcine circovirus 1 | 133704 | 53 | 3.8E+06 | 1.4E−05 |
| 377 | - | - | Cyclovirus PK5034 | 742916 | 81 | 6.6E+06 | 1.2E−05 |
| 322 | - | - | Cyclovirus PK5222 | 742917 | 206 | 3.8E+06 | 5.5E−05 |
| 235 | - | - | Torque teno virus | 68887 | 23 | 5.5E+06 | 4.2E−06 |
| 73 | - | TTV | Torque teno midi virus 1 | 687379 | 137 | 6.9E+06 | 2.0E−05 |
| 505 | - | - | Torque teno virus | 68887 | 37 | 6.9E+06 | 5.3E−06 |
| 505 | - | - | Small anellovirus | 393049 | 25 | 6.9E+06 | 3.6E−06 |
| 457 | - | - | Torque teno virus | 68887 | 29 | 1.5E+07 | 1.9E−06 |
| 171 | - | - | Torque teno mini virus 2 | 687370 | 18 | 1.2E+06 | 1.6E−05 |
| 159 | - | TTV | Torque teno mini virus 5 | 687373 | 143 | 2.6E+06 | 5.6E−05 |
| 179 | - | - | Torque teno mini virus 1 | 687369 | 17 | 1.8E+06 | 9.3E−06 |
| 193 | - | - | Torque teno mini virus 2 | 687370 | 56 | 1.6E+06 | 3.6E−05 |
| 183 | - | TTV | Torque teno mini virus 3 | 687371 | 139 | 3.0E+06 | 4.6E−05 |
| 156 | - | TTV | Torque teno midi virus 1 | 687379 | 213 | 2.3E+06 | 9.1E−05 |
| 186 | - | - | Torque teno virus 15 | 687354 | 1701 | 2.0E+06 | 8.3E−04 |
| 282 | - | TTV | Torque teno midi virus 1 | 687379 | 61 | 1.6E+06 | 3.7E−05 |
| 335 | - | - | Torque teno virus | 68887 | 47 | 1.7E+06 | 2.8E−05 |
| 330 | - | - | TTV-like mini virus | 93678 | 77 | 1.8E+06 | 4.2E−05 |
| 270 | - | - | Torque teno virus 8 | 687347 | 82 | 1.2E+06 | 7.1E−05 |
| 331 | - | - | Torque teno midi virus 2 | 687380 | 113 | 1.4E+06 | 8.2E−05 |
| 349 | - | TTV | Torque teno midi virus | 432261 | 47 | 1.6E+06 | 2.9E−05 |
| 350 | - | TTV | Torque teno mini virus 4 | 687372 | 51 | 3.0E+06 | 1.7E−05 |
| 566 | - | TTV | Torque teno mini virus 4 | 687372 | 206 | 1.9E+06 | 1.1E−04 |
| 377 | - | - | Torque teno mini virus 4 | 687372 | 153 | 6.6E+06 | 2.3E−05 |
| 168 | TTV | 1.9E+05 | |||||
| 263 | - | TTV | 1.5E+06 |
The NCBI TaxID and name of the virus species with the highest number of hits among those viruses with BLAST hits is given.
These two samples were prepared from aliquots of the same serum sample.
In its deep sequencing dataset, Sample 168 had 9 reads matching TTV, just below our positive identification threshold.
Figure 1Bioinformatic filtering of deep sequencing data.
Average percent remaining reads after each of the filtering steps. Low-quality and low-complexity reads are removed first, followed by iterative BLAT and BLAST comparisons to human sequence. Averages were calculated for all samples (n = 130). Inset: secondary pipeline depicting post-filtering viral searches. The dashed bubble includes future methods to improve the sensitivity of viral sequence detection.
Figure 2HHV-6B genome coverage in positive samples.
Histograms of HHV-6B genome coverage generated by aligning reads with minimum 90% identity over the total read length to the genome. The depth of sequence coverage was calculated as the total Kb of aligned sequence per 1 Kb bin over the HHV-6B reference genome. Genome track representation adapted from Dominguez et al [65]. The blue box represents conserved genes across the betaherpesvirus subfamily, the orange boxes represent core genes across the herpesvirus family, the green box represents the late structural genes (gp82-105), and the asterisk denotes the origin of lytic gene replication. Inset text for each histogram is the sample code. Coverage is shown for samples with greater than 80 HHV-6 reads.
Figure 3Circovirus-like NI sequence coverage and phylogeny.
Phylogenetic neighbor-joining tree of amino acid sequences showing the relationship between Circovirus-like NI rep sequences (red) and 19 representative replicase sequences. Abbreviations: CV, circovirus, Ba, Barbel, Bat, Bat ZS/Yunnan-China/2009, BFDV, beak and feather disease virus, Ca, Canary Circo-like Circovirus-like genome, Cyclo, cyclovirus, PKbeef, PKbeef23/PAK/2009, Du, Muscovy duck, Ed, Entamoeba dispar, Gi, Giardia intestinalis, PCV2, Porcine circovirus 2, RodCV, Rodent stool-associated circular genome virus, UncCV, uncultured circovirus. For a full list of sequences and accession numbers, see Table S2.