| Literature DB >> 26395152 |
Todd N Wylie1, Kristine M Wylie1, Brandi N Herter2, Gregory A Storch2.
Abstract
Metagenomic shotgun sequencing (MSS) is an important tool for characterizing viral populations. It is culture independent, requires no a priori knowledge of the viruses in the sample, and may provide useful genomic information. However, MSS can lack sensitivity and may yield insufficient data for detailed analysis. We have created a targeted sequence capture panel, ViroCap, designed to enrich nucleic acid from DNA and RNA viruses from 34 families that infect vertebrate hosts. A computational approach condensed ∼1 billion bp of viral reference sequence into <200 million bp of unique, representative sequence suitable for targeted sequence capture. We compared the effectiveness of detecting viruses in standard MSS versus MSS following targeted sequence capture. First, we analyzed two sets of samples, one derived from samples submitted to a diagnostic virology laboratory and one derived from samples collected in a study of fever in children. We detected 14 and 18 viruses in the two sets, comprising 19 genera from 10 families, with dramatic enhancement of genome representation following capture enrichment. The median fold-increases in percentage viral reads post-capture were 674 and 296. Median breadth of coverage increased from 2.1% to 83.2% post-capture in the first set and from 2.0% to 75.6% in the second set. Next, we analyzed samples containing a set of diverse anellovirus sequences and demonstrated that ViroCap could be used to detect viral sequences with up to 58% variation from the references used to select capture probes. ViroCap substantially enhances MSS for a comprehensive set of viruses and has utility for research and clinical applications.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26395152 PMCID: PMC4665012 DOI: 10.1101/gr.191049.115
Source DB: PubMed Journal: Genome Res ISSN: 1088-9051 Impact factor: 9.043
Figure 1.Taxonomic distribution of target genomes included in ViroCap. Shown are the viral groups and families included in the ViroCap targeted sequence capture panel. A highlighted subset illustrates underlying genera. To view complete genera for all families, see Supplemental Figure S1A. Taxonomic assignments were obtained from the NCBI Taxonomy Viewer (http://www.ncbi.nlm.nih.gov/genomes/GenomesGroup.cgi?opt=virus&taxid=10239).
Results of metagenomic shotgun sequencing for pooled specimens before and after viral targeted sequence capture
Figure 2.Targeted sequence capture enrichment. Examples are given showing the impact of targeted sequence capture on breadth and depth of genome coverage for eight representative viral genomes (A–H). For illustrative purposes, all of the coverage panels in this figure have been normalized by removing (deduplicating) reads based on identical alignment start-sites. Nucleotide positions along the reference genome are shown on the x-axis. The depth of deduplicated reads is shown on the y-axis. The shaded portion indicates the sequence coverage (breadth and depth) for each virus. Post-capture sequence coverage is represented in the larger panels in blue; precapture sequence coverage is shown in the insets in red. Note that y-axis ranges are different for each panel. At the top of each panel is shown the breadth of coverage (BoC) for the sample. The header of each panel includes breadth of coverage gain (BoC gain), sample id, and reference genome name and NCBI version number. BoC gain is calculated by subtracting the percentage of the length of the reference genome that was covered by sequence reads in precapture MSS from the percentage of the length of the reference genome covered by post-capture sequence reads.
Results of metagenomic shotgun sequencing for individual specimens before and after viral targeted sequence capture
Figure 3.Targeted sequence capture identifies divergent sequences. (A) The percentage identity of the top high-scoring segment pair (HSP) identified from the BLAST alignment of anellovirus contig sequences to the references used to design ViroCap is plotted on the y-axis. The x-axis represents the percentage of the length of the anellovirus contig covered after targeted sequence capture. (B) This coverage plot represents the sequence coverage of a divergent anellovirus contig sequence. The figure is designed as described in the figure legend for Figure 2, with the following addition: The post-capture coverage plot is shaded to show regions of nucleotide sequence variation between the anellovirus contig and the most similar reference genome in the ViroCap panel. Dark shading represents areas of identical sequence, and each position with nucleotide mismatch between aligned sequences is shown in the lighter color. All of the HSPs are shown, rather than just the top HSP.