| Literature DB >> 29441238 |
Geert Cremers1, Lavinia Gambelli1, Theo van Alen1, Laura van Niftrik1, Huub J M Op den Camp1.
Abstract
With the emergence of Next Generation Sequencing, major advances were made with regard to identifying viruses in natural environments. However, bioinformatical research on viruses is still limited because of the low amounts of viral DNA that can be obtained for analysis. To overcome this limitation, DNA is often amplified with multiple displacement amplification (MDA), which may cause an unavoidable bias. Here, we describe a case study in which the virome of a bioreactor is sequenced using Ion Torrent technology. DNA-spiking of samples is compared with MDA-amplified samples. DNA for spiking was obtained by amplifying a bacterial 16S rRNA gene. After sequencing, the 16S rRNA gene reads were removed by mapping to the Silva database. Three samples were tested, a whole genome from Enterobacteria P1 Phage and two viral metagenomes from an infected bioreactor. For one sample, the new DNA-spiking protocol was compared with the MDA technique. When MDA was applied, the overall GC content of the reads showed a bias towards lower GC%, indicating a change in composition of the DNA sample. Assemblies using all available reads from both MDA and the DNA-spiked samples resulted in six viral genomes. All six genomes could be almost completely retrieved (97.9%-100%) when mapping the reads from the DNA-spiked sample to those six genomes. In contrast, 6.3%-77.7% of three viral genomes was covered by reads obtained using the MDA amplification method and only three were nearly fully covered (97.4%-100%). This case study shows that DNA-spiking could be a simple and inexpensive alternative with very low bias for sequencing of metagenomes for which low amounts of DNA are available.Entities:
Keywords: Bacteriophage; DNA spiking; Metagenome; Metavirome; Multiple displacement amplification; Virus
Year: 2018 PMID: 29441238 PMCID: PMC5807891 DOI: 10.7717/peerj.4351
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Overview of the parameters and results from sequencing, trimming and mapping to the six viral sequences and P1 phage.
| Sample | DNA (ng) | 16S (ng) | Shearing cycle; 1 min on/ 1 min off | # reads | Trimming settings | # Trimmed reads | Mapped reads to 16S | Remaining reads | Expected % of non 16S reads | Observed % non 16S reads | % mapped to #1–6 | % mapped to P1 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| neg-16S | 0 | 43.5 | 6× | 58,989 | 25–375 bp | 58,616 | 58,293 | 323 | 0.00 | 0.55 | 0.00 | 0,00 |
| P1-spiked | 5 | 130 | 10× | 543,649 | 25–400 | 522,819 | 496,250 | 26,569 | 3.85 | 5.08 | 0.02 | 31.82 |
| P1-MDA | 100 | NA | 6× | 373,351 | <25 bp; 25–325 bp | 327,451 | 36 | 327,415 | 100 | 99.99 | 0.00 | 94.83 |
| DNA-spiked-GP | 7 | 50 | 6× | 4,760,807 | 25–400 bp; 25–350 bp | 4,636,949 | 4,557,290 | 79,659 | 14.00 | 1.72 | 31.31 | 0.01 |
| DNA-spiked-SP | 0.1 | 43.5 | 6× | 4,334,460 | 25–375 bp | 4,268,134 | 3,727,603 | 540,531 | 0.23 | 12.66 | 64.43 | 0.53 |
| 1 × MDA | 5.1 | NA | 6× | 797,971 | 25–325 bp; 25–400 bp and 15 bp on 5′ end | 770,366 | 3,811 | 766,555 | ∼99.9 | 99.51 | 1.34 | 0.15 |
| 2 × MDA | 100 | NA | 6× | 190,509 | 25–375 bp | 187,178 | 1,667 | 185,511 | ∼99.9 | 99.11 | 1.24 | 0.09 |
| neg-MDA | 100 | NA | 9× | 94,905 | 25–340 bp | 93,407 | 775 | 92,632 | 100 | 99.17 | 0.00 | 0.00 |
Notes.
P1 refers to phage P1; GP refers to DNA isolated using the general protocol; SP refers to DNA isolated using the specialized protocol.
Obtained from Qubit measurements.
Quality trimming = 0.05, ambiguous nucleotide limit = 2.
Mapping settings = local; 0.5 length; 0.9 similarity.
Mapping settings = local; 0.5 length; 0.95 similarity.
Only remaining reads used.
not applicable
Figure 1Overview of the methods used to compare MDA and 16S ribosomal DNA spiking.
After DNA isolation, half of the sample was amplified using MDA, while the other half was spiked with 16S ribosomal DNA. Contigs and reads containing 16S ribosomal DNA were filtered out through mapping to the Silva database.
Figure 2Coverage of the Enterobacteria P1 phage after sequencing using MDA amplification and 16S ribosomal DNA spiking.
Figure 3Distribution of reads obtained from Ion Torrent sequencing using three different sample preparation methods based on their individual GC content in % of the total number of reads in one sample.
Figure 4Comparison of the viral reads from the individual datasets mapping to the six assembled virus genomes (Green, DNA-spiked sample; Blue, 1 × MDA; Red, 2 × MDA).
(A) Horizontal coverage of the viral genomes with reads from the individual datasets. (B) Distribution of the reads from the individual datasets over the assembled viral genomes. (C) Depth (vertical coverage) of the viral genomes with reads from the individual datasets as a measure of abundance.
Figure 5Differential coverage of the viral contigs assembled using a combination of the DNA-spiked sample and the 1 × MDA sample with each individual read sets.
Each circle represents a contig present after assembly and the placement in the plot shows the abundance of the contig for each read set. Two similar read sets would result in a diagonal straight line. GC content of the different contigs is indicated as depicted in the colour scale and the size of the bubble depicts the length of the contig. Three outliers caused by the MDA amplification method are not shown in the plot.
Overview of the 16S reads in DNA-spiked-SP, 1× MDA and 2× MDA in number of reads and percentage.
The percentages are colour-coded in a gradient from 0% (red) to 100% (green). The reads were identified by mapping to the SILVA 16S rRNA database v128. DNA-spiked-SP (length 0.6, similarity 0.99):1 × MDA and 2 × MDA (length 0.5, similarity 0.95). Ambiguous reads were removed from the set.
| DNA-spiked-SP | 1 × MDA | 2 ×MDA | ||||
|---|---|---|---|---|---|---|
| # | % | # | % | # | % | |
| Rhizobiales | 61 | 18.9 | 1 | 1.1 | 0 | 0.0 |
| OPB41 | 2 | 0.6 | 0 | 0.0 | 0 | 0.0 |
| Actinomycetales | 3 | 0.9 | 0 | 0.0 | 0 | 0.0 |
| SAR11 | 6 | 1.9 | 5 | 5.4 | 1 | 3.0 |
| Latescibacteria | 7 | 2.2 | 0 | 0.0 | 0 | 0.0 |
| Clostridia | 25 | 7.8 | 0 | 0.0 | 0 | 0.0 |
| Saccharibacteria | 7 | 2.2 | 4 | 4.3 | 4 | 12.1 |
| Omnitrophica | 18 | 5.6 | 0 | 0.0 | 0 | 0.0 |
| Streptomycales | 12 | 3.7 | 0.0 | 0 | 0.0 | |
| Microgenomates | 2 | 0.6 | 17 | 18.5 | 6 | 18.2 |
| WS6 | 0 | 0.0 | 53 | 57.6 | 15 | 45.5 |
| Woesearchaeota__DHVEG-6 | 0 | 0.0 | 1 | 1.1 | 4 | 12.1 |
| Parcubacteria | 1 | 0.3 | 1 | 1.1 | 0 | 0.0 |
| Lactobacillales | 0 | 0.0 | 2 | 2.2 | 0 | 0.0 |
| Staphylococcaceae | 1 | 0.3 | 2 | 2.2 | 0 | 0.0 |
| Bacillales | 0 | 0.0 | 5 | 5.4 | 0 | 0.0 |
| Chloroflexi | 15 | 4.7 | 0 | 0.0 | 0 | 0.0 |
| Pseudomonadales | 0 | 0.0 | 0 | 0.0 | 1 | 3.0 |
| Thaumarchaeota | 0 | 0.0 | 0 | 0.0 | 1 | 3.0 |
| Nitrospira | 13 | 4.0 | 0 | 0.0 | 0 | 0.0 |
| Actinobacteria | 73 | 22.7 | 0 | 0.0 | 0 | 0.0 |
| Nitrospina | 17 | 5.3 | 0 | 0.0 | 0 | 0.0 |
| Cyanobacteria | 0 | 0.0 | 1 | 1.1 | 1 | 3.0 |
| Other | 59 | 18.3 | 0 | 0.0 | 0 | 0.0 |
| Total read count | ||||||