| Literature DB >> 24816564 |
Sandra C Abel Nielsen1, Christian A W Bruhn1, Jose Alfredo Samaniego1, Jemma Wadsworth2, Nick J Knowles2, M Thomas P Gilbert1.
Abstract
Swine vesicular disease virus (SVDV) is an enterovirus that is both genetically and antigenically closely related to human coxsackievirus B5 within the Picornaviridae family. SVDV is the causative agent of a highly contagious (though rarely fatal) vesicular disease in pigs. We report a rapid method that is suitable for sequencing the complete protein-encoding sequences of SVDV isolates in which the RNA is relatively intact. The approach couples a single PCR amplification reaction, using only a single PCR primer set to amplify the near-complete SVDV genome, with deep-sequencing using a small fraction of the capacity of a Roche GS FLX sequencing platform. Sequences were initially verified through one of two criteria; either a match between a de novo assembly and a reference mapping, or a match between all of five different reference mappings performed against a fixed set of starting reference genomes with significant genetic distances within the same species of viruses. All reference mappings used an iterative method to avoid bias. Further verification was achieved through phylogenetic analysis against published SVDV genomes and additional Enterovirus B sequences. This approach allows high confidence in the obtained consensus sequences, as well as provides sufficiently high and evenly dispersed sequence coverage to allow future studies of intra-host variation.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24816564 PMCID: PMC4016283 DOI: 10.1371/journal.pone.0097180
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Decision diagram for validation of consensus sequences.
The diagram illustrates the logic of the applied methodology for obtaining consensus sequences validated for further analysis. The process begins with de novo assembly as described in the methods section. The starting point from there is the blue-outline box in the top left hand corner. Positive answers follow the green ‘YES’ arrows, negative ones follow the red ‘NO’ arrows, and grey arrows are followed in all cases. Termination in a red box should lead to thorough analysis of upstream sources of error, including everything from contaminated samples to late stage in silico problems. Arriving at the first green box means that the consensus sequence of assembly/mapping has been verified. Arriving at the final green box means that the sample sequence is fully validated, now also with regard to sample provenance. The sample sequence is now ready to be used for further scientific analysis.
Results of assemblies and mappings of five Hong Kong swine vesicular disease virus (SVDV) isolates.
| SVDV isolate | HKN/8/73 | HKN/18/74 | HKN/25/75 | HKN/1/77 | HKN/5/77 |
|
| IB-RS-2 passage 1 | IB-RS-2 passage 1 | IB-RS-2 passage 1 | IB-RS-2 passage 2 | IB-RS-2 passage 2 |
|
| 13May1975 | 29Oct1974 | 24March1975 | 7March1977 | 7March1977 |
|
| KF963276 | KF963277 | KF963278 | KF963274 | KF963275 |
|
| 4287 | 5972 | 7078 | 7095 | 9985 |
|
| 4003 | 5489 | 6515 | 6497 | 9190 |
|
| |||||
| Mean | 164.2 | 167.9 | 176.1 | 170.7 | 135.2 |
| Std. dev. | 111.2 | 113.5 | 116.1 | 119.5 | 94.9 |
| Max. | 526 | 596 | 525 | 529 | 526 |
|
| |||||
| Coverage | 3217 | 3660 | Coding region | 6314 | Coding region |
| Min. | 28 | 32 | coverage not | 84 | coverage not |
| Max. | 152 | 174 | achieved from | 339 | achieved from |
| Mean | 75.4 | 89.6 |
| 153.3 |
|
| Std. dev. | 20.8 | 24.7 | 40.4 | ||
|
| |||||
| Coverage | 3981 | 5449 | 6473 | 6463 | 9157 |
| Min. | 50 | 67 | 92 | 83 | 93 |
| Max. | 172 | 313 | 448 | 362 | 330 |
| Mean | 93.5 | 131.4 | 164.0 | 159.6 | 177.6 |
| Std. dev. | 18.8 | 43.4 | 57.2 | 52.0 | 40.4 |
|
| |||||
| Coverage | - | - | 6472 | - | 9153 |
| Min. | - | - | 92 | - | 92 |
| Max. | - | - | 448 | - | 330 |
| Mean | - | - | 164.0 | - | 177.6 |
| Std. dev. | - | - | 57.2 | - | 40.4 |
|
| |||||
| Coverage | - | - | 6473 | - | 9150 |
| Min. | - | - | 92 | - | 93 |
| Max. | - | - | 448 | - | 330 |
| Mean | - | - | 164.0 | - | 177.5 |
| Std. dev. | - | - | 57.2 | - | 40.3 |
|
| |||||
| Coverage | - | - | 6474 | - | 9151 |
| Min. | - | - | 92 | - | 93 |
| Max. | - | - | 448 | - | 330 |
| Mean | - | - | 164.0 | - | 177.5 |
| Std. dev. | - | - | 57.2 | - | 40.3 |
|
| |||||
| Coverage | - | 5446 | 6473 | - | 9151 |
| Min. | - | 67 | 92 | - | 93 |
| Max. | - | 313 | 448 | - | 330 |
| Mean | - | 131.4 | 164.0 | - | 177.5 |
| Std. dev. | - | 43.4 | 57.2 | - | 40.3 |
|
| |||||
|
| |||||
| 1 (1) | 2 (0) | 0 (-) | 0 (-) | 0 (-) | |
First row shows the names of the five isolates, passage history, and harvest date. Second row shows the number of reads assigned to each isolate after they were separated using MID barcodes, and the number of these that had unique sequences (i.e. with duplicates removed), which were then kept for assembly/mapping. A number of reads is also lost during de-multiplexing, however this is even smaller than the total number of duplicates (not shown). Row three shows the mean, the standard deviation and the maximum of the read lengths of the unique reads. The fourth row shows results from de novo assembly, including number of reads mapped and minimum, maximum, mean, and standard deviation figures for the depth of coverage of the contig produced. The following five rows show results from iterative mapping against the five selected starting reference sequences (two SVDV sequences, a coxsackievirus B5, a coxsackievirus B3, and a coxsackievirus B6 sequence). The three isolates with successful de novo assemblies were iteratively mapped against one or two of these sequences for validation (see Figure 1), whereas the remaining two isolates were iteratively mapped against all five starting references. For each of these two isolates the final mapping statistics can be seen to be virtually identical across all five starting references, clearly indicating that convergence of the iterative mapping process has been obtained. The final row shows the total number of ‘polymorphic’ sites obtained when the consensus sequence from assembly (if successful) and all of the mappings are multiple-aligned for each isolate (in parenthesis is shown whether any of the ‘polymorphisms’ are due to base calls where a non-redundant call in one sequence matches the category of a redundant call in the other).
Figure 2Phylogenetic validation of samples.
Using a maximum likelihood approach as described in Methods on the 1B-1C-1D genome region, corresponding to the outer capsid proteins VP2, VP3 and VP1, final validation was obtained for the samples. The five Hong Kong SVDVs isolated in the 1970s and which were sequenced in this study, are shown in red and all other SVD virus isolates are shown in pink. Coxsackievirus B5 isolates are shown in blue, and other Enterovirus B serotypes are shown in black. As expected from the previous literature, all CV-B5 together with all SVDV form a monophyletic cluster. This is supported with a bootstrap proportion of 100/100. Within this cluster all the SVD virus isolates, including those sequenced in this study, form a monophyletic cluster with bootstrap support of 96/100. Additionally, none of the five presently sequenced Hong Kong isolates show any obvious aberrations with regards to their position in the topology concerning either geographical information or branch lengths (vs. age). The tree is rooted on the branch leading from CV-B5 to all other CV-B sequences, and all branch labels show support in the form of bootstrap proportions out of one hundred. Support values for nodes with minor importance regarding the verification are not shown.