| Literature DB >> 24755734 |
Steven Sijmons1, Kim Thys2, Michaël Corthout1, Ellen Van Damme2, Marnix Van Loock2, Stefanie Bollen1, Sylvie Baguet1, Jeroen Aerssens2, Marc Van Ranst1, Piet Maes1.
Abstract
Human cytomegalovirus (HCMV) is a ubiquitous virus that can cause serious sequelae in immunocompromised patients and in the developing fetus. The coding capacity of the 235 kbp genome is still incompletely understood, and there is a pressing need to characterize genomic contents in clinical isolates. In this study, a procedure for the high-throughput generation of full genome consensus sequences from clinical HCMV isolates is presented. This method relies on low number passaging of clinical isolates on human fibroblasts, followed by digestion of cellular DNA and purification of viral DNA. After multiple displacement amplification, highly pure viral DNA is generated. These extracts are suitable for high-throughput next-generation sequencing and assembly of consensus sequences. Throughout a series of validation experiments, we showed that the workflow reproducibly generated consensus sequences representative for the virus population present in the original clinical material. Additionally, the performance of 454 GS FLX and/or Illumina Genome Analyzer datasets in consensus sequence deduction was evaluated. Based on assembly performance data, the Illumina Genome Analyzer was the platform of choice in the presented workflow. Analysis of the consensus sequences derived in this study confirmed the presence of gene-disrupting mutations in clinical HCMV isolates independent from in vitro passaging. These mutations were identified in genes RL5A, UL1, UL9, UL111A and UL150. In conclusion, the presented workflow provides opportunities for high-throughput characterization of complete HCMV genomes that could deliver new insights into HCMV coding capacity and genetic determinants of viral tropism and pathogenicity.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24755734 PMCID: PMC3995935 DOI: 10.1371/journal.pone.0095501
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Schematic overview of the amplification, sequencing and analysis workflow.
UL and US denote unique and unique short regions of the genome; IRL and IRS denote internal repeats.
Figure 2Multiple Displacement Amplification (MDA) selectively amplifies viral but not cellular DNA.
Amounts of viral and cellular DNA were estimated using qPCR before and after amplification of the DNA extraction products using MDA (pre- and post-MDA). In [A], the increase in absolute amounts of viral DNA (µg) is visualized, [B] represents the relative increase of viral to cellular DNA (% viral DNA).
Mapping of 454 GS FLX and IGA reads to strain consensus sequences.
| Strain | GenBank accession | Isolate and/or passage number | # reads mapped | # reads unmapped | % reads mapped | qPCR sample purity | Average read depth (454 GS FLX + IGA) |
| Merlin | NC_006273 | 5,855,670 | 76,782 | 99 | 100 | 1306 (23+1283) | |
| BE/9/2010 | KC519319 | p2 | 7,166,157 | 351,662 | 95 | 100 | 1611 (43+1568) |
| p5 | 8,934,863 | 226,933 | 98 | 100 | 1978 (19+1959) | ||
| p7 | 8,445,946 | 607,953 | 93 | 99 | 1879 (28+1851) | ||
| p11 | 6,781,195 | 1,542,530 | 81 | 74 | 1507 (22+1485) | ||
| BE/10/2010 | KC519320 | i1 p2 | 10,359,782 | 63,203 | 99 | 100 | 2262 (22+2240) |
| i2 p2 | 5,963,342 | 50,527 | 99 | 100 | 1314 (27+1287) | ||
| BE/11/2010 | KC519321 | p2 | 8,855,022 | 325,142 | 96 | 99 | 1971 (30+1941) |
| p5 | 9,205,907 | 470,107 | 95 | 100 | 2046 (26+2020) | ||
| p9 | 5,751,100 | 682,788 | 89 | 92 | 1275 (13+1262) | ||
| BE/21/2010 | KC519322 | up | 5,429,700 | 39,097,554 | 12 | 85 | 1077 (0+1077) |
| p4 | 6,008,424 | 209,938 | 97 | 84 | 1390 (64+1326) | ||
| BE/27/2010 | KC519323 | i1 p4 | 1,190,000 | 2,150,782 | 36 | 90 | 273 (14+259) |
| i2 p4 | 1,256,717 | 89,568 | 93 | 97 | 328 (44+284) |
i = isolate number.
p = passage number.
up = unpassaged.
Figure 3Assembly performance using 454 GS FLX, IGA or both and freeware or commercial software suites.
Boxplots representing [A] the range of n50 contig lengths and [B] number of gaps in contig coverage of consensus sequences after de novo assembly of respectively 454 GS FLX, IGA or combined datasets. The central line in the box represents the median, top and bottom represent the 75 and 25 percentile and error bars represent minimum and maximum values. Median values are stated above each boxplot. Datasets (454 GS FLX and/or IGA) and software suites (CLC Genomics Workbench, MIRA, Velvet or Phrap combining MIRA and Velvet assemblies) are indicated below the plots. Since normality was violated, overall differences for n50 contig length and number of gaps were tested with the non-parametric Friedman test (n = 13; n50 contig length: χ2(5) = 42.506, p<0.001; gaps: χ2(5) = 37.275, p<0.001). Comparisons between assemblies based on different datasets were made using the Wilcoxon Signed Ranks Test with Bonferroni correction; p-values are reported in the figure. Because of the Bonferroni correction, differences are only significant when p<0.017.
Comparison of strain BE/21/2010 consensus sequences, derived directly from the clinical material (BE/21/2010 up) and after four cell culture passages (BE/21/2010 p4).
| Nucleotide position | Genome region | BE/21/2010 up | BE/21/2010 p4 | Length range inother HCMV strains |
| 6,055–63 | non-coding, UL | 9–10 C’s | 9–10 C’s | 7–12 |
| 96,658–81 | ncRNA4.9 | 23–24 T’s | 23–24 T’s | 7–24 |
| 99,184–207 | UL69 | 5–8 CGG’s | 8 CGG’s | 2–8 |
| 231,849–60 | non-coding, US | 9–13 G’s | 10–13 G’s | 8–15 |
| 232,207–20 | non-coding, US | 11–15 G’s | 11–15 G’s | 9–15 |
Gene-disrupting mutations in clinical HCMV strains.
| Strain | RL5A | UL1 | UL9 | UL111A | UL150 |
| BE/9/2010 | wt | wt | wt | wt | wt |
| BE/10/2010 | wt | wt | point mutation° | wt | wt |
| BE/11/2010 | 11 bp deletion°’ | several point mutations* | wt | wt | wt |
| BE/21/2010 | 17 bp deletion” | wt | point mutation | wt | 2 bp deletion |
| BE/27/2010 | 11 bp deletion°’ | several point mutations°* | point mutation° | 220 bp deletion° | wt |
| Other published full genome strains | JP, HAN13” | JHC | AF1 | JP, PH | CINCY+Towne |
wt = wild-type.
JP (GQ221975), HAN13 (GQ221973), JHC (HQ380895), AF1 (GU179291), PH (AC146904), CINCY+Towne (GU980198).
°Mutations verified by PCR amplification (Table S6) and Sanger sequencing of the viral gene in the original clinical material.
‘,”,*Identical mutations in unrelated strains.