| Literature DB >> 29940028 |
Stephanie Goya1,2, Laura E Valinotto1,2, Estefania Tittarelli1,2, Gabriel L Rojo1, Mercedes S Nabaes Jodar1,3, Alexander L Greninger4, Jonathan J Zaiat2,5, Marcelo A Marti2,5, Alicia S Mistchenko1,6, Mariana Viegas1,2.
Abstract
Over the last decade, the number of viral genome sequences deposited in available databases has grown exponentially. However, sequencing methodology vary widely and many published works have relied on viral enrichment by viral culture or nucleic acid amplification with specific primers rather than through unbiased techniques such as metagenomics. The genome of RNA viruses is highly variable and these enrichment methodologies may be difficult to achieve or may bias the results. In order to obtain genomic sequences of human respiratory syncytial virus (HRSV) from positive nasopharyngeal aspirates diverse methodologies were evaluated and compared. A total of 29 nearly complete and complete viral genomes were obtained. The best performance was achieved with a DNase I treatment to the RNA directly extracted from the nasopharyngeal aspirate (NPA), sequence-independent single-primer amplification (SISPA) and library preparation performed with Nextera XT DNA Library Prep Kit with manual normalization. An average of 633,789 and 1,674,845 filtered reads per library were obtained with MiSeq and NextSeq 500 platforms, respectively. The higher output of NextSeq 500 was accompanied by the increasing of duplicated reads percentage generated during SISPA (from an average of 1.5% duplicated viral reads in MiSeq to an average of 74% in NextSeq 500). HRSV genome recovery was not affected by the presence or absence of duplicated reads but the computational demand during the analysis was increased. Considering that only samples with viral load ≥ E+06 copies/ml NPA were tested, no correlation between sample viral loads and number of total filtered reads was observed, nor with the mapped viral reads. The HRSV genomes showed a mean coverage of 98.46% with the best methodology. In addition, genomes of human metapneumovirus (HMPV), human rhinovirus (HRV) and human parainfluenza virus types 1-3 (HPIV1-3) were also obtained with the selected optimal methodology.Entities:
Mesh:
Year: 2018 PMID: 29940028 PMCID: PMC6016902 DOI: 10.1371/journal.pone.0199714
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Scheme of sample treatments and library preparation workflow.
Different treatments are combined in the methodologies which are denoted with the letters A to I.
Comparison of sequencing results applying different methodologies.
| Sample | copies/ml of NPA | Methodology ID | Total filtered reads | %mapped reads to viral reference | %Duplicated reads | %genome coverage | Average depth of coverage | Min/Max depth of coverage | %Reads aligned to host | %Reads aligned to rRNA |
|---|---|---|---|---|---|---|---|---|---|---|
| MiSeq sequencing | ||||||||||
| HRSV/A/001 | 1.51E+08 | A | 469,660 | 1.27 | 2.37 | 99.7 | 37.60 | 0/134 | 78 | 10.44 |
| HRSV/B/002 | 4.13E+08 | A | 635,184 | 1.58 | 3.60 | 99.8 | 65.30 | 0/249 | 95 | 13.59 |
| B | 796,524 | 1.50 | 3.30 | 99.8 | 84.70 | 0/297 | 96 | 15.64 | ||
| NextSeq 500 sequencing | ||||||||||
| HRSV/A/003 | 9.49E+06 | B | 1,378,092 | 5.62 | 91.21 | 96.5 | 49.20 | 0/227 | 58 | 20.11 |
| HRSV/B/004 | 9.91E+07 | B | 2,317,914 | 4.90 | 93.18 | 98.6 | 56.03 | 0/230 | 25 | 24.00 |
| C | 1,899,144 | 10.26 | 93.63 | 99.7 | 89.30 | 0/234 | 13 | 31.01 | ||
| D | 3,054,972 | 4.86 | 94.81 | 96.9 | 55.60 | 0/233 | 19 | 22.22 | ||
| E | 3,967,890 | 6.81 | 96.56 | 97.2 | 64.60 | 0/234 | 11 | 30.74 | ||
| HRSV/B/005 | 9.38E+07 | F | 2,209,306 | 6.19 | 95.02 | 91 | 49.80 | 0/237 | 11 | 23.08 |
| G | 2,121,050 | 21.66 | 97.29 | 99.8 | 90.50 | 0/239 | 18 | 24.64 | ||
| HRSV/B/006 | 2.38E+08 | C | 904,954 | 80.62 | 98.11 | 99.7 | 97.70 | 0/239 | 63 | 9.78 |
| E | 4,571,192 | 79.42 | 99.51 | 99.6 | 124.20 | 0/240 | 61 | 8.87 | ||
| H | 460,408 | 1.24 | 34.85 | 99.4 | 22.30 | 0/67 | 97 | 9.21 | ||
| I | 2,391,816 | 1.02 | 65.84 | 99.8 | 50.40 | 0/119 | 95 | 9.59 | ||
| HRSV/A/007 | 2.40E+08 | C | 200,216 | 27.33 | 86.31 | 96.2 | 52.14 | 0/205 | 83 | 12.54 |
| E | 353,266 | 30.28 | 91.75 | 94.36 | 59.05 | 0/223 | 80 | 12.77 | ||
| H | 1,315,052 | 0.03 | 7.08 | 53.33 | 1.74 | 0/13 | 99 | 6.86 | ||
| I | 1,914,840 | 0.03 | 13.51 | 51.72 | 2.19 | 0/35 | 98 | 14.04 | ||
| HRSV/B/008 | 2.32E+08 | C | 157,926 | 11.77 | 80.18 | 94.04 | 25.33 | 0/192 | 93 | 13.44 |
| E | 216,860 | 12.22 | 86.10 | 88.75 | 23.57 | 0/189 | 91 | 13.26 | ||
| H | 1,513,366 | 0.08 | 14.32 | 93.61 | 6.19 | 0/25 | 98 | 11.99 | ||
| I | 4,150,692 | 0.07 | 30.27 | 93.96 | 10.99 | 0/52 | 97 | 12.62 | ||
| HRSV/A/009 | 4.53E+07 | C | 1,170,558 | 4.91 | 97.92 | 99.91 | 83.73 | 0/237 | 75 | 9.72 |
| E | 617,392 | 41.37 | 96.44 | 97.14 | 61.29 | 0/237 | 74 | 8.95 | ||
| H | 715,448 | 0.06 | 5.32 | 56.69 | 2.07 | 0/17 | 98 | 11.08 | ||
| I | 1,563,020 | 0.05 | 11.93 | 79.85 | 3.58 | 0/22 | 97 | 11.43 | ||
| HRSV/A/010 | 3.51E+08 | C | 357,194 | 50.62 | 89.96 | 99.76 | 122.23 | 0/210 | 50 | 14.77 |
| HRSV/B/011 | 1.28E+08 | C | 478,062 | 62.88 | 95.64 | 99.93 | 93.60 | 0/239 | 67 | 17.42 |
| HMPV/001 | 1.31E+08 | C | 1,448,584 | 1.88 | 81.82 | 99.03 | 36.93 | 0/179 | 84 | 24.22 |
| HMPV/002 | 3.94E+07 | C | 121,464 | 0.75 | 15.27 | 86.82 | 5.11 | 0/41 | 49 | 9.56 |
| HPIV1/001 | 6.14E+07 | C | 981,338 | 0.99 | 77.13 | 62.86 | 15.02 | 0/139 | 93 | 80.00 |
| HPIV1/002 | 2.84E+09 | C | 1,066,666 | 68.06 | 97.29 | 99.9 | 143.09 | 0/236 | 49 | 15.20 |
| HPIV2/001 | 5.51E+10 | C | 1,838,012 | 82.36 | 98.41 | 100 | 178.92 | 2/230 | 47 | 14.00 |
| HPIV2/002 | 1.43E+08 | C | 518,934 | 18.02 | 90.01 | 96.67 | 64.79 | 0/181 | 36 | 10.34 |
| HPIV3/001 | 1.97E+08 | C | 528,358 | 35.63 | 93.92 | 98.12 | 81.60 | 0/211 | 48 | 10.09 |
| HPIV3/002 | 9.34E+08 | C | 9,293,908 | 95.37 | 99.70 | 99.98 | 204.30 | 0/241 | 44 | 0.30 |
| HRV/C/001 | 9.59E+07 | C | 1,146,828 | 19.67 | 95.62 | 98.21 | 123.76 | 0/185 | 28 | 9.09 |
(a) NPA: nasopharyngeal aspirate.
(b) Description of A-H methodologies are in the manuscript and Fig 1.
(c) Reads were filtered according quality scores upper or equal than 30.
(d) For HRSV, GenBank accession number of the reference sequences are KU950583 for subtype A and JX576745 for subtype B; for HMPV is KF530179; for HRV is GU219984; for HPIV are KF530212, MF077313 and KJ672618 for types 1, 2 or 3, respectively.
(e) Percentage regarding the length of each the reference sequence.
(f) Average depth of coverage was calculated without duplicated reads.
(g) Human reference: GRCh38.p7.
(h) Human cytoplasmic and mitochondrial rRNA: NT_167214 and NC_012920.
Fig 2Coverage profiles of HRSV obtained per sample per methodology.
A-I methodologies are indicated at the left of each profile. Genome organization with genes in green and their coding region in yellow are shown at the top of the figure. Genome regions with depth of coverage upper than 4 are underlined in orange. GenBank accession numbers are indicated in brackets. Only 6 representative samples are shown, other profiles are shown in S1 Fig.