| Literature DB >> 24391758 |
Josep Gregori1, Juan I Esteban2, María Cubero3, Damir Garcia-Cehic3, Celia Perales4, Rosario Casillas5, Miguel Alvarez-Tejado6, Francisco Rodríguez-Frías7, Jaume Guardia2, Esteban Domingo4, Josep Quer2.
Abstract
We have investigated the reliability and reproducibility of HCV viral quasispecies quantification by ultra-deep pyrosequencing (UDPS) methods. Our study has been divided in two parts. First of all, by UDPS sequencing of clone mixes samples we have established the global noise level of UDPS and fine tuned a data treatment workflow previously optimized for HBV sequence analysis. Secondly, we have studied the reproducibility of the methodology by comparing 5 amplicons from two patient samples on three massive sequencing platforms (FLX+, FLX and Junior) after applying the error filters developed from the clonal/control study. After noise filtering the UDPS results, the three replicates showed the same 12 polymorphic sites above 0.7%, with a mean CV of 4.86%. Two polymorphic sites below 0.6% were identified by two replicates and one replicate respectively. A total of 25, 23 and 26 haplotypes were detected by GS-Junior, GS-FLX and GS-FLX+. The observed CVs for the normalized Shannon entropy (Sn), the mutation frequency (Mf), and the nucleotidic diversity (Pi) were 1.46%, 3.96% and 3.78%. The mean absolute difference in the two patients (5 amplicons each), in the GS-FLX and GS-FLX+, were 1.46%, 3.96% and 3.78% for Sn, Mf and Pi. No false polymorphic site was observed above 0.5%. Our results indicate that UDPS is an optimal alternative to molecular cloning for quantitative study of HCV viral quasispecies populations, both in complexity and composition. We propose an UDPS data treatment workflow for amplicons from the RNA viral quasispecies which, at a sequencing depth of at least 10,000 reads per strand, enables to obtain sequences and frequencies of consensus haplotypes above 0.5% abundance with no erroneous mutations, with high confidence, resistant mutants as minor variants at the level of 1%, with high confidence that variants are not missed, and highly confident measures of quasispecies complexity.Entities:
Mesh:
Substances:
Year: 2013 PMID: 24391758 PMCID: PMC3877031 DOI: 10.1371/journal.pone.0083361
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1UDPS data treatment workflow to obtain error-free HCV haplotypes as a sampling of the actual HCV viral quasispecies.
Figure 2Observed substitution errors in the experiment with clones.
a) Forward strand b) Reverse strand. The dominant G to T on fw (C to A on rv) are tied to in vitro transcription of the (+) chain in the plasmid, which is used as HCV viral source.
PCR and UDPS errors.
|
| By columns F (.01)+Av+F(.25) | By rows F (.10)+I+F(.25) | ||||
| Point Mutations | Point Mutations | |||||
| Err | OK | Haplotypes | Err | OK | Haplotypes | |
| Lane 1 | 2 | 2 | - | 2 | 2 | 5 |
| Lane 5 | 5 | 2 | - | 6 | 2 | 9 |
| Lane 9 | 0 | 2 | - | 0 | 2 | 3 |
| Lane 13 | 7 | 2 | - | 7 | 2 | 10 |
| Lane 2 | 12 | 0 | - | 12 | 0 | 13 |
| Lane 6 | 18 | 0 | - | 20 | 0 | 21 |
| Lane 10 | 2 | 0 | - | 0 | 0 | 1 |
| Lane 14 | 4 | 0 | - | 5 | 0 | 6 |
| Lane 3 | 10 | 0 | - | 13 | 0 | 14 |
| Lane 7 | 2 | 0 | - | 3 | 0 | 4 |
| Lane 11 | 13 | 0 | - | 13 | 0 | 14 |
| Lane 15 | 6 | 0 | - | 6 | 0 | 7 |
| Lane 4 | 11 | 0 | - | 10 | 0 | 11 |
| Lane 8 | 5 | 0 | - | 6 | 0 | 7 |
| Lane 12 | 12 | 0 | - | 15 | 0 | 16 |
| Lane 16 | 4 | 0 | - | 7 | 0 | 8 |
Experiment with clones. Point mutations and haplotypes found when cutting consensus point mutations (analysis by columns) or consensus haplotypes (analysis by rows) at 0.25% abundance ( ) at a cut-off of 0.50% abundance ( ). Err indicates the number of false segregating sites, and OK the number of true segregating sites.
Number of reads per sample and strand at each data treatment step, and haplotype distribution at a consensus haplotype abundance >0.5%.
| Observed haplotypes | Nominal | ||||||||||
| Identified | QF plus Correction | 0.1% Filter | Intersection | 0.5% Filter | M3 | M1∶M3 | M3∶M1 | Others | M1 | ||
| 1 fw | 12582 | 11376 | 10880 | 10744 | 28895 | 27602 | 99,22 | 0,78 | 5 | ||
| 1 rv | 20853 | 19235 | 18214 | 18151 | |||||||
| 2 fw | 13514 | 12301 | 11598 | 11392 | 29964 | 27217 | 100 | 1 | |||
| 2 rv | 21869 | 19975 | 18805 | 18572 | |||||||
| 3 fw | 16494 | 15047 | 14177 | 13988 | 26063 | 23941 | 100 | 0,5 | |||
| 3 rv | 14184 | 12919 | 12251 | 12075 | |||||||
| 4 fw | 13017 | 11740 | 11088 | 10925 | 29272 | 26905 | 100 | 0,1 | |||
| 4 rv | 21533 | 19826 | 18649 | 18347 | |||||||
| 5 fw | 12450 | 11315 | 10746 | 10518 | 27109 | 25604 | 99,26 | 0,74 | 5 | ||
| 5 rv | 19250 | 17677 | 16786 | 16591 | |||||||
| 6 fw | 11130 | 9767 | 9222 | 9163 | 21046 | 18234 | 100 | 1 | |||
| 6 rv | 14269 | 12804 | 12069 | 11883 | |||||||
| 7 fw | 6332 | 5613 | 5420 | 5332 | 30574 | 29527 | 100 | 0,5 | |||
| 7 rv | 29004 | 27089 | 25556 | 25242 | |||||||
| 8 fw | 9084 | 8045 | 7791 | 7637 | 23609 | 22504 | 100 | 0,1 | |||
| 8 rv | 18723 | 17168 | 16213 | 15972 | |||||||
| 9 fw | 15022 | 13965 | 12918 | 12765 | 25262 | 24517 | 98,46 | 0,95 | 0,58 | 5 | |
| 9 rv | 14594 | 13443 | 12608 | 12497 | |||||||
| 10 fw | 12490 | 11480 | 10683 | 10381 | 24153 | 24012 | 100 | 1 | |||
| 10 rv | 16240 | 14900 | 13880 | 13772 | |||||||
| 11 fw | 14141 | 12825 | 12066 | 11811 | 25581 | 22875 | 100 | 0,5 | |||
| 11 rv | 16413 | 14840 | 13901 | 13770 | |||||||
| 12 fw | 13189 | 11822 | 11132 | 10951 | 24780 | 22194 | 100 | 0,1 | |||
| 12 rv | 16467 | 14955 | 14013 | 13829 | |||||||
| 13 fw | 7554 | 6516 | 6170 | 6066 | 20055 | 18714 | 98,06 | 1,02 | 0,92 | 5 | |
| 13 rv | 17048 | 15468 | 14490 | 13989 | |||||||
| 14 fw | 12429 | 11368 | 10750 | 10559 | 27509 | 25723 | 100 | 1 | |||
| 14 rv | 19862 | 18232 | 17172 | 16950 | |||||||
| 15 fw | 4894 | 4127 | 3973 | 3825 | 22323 | 20819 | 100 | 0,5 | |||
| 15 rv | 22561 | 20431 | 18884 | 18498 | |||||||
| 16 fw | 8198 | 7209 | 6793 | 6501 | 17714 | 16647 | 100 | 0,1 | |||
| 16 rv | 13472 | 12130 | 11341 | 11213 | |||||||
|
| 478862 | 435608 | 410239 | 403909 | 377035 | ||||||
|
| 91,0% | 94,2% | 98,5% | 93,3% | |||||||
|
| 91,0% | 85,7% | 84,3% | 78,7% | |||||||
Reproducibility: Quasispecies complexity measures for a single amplicon replicated on three platforms, located in two laboratories.
| 3a. FCH 0.25% | Hpl | Eta | S | Mf×103 | Sn | Pi×103 |
| JR UCTS | 46 | 22 | 21 | 2,4977 | 0,6239 | 4,249 |
| FLX UCTS | 43 | 23 | 22 | 2,3547 | 0,6147 | 4,021 |
| FLX+CRAG | 47 | 21 | 20 | 2,5805 | 0,6339 | 4,360 |
| CV | 4,61% | 1,54% | 4,11% |
Hpl stand for number of haplotypes, Eta for number of mutations, S for number of polymorphic sites, Mf for mutation frequency, Sn for normalized Shannon entropy and Pi for nucleotide diversity.
Reproducibility: Comparison of mutant abundance at each polymorphic site for a single amplicon replicated on three platforms, located in two laboratories when considering all CH>0.5%.
| pos | UCTS Jr | UCTS FLX | CRAG FLX+ | Mean | SD | CV |
| 7223 | 25,89 | 25,41 | 25,82 | 25,71 | 0,2593 | 1,01% |
| 7598 | 14,61 | 13,63 | 14,99 | 14,41 | 0,7017 | 4,87% |
| 7484 | 11,95 | 11,06 | 12,22 | 11,74 | 0,6070 | 5,17% |
| 7346 | 11,47 | 11,37 | 11,66 | 11,50 | 0,1476 | 1,28% |
| 7532 | 10,30 | 9,15 | 10,56 | 10,00 | 0,7504 | 7,50% |
| 7303 | 3,62 | 3,71 | 3,78 | 3,70 | 0,0802 | 2,17% |
| 7229 | 2,08 | 1,95 | 2,09 | 2,04 | 0,0781 | 3,83% |
| 7370 | 1,24 | 1,14 | 1,27 | 1,22 | 0,0713 | 5,86% |
| 7455 | 1,24 | 1,14 | 1,27 | 1,22 | 0,0681 | 5,59% |
| 7442 | 1,19 | 0,92 | 1,16 | 1,09 | 0,1480 | 13,58% |
| 7459 | 0,73 | 0,74 | 0,83 | 0,77 | 0,0551 | 7,18% |
| 7579 | 0,73 | 0,74 | 0,83 | 0,77 | 0,0551 | 7,18% |
| 7340 | 0,59 | 0,62 | 0,60 | 0,60 | 0,0169 | 2,82% |
| 7475 | 0,58 | 0,58 | 0,58 | |||
| 7246 | 0,59 | 0,59 | ||||
| Mean | 4,86% | |||||
| Min | 0% | |||||
| Max | 13,58% |
Reproducibility: Measures of quasispecies complexity for ten amplicons replicated on GS-FLX and GS-FLX+ located in two different laboratories.
| FLX @ UCTS | Hpl | Eta | S | Mf·103 | Sn | Pi·103 | |
| Sample 1 | Ampl 1 | 31 | 25 | 25 | 2,1118 | 0,5807 | 4,019 |
| Ampl 2 | 24 | 22 | 22 | 1,1382 | 0,5069 | 2,062 | |
| Ampl 3 | 22 | 16 | 15 | 1,9863 | 0,5467 | 3,626 | |
| Ampl 4 | 24 | 16 | 15 | 1,9620 | 0,5130 | 3,539 | |
| Ampl 5 | 23 | 13 | 13 | 2,0445 | 0,6315 | 3,493 | |
| Sample 2 | Ampl 1 | 25 | 20 | 20 | 2,9966 | 0,6095 | 4,064 |
| Ampl 2 | 20 | 19 | 19 | 0,7650 | 0,4703 | 1,492 | |
| Ampl 3 | 16 | 14 | 14 | 2,1070 | 0,4955 | 2,772 | |
| Ampl 4 | 16 | 15 | 15 | 0,7122 | 0,3541 | 1,375 | |
| Ampl 5 | 20 | 13 | 12 | 2,0502 | 0,7163 | 3,263 | |
Data from CH>0.5%.
Hpl haplotypes, Eta number of mutations, S number of polymorphic sites, Mf mutation frequency, Sn normalized Shannon entropy, Pi nucleotide diversity.
Figure 3Reproducibility: Haplotype abundance relative difference vs haplotype mean abundance, and box plot of relative differences.
Ten samples in FLX and FLX+. Dot color identifies the amplicon to which the haplotype belongs.