| Literature DB >> 18291029 |
Baptiste Monsion1, Hervé Duborjal, Stéphane Blanc.
Abstract
BACKGROUND: Pathogens such as fungi, bacteria and especially viruses, are highly variable even within an individual host, intensifying the difficulty of distinguishing and accurately quantifying numerous allelic variants co-existing in a single nucleic acid sample. The majority of currently available techniques are based on real-time PCR or primer extension and often require multiplexing adjustments that impose a practical limitation of the number of alleles that can be monitored simultaneously at a single locus.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18291029 PMCID: PMC2276495 DOI: 10.1186/1471-2164-9-85
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Sequence signatures identify different markers in a mixed DNA solution. (A) Sequences of the six genetic markers pCaVIT-1 to -6. The blue A on the left is the last common adenine shared between all pCa-VIT1-6 plasmids upstream of the polymorphic region. Adenine residues marked in green are theoretically expected to yield discriminating peaks; ther A residues are indicated in black. The expected position (in bp) of each base from the fluorescent sequencing primer is indicated at the top. (B) Experimentally observed positions of discriminating A-peaks (named VIT1.1 to VIT6.2 at the top) on sequence traces (not shown) from individually sequenced markers. Sequences of all six markers are slightly distorted in order to match the actual position of A peaks observed on individual electropherograms. The red A residues were each confirmed to yield a discriminating peak. Though not theoretically predicted to do so, the red-boxed As proved to yield discriminating peaks experimentally. The remaining green As failed to produce discriminating-peaks. Sequencing reactions of pCa-VIT1-6 were repeated three times, and the observed position of all peaks was highly reproducible: +/- 0.2 bp among repeats. At the bottom, the number in blue is the average position of the last A common to all sequences upstream of the differential markers. The numbers in red are the average positions of the observed discriminating peaks. (C) Single-letter Sequencing electropherogram from a mixed DNA solution containing all 6 markers. A mix of all 6 pCaVIT1-6 plasmids in equal amounts was used as a template for PCR and subsequent single-letter sequencing. All discriminating-peaks defined in B are indicated by arrows on the electropherogram. The shaded blue peak corresponds to the last common A. The scale at the top indicates the nucleotide position relative to that of the sequencing primer. The scale on the left is used for scoring the height of the peaks in arbitrary units provided by the STRand program. The two purple peaks correspond to molecular weight markers. # These peaks are artefacts. ‡ Though appearing as discriminating, these peaks actually overlap small artefactual peaks observed on at least one electropherogram from individually sequenced markers (not shown). They are thus not used further.
Figure 2Standard curves for converting the intensity of discriminating peaks into frequency of corresponding markers. Standard curves were established for all 14 discriminating-peaks shown in Figure 1. Only those for VIT2.1 and VIT 4.2 are shown here, as they illustrate the worst and the best fit, respectively, between the recorded data and the polynomial regression function calculated using Excel (correlation coefficients R2, are shown). Vertical bars represent standard deviation among repeats.
Composition of plasmid DNA mixes
| 1 | 75 | 25 | ||||
| 2 | 75 | 25 | ||||
| 3 | 25 | 75 | ||||
| 4 | 50 | 25 | 25 | |||
| 5 | 25 | 25 | 50 | |||
| 6 | 25 | 50 | 25 | |||
| 7 | 25 | 25 | 50 | |||
| 8 | 25 | 25 | 50 | |||
| 9 | 16.67 | 16.67 | 16.67 | 16.67 | 16.67 | 16.67 |
| 10 | 33.33 | 33.33 | 33.33 | |||
| 11 | 33.33 | 33.33 | 33.33 | |||
| 12 | 40 | 60 | ||||
| 13 | 40 | 60 | ||||
| 14 | 60 | 40 | ||||
Different mixes containing some or all pCa-VIT plasmids in the proportions indicated.
Accuracy of QSS on mixtures of plasmid DNA before corrections
| 1 | 74.8 | 22.2 | 97 | ||||
| 2 | 78.02 | 28.5 | 106.52 | ||||
| 3 | 19.4 | 71.1 | 90.5 | ||||
| 4 | 50 | 23.5 | 29.6 | 103.1 | |||
| 5 | 27.3 | 29.1 | 54.6 | 111 | |||
| 6 | 27.8 | 52.9 | 28.1 | 108.8 | |||
| 7 | 23.3 | 23.4 | 51.7 | 98.4 | |||
| 8 | 22.7 | 21.6 | 46.2 | 90.5 | |||
| 9 | 17 | 16.9 | 16.6 | 16.8 | 16.7 | 16.5 | 100.5 |
| 10 | 30.3 | 32.1 | 34.6 | 97 | |||
| 11 | 36.4 | 40.1 | 33.5 | 110 | |||
| 12 | 36.5 | 59.8 | 96.3 | ||||
| 13 | 39.2 | 58.3 | 97.5 | ||||
| 14 | 66.8 | 53.1 | 119.9 | ||||
Marker frequency quantification, obtained from the standard curves shown in Figure 2. The sum of frequencies in each mix is shown on the right.
Accuracy of QSS on mixtures of plasmid DNA after final corrections
| 1 | 77.1 | 22.9 | 100 | ||||
| 2 | 73.3 | 26.7 | 100 | ||||
| 3 | 21.4 | 78.6 | 100 | ||||
| 4 | 48.5 | 22.8 | 28.7 | 100 | |||
| 5 | 24.6 | 26.2 | 49.2 | 100 | |||
| 6 | 25.6 | 48.6 | 25.8 | 100 | |||
| 7 | 23.7 | 23.7 | 52.6 | 100 | |||
| 8 | 25.1 | 23.9 | 51 | 100 | |||
| 9 | 16.9 | 16.8 | 16.6 | 16.7 | 16.6 | 16.4 | 100 |
| 10 | 31.2 | 33.1 | 35.7 | 100 | |||
| 11 | 33.1 | 36.4 | 30.5 | 100 | |||
| 12 | 37.9 | 62.1 | 100 | ||||
| 13 | 40.2 | 59.8 | 100 | ||||
| 14 | 55.7 | 44.3 | 100 | ||||
Frequency values after proportional correction to give a total of 100% in each mix.
Figure 3Accuracy of QSS measurements for the 6 genetic markers. Observed values are plotted against expected values for all measurements summarized in Table 3, plus four independent repeats of the analysis of mixes N°9 to 14. For each VIT marker, a linear regression function was deduced (colored lines). The near perfect scores for the R2 correlation coefficient in each case illustrate the high degree of accuracy of QSS.
Reproducibility of QSS on mixtures of plasmid DNA
| 1 | 16.92 | 16.78 | 16.55 | 16.72 | 16.59 | 16.43 | 100 |
| 2 | 17.45 | 16.99 | 17.14 | 16.12 | 16.19 | 16.12 | 100 |
| 3 | 17.17 | 16.58 | 16.81 | 16.74 | 16.79 | 15.92 | 100 |
| 4 | 17.06 | 16.47 | 16.32 | 16.68 | 17.33 | 16.14 | 100 |
| 5 | 17.19 | 16.36 | 17.03 | 16.60 | 17.21 | 15.62 | 100 |
| Mean | 17,16 | 16,63 | 16,77 | 16,57 | 16,82 | 16,05 | 100 |
| SD | 0.195 | 0.251 | 0.336 | 0.261 | 0.465 | 0.301 | |
Reproducibility of QSS as evaluated by the standard deviation (SD) on 5 replications of the analysis of Mix n°9. Proportional correction of the frequency of all markers was applied to give a total of 100% in each mix.
Accuracy and reliability of QSS on viral DNA extracted from an infected plant
| 42 | 40 | 42 | 41 | 41 | 41 | 40 | 41 | 41 | 44 | 42 | 44 | 44 | 43 | 43 | 42 | 43 | 42 | 43 | 42.0 | 1.3 | |
| 5.4 | 5.6 | 5.5 | 5.8 | 5.9 | 6.6 | 6.0 | 5.8 | 6.1 | 4.9 | 5.6 | 5.2 | 5.2 | 5.1 | 5.3 | 6.6 | 5.8 | 5.3 | 5.4 | 5.6 | 0.5 | |
| 39 | 36 | 40 | 38 | 37 | 39 | 39 | 40 | 39 | 41 | 39 | 41 | 42 | 42 | 42 | 38 | 40 | 42 | 42 | 39.9 | 1.8 | |
| 11 | 12 | 11 | 11 | 12 | 11 | 11 | 11 | 11 | 9 | 11 | 9 | 8 | 9 | 9 | 11 | 9 | 9 | 9 | 10.3 | 1.2 | |
| 1.4 | 3.3 | 1.1 | 1.9 | 2.5 | 1.3 | 2.2 | 1.3 | 1.2 | 1.1 | 1.4 | 0.6 | ND | 0.5 | 0.6 | 2.8 | 1.4 | 1.1 | 1.1 | 1.5 | 0.7 | |
| 1.1 | 2.9 | 1.0 | 1.9 | 1.7 | 1.4 | 2.1 | 1.2 | 1.0 | ND | 0.7 | ND | ND | ND | ND | ND | ND | ND | ND | 1.5 | 0.6 | |
Viral DNA was extracted from a plant infected with the CaMV Mix6VIT, and processed independently 19 times for PCR amplification and QSS analysis. Average final estimates (proportionally corrected to give a total of 100% in each case) and standard deviation (SD) among repeats are shown on the right.
Peaks yielded by low-frequency markers VIT 5 and VIT6, when identified, emerged just above the base line of the sequence traces. As indicated in the text, the corresponding frequency estimates reported in this table fall below the threshold of QSS accuracy, as confirmed by the high SD associated with VIT5 and VIT6, and by numerous repeats where estimates could not be obtained (ND), because the corresponding peaks were not clearly distinguishable on the electropherogram.
Sum of the frequency estimates of all the markers before final correction
| 125 | 137 | 120 | 128 | 129 | 123 | 125 | 122 | 122 | 116 | 121 | 111 | 110 | 111 | 110 | 117 | 110 | 112 | 111 | |
The sum of the frequency estimates of all markers before final correction is given here as an indicator of the quality of the sequence traces. Two subsets of differing quality can be distinguished (see text): replicates 1–11 (sums range from 116–137%), replicates 12–19 (sums range from 111–117%).
Robustness of QSS on samples of sub-optimal quality
| 41.2 | 43.0 | 1.1 | 0.7 | |
| 5.8 | 5.5 | 0.4 | 0.5 | |
| 38.8 | 41.3 | 1.3 | 1.3 | |
| 11.1 | 9.2 | 0.7 | 0.6 | |
| 1.7 | 1.0 | 0.7 | 0.8 | |
| 1.5 | ND | 0.6 | ND | |
Mean frequency estimates and SD for all markers within the two subsets shown in Table 6.
Compared performance of QSS and other available methods
| QSS | 0.281 to 0.950 | 1.255 | 0.984 to 0.999 |
| BAMPER [38] | 3.8* | ND | 0.9999 |
| Micro-Array [39, 40] | 3.5 to 4.1 | 2.4 | 0.971 to 0.9921 |
| Micro-Satellite [41] | ND | ND | 0.97 |
| Mass Spectrometry [29] | 1.55 | ND | ND |
| PE+DHPLC [42] | 1.4 | 1.2 | 0.977 |
| Pyrosequencing [29, 43, 44] | 0.07* to 1.9 | ND | 0.979 to 0.996 |
| Quantitative Sequencing [28] | 4.2 | 1.44 | ND |
| RFLP [29] | 2.8 | ND | ND |
| RFMP [25] | ND | ND | 0.992 |
| Single base extension [26, 29] | 0.27 to 1.75 | 1.5 to 2.15 | ND |
| SYBR Green [28, 45] | 1.65 to 6.47 | 1 to 1.12 | 0.997 |
| TaqMan Probe [28, 29, 46] | 0.75 to 3.18 | 1.47 | 0.9984 |
a Reproducibility is evaluated by the median of standard deviations (MSD) among repeats on different variants, and is expressed as a frequency in %. For QSS, the overall MSD is 0.401%; the values shown here, 0.281 and 0.950% are calculated from Table 4 (plasmid mix) and Table 5 (viral DNA population extracted from plant), respectively.
b Accuracy is estimated i) by the median deviation (MD), being the median of all differences between observed and expected allele frequencies, and ii) by the correlation coefficient between observed and expected frequencies (R2).
ND: Not-Determined
(*) these values refer to standard deviation (SD), because the data available in the literature do not allow calculation of the MSD.