| Literature DB >> 29121662 |
Vishakha Sharma1, Hoi Yan Chow1, Donald Siegel1, Elisa Wurmbach1.
Abstract
Massively parallel sequencing (MPS) is a powerful tool transforming DNA analysis in multiple fields ranging from medicine, to environmental science, to evolutionary biology. In forensic applications, MPS offers the ability to significantly increase the discriminatory power of human identification as well as aid in mixture deconvolution. However, before the benefits of any new technology can be employed, a thorough evaluation of its quality, consistency, sensitivity, and specificity must be rigorously evaluated in order to gain a detailed understanding of the technique including sources of error, error rates, and other restrictions/limitations. This extensive study assessed the performance of Illumina's MiSeq FGx MPS system and ForenSeq™ kit in nine experimental runs including 314 reaction samples. In-depth data analysis evaluated the consequences of different assay conditions on test results. Variables included: sample numbers per run, targets per run, DNA input per sample, and replications. Results are presented as heat maps revealing patterns for each locus. Data analysis focused on read numbers (allele coverage), drop-outs, drop-ins, and sequence analysis. The study revealed that loci with high read numbers performed better and resulted in fewer drop-outs and well balanced heterozygous alleles. Several loci were prone to drop-outs which led to falsely typed homozygotes and therefore to genotype errors. Sequence analysis of allele drop-in typically revealed a single nucleotide change (deletion, insertion, or substitution). Analyses of sequences, no template controls, and spurious alleles suggest no contamination during library preparation, pooling, and sequencing, but indicate that sequencing or PCR errors may have occurred due to DNA polymerase infidelities. Finally, we found utilizing Illumina's FGx System at recommended conditions does not guarantee 100% outcomes for all samples tested, including the positive control, and required manual editing due to low read numbers and/or allele drop-in. These findings are important for progressing towards implementation of MPS in forensic DNA testing.Entities:
Mesh:
Substances:
Year: 2017 PMID: 29121662 PMCID: PMC5679668 DOI: 10.1371/journal.pone.0187932
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Overview of experiments.
Nine experimental runs were performed with a total of 336 samples: 314 reaction samples, 9 NTCs, and 13 positive controls (2800M). Expt. I is the benchmark.
| Experiment Number | Primer Mix | Number of samples | Total DNA per Flow Cell (pg) | Description | Quality Parameters | Q30 | ERN |
|---|---|---|---|---|---|---|---|
| A: Cluster density (k/mm2) | |||||||
| B: Cluster passing filter (%) | |||||||
| C: Phasing (%) | |||||||
| I | A | 32 | A: | 58.4 | 1589 | ||
| B: 88.9 | |||||||
| C: 0.186 | |||||||
| D: 0.065 | |||||||
| II | A | 32 | 31,000 | Repeat of Expt. I, using the same kit but 11 weeks later. | A: | 48.8 | 757 |
| B: 94.91 | |||||||
| C: 0.213 | |||||||
| D: 0.031 | |||||||
| III | B | 32 | 31,000 | Same samples as in Expt. I but using Primer Mix B (more targets). | A: | 57.3 | 890 |
| B: 95.78 | |||||||
| C: 0.157 | |||||||
| D: 0.086 | |||||||
| IV | A | 96 | 95,000 | 96 samples (including 5 positive controls), each with 1 ng DNA input. | A: | 48.8 | 774 |
| B: 86.18 | |||||||
| C: 0.139 | |||||||
| D: 0.102 | |||||||
| V | A | 32 | 10,030 | Sensitivity study: 6 samples (3 M and 3 F) at 800, 400, 200, 100, and 50 pg DNA input. | A: | 51.5 | 681 |
| B: 91.97 | |||||||
| C: 0.164 | |||||||
| D: 0.018 | |||||||
| VI | A | 32 | 10,030 | Repeat of Expt. V | A: | 50.9 | 506 |
| B: 94.92 | |||||||
| C: 0.161 | |||||||
| D: 0.006 | |||||||
| VII | B | 32 | 10,030 | Same samples as in Expt. V but using Primer Mix B (more targets). | A: | 47.8 | 471 |
| B: 92.9 | |||||||
| C: 0.171 | |||||||
| D: 0 | |||||||
| VIII | A | 32 | 16,000 | Sensitivity test: same samples as in Expt. I, each with 500 pg DNA input. | A: | 47.5 | 459 |
| B: 95.85 | |||||||
| C: 0.166 | |||||||
| D: 0 | |||||||
| IX | A | 16 | 1,500 | Sensitivity test: 16 samples (including controls) with 100 pg DNA input. | A: | 39.9 | 357 |
| B: 92.4 | |||||||
| C: 0.132 | |||||||
| D: 0 |
1Primer Mix A has 153 loci; Primer Mix B has 231 loci
2 Acceptable range for quality parameters—A: Cluster density: 400–1650 k/mm2; B: Cluster passing filter: ≥ 80%; C: Phasing: ≤0.25%; D: Pre-Phasing: ≤0.15%.
3 Error probability: The percentage of bases that have a quality score >30 (1 base call out of 1000 is predicted to be incorrect) generated after the 25th cycle. The higher the percentage the better the run quality.
4ERN: average Experiment Read Number (the average read number for each experimental run calculated by using the true alleles for all samples and loci, including a-, Y-, X-STRs, and iSNPs)
Fig 1Heat map of read numbers for all samples and loci.
The read numbers are shown in color code (low, medium, and high in blue, yellow, and red, respectively) for each sample and locus. Loci shown in columns: aSTRs (28): AMEL, TPOX, D3S1358, FGA, D5S818, CSF1PO, D7S820, D8S1179, THO1, vWA, D13S317, D16S539, D18S51,D21S11, D1S1656, D2S441, D2S1338, D4S2408, D6S1043, D9S1122, D10S1248, D12S391, Penta E, D17S1301, D19S433, D20S482, Penta D, D22S1045; Y-STRs (24): DYF387S1, DYS19, DYS385a-b, DYS389I, DYS389II, DYS390, DYS391, DYS392, DYS437, DYS438, DYS439, DYS448, DYS460, DYS481, DYS505, DYS522, DYS533, DYS549, DYS570, DYS576, DYS612, DYS635, DYS643, Y-GATA-H4; X-STRs (7): DXS10074, DXS10103, DXS10135, DXS7132, DXS7423, DXS8378, HPRTB; iSNPs (94): rs1490413, rs560681, rs1294331, rs10495407, rs891700, rs1413212, rs876724, rs1109037, rs993934, rs12997453, rs907100, rs1357617, rs4364205, rs2399332, rs1355366, rs6444724, rs2046361, rs279844, rs6811238, rs1979255, rs717302, rs159606, rs13182883, rs251934, rs338882, rs13218440, rs1336071, rs214955, rs727811, rs6955448, rs917118, rs321198, rs737681, rs763869, rs10092491, rs2056277, rs4606077, rs1015250, rs7041158, rs1463729, rs1360288, rs10776839, rs826472, rs735155, rs3780962, rs740598, rs964681, rs1498553, rs901398, rs10488710, rs2076848, rs2107612, rs2269355, rs2920816, rs2111980, rs10773760, rs1335873, rs1886510, rs1058083, rs354439, rs1454361, rs722290, rs873196, rs4530059, rs1821380, rs8037429, rs1528460, rs729172, rs2342747, rs430046, rs1382387, rs9905977, rs740910, rs938283, rs8078417, rs1493232, rs9951171, rs1736442, rs1024116, rs719366, rs576261, rs1031825, rs445251, rs1005533, rs1523537, rs722098, rs2830795, rs2831700, rs914165, rs221956, rs733164, rs987640, rs2040411, rs1028528. Reaction samples are in rows. Experiments are boxed. Note, female samples did not show any read numbers at the Y-STRs and were kept in white. Locus drop out (LDO) are marked in black. Positive controls (2800M) of each experiment are shown at the bottom (kept in the same experimental order as in the figure; for Expt. IV a total of 5 positive controls were included).
Fig 4Heat map of ACRs for all samples and loci.
Loci in columns follow the same order as in Fig 1 for aSTRs (28), Y-STRs (2): DYF387S1 and DYS385a-b; X-STRs (7), and iSNPs (94). Reaction samples are in rows. Experiments and positive controls (2800M) are boxed. Color code: green–ACR ≥0.7; yellow–ACR between 0.5–0.7; orange–ACR between 0.3–0.5; red–ACR ≤0.3; white–female samples at Y-STRs, male samples at X-STRs, and homozygotes that did not show an ACR; gray–ADO; and black–LDO.
Sensitivity, dropouts, and ACR.
| vWA | 276 (164) | 0.77 | DYS460 | 255 (169) | DXS10103 | 68 (53) | 0.77 | rs1736442 | 34 (16) | 0.78 |
| AMEL | 332 (220) | 0.72 | Y-GATA-H4 | 295 (182) | HPRTB | 1508 (1180) | 0.81 | rs1031825 | 46 (22) | 0.76 |
| D1S1656 | 394 (262) | 0.76 | DYS389II | 306 (230) | DXS10135 | 1711 (1587) | 0.63 | rs719366 | 57 (30) | 0.79 |
| CSF1PO | 590 (371) | 0.83 | DYS522 | 342 (233) | rs7041158 | 65 (39) | 0.79 | |||
| D12S391 | 696 (492) | 0.78 | DYS448 | 426 (470) | rs1294331 | 67 (37) | 0.79 | |||
| THO1 | 3593 (2274) | 0.87 | DYS392 | 3386 (2792) | DXS10074 | 3008 (2473) | 0.78 | rs1109037 | 1835 (1202) | 0.82 |
| D20S482 | 3380 (2233) | 0.84 | DYS438 | 3210 (1977) | DXS7132 | 2509 (2149) | 0.85 | rs4364205 | 1293 (864) | 0.82 |
| D3S1358 | 2271 (1497) | 0.83 | DYS576 | 3031 (1837) | DXS8378 | 2344 (2010) | 0.78 | rs722098 | 1052 (682) | 0.84 |
| D9S1122 | 2253 (1476) | 0.83 | DYS505 | 2331 (1519) | DXS7423 | 2011 (1640) | 0.84 | rs430046 | 948 (610) | 0.84 |
| D8S1179 | 2057 (1319) | 0.83 | DYS389I | 2215 (1267) | rs251934 | 928 (611) | 0.85 | |||
| rs2040411 | 854 (568) | 0.85 | ||||||||
| PentaD | 0.67 | 23/314 | DYS385a-b | 0.46 | 33/164 | DXS10103 | 0.77 | 29/314 | rs1736442 | 85/314 |
| vWA | 0.77 | 4/314 | DYS488 | 23/164 | DXS10135 | 0.63 | 19/314 | rs1031825 | 52/314 | |
| D1S1656 | 0.76 | 4/314 | DYS390 | 23/164 | rs719366 | 43/314 | ||||
| AMEL | 0.72 | 2/314 | DYF387S1 | 0.67 | 10/164 | rs129331 | 37/314 | |||
| CSF1PO | 0.83 | 1/314 | DYS389II | 5/164 | rs7041158 | 30/314 | ||||
1SD: Standard deviation
2ACR: Allele coverage ratio. The average ACR for aSTRs, X-STRs, and iSNP were 0.79, 0.78, and 0.81, respectively; and the ACRs Y-STRs DYS385a-b and DYF387S1 were 0.46 and 0.67, respectively.
3f: Frequency
Effects of experimental conditions on read numbers.
| Test | Comparison of Experimental Runs | Fold-Change of Average Read Numbers of Correct Alleles | Results |
|---|---|---|---|
| 1. Experimental repeat | V / VI | 1.3 | Good experimental replication. |
| 2. Kit stability (testing 11 weeks apart) | I / II | 2.1 | Aged kit appeared to decline in activity. |
| 3. Varying DNA input | I / VIII | 3.5 | Reducing DNA input reduced read numbers. |
| 4. Varying the numbers of samples– 32 vs 96 | I / IV | 2.1 | Increasing the number of samples reduced read numbers. |
| 5. Varying the numbers of targets | I / III | 1.8 | Increasing the number of targets reduced read numbers. |
| V / VII | 1.4 | ||
| VI / VII | 1.1 | ||
| 6. Varying DNA input | V | 7.4 | Smaller amounts of DNA within the same run had lower read numbers. |
| VI | 4.5 | ||
| VII | 6.8 | ||
| 7. X-STRs within a single run—female samples vs male samples | For all Expt. | 3.8 vs 1.6 |
1The ERNs of the respective experimental runs were used to calculate the fold-change.
2DNA input refers to the amount of sample DNA used at the library preparation stage, not amounts added to the flow cell.
3Primer Mix A has 153 loci; Primer Mix B has 231 loci
4Compare with Fig 1
Fig 2Heat map of STR sequence analysis for all samples and loci.
Loci in columns follow the same order as in Fig 1 for aSTRs (28), Y-STRs (24), and X-STRs (7). Reaction samples are in rows. Experiments and positive controls (2800M) are boxed. Color code: red–genotype error not flagged by UAS; orange–genotype error flagged by UAS; pink–typed stutter plus typed sequence error (SE); purple–typed SE; yellow–typed stutter; black–locus drop-out (LDO); gray–untyped stutter; turquoise–untyped SE plus untyped SE from stutter; light blue–untyped SE; and green–no artifacts. Note, female samples did not show sequences at Y-STRs and were kept in white (except ADIs, see text). The white spacing between a-, Y-, and X-STRs separates the STRs.
Fig 3Heat map of iSNP genotype analysis for all samples and loci.
Loci in columns follow the same order as in Fig 1 for iSNPs (94). Reaction samples are in rows. Experiments and positive controls (2800M) are boxed. Color code: green–correct genotype; light green–editable genotype (see text); yellow–additional C-allele; red–genotype error; gray–ADO; black–LDO.