| Literature DB >> 33921728 |
Cydne L Holt1, Kathryn M Stephens1, Paulina Walichiewicz1, Keenan D Fleming1, Elmira Forouzmand1, Shan-Fu Wu1.
Abstract
Forensic mitochondrial DNA (mtDNA) analysis conducted using next-generation sequencing (NGS), also known as massively parallel sequencing (MPS), as compared to Sanger-type sequencing brings modern advantages, such as deep coverage per base (herein referred to as read depth per base pair (bp)), simultaneous sequencing of multiple samples (libraries) and increased operational efficiencies. This report describes the design and developmental validation, according to forensic quality assurance standards, of end-to-end workflows for two multiplexes, comprised of ForenSeq mtDNA control region and mtDNA whole-genome kits the MiSeq FGxTM instrument and ForenSeq universal analysis software (UAS) 2.0/2.1. Polymerase chain reaction (PCR) enrichment and a tiled amplicon approach target small, overlapping amplicons (60-150 bp and 60-209 bp for the control region and mtGenome, respectively). The system provides convenient access to data files that can be used outside of the UAS if desired. Studies assessed a range of environmental and situational variables, including but not limited to buccal samples, rootless hairs, dental and skeletal remains, concordance of control region typing between the two multiplexes and as compared to orthogonal data, assorted sensitivity studies, two-person DNA mixtures and PCR-based performance testing. Limitations of the system and implementation considerations are discussed. Data indicated that the two mtDNA multiplexes, MiSeq FGx and ForenSeq software, meet or exceed forensic DNA quality assurance (QA) guidelines with robust, reproducible performance on samples of various quantities and qualities.Entities:
Keywords: ForenSeq; MiSeq; Scientific Working Group on DNA Analysis Methods (SWGDAM); forensic; massively parallel sequencing; mitochondria; mtDNA; next generation sequencing; sequencing by synthesis; validation
Year: 2021 PMID: 33921728 PMCID: PMC8073089 DOI: 10.3390/genes12040599
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Figure 1Schematic of ForenSeq mtDNA control region and mtDNA whole-genome library preparation design and workflow. (a) PCR-based target enrichment and tagging create DNA templates consisting of regions of interest flanked by universal primer sequences. Index adapters then attach to the tags. Resultant paired-end, dual-indexed, tiled libraries are amplified, purified, and pooled into one tube for MiSeq FGx sequencing and analysis with ForenSeq UAS or other software, (b) ForenSeq protocol divides each DNA sample into two PCRs with separate primer sets (set 1, set 2) in a tiled strategy that promotes efficient amplification of overlapping amplicons to allow complete coverage and reduces unintended byproducts, (c) the two-PCR approach can facilitate confirmation of variant(s) that reside under a primer: when a primer-binding site mutation exists under a primer in one primer set (asterisk under a set 1 primer in this diagram), then that variant can be reliably detected in amplicons extended from the companion primer set (set 2). Alternative approaches can more frequently produce ambiguity regarding primer sequences versus variants.
Mock casework human remains: control region and mtGenome coverage, variants and European DNA Profiling Group (EDNAP) mtDNA population (EMPOP) haplogrouping, control region concordance between mtDNA multiplexes.
|
| Control Region Multiplex | mtGenome Multiplex | ||||||
|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
| ||
| CONTEMPORARY SAMPLES | Tooth 1661, InnoGenomics | 100% | 73G 150T 152C 263G 315.1C 523c 524a 16124C 16223T 16311C 16399G | L3d1b1 | 99% | 5086–5177 | 73G 150T 152C 263G 315.1C 497M 1 523c 524a 16124C 16223T 16311C 16399G | L3d1b1 |
| Tooth 1662, InnoGenomics | 100% | 73G 153G 195C 225A 226C 263G 309.1c 315.1C 16189c 2 16193.1c 16223T 16278T 16519C | X2 + 225 | 99% | 8290–8379 | 73G 153G 195C 225A 226C 263G 309.1c 315.1C 16189c 16193.1c 16223T 16278T 16519C | X2b4a1 | |
| Tooth 1663, InnoGenomics | 100% | 73G 150T 152C 195C 198T 263G 315.1C 16189c 16223T 16320T 16519C | L3e2a1 | 100% | 73G 150T 152C 195C 198T 263G 315.1C 16189c 16223T 16320T 16519C | L3e2a1b3 | ||
| Tooth 1664, InnoGenomics | 100% | 73G 146C 152C 195C 263G 309.1C 315.1C 378Y 507C 16223T 16278T 16286T 16294T 16309G 16390A 16519C | L2a1a2 | 98% | 519, 4044–4175, 7216–7367 | 73G 146C 152C 195C 263G 309.1C 315.1C 378Y 507C 16223T 16278T 16286T 16294T 16309G 16390A 16519C | L2a1a2b | |
| Tooth 1665, InnoGenomics | 100% | 64T 93G 185A 189G 200G 236C 247A 263G 315.1C 523a 524c 16129A 16148T 16168T 16172C 16187T 16188G 16189C 16223T 16230G 16311C 16320T 16325C 16362C | L0a1a + 200 | 97% | 4044–4175, 4299–4379, 7021–7182, 7192–7196, 7206, 7216–7367 | 64T 93G 185A 189G 200G 236C 247A 263G 315.1C 523a 524c 16129A 16148T 16168T 16172C 16187T 16188G 16189C 16223T 16230G 16311C 16320T 16325C 16362C | L0a1a2 | |
| Bone S1, | 100% | 73G 150T 263G 315.1C 16189c 16193.1c 16270T 16398A | U5b2a2 | 99% | 5307, 5327, 5334–5338, 5343, 6718−6810, 7308−7310, 12,563−12,564, 15,571−15,573 | 73G 150T 263G 315.1C 16183M 3 16189c 16193.1c 16270T 16398A | U5b2a2b | |
| Bone S2, embalmed, SHSU | 100% | 73G 150T 185A 228A 263G 295T 309.1C 315.1C 462T 489C 16069T 16126C | J1c | 97% | 1103, 1132−1138, 1150, 1164, 1172, 2661−2663, 3606, 5307−5347, 6121, 6139, 6718−6810, 7256−7342, 7508−7559, 11,187−11,189, 12,466−12,614, 15,190, 15,539−15,581 | 73G 150T 185A 228A 263G 295T 309.1C 315.1C 462T 489C 16069T 16126C | J1c | |
| Bone S3, embalmed, SHSU | 100% | 73G 143A 146C 152C 189G 195C 263G 315.1C 16129A 16189c 16192T 16223T 16278T 16294T 16309G 16390A | L2a1 | 98% | 4044−4175, 5081−5177, 5335−5336, 7216−7310 | 73G 143A 146C 152C 189G 195C 263G 315.1C 16129A 16189c 16192T 16223T 16278T 16294T 16309G 16390A | L2a1n | |
| Bone S4, | 100% | 73G 152C 263G 315.1C 16093Y 16256T 16270T 16399G | U5a1 | 100% | 73G 152C 263G 315.1C 16093Y 16256T 16270T 16399G | U5a1a1b | ||
| Bone S5, | 100% | 195C 263G 315.1C 523a 524c | R0 | 99% | 5858−5975, 8444−8446, 12,466−12536, 12,563−12,614 | 195C 263G 315.1C 523a 524c | H4a1a4b | |
| Bone S6, | 100% | 73G 263G 309.1c 315.1C 16126C 16294T 16296T 16519C | T2 | 100% | 2663, 3550−3606, 5334−5337, 7308−7310, 15,571−15,574 | 73G 263G 309.1c 315.1C 481Y 4 16126C 16294T 16296T 16519C | T2a1a | |
| Bone S7, | 100% | 263G 309.1c 315.1C 316A 16291T 16519C | H1j2a | 100% | 15,572 | 263G 309.1c 309.2c 5 315.1C 316A 16291T 16519C | H1j2a | |
| ANCIENT SAMPLES | Interred bone P2, PSU | 100% | 73G 6 263G 315.1c 7 489Y 8 16192Y 16256Y 16260Y 16270T 16291T 16399R | U5a1b1 | 96% | 1094−1177, 2668−2671, 3590−3591, 5307−5346, 6109−6141, 6719−6810, 7256−7342, 7545, 8291−8379, 11,193−11197, 12,466−12614, 15,519−15,581 | 73R 6 263G 315.1C 7 523a 9 16076M 10 16192Y 16256Y 16260Y 16270T 16291T 16399R | U5a1b1c |
| Interred bone P43pt1, PSU | 100% | 152C 263G 309.1c 315.1c 16234T 16270Y | H | 99% | 5340−5344, 6718−6810, 7314−7318, 15,579 | 152C 263G 309.1c 315.1c 495Y 11 506Y 12 16234T 16270Y | H13a1d | |
| Interred bone P48, PSU | 100% | 257R 263G 315.1C 477C 13 16093Y 16192Y 16270Y 16519C | H1c | 99% | 5307−5347, 6718−6810, 7258−7266, 7273, 7288−7340, 8345−8349, 12,555−12,559 | 257R 263G 315.1C 477Y 13 514Y 14 16093Y 16192Y 16270Y 16519C | H1 | |
| Interred bone P73, PSU | 100% | 73G 153G 195C 263G 309.1C 15 309.2c 315.1C 17 489G 16189c 16223T 16278T 16294T 16519C | X1′2′3 | 98% | 2668−2671, 5307−5347, 6109−6141, 6718−6810, 7256−7348, 12,555−12559, 15,520−15,581 | 73G 153G 195C 263G 309.1c 15 309.2c 310Y 16 315.1c 17 459Y 18 489G 494Y 496Y 497Y 511Y 513R 514Y 518Y 557Y 19 16188c 20 16189c 16223T 16278T 16294T 16519C | X2 | |
1 Base call A present at position 497 (10.4%) in the mtGenome multiplex run was not detected in the control region run; 2 base call 16189c should be called 16189C 16193c; 3 base call C present at position 16,183 (6.3%; below AT) in control region multiplex run and at 6.3% (above AT) in mtGenome run; 4 base call T present at position 481 at 18.7% (above AT) in mtGenome run not detected in control region run 5 C insertion present at position 309 at 9.2% (above AT) in mtGenome run not detected in control region run (region had very low read depth); 6 base call A present at position 73 at 7.3% (below AT) in control region multiplex run and at 7.5% (above AT) in mtGenome run; 7 reference sequence present at position 315 at 10% (above AT) in control region multiplex run and at 3% (below AT) in mtGenome run; 8 base call C present at position 489 at 12.8% (above AT) and at 1.9% (below AT) in mtGenome run; 9 deletion present at position 523 at 7.8% (below AT) in control region multiplex run and at 6.4% (above AT) in mtGenome run; 10 base call A present at position 16,076 at 7.1% (above AT) in mtGenome run not detected in control region run; 11 base call T present at position 495 at 1.1% (below AT) in control region multiplex run and at 6.6% (above AT) in mtGenome run; 12 base call T present at position 506 at 0.5% (below AT) in control region run and at 7.2% (above AT) in mtGenome run; 13 base call T present at position 477 at 6.7% (less than AT) in control region multiplex run and at 9.8% (greater than AT) in mtGenome run; 14 base call T present at position 514 at 0.3% (less than AT) in control region multiplex run and at 7.8% (greater than AT) in mtGenome run; 15 reference sequence present at position 309 at 4.9% (less than AT) in control region multiplex run and at 11.2% (greater than AT) in mtGenome run; 16 base call C present at position 310 at 7.7% (greater than AT) in mtGenome multiplex run not detected in control region (see Section 2.13); 17 reference sequence present at position 315 at 6.1% (less than AT) in control region multiplex run and at 10.5% (greater than AT) in mtGenome run; 18 low-level mixed base variants from position 459 to 518 at ~14% in mtGenome multiplex run are present in the control region run at ~1% (less than the AT); 19 base call T present at position 557 at 7.3% (less than AT) in control region multiplex run and at 6.3% (greater than AT) in mtGenome run; 20 reference sequence present at position 16188 at 7.5% (less than AT) in control region multiplex run and at 6.8% (above the AT) in mtGenome run.
Figure 2mtGenome data visualization of a mock casework sample as displayed in ForenSeq universal analysis software (UAS). (a) “Sample details” view zoomed to control region only; (b) zoomed out to display the whole mtGenome. The three main sections of the UAS sample details view are labeled in (a) and shown in (b): mtDNA navigator (upper left), Position Viewer (upper right) and coverage plot (bottom); a subset of software options and tools are also labeled in (a). (a) Commercially cremated bone sample S1 view of control region with 100% coverage and 100% variant calls reported. Variants are indicated by blue-colored tick marks in the mtDNA navigator; zero orange-colored tick marks are displayed, indicating complete coverage and zero “no calls”, which would render in orange (see (b)). (b) Bone sample S1 zoomed out view of entire mtGenome, as compared to (a) with 98% coverage due to “no calls” in eight regions (see Table 1), which are visible as orange-colored tick marks in the mtDNA navigator circle. Notes: 100% control region variant concordance was observed between both kits for bone sample S1; base call “C” was present at position 16,183 at 6.3% (below default AT) in control region run and at 6.3% (above default AT) in mtGenome run (Table 1).
Mock casework: control region and mtGenome coverage, variants and EMPOP haplogrouping, control region concordance between multiplexes and among matched buccal samples and rootless hair shafts from six individuals.
|
|
| ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
| Buccal sample 2 | 100 pg | 100% | 73G 146C 150T 263G 309.1c 315.1C 523a 524c 16126C 16292T 16294T 16296T 16519C | T2c1 + 146 | 100% | 73G 146C 150T 263G 309.1c 315.1C 523a 524c 16126C 16292T 16294T 16296T 16519C | T2c1e | ||||
| 0.5 cm Hair sample 2 | 12 µL | 99.9% | 310 | 73G 146C 150T 263G 315.1C 523a 524c 16126C 16292T 16294T 16296T 16519C | T2c1 + 146 | 100% | 100% | 73G 146C 150T 263G 309.1c 1 315.1C 523a 524c 16126C 16292T 16294T 16296T 16519C | T2c1e | 100% | |
| 2 cm Hair sample 2 | 12 µL | 100% | 73G 146C 150T 263G 315.1C 523a 524c 16126C 16292T 16294T 16296T 16519C | T2c1 + 146 | 100% | 100% | 73G 146C 150T 263G 309.1c 1 315.1C 523a 524c 16126C 16292T 16294T 16296T 16519C | T2c1e | 100% | ||
| Buccal sample 4 | 100pg | 99.9% | 310 | 146C 263G 309.1C 315.1C 16142T 16325C | HV | 99.7% | 9538–9590 | 146C 263G 309.1c 2 309.2c 3 315.1C 16142T 16325C | H47 | ||
| 0.5 cm Hair sample 4 | 12 µL | 96.9% | 303–346 | 146C 263G 16142T 16325C | HV | 100% | 98.7% | 8290–8379, 9538–9590, 12,496–12,601 | 146C 263G 309.1c 2 315.1C 16142T 16325C | H47 | 100% |
| 2 cm Hair sample 4 | 12 µL | 96.3% | 303–347 | 146C 263G 16142T 16325C | HV | 100% | 99.0% | 9541, 9545–9547, 9549–9550, 9552, 9555–9557, 9564, 9568, 9570–9571, 9577, 9581, 9588–9589, 12,466–12,614 | 146C 263G 309.1c 2 315.1C 16142T 16325C | H47 | 100% |
| Buccal sample 5 | 100pg | 100% | 263G 315.1C | R0 | 100% | 263G 315.1C | H4a1a1 | ||||
| 0.5 cm Hair sample 5 | 12 µL | 100% | 263G 315.1C | R0 | 100% | 100% | 263G 315.1C | H4a1a1 | 100% | ||
| 2 cm Hair sample 5 | 12 µL | 100% | 263G 315.1C | R0 | 100% | 100% | 263G 315.1C | H4a1a1 | 100% | ||
| Buccal sample 8 | 100pg | 100% | 73G 150T 194T 263G 315.1C 489C 523a 524c 16223T 16362C 16519C | D4b2b2a | 100% | 73G 150T 194T 263G 315.1C 489C 523a 524c 16223T 16362C 16519C | D4b2b2a | ||||
| 0.5 cm Hair sample 8 | 12 µL | 100% | 73G 150T 194T 263G 315.1C 489C 523a 524c 16223T 16362C 16519C | D4b2b2a | 100% | 100% | 73G 150T 194T 263G 315.1C 489C 523a 524c 16223T 16362C 16519C | D4b2b2a | 100% | ||
| 2 cm Hair sample 8 | 12 µL | 100% | 73G 150T 194T 263G 315.1C 489C 523a 524c 16223T 16362C 16519C | D4b2b2a | 100% | 100% | 73G 150T 194T 263G 315.1C 489C 523a 524c 16223T 16362C 16519C | D4b2b2a | 100% | ||
| Buccal sample 11 | 100pg | 100% | 73G 152C 249del 263G 309.1c 315.1C 523a 524c 16108T 16129A 16162G 16172C 16232A 16304C 16357C 16519C | F1a1a | 100% | 73G 152C 249del 263G 309.1c 315.1C 523a 524c 16108T 16129A 16162G 16172C 16232A 16304C 16357C 16519C | F1a1a | ||||
| 0.5 cm Hair sample 11 | 12 µL | 96.1% | 303–347 | 73G 152C 249del 263G 523a 524c 16108T 16129A 16162G 16172C 16232A 16304C 16357C 16519C | F1a1a | 100% | 99.8% | 9489–9526 | 73G 152C 249del 263G 309.1c 315.1C 523a 524c 16108T 16129A 16162G 16172C 16232A 16304C 16357C 16519C | F1a1a | 100% |
| 2 cm Hair sample 11 | 12 µL | 99.9% | 310 | 73G 152C 249del 263G 309.1c 315.1C 523a 524c 16108T 16129A 16162G 16172C 16232A 16304C 16357C 16519C | F1a1a | 100% | 100% | 73G 152C 249del 263G 309.1c 315.1C 523a 524c 16108T 16129A 16162G 16172C 16232A 16304C 16357C 16519C | F1a1a | 100% | |
| Buccal sample 12 | 100pg | 100% | 195Y 263G 309.1c 315.1C 4 16519C | R0 | 100% | 195Y 263G 309.1c 315.1C 4 16519C | H40b | ||||
| 0.5 cm Hair sample 12 | 12 µL | 96.1% | 303–347 | 195Y 263G 16519C | R0 | 100% | 98.5% | 5307, 5311–5312, 5318, 5321, 5323–5332, 6718–6810, 7256–7342, 15,519–15,581 | 195Y 263G 309.1c 315.1c 4 489Y 5 16519C | H40b | 100% |
| 2 cm Hair sample 12 | 12 µL | 100% | 195Y 263G 309.1c 315.1C 4 16519C | R0 | 100% | 99.5% | 6718–6810 | 195Y 263G 309.1c 315.1c 4 16519C | H40b | 100% | |
1 C insertion at position 309 less than the 10% AT in the control region multiplex run; 2 mixed variants 309.1c less than the 10% AT in the control region multiplex run; 3 2nd C Insertion at position 309 less than the 10% AT in the control region multiplex run; 4 reference sequence present in control region multiplex run, less than the 10% AT: 6.2% in buccal, 8.1% in 2 cm hair; the reference sequence is present in the mtGenome multiplex run greater than AT: 11% in 0.5 cm hair, 6.2% in 2 cm hair and less than AT at 3.4% in buccal; 5 C present at position 489 in 0.5 cm hair at 22% not detected in control region multiplex run nor in buccal or 2 cm hair from individual 12.
Figure 3Sensitivity studies of the control region and mtGenome multiplexes: DNA inputs, library purification and normalization methods, sample plexity. Various gDNA template inputs (x-axis) relative to total reads per sample (y-axis, left), and relative to detection of expected variants detected under a set of condition(s) (y-axis, right) plotted as open and closed circles shown in horizontal one atop each graph. (a) control region multiplex: dilution series of gDNAs HG01204 (solid bars, closed circle) and NA18524 (hatched bars, open circle). Libraries were prepared in duplicate, purified 1× and normalized with BBN (average of two reps plotted), (b) control region multiplex: dilution series of gDNAs HG01204 (solid bars, closed circle) and NA18524 (hatched bars, open circle). Libraries were prepared in duplicate, purified 2× and normalized with QN (average of two reps plotted), (c) mtGenome multiplex: dilution series of HL60 gDNA. Libraries were prepared in duplicate, purified 1×, normalized with BBN (solid bars, closed circle), or purified 2× and normalized with QN (hatched bars, open circle) (average of two reps plotted), (d) control region multiplex: dilution series of HL60 gDNA. DNA libraries were prepared in quadruplicate with three negative amplification controls, purified 1x and normalized with QN. MiSeq FGx runs using micro sequencing kit were conducted with either 12 or 47 sample plexity, shown as solid or hatched bars, respectively, (e) mtGenome multiplex: dilution series of HL60 gDNA. Libraries were prepared in duplicate, purified 2× and normalized with QN. MiSeq FGx runs using standard sequencing kit were conducted with either 8 or 16 sample plexity, shown as solid or hatched bars, respectively.
Two-person (2800 M:HL60) mtDNA mixtures at different ratios and DNA inputs: ForenSeq control region and mtGenome multiplexes.
| ForenSeq | gDNA | AT1 | Mixture | Expected | Expected | Expected | Expected | Observed | Minor |
|---|---|---|---|---|---|---|---|---|---|
| Control | 100 pg | 3.7% | 1:3 | 25:75 | 22–36 | 64–78 | 10 | 10 | 100% |
| 100 pg | 3.7% | 1:5 | 17:83 | 10–17 | 82–90 | 10 | 10 | 100% | |
| 100 pg | 3.7% | 1:15 | 6:94 | 4–7 | 93–96 | 10 | 10 | 100% | |
| 5 pg | 3.7% | 1:3 | 25:75 | 24–36 | 64–76 | 10 | 10 | 100% | |
| 5 pg | 3.7% | 1:5 | 17:83 | 4–19 | 81–96 | 10 | 112 | 100% | |
| 5 pg | 3.7% | 1:15 | 6:94 | 3–7 | 93–97 | 10 | 8 | 80% | |
| mtGenome | 100 pg | 6% | 1:1 | 50:50 | 26–47 | 53–74 | 27 | 27 | 100% |
| 100 pg | 6% | 1:3 | 25:75 | 11–25 | 75–89 | 27 | 27 | 100% | |
| 100 pg | 6% | 1:9 | 10:90 | 0–11 | 89–100 | 27 | 15 | 55.6% | |
| 5 pg | 6% | 1:1 | 50:50 | 24–47 | 53–76 | 27 | 27 | 100% | |
| 5 pg | 6% | 1:3 | 25:75 | 8–19 | 81–92 | 27 | 27 | 100% | |
| 5 pg | 6% | 1:9 | 10:90 | 0–8 | 92–100 | 27 | 7 | 25.9% |
1 analytical threshold; note: a custom 3.7% AT in ForenSeq UAS was applied for the analysis of control region multiplex in this study, 2 unexpected mixed base variant 501 Y was at 4.4%.
Repeatability and reproducibility studies summary: control region and mtGenome multiplexes.
| Input | Repeatability | Reproducibility | |||
|---|---|---|---|---|---|
| Control Region Multiplex | mtGenome Multiplex | Control Region Multiplex | mtGenome Multiplex | ||
|
| 2 pg | 100% | 97.9% 1 | 100% | 97.9% 2 |
| 20 pg | N/A | 99.4% 3 | 100% | N/A | |
| 100 pg | 100% | 99.98% 4 | 100% | 100% | |
|
| 2 pg | 100% | 100% 5 | 100% | 100% 6 |
| 20 pg | N/A | 100% 7 | 100.0% | N/A | |
| 100 pg | 100% | 100% 8 | 100.0% | 100% 9 | |
|
| 2 pg | 7129 | 3580 | 6983 | 1260 |
| 20 pg | N/A | 1764 | 20,728 | N/A | |
| 100 pg | 43,401 | 3146 | 29,574 | 3645 | |
Notes: Repeatability and reproducibility were analyzed using 48 samples per multiplex with 16 samples per run across 12 MiSeq FGx runs (three runs each for each multiplex for each study). This generated 56,693 and 79,5312 data points (bases called per mtDNA position) for the control region and the mtGenome, respectively, in repeatability studies, and another 56,693 and 79,5312 data points in reproducibility studies. 1 Average loss of coverage of 356 bases for the 2 pg HL60 samples (n = 18); 2 average loss of coverage of 342 bases for the 2 pg HL60 samples (n = 9); 3 average loss of coverage of 106 bases for the 20 pg HL60 samples (n = 5); 4 average loss of coverage of four bases for the 100 pg Coriell samples (n = 5); no loss of coverage for 100 pg HL60 samples; 5 heteroplasmy: “C” at position 1490, and “A” at position 4821, were not detected at 2 pg in HL60 in all nine replicates; 6 heteroplasmy: “C” at position 1490, and “A” at position 4821, were not detected at 2 pg in HL60 in 17 of 18 replicates, or 16 of 18 replicates, respectively. Heteroplasmy was detected at position 1490 at 7% in one replicate; in the two replicates where heteroplasmy at position 4821 occurred, an average of 8% was observed; 7 heteroplasmy: “C” at position 1490, and “A” at position 4821, were not detected at 20 pg in HL60 in four of six replicates, or in six of six replicates, respectively. In the two replicates where heteroplasmy at position 1490 occurred, an average of 2.9% was observed; 8 heteroplasmy: “A” at position 4821 was not detected at 100 pg in HL60 in six of nine replicates; in the three replicates where heteroplasmy at position 4821 occurred an average of 6% was observed. Heteroplasmy was detected at position 1490 at 4.5% in all nine replicates; 9 heteroplasmy: “C” at position 1490 was not detected at 100 pg in HL60 in two of 18 replicates. Heteroplasmy was detected at position 1490 at 5.1% in 16 replicates and at position 4821 at an average of 6.2% for the 18 replicates.
mtDNA control region concordance studies: control region multiplex vs. whole mtGenome multiplex in five well-characterized DNA samples.
| Sample | Expected CR Variants | Control Region Kit | Whole Genome Kit |
|---|---|---|---|
| Observed CR Variants | Observed CR Variant | ||
|
| 64Y 73G 195C 204C 207A 263G 309.1C 315.1C 16183C 16189C 16193.1c 16193.2c 16223T 16278T 16519C | 64Y 73G 195C 204C 207A 263G 309.1C 315.1C 16183C 16189C 16193.1c 16193.2c 16223T 16278T 16519C | 64Y 73G 195C 204C 207A 263G 309.1C 315.1C 16183C 16189C 16193.1c 16193.2c 16223T 16278T 16519C |
|
| 93G 195C 214G 263G 309.1C 309.2C 315.1C 16311C 16519C | 93G 195C 214G 263G 309.1C 309.2C 315.1C 16311C 16519C | 93G 195C 214G 263G 309.1C 309.2C 315.1C 16311C 16519C |
|
| 73G 150T 152C 263G 295T 315.1C 489C 16069T 16193T 16278T 16362C | 73G 150T 152C 263G 295T 315.1C 489C 16069T 16193T 16278T 16362C | 73G 150T 152C 263G 295T 315.1C 489C 16069T 16193T 16278T 16362C |
|
| 263G 315.1C 16357C 16519C | 263G 315.1C 16357C 16519C | 263G 315.1C 16357C 16519C |
|
| 73G 185A 228A 263G 295T 315.1C 462T 482C 489C 16069T 16126C 16292T | 73G 185A 228A 263G 295T 315.1C 462T 482C 489C 16069T 16126C 16292T | 73G 185A 228A 263G 295T 315.1C 462T 482C 489C 16069T 16126C 16292T |
Concordance studies: orthogonal WGS data vs. ForenSeq mtDNA multiplexes, haplogroup assignment using EMPOP.
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|
| HG00181 | 73G 195C 263G 309.1C 315.1C 499A 524.1a 524.2c 16356C 16519C | U4 | 6922–6988 | 73G 195C 263G 309.1C 315.1C 499A 524.1a 524.2c 16356C 16519C | U4d1a1 | 100% | 100% |
| HG00383 | 263G 315.1C 523a 524c 16093C 16129A 16316G 16519C | H27 | 263G 315.1C 523a 524c 16093C 16129A 16316G 16519C | H27a | 100% | 100% | |
| HG00384 | 73G 150T 263G 309.1c 309.2c 315.1C 16144C 16183M 16189C 16193.1c 16193.2c 16270T | U5b1b1a | 73G 150T 263G 309.1c 309.2c 315.1C 16144C 16183M 16189C 16193.1c 16193.2c 16270T | U5b1b1a | 100% | 100% | |
| HG00844 | 73G 249del 263G 309.1C 310Y 1 315.1C 489C 16092C 16189C 16193.1c 16193.2c 16223T 16298C 16327T 16355T 16519C | C | 470–519, 3550–3606, 13,013–13,080, 15,539–15,581 | 73G 249del 263G 309.1c 310Y 315.1C 16092C 16189c 2 16193.1c 16223T 16298C 16327T 16355T 16519C | C7a | 100% | 100% |
| HG01197 | 73G 150T 263G 279C 315.1C 455.1T 517T 16224C 16270T | U5b2b3a | 73G 150T 263G 279C 315.1C 455.1T 517T 523a 16181R 16224C 16270T | U5b2b3a | 100% | 100% | |
| HG01204 | 73G 249del 290del 291del 315.1C 489C 493G 523a 524c 16223T 16298C 16325C 16327T 16519C | C1b2 | 73G 249del 290del 291del 315.1C 489C 493G 523a 524c 16223T 16298C 16325C 16327T 16519C | C1b2 | 100% | 100% | |
| HG01205 | 73G 263G 315.1C 523a 524c 16093C 16223T 16278T 16362C 16519C | L3b1a | 73G 189R 263G 315.1C 523a 524c 16093C 16223T 16278T 16362C 16519C | L3b1a + @16124 | 100% | 100% | |
| HG01497 | 73G 263G 309.1C 309.2c 315.1c 498del 499A 524.1a 524.2c 16183c 16189C 16193.1c 16217C 16519C | B2d | 73G 263G 309.1C 309.2c 315.1c 498del 499A 524.1a 524.2c 16183c 16189C 16193.1c 16217C 16519C | B2d | 100% | 100% | |
| HG01498 | 73G 263G 307c 308c 309c 310c 498del 499A 16182c 16183c 16189C 16193.1c 16217C 16519C | B2d | 73G 263G 307c 308c 309c 310c 498del 499A 16182c 16183c 16189C 16193.1c 16217C 16519C | B2d | 100% | 100% | |
| HG01550 | 73G 263G 309.1C 309.2c 309.3c 310Y 315.1C 498del 499A 16182C 16183c 16189C 16193.1c 16217C 16519C | B2d | 73G 263G 309.1C 309.2c 309.3c 310Y 315.1C 498del 499A 16182c 16183c 16189C 16193.1c 16217C 16519C | B2d | 100% | 100% | |
| HG01551 | 73G 150T 263G 315.1C 523a 524c 16051G 16223T 16264T 16519C | L3e4a | 73G 150T 263G 315.1C 523a 524c 16051G 16223T 16264T 16519C | L3e4a | 100% | 100% | |
| HG01790 | 263G 309.1C 309.2C 315.1C | R0 | 263G 309.1C 309.2c 315.1C | H33a | 100% | 100% | |
| HG02190 | 73G 150T 263G 315.1C 489C 523a 524c 16172C 16182C 16183c 16189C 16193.1c 16223T 16362C 16519C | D5a2 | 73G 150T 263G 315.1C 489C 523a 524c 16172C 16182c 16183c 16189C 16193.1c 16223T 16362C 16519C | D5a2b | 100% | 100% | |
| HG02215 | 263G 315.1C 16311C 16519C | R0 | 263G 315.1C 16311C 16519C | H3m | 100% | 100% | |
| HG02236 | 214G 263G 315.1C 16172C 16519C | HV | 214G 263G 315.1C 16172C 16519C | H1 | 100% | 100% | |
| HG02238 | 263G 309.1C 309.2C 315.1C 16129A 16519C | H | 263G 309.1C 309.2c 315.1C 16129A 16519C | H1j1 | 100% | 100% | |
| HG02239 | 263G 292C 309.1C 315.1C 16519C | R0 | 263G 292C 309.1c 315.1C 16519C | H1 | 100% | 100% | |
| HG02317 | 73G 143A 146C 152C 195C 263G 309.1C 315.1C 16129A 16223T 16278T 16294T 16309G 16390A | L2a1c + 16129 | 73G 143A 146C 152C 195C 263G 309.1C 315.1C 16129A 16223T 16278T 16294T 16309G 16390A | L2a1c5 | 100% | 100% | |
| HG02322 | 73G 89C 93G 95C 152C 182T 186A 189C 236C 247A 263G 297G 315.1C 316A 523a 524c 16129A 16182C 16183c 16189C 16223T 16235G 16274A 16278T 16293G 16294T 16311C 16360T 16519C | L1c1a2 | 4044–4175, 7256–7367 | 73G 89C 93G 95C 152C 182T 186A 189C 236C 247A 263G 297G 315.1C 316A 523a 524c 16129A 16182c 16183c 16189C 16223T 16235G 16274A 16278T 16293G 16294T 16311C 16360T 16519C | L1c1a2 | 100% | 100% |
| HG02449 | 73G 150T 263G 273Y 309.1C 315.1C 523a 524c 16051G 16223T 16264T 16519C | L3e4a | 73G 150T 263G 273Y 309.1C 315.1C 523a 524c 16051G 16223T 16264T 16519C | L3e4a | 100% | 100% | |
| HG02450 | 73G 150T 195C 263G 309.1C 315.1C 499A 16223T 16320T 16399G 16519C | L3e2a1b1 | 73G 150T 195C 263G 309.1C 315.1C 499A 16223T 16320T 16399G 16519C | L3e2a1b1 | 100% | 100% | |
| HG02513 | 73G 249del 263G 309.1C 315.1C 521a 522c 523a 524c 16172C 16304C 16465T 16519C | F1a2a | 73G 249del 263G 309.1c 315.1C 521a 522c 523a 524c 16172C 16304C 16465T 16519C | F1a2a | 100% | 100% | |
| HG02521 | 73G 150T 263G 309.1c 315.1C 16111T 16129A 16223T 16257A 16261T | N9a1 | 73G 150T 263G 309.1c 315.1C 16111T 16129A 16223T 16257A 16261T | N9a1 | 100% | 100% | |
| HG03369 | 73G 150T 195C 263G 315.1C 16223T 16265T 16519C | L3e3 | 73G 150T 195C 263G 315.1C 16223T 16265T 16519C | L3e3b | 100% | 100% | |
| HG03370 | 73G 263G 315.1C 372C 523a 524c 16124C 16223T 16278T 16519C | L3 | 73G 263G 315.1C 372C 523a 524c 16124C 16223T 16278T 16519C | L3b1a | 100% | 100% | |
| HG03372 | 73G 150T 195C 263G 315.1C 16223T 16265T 16519C | L3e3 | 73G 150T 195C 263G 315.1C 16223T 16265T 16519C | L3e3b | 100% | 100% | |
| HG03577 | 73G 150T 195C 263G 309.1C 315.1C 16177G 16223T 16311C 16320T 16354T 16519C | L3e2 | 73G 150T 195C 263G 309.1C 315.1C 16177G 16223T 16311C 16320T 16354T 16519C | L3e2a | 100% | 100% | |
| HG03578 | 73G 146C 152C 195C 263G 315.1C 524.1a 524.2c 524.3a 524.4c 16223T 16233G 16278T 16294T 16309G 16368C 16390A 16519C | L2a1a1 | 73G 146C 152C 195C 263G 315.1C 524.1a 524.2c 524.3a 524.4c 16223T 16233G 16278T 16294T 16309G 16368C 16390A 16519C | L2a1a1 | 100% | 100% | |
| HG03583 | 73G 189C 195C 263G 315.1C 523del 524c 16126C 16179T 16215G 16223T 16256A 16284G 16311C | L3h1b1a | 73G 189C 195C 263G 315.1C 523del 524c 16126C 16179T 16215G 16223T 16256A 16284G 16311C | L3h1b1a | 100% | 100% | |
| HG03594 | 16T 73G 93G 188G 200G 204C 263G 309.1C 315.1C 489C 16153A 16223T 16287T 16327A 16519C | M91a | 16T 73G 93G 188G 200G 204C 263G 309.1C 315.1C 489C 16153A 16223T 16287T 16327A 16519C | M91a | 100% | 100% | |
| HG03595 | 41T 73G 153G 263G 309.1C 315.1C 489C 16223T 16234T 16295G 16311C 16320T 16519C | M | 41T 73G 153G 263G 309.1C 315.1C 489C 16223T 16234T 16295G 16311C 16320T 16519C | M49 | 100% | 100% | |
| HG03600 | 73G 195A 263G 315.1C 489C 523a 524c 16179del 16223T 16519C | M30 | 73G 195A 263G 315.1C 489C 523a 524c 16179del 16223T 16519C | M30d1 | 100% | 100% | |
| NA12812 | 44.1C 263G 309.1C 309.2C 315.1C 16093C 16129A 16183C 16189C 16193.1c 16519C | HV | 44.1C 263G 309.1C 309.2C 315.1C 16093C 16129A 16183C 16189C 16193.1c 16519C | H1 + 16189 | 100% | 100% | |
| NA12813 | 73G 263G 309.1C 315.1C 16114A 16129A 16189c 16192Y 16192.1t 16256T 16270T 16294T 16526A | U5a2a | 73G 263G 309.1C 315.1C 16114A 16129A 16189c 16192Y 16192.1t 16256T 16270T 16294T 16526A | U5a2a | 100% | 100% | |
| NA12814 | 73G 263G 315.1C 16192T 16256T 16270T 16291T 16399G | U5a1b1 | 73G 263G 315.1C 16192T 16256T 16270T 16291T 16399G | U5a1b1a2 | 100% | 100% | |
| NA12815 | 73G 263G 315.1C 16129A 16316G 16519C | H | 73G 263G 315.1C 16129A 16316G 16519C | H27 | 100% | 100% | |
| NA12872 | 263G 309.1C 309.2c 315.1C 16172C 16311C | HV | 263G 309.1C 309.2c 315.1C 16172C 16311C | HV6 | 100% | 100% | |
| NA12873 | 152C 195C 263G 309.1c 309.2c 315.1C 16293G 16311C 16525G | H11a6 | 152C 195C 263G 309.1c 309.2c 315.1C 16293G 16311C 16525G | H11a6 | 100% | 100% | |
| NA12874 | 73G 185A 188G 228A 263G 295T 309.1C 315.1C 462T 489C 16069T 16126C 16319A | J1c | 73G 185A 188G 228A 263G 295T 309.1C 315.1C 462T 489C 523a 16069T 16126C 16319A | J1c8a | 100% | 100% | |
| NA19240 | 73G 150T 152C 195C 263G 315.1C 16172C 16183c 16189C 16193.1c 16223T 16293T 16320T 16519C | L3e2b | 73G 150T 152C 195C 263G 315.1C 16172C 16183c 16189C 16193.1c 16223T 16293T 16320T 16519C | L3e2b5 | 100% | 100% | |
| NA20346 | 73G 150T 195C 263G 315.1C 16145A 16172C 16189c 16193.1c 16193.2c 16223T 16320T 16519C | L3e2b | 73G 150T 195C 263G 315.1C 16145A 16172C 16189c 16193.1c 16193.2c 16223T 16320T 16519C | L3e2b1a1 | 100% | 100% | |
| NA20356 | 73G 263G 309.1c 315.1C 16172C 16219G 16278T 16291Y 16519C | U6a | 73G 263G 309.1c 315.1C 16172C 16219G 16278T 16291Y 16519C | U6a5 | 100% | 100% | |
| NA20509 | 263G 309.1C 309.2C 309.3c 310Y 315.1C 523a 524c 16182C 16183c 16189C 16193.1c 16261T 16274A 16356C 16519C | H1b | 263G 309.1C 309.2C 309.3c 315.1C 523a 524c 16182c 16183c 16189C 16193.1c 16193.2c 16261T 16274A 16356C 16519C | H1b1 | 100% | 100% | |
| NA20510 | 73G 189G 195C 204C 207A 263G 315.1C 16192T 16223T 16309G 16325C 16519C | W6 | 73G 189G 195C 204C 207A 263G 315.1C 16192T 16223T 16309G 16325C 16519C | W6 | 100% | 100% | |
| NA20828 | 73G 263G 309.1c 315.1C 497T 524.1a 524.2c 524.3a 524.4c 16129A 16177G 16224C 16311C 16390A 16519C | K1a12a1a | 73G 263G 309.1c 315.1C 497T 524.1a 524.2c 524.3a 524.4c 16129A 16177G 16224C 16311C 16390A 16519C | K1a4f1 | 100% | 100% | |
| NA20832 | 146C 263G 309.1C 309.2c 315.1C 16519C | HV | 146C 263G 309.1C 309.2c 315.1C 16519C | H1n | 100% | 100% | |
| NA20845 | 73G 152C 263G 309.1c 315.1C 489C 16086C 16129A 16223T 16519C | M | 73G 152C 263G 309.1c 315.1C 489C 16086C 16129A 16223T 16519C | M5a2a | 100% | 100% | |
| NA21143 | 73G 146C 263G 309.1C 309.2c 315.1C 489C 16129A 16223T 16320T | M | 73G 146C 263G 309.1C 309.2c 315.1C 489C 16129A 16223T 16320T | M5c1 | 100% | 100% | |
| NA21144 | 73G 195C 263G 315.1C 16093C 16519C | R8 | 73G 195C 263G 315.1C 16093C 16519C | R8a1b | 100% | 100% |
1 See Section 2.13 regarding 310Y; 2 see Section 2.13 regarding 16189c.
Lineage distribution of 48,882 sequences addressed in ForenSeq mtDNA primer design [46].
| L Lineages “African” | M Lineages “Asian” | N Lineages “Eurasian” | ||||||
|---|---|---|---|---|---|---|---|---|
|
| # |
|
| # |
|
| # |
|
| L3 | 2135 | 35.6% | M | 5250 | 50% | H | 9167 | 28% |
| L0 | 1500 | 25% | D | 2358 | 22% | U | 4231 | 13% |
| L2 | 1322 | 22% | C | 1651 | 16% | B | 4193 | 13% |
| L1 | 878 | 14.7% | E | 456 | 4% | J | 2319 | 7% |
| L4 | 105 | 1.8% | G | 437 | 4% | T | 2237 | 7% |
| L5 | 39 | 0.7% | Z | 191 | 2% | K | 1817 | 6% |
| L6 | 12 | 0.2% | Q | 177 | 2% | F | 1663 | 5% |
| Total | 5991 | 100% | Total | 10,520 | 100% | A | 1386 | 4% |
| Overall 12% | Overall 22% | R | 1077 | 3% | ||||
| N | 785 | 2% | ||||||
| HV | 735 | 2% | ||||||
| I | 718 | 2% | ||||||
| V | 693 | 2% | ||||||
| W | 529 | 2% | ||||||
| X | 470 | 1% | ||||||
| P | 159 | 0.5% | ||||||
| Y | 135 | 0.4% | ||||||
| S | 49 | 0.2% | ||||||
| O | 8 | 0.02% | ||||||
| Total | 32,371 | 100% | ||||||
| Overall 66% | ||||||||
Notes: “hg” denotes haplogroup, “#” is the total number of mtGenomes in the database for each category.
Figure 4Critical reagents in PCR-based studies: potassium chloride concentration effect on variant detection and coverage in the control region and mtGenome multiplexes. Effects of varied KCl concentrations on the four mtDNA PCR buffers (mtPCR1 and mtPCR2 for each kit; x-axes in (a) and (b)) on variants detected (% relative to total; y-axis in (a)) and on bases with no coverage (% relative to control; y-axis in (b)) was assessed using 100 pg of HL60 positive control gDNA. 100% KCl is the titration control, and “Ctrl” is the commercial buffer lot of mtPCR1 and mtPCR2 for both multiplexes. Increased KCl relative to the control, and thus the manufactured standard, can contribute to data loss, as has been reported for forensic PCR systems generally.
Figure 5Control region data visualization of Coriell gDNA sample HG03370 as displayed in ForenSeq universal analysis software (UAS) coverage plot (position vs. coverage) with position viewer visible in center at position 0 (vertical line). Tiled amplicons are shown in variously colored horizontal brackets under the coverage plot, as are schematics of hypervariable regions I and II. Three black vertical rectangles show examples of “bat ears” or regions of coverage generated between overlapping amplicons.
Figure 6Species-specificity study: (a) basic local alignment search tool (BLAST) results for targeted human mtDNA amplicon (query) and porcine genome coordinates (subject, 3427–3506), (b) UAS screenshot of variant positions between the rCRS and the one amplicon detected in the replicate porcine libraries.
Figure 7Stability studies: chemical insults and coverage with control region and mtGenome multiplexes. Stability studies were conducted, as described in Materials and Methods, by assessing effects on average coverage (%, y-axis) of PCR inhibitors calcium (two concentrations), humic acid (two concentrations) and 10 ng E. coli DNA (x-axis) of mitochondrial DNA amplification using each ForenSeq mtDNA multiplex and 100 pg of HL60 positive control DNA. Untreated HL60 DNA is shown at the far left, where complete coverage and 100% call rates were observed for each kit. Solid black bars and solid gray bars indicate% bases detected (coverage) above the default analytical threshold in the ForenSeq UAS; hatched black bars and hatched gray bars indicate variant call rates.