| Literature DB >> 28542338 |
Brian Young1, Jonathan L King2, Bruce Budowle2,3, Luigi Armogida1.
Abstract
Amplicon (targeted) sequencing by massively parallel sequencing (PCR-MPS) is a potential method for use in forensic DNA analyses. In this application, PCR-MPS may supplement or replace other instrumental analysis methods such as capillary electrophoresis and Sanger sequencing for STR and mitochondrial DNA typing, respectively. PCR-MPS also may enable the expansion of forensic DNA analysis methods to include new marker systems such as single nucleotide polymorphisms (SNPs) and insertion/deletions (indels) that currently are assayable using various instrumental analysis methods including microarray and quantitative PCR. Acceptance of PCR-MPS as a forensic method will depend in part upon developing protocols and criteria that define the limitations of a method, including a defensible analytical threshold or method detection limit. This paper describes an approach to establish objective analytical thresholds suitable for multiplexed PCR-MPS methods. A definition is proposed for PCR-MPS method background noise, and an analytical threshold based on background noise is described.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28542338 PMCID: PMC5436856 DOI: 10.1371/journal.pone.0178005
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Illustration of locus definitions as defined by flanking sequence landmarks (FSLs).
The genomic sub-sequence between GRCh38/hg38 positions chr2:68,011,892 and chr2:68,012,004 includes the forensic STR locus D2S441 and SNP locus rs74640515. Flanking sub-sequences can be used as landmarks to delimit and define sub-segments of PCR amplicons for forensic analysis. FSLs of 10 nucleotides are indicated in underlined font. FSLs need not be unique in the genome, but only unique within the extent of the genome sequenced in an assay. Three separate locus definitions are illustrated: (a) the polymorphic repeat region of the STR locus D2S441, (b) The polymorphic SNP locus rs74640515, and (c) the haplotype sub-segment including both D2S441 and rs74640515. FSLs may be set equal to the PCR binding sites, making the delimited locus the entire PCR amplicon.
Frequency spectrum of the exhaustive and mutually exclusive set of sequence types generated for simple and compound repeat loci D5S818 and D12S391, respectively, observed in sample T36814.
The number of individual reads (tokens) comprising each type is indicated by N, and counts of types one repeat motif shorter than either allele are highlighted bold font. For ease of reading, selected repeat motifs are bracketed with a number following the bracket indicating the number of tandem repeats.
| Type Category | D5S818 | D12S391 | ||
|---|---|---|---|---|
| N | Sequence | N | Sequence | |
| Allele | 381 | 542 | ||
| 294 | 377 | |||
| Molecular | 9 | |||
| 2 | ||||
| 9 | ||||
| 6 | ||||
| 3 | ||||
| 3 | ||||
| 3 | ||||
| 3 | ||||
| Background Noise | 2 | 2 | ||
| 2 | 2 | |||
| 1 | 1 | |||
| 1 | 1 | |||
| 1 | 1 | |||
| 1 | 1 | |||
| 1 | 1 | |||
| 1 | 1 | |||
| 1 | 1 | |||
| 1 | 1 | |||
| 1 | 1 | |||
| 1 | 1 | |||
| 1 | 1 | |||
| 1 | 1 | |||
| 1 | 1 | |||
| 1 | 1 | |||
| 1 | 1 | |||
| 1 | 1 | |||
| 1 | 1 | |||
| 1 | 1 | |||
| 1 | 1 | |||
| 1 | 1 | |||
| 1 | 1 | |||
| 1 | 1 | |||
| 1 | 1 | |||
| 1 | 1 | |||
| 1 | 1 | |||
| 1 | 1 | |||
| 1 | 1 | |||
| 1 | ||||
| 1 | ||||
| 1 | ||||
| 1 | ||||
Fig 2Per-locus analytical threshold (AT) values calculated from the range of observed noise by each of two methods.
AT values by method A (blue boxes) are set at twice the range of observed per-locus noise intensities, where per-locus noise is defined as responses not attributable to alleles or molecular artifacts (stutter). AT values by method B (orange boxes) are set using the same formula, but where per-locus noise is defined as responses not attributable to authentic alleles or N-1 molecular artifacts. Boxes summarize data across 4 DNA samples, and the plot Y-axis is truncated at 100 read counts for purposes of readability.
Fig 3Comparison of analytical threshold (AT) values calculated from noise with ATs calculated as a percentage of allele coverage (signal).
AT values by method A (blue boxes) are set at twice the range of observed per-locus noise intensities, where per-locus noise is defined as responses not attributable to alleles or molecular artifacts (stutter). AT values by method B (orange boxes) are set using the same formula, but where per-locus noise is defined as responses not attributable to alleles or N-1 molecular artifacts. The black line represents AT values set as a constant 1.5% of allele coverage. In all cases, AT values are converted to percentages of the average allele coverage on a per-locus basis.
Fig 4Effect of read coverage on analytical thresholds (AT) calculated by three different methods.
Each data point represents a single instance of a sample-locus combination. AT values calculated as a fixed 1.5% percentage of total read coverage increase linearly with increasing read coverage (black diamonds) and are calculated for instances with a minimum of 650 total reads per Illumina ForenSeq protocol. This minimum corresponds to noise levels of 10 reads which for purposes of comparison has been used as the minimum AT value for all three methods. AT values based on background noise defined as the residual after removal of alleles and all stutter artifacts (Method A) are less sensitive to locus coverage (blue discs). An AT of 10 reads is generally sufficient for instances with coverages below 5,000 reads. AT values based on background noise defined as the residual after removal of alleles and N-1 stutter artifacts (Method B) trend upward with increasing locus coverage. Both Methods A and B are applied to instances with fewer than 650 reads.