| Literature DB >> 26391384 |
David H Warshauer1, Jennifer D Churchill1, Nicole Novroski1, Jonathan L King1, Bruce Budowle2.
Abstract
Massively parallel sequencing (MPS) technology is capable of determining the sizes of short tandem repeat (STR) alleles as well as their individual nucleotide sequences. Thus, single nucleotide polymorphisms (SNPs) within the repeat regions of STRs and variations in the pattern of repeat units in a given repeat motif can be used to differentiate alleles of the same length. In this study, MPS was used to sequence 28 forensically-relevant Y-chromosome STRs in a set of 41 DNA samples from the 3 major U.S. population groups (African Americans, Caucasians, and Hispanics). The resulting sequence data, which were analyzed with STRait Razor v2.0, revealed 37 unique allele sequence variants that have not been previously reported. Of these, 19 sequences were variations of documented sequences resulting from the presence of intra-repeat SNPs or alternative repeat unit patterns. Despite a limited sampling, two of the most frequently-observed variants were found only in African American samples. The remaining 18 variants represented allele sequences for which there were no published data with which to compare. These findings illustrate the great potential of MPS with regard to increasing the resolving power of STR typing and emphasize the need for sample population characterization of STR alleles.Entities:
Keywords: Allele variants; Massively parallel sequencing; Nextera; STRait Razor; Sequence polymorphism; Y-STR
Mesh:
Substances:
Year: 2015 PMID: 26391384 PMCID: PMC4610967 DOI: 10.1016/j.gpb.2015.08.001
Source DB: PubMed Journal: Genomics Proteomics Bioinformatics ISSN: 1672-0229 Impact factor: 7.691
Nominal allele sequence variants that differ from the published sequences
| DYS389I | [TCTG]3[TCTA] | 9 | [TCTA]9 | 60 | 0 | 1 | 0 | RPV | R1b |
| DYS389II | [TCTG] | 29 | [TCTG]6[TCTA]10N48[TCTG]3[TCTA]10 | 25 | 0 | 0 | 1 | RPV | E1b1b |
| [TCTG]6[TCTA]11N48[TCTG]3[TCTA]9 | 6 | 1 | 0 | 0 | RPV | E1b1a | |||
| 30 | [TCTG]6[TCTA]11N48[TCTG]3[TCTA]10 | 5–29 | 1 | 0 | 0 | RPV | E1b1a | ||
| 31 | [TCTG]6[TCTA]11N48[TCTG]3[TCTA]11 | 8 | 0 | 1 | 0 | RPV | E1b1a | ||
| 32 | [TCTG]6[TCTA]13N48[TCTG]3[TCTA]10 | 6 | 1 | 0 | 0 | RPV | E1b1b | ||
| DYS390 | [TCTG]8[TCTA] | 21 | [TCTG]8[TCTA]8[TCTG]1[TCTA]4 | 18–188 | 1 | 0 | 0 | RPV | E1b1a |
| [TCTG]8[TCTA]9[TCTG]1[TCTA]3 | 72 | 1 | 0 | 0 | RPV | E1b1b | |||
| DYS393 | [AGAT] | 13 | [ | 59 | 0 | 1 | 0 | A/C SNP | R1a |
| DYS481 | [CTT] | 25 | [CT | 413 | 0 | 1 | 0 | T/G SNP | I2a |
| 26 | [CT | 211 | 0 | 1 | 0 | T/G SNP | E1b1a | ||
| DYS518 | [AAAG]3[GAAG]1[AAAG] | 36 | [AAAG]3[GAAG]1[AAAG]14[GGAG]1[AAAG]4N6[AAAG]13 | 31 | 0 | 0 | 1 | RPV | G2a |
| 37 | [AAAG]3[GAAG]1[AAAG]16[GGAG]1[AAAG]4N6[AAAG]12 | 13 | 0 | 1 | 0 | RPV | R1b | ||
| 38 | [AAAG]3[GAAG]1[AAAG]14[GGAG]1[AAAG]4N6[AAAG]15 | 44 | 0 | 0 | 1 | RPV | J2a | ||
| [AAAG]3[GAAG]1[AAAG]15[GGAG]1[AAAG]4N6[AAAG]14 | 10–68 | 2 | 2 | 1 | RPV | E1b1a, I2a, J2b, R1b | |||
| 39 | [AAAG]3[GAAG]1[AAAG]18[GGAG]1[AAAG]4N6[AAAG]12 | 26 | 0 | 0 | 1 | RPV | I2b | ||
| 40 | [AAAG]3[GAAG]1[AAAG]18[GGAG]1[AAAG]4N6[AAAG]13 | 22 | 1 | 0 | 0 | RPV | E1b1a | ||
| 41 | [AAAG]3[GAAG]1[AAAG]16[GGAG]1[AAAG]4N6[AAAG]16 | 22 | 0 | 1 | 0 | RPV | R1a | ||
| DYS635 | [TCTA]4[TGTA]2[TCTA]2[TGTA]2[TCTA]2[TGTA] | 23 | [TCTA]4[TGTA]2[TCTA]2[TGTA]2[TCTA]2[TGTA]3[TCTA]8 | 247 | 0 | 0 | 1 | RPV | R1b |
Note: n, p, and q represent number of individual repeats per short tandem repeat unit. AFA, African American; CAU, Caucasian; HIS, Hispanic; RPV, repeat pattern variant. Reference motifs are based on sequences provided in STRBase (http://www.cstl.nist.gov/strbase/ystr_fact.htm) and those published by D’Amato and colleagues [8]. SNP in the observed repeat motif is underlined.
Novel allele sequence variants
| DYS449 | [TTTC] | 25 | [TTTC]11N50[TTTC]14 | 10 | 0 | 0 | 1 | RPV | J1 |
| DYS505 | [TCCT] | 11 | [TCCT]11 | 28–55 | 1 | 2 | 5 | RPV | E1b1b, G2a, I1, O/Q, R1b |
| 14 | [TCCT]14 | 24 | 1 | 0 | 0 | RPV | E1b1a | ||
| DYS533 | [ATCT] | 9 | [ATCT]9 | 113 | 0 | 0 | 1 | RPV | G2a |
| 11 | [ATCT]11 | 8–629 | 4 | 5 | 4 | RPV | E1b1a, E1b1b, I1, J2a, O/Q, R1b | ||
| 13 | [ATCT]13 | 83–458 | 1 | 1 | 2 | RPV | R1b | ||
| 14 | [ATCT]14 | 129 | 0 | 1 | 0 | RPV | R1b | ||
| DYS549 | [GATA] | 10 | [GATA]10 | 362–402 | 1 | 1 | 0 | RPV | E1b1a, I2a |
| 11 | [GATA]11 | 15–390 | 5 | 0 | 1 | RPV | E1b1a, E1b1b | ||
| DYS570 | [TTTC] | 23 | [TTTC]5[T | 192 | 0 | 0 | 1 | T/C SNP | E1b1b |
| DYS576 | [AAAG] | 13 | [AAAG]13 | 360 | 1 | 0 | 0 | RPV | E1b1a |
| 22 | [AAAG]22 | 149 | 0 | 0 | 1 | RPV | R1b | ||
| DYS612 | [CCT]5[CTT]1[TCT]4[CCT]1[TCT] | 35 | [CCT]5[CTT]1[TCT]4[CCT]1[TCT]17[ | 122 | 0 | 0 | 1 | T/C SNP | J2b |
| DYS635 | [TCTA]4[TGTA]2[TCTA]2[TGTA]2[TCTA]2[TGTA] | 24 | [TCTA]4[TGTA]2[TCTA]2[TGTA]2[TCTA]2[TGTA]2[TCTA]10 | 9 | 1 | 0 | 0 | RPV | R1b |
| 25 | [TCTA]4[TGTA]2[TCTA]2[TGTA]2[TCTA]2[TGTA]2[TCTA]11 | 23–28 | 0 | 1 | 0 | RPV | R1b | ||
| 26 | [TCTA]4[TGTA]2[TCTA]2[TGTA]2[TCTA]2[TGTA]2[TCTA]12 | 13 | 1 | 0 | 0 | RPV | R1b | ||
| DYS643 | [CTTTT] | 8 | [CTTTT]8 | 395 | 0 | 0 | 1 | RPV | J2a |
| 14 | [CTTTT]14 | 34 | 1 | 0 | 0 | RPV | E1b1a | ||
Note: n and p represent number of individual repeats per short tandem repeat unit. AFA, African American; CAU, Caucasian; HIS, Hispanic; RPV, repeat pattern variant. Reference motifs are based on sequences provided in STRBase (http://www.cstl.nist.gov/strbase/ystr_fact.htm) and those published by D’Amato and colleagues [8] and Butler and colleagues [19]. SNP in the observed repeat motif is underlined.
Figure 1STRait Razor algorithm for detection of STR alleles
The repeat region is shown in bold, capitalized font, while the flanking regions are shown in plain, lowercase font. Surrounding sequences are shown in plain, capitalized font.