| Literature DB >> 35784409 |
Halimureti Simayijiang1,2, Niels Morling1, Claus Børsting1.
Abstract
Massively parallel sequencing (MPS) offers a useful alternative to capillary electrophoresis (CE) based analysis of human identification markers in forensic genetics. By sequencing short tandem repeats (STRs) instead of determining the fragment lengths by CE, the sequence variation within the repeat region and the flanking regions may be identified. In this study, we typed 264 Uyghur individuals using the MiSeq FGx™ Forensic Genomics System and Primer Mix A of the ForenSeq™ DNA Signature Prep Kit that amplifies 27 autosomal STRs, 25 Y-STRs, seven X-STRs, and 94 HID-SNPs. STRinNGS v.1.0 and GATK 3.6 were used to analyse the STR regions and HID-SNPs, respectively. Increased allelic diversity was observed for 33 STRs with the PCR-MPS assay. The largest increases were found in DYS389II and D12S391, where the numbers of sequenced alleles were 3-4 times larger than those of alleles determined by repeat length alone. A relatively large number of flanking region variants (28 SNPs and three InDels) were observed in the Uyghur population. Seventeen of the flanking region SNPs were rare, and 12 of these SNPs had no accession number in dbSNP. The combined mean match probability and typical paternity index based on 26 sequenced autosomal STRs were 3.85E-36 and 1.49E + 16, respectively. This was 10 000 times lower and 1 000 times higher, respectively, than the same parameters calculated from STR repeat lengths.Key PointsSequencing data on STRs and SNPs used for human identification are presented for the Uyghur population.STRinNGS v.1.0 was used to analyse the flanking regions of STRs.The concordance between PCR-CE and PCR-MPS results was 99.86%.Detection of sequence variation in STRs and their flanking regions increased the allelic diversity.Entities:
Keywords: ForenSeq™ DNA Signature Prep Kit; Forensic sciences; Uyghur; forensic genetics; massively parallel sequencing (MPS); short tandem repeat (STR); single nucleotide polymorphism (SNP)
Year: 2020 PMID: 35784409 PMCID: PMC9246034 DOI: 10.1080/20961790.2020.1779967
Source DB: PubMed Journal: Forensic Sci Res ISSN: 2471-1411
Figure 1.Box-and-whisker plot of the heterozygote balance for each STR loci. Outliers are indicated by dots. X-STRs and Y-STRs are indicated in grey and dark grey colours, respectively.
Number of observed alleles in the Uyghur population.
| Number of alleles | |||||
|---|---|---|---|---|---|
| STR locus | Chromosome | Based on the number of repeats | Based on the STR sequence variation | Based on the sequence variation in the STR and flanks | Increase in number of alleles (%) |
|
| |||||
| D12S391 | 12 | 17 | 55 | 55 | 234 |
| D21S11 | 21 | 15 | 41 | 42 | 180 |
| D2S1338 | 2 | 14 | 34 | 34 | 143 |
| D3S1358 | 3 | 7 | 17 | 17 | 143 |
| D16S539 | 16 | 8 | 10 | 18 | 125 |
| D7S820 | 7 | 9 | 12 | 20 | 122 |
| vWA | 12 | 8 | 17 | 17 | 113 |
| D5S818 | 5 | 7 | 8 | 14 | 100 |
| D13S317 | 13 | 8 | 8 | 16 | 100 |
| D8S1179 | 8 | 11 | 21 | 21 | 91 |
| D9S1122 | 9 | 8 | 15 | 15 | 88 |
| D2S441 | 2 | 11 | 16 | 20 | 82 |
| D20S482 | 20 | 9 | 9 | 13 | 44 |
| D1S1656 | 1 | 15 | 21 | 21 | 40 |
| D4S2408 | 4 | 6 | 8 | 8 | 33 |
| D17S1301 | 17 | 10 | 13 | 13 | 30 |
| D6S1043 | 6 | 18 | 21 | 22 | 22 |
| CSF1PO | 5 | 7 | 8 | 8 | 14 |
| D18S51 | 18 | 17 | 19 | 19 | 12 |
| FGA | 4 | 19 | 21 | 21 | 11 |
| Penta D | 21 | 11 | 11 | 12 | 9 |
| D19S433 | 19 | 14 | 14 | 15 | 7 |
| Penta E | 15 | 20 | 21 | 21 | 5 |
| TPOX | 2 | 7 | 7 | 7 | 0 |
| D10S1248 | 10 | 9 | 9 | 9 | 0 |
| TH01 | 11 | 7 | 7 | 7 | 0 |
|
| |||||
| DXS10135 | X | 23 | 61 | 67 | 191 |
| DXS10074 | X | 13 | 16 | 19 | 46 |
| DXS7132 | X | 8 | 8 | 11 | 38 |
| DXS8378 | X | 7 | 7 | 7 | 0 |
| DXS7423 | X | 5 | 5 | 5 | 0 |
| HPRTB | X | 9 | 9 | 9 | 0 |
|
| |||||
| DYS389 | Y | 6 | 25 | 25 | 317 |
| | Y | 3 | 3 | 3 | 0 |
| | Y | 5 | 12 | 12 | 140 |
| DYF387S1 | Y | 10 | 33 | 33 | 230 |
| DYS460/461 | Y | 7 | 17 | 19 | 171 |
| | Y | 6 | 6 | 6 | 0 |
| | Y | 4 | 4 | 6 | 50 |
| DYS448 | Y | 6 | 13 | 15 | 150 |
| DYS481 | Y | 10 | 15 | 21 | 110 |
| DYS390 | Y | 6 | 11 | 12 | 100 |
| DYS612 | Y | 9 | 15 | 16 | 78 |
| DYS437 | Y | 4 | 6 | 7 | 75 |
| DYS635 | Y | 9 | 13 | 13 | 44 |
| DYS438 | Y | 6 | 7 | 8 | 33 |
| DYS19 | Y | 5 | 6 | 6 | 20 |
| Y-GATA-H4 | Y | 5 | 5 | 6 | 20 |
| DYS643 | Y | 6 | 7 | 7 | 17 |
| DYS570 | Y | 8 | 8 | 8 | 0 |
| DYS385a-b | Y | 14 | 14 | 14 | 0 |
| DYS391 | Y | 4 | 4 | 4 | 0 |
| DYS392 | Y | 7 | 7 | 7 | 0 |
| DYS439 | Y | 6 | 6 | 6 | 0 |
| DYS505 | Y | 7 | 7 | 7 | 0 |
| DYS522 | Y | 6 | 6 | 6 | 0 |
| DYS533 | Y | 5 | 5 | 5 | 0 |
| DYS549 | Y | 4 | 4 | 4 | 0 |
| DYS576 | Y | 8 | 8 | 8 | 0 |
aLoci with variations in the flanking regions.
bThe DYS389 locus consists of two compound STR regions. The marker conventionally referred to as DYS389II includes both STRs.
cConventionally referred to as DYS389I.
dThe amplicons included two STRs: DYS460 and DYS461.
eReferred to as DYS461.
fReferred to as DYS460.
SNPs and InDels observed in the STR flanking regions.
| Locus | STR locus | Upstream/ downstream | Chromosome | Positiona | Most frequent allele | Least frequent allele | Minor allele frequency |
|---|---|---|---|---|---|---|---|
| rs74640515 | D2S441 | Upstream | 2 | 68,239,054 | G | A | 0.080 |
| 68239142G > A | D2S441 | Downstream | 2 | 68,239,142 | G | A | 0.002 |
| rs73801920 | D5S818 | Upstream | 5 | 123,111,246 | C | A | 0.165 |
| 92449928C > T | D6S1043 | Upstream | 6 | 92,449,928 | C | T | 0.004 |
| rs7789995 | D7S820 | Upstream | 7 | 83,789,520 | T | A | 0.068 |
| del:83789519 | D7S820 | Upstream | 7 | 83,789,519 | T | del:83,789,519 | 0.002 |
| rs16887642 | D7S820 | Downstream | 7 | 83,789,602 | G | A | 0.157 |
| rs75219269 | vWA | Upstream | 12 | 6,093,136 | A | G | 0.127 |
| rs73250432 | D13S317 | Upstream | 13 | 82,722,135 | C | T | 0.008 |
| rs9546005 | D13S317 | Downstream | 13 | 82,722,204 | T | A | 0.508 |
| rs202043589 | D13S317 | Downstream | 13 | 82,722,208 | A | T | 0.034 |
| rs11642858 | D16S539 | Downstream | 16 | 86,386,367 | A | C | 0.288 |
| 86386297A > G | D16S539 | Upstream | 16 | 86,386,297 | A | G | 0.002 |
| rs563997442 | D16S539 | Upstream | 16 | 86,386,298 | C | G | 0.009 |
| rs745607776 | D19S433 | Upstream | 19 | 30,417,136-7 | CT | del:30,417,136-7 | 0.004 |
| rs77560248 | D20S482 | Upstream | 20 | 4,506,326 | C | T | 0.070 |
| rs561985213 | D20S482 | Upstream | 20 | 4,506,327 | G | A | 0.002 |
| 20554419C > T | D21S11 | Downstream | 21 | 20,554,419 | C | T | 0.002 |
| rs186259515 | Penta D | Downstream | 21 | 45,056,154 | A | G | 0.004 |
| 64655583C > T | DXS7132 | Downstream | X | 64,655,583 | C | T | 0.020 |
| rs56195635 | DXS10074 | Upstream | X | 66,977,164 | C | G | 0.002 |
| rs771349963 | DXS10074 | Upstream | X | 66,977,180 | G | A | 0.012 |
| del:9306454-6 | DXS10135 | Downstream | X | 9,306,454-6 | AGA | del:9,306,454-6 | 0.047 |
| rs368663163 | DYS481 | Upstream | Y | 8,426,362 | G | A | 0.079 |
| 14467152G > A | DYS437 | Downstream | Y | 14,467,152 | G | A | 0.008 |
| 14937880A > C | DYS438 | Downstream | Y | 14,937,880 | A | C | 0.016 |
| rs758940870 | DYS390 | Downstream | Y | 17,275,043 | T | C | 0.016 |
| 15752741T > C | DYS612 | Downstream | Y | 15,752,741 | T | C | 0.008 |
| 21050775T > C | DYS460 | Upstream | Y | 21,050,775 | T | C | 0.008 |
| 21050824T > A | DYS460 | Upstream | Y | 21,050,824 | T | A | 0.008 |
| 24365062A > G | DYS448 | Upstream | Y | 24,365,062 | A | G | 0.032 |
| 18743636A > G | Y-GATA-H4 | Downstream | Y | 18,743,636 | A | G | 0.008 |
aPositions in the GRCh37 genome build.