Literature DB >> 32029845

A 124-plex Microhaplotype Panel Based on Next-generation Sequencing Developed for Forensic Applications.

Jing-Bo Pang1,2,3, Min Rao1,2,3, Qing-Feng Chen1,2, An-Quan Ji1,2,3, Chi Zhang1,2, Ke-Lai Kang1,2, Hao Wu1,2, Jian Ye4,5, Sheng-Jie Nie6, Le Wang7,8,9.   

Abstract

Microhaplotypes are an emerging type of forensic genetic marker that are expected to support multiple forensic applications. Here, we developed a 124-plex panel for microhaplotype genotyping based on next-generation sequencing (NGS). The panel yielded intralocus and interlocus balanced sequencing data with a high percentage of effective reads. A full genotype was determined with as little as 0.1 ng of input DNA. Parallel mixture experiments and in-depth comparative analyses were performed with capillary-electrophoresis-based short tandem repeat (STR) and NGS-based microhaplotype genotyping, and demonstrated that microhaplotypes are far superior to STRs for mixture deconvolution. DNA from Han Chinese individuals (n = 256) was sequenced with the 124-plex panel. In total, 514 alleles were observed, and the forensic genetic parameters were calculated. A comparison of the forensic parameters for the 20 microhaplotypes with the top Ae values in the 124-plex panel and 20 commonly used forensic STRs showed that these microhaplotypes were as effective as STRs in identifying individuals. A linkage disequilibrium analysis showed that 106 of the 124 microhaplotypes were independently hereditary, and the combined match probability for these 106 microhaplotypes was 5.23 × 10-66. We conclude that this 124-plex microhaplotype panel is a powerful tool for forensic applications.

Entities:  

Mesh:

Substances:

Year:  2020        PMID: 32029845      PMCID: PMC7004988          DOI: 10.1038/s41598-020-58980-x

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.379


Introduction

The microhaplotype is a powerful new type of forensic genetic marker[1,2]. It is the combination of two or more closely linked single-nucleotide polymorphisms (SNPs) within DNA segments of 200 base pairs (bp), and offers multiple forensic applications[3-7]. Short tandem repeat (STR) genotyping is currently the dominant technology in forensic DNA laboratories. Although it works well with single-sourced DNA samples, great challenges are encountered with DNA mixtures because stutters in the major donor DNA can be indistinguishable from alleles in the minor donor DNA[3,8]. Stutters are unavoidable during the replication of repetitive DNA, and they severely interfere with mixture deconvolution. SNPs are not repetitive sequences, but are typically biallelic, which restricts their utility in the analysis of mixtures. Microhaplotypes have the advantages of both STRs and SNPs because they are multiallelic and do not produce stutters during amplification. Therefore, microhaplotypes are perfect genetic markers for mixture deconvolution. Although capillary electrophoresis (CE)-based genetic analyzers are widely used in forensic DNA laboratories, these machines are unsuitable for microhaplotype genotyping[8]. Several methods have been used for microhaplotype detection. TaqMan assays have been used to type each SNP that constitutes a microhaplotype[8], followed by a PHASE software analysis to determine the cis/trans relationships between individual SNP alleles. Single-strand conformational polymorphisms[9] and high-resolution melting curves[4] have also been used for microhaplotype genotyping. These methods are simple and inexpensive, but they can pose problems when multiplexing different loci or dealing with mixed samples. MinION, a nanopore sequencing machine, has also been used for microhaplotype sequencing[10], but the accuracy of sequencing for forensic applications must be improved. Next-generation sequencing (NGS) is well accepted by the forensic community. Both the Illumina and Ion Torrent sequencers are high throughput, with appropriate read lengths for microhaplotypes[11,12], and NGS can directly determine the phase between SNP alleles. Based on these characteristics, NGS is considered the optimal strategy for microhaplotype genotyping, and the development of NGS has made microhaplotypes a powerful new type of genetic marker for forensic analyses[2]. Zhu et al.[13], Qu et al.[14], Turchi et al.[12], and Kidd et al.[15] have studied microhaplotypes for forensic applications on the Miseq, HiSeq, Ion Personal Genome Machine (PGM), and Ion S5™ platforms, respectively. Attempts to develop NGS-based microhaplotype panels and microhaplotype population data have also been reported. In 2017, 89 microhaplotypes were sequenced with two primer pools in 73 Italian samples[12], and this panel was later optimized to 87 loci by the same research group[16]. Another research team constructed a 74-plex microhaplotype assay and sequenced 278 samples from three different populations[15]. In the present study, we developed and evaluated a multiplex amplification system containing 124 microhaplotype loci. Parallel mixture experiments were performed with CE-based STR and NGS-based microhaplotype genotyping methods to compare their capacities for forensic mixture deconvolution. Microhaplotype allelic diversity and forensic estimations were determined for a Han Chinese population.

Results

The 124-plex microhaplotype panel

A total of 124 microhaplotype loci were multiplexed in a single primer pool. The number of SNPs contained at each locus ranged from 2 to 5, and 52 loci contained ≥ 3 SNPs (Supplementary Table S1). The molecular extent of the loci ranged from 13 to 210 nt, with an average of 108 nt. The primer sequences, primer concentrations, and amplicon sizes of the 124-plex panel are summarized in Table 1. The amplicons ranged from 63 to 298 bp, with an average size of 212 bp (Supplementary Fig. S1).
Table 1

The 124 microhaplotype loci and the related parameter information of primers.

Locus namePrimers for PCR amplificationCASLocus namePrimers for PCR amplificationCAS
mh01KK-002TCTGGATAAGGGAGGAAGAAACT0.20135mh11KK-037TTTCCATCTCACCAGGCATCA0.08222
GCCTTCTAGTTCTGAAGCCAATATCCTGGGATAACAGGAAAGAAATC
mh01KK-070CCCACTCCAGCATCACTCAC0.04152mh11KK-038CCCAGGGTTGTTGCTTCCA0.08269
TTCTACCTGAAGAGCAAGTCCCCTCTAAAACCCGACGCTGC
mh01KK-072CCCTTTTCCGAATTTTCCTG0.08115mh11KK-039TGTTCCTGCCAAACCATTCA0.04197
GTATTCCCCTACTTTGTCTTCTGGGACCTCGTTGTCACTGATGATACTA
mh01KK-106ATCCAGTCCCGCTGCCTG0.04244mh11KK-040TGAACTTCCTGCACAGCATTAA0.04126
GATGTCAGATTTTCTTAGGACCGAAAGTGAAAGGGAGCGGAGG
mh01KK-117GTCTCCCCACAAAGCATTGC0.04243mh11KK-041GCAATCTTGGGGTGGTCTTT0.0491
GGTCACATCACCATCTCCGTCCCGACCCGTCCCACCA
mh01KK-205TAGAAGAAAGCACTAATGGGGTAAT0.04248mh11KK-089ACCTGCTCTGCTCACCTAACTCA0.04115
CAATTCGCAACAGTGAAAGCATGGATGCCTCCTGTGCCTGTA
mh01KK-210TCCAGAGTGGTTTGCAGGC0.04278mh11KK-090GTTGAGTCTGGGGAGGTTGC0.02150
AAGTAATTGGCTCCAGGTGACACTCCGTTCTCCACAGTGCTG
mh01KK-211AGATCAAGTCGGCCACGATG0.04243mh11KK-091CCCACCAAAGGAGCTGTACC0.20190
CACCTCCTCCATAATCCACAAGTGGAGAAGACTGGCGAGCAGA
mh02KK-003TGTGCAATGAAGAGCTAACTTGTG0.04178mh11KK-180GACCTGCCTGCTTTTCCTGA0.08288
GCTGGGCTGGCTAGACCCTTTGCACCCTCGCTTCCC
mh02KK-005GCTGGGCCCTAACAGTCTCA0.08259mh11KK-187CTGACTGTCAGCACTCCAGTATCA0.04250
CAACAGCCATTGACTTTTCCCTGGGTCTCGCCGCAAG
mh02KK-073TGGAAAATGGTTCTGAATCGG0.04127mh11KK-191GGGAAACAAAGGTATGTAAAGGC0.04296
CACTTTATGGATTAACTCAACCTGGCAGCAGTTCAGGCAAAGAGC
mh02KK-102ATCCTTAGTTGGGTAACCCTGTC0.12214mh12KK-043TCCTTAGGCAATGAGAAAACACTG0.12243
AAATGCTCCTAGGTGAGTCTAATGTGCAACCAAAAGAAGCCTCAGTC
mh02KK-134TTTGTGGCACTGGAGAACTG0.04198mh12KK-045GGTTATACCCTAAAACTAAAGTCTCGG0.04298
CAATGTCCTTGAGGCTCGTAGATGTGCCTGCTCGTCTATCAA
mh02KK-136ATCCCCACTCCCCATGTTC0.08162mh12KK-046CAAATAGGAACACTGGTATAGGAGG0.08200
CTCAGTATGTTTTGAGCACTTTCAGTGGATTCAGGGGCATGGA
mh02KK-201TTTGAGTATGCTCTGTAGATGCTTC0.12169mh12KK-092TGGGGATGAACAGCTTGGA0.20182
GAGTAACTGCTTCTCAAGTTGGAATTTGGTATGGCTTTGGCTAACTT
mh02KK-202GTGGGAGGGAACTTTCTGAGA0.08277mh12KK-093GCGTGATAGTGGCAATGATGG0.04236
GTTGGGATTAGGGTTGGTATTGCTTCTTACAGTTTCCTTGTTTCCGA
mh02KK-213CCCACCATTTGCCATGCT0.04236mh12KK-202TCCACCACCCACCTCTTCA0.08254
CTCGGGTAGGGCTTTCTTTGACGTACAACCTGAGCCACTGAT
mh03KK-006TGACCGGACGCCATAGCC0.04132mh13KK-047ACAGTTACAACAAGAAGGAAATGGA0.20286
GTCCTACATTACATGGTGTATAAAGCTCAGGGGACGGGAAACAAATGATC
mh03KK-007TTTCAGTTTGTTCTTGGCAGC0.1294mh13KK-213GAGACAGCAAGGAGAACTTCAGTT0.04215
TGCTGGAGATGTTATCAAGGCTCTCAAATGGCGGGCTTCT
mh03KK-008CATGAACCTAGCAACAGACGAGC0.20272mh13KK-217TGCAAAATTTGGCTCAACAAGC0.08281
GTGCAGAAAGATTCCAAAGGAGAATGGTGTATTGCCAAACAGAAAAGG
mh03KK-009GCCATTGCCGAAGACGAT0.04234mh13KK-218TAATAAAACTGGAATCATAAGCATAGC0.08209
CAACCAAGCCCCAAAGAGTCACTAGAGTAATGCAGAACTCACATGTTA
mh03KK-150GTGCCATTTACTGACCACCTATTA0.20297mh13KK-223ACTAGAGTAATGCAGAACTCACATGTTA0.08280
CCTGGGATCCACTGAAAGATTTGACCAGCCTCTTTACATGGAGT
mh04KK-010TGAGCACAGAAGGAGCGATG0.04128mh13KK-225GAATTGGAGCTACAGCCACACT0.08203
TGTGGGGTCACTTCAGGATAATCTGATGAAAAGGGAAGTGGAAA
mh04KK-011GTGTCTAATGGCCGCTGTAGTAA0.04142mh13KK-226AGTACAGTTTTCTCACCCCATAGG0.08191
GCTCAGGAATTTTCATCTGCTTTAATGGCTGTGGAAAGGGTAATA
mh04KK-013CATTGCAGTCATCTGAAATAAGCAC0.04250mh14KK-048GCCGTGGTGTCTGGAAAAC0.12231
TTGGAAGCACCATACCACTCAGGAGAAGCCAATGCAGGAGTCT
mh04KK-015TGGTCTGGTTTATTTTGGTTGG0.04226mh14KK-068TCTGTTCCATTGGCTCCTCTAC0.04158
GGCAAAGGGGAATGACTGAGCAGCTCACTTTTGCCCCTTT
mh04KK-016AGATTCAAGTTGAACTTTTAGACATCTG0.12196mh14KK-101CGGGATAAGGAATTAATCAAGGA0.56284
TTTTCTTCCTAGGGCTACAATTACAGCCATTAATATTTATTGTGATTACAACTG
mh04KK-017ATTGTACTGGTCGGATAATGAGC0.04290mh15KK-066CGGGACAAGGAATAGCCAGT0.20238
ACTTCACTATACACTGGCTTTCTCCCTTACCTGCCAACATATTCACCATA
mh04KK-019AACAATGATGCTACCTTCAGTGC0.20257mh15KK-067TTCTCCCCATTAAGCCATCCT0.04263
ATTCTTATTTGGAAGATTACAACAGGCCAGAAGAAGCAAAGACATCAAGA
mh04KK-021ACCACAGCGCCAAATGATG0.04282mh15KK-095CCCTAAACACCAGGATAGCAGTT0.04189
GGAGGGGATCCTTTAGGACAGTTTGAGGACGCTGCTGTTACTGT
mh04KK-028GCTGACTAATCTTGTGATGGTGAA0.04104mh15KK-104TTCCCACCTCACCTACATAATCT0.08240
CGGCATCGTGGAAAGTGTTGATGGAGCAGTAGTGATGAAGACA
mh04KK-029CTGATGGGTTTGGTAGAGTCCTT0.02174mh16KK-049ACTGCCCTGGAGATTGTTTCA0.08270
CACTTGCGTCGTCTTTGGCTGCTAATCCTGTCCCGTTTCT
mh04KK-074CCATCTTGAGTGCATTGGTTTA0.08172mh16KK-096CCGTGGACCGCTACATCTC0.04115
GTTTAGCACAAGGAACCACTGAAGTGCTGAAGACGACACTGGC
mh05KK-022GAGGACAGAGCCCAAACCAT0.04191mh16KK-255GGGCTTTCTGCTCAGACTTTC0.08236
AGGAGACAGAAATACTCCAAGAGGGCCTCCACGGGGACTTATTA
mh05KK-023TGGCACAGTGAGCACCTTCT0.04261mh16KK-302CTTATGCTTGGGTCCATCTCAG0.04194
GACTTATCCCAAAGCACAAACCTATACCACGGATTTCCCCTCA
mh05KK-062AGATCACATATCATGCGACATCC0.0863mh17KK-052GCTCAGGCAGGAGGTCA0.20288
TCCCTTGCTAAGTCCCTCACTGCGCCTACTGTGCGTG
mh05KK-078TCAGGAAGGACAGGATAGACAGC0.04162mh17KK-053CGCTACTCTTTTGCCTGACCT0.02244
AGTTCTCAGTGCCATTGCTTATCTCCCAACTATTCTGATTCTCGC
mh05KK-079AAACCCTGCATATTTGCTATGG0.08158mh17KK-054CCCGCTGGAGGAGCAAAAGT0.04135
GGCTCGGCGTTTTCTATTGGAGCACGGAAGTTAGGATGGA
mh05KK-170GACACATGGAGGACAAAAGTGAACT0.04210mh17KK-055CCCAAAACTGACAGCCCAAG0.20234
GCTGGTGATGACAAGTGAGATGTGTGGGGTGAACAGCTCTGAC
mh06KK-025GGAGTTAGCCGTGGTATGTTTG0.20229mh17KK-076TCAAACCCAGAGCCATCCC0.02195
CCATACGCTCCTGATAGTTGTTTAAGGGCAAAGGACCGTGATG
mh06KK-026AAGGACTTTCCCTGCTGTTCTAT0.04158mh17KK-077ACAGCCTCTACCCACCAAATG0.04184
ACGCAACACTCTTTTCGCTATTAGATGTCAGCCAGAAGATCAGC
mh06KK-080CAGTAACACTTACTACATATGAATTGAGAA0.20192mh17KK-105CCCGTCCCTTCCAACCC0.20193
CATGTCACATGTATTTTAATATCACAAATCTCACCTTCCCGCCTCC
mh06KK-101GCCTTGTAAGATTTCTCATCTGC0.04242mh17KK-110AGGTTTACCTTGGCATGTTCC0.04264
AGCTGGGAGTGGCCCATGCCAGCCCTGTTTCTAAAAGTGT
mh07KK-030CATTGGTAAGTTGAGTACATAACAGTTC0.20209mh17KK-272CCCTCTGGTTTTCCTTGGAT0.20261
GCTTTTATGCAGTCCTAAGGAAATGGAACATCACGGGAATCTTTT
mh07KK-031GAAGGAAAGATGTCACAGATGCG0.04215mh18KK-285TCACATCATGACGTCTACTGGG0.08246
GGAAAACCGCCAGCATAGCGATCTGTTCCTCAAAGAAGAATTGG
mh07KK-081CCATCTGTACCACGGCATCA0.04245mh18KK-293CACCCACTGAAGTTTGAGCAGA0.04165
TCTCCTACATTCATAACTCCTCCACCCTAATCAAGGCTATGGATACCTATCT
mh07KK-082AGCAGTAAAGCAGGCTGAGGC0.08235mh19KK-056CAAGCGGGAGCCCATG0.08289
TTTTGGGATGTAGTGAAGAGGCTCCCCGCCTCGGTCTC
mh08KK-032ACACCTCCCTGGAAACAACC0.12260mh19KK-057AAATGTCCTGGTCTTGATGGC0.04244
CAACTCTTACGTTCATCAATACCGGGGGAAAGCAGTAGTGAATGG
mh09KK-033TACACGGTTGCCAGAAGAAAA0.04175mh19KK-299CTCTATCATGTGGCCTGGCA0.04216
GAGGTAACACTACGAGGGAAGATTCTGGTGGGTCGCATGTCTC
mh09KK-034TGGTCCTGTCCTCATAGCACTT0.12194mh19KK-301TCTCAAAGACAGACCCACTACGG0.08168
GTATTGAAGTGATAGTTTTACAGTTTCCTAGAAGATTCATGCTGGCTTCAATAGT
mh09KK-035TTCTTTCAGCAAACCCACCC0.04298mh20KK-058TATAGAGCAGGGCCAGGCA0.04205
GGCTCTGATCTGACGGCAAGTGAAACCATCTCCAAGTCCAG
mh09KK-152AATGTGGTAACTGAGACTAGGAGAATC0.08241mh20KK-059TCATAGCAGCTGGTCTCGTTG0.04225
TCGAACTTCATAGGCTGACTCCTCCCTGGCTGTGCTCATGT
mh09KK-153GGGGATTGGCAGTCTTCATG0.04180mh20KK-307TCCTACAGCATTCAATTACCAAAGC0.08250
ACAGCCTCGTAAGGGGAGCTTGAGCATTACCACGATCACTTCTA
mh09KK-157AGTCTAGGGCTGGAGTTGGGT0.04233mh21KK-315TTATGTGGTAGGAGCCTAAAAGAAG0.04285
GGACCATCAGCATCAATAGCCTGTGACCCCTGACCTTGCTG
mh10KK-083GTGGTTCTATTTAATGTGAAGCCTG0.04224mh21KK-316TCATAAACTACAGCTGGCAGACC0.04208
GCTGGCAGAACTGGGATTTGCTCCTTAATATCTTCCCATGTCCA
mh10KK-084CTGTTGCTAATATCTTACCTGCTCC0.02126mh21KK-320TGACTGGGAGGCTGTGGAGA0.04283
GCTCTTACACGAAGTTACATTAGGGATGCTGGAATTAGAGGCGTGA
mh10KK-085AAGGGGCAGAAACTGGGAG0.04117mh21KK-324GGGCGAGCAGGGGTCA0.04196
GGGGATGGAAAACAGAGCCGCATTTCCGCTGACGCTAT
mh10KK-086TGGATTGGAGCCCAGGTATT0.08167mh22KK-060CGTGATTCAGGAGCACCAGC0.08213
ACACTGATTTCCCTCAAGGTCATTTTCCAGGTCTGACAACGG
mh10KK-087AAAGACTTGCTCCATTCCCTATTC0.12231mh22KK-061CTTTAGGGGTGGCAAGTCTCC0.02218
TGATTCTCCACGCTGCCACCACTTAGGGACTGGGGAACTC
mh10KK-088CAAAACTACATTCTTCACTGGGG0.08250mh22KK-064AAAGCGGTGAACAGGTGGA0.04263
ACTGCCTCTGATCTTTCTCACCTTGGTCACAGTTCTTGGTCCG
mh10KK-101CCCAGGACTGTCTGAGCATCT0.02170mh22KK-069GCAGCACTTTCTTTCATTCATTCC0.04144
TGTCTCCCTCCACAGCATGAAACCATGAGTGCTACAAAGGC
mh11KK-036GCCAAAGCTCCCTAATAGCTC0.08240mh22KK-303AGTTCATCCTGCAGCCCATC0.02181
CAGAAATAAAAGGCTAAATGTATGGATCGGACCCCACCTTTCTTGT

C: Final concentrations of the primers; AS: Amplicon sizes.

The 124 microhaplotype loci and the related parameter information of primers. C: Final concentrations of the primers; AS: Amplicon sizes. To evaluate the performance of this assay, we sequenced 10 reference samples. The numbers of total reads and reads representing microhaplotype alleles were calculated and are shown in Fig. 1. Around 100,000 total reads were obtained for each sample. The reads representing alleles accounted for over 90% of the total reads, and even 99% for some samples, indicating that the quality of the sequencing data was good.
Figure 1

Read counts and percentage of reads representing the alleles for 10 reference samples. Number of effective reads (those called as microhaplotype alleles) are shown in orange, and the total reads are shown in blue.

Read counts and percentage of reads representing the alleles for 10 reference samples. Number of effective reads (those called as microhaplotype alleles) are shown in orange, and the total reads are shown in blue. The allele coverage ratio (ACR) was used to evaluate the heterozygosity balance. The ACRs were calculated for the 10 reference samples by dividing the lower coverage allele by the higher coverage allele at each locus. All average ACRs were above 0.7, indicating that the heterozygosity balance of the 124-plex assay was good (Fig. 2). To examine the interlocus balance of this 124-plex panel, we calculated the average percentage depth of coverage (DoC) for each locus (Fig. 3). Each locus accounted for 0.2%–2% of the effective reads, 0.8% on average.
Figure 2

Average allele coverage ratio (ACR) for each locus. Horizontal black line, number of heterozygotes for each calculated ACR. Error bars represent standard deviations.

Figure 3

Average percentage (%) depth of coverage (DoC) for each locus. Error bars represent standard deviations.

Average allele coverage ratio (ACR) for each locus. Horizontal black line, number of heterozygotes for each calculated ACR. Error bars represent standard deviations. Average percentage (%) depth of coverage (DoC) for each locus. Error bars represent standard deviations. To evaluate the sensitivity of the 124-plex assay, a dilution series of genomic DNA 9947 A (1.0, 0.5, 0.2, and 0.1 ng) was sequenced. All 124 microhaplotypes were successfully genotyped with a sequencing depth of ≥ 30 × when 1.0 ng, 0.5 ng, 0.2 ng, or 0.1 ng of input DNA was used (Supplementary Table S2 and Supplementary Figs. S2–S5), demonstrating the highly sensitive performance of the 124-plex assay.

Mixture study

To compare the effectiveness of microhaplotypes and STRs in the analysis of forensic mixtures, we prepared artificially mixed DNA samples with commercial genomic DNAs 9947 A and 2800 M, and performed parallel CE-based STR profiling and NGS-based microhaplotype genotyping experiments (Table 2 and Supplementary Figs. S5–S18). Representative data are summarized and compared in Fig. 4.
Table 2

Summary of STR-based and microhaplotype-based analysis of artificially mixed biological samples.

Genetic markerMixturesNumber of loci with fully called 9947 A alleles9947 A drop-out lociNumber of loci interfered by stutters in mixture deconvolutionLoci interfered by stutters in mixture deconvolutionNumber of remaining effective lociPercentage of remaining effective loci
STR9947 A:2800 M = 1:121021100.00%
9947 A:2800 M = 1:321021100.00%
9947 A:2800 M = 1:6217D16S539, CSF1PO, D18S51, D19S433, FGA, D22S1045, D2S13381466.67%
9947 A:2800 M = 1:919D22S1045, D2S133811D3S1358, vWA, D16S539, CSF1PO, D18S51, D19S433, FGA, D5S818, D7S820, D10S1248, D12S391838.10%
9947 A:2800 M = 1:198D3S1358, vWA, D16S539, CSF1PO, TPOX, D18S51, D19S433, TH01, D22S1045, SE33, D1S1656, D12S391, D2S13387D8S1179, D21S11, FGA, D5S818, D13S317, D7S820, D10S124814.76%
Microhaplotype9947 A:2800 M = 1:11240124100.00%
9947 A:2800 M = 1:3123mh02KK-136012399.19%
9947 A:2800 M = 1:6123mh02KK-136012399.19%
9947 A:2800 M = 1:9123mh02KK-136012399.19%
9947 A:2800 M = 1:19114mh01KK-205, mh02KK-136, mh04KK-019, mh05KK-079, mh07KK-030, mh08KK-032, mh10KK-087, mh12KK-043, mh17KK-055, mh21KK-324011491.94%
Figure 4

Representative STR profiles and representative microhaplotype genotyping histograms for the mixture experiments. Signal peaks for D16S539-12 and CSF1PO-11 in the 1:6 mixture are indicated as “Allele” and “Stutter”, respectively, for comparison. Numbers under each microhaplotype allele are the numeral allele names assigned to allow the microhaplotype data to be read conveniently.

Summary of STR-based and microhaplotype-based analysis of artificially mixed biological samples. Representative STR profiles and representative microhaplotype genotyping histograms for the mixture experiments. Signal peaks for D16S539-12 and CSF1PO-11 in the 1:6 mixture are indicated as “Allele” and “Stutter”, respectively, for comparison. Numbers under each microhaplotype allele are the numeral allele names assigned to allow the microhaplotype data to be read conveniently. Allele dropouts can severely interfere with a mixture analysis. Therefore, we examined the dropout alleles of the minor contributor (9947 A), and calculated the number of loci with fully called 9947 A alleles for each artificially mixed sample. In the STR profiles, no allele dropout was observed for the 1:1, 1:3, or 1:6 mixture (Table 2). Two alleles (D22S1045-11 and D2S1338-19) of the minor contributor dropped out in the analysis of the 1:9 mixture (Supplementary Fig. S12), and only 38% of the STR loci (8/21) reported full 9947 A alleles when the mixture ratio was 1:19 (Table 2, Supplementary Fig. S13). In contrast, 92% of the microhaplotypes (114/124) reported full 9947 A alleles for the 1:19 mixture (Table 2, Supplementary Fig. S18). No allele dropin was observed for the 1:3, 1:6, 1:9, or 1:19 mixture. Only two artefacts dropped in (mh02KK003-GTC and mh20kk059-AG) with low sequencing depths (40X and 30X, see Supplementary Fig. S19) when analyzing the 1:1 mixture. These data indicated that the NGS-based microhaplotypes were superior to the CE-based STRs in genotyping the alleles of the minor contributor. We then investigated the effect of STR stutters on the analysis of these mixtures. When 9947 A and 2800 M were mixed at a 1:1 ratio, the alleles from both contributors were very similar in peak height or sequencing depth (Fig. 4, Supplementary Figs. S9 and S14). Neither STR nor microhaplotype was effective in mixture deconvolution. When the mixture ratio was 1:3, the peak heights of alleles from the minor contributor were significantly lower than those of the major contributor and significantly higher than the STR stutters (Fig. 4, Supplementary Figs. S10 and S15). Both STRs and microhaplotypes were effective in mixture deconvolution. However, at mixture ratios of 1:6, 1:9, and 1:19, the minor contributor STR alleles were indistinguishable from the stutters of the major contributor because their peak heights were similar (Fig. 4). For example, in the 1:6 mixture, D16S539-12 (an allele of the minor contributor 9947 A) and CSF1PO-11 (a stutter of the major contributor 2800 M) were both at the n-1 stutter position, with similar intensities. Their peak heights were 5%−10% of those of their possible parent alleles, which is typical for STR stutters. Incorrect allele/stutter interpretation can readily occur in such situations. However, with microhaplotype genotyping, the alleles from the major and minor contributors were easily distinguishable in the various mixture ratios based on their sequencing depths (Table 2, Supplementary Figs. S15–S18). Taken together, only 38.10% and 4.76% of the STR loci were effective in analyzing the 1:9 and 1:19 mixtures, respectively, whereas 99.19% and 91.94% of the microhaplotypes were effective in analyzing the same mixtures, respectively (Table 2). These data confirm that microhaplotypes are reliable genetic markers for the deconvolution of forensic mixtures.

Population data

A total of 256 Han Chinese individuals residing in Gansu Province were genotyped, and 514 alleles were observed (Table 3), with approximately four alleles per locus on average. Thirteen alleles were observed for locus mh01KK-117, which was the highest number in this dataset. Single alleles were observed for two loci, mh10KK-084 and mh17KK-076, indicating that there was no genetic diversity at these two loci in this Han Chinese population. Therefore, the forensic parameters were not calculated for these two loci. The forensic statistical parameters were calculated for the other 122 loci, and are summarized in Table 4.
Table 3

Allele frequencies of 124 microhaplotypes in the Chinese Han population (N = 256).

GenotCountFreGenotCountFreGenotCountFreGenotCountFre
mh01KK-002mh01KK-070mh01KK-072mh01KK-106
AA3230.6333AG970.1895CG3510.6937CAAG10.0022
AG1200.2353AT4150.8105TC1550.3063CAGA2510.5553
GA190.0373mh01KK-205mh01KK-210CAGG680.1504
GG480.0941CCAG1300.2632CC820.1660CGAG50.0111
mh01KK-117TCAG870.1761TC1430.2895TAGG1270.2810
AACC1930.3955TTAA740.1498TT2690.5445mh01KK-211
AACT840.1721TTAG660.1336mh02KK-073ACT300.0673
AAGC120.0246TTGG1370.2773GC3580.7075ATC1930.4327
AAGT60.0123mh02KK-005GT1050.2075ATT1270.2848
AGCC540.1107AG1950.3916TC20.0040GCC30.0067
AGCT150.0307GA1740.3494TT410.0810GTC930.2085
AGGC30.0061GG1290.2590mh02KK-201mh02KK-102
CACC830.1701mh02KK-136GA140.0287GAC110.0259
CACT50.0102GTA40.0116GG220.0451GGT4090.9646
CAGC150.0307GTC50.0145TA4520.9262TGC40.0094
CAGT10.0020TCA110.0320mh03KK-007mh02KK-202
CGCC80.0164TCC1850.5378CC1840.3622CA2530.4961
CGCT90.0184TTC1390.4041TC1830.3602CC10.0020
mh02KK-003mh03KK-006TT1410.2776GA2560.5020
GCC20.0039AA2880.5692mh03KK-150mh03KK-008
GTC180.0353AG1690.3340AACA1620.3476CG10.0020
TCC3960.7765TA490.0968GACC1910.4099CT210.0427
TTC480.0941mh03KK-009GGCC1130.2425TG2580.5244
TTT460.0902CC190.0373mh04KK-010TT2120.4309
mh02KK-134TC1440.2824AA2920.5703mh04KK-011
ACCG230.0456TT3470.6804AG1660.3242AC2150.4674
ACTA30.0060mh04KK-013GA470.0918AT1180.2565
ACTG250.0496AAGAT480.0964GG70.0137GT1270.2761
ATCA190.0377CAGAT610.1225mh04KK-015mh04KK-016
ATCG2340.4643CAGGT30.0060AC3390.6673CC600.1172
ATTA560.1111CGAAT20.0040AT1120.2205TC1160.2266
ATTG560.1111CGGAC270.0542TT570.1122TT3360.6563
TCTA10.0020CGGAT640.1285mh04KK-017mh04KK-029
TCTG110.0218CGGGT2930.5884ACA250.0912TC4220.8242
TTCG70.0139mh04KK-021GCA180.0657TT900.1758
TTTG690.1369AG1600.3226GCG1850.6752mh05KK-023
mh02KK-213GA1620.3266GTA460.1679GCG260.0544
CAT290.0566GG1740.3508mh04KK-028TCG2970.6213
CGT1560.3047mh05KK-022CA20.0039TTG1400.2929
TGC170.0332CA2590.5078CC1560.3047TTT150.0314
TGT3100.6055CC1520.2980TC3540.6914mh05KK-062
mh04KK-019TC990.1941mh05KK-170AA1360.2677
AA2260.4431mh05KK-079CAAA490.0984AC2480.4882
AG2620.5137CC2750.5413CAAG500.1004TA1240.2441
GA220.0431CT2330.4587CAGA120.0241mh06KK-025
mh04KK-074mh06KK-080CAGG140.0281AGG430.1503
AC10.0020AG80.0158CGAA630.1265GGG2430.8497
AT4460.8745CG4980.9842CGAG600.1205mh07KK-030
GT630.1235mh07KK-081CGGA350.0703ACC1590.6023
mh05KK-078-C30.0059CGGG160.0321GAC520.1970
GA810.1582-T5090.9941TAAA740.1486GCC530.2008
GG4310.8418mh09KK-034TAAG1230.2470mh08KK-032
mh06KK-026AA140.0276TAGG10.0020CG670.1683
ACG10.0020GA1080.2126TGAA10.0020TA390.0980
ATG170.0335GG3860.7598mh06KK-101TG2920.7337
GCA120.0236mh09KK-153AA4130.8381mh09KK-152
GCG4610.9075CAA230.0477GA10.0020AGCA970.1964
GTG170.0335CAC310.0643GG790.1599ATCG10.0020
mh07KK-031CGA50.0104mh07KK-082ATTA310.0628
CA2690.5316TAA1910.3963TC1980.3898ATTG2670.5405
CG980.1937TAC1160.2407TG3100.6102GTCG980.1984
TG1390.2747TGA870.1805mh10KK-083
mh09KK-033TGC290.0602mh09KK-035GC410.0807
ACG1640.3241mh09KK-157CG1570.3257TC4670.9193
GCG1810.3577ACCAT150.0305CT1950.4046mh10KK-088
GCT80.0158ACTAT450.0915TG1300.2697GC3550.9492
GTG1530.3024GCCAC2390.4858mh10KK-087GT190.0508
mh10KK-084GCCCC20.0041AG3510.6964mh11KK-038
TG5121.0000GCCCT1390.2825GA1530.3036CG2360.5388
mh10KK-085GTCAC520.1057mh11KK-037TA180.0411
CC2540.4961mh10KK-086ACG2020.4040TG1840.4201
CT2580.5039GA3020.5945GCG2080.4160mh11KK-089
mh10KK-101GC1490.2933GTG900.1800AT2640.5156
AG1810.3620TA570.1122mh11KK-041CG2190.4277
CA800.1600mh11KK-036AG450.0893CT290.0566
CG2390.4780AA1500.2941GA2820.5595mh11KK-187
mh11KK-039AG1550.3039GG1770.3512CCCA2260.4575
GG360.0706CG2050.4020mh11KK-180CCCG60.0121
GT2210.4333mh11KK-040AACC30.0066GCCG30.0061
TT2530.4961AC3080.8324AACG10.0022GCGA20.0040
mh11KK-090CG620.1676AATC380.0830GCGG1330.2692
AC3310.6490mh11KK-091AATG10.0022GTCA10.0020
GT1790.3510-C780.1535ACCC2000.4367GTGG1230.2490
mh11KK-191-T4300.8465ACCG130.0284mh12KK-046
CAGT1030.2239mh12KK-043ACTC540.1179GA1440.2824
CGAT650.1413CCG470.0925ACTG110.0240GG1300.2549
TAAC860.1870CTA2510.4941GCCC50.0109TA1330.2608
TAAT2040.4435CTG2090.4114GCCG1280.2795TG1030.2020
TGAT20.0043TCG10.0020GCTC30.0066mh13KK-047
mh12KK-092mh12KK-093GCTG10.0022CC1030.2146
CT1830.3735AT4020.7882mh12KK-045CT730.1521
TC3070.6265TA1080.2118CT350.0694TC150.0313
mh13KK-213mh13KK-217TC3940.7817TT2890.6021
CCA970.3255AACA60.0121TT750.1488mh13KK-223
CCG540.1812AACG260.0526mh12KK-202CCCT790.1561
TAG510.1711AATA10.0020AACT970.1972CGCC10.0020
TCA830.2785AATG1780.3603AATC1810.3679CGCT1040.2055
TCG130.0436AGCA610.1235AGTT920.1870CGTC1390.2747
mh13KK-218AGCG660.1336CATC20.0041CGTT1000.1976
CCCC80.0161AGTG800.1619CATT1190.2419TCCT160.0316
CCCT200.0403GATG10.0020CGTT10.0020TGCT670.1324
CTCC470.0948GGCA20.0040mh14KK-048mh14KK-068
CTCT610.1230GGCG670.1356AC70.0152AC1830.3574
CTTC950.1915GGTG60.0121AT2730.5935AT2860.5586
CTTT480.0968mh13KK-226GC370.0804CC430.0840
TCCC20.0040CA100.0207GT1430.3109mh15KK-067
TCCT40.0081CG1380.2851mh15KK-066GC2300.4563
TTCC180.0363TA3360.6942AG2020.4139GT880.1746
TTCT730.1472mh14KK-101AT620.1270TC1760.3492
TTTC220.0444AT710.1530CG1150.2357TT100.0198
TTTT980.1976GC130.0280CT1090.2234mh16KK-255
mh13KK-225GT3800.8190mh16KK-096ACCG380.0769
AAG650.1280mh15KK-095CA3280.6457ACTA10.0020
ACG2000.3937CA2600.5078CG1790.3524ACTG1550.3138
GAA1030.2028TA2210.4316TG10.0020GACA1730.3502
GAG1330.2618TG310.0605mh16KK-302GATA170.0344
GCG70.0138mh16KK-049ACTT620.1225GCCA40.0081
mh15KK-104AAAAG1500.3036GCTC990.1957GCCG390.0789
CAG80.0158ACAAA430.0870GCTT810.1601GCTG670.1356
TAA80.0158ACAAG50.0101GTAT1880.3715mh17KK-054
TAG540.1067ACAGA120.0243GTTT760.1502AA1830.4816
TCG4360.8617ACGGA2110.4271mh17KK-053AG840.2211
mh17KK-055CCAAA720.1457CT2340.4699GG1130.2974
AC2270.5881CCGGA10.0020TC2050.4116mh17KK-105
AT10.0026mh17KK-052TT590.1185ATA120.0235
CC470.1218AA1010.2186mh17KK-077ATG4980.9765
CT1110.2876AG1480.3203GG4400.8594mh18KK-293
mh17KK-110GA2020.4372TG720.1406AGAA1210.2430
CA70.0137GG110.0238mh18KK-285AGGA10.0020
CG4320.8471mh17KK-076AGCG450.0886ATAA80.0161
TG710.1392AG5121.0000CACG2560.5039ATGA720.1446
mh19KK-056mh17KK-272CGCG60.0118GGAA1980.3976
CA2650.5430CCCT2460.5125CGCT890.1752GGAG770.1546
CC10.0020TCAT250.0521CGTG1120.2205GGGA90.0181
TA180.0369TCCC280.0583mh19KK-299GTAA70.0141
TC2040.4180TCCT1160.2417ACGAA10.0020GTAG20.0040
mh21KK-315TTCC650.1354ATGAA680.1382GTGA30.0060
ACC280.0562mh19KK-057ATGAG10.0020mh19KK-301
ACT10.0020CCG3310.6567GCAAA500.1016AGGT40.0078
ATC1060.2129CTG1390.2758GCAAG2190.4451GAAC4160.8157
ATT130.0261CTT340.0675GCATG1000.2033GGAC20.0039
GCC690.1386mh20KK-059GCGTA430.0874GGAT880.1725
GCT280.0562AA1360.2698GCGTG100.0203mh20KK-058
GTC840.1687AG500.0992mh20KK-307CAC1620.3240
GTT1690.3394GG3180.6310CTGA1420.2971TAC1480.2960
mh21KK-316mh21KK-320TTAA1010.2113TAT1360.2720
ACAC1980.3976AACA560.1181TTGA1920.4017TGC540.1080
ACGC30.0060AACG1270.2679TTGC430.0900mh22KK-060
ACGT1320.2651AATA10.0021mh21KK-324CA1440.2903
ATGC440.0884AGCG10.0021CCAA30.0062CG1700.3427
GCGC1200.2410GACA1410.2975CCAG140.0288GG1820.3669
GTGC10.0020GACG180.0380CCTA190.0391mh22KK-303
mh22KK-061GATA810.1709CCTG30.0062CGGG3250.6423
AAA840.1667GGCA220.0464CTAA1400.2881CTGG360.0711
AAG40.0079GGCG270.0570CTTA500.1029TGGG1450.2866
GAA2560.5079mh22KK-064CTTG10.0021mh22KK-069
GAG310.0615AAT4320.8438TCAG1850.3807AG460.0898
GGG1290.2560GAT800.1563TCTG690.1420GG1660.3242
TTAA20.0041GT3000.5859

Genot: allele genotype; Count: allele count; Fre: allele frequency.

Table 4

Forensic parameters of 122 microhaplotypes in the Chinese Han population (N = 256).

MicrohaplotypeMPPDPETPIHoHepAe
mh01KK-0020.27060.72940.22611.08970.54120.53430.52172.1426
mh01KK-0700.53230.46770.05180.68450.26950.30770.06361.4433
mh01KK-0720.42260.57740.13380.87850.43080.42580.88341.7391
mh01KK-1060.22930.77070.19541.01800.50890.59120.00022.4386
mh01KK-1170.08580.91420.54552.17860.77050.77100.38334.3362
mh01KK-2050.08260.91740.55042.20540.77330.78410.59124.5984
mh01KK-2100.23560.76440.24851.14350.56280.59330.48562.4518
mh01KK-2110.14760.85240.28131.22530.59190.68510.06003.1606
mh02KK-0030.40450.59550.09810.79690.37260.37960.85491.6099
mh02KK-0050.19280.80720.37291.48210.66270.65880.97922.9197
mh02KK-0730.36200.63800.13380.87850.43080.45070.02881.8175
mh02KK-1020.87450.12550.00340.53270.06130.06890.22251.0738
mh02KK-1340.09520.90480.50982.00000.75000.73580.82783.7641
mh02KK-1360.35070.64930.01660.58500.14540.54770.00002.2033
mh02KK-2010.74370.25630.01540.58100.13930.13950.57071.1618
mh02KK-2020.35960.64040.16620.95150.47450.50290.38582.0078
mh02KK-2130.28780.71220.24411.13270.55860.53730.94222.1564
mh03KK-0060.28300.71700.24281.12950.55730.55620.64742.2478
mh03KK-0070.18980.81020.37671.49410.66540.66330.78682.9586
mh03KK-0080.31240.68760.18400.99190.49590.53860.02042.1623
mh03KK-0090.36600.63400.13960.89160.43920.45690.77461.8381
mh03KK-1500.17850.82150.18941.00430.50220.65380.00002.8765
mh04KK-0100.27780.72220.30231.28000.60940.56210.30582.2780
mh04KK-0110.20480.79520.28571.23660.59570.64090.00162.7741
mh04KK-0130.18980.81020.27481.20870.58640.61130.01782.5644
mh04KK-0150.31470.68530.19451.01600.50790.49450.86151.9743
mh04KK-0160.30610.69390.20151.03230.51560.50530.85472.0172
mh04KK-0170.44340.55660.03300.63430.21170.50520.00002.0133
mh04KK-0190.31280.68720.19271.01190.50590.53890.13732.1638
mh04KK-0210.18540.81460.33781.37780.63710.66760.04022.9958
mh04KK-0280.41620.58380.13840.88890.43750.43000.94461.7516
mh04KK-0290.54560.45440.06230.71110.29690.29030.83041.4080
mh04KK-0740.63740.36260.03080.62810.20390.22040.33881.2820
mh05KK-0220.22020.77980.32011.32810.62350.61680.84222.6014
mh05KK-0230.29210.70790.18931.00420.50210.52530.75282.1018
mh05KK-0620.20330.79670.28431.23300.59450.63170.40032.7058
mh05KK-0780.57050.42950.05760.69950.28520.26690.34721.3630
mh05KK-0790.39570.60430.21661.06720.53150.49760.31401.9864
mh05KK-1700.03800.96200.76194.29310.88350.86100.42057.1065
mh06KK-0250.73760.26240.00000.50350.00700.25640.00001.3432
mh06KK-0260.69220.30780.02290.60480.17320.17400.35171.2102
mh06KK-0800.95340.04660.00020.50800.01580.03120.00081.0321
mh06KK-1010.57010.42990.04530.66760.25100.27260.35901.3738
mh07KK-0300.32460.67540.06170.70970.29550.56030.00002.2633
mh07KK-0310.23420.76580.33161.36020.63240.60560.34962.5279
mh07KK-0810.97680.02320.00010.50590.01170.01171.00001.0118
mh07KK-0820.37760.62240.14650.90710.44880.47660.35541.9073
mh08KK-0320.41550.58450.04880.67690.26130.42490.00001.7355
mh09KK-0330.17890.82110.38051.50600.66800.67670.00613.0799
mh09KK-0340.45370.54630.07880.75150.33470.37740.00031.6043
mh09KK-0350.19280.80720.35721.43450.65150.65890.60512.9196
mh09KK-1520.18940.81060.29481.26020.60320.62730.41622.6740
mh09KK-1530.10310.89690.41741.62840.69300.74390.11483.8810
mh09KK-1570.16520.83480.37281.48190.66260.66510.22222.9738
mh10KK-0830.73780.26220.01670.58530.14570.14870.66901.1742
mh10KK-0850.37500.62500.18751.00000.50000.50101.00001.9999
mh10KK-0860.26990.73010.22041.07630.53540.54910.39242.2122
mh10KK-0870.44640.55360.17730.97670.48810.42370.01781.7326
mh10KK-0880.87790.12210.00070.51370.02670.09670.00001.1067
mh10KK-1010.22690.77310.30551.28870.61200.61610.98792.5965
mh11KK-0360.19190.80810.37301.48260.66280.66090.73572.9373
mh11KK-0370.21180.78820.34171.38890.64000.63260.02782.7124
mh11KK-0380.33180.66820.21481.06310.52970.53270.00172.1345
mh11KK-0390.30520.69480.27251.20280.58430.56230.13082.2787
mh11KK-0400.62330.37670.01150.56750.11890.27970.00001.3869
mh11KK-0410.28010.71990.19811.02440.51190.55670.05692.2504
mh11KK-0890.29590.70410.22021.07560.53520.54900.88862.2122
mh11KK-0900.40140.59860.15390.92390.45880.45651.00001.8368
mh11KK-0910.58530.41470.04300.66150.24410.26050.33491.3512
mh11KK-1800.12230.87770.33841.37950.63760.71040.03363.4343
mh11KK-1870.18290.81710.34121.38760.63970.65740.21712.9071
mh11KK-1910.13640.86360.35221.41980.64780.69980.52123.3140
mh12KK-0430.25310.74690.22041.07630.53540.57920.47632.3699
mh12KK-0450.44120.55880.08790.77300.35320.36260.89551.5672
mh12KK-0460.11050.88950.46241.79580.72160.74800.56873.9449
mh12KK-0920.38690.61310.14960.91420.45310.46890.68181.8796
mh12KK-0930.50140.49860.06270.71230.29800.33450.09171.5011
mh12KK-2020.12350.87650.52732.08470.76020.73380.59783.7356
mh13KK-0470.23930.76070.19871.02560.51250.56850.01142.3113
mh13KK-2130.16370.83630.21531.06430.53020.75500.00004.0395
mh13KK-2170.07090.92910.46091.78990.72070.79100.02674.7474
mh13KK-2180.03770.96230.78554.76920.89520.86560.24367.3473
mh13KK-2230.07180.92820.58882.43270.79450.80190.62515.0081
mh13KK-2250.13090.86910.48651.89550.73620.72020.42683.5560
mh13KK-2260.39330.60670.12220.85210.41320.43730.53781.7741
mh14KK-0480.28820.71180.17270.96640.48260.54560.00052.1951
mh14KK-0680.29600.70400.25681.16360.57030.55430.30202.2380
mh14KK-1010.55130.44870.03410.63740.21550.30580.00001.4390
mh15KK-0660.13220.86780.39251.54430.67620.70860.11303.4141
mh15KK-0670.20920.79080.34551.40000.64290.64020.38932.7695
mh15KK-0950.28950.71050.19801.02400.51170.55320.20572.2329
mh15KK-1040.59570.40430.04070.65540.23720.24610.25391.3256
mh16KK-0490.15110.84890.40431.58330.68420.69730.00843.2878
mh16KK-0960.39760.60240.15530.92700.46060.45990.53571.8483
mh16KK-2550.10580.89420.50801.99190.74900.74860.06423.9543
mh16KK-3020.09410.90590.50471.97660.74700.76200.19784.1750
mh17KK-0520.17650.82350.28271.22870.59310.65930.18712.9227
mh17KK-0530.23350.76650.24801.14220.56230.59690.10152.4736
mh17KK-0540.23480.76520.19211.01060.50530.63250.00002.7085
mh17KK-0550.31930.68070.41161.60830.68910.55810.00012.2555
mh17KK-0770.60630.39370.03980.65310.23440.24220.60311.3187
mh17KK-1050.91030.08970.00200.52470.04710.04601.00001.0482
mh17KK-1100.57210.42790.04530.66750.25100.26340.37981.3567
mh17KK-2720.16970.83030.28591.23710.59580.65590.01942.8943
mh18KK-2850.15990.84010.30821.29590.61420.66010.37862.9305
mh18KK-2930.11110.88890.42641.66000.69880.73870.02373.8057
mh19KK-0560.30500.69500.16040.93850.46720.53010.10122.1231
mh19KK-0570.33250.66750.17070.96180.48020.48900.70071.9534
mh19KK-2990.11150.88850.39631.55700.67890.72450.09163.6110
mh19KK-3010.52440.47560.06120.70830.29410.30540.42421.4384
mh20KK-0580.12960.87040.44071.71230.70800.72320.80803.5940
mh20KK-0590.29850.70150.21301.05880.52780.52030.99512.0801
mh20KK-3070.13690.86310.29901.27130.60670.69910.02713.3076
mh21KK-3150.07860.92140.58272.39420.79120.78650.67184.6484
mh21KK-3160.13730.86270.42021.63820.69480.70720.64243.3985
mh21KK-3200.07510.92490.55592.23580.77640.79140.48944.7555
mh21KK-3240.10820.89180.46681.81340.72430.74040.40083.8302
mh22KK-0600.17650.82350.28701.24000.59680.66490.10502.9726
mh22KK-0610.18910.81090.34551.40000.64290.64620.24692.8158
mh22KK-0640.57500.42500.05320.68820.27340.26420.81211.3581
mh22KK-0690.29390.70610.25251.15320.56640.54460.52572.1905
mh22KK-3030.31820.68180.19271.01200.50590.50130.66622.0011

MP: match probability; PD: power of discrimination; PE: power of exclusion; TPI: typical paternity index; Ho: observed heterozygosity; He: expected heterozygosity; p: p-value for Hardy–Weinberg equilibrium test; Ae: effective number of alleles.

Allele frequencies of 124 microhaplotypes in the Chinese Han population (N = 256). Genot: allele genotype; Count: allele count; Fre: allele frequency. Forensic parameters of 122 microhaplotypes in the Chinese Han population (N = 256). MP: match probability; PD: power of discrimination; PE: power of exclusion; TPI: typical paternity index; Ho: observed heterozygosity; He: expected heterozygosity; p: p-value for Hardy–Weinberg equilibrium test; Ae: effective number of alleles. The PD values ranged from 0.0232 to 0.9623, with an average of 0.6799. The PD values for 90 loci were > 0.6, indicating that the individual identification capacity of the panel was high. The PEs for 66 loci were > 0.2, with 0.7855 (mh13KK-218) the highest PE value. Observed heterozygosity (Ho) was 0.0070–0.8952, and expected heterozygosity (He) was 0.0117–0.8656. The Ae values for 28 loci were > 3 (Fig. 5), and for another 23 loci, Ae was 2.5–3. Notably, the Ae values for mh13KK-218 and mh05KK-170 were even higher than 7.
Figure 5

Histogram of the Ae values for the 124 microhaplotypes.

Histogram of the Ae values for the 124 microhaplotypes. To compare the individual identification capacities of the microhaplotypes and STRs, we summarized the PD and Ae values for the 20 microhaplotypes with the highest Ae values in the 124-plex panel and 20 commonly used forensic STRs (data under review in another manuscript) in Supplementary Table S3. The PD values for the microhaplotypes were 0.8691–0.9623 (0.9036 on average), which were very close to the PD range for STRs, 0.7794–0.9592 (0.9094 on average). The Ae values for the microhaplotypes and STRs were also similar. These data suggest that these 20 microhaplotypes are almost as effective as the commonly used forensic STRs for the identification of individuals. To examine whether the microhaplotypes located on the same chromosome were linked to each other, we calculated LD. The p-values for pairwise linkage analyses are presented in Supplementary Table S4. Among the 124 microhaplotypes, 28 were linked in 10 pairs or groups (Supplementary Table S5) after correction for multiple testing (p < 0.0000065565). The locus with the highest Ae value within each linkage pair or group was used to calculate the combined forensic genetic parameters, whereas the other microhaplotypes within the linkage pairs or groups were not. Thus, based on 106 independent microhaplotypes, the combined match probability (CMP) and combined power of exclusion (CPE) were calculated to be 5.23 × 10−66 and (1–4.28 × 10−16), respectively.

Discussion

Since the concept of microhaplotypes was introduced, their unique advantages as novel genetic markers in the field of forensics have been gradually demonstrated. Various research groups have conducted extensive research into microhaplotypes and provided data for different populations. Hiroaki et al. studied 27 multiple-SNP haplotype blocks in a Japanese population[4]. Chen and coworkers presented a novel panel of 26 microhaplotypes, with relatively high Ae (>3.0) and small sequence lengths (<50 bp)[17]. Voskoboinik et al. reported a panel of 10 highly polymorphic haplotypes, each containing more than 10 SNPs[10]. However, fewer surveys have been conducted with highly multiplexed systems. In this study, we developed a single-tube 124-plex assay for forensic microhaplotypes for use with next-generation sequencing. The sequencing data from the 124-plex panel showed good intralocus and interlocus balance (Figs. 2 and 3), with over 90% of the reads classified as effective (Fig. 1). Mixture deconvolution is one of the major forensic applications for which microhaplotypes are advantageous, and it is noteworthy that the excellent intralocus balance characteristic of this panel provides a reliable foundation for mixture analyses. Microhaplotypes are expected to provide a better solution than STRs to forensic mixture analyses because they circumvent the inference by stutters[3,18-20]. However, the extent to which microhaplotypes can improve mixture deconvolution has been unclear. Therefore, we undertook parallel mixture experiments and in-depth comparative analyses of CE-based STR and NGS-based microhaplotype genotyping. Our results show that only 38.10% and 4.76% of STR loci effectively analyzed 1:9 and 1:19 mixtures, respectively, whereas 99.19% and 91.94% of the microhaplotypes effectively analyzed the same mixtures, respectively (Table 2). The microhaplotypes were also superior to STRs in the analysis of forensic mixture because they avoided not only inference by stutters, but also the dropout of minor contributor alleles. It should be noted that these results were obtained by single experiments at each mixture ratio and needed further verification. Probabilistic genotyping software, including LRmix[21], STRmix[22], and EuroForMix[23], have been developed. Using semicontinuous or fully continuous models, these programs provide optional solutions for mixed STR profile deconvolution. As noted by Bennett et al.[24], similar probabilistic calculations could also be helpful in mixed microhaplotype data analyses. To evaluate their capacities to identify individuals and family/clan relationships in a Han Chinese population, we sequenced the DNA of 256 unrelated individuals. A statistical analysis showed that the majority of microhaplotypes sequenced were highly polymorphic and informative in the Gansu Han population. The CMPs for most commercial forensic STR kits range from 10−17 to 10−26 [25-27]. In this study, the CMP for 106 microhaplotypes was 5.23 × 10−66, which is tens of orders of magnitude lower than those of STR multiplex systems. These data demonstrate that microhaplotypes are powerful genetic markers for the precise identification of individuals. Some less polymorphic microhaplotypes in the Han Chinese population were kept in the 124-plex panel, including 2 markers which showed no genetic diversity. The ancestry inference capacity of these microhaplotypes has been extensively discussed by Kidd et al.[8,15,28,29]. Potential application of the 124-plex panel in ancestry inference awaits further studies.

Conclusions

We have developed an NGS-based 124-plex panel of microhaplotypes. Mixture experiments showed that the microhaplotypes are superior to STRs in forensic mixture analysis because they avoid not only interference by stutters, but also the dropout of minor contributor alleles. The DNA of 256 Chinese Han individuals was sequenced with the 124-plex panel. The estimated forensic parameters showed that the 20 microhaplotype loci with the highest Ae values in the 124-plex panel were as efficient as STRs in the identification of individuals, and that CMP for 106 microhaplotypes was 5.23 × 10−66. These data demonstrate that the 124-plex microhaplotype panel provides an additional tool for forensic applications.

Materials and Methods

DNA samples

Blood samples were collected from unrelated Han Chinese individuals. Written informed consent was given by the blood donors and this work was approved by the Ethical Review Board of the Institute of Forensic Science, Ministry of Public Security of China (Beijing, China). All methods were performed in accordance with the relevant guidelines and regulations. DNA was extracted with the MagAttract M48 DNA Manual Kit (Qiagen, Limburg, Germany), according to the manufacturer’s guidelines. The extracted DNA samples were quantified with the Qubit dsDNA HS Assay Kit (Thermo Fisher Scientific, Waltham, MA, USA) on a Qubit fluorometer (Thermo Fisher Scientific). The female genomic DNA standard 9947 A (Promega, Madison, WI, USA) was used in the sensitivity assays. Massive parallel sequencing was performed on a dilution series of genomic samples (1.0, 0.5, 0.2, or 0.1 ng). For the mixture experiments, standard genomic DNAs 9947 A and 2800 M (Promega) were mixed in ratios of 1:1, 1:3, 1:6, 1:9 and 1:19, to a total amount of 1.0 ng.

Multiplex amplification

Primers were designed for the 130 microhaplotype loci reported by Kidd et al.[8] with the Primer Premier 5.0 software[30]. After repeated optimization of the primer sequences and the PCR conditions, 124 microhaplotypes were successfully multiplexed in a single reaction system (Table 1). The PCRs were performed in a total volume of 20 μL containing 20 mM Tris-HCl (pH 8.3), 50 mM KCl, 1.6 mM MgCl2, 0.8 mg/ml bovine serum albumin, 0.2% (v/v) Tween 20, 3.2% (v/v) glycerol, 0.02% (w/v) NaN3, 200 mM each dNTP, 2 U of Taq DNA polymerase (Roche, Basel, Swiss), primer pairs (concentrations indicated in Table 1), and 1 ng of template DNA. The PCR conditions were 95 °C for 11 min, followed by 28 cycles of 30 s at 94 °C, 2 min at 60 °C, and 1 min at 72 °C, with a final elongation step at 60 °C for 60 min.

Library preparation and sequencing

The PCR products were purified with the QIAquick 96 PCR Purification Kit (Qiagen) and the TruSeq DNA PCR-Free HT Kit (Illumina, San Diego, CA, USA) and used for library preparation, according to the manufacturer’s guidelines. The libraries were sequenced on a MiSeq FGx platform (Illumina) using the Miseq Reagent Kit v2 (Illumina), with a read length of 250 bases.

Data analysis

FASTQ data were generated with the Miseq FGx Control Software 1.0.15.0 (Illumina). The MHTyper software[31] was employed for microhaplotype allele calling, with the sequencing depth threshold set at 30 reads. The Hg19 human genome was used as the reference sequence. The allele frequencies and forensic statistical parameters (match probability, MP; power of discrimination, PD; power of exclusion, PE; typical paternity index, TPI) were calculated with Modified-PowerStat spreadsheet 1.2[32]. Arlequin 3.5[33] was used to calculate the observed heterozygosity, expected heterozygosity, Hardy–Weinberg equilibrium, and linkage disequilibrium (LD). The effective number of alleles (Ae) was calculated with the formula described in a previous publication[3].

CE-based STR genotyping

The GlobalFiler® Kit (Thermo Fisher Scientific) was used for CE-based STR genotyping, according to the manufacturer’s recommendations. An aliquot of PCR product (1 µL) was added to 10 µL of deionized formamide (Thermo Fisher Scientific) containing the internal size standards. All samples were separated on a 3500XL Genetic Analyzer (Thermo Fisher Scientific) using POP™-4 Polymer (Thermo Fisher Scientific) and a 36 cm capillary array (Thermo Fisher Scientific). The GeneMapper® ID-X software v4.0 (Thermo Fisher Scientific) was used for fragment sizing and allele calling. Supplementary Figures S1-S19 and Supplementary Tables S1-S3 and S5. Supplementary Table S4.
  23 in total

1.  Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows.

Authors:  Laurent Excoffier; Heidi E L Lischer
Journal:  Mol Ecol Resour       Date:  2010-03-01       Impact factor: 7.090

2.  Approaches for identifying multiple-SNP haplotype blocks for use in human identification.

Authors:  Nakahara Hiroaki; Fujii Koji; Kitayama Tetsushi; Sekiguchi Kazumasa; Nakanishi Hiroaki; Saito Kazuyuki
Journal:  Leg Med (Tokyo)       Date:  2015-06-20       Impact factor: 1.376

3.  Evaluating 130 microhaplotypes across a global set of 83 populations.

Authors:  Kenneth K Kidd; William C Speed; Andrew J Pakstis; Daniele S Podini; Robert Lagacé; Joseph Chang; Sharon Wootton; Eva Haigh; Usha Soundararajan
Journal:  Forensic Sci Int Genet       Date:  2017-03-16       Impact factor: 4.882

4.  Genotyping polymorphic microhaplotype markers through the Illumina® MiSeq platform for forensics.

Authors:  Jing Zhu; Meili Lv; Nan Zhou; Dan Chen; Youjing Jiang; Li Wang; Wang He; Duo Peng; Zhilong Li; Shengqiu Qu; Yinji Wang; Hui Wang; Haibo Luo; Gang An; Weibo Liang; Lin Zhang
Journal:  Forensic Sci Int Genet       Date:  2018-11-16       Impact factor: 4.882

5.  Genetic data for PowerPlex 21™ autosomal and PowerPlex 23 Y-STR™ loci from population of the state of Uttar Pradesh, India.

Authors:  Ankit Srivastava; Ramkishan Kumawat; Shivani Dixit; Kamlesh Kaitholia; Divya Shrivastava; Vijay Kumar Yadav; Kriti Nigam; Harsh Sharma; Veena Ben Trivedi; Gyaneshwer Chaubey; Pankaj Shrivastava
Journal:  Int J Legal Med       Date:  2019-01-04       Impact factor: 2.686

6.  Current sequencing technology makes microhaplotypes a powerful new type of genetic marker for forensics.

Authors:  Kenneth K Kidd; Andrew J Pakstis; William C Speed; Robert Lagacé; Joseph Chang; Sharon Wootton; Eva Haigh; Judith R Kidd
Journal:  Forensic Sci Int Genet       Date:  2014-07-01       Impact factor: 4.882

7.  Euroforgen-NoE collaborative exercise on LRmix to demonstrate standardization of the interpretation of complex DNA profiles.

Authors:  L Prieto; H Haned; A Mosquera; M Crespillo; M Alemañ; M Aler; F Alvarez; C Baeza-Richer; A Dominguez; C Doutremepuich; M J Farfán; M Fenger-Grøn; J M García-Ganivet; E González-Moya; L Hombreiro; M V Lareu; B Martínez-Jarreta; S Merigioli; P Milans Del Bosch; N Morling; M Muñoz-Nieto; E Ortega-González; S Pedrosa; R Pérez; C Solís; I Yurrebaso; P Gill
Journal:  Forensic Sci Int Genet       Date:  2013-10-31       Impact factor: 4.882

8.  Allele frequencies of 15 STR loci (Identifiler™ kit) in Basque-Americans.

Authors:  Jason Besecker; Gianluca Peri; Michael Davis; Josu Zubizarreta; Greg Hampikian
Journal:  Leg Med (Tokyo)       Date:  2017-12-12       Impact factor: 1.376

9.  A novel bifunctional europium complex as a potential fluorescent label for DNA detection.

Authors:  Pin-Zhu Qin; Cheng-Gang Niu; Min Ruan; Guang-Ming Zeng; Xiao-Yu Wang
Journal:  Analyst       Date:  2010-06-28       Impact factor: 4.616

Review 10.  Microhaplotypes in forensic genetics.

Authors:  Fabio Oldoni; Kenneth K Kidd; Daniele Podini
Journal:  Forensic Sci Int Genet       Date:  2018-10-01       Impact factor: 4.882

View more
  8 in total

1. 

Authors:  靖 周; 艳 王; 恩萍 徐
Journal:  Zhejiang Da Xue Xue Bao Yi Xue Ban       Date:  2021-12-25

Review 2.  Research progress on application of microhaplotype in forensic genetics.

Authors:  Jing Zhou; Yan Wang; Enping Xu
Journal:  Zhejiang Da Xue Xue Bao Yi Xue Ban       Date:  2021-12-25

3.  Development and validation of a novel 133-plex forensic STR panel (52 STRs and 81 Y-STRs) using single-end 400 bp massive parallel sequencing.

Authors:  Haoliang Fan; Lingxiang Wang; Changhui Liu; Xiaoyu Lu; Xuding Xu; Kai Ru; Pingming Qiu; Chao Liu; Shao-Qing Wen
Journal:  Int J Legal Med       Date:  2021-11-06       Impact factor: 2.791

4.  Identification and sequencing of 59 highly polymorphic microhaplotypes for analysis of DNA mixtures.

Authors:  Riga Wu; Haixia Li; Ran Li; Dan Peng; Nana Wang; Xuefeng Shen; Hongyu Sun
Journal:  Int J Legal Med       Date:  2021-01-27       Impact factor: 2.686

5.  A Highly Polymorphic Panel Consisting of Microhaplotypes and Compound Markers with the NGS and Its Forensic Efficiency Evaluations in Chinese Two Groups.

Authors:  Xiaoye Jin; Xingru Zhang; Chunmei Shen; Yanfang Liu; Wei Cui; Chong Chen; Yuxin Guo; Bofeng Zhu
Journal:  Genes (Basel)       Date:  2020-09-01       Impact factor: 4.096

6.  Applications of massively parallel sequencing in forensic genetics.

Authors:  Thássia Mayra Telles Carratto; Vitor Matheus Soares Moraes; Tamara Soledad Frontanilla Recalde; Maria Luiza Guimarães de Oliveira; Celso Teixeira Mendes-Junior
Journal:  Genet Mol Biol       Date:  2022-09-19       Impact factor: 2.087

7.  Multi-Indel: A Microhaplotype Marker Can Be Typed Using Capillary Electrophoresis Platforms.

Authors:  Shengqiu Qu; Meili Lv; Jiaming Xue; Jing Zhu; Li Wang; Hui Jian; Yuqing Liu; Ranran Zhang; Lagabaiyila Zha; Weibo Liang; Lin Zhang
Journal:  Front Genet       Date:  2020-10-23       Impact factor: 4.599

8.  Noninvasive prenatal paternity determination using microhaplotypes: a pilot study.

Authors:  Jaqueline Yu Ting Wang; Martin R Whittle; Renato David Puga; Anatoly Yambartsev; André Fujita; Helder I Nakaya
Journal:  BMC Med Genomics       Date:  2020-10-23       Impact factor: 3.063

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.