| Literature DB >> 31861983 |
Yilin Liu1, Jiao Xu1, Miaoxia Chen1, Changfa Wang2, Shuaicheng Li3.
Abstract
BACKGROUND: Short tandem repeats (STRs) serve as genetic markers in forensic scenes due to their high polymorphism in eukaryotic genomes. A variety of STRs profiling systems have been developed for species including human, dog, cat, cattle, etc. Maintaining these systems simultaneously can be costly. These mammals share many high similar regions along their genomes. With the availability of the massive amount of the whole genomics data of these species, it is possible to develop a unified STR profiling system. In this study, our objective is to propose and develop a unified set of STR loci that could be simultaneously applied to multiple species. RESULT: To find a unified STR set, we collected the whole genome sequence data of the concerned species and mapped them to the human genome reference. Then we extracted the STR loci across the species. From these loci, we proposed an algorithm which selected a subset of loci by incorporating the optimized combined power of discrimination. Our results show that the unified set of loci have high combined power of discrimination, >1-10-9, for both individual species and the mixed population, as well as the random-match probability, <10-7 for all the involved species, indicating that the identified set of STR loci could be applied to multiple species.Entities:
Keywords: Individual identification; Short tandem repeats; Whole genome sequencing
Mesh:
Year: 2019 PMID: 31861983 PMCID: PMC6923897 DOI: 10.1186/s12859-019-3246-y
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Impact of sample size on allele detection and major forensic parameters (PD, HE, MPF)
Fig. 2The distribution of call rate
Fig. 3Distribution of PD of remaining loci (ηl≥0.5)
Forensic parameters of selected loci
| CHR | POS | RUL | HE | PD |
|---|---|---|---|---|
| chr17 | 7787390 | 2 | 0.814716312056738 | 0.935450047622492 |
| chr14 | 57279935 | 2 | 0.827094474153298 | 0.932795344883323 |
| chr11 | 57467882 | 2 | 0.828629032258065 | 0.932229995727539 |
| chr1 | 207997377 | 2 | 0.785381285381285 | 0.923859365077692 |
| chr11 | 46142707 | 2 | 0.8 | 0.917120154272911 |
| chr16 | 4322949 | 2 | 0.782472613458529 | 0.913962030606996 |
| chr15 | 98292738 | 2 | 0.777777777777778 | 0.901665702815812 |
| chr3 | 119541091 | 2 | 0.780241935483871 | 0.900781631469727 |
| chr15 | 73889791 | 5 | 0.767195767195767 | 0.891822417742607 |
| chr2 | 39201053 | 2 | 0.747619047619048 | 0.881054955418381 |
| chr2 | 241696826 | 3 | 0.747619047619048 | 0.875768032693187 |
| chr12 | 118588317 | 5 | 0.758241758241758 | 0.873070486988594 |
| chr9 | 14086350 | 4 | 0.734042553191489 | 0.86962890625 |
| chr6 | 163992752 | 6 | 0.719774011299435 | 0.868980555555556 |
| chr5 | 124081853 | 2 | 0.733333333333333 | 0.863633976401387 |
| chr8 | 22619315 | 2 | 0.701149425287356 | 0.856881481481481 |
| chr18 | 72357594 | 2 | 0.683257918552036 | 0.856743570778334 |
| chr2 | 135703320 | 3 | 0.711693548387097 | 0.851266860961914 |
| chr2 | 36777760 | 6 | 0.68974358974359 | 0.8426265625 |
| chr17 | 49255307 | 3 | 0.727272727272727 | 0.842592592592593 |
| chr3 | 114173776 | 2 | 0.727272727272727 | 0.842592592592593 |
| chr3 | 137413014 | 2 | 0.666666666666667 | 0.821603869787472 |
| chr16 | 22092942 | 2 | 0.692307692307692 | 0.814504373177843 |
| chr3 | 114033630 | 3 | 0.674645390070922 | 0.810491491247107 |
| chr4 | 54876122 | 6 | 0.634765294711289 | 0.808673104516968 |
| chr1 | 176522684 | 2 | 0.712121212121212 | 0.805266203703704 |
| chr22 | 36140123 | 4 | 0.67032967032967 | 0.786182840483132 |
| chr1 | 27108339 | 5 | 0.62145390070922 | 0.77700524691358 |
| chr3 | 160219801 | 2 | 0.591666666666667 | 0.757476806640625 |
| chr4 | 17885278 | 2 | 0.546654861535651 | 0.68947775749674 |
| chr3 | 187439732 | 2 | 0.533333333333333 | 0.6144 |
Evaluation of selected loci on each species
| Species | CPE | ||
|---|---|---|---|
| cat | 5.0615×10−10 | 0.99999999999993383 | 0.99978049249880108 |
| cattle | 1.0536×10−8 | 0.99999999998633915 | 0.99794653711292580 |
| dog | 3.9239×10−10 | 0.99999999999975098 | 0.99970843603702098 |
| goat | 2.3005×10−8 | 0.99999999997592071 | 0.99857509526114097 |
| horse | 5.0166×10−8 | 0.99999999977198839 | 0.99625817962597518 |
| human | 3.3695×10−8 | 0.99999999999140876 | 0.99933064560113372 |
| pig | 6.2908×10−8 | 0.99999999985080279 | 0.99713785294534030 |
| rabbit | 4.4219×10−10 | 0.99999999999992795 | 0.99980453489094234 |
| sheep | 5.0696×10−10 | 0.99999999999990974 | 0.99975253609811832 |
| yak | 1.9169×10−9 | 0.99999999999823685 | 0.99931237001520157 |
Number of loci selected under different thresholds of and )
| 1−10−3 | 1−10−4 | 1−10−5 | 1−10−6 | 1−10−7 | |
|---|---|---|---|---|---|
| 10−3 | 12 | 12 | 14 | 17 | 17 |
| 10−4 | 16 | 16 | 16 | 17 | 17 |
| 10−5 | 21 | 20 | 20 | 21 | 21 |
| 10−6 | 27 | 27 | 27 | 25 | 25 |
| 10−7 | 31 | 31 | 31 | 31 | 31 |
| 10−8 | 36 | 36 | 36 | 36 | 36 |
| 10−9 | 41 | 41 | 41 | 41 | 41 |
| 10−10 | 46 | 46 | 46 | 46 | 46 |
Fig. 4achieved by different number of loci
Fig. 5Number of loci generated for different number of species
Forensic parameters of selected loci and loci in CODIS
| Our | CODIS | |
|---|---|---|
| 8 | 13 | |
| 4.51×10−11 | 8.42×10−10 | |
| 1.35×10−14 | 4.58×10−13 |
Fig. 6Box plot for common logarithms of RMPs on 10,000 simulated individuals with CODIS and loci selected with proposed method
Fig. 7Normalized probability distributions of common logarithm of CPI in trio paternity testing in human
Forensic parameters of loci selected for cattle, goat, and sheep
| Species | CPE | ||
|---|---|---|---|
| cattle | 3.8319×10−8 | 0.999999999815 | 0.996642777607 |
| goat | 3.4890×10−8 | 0.999999999908 | 0.998089765874 |
| sheep | 4.6930×10−10 | 0.999999999995 | 0.999261605878 |