| Literature DB >> 31227725 |
Alessio Boattini1, Stefania Sarno2, Alessandra M Mazzarisi2, Cinzia Viroli3, Sara De Fanti2, Carla Bini4, Maarten H D Larmuseau5,6, Susi Pelotti4, Donata Luiselli7.
Abstract
In the population genomics era, the study of Y-chromosome variability is still of the greatest interest for several fields ranging from molecular anthropology to forensics and genetic genealogy. In particular, mutation rates of Y-chromosomal Short Tandem Repeats markers (Y-STRs) are key parameters for different interdisciplinary applications. Among them, testing the patrilineal relatedness between individuals and calculating their Time of Most Recent Common Ancestors (TMRCAs) are of the utmost importance. To provide new valuable estimates and to address these issues, we typed 47 Y-STRs (comprising Yfiler, PowerPlex23 and YfilerPlus loci, the recently defined Rapidly Mutating [RM] panel and 11 additional markers often used in genetic genealogical applications) in 135 individuals belonging to 66 deep-rooting paternal genealogies from Northern Italy. Our results confirmed that the genealogy approach is an effective way to obtain reliable Y-STR mutation rate estimates even with a limited number of samples. Moreover, they showed that the impact of multi-step mutations and backmutations is negligible within the temporal scale usually adopted by forensic and genetic genealogy analyses. We then detected a significant association between the number of mutations within genealogies and observed TMRCAs. Therefore, we compared observed and expected TMRCAs by implementing a Bayesian procedure originally designed by Walsh (2001) and showed that the method yields a good performance (up to 96.72%), especially when using the Infinite Alleles Model (IAM).Entities:
Mesh:
Year: 2019 PMID: 31227725 PMCID: PMC6588691 DOI: 10.1038/s41598-019-45398-3
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Typed Y-STRs and estimated mutation rates.
| Y-STR | Panel | Number of Mutations | Meioses | Mutation rates (*10-3) | Ref. Values | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Single | Multi | Val (Single) | CI (95%) | Val (Multi) | CI (95%) | FS | GP | |||||
| DYS19 | Yf/PP/YfP/L | 3 | 3 | 1166 | 2.57 | 0.00 | 6.00 | 2.57 | 0.00 | 6.00 | 3.99 | 2.26 |
| DYS389I | Yf/PP/YfP/L | 3 | 3 | 1166 | 2.57 | 0.00 | 6.00 | 2.57 | 0.00 | 6.00 | 5.14 | 4.24 |
| DYS389II-I | Yf/PP/YfP/L | 3 | 3 | 1166 | 2.57 | 0.00 | 6.00 | 2.57 | 0.00 | 6.00 | 3.44 | 4.80 |
| DYS390 | Yf/PP/YfP/L | 5 | 5 | 1166 | 4.29 | 0.86 | 8.58 | 4.29 | 0.86 | 8.58 | 1.14 | 2.54 |
| DYS391 | Yf/PP/YfP/L | 2 | 2 | 1166 | 1.72 | 0.00 | 4.29 | 1.72 | 0.00 | 4.29 | 2.84 | 4.52 |
| DYS392 | Yf/PP/YfP/L | 3 | 6 | 1166 | 2.57 | 0.00 | 6.00 | 5.15 | 1.72 | 9.43 | 0.58 | 1.13 |
| DYS393 | Yf/PP/YfP/L | 3 | 3 | 1166 | 2.57 | 0.00 | 6.00 | 2.57 | 0.00 | 6.00 | 1.71 | 1.13 |
| DYS385 (a-b) | Yf/PP/YfP/L | 8 | 8 | 1166 | 6.86 | 2.57 | 12.01 | 6.86 | 2.57 | 12.01 | 5.33 | 5.37 |
| DYS437 | Yf/PP/YfP/L | 4 | 4 | 1166 | 3.43 | 0.86 | 6.86 | 3.43 | 0.86 | 6.86 | 1.14 | 3.11 |
| DYS438 | Yf/PP/YfP/L | 0 | 0 | 1166 | — | — | — | — | — | — | 0.57 | 0.28 |
| DYS439 | Yf/PP/YfP/L | 7 | 9 | 1166 | 6.00 | 1.72 | 11.15 | 7.72 | 3.43 | 12.86 | 3.46 | 3.67 |
| DYS448 | Yf/PP/YfP/L | 2 | 2 | 1166 | 1.72 | 0.00 | 4.29 | 1.72 | 0.00 | 4.29 | 0.00 | 1.13 |
| DYS456 | Yf/PP/YfP/L | 3 | 3 | 1166 | 2.57 | 0.00 | 6.00 | 2.57 | 0.00 | 6.00 | 4.55 | 7.91 |
| DYS458 | Yf/PP/YfP/L | 12 | 13 | 1166 | 10.29 | 5.15 | 16.30 | 11.15 | 5.15 | 17.15 | 7.97 | 9.04 |
| DYS635 | Yf/PP/YfP/L | 2 | 2 | 1166 | 1.72 | 0.00 | 4.29 | 1.72 | 0.00 | 4.29 | 3.46 | 5.65 |
| GATA H4.1 | Yf/PP/YfP/L | 3 | 3 | 1166 | 2.57 | 0.00 | 6.00 | 2.57 | 0.00 | 6.00 | 2.85 | 1.98 |
| DYS481 | PP/YfP/L | 3 | 3 | 810 | 3.70 | 0.00 | 8.64 | 3.70 | 0.00 | 8.64 | 4.59 | 6.22 |
| DYS533 | PP/YfP/L | 3 | 3 | 810 | 3.70 | 0.00 | 8.64 | 3.70 | 0.00 | 8.64 | 4.62 | 3.96 |
| DYS570 | PP/YfP/RM/L | 9 | 9 | 1166 | 7.72 | 3.43 | 12.86 | 7.72 | 3.43 | 12.86 | 11.92 | 9.89 |
| DYS576 | PP/YfP/RM/L | 10 | 11 | 1166 | 8.58 | 3.43 | 14.58 | 9.43 | 4.29 | 15.44 | 13.90 | 7.91 |
| DYS549 | PP/L | 8 | 8 | 810 | 9.88 | 3.70 | 17.28 | 9.88 | 3.70 | 17.28 | 4.16 | 4.52 |
| DYS643 | PP/L | 0 | 0 | 810 | — | — | — | — | — | — | 1.13 | 1.13 |
| DYS460 | YfP/L | 1 | 1 | 718 | 1.39 | 0.00 | 4.18 | 1.39 | 0.00 | 4.18 | 5.82 | 4.52 |
| DYS518 | YfP/RM/L | 18 | 18 | 1166 | 15.44 | 8.58 | 23.16 | 15.44 | 8.58 | 23.16 | 17.99 | 18.65 |
| DYS627 | YfP/RM/L | 17 | 19 | 1166 | 14.58 | 7.72 | 21.44 | 16.30 | 9.43 | 24.01 | 11.89 | 15.55 |
| DYF387S1 | YfP/RM/L | 13 | 13 | 1166 | 11.15 | 5.15 | 17.15 | 11.15 | 5.15 | 17.15 | 15.52 | 15.26 |
| DYS449 | YfP/RM/L | 9 | 9 | 1166 | 7.72 | 3.43 | 12.86 | 7.72 | 3.43 | 12.86 | 11.75 | 11.59 |
| DYS459 (a-b) | L | 0 | 0 | 718 | — | — | — | — | — | — | 2.30 | 2.54 |
| DYS724 (a-b) | L | 25 | 29 | 718 | 34.82 | 22.28 | 48.75 | 40.39 | 26.46 | 55.71 | — | 29.11 |
| DYS607 | L | 0 | 0 | 718 | — | — | — | — | — | — | — | 2.54 |
| DYS455 | L | 0 | 0 | 718 | — | — | — | — | — | — | 0.00 | 1.41 |
| DYS426 | L | 1 | 1 | 718 | 1.39 | 0.00 | 4.18 | 1.39 | 0.00 | 4.18 | 0.00 | 1.13 |
| DYS454 | L | 0 | 0 | 718 | — | — | — | — | — | — | 0.00 | 0.28 |
| DYS447 | L | 3 | 3 | 718 | 4.18 | 0.00 | 9.75 | 4.18 | 0.00 | 9.75 | 1.74 | 3.67 |
| DYS442 | L | 2 | 2 | 718 | 2.79 | 0.00 | 6.96 | 2.79 | 0.00 | 6.96 | 9.35 | 4.80 |
| DYS464 (a-b-c-d) | L | 17 | 23 | 718 | 23.68 | 13.93 | 34.82 | 32.03 | 19.50 | 45.96 | 27.51 | — |
| YCAII (a-b) | L | 0 | 0 | 718 | — | — | — | — | — | — | — | 1.13 |
| DYS388 | L | 0 | 0 | 718 | — | — | — | — | — | — | 0.00 | 0.28 |
| DYF399S1 | RM | 70 | 74 | 1149 | 60.92 | 47.87 | 74.85 | 64.40 | 50.48 | 79.20 | 77.48 | — |
| DYS526A | RM | 1 | 1 | 1166 | 0.86 | 0.00 | 2.57 | 0.86 | 0.00 | 2.57 | 2.33 | — |
| DYS626 | RM | 9 | 10 | 1166 | 7.72 | 3.43 | 12.86 | 8.58 | 3.43 | 14.58 | 11.84 | — |
| DYS526B | RM | 5 | 6 | 1166 | 4.29 | 0.86 | 8.58 | 5.15 | 1.72 | 9.43 |
| — |
| DYS612 | RM | 16 | 19 | 1166 | 13.72 | 7.72 | 20.58 | 16.30 | 9.43 | 24.01 | 14.15 | — |
| DYS547 | RM | 15 | 16 | 1166 | 12.86 | 6.86 | 19.73 | 13.72 | 7.72 | 20.58 | 23.23 | — |
| DYF404S1 | RM | 14 | 17 | 1166 | 12.01 | 6.00 | 18.87 | 14.58 | 7.72 | 21.44 | 12.08 | — |
| DYF403S1a | RM | 29 | 29 | 1152 | 25.17 | 16.49 | 34.72 | 25.17 | 16.49 | 34.72 | 30.59 | — |
| DYF403S1b | RM | 4 | 5 | 1166 | 3.43 | 0.86 | 6.86 | 4.29 | 0.86 | 8.58 | — | |
|
| ||||||||||||
| Yfiler (Yf) | 63 | 69 | 18656 | 3.38 | 2.57 | 4.23 | 3.70 | 2.84 | 4.61 | 3.01 | 3.67 | |
| PowerPlex Y23 (PP) | 96 | 103 | 24228 | 3.96 | 3.18 | 4.79 | 4.25 | 3.47 | 5.08 | 3.95 | 4.20 | |
| Yfiler Plus (YfP) | 146 | 155 | 27990 | 5.22 | 4.39 | 6.07 | 5.54 | 4.68 | 6.43 | 5.74 | 6.09 | |
| Leuven (L) | 202 | 221 | 37508 | 5.39 | 4.67 | 6.13 | 5.89 | 5.12 | 6.69 | — | 5.54^ | |
| Rapidly Mutating (RM) | 239 | 256 | 17459 | 13.69 | 11.97 | 15.46 | 14.66 | 12.89 | 16.50 | — | ||
| All | 365 | 398 | 47971 | 7.61 | 6.84 | 8.40 | 8.30 | 7.50 | 9.11 | — | — | |
Locus-specific and panel-wise estimates (Val) along with the corresponding 95% confidence intervals (CI) are calculated both with all mutations as single events (Single) and with multi-step mutations as sum of independent events (Multi). Reference values extracted from the literature are also included, with FS = Father-Son pairs[20] and GP = Genealogical Pairs[21]. Significant comparisons (Fisher tests) are highlighted in bold, with *p < 0.05 and **p < 0.005.
Figure 1Overall mutation rates and 95% confidence intervals for the considered Y-STR sets (abbreviations as in Table 1).
Figure 2Diachronic changes of overall mutation rates for the considered Y-STR sets (abbreviations as in Table 1) on three increasing bins of meioses (7–10, 11–19, >19).
Panel-specific and All markers data summary statistics of linear regression models with TMRCA as a function of the number of observed mutations and Spearman rank correlation with the same variables.
| Panel | Single/Multi | Nr. Meioses | Regression coefficient | P-value | R-squared | Spearman | ||
|---|---|---|---|---|---|---|---|---|
| Multiple | Adjusted | Rho | P-value | |||||
| Yfiler | Single | 1941 | 0.04688 | 4.850E-04 | 0.09174 | 0.08458 | 0.3119116 | 3.202E-04 |
| Multi | 0.06083 | 7.746E-05 | 0.11610 | 0.10920 | 0.3345687 | 1.065E-04 | ||
| PP23 | Single | 870 | 0.05334 | 3.833E-02 | 0.05904 | 0.04578 | 0.2725257 | 1.967E-02 |
| Multi | 0.07743 | 6.171E-03 | 0.10090 | 0.08824 | 0.3092618 | 7.760E-03 | ||
| YfilerPlus | Single | 761 | 0.10090 | 1.144E-03 | 0.16040 | 0.14660 | 0.4202341 | 6.059E-04 |
| Multi | 0.13253 | 8.568E-05 | 0.22510 | 0.21240 | 0.4675362 | 1.118E-04 | ||
| RM | Single | 1941 | 0.15780 | 1.918E-12 | 0.32410 | 0.31880 | 0.5891791 | 2.062E-13 |
| Multi | 0.15760 | 3.798E-10 | 0.26650 | 0.26070 | 0.5365281 | 5.616E-11 | ||
| Leuven | Single | 761 | 0.19030 | 5.564E-06 | 0.28890 | 0.27720 | 0.4963118 | 3.522E-05 |
| Multi | 0.26460 | 3.761E-07 | 0.34720 | 0.33650 | 0.5458658 | 3.709E-06 | ||
| All markers | Single | 761 | 0.32620 | 1.918E-09 | 0.44880 | 0.43980 | 0.5763884 | 7.675E-07 |
| Multi | 0.39730 | 2.584E-09 | 0.44350 | 0.43440 | 0.5713817 | 1.005E-06 | ||
Calculations were performed considering all mutations as single events (Single) and multi-step mutations as sum of independent events (Multi). Panel abbreviations as in Table 1.
Performance of the Walsh[32] procedure as the percentage of observed TMRCA values falling within the estimated confidence intervals using the whole dataset (All data) along with Rapidly Mutating (RM) and Leuven (L) panels and two different mutation models (IAM = Infinite Alleles Model; SMM = Stepwise Mutation Model).
| Panel | Model | % TMRCA within estimated CIs |
|---|---|---|
| RM | IAM | 96.72% |
| RM | SMM | 93.44% |
| L | IAM | 91.80% |
| L | SMM | 88.52% |
| All data | IAM | 96.72% |
| All data | SMM | 91.80% |
Deviation and parameters of both normal regression models and regressions through the origin, with observed TMRCA values as a function of the expected ones.
| Panel | Model | Statistic | Deviation | Regression With Intercept | No Intercept | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Beta | CI_2.5% | CI_97.5 | Pval | R2_M | R2_A | Beta | CI_2.5% | CI_97.5% | ||||
| RM | SMM | MEAN | 4.73 | 0.25 | 0.16 | 0.33 | 1.28E-07 | 0.38 | 0.37 | 0.47 | 0.42 | 0.53 |
| RM | SMM | MEDIAN | 3.34 | 0.27 | 0.18 | 0.36 | 1.17E-07 | 0.38 | 0.37 | 0.53 | 0.46 | 0.60 |
| RM | SMM | D.MODE | 1.13 | 0.31 | 0.21 | 0.40 | 6.59E-08 | 0.39 | 0.38 | 0.64 | 0.55 | 0.73 |
| RM | IAM | MEAN | 2.87 | 0.35 | 0.23 | 0.46 | 6.81E-08 | 0.39 | 0.38 | 0.60 | 0.54 | 0.67 |
| RM | IAM | MEDIAN | 1.98 | 0.35 | 0.24 | 0.46 | 7.00E-08 | 0.39 | 0.38 | 0.65 | 0.58 | 0.72 |
| RM | IAM | D.MODE | 0.31 | 0.36 | 0.24 | 0.47 | 4.61E-08 | 0.40 | 0.39 | 0.74 | 0.64 | 0.84 |
| L | SMM | MEAN | 4.24 | 0.21 | 0.13 | 0.28 | 1.38E-06 | 0.33 | 0.32 | 0.46 | 0.39 | 0.52 |
| L | SMM | MEDIAN | 3.25 | 0.21 | 0.13 | 0.29 | 1.38E-06 | 0.33 | 0.32 | 0.48 | 0.41 | 0.56 |
| L | SMM | D.MODE | 1.31 | 0.22 | 0.14 | 0.31 | 1.01E-06 | 0.34 | 0.32 | 0.54 | 0.45 | 0.64 |
| L | IAM | MEAN | 2.72 | 0.26 | 0.14 | 0.38 | 7.46E-05 | 0.24 | 0.22 | 0.59 | 0.51 | 0.66 |
| L | IAM | MEDIAN | 1.88 | 0.26 | 0.14 | 0.38 | 7.77E-05 | 0.23 | 0.22 | 0.62 | 0.53 | 0.71 |
| L | IAM | D.MODE | 0.13 | 0.27 | 0.15 | 0.40 | 4.76E-05 | 0.25 | 0.23 | 0.71 | 0.59 | 0.83 |
| All data | SMM | MEAN | 3.07 | 0.30 | 0.21 | 0.39 | 1.63E-08 | 0.42 | 0.41 | 0.56 | 0.49 | 0.62 |
| All data | SMM | MEDIAN | 2.46 | 0.30 | 0.21 | 0.40 | 1.57E-08 | 0.42 | 0.41 | 0.58 | 0.51 | 0.66 |
| All data | SMM | D.MODE | 1.38 | 0.31 | 0.22 | 0.40 | 1.13E-08 | 0.43 | 0.42 | 0.63 | 0.55 | 0.71 |
| All data | IAM | MEAN | 1.11 | 0.47 | 0.31 | 0.62 | 8.43E-08 | 0.39 | 0.38 | 0.77 | 0.69 | 0.85 |
| All data | IAM | MEDIAN | 0.61 | 0.47 | 0.32 | 0.62 | 8.23E-08 | 0.39 | 0.38 | 0.81 | 0.73 | 0.90 |
| All data | IAM | D.MODE | −0.40 | 0.46 | 0.31 | 0.62 | 1.28E-07 | 0.38 | 0.37 | 0.90 | 0.79 | 1.00 |
Calculations were performed for the complete dataset (All data) and the two best-fitting panels (RM and L; abbreviations as in Table 1) and three summary statistics (median, mean and mode). Regression coefficients (Beta) and their 95% confidence intervals are reported for all models, while p-values and R-squared statistics (M = Multiple, A = Adjusted) only for regressions with intercept.