| Literature DB >> 35126470 |
Kelly M Burkett1, Mohan Rakesh2, Patricia Morris1, Hélène Vézina3,4,5, Catherine Laprise5,6, Ellen E Freeman2,7, Marie-Hélène Roy-Gagnon2.
Abstract
Research on the genetics of complex traits overwhelmingly focuses on the additive effects of genes. Yet, animal studies have shown that non-additive effects, in particular homozygosity effects, can shape complex traits. Recent investigations in human studies found some significant homozygosity effects. However, most human populations display restricted ranges of homozygosity by descent (HBD), making the identification of homozygosity effects challenging. Founder populations give rise to higher HBD levels. When deep genealogical data are available in a founder population, it is possible to gain information on the time to the most recent common ancestor (MRCA) from whom a chromosomal segment has been transmitted to both parents of an individual and in turn to that individual. This information on the time to MRCA can be combined with the time to MRCA inferred from coalescent models of gene genealogies. HBD can also be estimated from genomic data. The extent to which the genomic HBD measures correspond to the genealogical/coalescent measures has not been documented in founder populations with extensive genealogical data. In this study, we used simulations to relate genomic and genealogical/coalescent HBD measures. We based our simulations on genealogical data from two ongoing studies from the French-Canadian founder population displaying different levels of inbreeding. We simulated single-nucleotide polymorphisms (SNPs) in a 1-Mb genomic segment from a coalescent model in conjunction with the observed genealogical data. We compared genealogical/coalescent HBD to two genomic methods of HBD estimation based on hidden Markov models (HMMs). We found that genomic estimates of HBD correlated well with genealogical/coalescent HBD measures in both study genealogies. We described generation time to coalescence in terms of genomic HBD estimates and found a large variability in generation time captured by genomic HBD when considering each SNP. However, SNPs in longer segments were more likely to capture recent time to coalescence, as expected. Our study suggests that estimating the coalescent gene genealogy from the genomic data to use in conjunction with observed genealogical data could provide valuable information on HBD.Entities:
Keywords: coalescent models; founder populations, genealogical data; homozygosity by descent; most recent common ancestor; simulations
Year: 2022 PMID: 35126470 PMCID: PMC8814340 DOI: 10.3389/fgene.2021.808829
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
FIGURE 1Characteristics of the SLSJ and Montreal study genealogies. (A) Average across probands of the genealogical completeness and inbreeding coefficients calculated at each generation. (B) Violin plots of the distribution of inbreeding coefficients across probands (points are the inbreeding coefficients of the probands).
FIGURE 2Genomic and genealogical homozygosity by descent (HBD). For each proband, the total length of simulated segments inferred HBD based on IBDLD HBD probabilities of at least 0.5 is plotted against the proband's genealogical inbreeding coefficient. The size of the points is proportional to the number of replicates in which the proband has a segment HBD. (A) SLSJ study; (B) Montreal study.
FIGURE 3Distributions of inferred homozygosity by descent (HBD). Violin plots of the distributions across probands, averaged across SNPs and simulation replicates, of (A) HBD probabilities estimated from FEstim; (B) HBD probabilities estimated from IBDLD; (C) time to coalescence, in generations, obtained from the observed study and coalescent/gene genealogies.
FIGURE 4Probability of homozygosity by descent (HBD) from FEstim and IBDLD. Scatter plots of the average (across SNPs and simulation replicates) HBD probabilities estimated by FEstim and IBDLD for (A) the SLSJ study and (B) the Montreal study.
FIGURE 5Genomic- and genealogical/coalescent-based homozygosity by descent (HBD). Scatter plots with fitted regression lines of the average (across simulation replicates) HBD probabilities estimated from the observed study and coalescent/gene genealogies and from the simulated genomic segments (by FEstim and IBDLD) for (A) the SLSJ study and (B) the Montreal study.
FIGURE 6Genomic homozygosity by descent (HBD) and generation to coalescence. Scatter plots of genomic HBD probabilities and generations to coalescence from the observed study and coalescent/gene genealogies. Results from the SLSJ study are shown in (A) for FEstim and (B) for IBDLD. Results from the Montreal study are shown in (C) for FEstim and (D) for IBDLD. Boxplots show the distributions of generations to coalescence captured by setting different cutoffs of HBD probability to determine the HBD status of each SNP for each individual.
Summary statistics of the distributions of the time to coalescence (generations) captured by setting different cutoffs of HBD probability (from FEstim or IBDLD) to determine the HBD status of each SNP for each proband.
| HBD prob | SLSJ study | Montreal study | ||||||
|---|---|---|---|---|---|---|---|---|
| Median | IQR | Min | Max | Median | IQR | Min | Max | |
| FEstim | ||||||||
| [0.75, 1] | 251 | 441 | 0 | 111,932 | 282 | 445 | 0 | 145,232 |
| [0.5, 0.75) | 959 | 1,347 | 0 | 104,619 | 973 | 1,352 | 0 | 119,188 |
| [0.25, 0.5) | 1,542 | 2,371 | 0 | 123,337 | 1,541 | 2,324 | 0 | 133,965 |
| [0, 0.25) | 8,109 | 11,957 | 0 | 155,952 | 8,071 | 11,859 | 0 | 145,232 |
| IBDLD | ||||||||
| [0.75, 1] | 159 | 300 | 0 | 102,615 | 180 | 285 | 0 | 92,469 |
| [0.5, 0.75) | 461 | 655 | 0 | 82,028 | 445 | 594 | 0 | 69,544 |
| [0.25, 0.5) | 547 | 1,002 | 0 | 76,758 | 542 | 914 | 0 | 82,197 |
| [0, 0.25) | 7,828 | 11,935 | 0 | 155,952 | 7,775 | 11,837 | 0 | 145,232 |
HBD probability cutoffs.
Interquartile range.