| Literature DB >> 25971991 |
Héloïse Gauvin1,2, Jean-François Lefebvre3, Claudia Moreau4, Eve-Marie Lavoie5, Damian Labuda6,7, Hélène Vézina8, Marie-Hélène Roy-Gagnon9,10.
Abstract
BACKGROUND: Founder populations have an important role in the study of genetic diseases. Access to detailed genealogical records is often one of their advantages. These genealogical data provide unique information for researchers in evolutionary and population genetics, demography and genetic epidemiology. However, analyzing large genealogical datasets requires specialized methods and software. The GENLIB software was developed to study the large genealogies of the French Canadian population of Quebec, Canada. These genealogies are accessible through the BALSAC database, which contains over 3 million records covering the whole province of Quebec over four centuries. Using this resource, extended pedigrees of up to 17 generations can be constructed from a sample of present-day individuals.Entities:
Mesh:
Year: 2015 PMID: 25971991 PMCID: PMC4431039 DOI: 10.1186/s12859-015-0581-5
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Overview of GENLIB functions
| Name | Use | |
|---|---|---|
|
| ||
| gen.genalogy | To create a basic genealogical object | |
| gen.lineages | To extract parental lineages from a genealogical object | |
| gen.branching | To extract a subset of a genealogical object | |
| gen.genout | To output a genealogical object as a data frame | |
| gen.founder, gen.half.founder, gen.pro, gen.parent, gen.sibship, gen.children, gen.findFounders, gen.findMRCA | To identify specific individuals (founder, half-founder, proband, parent, sibship, children, common founder, most recent common ancestor) | |
|
| ||
| gen.nomen, gen.nowomen, gen.noind, gen.nochildren | …the number of men, women or individuals in a genealogy and number of kids an individual has | |
| gen.min, gen.mean, gen.max | …the minimal, mean or maximal generation at which an individual can be found in a genealogy (first generation is coded 0) | |
| gen.depth | …the number of generations in the genealogy | |
| gen.completeness | …genealogical data completeness | |
| gen.rec | …how many individuals within the specified individual group descend from each specified ancestor | |
| gen.occ | …how many different (but not mutually exclusive) paths link an ancestor to a descendant | |
| gen.meangendepth | …how much rooted are the genealogical lineages | |
| gen.implex | …the extent of pedigree collapse within an individual’s genealogy | |
| gen.findDistance | …distance between individuals through a specific ancestor | |
| gen.find.Min.Distance.MRCA | …the shortest distances between individuals | |
|
| ||
| gen.graph | …the genealogy | |
|
| ||
| gen.phi | …the kinship matrix at specified generations | |
| gen.f | …the inbreeding coefficients at specified generations | |
| gen.gc | …the genetic contribution of ancestors to individuals | |
|
| ||
| gen.simuProb | To compute the probability that individuals have 0, 1 or 2 copies of a disease allele knowing how many their ancestors had | |
| gen.simuSample, gen.simuSampleFreq | To obtain the number (frequencies) of disease alleles for each individual taking into account each ancestor’s carrier status | |
| gen.simuSet | As function gen.simuSample with option to customize transmission probabilities according to the parent’s and/or subject’s sex | |
Note: Additional functions (e.g., to calculate the variance associated with kinship and other measures) are available but not included in the table.
Formulas of genealogical measures in GENLIB
| Completeness |
|
|
| Implex index |
| |
| Mean genealogical depth |
| |
| Variance of mean genealogical depth |
| |
| Kinship |
|
|
| Inbreeding |
|
|
| Genetic Contribution |
|
|
Figure 1Completeness and implex indices for the Quebec genealogical corpus.
Figure 2Cumulative genetic contribution of founders for each population. Plot of the cumulative distribution of genetic contributions of founders for each population in relation to the cumulative proportion of contributing founders, sorted in decreasing order of their genetic contribution. The dashed line presents the hypothetical situation in which all founders of a population contribute equally to the gene pool. ACA Acadians, GFC Gaspesian French Canadians, LOY Loyalists, MON Montreal, NS North Shore, QUE Quebec City area, SAG Saguenay, PQ Whole sample from the Province of Quebec, Unif Uniform distribution.
Figure 3Estimated probabilities of sharing one allele IBD versus ancestors’ genetic contributions. Plots of estimated probabilities of sharing one allele identical-by-descent (IBD) from a specific ancestor relative to the product of the genetic contributions of that ancestor to each of the two individuals from A) the Acadian population and B) the Saguenay population. Probabilities that the two individuals share one allele IBD were estimated from 10,000,000 gene-dropping simulations for each shared ancestor. Ancestors are divided into four categories depending on whether they are founders, most recent common ancestors (MRCA), both (MRCA-Founder) or neither of the two (In between). The black line is the identity line, i.e. y = x.
Selected segments shared IBD by two pairs of individuals
|
|
|
|
|
|
|---|---|---|---|---|
| 408868, | 1 | 3.5416 | 3.0829 | 4.2469 |
| 409033 | 5 | 7.3682 | 5.0353 | 9.0442 |
| Acadian | 8 | 10.0267 | 4.0278 | 14.9491 |
| 302710, | 3 | 3.3035 | 2.2441 | 4.4612 |
| 302711 | 7 | 5.8712 | 4.0323 | 8.4025 |
| Saguenay | 1 | 12.6098 | 7.1124 | 17.3889 |
Figure 4Estimated probabilities of IBD sharing for a segment versus one allele. Plots of estimated probabilities of sharing one segment identical-by-descent (IBD) from a specific ancestor relative to the estimated probabilities to share one allele IBD from that same ancestor for a pair of individuals from A) the Acadian population and B) the Saguenay population. Probabilities that the two individuals share one allele or one segment IBD were estimated from 10,000,000 (for allele sharing) or 100,000,000 (for segment sharing) simulations for each common ancestor. Ancestors are divided in four categories depending on whether they are founders, most recent common ancestors (MRCA), both (MRCA-Founder) or neither of the two (In between). The solid black line is the identity line and colored lines are simple regression lines between IBD sharing of a segment and IBD sharing of an allele. Three different segment lengths are considered and shown for A: 3.30, 5.87 and 12.61 cM and for B 3.54, 7.37 and 10.03 cM (Table 3).