| Literature DB >> 26803158 |
Johannes Rainer1, Daniel Taliun2, Yuri D'Elia1, Cristian Pattaro1, Francisco S Domingues1, Christian X Weichenberger1.
Abstract
UNLABELLED: Familial aggregation analysis is the first fundamental step to perform when assessing the extent of genetic background of a disease. However, there is a lack of software to analyze the familial clustering of complex phenotypes in very large pedigrees. Such pedigrees can be utilized to calculate measures that express trait aggregation on both the family and individual level, providing valuable directions in choosing families for detailed follow-up studies. We developed FamAgg, an open source R package that contains both established and novel methods to investigate familial aggregation of traits in large pedigrees. We demonstrate its use and interpretation by analyzing a publicly available cancer dataset with more than 20 000 participants distributed across approximately 400 families.Entities:
Mesh:
Year: 2016 PMID: 26803158 PMCID: PMC4866523 DOI: 10.1093/bioinformatics/btw019
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Familial aggregation in the Minnesota Breast Cancer dataset. (A) Scatter plot of –log10(p-values) from the KS test (x-axis) and the GIF test (y-axis) computed for all 426 families. Given the KS test provides a p-value for each affected subject, the lowest p-value in each family is displayed. At a significance level of 0.05 (dashed lines), the GIF test identifies 34 families whereas the KS test identifies 42 families. Filled circles and family identifiers are provided for the 14 families when tests are jointly significant. For example, family 432 is top-ranked by both tests: p-value = 1.3 × 10−3 and 9.6 × 10−5 with the GIF and KS test, respectively. Non-significant family clusters are gray shaded. (B) Pedigree of family 13, which is ranked second by the GIF test (p-value = 2.4 × 10−3). The family comprises 29 phenotyped members and includes five affected females. If known, age of cancer onset (cases) or age of demise is indicated below individuals’ identifiers. For subject 410, S = 1.0 (0.25 × 3 affected sisters + 0.25 × 1 affected daughter), with p-value = 1.3 × 10−2. Sisters 406, 408 and 409 have equal S = 3 × 0.25 + 0.125 = 0.875 (p-value = 2.4 × 10−2), as they are aunts of subject 419. The familial incidence rate of individual 410 is FR = 8.7 × 10−3, which is in the top percentile of all computed values in the Minnesota Breast Cancer dataset