Ardalan Naseri1, Kecong Tang2, Xin Geng1, Junjie Shi1, Jing Zhang1, Pramesh Shakya2, Xiaoming Liu3, Shaojie Zhang4, Degui Zhi5,6. 1. School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA. 2. Department of Computer Science, University of Central Florida, Orlando, FL, 32816, USA. 3. USF Genomics, College of Public Health, University of South Florida, Tampa, FL, 33612, USA. 4. Department of Computer Science, University of Central Florida, Orlando, FL, 32816, USA. shzhang@cs.ucf.edu. 5. School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA. degui.zhi@uth.tmc.edu. 6. Center for Precision Health, School of Biomedical Informatics, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA. degui.zhi@uth.tmc.edu.
Abstract
BACKGROUND: The genealogical histories of individuals within populations are of interest to studies aiming both to uncover detailed pedigree information and overall quantitative population demographic histories. However, the analysis of quantitative details of individual genealogical histories has faced challenges from incomplete available pedigree records and an absence of objective and quantitative details in pedigree information. Although complete pedigree information for most individuals is difficult to track beyond a few generations, it is possible to describe a person's genealogical history using their genetic relatives revealed by identity by descent (IBD) segments-long genomic segments shared by two individuals within a population, which are identical due to inheritance from common ancestors. When modern biobanks collect genotype information for a significant fraction of a population, dense genetic connections of a person can be traced using such IBD segments, offering opportunities to characterize individuals in the context of the underlying populations. Here, we conducted an individual-centric analysis of IBD segments among the UK Biobank participants that represent 0.7% of the UK population. RESULTS: We made a high-quality call set of IBD segments over 5 cM among all 500,000 UK Biobank participants. On average, one UK individual shares IBD segments with 14,000 UK Biobank participants, which we refer to as "relatives." Using these segments, approximately 80% of a person's genome can be imputed. We subsequently propose genealogical descriptors based on the genetic connections of relative cohorts of individuals sharing at least one IBD segment and show that such descriptors offer important information about one's genetic makeup, personal genealogical history, and social behavior. Through analysis of relative counts sharing segments at different lengths, we identified a group, potentially British Jews, who has a distinct pattern of familial expansion history. Finally, using the enrichment of relatives in one's neighborhood, we identified regional variations of personal preference favoring living closer to one's extended families. CONCLUSIONS: Our analysis revealed genetic makeup, personal genealogical history, and social behaviors at the population scale, opening possibilities for further studies of individual's genetic connections in biobank data.
BACKGROUND: The genealogical histories of individuals within populations are of interest to studies aiming both to uncover detailed pedigree information and overall quantitative population demographic histories. However, the analysis of quantitative details of individual genealogical histories has faced challenges from incomplete available pedigree records and an absence of objective and quantitative details in pedigree information. Although complete pedigree information for most individuals is difficult to track beyond a few generations, it is possible to describe a person's genealogical history using their genetic relatives revealed by identity by descent (IBD) segments-long genomic segments shared by two individuals within a population, which are identical due to inheritance from common ancestors. When modern biobanks collect genotype information for a significant fraction of a population, dense genetic connections of a person can be traced using such IBD segments, offering opportunities to characterize individuals in the context of the underlying populations. Here, we conducted an individual-centric analysis of IBD segments among the UK Biobank participants that represent 0.7% of the UK population. RESULTS: We made a high-quality call set of IBD segments over 5 cM among all 500,000 UK Biobank participants. On average, one UK individual shares IBD segments with 14,000 UK Biobank participants, which we refer to as "relatives." Using these segments, approximately 80% of a person's genome can be imputed. We subsequently propose genealogical descriptors based on the genetic connections of relative cohorts of individuals sharing at least one IBD segment and show that such descriptors offer important information about one's genetic makeup, personal genealogical history, and social behavior. Through analysis of relative counts sharing segments at different lengths, we identified a group, potentially British Jews, who has a distinct pattern of familial expansion history. Finally, using the enrichment of relatives in one's neighborhood, we identified regional variations of personal preference favoring living closer to one's extended families. CONCLUSIONS: Our analysis revealed genetic makeup, personal genealogical history, and social behaviors at the population scale, opening possibilities for further studies of individual's genetic connections in biobank data.
Entities:
Keywords:
Genealogical history; Identity by descent; RaPID; UK Biobank
Authors: Jeffrey Staples; Dandi Qiao; Michael H Cho; Edwin K Silverman; Deborah A Nickerson; Jennifer E Below Journal: Am J Hum Genet Date: 2014-10-30 Impact factor: 11.025
Authors: Georgios Athanasiadis; Jade Y Cheng; Bjarni J Vilhjálmsson; Frank G Jørgensen; Thomas D Als; Stephanie Le Hellard; Thomas Espeseth; Patrick F Sullivan; Christina M Hultman; Peter C Kjærgaard; Mikkel H Schierup; Thomas Mailund Journal: Genetics Date: 2016-08-17 Impact factor: 4.562
Authors: Chengzhen L Dai; Mohammad M Vazifeh; Chen-Hsiang Yeang; Remi Tachet; R Spencer Wells; Miguel G Vilar; Mark J Daly; Carlo Ratti; Alicia R Martin Journal: Am J Hum Genet Date: 2020-03-05 Impact factor: 11.025
Authors: Alicia R Martin; Konrad J Karczewski; Sini Kerminen; Mitja I Kurki; Antti-Pekka Sarin; Mykyta Artomov; Johan G Eriksson; Tõnu Esko; Giulio Genovese; Aki S Havulinna; Jaakko Kaprio; Alexandra Konradi; László Korányi; Anna Kostareva; Minna Männikkö; Andres Metspalu; Markus Perola; Rashmi B Prasad; Olli Raitakari; Oxana Rotar; Veikko Salomaa; Leif Groop; Aarno Palotie; Benjamin M Neale; Samuli Ripatti; Matti Pirinen; Mark J Daly Journal: Am J Hum Genet Date: 2018-04-26 Impact factor: 11.025
Authors: Jared O'Connell; Kevin Sharp; Nick Shrine; Louise Wain; Ian Hall; Martin Tobin; Jean-Francois Zagury; Olivier Delaneau; Jonathan Marchini Journal: Nat Genet Date: 2016-06-06 Impact factor: 38.330
Authors: William A Freyman; Kimberly F McManus; Suyash S Shringarpure; Ethan M Jewett; Katarzyna Bryc; Adam Auton Journal: Mol Biol Evol Date: 2021-05-04 Impact factor: 16.240