Literature DB >> 31178127

Fast and Accurate Shared Segment Detection and Relatedness Estimation in Un-phased Genetic Data via TRUFFLE.

Apostolos Dimitromanolakis1, Andrew D Paterson2, Lei Sun3.   

Abstract

Relationship estimation and segment detection between individuals is an important aspect of disease gene mapping. Existing methods are either tailored for computational efficiency or require phasing to improve accuracy. We developed TRUFFLE, a method that integrates computational techniques and statistical principles for the identification and visualization of identity-by-descent (IBD) segments using un-phased data. By skipping the haplotype phasing step and, instead, relying on a simpler region-based approach, our method is computationally efficient while maintaining inferential accuracy. In addition, an error model corrects for segment break-ups that occur as a consequence of genotyping errors. TRUFFLE can estimate relatedness for 3.1 million pairs from the 1000 Genomes Project data in a few minutes on a typical laptop computer. Consistent with expectation, we identified only three second cousin or closer pairs across different populations, while commonly used methods identified a large number of such pairs. Similarly, within populations, we identified many fewer related pairs. Compared to methods relying on phased data, TRUFFLE has comparable accuracy but is drastically faster and has fewer broken segments. We also identified specific local genomic regions that are commonly shared within populations, suggesting selection. When applied to pedigree data, we observed 99.6% accuracy in detecting 1st to 5th degree relationships. As genomic datasets become much larger, TRUFFLE can enable disease gene mapping through implicit shared haplotypes by accurate IBD segment detection.
Copyright © 2019 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

Keywords:  1000 Genomes Project; IBD; high-throughput; implicit haplotype inference; relationship; segment sharing; software; un-phased genetic data

Mesh:

Year:  2019        PMID: 31178127      PMCID: PMC6612710          DOI: 10.1016/j.ajhg.2019.05.007

Source DB:  PubMed          Journal:  Am J Hum Genet        ISSN: 0002-9297            Impact factor:   11.025


  23 in total

1.  Statistical tests for detection of misspecified relationships by use of genome-screen data.

Authors:  M S McPeek; L Sun
Journal:  Am J Hum Genet       Date:  2000-03       Impact factor: 11.025

2.  Robust relationship inference in genome-wide association studies.

Authors:  Ani Manichaikul; Josyf C Mychaleckyj; Stephen S Rich; Kathy Daly; Michèle Sale; Wei-Min Chen
Journal:  Bioinformatics       Date:  2010-10-05       Impact factor: 6.937

3.  High-resolution detection of identity by descent in unrelated individuals.

Authors:  Sharon R Browning; Brian L Browning
Journal:  Am J Hum Genet       Date:  2010-03-18       Impact factor: 11.025

4.  simuPOP: a forward-time population genetics simulation environment.

Authors:  Bo Peng; Marek Kimmel
Journal:  Bioinformatics       Date:  2005-07-14       Impact factor: 6.937

5.  PLINK: a tool set for whole-genome association and population-based linkage analyses.

Authors:  Shaun Purcell; Benjamin Neale; Kathe Todd-Brown; Lori Thomas; Manuel A R Ferreira; David Bender; Julian Maller; Pamela Sklar; Paul I W de Bakker; Mark J Daly; Pak C Sham
Journal:  Am J Hum Genet       Date:  2007-07-25       Impact factor: 11.025

6.  Whole population, genome-wide mapping of hidden relatedness.

Authors:  Alexander Gusev; Jennifer K Lowe; Markus Stoffel; Mark J Daly; David Altshuler; Jan L Breslow; Jeffrey M Friedman; Itsik Pe'er
Journal:  Genome Res       Date:  2008-10-29       Impact factor: 9.043

Review 7.  African genetic diversity: implications for human demographic history, modern human origins, and complex disease mapping.

Authors:  Michael C Campbell; Sarah A Tishkoff
Journal:  Annu Rev Genomics Hum Genet       Date:  2008       Impact factor: 8.929

8.  A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms.

Authors:  R Sachidanandam; D Weissman; S C Schmidt; J M Kakol; L D Stein; G Marth; S Sherry; J C Mullikin; B J Mortimore; D L Willey; S E Hunt; C G Cole; P C Coggill; C M Rice; Z Ning; J Rogers; D R Bentley; P Y Kwok; E R Mardis; R T Yeh; B Schultz; L Cook; R Davenport; M Dante; L Fulton; L Hillier; R H Waterston; J D McPherson; B Gilman; S Schaffner; W J Van Etten; D Reich; J Higgins; M J Daly; B Blumenstiel; J Baldwin; N Stange-Thomann; M C Zody; L Linton; E S Lander; D Altshuler
Journal:  Nature       Date:  2001-02-15       Impact factor: 49.962

9.  Forward-time simulation of realistic samples for genome-wide association studies.

Authors:  Bo Peng; Christopher I Amos
Journal:  BMC Bioinformatics       Date:  2010-09-01       Impact factor: 3.169

10.  A second generation human haplotype map of over 3.1 million SNPs.

Authors:  Kelly A Frazer; Dennis G Ballinger; David R Cox; David A Hinds; Laura L Stuve; Richard A Gibbs; John W Belmont; Andrew Boudreau; Paul Hardenbol; Suzanne M Leal; Shiran Pasternak; David A Wheeler; Thomas D Willis; Fuli Yu; Huanming Yang; Changqing Zeng; Yang Gao; Haoran Hu; Weitao Hu; Chaohua Li; Wei Lin; Siqi Liu; Hao Pan; Xiaoli Tang; Jian Wang; Wei Wang; Jun Yu; Bo Zhang; Qingrun Zhang; Hongbin Zhao; Hui Zhao; Jun Zhou; Stacey B Gabriel; Rachel Barry; Brendan Blumenstiel; Amy Camargo; Matthew Defelice; Maura Faggart; Mary Goyette; Supriya Gupta; Jamie Moore; Huy Nguyen; Robert C Onofrio; Melissa Parkin; Jessica Roy; Erich Stahl; Ellen Winchester; Liuda Ziaugra; David Altshuler; Yan Shen; Zhijian Yao; Wei Huang; Xun Chu; Yungang He; Li Jin; Yangfan Liu; Yayun Shen; Weiwei Sun; Haifeng Wang; Yi Wang; Ying Wang; Xiaoyan Xiong; Liang Xu; Mary M Y Waye; Stephen K W Tsui; Hong Xue; J Tze-Fei Wong; Luana M Galver; Jian-Bing Fan; Kevin Gunderson; Sarah S Murray; Arnold R Oliphant; Mark S Chee; Alexandre Montpetit; Fanny Chagnon; Vincent Ferretti; Martin Leboeuf; Jean-François Olivier; Michael S Phillips; Stéphanie Roumy; Clémentine Sallée; Andrei Verner; Thomas J Hudson; Pui-Yan Kwok; Dongmei Cai; Daniel C Koboldt; Raymond D Miller; Ludmila Pawlikowska; Patricia Taillon-Miller; Ming Xiao; Lap-Chee Tsui; William Mak; You Qiang Song; Paul K H Tam; Yusuke Nakamura; Takahisa Kawaguchi; Takuya Kitamoto; Takashi Morizono; Atsushi Nagashima; Yozo Ohnishi; Akihiro Sekine; Toshihiro Tanaka; Tatsuhiko Tsunoda; Panos Deloukas; Christine P Bird; Marcos Delgado; Emmanouil T Dermitzakis; Rhian Gwilliam; Sarah Hunt; Jonathan Morrison; Don Powell; Barbara E Stranger; Pamela Whittaker; David R Bentley; Mark J Daly; Paul I W de Bakker; Jeff Barrett; Yves R Chretien; Julian Maller; Steve McCarroll; Nick Patterson; Itsik Pe'er; Alkes Price; Shaun Purcell; Daniel J Richter; Pardis Sabeti; Richa Saxena; Stephen F Schaffner; Pak C Sham; Patrick Varilly; David Altshuler; Lincoln D Stein; Lalitha Krishnan; Albert Vernon Smith; Marcela K Tello-Ruiz; Gudmundur A Thorisson; Aravinda Chakravarti; Peter E Chen; David J Cutler; Carl S Kashuk; Shin Lin; Gonçalo R Abecasis; Weihua Guan; Yun Li; Heather M Munro; Zhaohui Steve Qin; Daryl J Thomas; Gilean McVean; Adam Auton; Leonardo Bottolo; Niall Cardin; Susana Eyheramendy; Colin Freeman; Jonathan Marchini; Simon Myers; Chris Spencer; Matthew Stephens; Peter Donnelly; Lon R Cardon; Geraldine Clarke; David M Evans; Andrew P Morris; Bruce S Weir; Tatsuhiko Tsunoda; James C Mullikin; Stephen T Sherry; Michael Feolo; Andrew Skol; Houcan Zhang; Changqing Zeng; Hui Zhao; Ichiro Matsuda; Yoshimitsu Fukushima; Darryl R Macer; Eiko Suda; Charles N Rotimi; Clement A Adebamowo; Ike Ajayi; Toyin Aniagwu; Patricia A Marshall; Chibuzor Nkwodimmah; Charmaine D M Royal; Mark F Leppert; Missy Dixon; Andy Peiffer; Renzong Qiu; Alastair Kent; Kazuto Kato; Norio Niikawa; Isaac F Adewole; Bartha M Knoppers; Morris W Foster; Ellen Wright Clayton; Jessica Watkin; Richard A Gibbs; John W Belmont; Donna Muzny; Lynne Nazareth; Erica Sodergren; George M Weinstock; David A Wheeler; Imtaz Yakub; Stacey B Gabriel; Robert C Onofrio; Daniel J Richter; Liuda Ziaugra; Bruce W Birren; Mark J Daly; David Altshuler; Richard K Wilson; Lucinda L Fulton; Jane Rogers; John Burton; Nigel P Carter; Christopher M Clee; Mark Griffiths; Matthew C Jones; Kirsten McLay; Robert W Plumb; Mark T Ross; Sarah K Sims; David L Willey; Zhu Chen; Hua Han; Le Kang; Martin Godbout; John C Wallenburg; Paul L'Archevêque; Guy Bellemare; Koji Saeki; Hongguang Wang; Daochang An; Hongbo Fu; Qing Li; Zhen Wang; Renwu Wang; Arthur L Holden; Lisa D Brooks; Jean E McEwen; Mark S Guyer; Vivian Ota Wang; Jane L Peterson; Michael Shi; Jack Spiegel; Lawrence M Sung; Lynn F Zacharia; Francis S Collins; Karen Kennedy; Ruth Jamieson; John Stewart
Journal:  Nature       Date:  2007-10-18       Impact factor: 49.962

View more
  15 in total

1.  A Fast and Simple Method for Detecting Identity-by-Descent Segments in Large-Scale Data.

Authors:  Ying Zhou; Sharon R Browning; Brian L Browning
Journal:  Am J Hum Genet       Date:  2020-03-12       Impact factor: 11.025

2.  Patterns of genetic connectedness between modern and medieval Estonian genomes reveal the origins of a major ancestry component of the Finnish population.

Authors:  Toomas Kivisild; Lehti Saag; Ruoyun Hui; Simone Andrea Biagini; Vasili Pankratov; Eugenia D'Atanasio; Luca Pagani; Lauri Saag; Siiri Rootsi; Reedik Mägi; Ene Metspalu; Heiki Valk; Martin Malve; Kadri Irdt; Tuuli Reisberg; Anu Solnik; Christiana L Scheib; Daniel N Seidman; Amy L Williams; Kristiina Tambets; Mait Metspalu
Journal:  Am J Hum Genet       Date:  2021-08-18       Impact factor: 11.025

3.  Including diverse and admixed populations in genetic epidemiology research.

Authors:  Amke Caliebe; Fasil Tekola-Ayele; Burcu F Darst; Xuexia Wang; Yeunjoo E Song; Jiang Gui; Ronnie A Sebro; David J Balding; Mohamad Saad; Marie-Pierre Dubé
Journal:  Genet Epidemiol       Date:  2022-07-16       Impact factor: 2.344

4.  Whole-genome sequencing of multiple related individuals with type 2 diabetes reveals an atypical likely pathogenic mutation in the PAX6 gene.

Authors:  Bernhard O Boehm; Wolfgang Kratzer; Vikas Bansal
Journal:  Eur J Hum Genet       Date:  2022-10-07       Impact factor: 5.351

5.  Evaluating the Impact of Dropout and Genotyping Error on SNP-Based Kinship Analysis With Forensic Samples.

Authors:  Stephen D Turner; V P Nagraj; Matthew Scholz; Shakeel Jessa; Carlos Acevedo; Jianye Ge; August E Woerner; Bruce Budowle
Journal:  Front Genet       Date:  2022-06-30       Impact factor: 4.772

6.  Rapid, Phase-free Detection of Long Identity-by-Descent Segments Enables Effective Relationship Classification.

Authors:  Daniel N Seidman; Sushila A Shenoy; Minsoo Kim; Ramya Babu; Ian G Woods; Thomas D Dyer; Donna M Lehman; Joanne E Curran; Ravindranath Duggirala; John Blangero; Amy L Williams
Journal:  Am J Hum Genet       Date:  2020-03-19       Impact factor: 11.025

7.  RAFFI: Accurate and fast familial relationship inference in large scale biobank studies using RaPID.

Authors:  Ardalan Naseri; Junjie Shi; Xihong Lin; Shaojie Zhang; Degui Zhi
Journal:  PLoS Genet       Date:  2021-01-21       Impact factor: 5.917

8.  Pedigree reconstruction and distant pairwise relatedness estimation from genome sequence data: A demonstration in a population of rhesus macaques (Macaca mulatta).

Authors:  Lauren E Petty; Kathrine Phillippi-Falkenstein; H Michael Kubisch; Muthuswamy Raveendran; R Alan Harris; Eric J Vallender; Chad D Huff; Rudolf P Bohm; Jeffrey Rogers; Jennifer E Below
Journal:  Mol Ecol Resour       Date:  2021-01-27       Impact factor: 7.090

9.  Shared genomic segment analysis with equivalence testing.

Authors:  Sukanya Horpaopan; Cathy S J Fann; Mark Lathrop; Jurg Ott
Journal:  Genet Epidemiol       Date:  2020-07-16       Impact factor: 2.135

10.  Rapid radiation in a highly diverse marine environment.

Authors:  Kosmas Hench; Martin Helmkampf; W Owen McMillan; Oscar Puebla
Journal:  Proc Natl Acad Sci U S A       Date:  2022-01-25       Impact factor: 12.779

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.