Literature DB >> 28318110

On the association analysis of genome-sequencing data: A spatial clustering approach for partitioning the entire genome into nonoverlapping windows.

Heide Loehlein Fier1,2, Dmitry Prokopenko3, Julian Hecker2, Michael H Cho3, Edwin K Silverman3, Scott T Weiss3, Rudolph E Tanzi4, Christoph Lange1,3.   

Abstract

For the association analysis of whole-genome sequencing (WGS) studies, we propose an efficient and fast spatial-clustering algorithm. Compared to existing analysis approaches for WGS data, that define the tested regions either by sliding or consecutive windows of fixed sizes along variants, a meaningful grouping of nearby variants into consecutive regions has the advantage that, compared to sliding window approaches, the number of tested regions is likely to be smaller. In comparison to consecutive, fixed-window approaches, our approach is likely to group nearby variants together. Given existing biological evidence that disease-associated mutations tend to physically cluster in specific regions along the chromosome, the identification of meaningful groups of nearby located variants could thus lead to a potential power gain for association analysis. Our algorithm defines consecutive genomic regions based on the physical positions of the variants, assuming an inhomogeneous Poisson process and groups together nearby variants. As parameters are estimated locally, the algorithm takes the differing variant density along the chromosome into account and provides locally optimal partitioning of variants into consecutive regions. An R-implementation of the algorithm is provided. We discuss the theoretical advances of our algorithm compared to existing, window-based approaches and show the performance and advantage of our introduced algorithm in a simulation study and by an application to Alzheimer's disease WGS data. Our analysis identifies a region in the ITGB3 gene that potentially harbors disease susceptibility loci for Alzheimer's disease. The region-based association signal of ITGB3 replicates in an independent data set and achieves formally genome-wide significance. Software Implementation: An implementation of the algorithm in R is available at: https://github.com/heidefier/cluster_wgs_data.
© 2017 WILEY PERIODICALS, INC.

Entities:  

Keywords:  WGS data; clustering; genetic association analysis

Mesh:

Year:  2017        PMID: 28318110      PMCID: PMC5525021          DOI: 10.1002/gepi.22040

Source DB:  PubMed          Journal:  Genet Epidemiol        ISSN: 0741-0395            Impact factor:   2.135


  28 in total

1.  Potential etiologic and functional implications of genome-wide association loci for human diseases and traits.

Authors:  Lucia A Hindorff; Praveen Sethupathy; Heather A Junkins; Erin M Ramos; Jayashri P Mehta; Francis S Collins; Teri A Manolio
Journal:  Proc Natl Acad Sci U S A       Date:  2009-05-27       Impact factor: 11.205

2.  Rare-variant extensions of the transmission disequilibrium test: application to autism exome sequence data.

Authors:  Zongxiao He; Brian J O'Roak; Joshua D Smith; Gao Wang; Stanley Hooker; Regie Lyn P Santos-Cortez; Biao Li; Mengyuan Kan; Nik Krumm; Deborah A Nickerson; Jay Shendure; Evan E Eichler; Suzanne M Leal
Journal:  Am J Hum Genet       Date:  2013-12-19       Impact factor: 11.025

3.  A general framework for detecting disease associations with rare variants in sequencing studies.

Authors:  Dan-Yu Lin; Zheng-Zheng Tang
Journal:  Am J Hum Genet       Date:  2011-09-01       Impact factor: 11.025

4.  The Biology of Genomes. Disease risk links to gene regulation.

Authors:  Elizabeth Pennisi
Journal:  Science       Date:  2011-05-27       Impact factor: 47.728

5.  VEGAS2: Software for More Flexible Gene-Based Testing.

Authors:  Aniket Mishra; Stuart Macgregor
Journal:  Twin Res Hum Genet       Date:  2014-12-18       Impact factor: 1.587

6.  The distribution of rare alleles.

Authors:  P Joyce; S Tavaré
Journal:  J Math Biol       Date:  1995       Impact factor: 2.259

7.  Predictive value of platelet activation for the rate of cognitive decline in Alzheimer's disease patients.

Authors:  Konstantinos Stellos; Victoria Panagiota; Andreas Kögel; Thomas Leyhe; Meinrad Gawaz; Christoph Laske
Journal:  J Cereb Blood Flow Metab       Date:  2010-08-18       Impact factor: 6.200

8.  The empirical power of rare variant association methods: results from sanger sequencing in 1,998 individuals.

Authors:  Martin Ladouceur; Zari Dastani; Yurii S Aulchenko; Celia M T Greenwood; J Brent Richards
Journal:  PLoS Genet       Date:  2012-02-02       Impact factor: 5.917

9.  A global reference for human genetic variation.

Authors:  Adam Auton; Lisa D Brooks; Richard M Durbin; Erik P Garrison; Hyun Min Kang; Jan O Korbel; Jonathan L Marchini; Shane McCarthy; Gil A McVean; Gonçalo R Abecasis
Journal:  Nature       Date:  2015-10-01       Impact factor: 49.962

10.  'Location, Location, Location': a spatial approach for rare variant analysis and an application to a study on non-syndromic cleft lip with or without cleft palate.

Authors:  Heide Fier; Sungho Won; Dmitry Prokopenko; Taofik AlChawa; Kerstin U Ludwig; Rolf Fimmers; Edwin K Silverman; Marcello Pagano; Elisabeth Mangold; Christoph Lange
Journal:  Bioinformatics       Date:  2012-10-08       Impact factor: 6.937

View more
  6 in total

1.  Whole Genome Sequencing Identifies CRISPLD2 as a Lung Function Gene in Children With Asthma.

Authors:  Priyadarshini Kachroo; Julian Hecker; Bo L Chawes; Tarunveer S Ahluwalia; Michael H Cho; Dandi Qiao; Rachel S Kelly; Su H Chu; Yamini V Virkud; Mengna Huang; Kathleen C Barnes; Esteban G Burchard; Celeste Eng; Donglei Hu; Juan C Celedón; Michelle Daya; Albert M Levin; Hongsheng Gui; L Keoki Williams; Erick Forno; Angel C Y Mak; Lydiana Avila; Manuel E Soto-Quiros; Michelle M Cloutier; Edna Acosta-Pérez; Glorisa Canino; Klaus Bønnelykke; Hans Bisgaard; Benjamin A Raby; Christoph Lange; Scott T Weiss; Jessica A Lasky-Su
Journal:  Chest       Date:  2019-09-23       Impact factor: 9.410

2.  Whole-Genome Sequencing in Severe Chronic Obstructive Pulmonary Disease.

Authors:  Dmitry Prokopenko; Phuwanat Sakornsakolpat; Heide Loehlein Fier; Dandi Qiao; Margaret M Parker; Merry-Lynn N McDonald; Ani Manichaikul; Stephen S Rich; R Graham Barr; Christopher J Williams; Mark L Brantly; Christoph Lange; Terri H Beaty; James D Crapo; Edwin K Silverman; Michael H Cho
Journal:  Am J Respir Cell Mol Biol       Date:  2018-11       Impact factor: 6.914

3.  Family-based tests for associating haplotypes with general phenotype data: Improving the FBAT-haplotype algorithm.

Authors:  Julian Hecker; Xin Xu; F William Townes; Heide Loehlein Fier; Chris Corcoran; Nan Laird; Christoph Lange
Journal:  Genet Epidemiol       Date:  2017-11-21       Impact factor: 2.135

Review 4.  Genetic Advances in Chronic Obstructive Pulmonary Disease. Insights from COPDGene.

Authors:  Margaret F Ragland; Christopher J Benway; Sharon M Lutz; Russell P Bowler; Julian Hecker; John E Hokanson; James D Crapo; Peter J Castaldi; Dawn L DeMeo; Craig P Hersh; Brian D Hobbs; Christoph Lange; Terri H Beaty; Michael H Cho; Edwin K Silverman
Journal:  Am J Respir Crit Care Med       Date:  2019-09-15       Impact factor: 21.405

5.  A unifying framework for rare variant association testing in family-based designs, including higher criticism approaches, SKATs, and burden tests.

Authors:  Julian Hecker; F William Townes; Priyadarshini Kachroo; Cecelia Laurie; Jessica Lasky-Su; John Ziniti; Michael H Cho; Scott T Weiss; Nan M Laird; Christoph Lange
Journal:  Bioinformatics       Date:  2020-12-26       Impact factor: 6.937

6.  Whole-genome sequencing reveals new Alzheimer's disease-associated rare variants in loci related to synaptic function and neuronal development.

Authors:  Dmitry Prokopenko; Sarah L Morgan; Kristina Mullin; Oliver Hofmann; Brad Chapman; Rory Kirchner; Sandeep Amberkar; Inken Wohlers; Christoph Lange; Winston Hide; Lars Bertram; Rudolph E Tanzi
Journal:  Alzheimers Dement       Date:  2021-04-02       Impact factor: 21.566

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.