Literature DB >> 28498950

IGESS: a statistical approach to integrating individual-level genotype data and summary statistics in genome-wide association studies.

Mingwei Dai1,2, Jingsi Ming2, Mingxuan Cai2, Jin Liu3, Can Yang2, Xiang Wan4, Zongben Xu1.   

Abstract

MOTIVATION: Results from genome-wide association studies (GWAS) suggest that a complex phenotype is often affected by many variants with small effects, known as 'polygenicity'. Tens of thousands of samples are often required to ensure statistical power of identifying these variants with small effects. However, it is often the case that a research group can only get approval for the access to individual-level genotype data with a limited sample size (e.g. a few hundreds or thousands). Meanwhile, summary statistics generated using single-variant-based analysis are becoming publicly available. The sample sizes associated with the summary statistics datasets are usually quite large. How to make the most efficient use of existing abundant data resources largely remains an open question.
RESULTS: In this study, we propose a statistical approach, IGESS, to increasing statistical power of identifying risk variants and improving accuracy of risk prediction by i ntegrating individual level ge notype data and s ummary s tatistics. An efficient algorithm based on variational inference is developed to handle the genome-wide analysis. Through comprehensive simulation studies, we demonstrated the advantages of IGESS over the methods which take either individual-level data or summary statistics data as input. We applied IGESS to perform integrative analysis of Crohns Disease from WTCCC and summary statistics from other studies. IGESS was able to significantly increase the statistical power of identifying risk variants and improve the risk prediction accuracy from 63.2% ( ±0.4% ) to 69.4% ( ±0.1% ) using about 240 000 variants.
AVAILABILITY AND IMPLEMENTATION: The IGESS software is available at https://github.com/daviddaigithub/IGESS . CONTACT: zbxu@xjtu.edu.cn or xwan@comp.hkbu.edu.hk or eeyang@hkbu.edu.hk. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

Entities:  

Mesh:

Year:  2017        PMID: 28498950      PMCID: PMC5860575          DOI: 10.1093/bioinformatics/btx314

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  30 in total

1.  BOOST: A fast approach to detecting gene-gene interactions in genome-wide case-control studies.

Authors:  Xiang Wan; Can Yang; Qiang Yang; Hong Xue; Xiaodan Fan; Nelson L S Tang; Weichuan Yu
Journal:  Am J Hum Genet       Date:  2010-09-10       Impact factor: 11.025

Review 2.  Heritability in the genomics era--concepts and misconceptions.

Authors:  Peter M Visscher; William G Hill; Naomi R Wray
Journal:  Nat Rev Genet       Date:  2008-03-04       Impact factor: 53.242

3.  Genome-wide association analysis by lasso penalized logistic regression.

Authors:  Tong Tong Wu; Yi Fang Chen; Trevor Hastie; Eric Sobel; Kenneth Lange
Journal:  Bioinformatics       Date:  2009-01-28       Impact factor: 6.937

4.  LD Score regression distinguishes confounding from polygenicity in genome-wide association studies.

Authors:  Brendan K Bulik-Sullivan; Po-Ru Loh; Hilary K Finucane; Stephan Ripke; Jian Yang; Nick Patterson; Mark J Daly; Alkes L Price; Benjamin M Neale
Journal:  Nat Genet       Date:  2015-02-02       Impact factor: 38.330

5.  Meta-analysis of correlated traits via summary statistics from GWASs with an application in hypertension.

Authors:  Xiaofeng Zhu; Tao Feng; Bamidele O Tayo; Jingjing Liang; J Hunter Young; Nora Franceschini; Jennifer A Smith; Lisa R Yanek; Yan V Sun; Todd L Edwards; Wei Chen; Mike Nalls; Ervin Fox; Michele Sale; Erwin Bottinger; Charles Rotimi; Yongmei Liu; Barbara McKnight; Kiang Liu; Donna K Arnett; Aravinda Chakravati; Richard S Cooper; Susan Redline
Journal:  Am J Hum Genet       Date:  2014-12-11       Impact factor: 11.025

6.  Incorporating group correlations in genome-wide association studies using smoothed group Lasso.

Authors:  Jin Liu; Jian Huang; Shuangge Ma; Kai Wang
Journal:  Biostatistics       Date:  2012-09-17       Impact factor: 5.899

7.  Common SNPs explain a large proportion of the heritability for human height.

Authors:  Jian Yang; Beben Benyamin; Brian P McEvoy; Scott Gordon; Anjali K Henders; Dale R Nyholt; Pamela A Madden; Andrew C Heath; Nicholas G Martin; Grant W Montgomery; Michael E Goddard; Peter M Visscher
Journal:  Nat Genet       Date:  2010-06-20       Impact factor: 38.330

8.  Regularization Paths for Generalized Linear Models via Coordinate Descent.

Authors:  Jerome Friedman; Trevor Hastie; Rob Tibshirani
Journal:  J Stat Softw       Date:  2010       Impact factor: 6.440

Review 9.  Developing and evaluating polygenic risk prediction models for stratified disease prevention.

Authors:  Nilanjan Chatterjee; Jianxin Shi; Montserrat García-Closas
Journal:  Nat Rev Genet       Date:  2016-05-03       Impact factor: 53.242

10.  The NHGRI GWAS Catalog, a curated resource of SNP-trait associations.

Authors:  Danielle Welter; Jacqueline MacArthur; Joannella Morales; Tony Burdett; Peggy Hall; Heather Junkins; Alan Klemm; Paul Flicek; Teri Manolio; Lucia Hindorff; Helen Parkinson
Journal:  Nucleic Acids Res       Date:  2013-12-06       Impact factor: 16.971

View more
  4 in total

1.  IGREX for quantifying the impact of genetically regulated expression on phenotypes.

Authors:  Mingxuan Cai; Lin S Chen; Jin Liu; Can Yang
Journal:  NAR Genom Bioinform       Date:  2020-02-19

2.  OmicsON - Integration of omics data with molecular networks and statistical procedures.

Authors:  Cezary Turek; Sonia Wróbel; Monika Piwowar
Journal:  PLoS One       Date:  2020-07-29       Impact factor: 3.240

3.  LPG: A four-group probabilistic approach to leveraging pleiotropy in genome-wide association studies.

Authors:  Yi Yang; Mingwei Dai; Jian Huang; Xinyi Lin; Can Yang; Min Chen; Jin Liu
Journal:  BMC Genomics       Date:  2018-06-28       Impact factor: 3.969

4.  LEP: A Statistical Method Integrating Individual-Level and Summary-Level Data of the Same Trait From Different Populations.

Authors:  Mingwei Dai; Jin Liu; Can Yang
Journal:  Biomed Inform Insights       Date:  2019-10-17
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.