Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 IGESS: a statistical approach to integrating individual-level genotype data and summary statistics in genome-wide association studies.

Literature DB >> 28498950

IGESS: a statistical approach to integrating individual-level genotype data and summary statistics in genome-wide association studies.

Mingwei Dai^1,2, Jingsi Ming², Mingxuan Cai², Jin Liu³, Can Yang², Xiang Wan⁴, Zongben Xu¹.

Abstract

MOTIVATION: Results from genome-wide association studies (GWAS) suggest that a complex phenotype is often affected by many variants with small effects, known as 'polygenicity'. Tens of thousands of samples are often required to ensure statistical power of identifying these variants with small effects. However, it is often the case that a research group can only get approval for the access to individual-level genotype data with a limited sample size (e.g. a few hundreds or thousands). Meanwhile, summary statistics generated using single-variant-based analysis are becoming publicly available. The sample sizes associated with the summary statistics datasets are usually quite large. How to make the most efficient use of existing abundant data resources largely remains an open question.
RESULTS: In this study, we propose a statistical approach, IGESS, to increasing statistical power of identifying risk variants and improving accuracy of risk prediction by i ntegrating individual level ge notype data and s ummary s tatistics. An efficient algorithm based on variational inference is developed to handle the genome-wide analysis. Through comprehensive simulation studies, we demonstrated the advantages of IGESS over the methods which take either individual-level data or summary statistics data as input. We applied IGESS to perform integrative analysis of Crohns Disease from WTCCC and summary statistics from other studies. IGESS was able to significantly increase the statistical power of identifying risk variants and improve the risk prediction accuracy from 63.2% ( ±0.4% ) to 69.4% ( ±0.1% ) using about 240 000 variants.
AVAILABILITY AND IMPLEMENTATION: The IGESS software is available at https://github.com/daviddaigithub/IGESS . CONTACT: zbxu@xjtu.edu.cn or xwan@comp.hkbu.edu.hk or eeyang@hkbu.edu.hk. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities: Chemical

Mesh：

Year: 2017 PMID： 28498950 PMCID： PMC5860575 DOI： 10.1093/bioinformatics/btx314

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

30 in total

1. BOOST: A fast approach to detecting gene-gene interactions in genome-wide case-control studies.

Authors: Xiang Wan; Can Yang; Qiang Yang; Hong Xue; Xiaodan Fan; Nelson L S Tang; Weichuan Yu
Journal: Am J Hum Genet Date: 2010-09-10 Impact factor: 11.025

Review 2. Heritability in the genomics era--concepts and misconceptions.

Authors: Peter M Visscher; William G Hill; Naomi R Wray
Journal: Nat Rev Genet Date: 2008-03-04 Impact factor: 53.242

3. Genome-wide association analysis by lasso penalized logistic regression.

Authors: Tong Tong Wu; Yi Fang Chen; Trevor Hastie; Eric Sobel; Kenneth Lange
Journal: Bioinformatics Date: 2009-01-28 Impact factor: 6.937

4. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies.

Authors: Brendan K Bulik-Sullivan; Po-Ru Loh; Hilary K Finucane; Stephan Ripke; Jian Yang; Nick Patterson; Mark J Daly; Alkes L Price; Benjamin M Neale
Journal: Nat Genet Date: 2015-02-02 Impact factor: 38.330

5. Meta-analysis of correlated traits via summary statistics from GWASs with an application in hypertension.

Authors: Xiaofeng Zhu; Tao Feng; Bamidele O Tayo; Jingjing Liang; J Hunter Young; Nora Franceschini; Jennifer A Smith; Lisa R Yanek; Yan V Sun; Todd L Edwards; Wei Chen; Mike Nalls; Ervin Fox; Michele Sale; Erwin Bottinger; Charles Rotimi; Yongmei Liu; Barbara McKnight; Kiang Liu; Donna K Arnett; Aravinda Chakravati; Richard S Cooper; Susan Redline
Journal: Am J Hum Genet Date: 2014-12-11 Impact factor: 11.025

6. Incorporating group correlations in genome-wide association studies using smoothed group Lasso.

Authors: Jin Liu; Jian Huang; Shuangge Ma; Kai Wang
Journal: Biostatistics Date: 2012-09-17 Impact factor: 5.899

7. Common SNPs explain a large proportion of the heritability for human height.

Authors: Jian Yang; Beben Benyamin; Brian P McEvoy; Scott Gordon; Anjali K Henders; Dale R Nyholt; Pamela A Madden; Andrew C Heath; Nicholas G Martin; Grant W Montgomery; Michael E Goddard; Peter M Visscher
Journal: Nat Genet Date: 2010-06-20 Impact factor: 38.330

4. LEP: A Statistical Method Integrating Individual-Level and Summary-Level Data of the Same Trait From Different Populations.

Authors: Mingwei Dai; Jin Liu; Can Yang
Journal: Biomed Inform Insights Date: 2019-10-17

4 in total

IGESS: a statistical approach to integrating individual-level genotype data and summary statistics in genome-wide association studies.

1. BOOST: A fast approach to detecting gene-gene interactions in genome-wide case-control studies.

Review 2. Heritability in the genomics era--concepts and misconceptions.

3. Genome-wide association analysis by lasso penalized logistic regression.

4. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies.

5. Meta-analysis of correlated traits via summary statistics from GWASs with an application in hypertension.

6. Incorporating group correlations in genome-wide association studies using smoothed group Lasso.

7. Common SNPs explain a large proportion of the heritability for human height.

8. Regularization Paths for Generalized Linear Models via Coordinate Descent.

Review 9. Developing and evaluating polygenic risk prediction models for stratified disease prevention.

10. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations.

1. IGREX for quantifying the impact of genetically regulated expression on phenotypes.

2. OmicsON - Integration of omics data with molecular networks and statistical procedures.

3. LPG: A four-group probabilistic approach to leveraging pleiotropy in genome-wide association studies.

4. LEP: A Statistical Method Integrating Individual-Level and Summary-Level Data of the Same Trait From Different Populations.