| Literature DB >> 28851283 |
Guo-Bo Chen1, Sang Hong Lee1,2, Grant W Montgomery3, Naomi R Wray1, Peter M Visscher1,4, Richard B Gearry5,6, Ian C Lawrance7,8, Jane M Andrews9, Peter Bampton10, Gillian Mahy11, Sally Bell12, Alissa Walsh13, Susan Connor14,15, Miles Sparrow16, Lisa M Bowdler3, Lisa A Simms17, Krupa Krishnaprasad17, Graham L Radford-Smith18,17,19, Gerhard Moser20.
Abstract
BACKGROUND: Predicting risk of disease from genotypes is being increasingly proposed for a variety of diagnostic and prognostic purposes. Genome-wide association studies (GWAS) have identified a large number of genome-wide significant susceptibility loci for Crohn's disease (CD) and ulcerative colitis (UC), two subtypes of inflammatory bowel disease (IBD). Recent studies have demonstrated that including only loci that are significantly associated with disease in the prediction model has low predictive power and that power can substantially be improved using a polygenic approach.Entities:
Keywords: Case-control study; Complex trait; Crohn’s disease; Inflammatory bowel disease; Risk score; SNP array; Ulcerative colitis
Mesh:
Year: 2017 PMID: 28851283 PMCID: PMC5576242 DOI: 10.1186/s12881-017-0451-2
Source DB: PubMed Journal: BMC Med Genet ISSN: 1471-2350 Impact factor: 2.103
Fig. 1Datasets used in this study. a SNP density of iChip and gChip SNPs. The whole genome was partitioned into 0.6 M bins on each chromosome. The middle and inner circles indicate the density of the SNPs on iChip and gChip, respectively. The spikes for iChip depict regions of dense coverage mainly chosen for replication and fine mapping of GWAS loci, while gChip provides a uniform coverage with higher average density. b Partitioning of data into sets of increasing sample size and number of SNPs. Samples were split into four subsets with increasing number of individuals and SNPs. The smallest subsets (dotted box) include samples genotyped on both gChip and iChip and SNPs overlapping between chips
Fig. 2Comparison of prediction performance of four methods using individuals and SNPs common between gChip and iChip. The sample consisted of 2479 cases and 3440 controls for CD and 2357 cases and 6740 controls for UC. The number of SNPs was 42,534. Prediction accuracy is measured as the area under the curve (AUC) with higher values denoting better performance. Vertical lines display the variation of estimates in 5-fold cross-validation. Prediction models were trained using either disease status (0–1) or disease phenotype adjusted for ancestry (adjusted)
Fig. 3Prediction performance with increasing sample size and SNP density using BayesR. Prediction accuracy is measured as the area under the curve (AUC) with higher values denoting better performance. Prediction models were trained using either disease status (0–1) or disease phenotype adjusted for ancestry (adjusted)
Fig. 4Distribution of genomic risk scores in UC and CD cases and controls of ANZ cohort. Kernel density estimates of risks scores in case and control groups predicted using models trained on IIBDGC samples and iChip
Fig. 5Odds ratio of case-control status. Individuals in the independent ANZ cohort were partitioned into 10 groups on the basis of the rank of their predicted risk score from BayesR, EN, GBLUP, and GPRS. The first decile is used as the reference group. The vertical bars denote mean and 95% confidence intervals from 5-fold cross-validation. The discovery populations included 123,437 iChip SNPs and 43,900 and 40,050 individuals for CD and UC, respectively