| Literature DB >> 31745092 |
Benazir Rowe1, Xiangning Chen2,3, Zuoheng Wang4, Jingchun Chen2, Amei Amei5.
Abstract
Genome-wide association studies (GWAS) have identified over 100 loci associated with schizophrenia. Most of these studies test genetic variants for association one at a time. In this study, we performed GWAS of the molecular genetics of schizophrenia (MGS) dataset with 5334 subjects using multivariate Bayesian variable selection (BVS) method Posterior Inference via Model Averaging and Subset Selection (piMASS) and compared our results with the previous univariate analysis of the MGS dataset. We showed that piMASS can improve the power of detecting schizophrenia-associated SNPs, potentially leading to new discoveries from existing data without increasing the sample size. We tested SNPs in groups to allow for local additive effects and used permutation test to determine statistical significance in order to compare our results with univariate method. The previous univariate analysis of the MGS dataset revealed no genome-wide significant loci. Using the same dataset, we identified a single region that exceeded the genome-wide significance. The result was replicated using an independent Swedish Schizophrenia Case-Control Study (SSCCS) dataset. Based on the SZGR 2.0 database we found 63 SNPs from the best performing regions that are mapped to 27 genes known to be associated with schizophrenia. Overall, we demonstrated that piMASS could discover association signals that otherwise would need a much larger sample size. Our study has important implication that reanalyzing published datasets with BVS methods like piMASS might have more power to discover new risk variants for many diseases without new sample collection, ascertainment, and genotyping.Entities:
Year: 2019 PMID: 31745092 PMCID: PMC6863898 DOI: 10.1038/s41537-019-0088-6
Source DB: PubMed Journal: NPJ Schizophr ISSN: 2334-265X
Regions with best association metrics (Pdisc) based on permutation test
| Chr | Regiona | Start positionb | End positionb | Rankc |
|
|
|---|---|---|---|---|---|---|
| 15 | 29 | 83,907,801 | 86,887,657 | 1 | 1.43E−05 | 0.001 |
| 19 | 5 | 15,724,023 | 22,638,628 | 2 | 1.67E−04 | <0.001 |
| 14 | 33 | 86,399,092 | 90,573,122 | 3 | 2.20E−04 | <0.001 |
| 9 | 24 | 34,905,605 | 70,379,322 | 4 | 2.50E−04 | 0.002 |
| 14 | 6 | 29,288,170 | 33,177,081 | 5 | 3.00E−04 | <0.001 |
| 8 | 30 | 53,113,091 | 57,376,926 | 5 | 3.00E−04 | 0.001 |
| 20 | 1 | 9795 | 2,715,620 | 7 | 3.33E−04 | <0.001 |
| 18 | 15 | 28,642,588 | 33,646,071 | 8 | 5.00E−04 | <0.001 |
| 15 | 28 | 80,260,648 | 85,190,202 | 9 | 5.50E−04 | <0.001 |
| 1 | 36 | 81,955,643 | 85,727,849 | 10 | 5.67E−04 | <0.001 |
| 13 | 41 | 98,395,342 | 101,641,945 | 11 | 6.00E−04 | <0.001 |
| 3 | 15 | 22,010,347 | 25,354,138 | 12 | 7.50E−04 | 0.001 |
aRegions were assigned separately to each chromosome starting from 1
bStart position reflects the position of the first SNP included in the region, end position reflects the position of last SNP included in the region
cRank is based on empirical P value calculated from permutation test using the MGS dataset
dEmpirical P value based on 100,000 or less permutations using the discovery dataset (MGS)
eEmpirical P value based on 1000 permutations using the validation dataset (SSCCS)
Fig. 1Manhattan plot of 1-PIP for the MGS dataset
SNPs with their mapped genes
| Chr | Gene Region | SNP | Positiona | MAFb | PIP | C-scorec |
|---|---|---|---|---|---|---|
| 3 | AC092422.1 (RARB) | rs993804 | 25,070,680 | 0.27 | 0.059 | 5.917 |
| 3 | AC092422.1 (RARB) | rs4858697 | 25,075,091 | 0.46 | 0.044 | 2.97 |
| 13 | NALCN, NALCN-AS1 | rs2044117 | 101,055,958 | 0.13 | 0.124 | 9.444 |
| 13 | NALCN | rs9554752 | 101,073,961 | 0.35 | 0.040 | 1.960 |
| 14 | LOC105370439, LOC105370440 | rs915071 | 31,964,652 | 0.40 | 0.738 | 0.898 |
aPosition is referred to NHGRI-EBI GWAS Catalog
bMinor allele frequency (MAF) in the 1000 Genomes Phase 3 combined population
cAverage C-score based on Combined Annotation–Dependent Depletion (CADD) method
Fig. 2piMASS genome-wide region-based performance of the SSCCS dataset. The sum of posterior inclusion probabilities (PIPs) for each of the 1244 overlapping regions spanning 22 chromosomes of the SSCCS dataset
Fig. 3Manhattan plot of 1-PIP for the SSCCS dataset
Fig. 4piMASS genome-wide region-based performance of the MGS dataset. The sum of posterior inclusion probabilities (PIPs) for each of the 1266 overlapping regions spanning 22 chromosomes of the MGS dataset