| Literature DB >> 23922650 |
Zhongming Zhao1, Bradley T Webb, Peilin Jia, T Bernard Bigdeli, Brion S Maher, Edwin van den Oord, Sarah E Bergen, Richard L Amdur, Francis A O'Neill, Dermot Walsh, Dawn L Thiselton, Xiangning Chen, Carlos N Pato, Brien P Riley, Kenneth S Kendler, Ayman H Fanous.
Abstract
Integrating evidence from multiple domains is useful in prioritizing disease candidate genes for subsequent testing. We ranked all known human genes (n=3819) under linkage peaks in the Irish Study of High-Density Schizophrenia Families using three different evidence domains: 1) a meta-analysis of microarray gene expression results using the Stanley Brain collection, 2) a schizophrenia protein-protein interaction network, and 3) a systematic literature search. Each gene was assigned a domain-specific p-value and ranked after evaluating the evidence within each domain. For comparison to this ranking process, a large-scale candidate gene hypothesis was also tested by including genes with Gene Ontology terms related to neurodevelopment. Subsequently, genotypes of 3725 SNPs in 167 genes from a custom Illumina iSelect array were used to evaluate the top ranked vs. hypothesis selected genes. Seventy-three genes were both highly ranked and involved in neurodevelopment (category 1) while 42 and 52 genes were exclusive to neurodevelopment (category 2) or highly ranked (category 3), respectively. The most significant associations were observed in genes PRKG1, PRKCE, and CNTN4 but no individual SNPs were significant after correction for multiple testing. Comparison of the approaches showed an excess of significant tests using the hypothesis-driven neurodevelopment category. Random selection of similar sized genes from two independent genome-wide association studies (GWAS) of schizophrenia showed the excess was unlikely by chance. In a further meta-analysis of three GWAS datasets, four candidate SNPs reached nominal significance. Although gene ranking using integrated sources of prior information did not enrich for significant results in the current experiment, gene selection using an a priori hypothesis (neurodevelopment) was superior to random selection. As such, further development of gene ranking strategies using more carefully selected sources of information is warranted.Entities:
Mesh:
Year: 2013 PMID: 23922650 PMCID: PMC3726675 DOI: 10.1371/journal.pone.0067776
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Flowchart of data process, algorithm for gene ranking and selection, custom-based genotyping and association analysis.
Summary of genes with at least one significant (p<0.01) SNP.
| Gene | Category | Rank | # SNPs | Min p-value |
|
| 2 | 954 | 247 | 0.000536 |
|
| 1 | 35 | 229 | 0.001321 |
|
| 2 | 645 | 381 | 0.001474 |
|
| 2 | 549 | 2 | 0.001949 |
|
| 1 | 109 | 51 | 0.001975 |
|
| 3 | 88 | 2 | 0.001978 |
|
| 3 | 98 | 76 | 0.002273 |
|
| 3 | 99 | 56 | 0.002886 |
|
| 2 | 538 | 74 | 0.003219 |
|
| 3 | 96 | 55 | 0.003419 |
|
| 3 | 149 | 8 | 0.003559 |
|
| 3 | 8 | 67 | 0.004041 |
|
| 2 | 291 | 56 | 0.004202 |
|
| 3 | 144 | 11 | 0.004317 |
|
| 2 | 361 | 4 | 0.004592 |
|
| 2 | 862 | 146 | 0.005419 |
|
| 1 | 132 | 66 | 0.006312 |
|
| 3 | 151 | 47 | 0.007685 |
|
| 1 | 65 | 4 | 0.008221 |
|
| 3 | 78 | 52 | 0.008672 |
|
| 2 | 439 | 15 | 0.009058 |
|
| 3 | 101 | 67 | 0.009981 |
Category 1: genes are both highly ranked and involved in neurodevelopment. Category 2: genes are exclusive to neurodevelopment. Category 3: genes are exclusively highly ranked (see details in Materials and Methods).
Summary of genes and number of SNPs per test category.
| Category | # genes | # SNPs | Highly ranked | Neuro-development | Mean p-value | SNPs with p<0.05 | SNPs with p<0.005 | ||
| Obs. | Exp. | Obs. | Exp. | ||||||
| 1 | 73 | 1271 | Yes | Yes | 0.521 | 42 | 63.6 | 4 | 6.4 |
| 2 | 42 | 1525 | No | Yes | 0.480 | 103 | 76.3 | 17 | 7.6 |
| 3 | 52 | 929 | Yes | No | 0.488 | 63 | 46.5 | 2 | 4.6 |
| 1+2 | 115 | 2796 | - | Yes | 0.498 | 145 | 139.8 | 21 | 14.0 |
| 1+3 | 125 | 2200 | Yes | - | 0.507 | 105 | 110.0 | 6 | 11.0 |
| All | 167 | 3725 | 208 | 186.3 | 23 | 18.6 | |||
Category 1: genes are both highly ranked and involved in neurodevelopment. Category 2: genes are exclusive to neurodevelopment. Category 3: genes are exclusively highly ranked (see details in Materials and Methods).
Comparison of ISHDSF rank and hypothesis based gene selection results to random gene selection in schizophrenia CATIE and GAIN GWAS datasets.
| Category | Simulation | Empirical p-value | |||||
| Method | Observed in ISHDSF | CATIE GWAS | GAIN GWAS | ||||
| 100,000 | Top 500 | 100,000 | Top 10,000 | Top 500 | |||
| All | SNP count | 3741 | 0.00097 | 0.194 | 0.066 | 0.659 | 0.998 |
| min p-value | 0.000582 | 0.618 | 0.896 | 0.787 | 0.879 | 0.964 | |
| # SNPs with p<0.05 | 208 |
| 0.208 | 0.158 | 0.767 | 1 | |
| # SNPs with p<0.005 | 23 |
| 0.234 | 0.241 | 0.614 | 0.924 | |
| 1 | SNP count | 1271 | 0.079 | 0.655 | 0.389 | 0.806 | 0.892 |
| min p | 0.001975 | 0.729 | 0.872 | 0.859 | 0.920 | 0.956 | |
| # SNPs with p<0.05 | 42 | 0.586 | 0.914 | 0.887 | 0.976 | 0.986 | |
| # SNPs with p<0.005 | 4 | 0.556 | 0.836 | 0.780 | 0.895 | 0.934 | |
| 2 | SNP count | 1525 | 0.0022 | 0.104 | 0.017 | 0.121 | 0.267 |
| min p | 0.000582 | 0.214 | 0.349 | 0.319 | 0.398 | 0.445 | |
| # SNPs with p<0.05 | 103 |
|
|
| 0.102 | 0.232 | |
| # SNPs with p<0.005 | 17 |
|
|
| 0.058 | 0.142 | |
| 3 | SNP count | 929 | 0.088 | 0.591 | 0.358 | 0.707 | 0.822 |
| min p | 0.003559 | 0.797 | 0.926 | 0.883 | 0.932 | 0.972 | |
| # SNPs with p<0.05 | 63 |
| 0.427 | 0.264 | 0.558 | 0.719 | |
| # SNPs with p<0.005 | 2 | 0.701 | 0.896 | 0.837 | 0.914 | 0.954 | |
Category 1: genes are both highly ranked and involved in neurodevelopment. Category 2: genes are exclusive to neurodevelopment. Category 3: genes are exclusively highly ranked (see details in text).
We performed simulations by four methods: 1) based on the count of SNPs, 2) based on the minimum p-value, 3) based on the number of SNPs with p<0.05, and 4) based on the number of SNPs with p<0.005.
100,000 simulations (see text).
To reduce bias, simulations were filtered with top 500 or 10,000 SNPs being used (see Materials and Methods).
Four SNPs from the meta-analysis of 66 SNPs using GAIN, nonGAIN, and ISC GWAS datasets.
| Gene | SNP ID | Chr. | Position (bp) | Allele | Meta-analysis | ||||||||
| p-value | Beta | s.e. | pheterogeneity | I2 | pGAIN | pnonGAIN | pISC | IHDS min p | |||||
|
| rs2176348 | 2 | 45798033 | A/G | 0.044 | −0.064 | 0.032 | 0.838 | 0 | 0.470 | 0.599 | 0.057 | 0.004 |
|
| rs552551 | 10 | 131271915 | C/T | 0.044 | 0.073 | 0.036 | 0.29 | 19.15 | 0.996 | 0.797 | 0.011 | 0.009 |
|
| rs2616591 | 3 | 2614861 | C/T | 0.048 | 0.088 | 0.044 | 0.481 | 0 | 0.347 | 0.061 | NA | 0.004 |
|
| rs2043534 | 2 | 100847317 | C/T | 0.062 | 0.064 | 0.035 | 0.635 | 0 | 0.258 | 0.808 | 0.081 | 0.003 |
Chr.: chromosome. GAIN, nonGAIN and ISC are three GWAS datasets for meta-analysis. ISHDSF min p was the smallest p-value in the gene from the IHDS dataset (this study). NA: this SNP was not analyzed in ISC due to missing genotyping data in samples.