Literature DB >> 23974872

Genome-wide association analysis identifies 13 new risk loci for schizophrenia.

Stephan Ripke¹, Colm O'Dushlaine, Kimberly Chambert, Jennifer L Moran, Anna K Kähler, Susanne Akterin, Sarah E Bergen, Ann L Collins, James J Crowley, Menachem Fromer, Yunjung Kim, Sang Hong Lee, Patrik K E Magnusson, Nick Sanchez, Eli A Stahl, Stephanie Williams, Naomi R Wray, Kai Xia, Francesco Bettella, Anders D Borglum, Brendan K Bulik-Sullivan, Paul Cormican, Nick Craddock, Christiaan de Leeuw, Naser Durmishi, Michael Gill, Vera Golimbet, Marian L Hamshere, Peter Holmans, David M Hougaard, Kenneth S Kendler, Kuang Lin, Derek W Morris, Ole Mors, Preben B Mortensen, Benjamin M Neale, Francis A O'Neill, Michael J Owen, Milica Pejovic Milovancevic, Danielle Posthuma, John Powell, Alexander L Richards, Brien P Riley, Douglas Ruderfer, Dan Rujescu, Engilbert Sigurdsson, Teimuraz Silagadze, August B Smit, Hreinn Stefansson, Stacy Steinberg, Jaana Suvisaari, Sarah Tosato, Matthijs Verhage, James T Walters, Douglas F Levinson, Pablo V Gejman, Kenneth S Kendler, Claudine Laurent, Bryan J Mowry, Michael C O'Donovan, Michael J Owen, Ann E Pulver, Brien P Riley, Sibylle G Schwab, Dieter B Wildenauer, Frank Dudbridge, Peter Holmans, Jianxin Shi, Margot Albus, Madeline Alexander, Dominique Campion, David Cohen, Dimitris Dikeos, Jubao Duan, Peter Eichhammer, Stephanie Godard, Mark Hansen, F Bernard Lerer, Kung-Yee Liang, Wolfgang Maier, Jacques Mallet, Deborah A Nertney, Gerald Nestadt, Nadine Norton, Francis A O'Neill, George N Papadimitriou, Robert Ribble, Alan R Sanders, Jeremy M Silverman, Dermot Walsh, Nigel M Williams, Brandon Wormley, Maria J Arranz, Steven Bakker, Stephan Bender, Elvira Bramon, David Collier, Benedicto Crespo-Facorro, Jeremy Hall, Conrad Iyegbe, Assen Jablensky, Rene S Kahn, Luba Kalaydjieva, Stephen Lawrie, Cathryn M Lewis, Kuang Lin, Don H Linszen, Ignacio Mata, Andrew McIntosh, Robin M Murray, Roel A Ophoff, John Powell, Dan Rujescu, Jim Van Os, Muriel Walshe, Matthias Weisbrod, Durk Wiersma, Peter Donnelly, Ines Barroso, Jenefer M Blackwell, Elvira Bramon, Matthew A Brown, Juan P Casas, Aiden P Corvin, Panos Deloukas, Audrey Duncanson, Janusz Jankowski, Hugh S Markus, Christopher G Mathew, Colin N A Palmer, Robert Plomin, Anna Rautanen, Stephen J Sawcer, Richard C Trembath, Ananth C Viswanathan, Nicholas W Wood, Chris C A Spencer, Gavin Band, Céline Bellenguez, Colin Freeman, Garrett Hellenthal, Eleni Giannoulatou, Matti Pirinen, Richard D Pearson, Amy Strange, Zhan Su, Damjan Vukcevic, Peter Donnelly, Cordelia Langford, Sarah E Hunt, Sarah Edkins, Rhian Gwilliam, Hannah Blackburn, Suzannah J Bumpstead, Serge Dronov, Matthew Gillman, Emma Gray, Naomi Hammond, Alagurevathi Jayakumar, Owen T McCann, Jennifer Liddle, Simon C Potter, Radhi Ravindrarajah, Michelle Ricketts, Avazeh Tashakkori-Ghanbaria, Matthew J Waller, Paul Weston, Sara Widaa, Pamela Whittaker, Ines Barroso, Panos Deloukas, Christopher G Mathew, Jenefer M Blackwell, Matthew A Brown, Aiden P Corvin, Mark I McCarthy, Chris C A Spencer, Elvira Bramon, Aiden P Corvin, Michael C O'Donovan, Kari Stefansson, Edward Scolnick, Shaun Purcell, Steven A McCarroll, Pamela Sklar, Christina M Hultman, Patrick F Sullivan.

Abstract

Schizophrenia is an idiopathic mental disorder with a heritable component and a substantial public health impact. We conducted a multi-stage genome-wide association study (GWAS) for schizophrenia beginning with a Swedish national sample (5,001 cases and 6,243 controls) followed by meta-analysis with previous schizophrenia GWAS (8,832 cases and 12,067 controls) and finally by replication of SNPs in 168 genomic regions in independent samples (7,413 cases, 19,762 controls and 581 parent-offspring trios). We identified 22 loci associated at genome-wide significance; 13 of these are new, and 1 was previously implicated in bipolar disorder. Examination of candidate genes at these loci suggests the involvement of neuronal calcium signaling. We estimate that 8,300 independent, mostly common SNPs (95% credible interval of 6,300-10,200 SNPs) contribute to risk for schizophrenia and that these collectively account for at least 32% of the variance in liability. Common genetic variation has an important role in the etiology of schizophrenia, and larger studies will allow more detailed understanding of this disorder.

Entities: Chemical

Mesh：

Year: 2013 PMID： 23974872 PMCID： PMC3827979 DOI： 10.1038/ng.2742

Source DB: PubMed Journal: Nat Genet ISSN： 1061-4036 Impact factor: 38.330

Schizophrenia is an idiopathic mental disorder with substantial morbidity, mortality, and personal and societal costs. [1-3] An important genetic component is indicated by a sibling recurrence risk ratio of 8.6, high heritability estimates (0.64 in a national family study,0.81 in a meta-analysis of twin studies, and 0.23 estimated directly from common SNPs), and prior genomic findings. [4-8] Although the rationale for genomic searches is strong, there are only a handful of robust empirical findings for schizophrenia. Genome-wide linkage studies to date have been inconclusive, [9] and no compelling Mendelian variants have been identified.[8] Eight rare copy number variants of strong effect (genotypic relative risks 4-20)with consistent replication have been described (e.g., 16p11.2 and 22q11.21); however, these associations are generally not disease-specific and can also be associated with autism, mental retardation, or epilepsy. [8]Initial exome sequencing studies have not yet identified specific variants of unequivocal genome-wide significance [9-13] although larger studies are in progress. Prior GWAS for common variation have yielded statistical evidence for ∼10 genomic regions[8] including the major histocompatibility complex (MHC) [14-16] along with MIR137 and targets of miR-137. [17] The prior studies contained indications that more common variant associations were likely to be discovered with larger sample sizes. [13,17,18] We therefore sought to increase substantially the number of cases using a multistage GWAS.

Results from Sweden

We analyzed genome-wide data in 5,001 schizophrenia cases and 6,243 controls from a population-based sampling frame in Sweden (N=11,244, Table 1). Most subjects (57.4%) have never been previously reported. Following genotyping and imputation with the 1000 Genomes Project Phase 1 reference panel, the genetic data consisted of allelic dosages for 9,871,789 high-quality polymorphic SNPs. Given that this imputation panel is based on >800 chromosomes of European ancestry and includes the detail afforded by genome sequencing, we anticipated increased power in finding and describing association signals. Indeed, we observed 10,201 SNPs and 187 genomic regions with P < 1×10−5 using 1000 Genomes imputation compared with 1,594 SNPs and 133 regions for HapMap3 imputation (counts include only one region from the MHC).

Table 1

Subject characteristics and sample sizes.

Feature	Cases	Controls
Swedish sample characteristics
Male sex	0.595	0.512
Median age at sampling	54 (45-62)	57 (48-65)
Median hospital admissions for SCZ or SAD	7 (3-15)	n/a
Median total inpatient days	243 (81-696)	n/a
Median years from first to last HDR admission	9.7 (2.9-19.5)	n/a
Sample sizes
Swedish subjects (Sw1-6)	5,001	6,243
PGC schizophrenia subjects (excluding Sw1-2)	8,832	12,067
Replication results for up to 168 genomic regions	7,413	19,762
Total subjects	21,246	38,072

Values in parentheses are inter-quartile ranges. The case group had significantly more males (p < 0.0001) and was significantly younger (p < 0.0001) than controls although these differences were not of large magnitude. The higher median age in controls is in the direction of greater confidence in control classification (i.e., controls had greater time at risk for psychiatric hospitalization). Cases tended to have had considerable hospitalizations, inpatient lengths of stay, and years of observation. IQQ=interquartile range, SCZ=schizophrenia, SAD=schizoaffective disorder, HDR=Hospital Discharge Register.

All cases and controls are independent. The Swedish sample totals N=11,244, the PGC N=20,899, and the replication samples N=27,175. The Sweden plus PGC meta-analysis is based on N=32, 143. The Swedish sample plus PGC plus replication samples total 59,318. (these counts exclude 511 trios).

The resulting λGC was 1.075 and λ1000 (references [19-21]) was 1.013. Quantile-quantile and Manhattan plots are given in Supplemental Figures 5-6. For association with schizophrenia, 312 SNPs met a genome-wide significance threshold of 5×10−8 (reference [22]). These SNPs were in two genomic regions (Supplemental Figure 7): 241 SNPs in the MHC region (chr6:28,502,794-32,536,501, minimum P=4.07×10−11 at rs115939516) and 71 SNPs from chr2:200,715,388-201,040,981 (minimum P=3.33×10−10 at rs35220450). We replicated the MHC association reported in prior studies. [14-17] The chr2 association with schizophrenia is novel, shows highly consistent effects in the Sw1-6 genotyping batches and encompasses C2orf69, C2orf47, C2orf60, and TYW5.

Sweden + PGC

We re-analyzed the PGC schizophrenia data using 1000 Genomes imputation (8,832 cases and 12,067 controls, excluding Swedish samples). [17] Five regions met genome-wide significance: the MHC locus (chr6:27,261,324-32,610,445, minimum P=2.18×10−10), AS3MT-CNNM2-NT5C2 (chr10:104,635,103-104,960,464, minimum P=4.29×10−10), MAD1L1 (chr7:2,005,747-2,098,238, minimum P=2.40×10−8), RP11-586K2.1, (chr8:89,585,639-89,760,620, minimum P=2.37×10−8), and SNPs nearTCF4 (chr18:53,311,001-53,423,307, minimum P=3.00×10−8). We then conducted a meta-analysis of the Swedish and independent PGC schizophrenia samples using the same quality control, imputation, and analysis pipeline. This GWAS meta-analysis of 13,833 schizophrenia cases and 18,310 controls (Table 1) afforded power to detect genotypic relative risks of 1.10-1.14 for reference allele frequencies 0.15-0.85 (power=0.8, α=5×10−8, log-additive model). We evaluated the comparability of the Swedish and PGC studies using sign tests: of 608 SNPs selected from the PGC results with P < 0.0001 and in approximate linkage equilibrium, 62.6% had logistic regression beta coefficients with the same sign in the Swedish results, an observation highly inconsistent with the null (P=2.2×10−10). λGC was 1.186 and λ1000 was 1.012, values consistent with a polygenic pattern of association but not gross inflation due to technical artifacts. [20] Quantile-quantile and Manhattan plots are shown in Supplemental Figure 11 and Figure 1, and genome-wide significance was exceeded by 3,538 SNPs in 12 genomic regions.

Figure 1

Manhattan plot of the Swedish and PGC schizophrenia meta-analysis results. The x-axis is chromosomal position and the y-axis is –log10(P). The red line is the genome-wide significance level (5×10−8). Gene locations are indicated.

We used risk score profiling[14,17] to evaluate the capacity of 130K SNPs derived from the PGC to predict case-control status in the Swedish samples. These SNPs were selected for high-confidence and approximate linkage equilibrium but without regard to association P value. As shown in Figure 2, PGC risk scores had a highly significant capacity to predict case-control status in the independent Swedish samples (P values from 10−26 – 10−114). The increased sample size allowed improved risk profile prediction as more of the SNPs in the lower bins are replicable signals. The threshold at which the explanatory power of these risk profile SNPs plateaus has decreased with increasing sample size: PT=0.1 in Figure 2, 0.2 in the PGC report, and no plateau in the ISC study). [14,17] Although the mean risk profiles were highly significantly different between cases and controls, the distributions overlap substantially (Supplemental Figure 9) and are insufficient for diagnostic purposes (area under the receiver operating characteristic curve 0.65). However, these results strongly support the comparability of the Swedish and PGC samples and the validity of the meta-analysis.

Figure 2

Risk score profiling results using the PGC schizophrenia results as the discovery set and the Sweden data as the testing set. The x-axis shows ten P value thresholds (P = 10−4, 10−3, …, 1). The y-axis is the Nagelkerke pseudo R2, the proportion of variance in case-control status explained by the risk score profile. The number atop each vertical bar is the P value for the capacity of the risk score profile to predict case-control status for that P.

GWAS often omit the X chromosome (chrX). This omission is problematic as chrX is approximately as large as chromosome 8 and is enriched for genes important in brain development. Using a previously described approach, we imputed genotyped chrX SNPs to the 1000 Genomes reference panel. [23] Joint analysis of all subjects as well as males and females separately revealed no association exceeding genome-wide significance. The strongest association (rs12845396, chrX:6,029,533, P=3.46×10−7) was in an intron of NLGN4X (neuroligin 4), a gene previously implicated in mental retardation and autism, and there were multiple possible signals nearMECP2(causal to Rett syndrome, P=9.3×10−6). GWAS results generally do not lie in protein coding regions. [24] A recent report suggested that most SNPs in the NHGRI GWAS catalog [24] were in or in perfect LD with DNase 1 hypersensitive sites. [25] We thus evaluated whether the Sweden + PGC results had significant overlap with DNase 1 hypersensitive sites generated as part of the ENCODE project. [26] We did not find evidence of enrichment (Supplemental Table 8 and Supplemental Figure 10). However, this negative result is strongly qualified by the lack of DNase 1 hypersensitivity data directly relevant to psychiatric disorders.

Sweden + PGC + replication

We then obtained association results for SNPs in 194 genomic regions in six independent samples for a total sample size of over 21,000 cases and 38,000 controls(Table 1). The genomic regions for which replication genotypes were sought were identified using LD clumping defined by LD (r2> 0.5) and a minimum P < 1×10−5 in the Sweden-PGC meta-analysis. Only one MHC SNP was included. The Sweden-PGC meta-analysis and replication results were highly concordant with 76.3% of the logistic regression beta coefficients having the same direction of effect (sign test P=1.5×10−17). Indeed, of the top 100 SNPs in the Sweden-PGC meta-analysis, 90% had the same sign in the replication results. This result strongly suggests that many more loci will achieve genome-wide significance with further increases in sample size. Table 2 shows the combined results in which 24 regions reached genome-wide significance. As two pairs of these regions overlap (chr1:243Mb and chr5:152Mb), there are associations with schizophrenia in 22 genomic regions. Three additional regions nearly met genome-wide significance (rs4380187 near ZNF804A P=5.66×10−8, rs4523957 in SRR P=5.69×10−8, and rs6550435 near TRANK1 P=5.86×10−8 which also had P=9×10−6 in a bipolar disorder GWAS). [27]

Table 2

Association results for Sweden-PGC meta-analysis, replication samples, and combined analysis

Chromosomal region	kb	SNPs	Index SNP				P-value			OR (SE)
Chromosomal region	kb	SNPs	rs ID	a12	bp	Freq	Sw+PGC	Replication	Combined	Sw+PGC	Replication	Combined
chr6:31,596,138-32,813,768	1217.6	1412	rs114002140	AG	32,431,962	0.763	8.28x10⁻¹⁵	6.93×10⁻²	9.14×10⁻¹⁴	1.213 (0.025)	1.070 (0.037)	1.167 (0.021)
chr10:104,487,871-105,245,420	757.5	362	rs7085104	AG	104,628,873	0.645	1.07×10⁻¹¹	2.10×10⁻³	3.68×10⁻¹³	1.129 (0.018)	1.076 (0.024)	1.110 (0.014)
chr7:1,827,717-2,346,115	518.4	566	rs6461049	TC	2,017,445	0.571	6.17×10⁻¹³	1.85×10⁻²	5.93×10⁻¹³	1.132 (0.017)	1.059 (0.024)	1.107 (0.014)
chr1:98,141,112-98,664,991	523.9	307	rs1198588	AT	98,552,832	0.214	1.92×10⁻⁸	1.91×10⁻⁵	1.72×10⁻¹²	0.889 (0.021)	0.888 (0.028)	0.889 (0.017)
chr12:2,285,731-2,440,464	154.7	129	rs1006737	AG	2,345,295	0.332	8.79×10⁻¹¹	3.76×10⁻³	5.22×10⁻¹²	1.122 (0.018)	1.070 (0.023)	1.103 (0.014)
chr10:18,601,928-18,934,390	332.5	147	rs17691888	AG	18,734,528	0.114	3.86×10⁻⁷	6.09×10⁻⁵	1.27×10⁻¹⁰	0.870 (0.028)	0.842 (0.043)	0.862 (0.023)
chr8:143,297,312-143,410,423	113.1	117	rs4129585	AC	143,312,933	0.439	3.32×10⁻⁸	1.20×10⁻³	2.19×10⁻¹⁰	1.098 (0.017)	1.077 (0.023)	1.091 (0.014)
chr1:73,275,828-74,099,273	823.4	1026	rs10789369	AG	73,824,909	0.383	4.68×10⁻⁷	1.99×10⁻⁴	3.64×10⁻¹⁰	1.091 (0.017)	1.106 (0.027)	1.095 (0.015)
chr11:130,706,918-130,894,976	188.1	269	rs7940866	AT	130,817,579	0.513	1.61×10⁻¹⁰	1.30×10⁻¹	1.83×10⁻⁹	0.896 (0.017)	0.966 (0.023)	0.921 (0.014)
chr5:151,888,959-152,835,304	946.3	79	rs17504622	TC	152,654,479	0.050	6.88×10⁻⁸	1.02×10⁻²	2.65×10⁻⁹	1.250 (0.041)	1.202 (0.072)	1.238 (0.036)
chr19:19,354,937-19,744,079	389.1	294	rs2905424	TC	19,473,445	0.348	5.38×10⁻⁷	1.64×10⁻³	3.44×10⁻⁹	1.092 (0.018)	1.093 (0.028)	1.092 (0.015)
chr2:37,422,072-37,592,628	170.6	10	rs2373000	TC	37,592,628	0.402	9.17×10⁻⁶	1.38×10⁻⁴	6.78×10⁻⁹	1.079 (0.017)	1.108 (0.027)	1.087 (0.014)
chr5:101,581,848-101,870,822	289	367	rs6878284	TC	101,769,726	0.637	1.47×10⁻⁶	1.61×10⁻³	9.03×10⁻⁹	0.917 (0.018)	0.925 (0.025)	0.920 (0.015)
chr3:52,215,002-53,175,017	960	533	rs4687552	TC	52,838,402	0.641	9.31×10⁻⁷	3.23×10⁻³	1.16×10⁻⁸	1.092 (0.018)	1.074 (0.024)	1.086 (0.014)
chr2:145,139,727-145,214,607	74.9	4	rs12991836	AC	145,141,541	0.652	2.25×10⁻⁶	1.30×10⁻³	1.19×10⁻⁸	0.918 (0.018)	0.928 (0.023)	0.922 (0.014)
chr2:200,628,118-201,293,421	665.3	249	rs2949006	TG	200,715,388	0.192	4.67×10⁻⁹	9.18×10⁻²	1.21×10⁻⁸	1.132 (0.021)	1.049 (0.029)	1.102 (0.017)
chr18:52,722,378-52,827,668	105.3	39	rs4801131	TC	52,752,700	0.418	6.46×10⁻⁶	5.27×10⁻⁴	1.22×10⁻⁸	0.926 (0.017)	0.924 (0.023)	0.925 (0.014)
chr2:233,550,961-233,808,241	257.3	197	rs778371	AG	233,743,109	0.719	5.66×10⁻⁷	5.93×10⁻³	1.51×10⁻⁸	0.911 (0.019)	0.935 (0.025)	0.920 (0.015)
chr1:243,593,066-244,025,999	432.9	133	rs14403	TC	243,663,893	0.227	1.35×10⁻⁸	8.34×10⁻²	1.80×10⁻⁸	0.889 (0.021)	0.952 (0.029)	0.910 (0.017)
chr12:123,447,928-123,913,433	465.5	353	rs11532322	AG	123,731,423	0.318	1.37×10⁻⁶	4.77×10⁻³	2.28×10⁻⁸	1.099 (0.020)	1.084 (0.029)	1.094 (0.016)
chr1:243,418,063-243,627,135	209.1	115	rs1538774	CG	243,544,827	0.260	6.11×10⁻⁷	8.38×10⁻³	2.53×10⁻⁸	0.907 (0.020)	0.934 (0.026)	0.917 (0.016)
chr8:89,188,454-89,761,163	572.7	402	rs11995572	TG	89,592,083	0.135	5.39×10⁻⁸	5.02×10⁻²	3.33×10⁻⁸	1.150 (0.026)	1.069 (0.034)	1.120 (0.021)
chr5:60,484,179-60,843,706	359.5	100	rs171748	AG	60,499,131	0.471	1.62×10⁻⁶	5.36×10⁻³	3.78×10⁻⁸	1.084 (0.017)	1.068 (0.024)	1.078 (0.014)
chr5:152,505,453-152,707,306	201.9	8	rs2910032	TC	152,540,354	0.531	8.90×10⁻⁶	1.22×10⁻³	4.12×10⁻⁸	0.928 (0.017)	0.916 (0.027)	0.925 (0.014)
chr2:185,533,580-186,057,716	524.1	50	rs4380187	AC	185,811,940	0.529	5.14×10⁻⁷	1.98×10⁻²	5.66×10⁻⁸	1.089 (0.017)	1.056 (0.024)	1.078 (0.014)
chr17:2,015,612-2,256,111	240.5	252	rs4523957	TG	2,208,899	0.616	3.01×10⁻⁷	2.66×10⁻²	5.69×10⁻⁸	1.096 (0.018)	1.057 (0.025)	1.083 (0.015)
chr3:36,834,099-36,964,583	130.5	66	rs6550435	TG	36,864,489	0.656	1.65×10⁻⁶	8.24×10⁻³	5.86×10⁻⁸	0.917 (0.018)	0.939 (0.024)	0.925 (0.014)

We used LD “clumping” to aggregate association findings into genomic regions. The first three columns describe the genomic regions and the next three columns the index SNP, the SNP with strongest association in the genomic region. The next three columns show the P-values in the meta-analysis of Sw1-6 with the PGC schizophrenia results, the replication samples alone, and the final combined analysis of Sw1-6, PGC, and replication samples. The final three columns show the odd ratios (OR) and standard errors (SE). All locations UCSC hg19.

Of these 22 regions (Table 3), five regions have been reported previously as meeting genome-wide significance for schizophrenia (MHC, C10orf26, DPYD-MIR137, SDCCAG8, and MMP16) and two for schizophrenia, bipolar disorder, and a combined phenotype (CACNA1C and ITIH3-ITIH4). [14-17,27-29] For the remaining 15 regions, we now find genome-wide significance for a locus previously implicated only for bipolar disorder (NCAN)[30] along with 14 novel regions.

Table 3

Description of 22 genome-wide significant regions in the combined analysis

Chromosomal region	P-value	Prior GWSIG?	Gene in relation to index SNP	Other genes in genomic region defined by LD	eQTL	Disease associations
chr6:31,596,138-32,813,768	9.14×10⁻¹⁴	SCZ	HLA-DRB9	MHC class II, many other genes, lincRNA	Many	Many
chr10:104,487,871-105,245,420	3.68×10⁻¹³	SCZ	C10orf32-AS3MT	C10orf26 CALHM1 CALHM2 CALHM3 CNNM2 CYP17A1 INA MIR1307 NT5C2 PCGF6 PDCD11 SFXN2 ST13P13 TAF5 USMG5	ACTR1A ARL3 AS3MT C10orf26 C10orf32 C10orf78 NT5C2 TMEM180 TRIM8	GWAS-blood pressure, CAD, aneurysm
chr7:1,827,717-2,346,115	5.93×10⁻¹³	No	MAD1L1	FTSJ2 NUDT1 SNX8	C7orf27 FTSJ2 MAD1L1 NUDT1
chr1:98,141,112-98,664,991	1.72×10⁻¹²	SCZ	(MIR137, 37kb)	DPYD & lincRNA	DPYD	DPYD-MR
chr12:2,285,731-2,440,464	5.22×10⁻¹²	SCZ, BIP	CACNA1C	-	No data	CACNA1C-AUT-Timothy syn Brugada syn 3
chr10:18,601,928-18,934,390	1.27×10⁻¹⁰	No	CACNB2	NSUN6	No data	CACNB2-Brugada syn 4, GWAS-blood pressure
chr8:143,297,312-143,410,423	2.19×10⁻¹⁰	No	TSNARE1	-	No data
chr1:73,275,828-74,099,273	3.64×10⁻¹⁰	No	(x10NST00000415686.1, 4kb)	lincRNA	No data
chr11:130,706,918-130,894,976	1.83×10⁻⁹	No	(SNX19, 31kb)	lincRNA	SNX19
chr5:151,888,959-152,835,304 chr5:152,505,453-152,707,306	2.65×10⁻⁹ 4.12×10⁻⁸	No No	ENST00000503048.1	lincRNA (GRIA1)	No data
chr19:19,354,937-19,744,079	3.44×10⁻⁹	BIP	(MAU2, 4kb)	CILP2 GATAD2A GMIP HAPLN4 LPAR2 MIR640 NCAN NDUFA13 PBX4 SUGP1 TM6SF2 TSSK6 YJEFN3	No data	GWAS-lipid levels
chr2:37,422,072-37,592,628	6.78×10⁻⁹	No	QPCT	C2orf56 CEBPZ PRKD3 SULT6B1 lincRNA	No eQTL
chr5:101,581,848-101,870,822	9.03×10⁻⁹	No	SLCO6A1	lincRNA	No data
chr3:52,215,002-53,175,017	1.16×10⁻⁸	SCZ, BIP	ITIH3	ALAS1 ALDOAP1 BAP1 C3orf78 DNAH1 GLT8D1 GLYCTK GNL3 ITIH1 ITIH4 MIR135A1 MIRLET7G MUSTN1 NEK4 NISCH NT5DC2 PBRM1 PHF7 PPM1M RFT1 SEMA3G SFMBT1 SPCS1 STAB1 TLR9 TMEM110 TNNC1 TWF2 WDR82 lincRNA	No data (ITIH1-ITIH3-ITIH4)	GLYCTK-D-glycericaciduria-MR; RTF1-MR; GWAS-adiponectin, height, waist-hip ratio
chr2:145,139,727-145,214,607	1.19×10⁻⁸	No	ZEB2	-	No eQTL	ZEB2-Mowat-Wilson syn-MR
chr2:200,628,118-201,293,421	1.21×10⁻⁸	No	FONG	C2orf47 C2orf60 C2orf69 SPATS2L TYW5 lincRNA	No data	GWAS-osteoporosis
chr18:52,722,378-52,827,668	1.22×10⁻⁸	No	(ENST00000565991.1, 21kb)	lincRNA (TCF4)	No data
chr2:233,550,961-233,808,241	1.51×10⁻⁸	No	C2orf82	GIGYF2 KCNJ13 NGEF	No data
chr1:243,593,066-244,025,999chr1:243,418,063-243,627,135	1.80×10⁻⁸ 2.53×10⁻⁸	No Yes	AKT3SDCCAG8	CEP170	AKT3SDCCAG8
chr12:123,447,928-123,913,433	2.28×10⁻⁸	No	C12orf65	ABCB9 ARL6IP4 CDK2AP1 MIR4304 MPHOSPH9 OGFOD2 PITPNM2 RILPL2 SBNO1 SETD8 lincRNA	ARL6IP4 CDK2AP1 OGFOD2 SBNO1	C12orf65-MR; GWAS-HDL, height, head size
chr8:89,188,454-89,761,163	3.33×10⁻⁸	SCZ	Intergenic	MMP16 lincRNA	MMP16
chr5:60,484,179-60,843,706	3.78×10⁻⁸	No	ENST00000506902.1	ZSWIM6C5orf43 lincRNA	C5orf43 ZSWIM6

The prior GWSIG column indicates regions reported to meet genome-wide significance for schizophrenia or bipolar disorder. The first gene column shows the gene with respect to the SNP with the strongest association in the interval. Parentheses indicate that a SNP is not within a gene and show the distance to the nearest gene. The second gene column shows the other named genes in the genomic interval. The eQTL column shows SNP-transcript associations with q < 0.05 in peripheral blood. Underlining indicates eQTLs with the SNP with the strongest association. Disease associations contains data from the NHGRI GWAS catalog, [24] OMIM, [42] and compilation of genes related to autism[72] and mental retardation. [42,73,74] No data means no Affymetrix U219 probesets or low expression in peripheral blood. Abbreviations: GWSIG=genome-wide significant, SCZ=schizophrenia, BIP=bipolar disorder, AUT=autism, MR=mental retardation.

Themes

We highlight four themes from these results (see also Supplemental Table 9). First, these results implicate calcium signaling in the etiology of schizophrenia. As in prior studies of bipolar disorder and schizophrenia, [17,27,28] we found genome-wide significant support for CACNA1C (Cav1.2, P=5.2×10−12 at the intronic SNP rs1006737). Intriguingly, we identified a novel genome-wide significant association for CACNB2 (P=1.3×10−10 at the intronic SNP rs17691888) which encodes the β2 subunit of L-type calcium channels (Cav β2). A gene-set test supported the involvement of calcium channel subunits in the etiology of schizophrenia (Supplemental Table 7). In L-type calcium channels, the α1c subunit forms the transmembrane pore, and directly interacts with the intracellular β2 subunit. [31] The β2 subunit also antagonizes an endoplasmic reticulum retention motif on the α1c subunit to facilitate transport to the plasma membrane. [32]Additional genes with genome-wide significant evidence were implicated based on membership in a proteomic network centered on Cav2 (reference [33]): the protein products of ACTR1A (α-centractin), the divalent metal cation transporter CNNM2 (P=3.7×10−13, chr10:103,009,986-105,512,924), and CACNB2. A broad genomic region containing the calcium binding protein troponin C (TNNC1) also met genome-wide significance (P=1.1×10−8) as well as three calcium homeostasis modulator genes (CALHM1, CALHM2,and CALHM3 in same chr10 region as CNNM2). The genetics and biology of calcium channels have been the subject of considerable investigation owing to their importance in fundamental neuronal processes and human diseases. L-type voltage-gated calcium channels are involved in learning, memory, and synaptic plasticity, and CACNA1C knock-out mice show notable deficits in long term potentiation. [34-37] Calcium “channelopathies” include mutations in CACNA1C and CACNB2 that cause Brugada syndrome types 3 and 4 (OMIM #611875 and #611876). [38]In addition, Timothy syndrome (OMIM #601005), caused by mutations in CACNA1C, is a multisystem disorder including cognitive impairment and autism spectrum disorder. [39] Although Mendelian disorders are usually characterized by persistent pathological features, Mendelian calcium channelopathies can have episodic phenomena perhaps reminiscent of the episodic nature of psychotic disorders – for example, intermittent hypoglycemia and hypocalcemia in Timothy syndrome (CACNA1C), episodic ataxia (CACNA1A, CACNB4), migraine (CACNA1A), epilepsy (CACNA1H, CACNB4), periodic paralysis (CACNA1S), and malignant hyperthermia (CACNA1S, CACNA2D1). [31,39] GWAS findings for schizophrenia have converged on genome-wide significant evidence for a calcium channel functional complex that has also been implicated in bipolar disorder and autism. These genomic results support increased attention to this pathway, and suggest hypotheses for clinical translation. Multiple approved medications act at calcium channels including some antipsychotics (e.g., pimozide) along with adjuvants for treatment non-response for schizophrenia and bipolar disorder (e.g., the calcium channel blockers verapamil and nifedipine). It is possible that drugs that act on the protein products of CACNA1C and CACNB2 for a different therapeutic indication could be “re-purposed” for the treatment of schizophrenia. For example, there has been at least one clinical trial of the efficacy of isradipine in bipolar disorder (an approved antihypertensive acting at the protein product of CACNA1C, R Perlis, personal communication). In addition, given that many approved antipsychotics increase the cardiac QT interval, genetic variation in calcium channel genes might identify individuals at higher risk of sudden cardiac death. [40,41] Second, as reported previously, [14-17] the strongest association P=9.1×10−14)with schizophrenia is in the extended MHC (chr6:25-34Mb), a region of both exceptional importance and complexity. The MHC comprises 0.3% of the genome but contains 1.5% of the genes in OMIM [42] and 6.4% of genome-wide significant SNP associations in the NHGRI GWAS catalog. [24] It is the second most gene-dense genomic region and has high LD over its extent. We speculate that these features (high gene density and strong LD) combined with the polygenicity of schizophrenia lead to the strong association but will also complicate efforts to identify causal variation. Genome-wide significant associations with schizophrenia extend over 7Mb, but Supplemental Figure 12 suggests that larger samples may resolve this association into sub-regions near TRIM26 (tripartite motif containing 26, chr6:30.1Mb) and the HLA-DRB9 unprocessed pseudogene (chr6:32.4Mb, intergenic HLA-DRA – HLA-DRB5). Third, multiple genomic lines of evidence support a role for MIR137 in the etiology of schizophrenia. We provide increased support for a common variant association located upstream of the MIR137 transcript (P=1.7×10−12, Supplemental Figure 13). Fourteen genes in the regions in Table 3 have miR-137 target sites predicted by TargetScan (v6.2) [43] (C6orf47, HLA-DQA1, TNXB, VARS, C10orf26, CACNA1C, DPYD, CACNB2, TSSK6, NT5DC2, PITPNM2, SBNO1, ZEB2, and PRKD3). Using gene-set analysis, we evaluated whether genes with predicted miR-137 target sites were enriched for smaller association P values. We confirmed the PGC result [17] and extended the finding by showing more robust enrichment in afar larger set of genes with predicted miR-137 target sites (Supplemental Table 7). In addition, our unpublished work shows enrichment for smaller GWAS P-values in genes down-regulated following over-expression of miR-137 in human neural stem cells (Collins, in preparation). Given the role of miR-137 in fundamental neuronal processes, [44-46] these results support investigation of pathways influenced by miR-137in regard to a role in the pathogenesis of schizophrenia. The SNP with the strongest association to schizophrenia (rs1198588) is 39kb upstream of MIR137, and might regulate the transcription of MIR137. However, this has not been proven experimentally and there is another candidate gene in the region. rs1198588 is in an LD block that includes DPYD (169kb upstream of rs1198588),and rs1198588 is a significant local expression quantitative trait locus (eQTL) with DPYD. We note that DPYD also contains a predicted miR-137 target site. An exome sequencing study reported two putative functional de novo variants in DPYD in cases with schizophrenia. [11] Fourth, 13 of the 22 regions in Table 3 contain long intergenic non-coding RNAs (lincRNAs). lincRNAs have multiple known or suspected functions including epigenetic regulation and development. [47] Using pathway analysis,[48] there was modest enrichment (P=0.06) for smaller association P values in a conservative set of lincRNAs derived from sequencing of poly-A RNA from multiple tissues. [47] This observation is consistent with a general role for GWAS findings in the regulation of gene expression rather than alteration of protein sequence. eQTLs [49,50] overlap with SNPs implicated by GWAS over all traits [51-53] as well as for specific traits like height, adiposity, cardiovascular risk factors, chemotherapy-induced cytotoxicity, autism, schizophrenia, and Crohn's disease. [54-61]An estimated 55% of eQTL SNPs lie in DNase I hypersensitivity sites (a marker for open chromatin subject to transcriptional regulation) and 77% of SNPs implicated in GWAS are in or in high LD with SNPs inDNase I hypersensitivity sites. [25,62,63]

Genetic architecture

There has been considerable debate about the genetic architecture of schizophrenia. We estimated the proportion of variance in liability to schizophrenia explained by SNPs using GCTA. [64] Traditional genetic epidemiological studies use the phenotypic resemblance of relatives to estimate the proportion of variance in liability using theoretic resemblance assumptions. GCTA uses genome-wide SNP genotypes to calculate the heritability in the population from the identity-by-state relationships for each pair of individuals. Using the PGC schizophrenia data, we previously estimated the SNP heritability of schizophrenia at 0.23 (SE 0.01) using HapMap3 imputation and assuming a population risk of 0.01. [7] Using the same imputation reference and population risk, SNP heritability was substantially higher in the Swedish samples (0.32, SE 0.03) possibly due to the greater phenotypic and genetic homogeneity in the Swedish sample compared to the PGC samples of mixed European ancestry. We obtained a similar estimate of SNP heritability using 1000 Genomes imputed data (0.33, SE 0.03, population risk 0.01). For a population risk of 0.004, [4,65] SNP heritability was 0.26 (SE 0.02) using HapMap3 and 0.27 (SE 0.02) using 1000 Genomes imputation. Partitioning of the SNP-heritability by minor allele frequency is consistent with 80% of the signal reflecting causal variants with MAF > 0.1 (Supplemental Table 5). To complement the GCTA analyses, we also applied ABPA (approximate Bayesian polygenic analysis) [66] to the Sweden + PGC results. Compared to GCTA, ABPA yielded somewhat larger but generally congruent estimates of variance in liability to schizophrenia using HapMap3 data: 0.43 for population risk of 0.01 (95% credible interval 0.38-0.48) and 0.34 for population risk of 0.004 (95% credible interval 0.31-0.37). The Bayesian framework used by ABPA also allows simultaneous estimation of the number of independent SNP loci that contribute to risk for schizophrenia. Here, we assume that the number of genome-wide significant SNP associations and the amount of variance they explain in the Sweden + PGC results reflect only partly the underlying genetic architecture of schizophrenia due to inadequate sample size. Using 1000 Genomes results for Sweden + PGC and assuming population risk of 0.01, we estimated that 8,300 independent SNPs contribute to the genetic basis of schizophrenia and that these SNPs account for 50% of the variance in liability to schizophrenia (95% credible intervals 6,300-10,200 for the number of SNPs and 0.45-0.54 for total variance explained). We stress that these estimates must be interpreted in the context of the assumptions of ABPA and the strengths and weaknesses of the input data. Additional analyses (not shown) indicate that most of the signal was derived from SNPs with allele frequencies > 0.1; low-frequency imputed SNPs were not generally inferred to be associated with schizophrenia. Figure 3 compares ABPA estimates of the genetic architecture of schizophrenia and four biomedical diseases. [66] There are similarities across the estimates for these complex traits as all are relatively highly polygenic, and common SNPs explain substantial proportions of variation. However, these results suggest that the genetic architecture of schizophrenia is left-shifted with greater numbers of SNPs with smaller effects.

Figure 3

The main figure shows the results of ABPA modeling based on the Sweden + PGC results (population risk 0.01). The x-axis is the estimated number of SNPs on a log10 scale, and the y-axis estimates the total variance in liability explained. The results for five conditions are shown: schizophrenia (this analysis, red) and, for comparison, results from a published analysis of myocardial infarction (MI, purple), type 2 diabetes mellitus (T2D, blue), celiac disease (green), and rheumatoid arthritis (RA, teal). [71] The schizophrenia results are based on 1000 Genomes imputation, and the others on HapMap3 imputation. Color intensity reflects the probability density with darker colors indicating higher density. Contour lines show 50% and 95% credible regions for SCZ, and 95% credible regions for the other diseases. The insets depict estimated SNP distributions for the five disorders: (a) distribution of SNPs in terms of the variance in liability explained per SNP and (b) the estimated distribution of SNP genotypic relative risks (GRR). We again stress that multiple qualifiers are essential in interpreting these estimates.

We previously estimated the heritability of schizophrenia in Sweden to be 0.64 (95% CI 0.617-0.675) using a national pedigree sample of 9.0M individuals,[5] and a Danish national pedigree study of 2.6M individuals reported a similar estimate (0.67, 95% CI 0.65-0.71). [5,67]Using the 1000 Genomes data with population risk of 0.01, the variance in liability estimate from GCTA accounts for 52% of the heritability (0.33/0.64) and ABPA accounts for 78% of the heritability (0.50/0.64). Imprecision is inherent to these estimates and future work or the use of a twin meta-analytic estimate of the heritability of schizophrenia (0.81, 95% CI 0.73-0.90) [6] could revise these estimates downward. However, despite the use of different assumptions and methods, these estimates converge on a crucial qualitative implication: causal variants tagged by common SNPs make substantial contributions to the risk for schizophrenia.

Conclusions

These results provide deeper insight into the genetic architecture of schizophrenia than ever before. We find support for 22 common variant loci (14 novel) that highlight biological hypotheses for further evaluation. Some findings have immediate translational relevance. Larger studies are highly likely to uncover more common variant associations as argued elsewhere. [8,18,68,69] Common variation is an important (and perhaps predominant) genetic contributor to risk for schizophrenia. We estimated that 6,300-10,200 independent and mostly common SNPs contribute to the etiology of schizophrenia. As one gene or structural element could contain multiple independent associations, that the number of number of genes ultimately determined to harbor causal variation for schizophrenia will be smaller, and we expect that these genes will implicate one or more biological pathways fundamental to disease risk. Moreover, these thousands of independent loci appear to account for a considerable fraction of the heritability of schizophrenia. It is possible that the commonly used phrase “missing heritability” lacks precision. Indeed, if thousands of SNPs underlie schizophrenia, a statistical models containing a handful of SNPs is unlikely to account for more than a small fraction of the heritability. [70] Ourresults imply that the genetic architecture of schizophrenia is not dominated by uncommon variation. However, a balanced plan of attack should include well-powered searches for rare, private, or de novo genetic variation of strong effect given that such variants are probably more tractable to current molecular methods. Power calculations are a fundamental component of the design of genetic studies. However, relatively extensive knowledge of genetic architecture is essential for power calculations to have maximum utility for study planning. We used the ABPA estimates of the posterior distribution of genotypic relative risks (Figure 3) to inform power calculations by estimating the numbers of independent loci that could be detected for different sample sizes (Supplemental Table 6 and Supplemental Figure 8). For example, for 60,000 schizophrenia cases and 60,000 controls, ABPA results project that hundreds of independent SNP loci would reach genome-wide significance (mean of 794 SNPs, 95% credible interval 362-1154 SNPs). Thus, for the first time, we now have a clear path to increased knowledge about the etiology of schizophrenia via application of standard, off-the-shelf genomic technologies for elucidating the effects of common variation. We suggest that a relatively thorough enumeration of the genomic loci conferring risk for schizophrenia (the “parts list”) should be a priority for the field. [8] Identifying all loci would surely be an exercise in diminishing returns. However, we propose a goal for the field: identification of the top 2,000 loci (for example) might be sufficient confidently and clearly to reveal the biological processes that mediate risk and protection for schizophrenia. Achievement of this goal would provide a strong empirical impetus for targeted biological and genetic research into the precise molecular basis of risk for schizophrenia, stratification of at-risk populations (e.g., psychotic prodrome), and appropriate cellular measure for evaluation of novel therapeutics. As indicated by our findings, greater knowledge of the genetic basis of schizophrenia can converge on increasingly specific neurobiological hypotheses that can be prioritized for subsequent investigation.

Online Methods

Overview

We present here the pre-planned principal analyses for this project. In order to advance knowledge of schizophrenia, a minority of samples were included in prior reports. Genotyping was conducted in six batches (denoted Sw1-Sw6) with total sample sizes of 464, 694, 1498, 2388, 4461, and 2345. Genotypes were generated as sufficient numbers of samples accumulated from the field work in Sweden. The 2009 International Schizophrenia Consortium report contained GWAS data from the Sw1-2 subjects (N=1158, 9.8% of the sample before quality control). [14]. The 2011 PGC schizophrenia paper also contained GWAS data from the Sw1-2 subjects plus ∼80 SNPs from Sw3-4 in the replication phase. [17] The 2012 Bergen et al. paper had a particular focus contrasting schizophrenia with bipolar disorder and reported GWAS results from Sw1-4 (N=4044, 42.6% of the full sample). [75] Thus, of the total sample of 11,850 Swedish subjects before quality control (5,351 cases, 6,509 controls), 57.4% have never been reported previously.

Subjects

All procedures were approved by ethical committees at the Karolinska Institutet and University of North Carolina, and all subjects provided written informed consent (or legal guardian consent and subject assent). Sample collection was from 2005-11. Cases with schizophrenia were identified via the Swedish Hospital Discharge Register [76,77] which captures all public and private inpatient hospitalizations. The register is complete from 1987 and augmented by psychiatric data from 1973-86. The register contains ICD discharge diagnoses [78-80] made by attending physicians for each hospitalization. [81-84] Case inclusion criteria: ≥2 hospitalizations with a discharge diagnosis of schizophrenia, both parents born in Scandinavia, and age ≥18 years. Case exclusion criteria: hospital register diagnosis of any medical or psychiatric disorder mitigating a confident diagnosis of schizophrenia as determined by expert review, and included removal of 3.4% of eligible cases due to the primacy of another psychiatric disorder (0.9%) or a general medical condition (0.3%) or uncertainties in the Hospital Discharge Register (e.g., contiguous admissions with brief total duration, 2.2%). The validity of this case definition of schizophrenia is described at length in the Supplement, and validity is strongly supported by clinical, epidemiological, genetic epidemiological, and genetic evidence. Controls were selected at random from Swedish population registers with the goal of obtaining an appropriate control group and avoiding “super-normal” controls. [85] Control inclusion criteria: never hospitalized for schizophrenia or bipolar disorder (given evidence of genetic overlap with schizophrenia), [5,14,86] both parents born in Scandinavia, and age ≥18 years. Of the potential cases and controls who were alive and contactable, refusal rates were higher for cases than for controls (46.7% versus 41.7%). However, these proportions compare favorably with modern refusal rates in epidemiology (59% for cross-sectional and 44% for case-control studies), [87,88] and in a recent large Norwegian longitudinal study (58%). [89] For cases, comorbidity with drug/alcohol abuse or dependence did not predict participation nor did any subtype of schizophrenia (e.g., paranoid or disorganized types). The sample was approximately representative of the Swedish populace in regard to county of birth (Supplemental Figure 4).

Genotyping, quality control, and imputation

DNA was extracted from peripheral blood samples at the Karolinska Institutet Biobank. Samples were genotyped in six batches at the Broad Institute using Affymetrix 5.0 (3.9%), Affymetrix 6.0 (38.6%), and Illumina OmniExpress (57.4%) chips according to the manufacturers' protocols (Supplemental Table 3). Genotype calling, quality control, and imputation were done in four sets corresponding to data from Affymetrix 5.0 (Sw1), Affymetrix 6.0 (Sw2-4), and the OmniExpress batches (Sw5, Sw6). Genotypes were called using Birdsuite (Affymetrix) or BeadStudio (Illumina). The quality control parameters applied were: SNP missingness < 0.05 (before sample removal); subject missingness < 0.02;autosomal heterozygosity deviation; SNP missingness < 0.02 (after sample removal);difference in SNP missingness between cases and controls < 0.02; and deviation from Hardy-Weinberg equilibrium (P < 10−6 in controls or P < 10−10 in cases). After basic quality control, 77,986 autosomal SNPs directly genotyped on all four GWAS platforms were extracted and pruned to remove SNPs in LD (r2> 0.05) or with minor allele frequency < 0.05, leaving 39,239 SNPs suitable for robust relatedness testing and population structure analysis. Relatedness testing was done with PLINK[90] and pairs of subjects with π(x00302) > 0.2 were identified and one member of each relative pair removed at random. Principal component estimation was done with the same collection of SNPs. We tested 20 principal components for phenotype association (using logistic regression with batch indicator variables included as covariates) and evaluated their impact on the genome-wide test statistics using λ [19] after genome-wide association of the specified principal component, and 11 principal components were included in all association analyses. Genotype imputation was performed using the pre-phasing/imputation stepwise approach implemented in IMPUTE2 / SHAPEIT (chunk size of 3 Mb and default parameters). [91,92] The imputation reference set consisted of 2,186 phased haplotypes from the full 1000Genomes Project dataset (March 2012, 40,318,245 variants). Evaluation of λGC led to the removal of SNPs with control allele frequencies < 0.005 or > 0.995, imputation “info” values < 0.2, or that were genotyped only in the smallest sample set (Sw1). Given that male sex is a risk factor for schizophrenia, [93] chromosome X imputation was conducted for subjects passing QC for the autosomal analysis (excluding chrX SNPs with missingness ≥ 0.05 or HWE P < 10−6 in females). Imputation was performed separately for males and females, gene dosages tested for association under an additive logistic regression model using the same covariates as for the autosomal analysis. All genomic locations are given in NCBI build 37/UCSC hg19 coordinates.

Statistical analysis

We first analyzed Swedish cases and controls (N=11,244), and then conducted a meta-analysis with the PGC results for schizophrenia to evaluate our results with respect to the world's literature (N=20,899 after removing 954 subjects from Sw1-2). [17] To maximize comparability, the Swedish samples were run through the same analytical pipeline used for the PGC samples. Association testing was carried out in PLINK using imputed SNP dosages and the principal components described above as covariates. [22] Meta-analysis was conducted using an inverse-weighted fixed effects model. [21] To evaluate the comparability of the Swedish results with those from the PGC schizophrenia study, we used sign tests and risk score profiling based on sets of carefully selected SNPs. [17]

Summarizing regional data using “clumping”

Many GWAS findings implicate an extended region containing multiple significant SNPs. These are not independent associations but result because of high LD between associated SNPs. It is useful to summarize these associations in terms of the index SNP with the highest association and other SNPs in high linkage disequilibrium with the index SNP. To summarize GWAS findings, we used the following settings in PLINK: to retain SNPs with association P < 0.0001 and r2 < 0.2 within 500 kb windows.

Sign tests

We used sign tests to compare the overall patterns of results between the Swedish and PGC schizophrenia samples. We used the clumping settings above to derive a filtered set of SNPs. Due to the strong signal and high linkage disequilibrium in the MHC, only one SNP was kept from the extended MHC region. We then determined the number of SNPs whose logistic regression beta coefficient signs were the same between two independent samples. Under the null, the expectation is that 50% of the signs of these SNPs will be the same between two independent sets of results. The significance of the observed proportion was evaluated using the binomial distribution. The significance test was done in two ways: selecting SNPs from Sw1-6 results and evaluating the signs in the independent PGC results, and by reversing the procedure (select from PGC, evaluate signs in Sw1-6). Similar results were obtained selecting SNPs for: (a) P < 1×10−5, (b) P < 1×10−6, (c) keeping one SNP every 3 Mb (effectively removing or greatly minimizing the effects of residual linkage disequilibrium).

Risk score profiles (RPS)

We used RPS [14] as an alternative and complementary way to compare the overall patterns of results from the PGC schizophrenia analysis (discovery sample) with the independent Swedish results (target sample). We began by selecting a high-quality, relatively independent SNPs with unambiguous directions of effects: from the PGC imputed results file, we made a subset of results containing SNPs with allele frequency 0.02-0.98 and imputation INFO scores > 0.9. We then removed SNPs in high LD using via clumping (i.e., retain all SNPs with r2< 0.25 within 500 kb windows): For RPS, we wished to evaluate SNP effects across the p-value spectrum. Again, due to the strong signal and high linkage disequilibrium in the MHC, only one SNP was kept from the extended MHC region. We used the resulting list from the PGC to calculate schizophrenia risk profile scores in the independent Swedish samples using the --score function in PLINK. We did this 10 times using different subsets of the PGC SNPs selected by increasing P value thresholds. From the set of filtered SNPs from the PGC, we evaluated 10 different association P thresholds (P): 0.0001, 0.001, 0.01, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, and 1.0 (i.e., include all SNPs). For each of these 10 sets of SNPs derived from the PGC, the schizophrenia risk profile score (the number of schizophrenia risk alleles weighted by the logistic regression beta) was calculated for each case and control in Sw1-6. Logistic regression was then used to test whether Swedish cases had significantly different burden of schizophrenia risk alleles in comparison to controls (including ancestry principal components as covariates). To estimate the proportion of variance of case-control status in the Swedish samples accounted for by the risk profile score from the PGC, we used the difference in the Nagelkerke pseudo R2 contrasting a logistic regression model containing the risk profile score plus ancestry covariates with a logistic regression model containing the covariates alone.

Gene-set analysis

One way to understand polygenic associations for a complex trait is if the implicated genetic variants are in genes that comprise a biological pathway. Gene-set analysis includes evaluation of genetic variants in genes that are grouped based on their interacting role in biological pathways (biological pathway analysis) and genes that share similar cellular functions (functional gene-set analysis). We used JAG (Joint Association of Genetic variants, http://ctglab.nl/software) to conduct gene-set analyses. This method has previously been applied to the International Schizophrenia Consortium data by Lips et al. [94] JAG tests for the association of specified gene-sets with schizophrenia as applied to individual-level genotype data which tends to be more powerful than using summary statistics. JAG constructs a test-statistic for each gene-set. JAG includes both self-contained and competitive tests. These two approaches evaluate different null hypotheses. Statistical significance (P and P) are determined using permutation. First, the self-contained test evaluates the null hypothesis that a defined set of genes is not associated with schizophrenia while accounting for the some of the properties of the SNPs being studied (e.g., LD structure). Second, the competitive test evaluates whether a specific set of genes has evidence for stronger associations with schizophrenia than randomly selected sets of control genes (with the latter matched to the former using the same effective number of SNPs per gene-set). Thus, a competitive test is of the null hypothesis is that these genes are not more strongly associated than a similar but randomly-selected set of genes. That is, the comparison is more one to the average degree of association across genes. The principal comparison is the competitive test, and we present self-contained tests for completeness. Competitive gene-set tests are more appropriate for a polygenic disease like schizophrenia because they explicitly prioritize gene-sets that show a greater average degree of association, over and above the polygenic background, rather than prioritizing larger but more weakly-enriched gene-sets (as self-contained tests would tend to do).

Replication

We obtained replication association results from six independent samples totaling 7,452 cases, 20,404 controls, and 581 trios (Supplemental Table 4). These subjects are not included in the Swedish samples or in the PGC mega-analysis. [17] The independent samples were from SGENE+, [16], CLOZUK, [29] the Irish Schizophrenia Genomics Consortium, [95] the Psychosis Endophenotype Consortium, [96], and the Multicenter Family Study. [97] After selecting for P < 1×10−5 in the Sweden and PGC meta-analysis and accounting for linkage disequilibrium, we requested association results for 194 genomic regions.

COLLABORATOR LIST - Multicenter Genetic Studies of Schizophrenia Consortium

Prof Douglas F Levinson MD	dflev@stanford.edu	Psychiatry and Behavioral Sciences, Stanford University, Stanford, California, USA
Prof Pablo V Gejman MD	pgejman@gmail.com	Psychiatry and Behavioral Sciences, NorthShore University HealthSystem and University of Chicago, Evanston, Illinois, USA
Dr Claudine Laurent MD PhD	claudinelaurent54@yahoo.fr	Child and Adolescent Psychiatry, Pierre and Marie Curie Faculty of Medicine and Brain and Spinal Cord Institute (ICM), Paris, France
Prof Bryan J Mowry MD FRANZCP	b.mowry@uq.edu.au	Psychiatry, Queensland Brain Institute and Queensland Centre for Mental Health Research, University of Queensland; Brisbane, Queensland, Australia
Prof Ann E Pulver PhD	aepulver@jhmi.edu	Psychiatry, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
Prof Sibylle G Schwab PhD	sibylle.schwab@uk-erlangen.de	Psychiatry, Friedrich-Alexander University, Erlangen-Nuremberg, Erlangen, Germany
Prof Dieter B Wildenauer PhD	dieter.wildenauer@uwa.edu.au	Psychiatry and Clinical Neurosciences, Western Australian Institute for Medical Research & Centre for Medical Research, The University of Western Australia, Nedlands, Australia
Dr Frank Dudbridge PhD	frank.dudbridge@lshtm.ac.uk	Non-Communicable Disease Epidemiology, London School of Hygiene and Tropical Medicine, London, UK
Dr Jianxin Shi PhD	jianxins@mail.nih.gov	Biostatistics, National Cancer Institute, Bethesda, MD, USA
Prof Margot Albus MD	margot.albus@iak-kmo.de	State Mental Hospital, Haar, Germany
Dr Madeline Alexander PhD	mmga1@comcast.net	Psychiatry and Behavioral Sciences, Stanford University, Stanford, California, USA
Prof Dominique Campion PhD	dominique.campion@univ-rouen.fr	INSERM U614, University of Medicine, Rouen, France
Prof David Cohen MD PhD	dcohen55@noos.fr	Child and Adolescent Psychiatry, Pierre and Marie Curie Faculty of Medicine, Institute for Intelligent Systems and Robotics (ISIR), Paris, France
Prof DimitrisDikeos MD	ddikeos@med.uoa.gr	First Department of Psychiatry, University of Athens Medical School, Athens, Greece
Dr JubaoDuan PhD	jduan69@gmail.com	Psychiatry and Behavioral Sciences, NorthShore University HealthSystem and University of Chicago; Evanston, Illinois, USA
Prof Peter Eichhammer MD PhD	Peter.Eichhammer@medbo.de	Psychiatry, University of Regensburg, Regensburg, Germany
Stephanie Godard	stephanie.godard-bauche@upmc.fr	Psychiatry and Genetics, INSERM, Institut de Myologie, Hôpital Pitié Salpêtrière, Paris, France
Dr Mark Hansen PhD	mhansen@illumina.com	Illumina, Inc., La Jolla, California, USA
Prof F Bernard Lerer MD	lerer@cc.huji.ac.il	Psychiatry, Hadassah-Hebrew University Medical Center, Jerusalem, Israel
Prof Kung-Yee Liang PhD	kyliang@jhsph.edu	Biostatistics, Johns Hopkins University, Baltimore, Maryland, USA
Prof Wolfgang Maier MD	wolfgang.maier@ukb.uni-bonn.de	Psychiatry, University of Bonn, Bonn, Germany
Prof Jacques Mallet PhD	mallet@chups.jussieu.fr	Centre National de la Recherche Scientifique, Laboratoire de Génétique Moléculaire de la Neurotransmission et des Processus Neurodégénératifs, Hôpital Pitié Salpêtrière, Paris, France
Deborah A Nertney	deb_nertney@qcmhr.uq.edu.au	Psychiatry, Queensland Brain Institute and Queensland Centre for Mental Health Research, University of Queensland, Brisbane, Queensland, Australia
Prof Gerald Nestadt MD	gnestadt@jhmi.edu	Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
Dr Nadine Norton PhD	nortonn@cardiff.ac.uk	Psychological Medicine and Neurology, MRC Centre for Neuropsychiatric Genetics and Genomics, School of Medicine, Cardiff University, Cardiff, Wales, UK
Prof George N Papadimitriou MD	gnpapad@med.uoa.gr	First Department of Psychiatry, University of Athens Medical School, Athens, Greece
Robert Ribble	rcribble@vcu.edu	Psychiatry, VIPBG, VCU, Richmond, Virginia, USA
Dr Alan R Sanders MD	alan.sanders.md@gmail.com	Psychiatry, NorthShore University HealthSystem and University of Chicago, Evanston, Illinois, USA
Prof Jeremy M Silverman PhD	jeremy.silverman@mssm.edu	Psychiatry, Mount Sinai School of Medicine, New York, NY and VAMC,Bronx, New York, USA
Prof Dermot Walsh MD	dwalsh@hrb.ie	The Health Research Board, Dublin, Ireland
Dr Nigel M Williams PhD	williamsnm@cf.ac.uk	Psychological Medicine, MRC Centre for Neuropsychiatric Genetics and Genomics, School of Medicine, Cardiff University, Cardiff, Wales, UK
Brandon Wormley	bwormley@hsc.vcu.edu	Psychiatry, VIPBG, VCU, Richmond, Virginia, USA

90 in total

1. Expression QTL analysis of top loci from GWAS meta-analysis highlights additional schizophrenia candidate genes.

Authors: Simone de Jong; Kristel R van Eijk; Dave W L H Zeegers; Eric Strengman; Esther Janson; Jan H Veldink; Leonard H van den Berg; Wiepke Cahn; René S Kahn; Marco P M Boks; Roel A Ophoff
Journal: Eur J Hum Genet Date: 2012-03-21 Impact factor: 4.246

2. Practical aspects of imputation-driven meta-analysis of genome-wide association studies.

Authors: Paul I W de Bakker; Manuel A R Ferreira; Xiaoming Jia; Benjamin M Neale; Soumya Raychaudhuri; Benjamin F Voight
Journal: Hum Mol Genet Date: 2008-10-15 Impact factor: 6.150

3. Genome-wide significant associations in schizophrenia to ITIH3/4, CACNA1C and SDCCAG8, and extensive replication of associations reported by the Schizophrenia PGC.

Authors: M L Hamshere; J T R Walters; R Smith; A L Richards; E Green; D Grozeva; I Jones; L Forty; L Jones; K Gordon-Smith; B Riley; F A O'Neill; T O'Neill; K S Kendler; P Sklar; S Purcell; J Kranz; D Morris; M Gill; P Holmans; N Craddock; A Corvin; M J Owen; M C O'Donovan
Journal: Mol Psychiatry Date: 2012-05-22 Impact factor: 15.992

4. Estimation of effect size distribution from genome-wide association studies and implications for future discoveries.

Authors: Ju-Hyun Park; Sholom Wacholder; Mitchell H Gail; Ulrike Peters; Kevin B Jacobs; Stephen J Chanock; Nilanjan Chatterjee
Journal: Nat Genet Date: 2010-06-20 Impact factor: 38.330

Review 5. Schizophrenia genetics: where next?

Authors: Yunjung Kim; Stephanie Zerwas; Sara E Trace; Patrick F Sullivan
Journal: Schizophr Bull Date: 2011-05 Impact factor: 9.306

6. Genetics of gene expression and its effect on disease.

Authors: Valur Emilsson; Gudmar Thorleifsson; Bin Zhang; Amy S Leonardson; Florian Zink; Jun Zhu; Sonia Carlson; Agnar Helgason; G Bragi Walters; Steinunn Gunnarsdottir; Magali Mouy; Valgerdur Steinthorsdottir; Gudrun H Eiriksdottir; Gyda Bjornsdottir; Inga Reynisdottir; Daniel Gudbjartsson; Anna Helgadottir; Aslaug Jonasdottir; Adalbjorg Jonasdottir; Unnur Styrkarsdottir; Solveig Gretarsdottir; Kristinn P Magnusson; Hreinn Stefansson; Ragnheidur Fossdal; Kristleifur Kristjansson; Hjortur G Gislason; Tryggvi Stefansson; Bjorn G Leifsson; Unnur Thorsteinsdottir; John R Lamb; Jeffrey R Gulcher; Marc L Reitman; Augustine Kong; Eric E Schadt; Kari Stefansson
Journal: Nature Date: 2008-03-16 Impact factor: 49.962

7. The global costs of schizophrenia.

Authors: Martin Knapp; Roshni Mangalore; Judit Simon
Journal: Schizophr Bull Date: 2004 Impact factor: 9.306

Review 8. A systematic review of mortality in schizophrenia: is the differential mortality gap worsening over time?

Authors: Sukanta Saha; David Chant; John McGrath
Journal: Arch Gen Psychiatry Date: 2007-10

9. Investigating the association between cigarette smoking and schizophrenia in a cohort study.

Authors: Stanley Zammit; Peter Allebeck; Christina Dalman; Ingvar Lundberg; Tomas Hemmingsson; Glyn Lewis
Journal: Am J Psychiatry Date: 2003-12 Impact factor: 18.112

Review 10. A systematic review of the prevalence of schizophrenia.

Authors: Sukanta Saha; David Chant; Joy Welham; John McGrath
Journal: PLoS Med Date: 2005-05-31 Impact factor: 11.069

714 in total

1. Polygenic risk for schizophrenia associated with working memory-related prefrontal brain activation in patients with schizophrenia and healthy controls.

Authors: Karolina Kauppi; Lars T Westlye; Martin Tesli; Francesco Bettella; Christine L Brandt; Morten Mattingsdal; Torill Ueland; Thomas Espeseth; Ingrid Agartz; Ingrid Melle; Srdjan Djurovic; Ole A Andreassen
Journal: Schizophr Bull Date: 2014-11-11 Impact factor: 9.306

Review 2. Heterogeneity and individuality: microRNAs in mental disorders.

Authors: Leif G Hommers; Katharina Domschke; Jürgen Deckert
Journal: J Neural Transm (Vienna) Date: 2014-11-14 Impact factor: 3.575

3. Integration of Enhancer-Promoter Interactions with GWAS Summary Results Identifies Novel Schizophrenia-Associated Genes and Pathways.

Authors: Chong Wu; Wei Pan
Journal: Genetics Date: 2018-05-04 Impact factor: 4.562

4. Converging genetic and functional brain imaging evidence links neuronal excitability to working memory, psychiatric disease, and brain activity.

Authors: Angela Heck; Matthias Fastenrath; Sandra Ackermann; Bianca Auschra; Horst Bickel; David Coynel; Leo Gschwind; Frank Jessen; Hanna Kaduszkiewicz; Wolfgang Maier; Annette Milnik; Michael Pentzek; Steffi G Riedel-Heller; Stephan Ripke; Klara Spalek; Patrick Sullivan; Christian Vogler; Michael Wagner; Siegfried Weyerer; Steffen Wolfsgruber; Dominique J-F de Quervain; Andreas Papassotiropoulos
Journal: Neuron Date: 2014-02-13 Impact factor: 17.173

5. Human-specific endogenous retroviral insert serves as an enhancer for the schizophrenia-linked gene PRODH.

Authors: Maria Suntsova; Elena V Gogvadze; Sergey Salozhin; Nurshat Gaifullin; Fedor Eroshkin; Sergey E Dmitriev; Natalia Martynova; Kirill Kulikov; Galina Malakhova; Gulnur Tukhbatova; Alexey P Bolshakov; Dmitry Ghilarov; Andrew Garazha; Alexander Aliper; Charles R Cantor; Yuri Solokhin; Sergey Roumiantsev; Pavel Balaban; Alex Zhavoronkov; Anton Buzdin
Journal: Proc Natl Acad Sci U S A Date: 2013-11-11 Impact factor: 11.205

Review 6. The emerging molecular architecture of schizophrenia, polygenic risk scores and the clinical implications for GxE research.

Authors: Conrad Iyegbe; Desmond Campbell; Amy Butler; Olesya Ajnakina; Pak Sham
Journal: Soc Psychiatry Psychiatr Epidemiol Date: 2014-01-17 Impact factor: 4.328

7. Integrated Post-GWAS Analysis Sheds New Light on the Disease Mechanisms of Schizophrenia.

Authors: Jhih-Rong Lin; Ying Cai; Quanwei Zhang; Wen Zhang; Rubén Nogales-Cadenas; Zhengdong D Zhang
Journal: Genetics Date: 2016-10-17 Impact factor: 4.562

8. Perspectives of psychiatric investigators and IRB chairs regarding benefits of psychiatric genetics research.

Authors: Laura Weiss Roberts; Laura B Dunn; Jane Paik Kim; Maryam Rostami
Journal: J Psychiatr Res Date: 2018-09-15 Impact factor: 4.791

9. Recommendations from the international stroke genetics consortium, part 2: biological sample collection and storage.

Authors: Thomas W K Battey; Valerie Valant; Sylvia Baedorf Kassis; Christina Kourkoulis; Chaeyoung Lee; Christopher D Anderson; Guido J Falcone; Jordi Jimenez-Conde; Israel Fernandez-Cadenas; Guillaume Pare; Tatjana Rundek; Michael L James; Robin Lemmens; Tsong-Hai Lee; Turgut Tatlisumak; Steven J Kittner; Arne Lindgren; Farrah J Mateen; Aaron L Berkowitz; Elizabeth G Holliday; Jennifer Majersik; Jane Maguire; Cathie Sudlow; Jonathan Rosand
Journal: Stroke Date: 2014-12-09 Impact factor: 7.914

10. Loss of the neurodevelopmental gene Zswim6 alters striatal morphology and motor regulation.

Authors: David J Tischfield; Dave K Saraswat; Andrew Furash; Stephen C Fowler; Marc V Fuccillo; Stewart A Anderson
Journal: Neurobiol Dis Date: 2017-04-19 Impact factor: 5.996