Literature DB >> 33227023

Genome-wide association study identifies 16 genomic regions associated with circulating cytokines at birth.

Yunpeng Wang^1,2,3,4, Ron Nudel^1,2, Michael E Benros^1,5,6, Kristin Skogstrand^1,7, Simon Fishilevich⁸, Doron Lancet⁸, Jiangming Sun^1,2, David M Hougaard^1,7, Ole A Andreassen³, Preben Bo Mortensen^1,9, Alfonso Buil^1,2, Thomas F Hansen^1,2,10, Wesley K Thompson^1,2,11, Thomas Werge^1,2,12.

Abstract

Circulating inflammatory markers are essential to human health and disease, and they are often dysregulated or malfunctioning in cancers as well as in cardiovascular, metabolic, immunologic and neuropsychiatric disorders. However, the genetic contribution to the physiological variation of levels of circulating inflammatory markers is largely unknown. Here we report the results of a genome-wide genetic study of blood concentration of ten cytokines, including the hitherto unexplored calcium-binding protein (S100B). The study leverages a unique sample of neonatal blood spots from 9,459 Danish subjects from the iPSYCH initiative. We estimate the SNP-heritability of marker levels as ranging from essentially zero for Erythropoietin (EPO) up to 73% for S100B. We identify and replicate 16 associated genomic regions (p < 5 x 10-9), of which four are novel. We show that the associated variants map to enhancer elements, suggesting a possible transcriptional effect of genomic variants on the cytokine levels. The identification of the genetic architecture underlying the basic levels of cytokines is likely to prompt studies investigating the relationship between cytokines and complex disease. Our results also suggest that the genetic architecture of cytokines is stable from neonatal to adult life.

Entities: CellLine Chemical Disease Gene Mutation Species

Mesh：

Substances：

Year: 2020 PMID： 33227023 PMCID： PMC7721185 DOI： 10.1371/journal.pgen.1009163

Source DB: PubMed Journal: PLoS Genet ISSN： 1553-7390 Impact factor: 5.917

Introduction

Circulating inflammatory markers are essential to human health and disease [1]. An important group of small circulating proteins are cytokines. These have important roles in cell signaling in general and in modulating immune function in particular, including inducing and reducing inflammation [2] Circulating inflammatory cytokines have been implicated in many classes of diseases, including cancers [3], cardiovascular diseases [4], metabolic diseases [5], autoimmune diseases [6] and neuropsychiatric disorders [7]. Their utility goes beyond their explanatory power in disease mechanism; since measuring their blood levels is a simple procedure, they can be useful in their diagnostic and predictive power. For example, they can be used as indicators for obesity and early cancer risk factors [8]. In addition to their involvement in disease, cytokines are also involved in both physiological function e.g. pain [9] and mental, cognitive, or brain function [10]. The latter point might be extremely important given emerging evidence for the links between immune function and psychiatric disorders [11-13]. Despite their relevance to disease mechanism and diagnostic power, only few studies have examined the genetic architecture of circulating inflammatory markers[14-17]. Furthermore, previous studies have mainly used adult samples; thus, it is unclear whether the genetic control of inflammatory markers varies across age groups. Here, we estimate the genetic contribution to variation in the circulating cytokine marker levels at birth for: interleukin 8 and 18 (IL8 and IL18), monocytes chemoattractant protein (MCP1 aka CCL2), thymus and reactivation regulated chemokine (TARC, also known as CCL17), erythropoietin (EPO), immunoglobulin A (IgA), C-reactive protein (CRP), brain-derived neurotrophic factor (BDNF), vascular endothelial growth factors (VEGFA) and S100 calcium-binding protein (S100B). Of note, the genetics of S100B has not been previously studied.

Results

We use data from 12,000 neonatal blood spots as part of the Danish iPSYCH Initiative[18], in which the concentrations of ten cytokines were measured using a two-step design with a discovery sample (N = 10,000) and a replication sample (N = 2,000). Five of the ten markers, i.e. BDNF, IL8, IL18, MCP1 and S100B, were measured in both samples. Both discovery and replication samples included subjects tested at birth who later in life had at least one inpatient or outpatient hospital discharge code involving one or more of six psychiatric disorders: schizophrenia, bipolar disorder, depression, autism, attention-deficit/hyperactivity disorder and anorexia ()[18], as well as a random population sample. Genome-wide genotyping of DNA extracted from neonatal blood spots was accomplished using the Infinium PsychChip v1.0 array in 23 waves (for detailed protocol see Pedersen et al.[18]) and used to impute ~9 million 1000 Genomes Project Phase 3 SNPs. We performed two rounds of strict quality control to remove possible technical artifacts within each wave and across waves, respectively (Materials and Methods). We inferred the ancestry of each subject using both national birth register data and genomic principal component (PC) analysis. Non-Danish subjects were subsequently removed before the genetic association analyses. In total, 8,318 and 1,141 subjects were used in the discovery and replication analyses, respectively (Materials and Methods). Marker levels were log-transformed and age-residualized using a generalized additive model with 5 degrees of freedom (hereafter normalized, Materials and Methods). As expected, we observed both positive and negative correlations of the measured marker levels; correlation coefficients range from -0.06 to 0.43, but positive correlation is observed in the majority of the cases (S33 Fig). This suggests a complex regulation mechanism for immune responses. We first estimated the proportions of the variance of marker levels accounted for by genetic variants (h2SNP) using restricted maximum likelihood[19]. S100B shows the highest h2SNP (0.73), while EPO has h2SNP~0; (Fig 1A). SNP-heritabilities for the remaining eight markers range from 0.08 (BDNF) to 0.21 (IL18 and VEGFA). For each marker, h2SNP was partitioned to autosomes, revealing that SNP-heritabilities of S100B, CRP, IL18 and IgA predominantly stem from the chromosomes where their coding genes are located (Fig 1B), suggesting strong cis-regulatory mechanisms. In contrast, analyses suggest disperse and polygenic trans-regulation for IL8, MCP1 and TARC (Fig 1B).

Fig 1

SNP heritability of circulating protein levels.

SNP heritability of circulating protein levels.

The variation of circulating marker levels captured by a. all the genotyped SNP; b. SNPs on each autosome and c. polygenic scores computed from independent sample are shown. Cytokine SNP-heritabilities were shown in a by point estimates and standard errors. These point estimates were also shown in parentheses following the cytokine names on b. The Pearson’s correlation coefficients between polygenic scores and measured protein levels in the discovery sample are stratified by different p value thresholds (pT) of association in the discovery sample (S1, P<1x10-6; S2, P <1x10-5; S3,1x10-4; S4, P <0.001; S5, P < 0.01; S6, P < 0.1; S7, P <0.5; S8, P <1.0). The effect sizes used to compute polygenic scores are derived from Ahola-Olli et al.[17]. We correlated blood marker levels with polygenic risk scores (PGRSs) constructed from effect size estimates reported in a previous, independent study[17]. As shown in Fig 1C, moderately high correlations were observed for VEGFA, IL18, MCP1 (r = 0.31, p<10−16; r = 0.18, p <10−16; r = 0.19, p<10−16; respectively, using SNPs with association p<10−6 from the independent study), whereas IL8 blood levels and PGRSs are only marginally correlated. We performed a genome-wide association study for each cytokine using a multiple linear regression model, including the first 6 principal components (PCs), diagnosis of any of the six disorders, genotyping wave indicators and sex as covariates (Materials and Methods). The same model was separately applied to both discovery and replication samples. We did not observe inflation in the resulting association statistics (lambda: min = 0.99, max = 1.03; S1–S10 Figs). Except for the cases of BDNF and IL8, we observed a high number genetic variants significantly associated (P<5x10-9) with the cytokine markers, ranging from 131 for EPO to 3,941 for S100B. Extreme p values (P < 10−100) are especially common for IL18, S100B and VEGFA (Fig 2A) in line with analyses showing strong cis-regulatory mechanisms for these markers (Fig 1B). As shown in Fig 2B, common variants (minor allele frequency: MAF>20%) make up 56% of all significant variants.

Fig 2

Distribution of association statistics for inflammation marker level a.

Distribution of association statistics for inflammation marker level a.

The empirical cumulative distribution function of the log10(-log10(P)) for the association of SNPs (P<5x10-9) with each inflammation marker. Colors indicate different markers. b. Distribution of SNPs (P<1x10-9) in different minor allele frequency (MAF) bins is shown for each marker. Colors indicate different MAF intervals. Numbers in the figure shows the proportion of SNPs in the region. We clumped the association signals into 20 independent loci (16 unique loci) indexed by at least one significant SNP (P<5 x 10−9) (Materials and Methods, S11–S30 Figs), associated with one or more markers (Table 1). Out of the 20 associations, four are novel and confirmed in the replication study (P<0.0036, Table 1). The first novel association with IL18 is in 19p13.2 indexed by rs56195122 (P = 2.4x10-13, MAF = 0.03, replication P = 6.59x10-4, S17 Fig). The SNP rs56195122 is in the first intron of the synaptonemal complex central element protein 2 gene (SYCE2), associated with several blood metabolites levels[21] and blood cell related traits[22, 23]. The second locus is associated with MCP1, indexed by rs4493469 (P = 1.62x10-16, MAF = 0.1, replication P = 2.0x10-3, 27kb upstream of the C-C chemokine receptor type 3 gene, CCR3, S20 Fig).

Table 1

Genome wide significant associations with blood inflammatory marker levels.

Marker	SNP	Chr	Region	Pos	A1/A2	Beta	Se	P	VE	INFO	Freq	P repl	Gene
CRP	rs3091244	1	1q23.2	159684665	A/G	0.33	1.89E-2	7.47E-68	5.2E-2	1.01	0.32		CRP
	rs112635299	14	14q32.13	94838142	T/G	-0.41	5.61E-2	3.31-E-13	9.02E-3	0.95	0.03		SERPINA1
EPO	rs1130864	1	1q32.2	159683091	A/G	0.11	0.01	4.24E-23	1.02E-2	1.01	0.32		CRP
IgA	rs3094087	6	6p21.33	31061561	T/C	0.10	0.01	1.83E-10	3.12E-3	1.03	0.15		HLA
IL18	rs10891329	11	11q23.1	112009892	T/C	0.32	7.4E-3	1E-300	4.0E-2	0.95	0.32	3.72E-37	IL18
	rs10891268	11	11q23.1	111301044	A/G	6.1E-2	9E-3	1.21E-11	1.26E-3	0.98	0.22	4.46E-4	POU2AF1
	rs56195122	19	19p13.2	13020506	A/G	-0.15	2.08E-2	2.4E-13	1.22E-3	0.99	0.03	6.59E-4	SYCE2
	rs9402686	6	6q23.3	135427817	A/G	5.65E-2	8.3E-3	1.51E-11	1.22E-3	1.0	0.26	0.58	HBS1L
MCP1	rs12075	1	1q23.2	159175354	A/G	0.11	5.1E-3	1.12E-92	5.47E-3	0.98	0.44	1.84E-8	ACKR1
	rs4493469	3	3p21.31	46177992	T/C	-7.01E-2	8.5E-3	1.62E-16	9.18E-4	0.97	0.10	2.0E-3	CCR3
	rs2228467	3	3p22.1	42906116	T/C	-8.19E-2	1.01E-2	6.22E-16	8.35E-4	1.02	0.07	1.24E-5	ACKR2
	rs60200069	10	10q22.1	73503994	T/G	-3.74E-2	5.2E-3	7.82E-13	6.86E-4	0.98	0.43	8.67E-2	CDH23
S100B	rs62224256	21	21q22.3	47887095	A/G	-0.61	1.31E-2	1E-300	0.18	0.98	0.49	9.85E-38	PCNT
	rs28397289	6	6p21.33	31197407	T/C	0.15	1.72E-2	7.67E-19	8.46E-3	0.98	0.24	rs47134625.83E-5	HLA
TARC	rs115952894	3	3p24.3	16950359	A/G	0.54	2.46E-2	1E-104	2.67E-2	1.0	0.05		PLCL2
	rs2228467	3	3p22.1	42906116	T/C	-0.41	2.09E-2	1.84E-82	2.05E-2	1.02	0.07	^b4.23E-11	ACKR2
	rs10886430	10	10q26.11	121010256	A/G	0.33	1.80E-2	1.23E-75	2.16E-2	0.89	0.11		GRK5
	rs223896	16	16q21	57443146	A/G	-0.12	1.08E-2	2.91E-29	7.10E-3	1.02	0.41	^b1.3E-9	CCL17
VEGFA	rs7767396	6	6q21.1	43927050	A/G	0.22	6.2E-3	3.31E-253	2.36E-2	0.99	0.47	^a3.67E-171(^c4.85E-1284)	intergenic
	rs11789392	9	9p24.2	2694914	T/C	0.13	7.0E-3	1.22E-73	8.09E-3	0.87	0.44	^a4.91E-5	intergenic

a. from Ahola-Olli et al[17]

b. from Suhre et al[16]

c. from choi et al[20].

a. from Ahola-Olli et al[17] b. from Suhre et al[16] c. from choi et al[20]. Genome wide significant SNPs were clumped into 20 independent regions. Information about each region includes, leading SNP (SNP), Chromosome (Chr), cytoband (Region), genomic position (Pos, hg19), effective allele (A1), alternative allele (A2), effect size (Beta), standard error (Se), association p values (P), proportion of maker level variance accounted for by the leading SNP (VE), imputation quality score (INFO), association p values in the replication sample or previously reported studies (P repl) and the gene closest to the leading SNPs (Gene). We discovered two novel regions associated with S100B levels in blood (Table 1). The first region at 21q22.3 is indexed by rs62224256 (P<1x10-300, MAF = 0.49, replication P = 9.58x10-38, S23 Fig) located 21kb downstream of the pericentrin gene (PCNT), which is a calmodulin binding protein. Remarkably, the leading SNP accounts for 18% of the variation of S100B level in blood in the discovery sample (Table 1). The A allele of the top SNP rs62224256 is associated with reduced levels corresponding to 0.32 standard deviations (SD = 0.02, P<2x10-16) and explains 14% of S100B variation in the replication sample (Fig 3A). We also discovered that the human leukocyte antigen (HLA) region (build hg19, chr6: 28,477,797–33,448,354) is associated with the variation of circulating S100B, led by rs28397289 (P = 7.67x10-19, MAF = 0.24, S24 Fig). The association of the HLA region with S100B in the replication sample is indexed by another SNP rs4713462 (replication P = 5.83x10-5, MAF = 0.30).

Fig 3

Prediction of inflammation marker levels by genetic variants.

Additionally, 14 of the 20 loci replicated previously-reported associations (S1 Text and S10 Table).

Prediction of inflammation marker levels by genetic variants.

a. The distribution of the normalized S100B level in the replication sample is shown in the three genotype groups of rs62224256 (0: AA, 1, AG and 2 GG). A simple linear regression line(red) is added in the figure to show the trend. b. The Pearson’s correlation coefficients between polygenic scores and normalized S100B level in the replication sample are stratified by different p value threshold(pT) of association in the discovery sample (S1, P<1x10-6; S2, P<1x10-5; S3, P<1x10-4; S4, P <0.001; S5, P< 0.01; S6, P< 0.1; S7, P<0.5; S8, P<1.0). Standard errors are show by the error bar. Stars indicate significant correlations (P<0.00125 = 0.05/40). c. A scatter plot shows the predicted S100B level (normalized, fitted strait line) in the replication sample by SNPs with P<1x10-6 in the discovery sample. We constructed PGRSs for: BDNF, IL18, IL8, MCP1 and S100B, measured in both samples, for the replication analysis using the effect estimates from discovery association. Fig 3B shows Pearson’s correlations between the PGRSs and the corresponding normalized marker levels stratified by different “discovery association strength”. The PGRSs based on SNPs with P<10−6 (S1) were correlated most strongly across all markers except IL8. In contrast, PGRS constructed with all SNPs (S8), i.e. P≤<1.0, show no significant correlation except for with S100B. The PGRSs constructed with SNPs with P<10−6 accounts for 21% of S100B variation in the replication sample (Fig 3C), and the correlation between PGRS and S100B levels is ~0.5. For comparison, an analysis based on a previously-studied discovery sample[17] is shown in S31 and S32 Figs. The observed low correlations between PGRS and IL8 levels (Figs 1C, 3B, S31 and S32 Figs) can be partially explained by the low SNP heritability for IL8 estimated in our sample (Fig 1A). On the other hand, the most significant correlations for the other cytokines was achieved by the PGRS with the lowest p-value threshold indicate that the genetic architecture of cytokines may be less polygenic than other human complex traits. The associated loci contain a large number of genome-wide significant SNPs (P<5x10-9, Fig 2A), making it challenging to infer the causal variants for follow-up experimental studies. We performed Bayesian statistical fine-mapping on each associated region[24] (Materials and Methods). For each associated region, we inferred the most probable causal configuration (causal set) assuming at most 3 causal variants per region (Materials and Methods). As shown in Table 2, eleven causal sets include their corresponding leading SNPs, among which 3 are one-variant sets. Nonetheless, 9 causal sets do not contain their corresponding leading SNPs, indicating that the top association signals may be driven by the allelic combination of SNPs in the causal sets (Table 2). Re-analysis assuming at most six causal variants per region did not change the results (S9 Table).

Table 2

Fine mapping of associated regions.

Marker	Leading SNP	log₁₀ (BFc)	SNP	1og₁₀ (BF)	PIP	R²	P	P repl	Gene	Enhancer ID	Enh Gene
CRP	rs3091244	93.24	rs3091244	2.56	0.17	1.0	7.47E-68		CRP	GH01G159751 (rs4131568,R² = 0.82; rs12094103,R² = 0.79)	AIM2,CRP, FCRL6,RPL27P2,DUSP23
			rs376195567	2.75	0.24	0.05	3.66E-24		CRP
			rs3093059	2.38	0.11	0.02	2.26E-14		CRP
	rs112635299	10.31	rs112635299	3.6	0.63	1.0	3.31-E-13		SERPINA1
EPO	rs1130864	27.74	rs1130864	2.59	0.18	1.0	4.24E-23		CRP	GH01G159751 (rs4131568,R² = 0.83; rs12094103,R² = 0.83)	AIM2,CRP, FCRL6,RPL27P2,DUSP23
			rs16842568	2.28	0.10	0.02	2.06E-6		CRPP1
IL18	rs10891329	362.57	rs10891325	3.08	0.55	0.79	6.19E-263	1.26E-23	SDHD
			rs11214126	9.12	1.0	0.26	2.03E-98	4.84E-14	BCO2
			rs10891343	4.74	0.98	0.55	7.23E-287	8.58E-27	BCO2
	rs10891268	21.25	rs10444327	2.65	0.24	0.0	1.32E-5	0.25	POU2AF1
			rs117369151	2.53	0.29	0.05	1.62E-5	0.16	SIK2
			rs79958943	3.70	0.78	0.12	4.49E-15	2.68E-3	SIK2	GH19G012880
	rs56195122	10.44	rs56195122	2.76	0.28	1.0	2.4E-13	6.59E-4	SYCE2	GH19G012890 (rs2072596,R² = 0.91);GH11G111658 (rs3745647,R² = 0.90)	GCDH,PRS6P25,ZNF709,ZNF136,ZNF788,SIK2,BGT4,C11orf88,MIR34B, MIC34C
	rs9402686	12.10	rs9402686	2.18	0.10	1.0	1.51E-11	0.58	HBS1L
			rs56293029	2.09	0.09	0.93	2.39E-11	0.66	HBS1L
MCP1	rs12075	119.06	rs12075	4.87	0.97	1.0	1.12E-92	1.84E-8	ACKR1
			rs13962	0.96	4.64	0.17	0.013	0.11	ACKR1
			rs72698561	3.55	0.65	0.04	7.85E-17	0.14	CRPP1
	rs4493469	30.68	rs6441947	2.91	0.31	0.03	0.01	0.61	CCR3
			rs11923627	2.40	0.11	0.68	2.73E-15	2.49E-2 (^a5.90E-4)	CCR3
			rs12495098	2.48	0.14	0.01	1.46E-10	1.81E-4 (^a1.54E-20)	CCR3	GH03G046297	CCR2, CCR5,CCR1,CCRL2,TDGF1,LRRC2,FYCO1
	rs2228467	17.78	rs2228467	6.94	1.0	1.0	6.22E-16	1.23E-5 (^a9.19E-20)	ACKR2
	rs60200069	14.15	rs10823838	2.85	0.27	1.0	8.33E-13	7.90E-2	CDH23	GH10G071740	,CCR2,PSAP, DNAJB12
			rs3747858	2.76	0.23	0.03	5.16E-9	0.28	CDH23	GH10G071745	VSIR,CDH23
S100B	rs62224256	657.11	rs11910707	13.25	1.0	0.10	1.26E-205	1.66E-27	PRMT2	GH21G046620	PRMT2, S100B, DIP2A, SPATC1L
			rs55912899	5.22	0.99	0.09	5.89E-129	1.7E-6	PRMT2
			rs2839314	4.07	0.96	0.13	1.99E-240	4.30E-19	DIP2A	GH21G046541	S100B,MCM3AP, SPATC1L,DIP2A, RNU6
TARC	rs115952894	177.69	rs115952894	4.34	0.92	1.0	1E-104		PLCL2
			rs76472873	3.0	0.36	0.0	2.21E-57		PLCL2	GH03G016916	MIR3713, PLCL2
			rs369616361	3.78	0.77	0.14	0.013		PLCL2
	rs2228467	91.10	rs2228467	7.85	1.0	1.0	1.84E-82	^b4.2E-11	ACKR2
			rs115667394	2.76	0.31	0.0	0.02		VIPR1
			rs1427803	2.13	0.10	0.03	4.59E-26		ACKR2
	rs10886430	73.26	rs10886430	13.76	1.0	1.0	1.23E-75		GRK5	GH10G119249	GC10P119246, LOC105378511
			rs10886437	3.45	0.60	0.65	5.05E-51		GRK5
	rs223896	53.73	rs4396523	2.94	0.30	0.11	2.30E-23		CCL17	GH16G057409	CCL17, CIAPIN1, DOK4
			rs223897	2.92	0.29	0.53	3.51E-19		CCL17	GH16G057409	CCL17, CIAPIN1, DOK4
			rs34379253	4.09	0.86	0.03	4.38E-13		CCL17	GH16G057409	CCL17, CIAPIN1, DOK4
VEGFA	rs7767396	278.31	rs9369421	3.70	0.75	0.0	0.001	^a1.70E-2	intergenic	GH06G043953	GC06M043993,LOC105375067
			rs73422214	12.65	1.0	0.07	1.70E-29	^a1.16E-13	intergenic	GH06G043953	GC06M043993,LOC105375067
			rs4481426	3.74	0.77	0.77	4.95E-194	^a1.24E-127 (^c5.25E-1060)	intergenic	GH06G043953	GC06M043993,LOC105375067
	rs11789392	132.60	rs11789392	5.12	0.98	1.0	1.22E-73	^a4.91E-5	intergenic
			rs2219143	4.97	0.97	0.0	10.8E-58	^a3.0E-4	VLDLR	GH09G002620	VLDLR, PIR48978
			rs10812148	3.15	0.38	0.01	4.33E-16	^a0.115	VLDLR-AS1

a. from Ahola-Olli et al[17]

b. from Suhre et al[16]

c. from choi et al[20].

a. from Ahola-Olli et al[17] b. from Suhre et al[16] c. from choi et al[20]. The FINEMAP program was applied to each associated region (500kb left and right of the leading SNP) in Table 1, assuming each region contains at most three causal variants. Abbreviations used: log10(BFc): common logarithm of Bayesian factor for the inferred most probable causal configuration; log10(BF): common logarithm of Bayesian factor for the SNP being in the causal set; PIP: posterior inclusion probability in the causal set; R2: LD r-square of the SNP with the leading SNP in the corresponding region; P: association p value in the discovery sample; P repl: association p value in the replication sample or previous studies; Gene: closest gene to the corresponding SNP; Enh gene: inferred genes regulated by the corresponding enhancer. Most of the identified genetic variants are located outside of protein-coding regions. We integrated associated loci with public epigenomic datasets[25-28] to infer plausible regulatory mechanisms. Eighteen of the 50 identified leading SNPs implicated by both association and fine-mapping analyses are located in enhancers from GeneHancer, the GeneCards Suite[29] database of human enhancers and their associated genes (Materials and Methods)(Table 2 and S2–S8 Tables). We also tested whether cytokine associated SNPs were enriched in DNAse hypertensive sites, histone modification sites and chromatin states. However, after correcting multiple testing no significant enrichment was observed (S34–S36 Figs). In Fig 4, we demonstrate the annotation by the 21q22.3 region, indexed by rs62224256, associated with S100B level. The SNPs rs11910707 (P = 1.26x10-205, replication P = 1.66x10-27, log10BF = 13.25, 12kb upstream of PRMT2) and rs2839314 (P = 188x10-240, replication P = 4.30x10-19, log10BF = 4.1, 22kb upstream of DIP2A) are the most probable causal variants. The rs11910707 SNP overlaps with the elite enhancer (Materials and Methods) GH21G046620, and rs2839314 –with GH21G046541. Both enhancers modulate the transcription of the S100B gene (the former through a double-elite association). Moreover, among the genes regulated by at least one of these enhancers are PRMT2, DIP2A, and SPATC1L (Table 2). Thus, the most highly associated signal with rs62224256 is highly likely to be a proxy of the two causal SNPs. As such, the closest gene, PCNT, may or may not play roles in the regulation of circulating S100B.

Fig 4

Annotation of the region indexed by rs62224256 associated with S100B.

The top panel shows the regional plot. P values bellow 1x10-100 were censored at 1x10-100 for the clearness of illustration. Genes located in this region are shown in the middle panel. The sub-region contains rs662224256 is zoomed in approximately. Two enhancers are represented by the black and red bars. Genes regulated by the enhancers are underscored by red line and shaded bar when they are regulated by both enhancers. The log10 Bayesian Factor (LBF), posterior inclusion probability(PIP) of being included in the causal set and association p values (P) scales are shown in the same order as SNP rs-numbers. The genomic coordinates (build hg19) of SNPs and enhancers are shown on the lower-left panel.

Annotation of the region indexed by rs62224256 associated with S100B.

Discussion

In this study, we investigate the genetic architecture of ten cytokines in whole blood at birth, in a sample of 12,000 individuals, the largest study so far. Our results highlight an important role for regulatory elements in determining levels of circulatory inflammatory markers. Importantly, we robustly replicate our findings in an in-house replication sample and by using data from other studies[16, 17]. The latter studies, in contrast to the current study, were based on adult samples, and, therefore, our results suggest that the genetic architecture of cytokines is stable from neonatal to adult life. Inflammation and conditions associated with it, such as infections and autoimmune diseases, have been implicated in a number of disorders and medical conditions[1], including mental disorders[7, 11, 13]. In the context of the latter type of disorders, studies such as ours could be of great utility; while it has been known for a long time that mental disorders have strong genetic etiologies [30], when it comes to reliable accounts of disease mechanism, our current understanding is very limited compared to not only monogenic disorders, but also other complex disorders such as autoimmune disorders [31, 32]. This is not necessarily due to lack of significant genetic associations, e.g. for schizophrenia [33], but rather it could also stem from the difficulty in defining the psychiatric traits. In this respect, leveraging the results of studies such as ours could be useful for both diagnosis and as a future avenue for research; given the links between inflammation, immunity and mental illness, and the properties of some of the inflammatory makers studied here, it could be envisaged that the latter could be used in a way similar to how endophenotypes could be used in psychiatry[34, 35]. Moreover, the intricate genetic architecture identified in this study, which highlights gene regulation, could be informative to molecular studies of psychiatric diseases and other types of diseases. For example, it is likely to prompt studies using e.g. Mendelian randomization[36] to investigate the relationship between inflammatory markers and complex disease. The main strengths of our study are the large number of markers included, the large sample size, and the replication sample (S37 Fig). The postnatally sampling on days 5–7 day renders our findings relatively independent of the child’s behavior and natural environment, which could be considered a major strength. However, it should be noted that the marker levels may, in some cases, be influenced by perinatal complications, diseases and medication administered to the child, as well as by the smoking habits, alcohol consumption, diet, weight and other general life conditions of the mother. Certain peptides, e.g. antibodies, cross the placenta, and neonatal levels in the child therefore reflect those of the mothers at birth, thus reducing the power of the study and accounting for the zero heritability of IL8 and BDNF. A possible source of noise in the levels of inflammatory markers is that measurements come from dried, whole blood samples that may not precisely correspond to concentrations measures in plasma or serum in practice. However, our replication of findings from adult samples suggests that these putative biases do not present a serious limitation to the study. In conclusion, our study sheds some light on the complex genetic architecture of inflammatory markers and highlights the important role of regulatory elements therein. We also show that the mechanisms involved are relatively stable throughout life, by comparing our results to those of studies which used adult samples. We hope that these results will prompt future studies looking into the links between inflammation and complex diseases and, in particular, that they will contribute to investigations into the mechanisms of mental illness, which have proven difficult to explain from a molecular perspective.

Materials and methods

Sample

The sample was based on complete and consecutive birth cohorts of all singletons born in Denmark between May 1, 1981 and December 31, 2005. Only individuals who were residents in Denmark on their first birthday and who have a known mother (N = 1,536,309) were included. From this group, 78,000 subjects were genotyped in 23 waves by the Broad Institute using the PsychChip version 1. For the discovery sample, 10,000 subjects were randomly selected from the 23 waves of the iPSYCH initiative[18]. For the replication sample 2,000 subjects were chosen from the second wave, excluding the discovery sample (for detailed description of samples see S1 Table).

Cytokine level measurements

The 2000 samples for replication analysis were measured using Luminex technology as described by Skogstrand et al.[37, 38]. The second 10 000 samples used for discovery study were measured using Meso-Scale technology as described in Skogstrand et al.[39]. Briefly, dried blood spot sample were punched as 3.2mm disks into PCR-plates (Sarstedt, 72.1981.202). 130 μl extraction buffer (PBS containing 1% BSA and 0,5%Tween-20) were added to each well, and the samples were extracted in 1 hour at room temperature on a microwell shaker set at (900rpm). The extracts were manually moved to sterile Matrix 2D tubes (Thermo Scientific, 3232) and frozen at -80°C. One (Luminex) or two (Meso-Scale) years later, samples were thawed and analyzed using either Luminex technology in-house assays or Meso-Scale plates printed customized for the project. Analyte concentrations were calculated from the calibrator curves on each plate using 5PL (Luminex) or 4PL (Meso-Scale) logistic regression. Analytes falling below the lowest concentration within the working range were assigned to that value. The measured levels were first inspected for potential outliers by scatter plots. Then, each marker level was logarithm transformed and age-residualized using a generalized additive model with 5 degrees of freedom, using the R function ‘gam’. The resultant data was further checked for normality and outliers.

Quality control and imputation

Quality control, and imputation were performed for each wave separately. The quality control parameters for retaining SNPs and subjects were: SNP missingness≤0.05 (before sample removal); subject missingness ≤ 0.02; autosomal heterozygosity deviation (| Fhet | ≤ 0.2); SNP missingness≤0.02 (after sample removal); and, SNP Hardy-Weinberg equilibrium (P > 10−6). Genotype imputation was performed using the pre-phasing/imputation stepwise approach implemented in IMPUTE2[40]/ SHAPEIT2[41](chunk size of 3 Mb and with default parameters). The imputation reference set consisted of 2,186 phased haplotypes from the full 1000 Genomes Project Phase 3. Only autosome chromosomes were analyzed. After imputation, we identified SNPs with high imputation quality (INFO ≥ 0.1) and minor allele frequency (MAF > 0.01). Imputed dataset across 22 waves were merged and further quality control measures were applied (min INFO ≥ 0.1 and MAF ≥ 0.01). The best-guess genotypes were called using parameters: INFO ≥ 0.9 and MAF > = 0.05. The set of SNPs after linkage disequilibrium pruning (r2 ≥ 0.02) was used for relatedness testing and population structure analysis. PLINK[42] was used for relatedness testing. One random member of a pair of subjects with pi-hat ≥ 0.2 were removed. Principal component analysis was performed using EIGENSOFT[43] with the same collection of autosomal SNPs. After quality control, 8,318 subjects remained for discovery and 1,141 subjects for replication sample. In total, about 9 million SNPs were used in the association study.

SNP heritability, h2SNP

The merged genotypes for discovery sample were quality controlled using the same parameter as above. Before estimating the heritability, SNPs were thinned by the PLINK38 using the command:—indep-pairwise 100 50 0.2. The first 6 PCs (see next section), genotyping wave indicators and sex were used as covariates in the restricted maximum likelihood-based program BOLT-REML[19]. To estimate per-chromosome SNP heritability, SNPs located in the focal chromosome was removed and the estimated h2SNP was subtracted from the whole genome estimates.

Genome-wide association

Genome-wide association study of SNPs with inflammation marker levels were performed separately for the discovery and replication sample using a multiple linear regression model implemented in PLINK[42]. Principal components were computed separately for discovery and replication, and the first 6 principal components were used as covariates, along covariates for sex and wave indicator variables. We employed the first 6 PCs following regression analyses testing each PC and each cytokine until we reached a PC which was not associated (P>0.05) with any of the 10 cytokines. Manhattan plots in S1–S8 Figs presented the association results. The genomic inflation factors were estimated and shown in the quantile-quantile plots in S1–S10 Figs. The regional association results were constructed using LocusZoom[44](S11–S30 Figs). The phenotypic variance explained by a SNP was estimated by the , where β is the estimated effect and p the allele frequency in the discovery sample.

Associated regions and genes

Association results were ‘clumped’ using PLINK based on the linkage disequilibrium structure of the 1000 Genomes projects phase 3 EUR dataset, with parameters–clump-p1 5e-9 –clump-2 1e-6 –clump-r2 0.1. Five hundred kilo-base (kb) were used as inter-region distance threshold. Genes whose genomic coordinates located within the boundaries of each region were assigned to the corresponding region. SNPs with the smallest association p values were taken as the leading SNP for the corresponding region. The associated SNPs were annotated to the closest genes by genomic position the Ensembl tool VEP[45] (S2–S8 Tables).

Fine mapping

Association regions were fine-mapped using the FINEMAP[24] program. Regions were defined as genomic segments 500kb on both sides of the most significant SNP in an associated region (P < 5x10-9). Linkage disequilibrium data from the 1000 Genomes Project phase 3 European sample were used in fine mapping. We performed two analyses: the first set the maximum number of causal variants to 3 and the other to 6. S2–S8 Tables listed all SNPs with posterior inclusion probability (PIP) > 0.1 for 3-causal variants analysis. S9 Table listed all inferred causal SNPs in each region for 6-causal variants analysis. The log10 Bayesian factors for the causal set (log10BFc) and for each SNP are shown in the tables along with association statistics.

Enhancer annotation

The associated SNPs were mapped onto genomic enhancer regions from the GeneHancer database (v4.5) [25] using a specially-prepared annotated dataset. The GeneHancer database contains enhancers that were integrated from five enhancer sources (Ensembl[46], ENCODE[47], VISTA[48], dbSUPER[49] and FANTOM[50]) and enhancer-gene connections that are based on five methods (eQTLs[51], eRNAs[50], TF-gene expression correlations, capture-HiC[52], and genomic distance from TSS). Double-elite associations are considered to be more confident annotations and are defined as enhancer-gene connections for which both the enhancer itself and the connection to the gene are supported by at least two sources or methods, respectively.

Polygenic risk scoring

We computed the polygenic risk scores (PGRS) for both discovery and replication samples. To compute the effect size: for discovery sample, we used the association results from the previous study[17]; and, for the replication sample, we used both the association results from discovery sample and the same previous study. The association summary statistics were first carefully filtered by removing SNPs with: MAF < 0.05 or INFO < 0.8 or having a multi-character allele. We, then, clumped the resultant data based on the 1000 Genomes Project 3 EUR linkage disequilibrium structure using the program PLINK[42] with parameters:—clump-p1 1.0,—clump-p2 1.0,—clump-r2 0.1 and—clump-kb 500. The same program was used for scoring each subject in our sample. The correlations between normalized marker levels and PGRS were computed using the R program with the cor.test for the Pearson’ correlation. The proportions of the variance explained for each marker by each PGRS was computed as the square of the Pearson’s correlation coefficients.

Additional analyses performed.

(PDF) Click here for additional data file.

Description of samples.

(PDF) Click here for additional data file.

Full annotation results for CRP level using FINEMAP, Ensemble VEP, HaploReg, Enhancer and Sherlock.

(XLSX) Click here for additional data file.

Full annotation results for EPO level using FINEMAP, Ensemble VEP, HaploReg, Enhancer and Sherlock.

(XLSX) Click here for additional data file.

Full annotation results for IL18 level using FINEMAP, Ensemble VEP, HaploReg, Enhancer and Sherlock.

(XLSX) Click here for additional data file.

Full annotation results for MCP1 level using FINEMAP, Ensemble VEP, HaploReg, Enhancer and Sherlock.

(XLSX) Click here for additional data file.

Full annotation results for S100B level using FINEMAP, Ensemble VEP, HaploReg, Enhancer and Sherlock.

(XLSX) Click here for additional data file.

Full annotation results for TARC level using FINEMAP, Ensemble VEP, HaploReg, Enhancer and Sherlock.

(XLSX) Click here for additional data file.

Full annotation results for VEGFA level using FINEMAP, Ensemble VEP, HaploReg, Enhancer and Sherlock.

(XLSX) Click here for additional data file.

FINEMAP results for regions associated with S100B assuming six causal variants.

(XLSX) Click here for additional data file.

Replication of SNPs identified by present study with those reported by Ahola-Olli et al.

(XLSX) Click here for additional data file.

The Manhattan & qq plots for BDNF level.

(PDF) Click here for additional data file.

The Manhattan & qq plots for IL8 level.

(PDF) Click here for additional data file.

The Manhattan & qq plots for CRP level.

(PDF) Click here for additional data file.

The Manhattan & qq plots for EPO level.

(PDF) Click here for additional data file.

The Manhattan & qq plots for IgA level.

(PDF) Click here for additional data file.

The Manhattan & qq plots for IL18 level.

(PDF) Click here for additional data file.

The Manhattan & qq plots for MCP1 level.

(PDF) Click here for additional data file.

The Manhattan & qq plots for S100B level.

(PDF) Click here for additional data file.

The Manhattan & qq plots for TARC level.

(PDF) Click here for additional data file.

The Manhattan & qq plots for VEGFA level.

(PDF) Click here for additional data file.

Region plot for CRP-rs3091244 association.

(PDF) Click here for additional data file.

Region plot CRP rs112635299 association.

(PDF) Click here for additional data file.

Region plot EPO rs1130864 association.

(PDF) Click here for additional data file.

Region plot IgA rs3094087 association.

(PDF) Click here for additional data file.

Region plot IL18 rs10891329 association.

(PDF) Click here for additional data file.

Region plot IL18 rs10891268 association.

(PDF) Click here for additional data file.

Region plot IL18 rs56195122 association.

(PDF) Click here for additional data file.

Region plot IL18 rs9402686 association.

(PDF) Click here for additional data file.

Region plot MCP1 rs12075 association.

(PDF) Click here for additional data file.

Region plot MCP1 rs4493469 association.

(PDF) Click here for additional data file.

Region plot MCP1 rs2228467 association.

(PDF) Click here for additional data file.

Region plot MCP1 rs60200069 association.

(PDF) Click here for additional data file.

Region plot S100B rs62224256 association.

(PDF) Click here for additional data file.

Region plot S100B rs28397289 association.

(PDF) Click here for additional data file.

Region plot TARC rs115952894 association.

(PDF) Click here for additional data file.

Region plot TARC rs2228467 association.

(PDF) Click here for additional data file.

Region plot TARC rs10886430 association.

(PDF) Click here for additional data file.

Region plot TARC rs223896 association.

(PDF) Click here for additional data file.

Region plot VEGFA rs7767396 association.

(PDF) Click here for additional data file.

Region plot VEGFA rs11789392 association.

(PDF) Click here for additional data file.

Polygenic score for discovery sample based on Ahola-Olli et al.

(PDF) Click here for additional data file.

Polygenic score for replication sample based on Ahola-Olli et al.

(PDF) Click here for additional data file.

Pearson's correlation among inflammation markers.

(PDF) Click here for additional data file.

Enrichment of genetic variants with DNAse hypersensitive sites.

(PDF) Click here for additional data file.

Enrichment of genetic variants with histone modification.

(PDF) Click here for additional data file.

Enrichment of genetic variants with chromatin state.

(PDF) Click here for additional data file.

Power analysis of discovery and replication samples.

(PDF) Click here for additional data file. 1 Jul 2020 * Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. * Dear Dr wang, Thank you very much for submitting your Research Article entitled 'Genome-Wide Association Studies Identify 16 Genomic Regions Associated with Circulating Inflammatory Markers at Birth' to PLOS Genetics. Your manuscript was fully evaluated at the editorial level and by independent peer reviewers. The reviewers appreciated the attention to an important topic but identified some aspects of the manuscript that should be improved. We therefore ask you to modify the manuscript according to the review recommendations before we can consider your manuscript for acceptance. Your revisions should address the specific points made by each reviewer. In addition we ask that you: 1) Provide a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. 2) Upload a Striking Image with a corresponding caption to accompany your manuscript if one is available (either a new image or an existing one from within your manuscript). If this image is judged to be suitable, it may be featured on our website. Images should ideally be high resolution, eye-catching, single panel square images. For examples, please browse our archive. If your image is from someone other than yourself, please ensure that the artist has read and agreed to the terms and conditions of the Creative Commons Attribution License. Note: we cannot publish copyrighted images. We hope to receive your revised manuscript within the next 30 days. If you anticipate any delay in its return, we would ask you to let us know the expected resubmission date by email to plosgenetics@plos.org. If present, accompanying reviewer attachments should be included with this email; please notify the journal office if any appear to be missing. They will also be available for download from the link below. You can use this link to log into the system when you are ready to submit a revised version, having first consulted our Submission Checklist. While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please be aware that our data availability policy requires that all numerical data underlying graphs or summary statistics are included with the submission, and you will need to provide this upon resubmission if not already present. In addition, we do not permit the inclusion of phrases such as "data not shown" or "unpublished results" in manuscripts. All points should be backed up by data provided with the submission. PLOS has incorporated Similarity Check, powered by iThenticate, into its journal-wide submission system in order to screen submitted content for originality before publication. Each PLOS journal undertakes screening on a proportion of submitted articles. You will be contacted if needed following the screening process. To resubmit, you will need to go to the link below and 'Revise Submission' in the 'Submissions Needing Revision' folder. [LINK] Please let us know if you have any questions while making these revisions. Yours sincerely, Caroline Relton, PhD Associate Editor PLOS Genetics Hua Tang Section Editor: Natural Variation PLOS Genetics This is an original and interesting manuscript. The reviewers have highlighted the need for additional clarification and technical details. It would also be of interest to elaborate on the potential overlap in genetic architecture between adult and neonatal genetic variation associated with cytokines, perhaps through the application of LD score regression. Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: This study by Wang et al. performs a genome wide association studies examining ten cytokines extracted from neonatal blood birth spots. The study is well performed and accompanied by SNP heritability, fine-mapping and through polygenic risk score analysis compares the genetic architecture in adults to levels of cytokines in newborns. Major comments: A power calculation has not been presented. This could be performed based on the previous cytokine GWAS by Ahola-Olli et al and indicated whether the numbers are sufficient. This could then indicate the numbers needed for the replication sample. Additionally, is cytokine data only available 12,000 individuals within the Danish iPSYCH? The split between the discovery and replication samples needs to be explained. This appears not to be random due to the loss of almost half the individuals in the replication sample after removing the non-Danish individuals compared to only a fifth in the discovery. Equally, only half of the cytokines are present in the replication sample. The abstract and discussion is then misleading in terms of number of individuals and number of cytokines included in this study. There are analysis and figures in the supplementary material that are not referred to in the text at all (S33-38). These analyses would be very interesting to discuss in the results. The introduction and discussion should be expanded. They could include for example: Introduction: why choice of those particular markers; what the markers are; merits of performing GWAS; etc. Discussion: not a proper replication; clinical relevance of knowing an individual’s genetic predisposition to cytokines; PGRSs only correlating for certain cytokines; etc. In the inflammation marker level measurements, the samples with concentrations falling below the lowest concentration within the working range should be excluded rather than being assigned to that value. The polygenic risk scores need more of an explanation: it is unclear why only specific ones were generated. I’m not convinced presenting it in terms of pvalue thresholds adds to the conclusions. There needs to be more explanation about the conclusion of cis regulatory mechanisms from the proportion of SNP heritability from each chromosome and the extreme p values. Please double check the colour scheme is suitable for colour blind people. Results: The individual figures from the supplementary need to be referenced in the results rather than just ‘Appendix S1’. Line 110 and 111: should be SNP heritabilities. Line 113: How much of the SNP heritability stems from the coding genes? Because if that explains the SNP heritability then you can’t conclude that there are strong cis-regulatory mechanisms. Line 114: ‘analyses suggest’ is surplus. Line 140: ‘clumped’ needs a better explanation Methods: The discovery and replication p value needs to be justified. Is there a justification for only including 6 principal components? Is there a reason that the chromosomes X and Y were excluded? Line 269: ‘BOLT-RMEL’ should be ‘BOLT-REML’ Figures: Figure 1b needs better labelling – it is currently very confusing to understand. Reviewer #2: This study intended to find genetic contributors for ten inflammation markers by performing GWAS of two samples. They found and replicated 16 associated genomic regions, of which four are novel. Further, they estimated SNP-based heritability ranging from 0 for EPO up to 73% for S100B. Finally, the authors mapped these associated variants to enhancer elements, suggesting a possible transcriptional effect of genomic variants on the inflammatory markers. Overall, this is a well-designed and conducted study with many strengths and merits. However, the major concern of this paper is about its writing. In many parts, description is over simplified, which made this paper hard to follow. It would be important and essential for the authors to expand almost all parts of the paper. Reviewer #3: The manuscript is well-written and provides important insights into genetics of inflammatory biomarkers. The results contain interesting genetic instruments for use in future Mendelian randomization studies. I have few questions related to manuscript. 1) The biomarkers were measured from dried blood spots. It is well known that for example heparin releases cytokines from receptors, such as ACKR1 (DARC), which serves as cytokine reservoir on red blood cell surface. This release induced by heparin might mask some genetic signals which would be detected if quantification would have been done by using non-heparin treated blood samples. Is there any previous data on how cytokine measures done from dried blood correlates with measures done from plasma or serum? 2) Could you explain why you calculated SNP heritability explained by each chromosome by excluding the pertinent chromosome from SNP heritability estimation and then subtracting the obtained SNP heritability from total SNP heritability explained by all autosomes instead of just calculating SNP heritability for one chromosome at a time? 3) According to Skogstrand et al. inter-assay variability for S100B was over 13%. Were cytokines assayed independently from genotyping batches? 4) S100B quantification was done with in-house developed platform. Do you know what this approach actually measures? Can capture antibody block the binding of detection antibody? 5) The Supplementary tables don't have any foot notes or explanation where the data originated from. Therefore, it is hard to track what data they actually contain. Could you describe these little more specifically on each spreadsheet? 6) According to the methods each genotyping wave was imputed separately. Sample size is an important determinant of phasing accuracy and this accuracy decreases rapidly when sample size drops below 1000 which in turn impairs imputation accuracy. What was the sample size in each genotyping wave? ********** Have all data underlying the figures and results presented in the manuscript been provided? Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes: Dr Ruth E Mitchell Reviewer #2: No Reviewer #3: No 11 Aug 2020 Submitted filename: Wang_et_al_response.docx Click here for additional data file. 29 Sep 2020 Dear Dr wang, We are pleased to inform you that your manuscript entitled "Genome-Wide Association Study Identifies 16 Genomic Regions Associated with Circulating Inflammatory Markers at Birth" has been editorially accepted for publication in PLOS Genetics. Congratulations! Before your submission can be formally accepted and sent to production you will need to complete our formatting changes, which you will receive in a follow up email. Please be aware that it may take several days for you to receive this email; during this time no action is required by you. Please note: the accept date on your published article will reflect the date of this provisional accept, but your manuscript will not be scheduled for publication until the required changes have been made. Once your paper is formally accepted, an uncorrected proof of your manuscript will be published online ahead of the final version, unless you’ve already opted out via the online submission form. If, for any reason, you do not want an earlier version of your manuscript published online or are unsure if you have already indicated as such, please let the journal staff know immediately at plosgenetics@plos.org. In the meantime, please log into Editorial Manager at https://www.editorialmanager.com/pgenetics/, click the "Update My Information" link at the top of the page, and update your user information to ensure an efficient production and billing process. Note that PLOS requires an ORCID iD for all corresponding authors. Therefore, please ensure that you have an ORCID iD and that it is validated in Editorial Manager. To do this, go to ‘Update my Information’ (in the upper left-hand corner of the main menu), and click on the Fetch/Validate link next to the ORCID field. This will take you to the ORCID site and allow you to create a new iD or authenticate a pre-existing iD in Editorial Manager. If you have a press-related query, or would like to know about one way to make your underlying data available (as you will be aware, this is required for publication), please see the end of this email. If your institution or institutions have a press office, please notify them about your upcoming article at this point, to enable them to help maximise its impact. Inform journal staff as soon as possible if you are preparing a press release for your article and need a publication date. Thank you again for supporting open-access publishing; we are looking forward to publishing your work in PLOS Genetics! Yours sincerely, Caroline Relton, PhD Associate Editor PLOS Genetics Hua Tang Section Editor: Natural Variation PLOS Genetics www.plosgenetics.org Twitter: @PLOSGenetics ---------------------------------------------------- Comments from the reviewers (if applicable): Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: The authors have provided very clear explanations to my comments. Reviewer #2: The authors have addressed my concerns on this paper. Reviewer #3: The authors provided satisfactory responses for previous comments. I have no further comments. Thank you for making summary statistics available. ********** Have all data underlying the figures and results presented in the manuscript been provided? Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes: Ruth E Mitchell Reviewer #2: No Reviewer #3: No ---------------------------------------------------- Data Deposition If you have submitted a Research Article or Front Matter that has associated data that are not suitable for deposition in a subject-specific public repository (such as GenBank or ArrayExpress), one way to make that data available is to deposit it in the Dryad Digital Repository. As you may recall, we ask all authors to agree to make data available; this is one way to achieve that. A full list of recommended repositories can be found on our website. The following link will take you to the Dryad record for your article, so you won't have to re‐enter its bibliographic information, and can upload your files directly: http://datadryad.org/submit?journalID=pgenetics&manu=PGENETICS-D-20-00241R1 More information about depositing data in Dryad is available at http://www.datadryad.org/depositing. If you experience any difficulties in submitting your data, please contact help@datadryad.org for support. Additionally, please be aware that our data availability policy requires that all numerical data underlying display items are included with the submission, and you will need to provide this before we can formally accept your manuscript, if not already present. ---------------------------------------------------- Press Queries If you or your institution will be preparing press materials for this manuscript, or if you need to know your paper's publication date for media purposes, please inform the journal staff as soon as possible so that your submission can be scheduled accordingly. Your manuscript will remain under a strict press embargo until the publication date and time. This means an early version of your manuscript will not be published ahead of your final version. PLOS Genetics may also choose to issue a press release for your article. If there's anything the journal should know or you'd like more information, please get in touch via plosgenetics@plos.org. 30 Oct 2020 PGENETICS-D-20-00241R1 Genome-Wide Association Study Identifies 16 Genomic Regions Associated with Circulating Cytokines at Birth Dear Dr wang, We are pleased to inform you that your manuscript entitled "Genome-Wide Association Study Identifies 16 Genomic Regions Associated with Circulating Cytokines at Birth" has been formally accepted for publication in PLOS Genetics! Your manuscript is now with our production department and you will be notified of the publication date in due course. The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Soon after your final files are uploaded, unless you have opted out or your manuscript is a front-matter piece, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers. Thank you again for supporting PLOS Genetics and open-access publishing. We are looking forward to publishing your work! With kind regards, Laura Mallard PLOS Genetics On behalf of: The PLOS Genetics Team Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom plosgenetics@plos.org | +44 (0) 1223-442823 plosgenetics.org | Twitter: @PLOSGenetics

51 in total

Review 1. Cytokines in cancer pathogenesis and cancer therapy.

Authors: Glenn Dranoff
Journal: Nat Rev Cancer Date: 2004-01 Impact factor: 60.716

2. Multiplex assays of inflammatory markers, a description of methods and discussion of precautions - Our experience through the last ten years.

Authors: Kristin Skogstrand
Journal: Methods Date: 2011-10-07 Impact factor: 3.608

Review 3. Inflammation, metaflammation and immunometabolic disorders.

Authors: Gökhan S Hotamisligil
Journal: Nature Date: 2017-02-08 Impact factor: 49.962

4. The Genotype-Tissue Expression (GTEx) project.

Authors:
Journal: Nat Genet Date: 2013-06 Impact factor: 38.330

5. Genetic factors underlying the bidirectional relationship between autoimmune and mental disorders - Findings from a Danish population-based study.

Authors: Xueping Liu; Ron Nudel; Wesley K Thompson; Vivek Appadurai; Andrew J Schork; Alfonso Buil; Simon Rasmussen; Rosa L Allesøe; Thomas Werge; Ole Mors; Anders D Børglum; David M Hougaard; Preben B Mortensen; Merete Nordentoft; Michael E Benros
Journal: Brain Behav Immun Date: 2020-06-11 Impact factor: 7.217

6. Mapping long-range promoter contacts in human cells with high-resolution capture Hi-C.

Authors: Borbala Mifsud; Filipe Tavares-Cadete; Alice N Young; Robert Sugar; Stefan Schoenfelder; Lauren Ferreira; Steven W Wingett; Simon Andrews; William Grey; Philip A Ewels; Bram Herman; Scott Happe; Andy Higgs; Emily LeProust; George A Follows; Peter Fraser; Nicholas M Luscombe; Cameron S Osborne
Journal: Nat Genet Date: 2015-05-04 Impact factor: 38.330

7. Connecting genetic risk to disease end points through the human blood plasma proteome.

Authors: Karsten Suhre; Matthias Arnold; Aditya Mukund Bhagwat; Richard J Cotton; Rudolf Engelke; Johannes Raffler; Hina Sarwath; Gaurav Thareja; Annika Wahl; Robert Kirk DeLisle; Larry Gold; Marija Pezer; Gordan Lauc; Mohammed A El-Din Selim; Dennis O Mook-Kanamori; Eman K Al-Dous; Yasmin A Mohamoud; Joel Malek; Konstantin Strauch; Harald Grallert; Annette Peters; Gabi Kastenmüller; Christian Gieger; Johannes Graumann
Journal: Nat Commun Date: 2017-02-27 Impact factor: 14.919

8. Immunity and mental illness: findings from a Danish population-based immunogenetic study of seven psychiatric and neurodevelopmental disorders.

Authors: Ron Nudel; Michael E Benros; Morten Dybdahl Krebs; Rosa Lundbye Allesøe; Camilla Koldbæk Lemvigh; Jonas Bybjerg-Grauholm; Anders D Børglum; Mark J Daly; Merete Nordentoft; Ole Mors; David M Hougaard; Preben Bo Mortensen; Alfonso Buil; Thomas Werge; Simon Rasmussen; Wesley K Thompson
Journal: Eur J Hum Genet Date: 2019-04-11 Impact factor: 4.246

Introduction

Results

SNP heritability of circulating protein levels.

Distribution of association statistics for inflammation marker level a.

Prediction of inflammation marker levels by genetic variants.

Annotation of the region indexed by rs62224256 associated with S100B.

Discussion

Materials and methods

Sample

Cytokine level measurements

Quality control and imputation

SNP heritability, h2SNP

Genome-wide association

Associated regions and genes

Fine mapping

Enhancer annotation

Polygenic risk scoring

Additional analyses performed.

Description of samples.

Full annotation results for CRP level using FINEMAP, Ensemble VEP, HaploReg, Enhancer and Sherlock.

Full annotation results for EPO level using FINEMAP, Ensemble VEP, HaploReg, Enhancer and Sherlock.

Full annotation results for IL18 level using FINEMAP, Ensemble VEP, HaploReg, Enhancer and Sherlock.

Full annotation results for MCP1 level using FINEMAP, Ensemble VEP, HaploReg, Enhancer and Sherlock.

Full annotation results for S100B level using FINEMAP, Ensemble VEP, HaploReg, Enhancer and Sherlock.

Full annotation results for TARC level using FINEMAP, Ensemble VEP, HaploReg, Enhancer and Sherlock.

Full annotation results for VEGFA level using FINEMAP, Ensemble VEP, HaploReg, Enhancer and Sherlock.

FINEMAP results for regions associated with S100B assuming six causal variants.

Replication of SNPs identified by present study with those reported by Ahola-Olli et al.

The Manhattan & qq plots for BDNF level.

The Manhattan & qq plots for IL8 level.

The Manhattan & qq plots for CRP level.

The Manhattan & qq plots for EPO level.

The Manhattan & qq plots for IgA level.

The Manhattan & qq plots for IL18 level.

The Manhattan & qq plots for MCP1 level.

The Manhattan & qq plots for S100B level.

The Manhattan & qq plots for TARC level.

The Manhattan & qq plots for VEGFA level.

Region plot for CRP-rs3091244 association.

Region plot CRP rs112635299 association.

Region plot EPO rs1130864 association.

Region plot IgA rs3094087 association.

Region plot IL18 rs10891329 association.

Region plot IL18 rs10891268 association.

Region plot IL18 rs56195122 association.

Region plot IL18 rs9402686 association.

Region plot MCP1 rs12075 association.

Region plot MCP1 rs4493469 association.

Region plot MCP1 rs2228467 association.

Region plot MCP1 rs60200069 association.

Region plot S100B rs62224256 association.

Region plot S100B rs28397289 association.

Region plot TARC rs115952894 association.

Region plot TARC rs2228467 association.

Region plot TARC rs10886430 association.

Region plot TARC rs223896 association.

Region plot VEGFA rs7767396 association.

Region plot VEGFA rs11789392 association.

Polygenic score for discovery sample based on Ahola-Olli et al.

Polygenic score for replication sample based on Ahola-Olli et al.

Pearson's correlation among inflammation markers.

Enrichment of genetic variants with DNAse hypersensitive sites.

Enrichment of genetic variants with histone modification.

Enrichment of genetic variants with chromatin state.

Power analysis of discovery and replication samples.

Review 1. Cytokines in cancer pathogenesis and cancer therapy.

Review 3. Inflammation, metaflammation and immunometabolic disorders.

Review 10. Low back pain, obesity, and inflammatory markers: exercise as potential treatment.

Review 1. Fount, fate, features, and function of renal erythropoietin-producing cells.