| Literature DB >> 32047268 |
Bradford B Worrall1, Jeroen de Ridder2,3, Sara L Pulit4,5,6, Joanna von Berg7, Sander W van der Laan8, Patrick F McArdle9, Rainer Malik10, Steven J Kittner11, Braxton D Mitchell9.
Abstract
Ischemic stroke (IS), caused by obstruction of cerebral blood flow, is one of the leading causes of death. While neurologists agree on delineation of IS into three subtypes (cardioembolic stroke (CES), large artery stroke (LAS), and small vessel stroke (SVS)), several subtyping systems exist. The most commonly used systems are TOAST (Trial of Org 10172 in Acute Stroke Treatment) and CCS (Causative Classification System for Stroke), but agreement is only moderate. We have compared two approaches to combining the existing subtyping systems for a phenotype suited for a genome-wide association study (GWAS). We used the NINDS Stroke Genetics Network dataset (SiGN, 11,477 cases with CCS and TOAST subtypes and 28,026 controls). We defined two new phenotypes: the intersect, for which an individual must be assigned the same subtype by CCS and TOAST; and the union, for which an individual must be assigned a subtype by either CCS or TOAST. The union yields the largest sample size while the intersect yields a phenotype with less potential misclassification. We performed GWAS for all subtypes, using the original subtyping systems, the intersect, and the union as phenotypes. In each subtype, heritability was higher for the intersect compared with the other phenotypes. We observed stronger effects at known IS variants with the intersect compared with the other phenotypes. With the intersect, we identify rs10029218:G>A as an associated variant with SVS. We conclude that this approach increases the likelihood to detect genetic associations in ischemic stroke.Entities:
Mesh:
Year: 2020 PMID: 32047268 PMCID: PMC7316747 DOI: 10.1038/s41431-020-0580-5
Source DB: PubMed Journal: Eur J Hum Genet ISSN: 1018-4813 Impact factor: 5.351
Fig. 1Hypothesized benefit of using the intersect, at a SNP associated with ischemic stroke.
Circles indicate the non-risk allele, and crosses the risk allele. Using a chi-square test (visualized with contingency tables), the measured effect is stronger with a group of cases that is more homogeneous but smaller (intersect, purple) than with a group of cases that is less strictly defined but is larger (union, teal).
Case counts for the different phenotype definitions in the three subtypes.
| CES | LAS | SVS | Undetermined (‘other’ for CCSp) | Total | |
|---|---|---|---|---|---|
| C (CCSc) | 3,000 | 1,565 | 2,262 | 4,574 | 11,401 |
| P (CCSp) | 3,608 | 2,449 | 2,419 | 718 | 9,194 |
| T (TOAST) | 3,333 | 2,318 | 2,631 | 3,479 | 11,761 |
| I (intersect) | 2,219 | 1,328 | 1,548 | Not tested | 5,095 |
| U (union) | 4,502 | 3,495 | 3,480 | Not tested | 11,477 |
| S (symmetric difference) | 2,283 | 2,167 | 1,932 | Not tested | 6,382 |
The control group is always the same group of 28,026 individuals.
Fig. 2Intersect is the most heritable phenotype.
Heritabilities on the liability scale for the six case definitions. Bars indicate the standard error. Note that intersect has a relatively high standard error, due to its lower sample size. a In cardioembolic stroke, intersect is significantly more heritable than all other phenotype definitions (p values for the difference between intersect and all others 3.6e−03 or lower). b In large artery stroke and c small vessel stroke, intersect is significantly more heritable than all other phenotype definitions except CCSc (p values for the difference between intersect and all others except CCSc, 2.7e−03 or lower in LAS, 6.1e−07 or lower in SVS). P values for heritability differences determined by t-test (see Table S5). See Table S4 for numerical values of heritabilities and standard errors. int intersect, symdif symmetric difference.
Fig. 3Graphical explanation of overlap analysis.
a At a certain absolute z-score threshold Z, all SNPs that have a z-score lower than −Z or higher than +Z in GWAS I are determined (SNPs 1–8 and 9–12). Next, all SNPs that have a z-score lower than −Z or higher than +Z in GWAS II are determined (SNPs 1–8 and 13–16). The number of shared significant SNPs is divided by the union of significant SNPs to calculate the Jaccard index. b We also calculate the Pearson correlation of the z-scores of the shared SNPs.
Fig. 4Different phenotype definitions capture different genetic risk factors.
Overlap analysis in cardioembolic stroke. Similarity on the y-axis denotes either correlation (circles) or Jaccard index (triangles). The absolute z-score threshold is plotted on the x-axis. Numbers indicate the number of shared SNPs at Z = 3. a Pairwise comparisons with intersect, b pairwise comparisons with union, c pairwise comparisons with symmetric difference.
Fig. 5Intersect most often shows the strongest effect at previously identified subtype-specific associations.
Odds ratios for the five LAS-associated SNPs (in purple) and the four CES-associated SNPs (in teal) in the five phenotype definitions. The dotted line indicates an OR of 1 (no effect). Error bars indicate the 95% confidence interval. Intersect show the strongest effect at five of the nine SNPs (binomial p = 0.0196).
Summary statistics for the new genome-wide significant SNPs.
| Locus | SNP | Chr | A1 | A2 | Analysis | Freq1 | OR | Beta | SE | |
|---|---|---|---|---|---|---|---|---|---|---|
| CAMK2D | rs10029218 | 4 | A | G | SVS-intersect | 1.20E−08 | 0.12 | 1.27 | 0.02 | 0.00 |
| SVS-rep-EUR | 2.46E−02 | 0.11 | 1.11 | 0.10 | 0.05 | |||||
| SVS-rep-TRANS | 0.13 | 1.12 | 0.11 | 0.04 | ||||||
| SH2B3-BRAP-ALDH2 | rs11065979 | 12 | T | C | SVS-CCSp | 9.40E−09 | 0.42 | 1.13 | 0.01 | 0.00 |
| SVS-rep-EUR | 0.43 | 1.08 | 0.08 | 0.03 | ||||||
| SVS-rep-TRANS | 0.41 | 1.08 | 0.07 | 0.03 | ||||||
| PFH20 | rs11697087 | 20 | A | G | CES-intersect | 3.20E−09 | 0.09 | 1.26 | 0.02 | 0.00 |
| CES-rep-EUR | 1.55E−02 | 0.09 | 1.10 | 0.10 | 0.04 | |||||
| CES-rep-TRANS | 4.76E−02 | 0.09 | 1.07 | 0.07 | 0.03 | |||||
| 5:114799266 | rs2169955 | 5 | T | C | CES-CCSc | 3.90E−08 | 0.57 | 0.90 | −0.01 | 0.00 |
| CES-rep-EUR | 1.48E−02 | 0.56 | 0.95 | −0.05 | 0.02 | |||||
| CES-rep-TRANS | 2.22E−02 | 0.56 | 0.96 | −0.04 | 0.02 | |||||
| GNAO1 | rs3790099 | 16 | C | G | CES-CCSp | 4.90E−08 | 0.85 | 0.87 | −0.02 | 0.00 |
| CES-rep-EUR | 0.84 | 0.89 | −0.11 | 0.03 | ||||||
| CES-rep-TRANS | 1.10E−02 | 0.77 | 0.94 | −0.07 | 0.03 |
Per locus, the SiGN GWAS is in the first row, in the format ‘subtype–phenotype’. In the other rows, results in MEGASTROKE are shown with ‘subtype-rep-EUR’ for the Europeans-only analysis, and with ‘subtype-rep-TRANS’ for the trans-ancestry analysis. NB, Beta, and SE of SiGN GWAS and MEGASTROKE GWAS are not comparable since they come from linear and logistic regression, respectively. The ORs are comparable. We did a Bonferroni correction: for SVS, α = 0.0125 and for CES, α = 0.00625. Replication p values below the threshold are indicated in bold. Two SNPs (rs2169955:C>T and rs62379973:C>G, in CES-CCSc) that are relatively close (260 kb) on chromosome 5 were in two different clumps, even though they are in LD (r2 = 0.52, D′ = 0.87, in a CEU population [31]) because the distance is just above the threshold (250 kb). Because they are in LD, and just a little farther apart than 250 kb, they were considered to be from the same locus and only the strongest association was kept (rs2169955:C>T).
A1 allele 1, A2 allele 2, Freq1 frequency of allele 1, OR odds ratio, Beta coefficient, SE standard error.