Literature DB >> 34367251

Multi-Level Analyses of Genome-Wide Association Study to Reveal Significant Risk Genes and Pathways in Neuromyelitis Optica Spectrum Disorder.

Ting Li1, He Li2, Yue Li1, Shu-An Dong3, Ming Yi1, Qiu-Xia Zhang1, Bin Feng1, Li Yang1, Fu-Dong Shi1,4,5, Chun-Sheng Yang1.   

Abstract

BACKGROUND: Neuromyelitis optica spectrum disorder (NMOSD) is an inflammatory disease of the central nervous system and it is understandable that environmental and genetic factors underlie the etiology of NMOSD. However, the susceptibility genes and associated pathways of NMOSD patients who are AQP4-Ab positive and negative have not been elucidated.
METHODS: Secondary analysis from a NMOSD Genome-wide association study (GWAS) dataset originally published in 2018 (215 NMOSD cases and 1244 controls) was conducted to identify potential susceptibility genes and associated pathways in AQP4-positive and negative NMOSD patients, respectively (132 AQP4-positive and 83 AQP4-negative).
RESULTS: In AQP4-positive NMOSD cases, five shared risk genes were obtained at chromosome 6 in AQP4-positive NMOSD cases by using more stringent p-Values in both methods (p < 0.05/16,532), comprising CFB, EHMT2, HLA-DQA1, MSH5, and SLC44A4. Fifty potential susceptibility gene sets were determined and 12 significant KEGG pathways were identified. Sixty-seven biological process pathways, 32 cellular-component pathways, and 29 molecular-function pathways with a p-Value of <0.05 were obtained from the GO annotations of the 128 pathways identified. In the AQP4 negative NMOSD group, no significant genes were obtained by using more stringent p-Values in both methods (p < 0.05/16,485). The 22 potential susceptibility gene sets were determined. There were no shared potential susceptibility genes between the AQP4-positive and negative groups, furthermore, four significant KEGG pathways were also identified. Of the GO annotations of the 165 pathways identified, 99 biological process pathways, 37 cellular-component pathways, and 29 molecular-function pathways with a p-Value of <0.05 were obtained.
CONCLUSION: The potential molecular mechanism underlying NMOSD may be related to proteins encoded by these novel genes in complements, antigen presentation, and immune regulation. The new results may represent an improved comprehension of the genetic and molecular mechanisms underlying NMOSD.
Copyright © 2021 Li, Li, Li, Dong, Yi, Zhang, Feng, Yang, Shi and Yang.

Entities:  

Keywords:  Genome-wide association study; Neuromyelitis optica spectrum disorder; gene differential expression; gene sets; pathway

Year:  2021        PMID: 34367251      PMCID: PMC8335167          DOI: 10.3389/fgene.2021.690537

Source DB:  PubMed          Journal:  Front Genet        ISSN: 1664-8021            Impact factor:   4.599


Introduction

Neuromyelitis optica spectrum disorder (NMOSD) is an inflammatory disease of the central nervous system that predominantly affects the spinal cord and optic nerves (Wingerchuk et al., 2015). The estimated prevalence is 0.5 to 10 persons (predominantly women) per population of 100,000 (Hor et al., 2020), especially high incidence in Asia, and with an incidence of 0.445/100,000 (0.433-0.457) in China (Tian et al., 2020). The pathogenic role of anti-aquaporin 4 autoantibody (AQP4-Ab) has been established in previous studies. Complement-dependent cytotoxicity and antibody-dependent cell-mediated cytotoxicity have been reported (Wingerchuk et al., 2007). Progression in NMOSD research has led to the deduction that environmental and genetic factors underlie the etiology of NMOSD, environmental factors including changes in the diet, exposure to infections, stressful life events, seasonal variation and so on (Akaishi et al., 2020; Hor et al., 2020; Rafiee et al., 2020). Recent studies on genome-wide association (GWAS) of NMOSD have amplified the understanding of NMOSD. Previous studies revealed that a common promoter single-nucleotide polymorphisms (SNP) in CYP7A1 has a protective/gene dose-dependent effect on the risk of NMO (Kim et al., 2010). Recently, Estrada et al. (2018) identified two independent SNP (rs1150757 and rs28383224) in the major histocompatibility complex (MHC) region associated with AQP4-Ab positive NMOSD, however, the susceptibility genes and associated pathways of NMOSD patients with AQP4-Ab positive and negative have not been elucidated. As a useful complementary approach for SNP-GWAS, gene-based tests play an increasing role in genetics research. Studies have shown that gene-based tests can be more powerful than SNP-GWAS approach and suitable for integrating the results of pathway tests from GWAS (Liu et al., 2010). Secondary analysis from a NMOSD-GWAS dataset originally published in 2018 (215 NMOSD cases and 1244 controls) was conducted to identify potential susceptibility genes and associated pathways of NMOSD (Estrada et al., 2018). The original data was obtained at the NHGRI-EBI GWAS Catalog[1]. In phase I, more stringent p-Values were used to determine the intersection of two new gene-based tests for the GWAS approach, called Versatile Gene-based Association Study-2 version 2 (VEGAS2) and PLINK, which were used to conduct a gene-based association study in AQP4-positive (p < 0.05/16,532) and negative (p < 0.05/16,485) NMOSD patients, respectively. Additionally, for exploring more potential genes, we used p < 0.05 to get the share genes of these two methods. In phase II, gene sets identified with meta p < 0.05/n (“n” represents the number of genes shared by the VEGAS2 and PLINK tests) were carried forward to the next phase. In phase III, protein-protein association networks were performed by STRING, meanwhile, KEGG pathways and GO analysis were conducted for the NMOSD susceptibility genes to gain further insights into the genes (Figure 1).
FIGURE 1

Flow diagram of the three-phase analysis design. In phase I, two new gene-based tests for the GWAS approach, called VEGAS2 and PLINK, were used to conduct a gene-based association study. In phase II, Gene sets identified with meta p < 0.05/n were carried forward to the next phase. In phase III, protein-protein association networks were performed by STRING. Meanwhile, KEGG pathways and GO analysis were carried out for these NMOSD susceptibility genes. VEGAS2 and PLINK are two kinds of new gene-based tests. “n” represents the number of genes shared by the VEGAS2 and PLINK tests.

Flow diagram of the three-phase analysis design. In phase I, two new gene-based tests for the GWAS approach, called VEGAS2 and PLINK, were used to conduct a gene-based association study. In phase II, Gene sets identified with meta p < 0.05/n were carried forward to the next phase. In phase III, protein-protein association networks were performed by STRING. Meanwhile, KEGG pathways and GO analysis were carried out for these NMOSD susceptibility genes. VEGAS2 and PLINK are two kinds of new gene-based tests. “n” represents the number of genes shared by the VEGAS2 and PLINK tests. In AQP4-positive NMOSD cases, where more stringent p-Values were used in both methods, five shared risk genes were obtained at chromosome 6 (p < 0.05/16,532), comprising CFB, EHMT2, HLA-DQA1, MSH5, and SLC44A4. Furthermore, we determined 50 potential susceptibility gene sets of AQP4-positive NMOSD, and the top 5 of these genes were the same as the genes obtained using more stringent p-Values after which 12 significant KEGG pathways (p < 0.05) were identified. In the GO annotations of the 128 pathways identified, 67 biological process pathways, 32 cellular-component pathways, and 29 molecular-function pathways with a p-Value of <0.05 were obtained. No significant genes were obtained by using more stringent p-Values in both methods in the AQP4-negative NMOSD group (p < 0.05/16,485). The 22 potential susceptibility gene sets were determined. There were no shared potential susceptibility genes between the AQP4-positive and negative groups, furthermore, four significant KEGG pathways were also identified. Of the GO annotations of 165 pathways identified, 99 biological process pathways, 37 cellular-component pathways, and 29 molecular-function pathways with a p-Value of <0.05 were obtained. In short, the new results may represent significant steps toward defining the genetic mechanism underlying the association of NMOSD.

Materials and Methods

Samples

The original data were obtained from the NHGRI-EBI GWAS Catalog see text footnote 1. The large-scale NMOSD-GWAS dataset that comprises 215 NMOSD cases (132 AQP4-positive and 83 AQP4-negative) and 1244 controls of European ancestry was analyzed. The originally published data in 2018 used 2006 NMO diagnostic criteria that require optic neuritis and transverse myelitis plus two of the following three supportive elements: (1) longitudinally extensive lesions (≥3 vertebral segments in length); (2) magnetic resonance imaging of the brain with normal findings or with findings inconsistent with MS; and (3) NMO-IgG seropositivity. After subjecting the dataset to certain quality-control methods, 7,138,498 autosomal SNPs were available for genetic analysis. Please refer to the original text for more details (Estrada et al., 2018). To investigate whether there are genetic differences between AQP4-positive and AQP4 negative NMOSD, gene-based association study in AQP4-positive and negative NMOSD patients was conducted. Gene-based test The first gene-based test method used was a versatile gene-based association study (VEGAS2[2]) developed by Aniket Mishra and Stuart Macgregor (Mishra and Macgregor, 2015). VEGAS2 distributes SNPs to every gene of 17,787 autosomes in terms of the location on the UCSC Genome Browser hg18 group. The method integrates the overall information of the gene and its SNPs while also illustrating their details such as the sizes of genes and the linkage disequilibrium (LD) between SNPs. ± 50 kb of 5′ and 3′. UTRs, which is defined as the inclusion of SNPs screening within genes. Subsequently, the association p-Values of SNPs within a given gene are converted to upper-tail chi-squared statistics with one degree of freedom (df) (Liu et al., 2010). The statistic of this gene-based test is the total of all χ2 1 df statistics in the given gene. This method uses the multivariate normal simulations to illustrate the LD structure of SNPs within the given gene with reference to the HapMap2 genotype information (Liu et al., 2010). Simulations of 103 are first processed. If the out-coming empirical p-Value is less than 0.1 then 104 simulations will be processed. If the empirical p-Value of 104 simulations is less than 0.001, the process will perform 106 simulations. Due to the calculated reasons, no more simulations will be processed if the empirical p-Value is 0. More stringent p-Values have been used for both methods in AQP4-positive NMOSD cases (p < 0.05/16,532) and in the AQP4-negative NMOSD group (p < 0.05/16,485). For exploring more potential genes, p < 0.05 was used to get the intersection of two methods and an empirical p-Value of 0 from 106 simulations can be explained as p < 10–6, which surpasses p-Value < 2.8 10–6 (≈0.05/17,787(the autosomal genes’ amount)) (Liu et al., 2010). Gene-based, test method based, user-friendly bioinformatics tools, like PLINK which was a set-based analysis, were used and this method combined a meta-analysis process and PLINK to calculate the data of GWAS (Moskvina et al., 2011). The theoretical description of this method is detailed in the previously published literature (Liu et al., 2017). The PLINK is an open-source C/C++ whole-genome association study software[3] (Purcell et al., 2007). Based on the HapMap data available in PLINK, “plink –bfile hapmap_CEU_r23a –set-screen nmo.txt –make-set glist.dat” was conducted for gene-wise calculations (Moskvina et al., 2011). In the code above, “nmo.txt” is the file with SNP IDs and their p-Value in NMO GWAS data, other files are provided by the software (Moskvina et al., 2011). Meta-analysis Fisher’s method was used to integrate the p-Values of common risk genes derived from two gene-based tests. Fisher’s method is a simple meta-analysis approach for significance computation (Kippola and Santorico, 2010). The statistics of Fisher’s method are computed as: In 2-1, i is the number of studies, p is the p-Values of the gene and k is the sum count of studies. This formula follows the chi-squared distribution with 2 kdf. Protein-protein association networks performed by string STRING (version 11) was performed for functional partnerships and interactions between the proteins encoded by the overlapped NMOSD genes by two gene-based tests[4]. The STRING database plans on assembling and integrating the proteins association information (Szklarczyk et al., 2017). The subtypes of associations in STRING are direct (physical) and indirect (functional) interactions. The source of the STRING database includes assembling and evaluating obtainable tested data from known protein complexes and pathways, co-expression study, inter-genomes shared selective signals, scientific article text-mining, and interaction knowledge calculated transfer between organisms (Szklarczyk et al., 2017). The association network of STRING provides much information of proteins including annotations, cross-links, domain structures, 3D protein structures, and their types of interaction (Szklarczyk et al., 2015). The new version (v11) of STRING updates the organisms’ amount to 5,090 and increased the capacity of the input to the genome-wide level (Szklarczyk et al., 2019). In a study of the genome-wide dataset, the subsets can be visualized as association networks and conducted as enrichment analysis. Furthermore, STRING actualized novel classification systems for enrichment analysis in addition to classical systems such as Gene Ontology and KEGG (Szklarczyk et al., 2019). Pathway-based test In the study, widely used online assessment analysis tools were highlighted for enrichment analysis, WebGestalt 2019[5] (Liao et al., 2019). A hypergeometric test was conducted to explore an over-representation of the screened genes among the genes in a given pathway (Wang et al., 2013). In this pathway, the p-Values of more than J disease-related genes were observed and computed by the following formula: In 4-1, m is the total number of target genes related to a given disease, N is the number of reference genes, S is the number of genes in the designated pathways. The pathway selection interval was set between 20 and 300 genes to avoid bias caused by testing an extremely narrow or broad enrichment of genes on the pathway (Liao et al., 2019). According to the result, the pathways with p-Value < 0.05 were regarded as risk pathways.

Results

Gene-Based Test of NMOSD GWAS Dataset

In the AQP4-positive NMOSD cases, 6,804,718 NMOSD SNPs were mapped to 21,275 and 16532 genes, respectively, using VEGAS2 and PLINK. More stringent p-Values were used in both methods, five shared risk genes were obtained at chromosome 6 (p < 0.05/16,532), comprising CFB, EHMT2, HLA-DQA1, MSH5, and SLC44A4 (Table 1 and detailed results are listed in Supplementary Table 1). For exploring more potential genes, p < 0.05 was used to get the intersection of two methods after which 961 genes were identified using VEGAS2, and 544 genes using PLINK with a significance of p < 0.05, 315 common genes were revealed by VEGAS2 and PLINK (Supplementary Table 2), of which 50 gene sets passed the Bonferroni-corrected statistical test at a significance of meta p < 1.59 × 10–4 (p = 0.05/315) (Supplementary Table 2). The 50 potential susceptibility gene sets of AQP4-positive NMOSD are shown in Table 2 and the top 5 of these genes were same with the ones obtained with more stringent p-Values (p < 0.05/16,532).
TABLE 1

Susceptibility genes by using more stringent p-Values and the intersection of VEGAS2 and PLINK in AQP4-positive NMOSD.

Gene setPositionnSNPsP-value
VEGAS2
ATP6V1G2chr6:31620699..31621843111E-06
ATP6V1G2-DDX39Bchr6:31530219..31546848201E-06
CFB*chr6:32022003..32027809181E-06
CLIC1chr6:31810689..3181227341E-06
EHMT2*chr6:31956199..31972526161E-06
HLA-DQA1*chr6:32717784..32717784221E-06
MB21D1chr6:73423711..73452297291E-06
MSH5*chr6:31816126..31837338341E-06
MSH5-SAPCD1chr6:31739948..31764850361E-06
SLC44A4*chr6:31942176..31954213441E-06
STK19_1chr6:31971175..31981446101E-06
TNXB_2chr6:32041153..32109338851E-06
SLC17A5chr6:74361136..744196213342E-06
PLINK
AGERchr6:32257794..3225997251.09E-06
BTNL2chr6:32470908..32482618351.43E-06
C6orf10chr6:32368537..324476252131.88E-09
C6orf136chr6:30726885..3072688517.09E-08
C6orf27chr6:31841629..3185056946.22E-07
C6orf48chr6:31912625..3191551941E-08
CFB*chr6:32022003..32027809151.2E-06
EGFL8chr6:32242488..3224248811.9E-08
EHMT2*chr6:31956199..31972526101.81E-09
HLA-DQA1*chr6:32717784..3271778413.29E-09
HLA-DRAchr6:32515687..32520787378.77E-10
MSH5*chr6:31816126..31837338193.78E-10
PBX2chr6:32262263..3226561743.54E-07
PRRT1chr6:32225949..3222594912.21E-08
RNF5chr6:32254470..3225638169.36E-08
SKIV2Lchr6:32035321..32045016176.39E-07
SLC44A4*chr6:31942176..31954213192.82E-06
STK19chr6:32048876..3205543972.24E-13
TABLE 2

Comparison of significant shared gene sets in AQP4-positive NMOSD produced with VEGAS2 and PLINK.

Gene setPositionPLINK
VEGAS2
NSNPP-valueNSNPP-valueP meta
MSH5chr6:31816126..31837338193.78E-10341.00E-061.38E-14
EHMT2chr6:31956199..31972526101.81E-09161.00E-066.31E-14
HLA-DQA1#chr6:32717784..3271778413.29E-09221.00E-061.13E-13
CFBchr6:32022003..32027809151.20E-06181.00E-063.42E-11
SLC44A4chr6:31942176..31954213192.82E-06441.00E-067.77E-11
C6orf10chr6:32368537..324476252131.88E-09140.00271.37E-10
CLIC1chr6:31810689..3181227327.76E-0641.00E-062.06E-10
SLC17A5chr6:74361136..74419621924.78E-063342.00E-062.52E-10
HIST1H1Bchr6:27943197..2794325128.83E-06121.20E-052.54E-09
HIST1H2BLchr6:27883653..2788367622.57E-0562.10E-051.21E-08
HIST1H2ALchr6:27941153..2794115312.66E-0592.30E-051.36E-08
ATP6V1G2chr6:31620699..3162184350.003494111.00E-067.15E-08
ZSCAN16chr6:28200582..2820517287.64E-05145.40E-058.38E-08
BTN3A2chr6:26473574..26486128390.000377892.20E-051.62E-07
EEF1A1chr6:74283039..7428651379.38E-05209.40E-051.72E-07
MTO1chr6:74228188..74266790170.000618782.30E-052.71E-07
VARSchr6:31856799..3187082251.10E-0520.002134.36E-07
C6orf15chr6:31187055..3118807890.000135240.0001814.54E-07
ARSIchr5:149656490..14966045230.000152180.0001975.48E-07
ZFP57chr6:29750488..29756485160.000276170.0001487.37E-07
BTN2A1chr6:26566244..26576639200.003007663.60E-051.84E-06
C2chr6:32003952..32019988120.001923418.30E-052.66E-06
LTAchr6:31648120..3164876350.00106780.0002023.52E-06
HIST1H3Jchr6:27966400..2796640010.00049860.0004673.79E-06
PSORS1C1chr6:31190939..31215712650.0052061496.40E-055.3E-06
GLSchr2:191457280..191537657430.0009381080.0003795.63E-06
CDSNchr6:31190939..31192771170.001358350.0004028.42E-06
APPL1chr3:57238434..57282001170.000839630.0007881.01E-05
NFKBIL1chr6:31623742..31633891120.003355320.0002411.22E-05
ZNF165chr6:28154652..2816347480.003189290.000421.95E-05
ZKSCAN3chr6:28426460..28441375240.008439640.0001752.13E-05
HSPA1Achr6:31891486..3189148610.00095720.001572.16E-05
C6orf203chr6:107458491..107478905170.000463870.004272.79E-05
ZMYND19chr9:139597070..13960289740.000315190.006382.83E-05
EIF3Fchr11:7965996..7973800160.004203350.0004982.95E-05
PSORS1C2chr6:31213289..31215066120.004212160.0006123.57E-05
HRASLS2chr11:63078313..6308697250.003647140.0008464.22E-05
KLHL30chr2:238715095..238725505110.000507540.00694.75E-05
CAPN2chr1:221967442..222030091650.0020962060.001875.27E-05
CDC16chr13:114021570..114055715190.003988900.001035.51E-05
PKN3chr9:130508561..13052228820.001899130.002225.64E-05
TBCBchr19:41301630..4130630250.000416250.0158988.55E-05
TJP2chr9:70980117..710597301080.0070714180.0019.09E-05
TAP2chr6:32897785..32914439650.0069171120.001119.81E-05
RBBP6chr16:24458431..24490907300.005909750.001340.000101
CYP2A6chr19:46042434..4604637330.001634260.005220.000108
PSMB9chr6:32930164..32934963140.002805230.003360.000118
TCP10chr6:167715748..16771681440.004822280.002410.000144
POLA2chr11:64786765..64821494150.003271360.003910.000157
LST1chr6:31662586..3166456060.025970.0004960.000158
Susceptibility genes by using more stringent p-Values and the intersection of VEGAS2 and PLINK in AQP4-positive NMOSD. Comparison of significant shared gene sets in AQP4-positive NMOSD produced with VEGAS2 and PLINK. In the AQP4-negative NMOSD group, 6,423,746 NMOSD SNPs were mapped to 21,170 and 16485 genes, respectively using VEGAS2 and PLINK. More stringent p-Values were used in both methods, no significant genes were determined (p < 0.05/16,485). To explore more potential genes, p < 0.05 was used to get the intersection of two methods after which 988 genes were identified using VEGAS2 and 473 genes using PLINK with a significance of p < 0.05, 305 common genes were identified (Supplementary Table 2) of which 22 gene sets passed the Bonferroni-corrected statistical test at a significance of meta p < 1.63 × 10–4 (p = 0.05/305). The 22 potential susceptibility gene sets of AQP4 negative NMOSD are shown in Table 3. Shared susceptibility genes between the AQP4-positive and negative groups were absent.
TABLE 3

Comparison of significant shared gene sets in AQP4 negative NMOSD produced with VEGAS2 and PLINK.

Gene setPositionVEGAS2
PLINK
NSNPP-valueNSNPP-valueP meta
RGS1chr1:190811722..19081516690.001563250.0002155.35E-06
COL8A2chr1:36338171..3633820424.32E-05150.0155981.02E-05
AGTPBP1chr9:87354033..875450051150.0006114130.001961.75E-05
SYT9chr11:7232814..74463953760.025697430.0000471.77E-05
SGIP1chr1:66772691..669824063740.0029186540.0004611.95E-05
CEACAM4chr19:46817929..46824888100.03613310.0000432.23E-05
INTS5chr11:62177128..6217712810.00050650.005583.89E-05
RAD51AP2chr2:17557392..1756215970.003099360.0009313.97E-05
CDR2Lchr17:70496373..7051301360.000389200.008894.7E-05
ZNF652chr17:44728416..44791811390.0055431480.0007845.8E-05
WDR86chr7:150709151..150736999230.0089851010.0005997.07E-05
HSD11B1chr1:207927291..207973298280.005835760.0009767.45E-05
SYCE1chr10:135217645..135231917540.0019291000.003548.81E-05
SSFA2chr2:182469362..182502779160.003418480.002059.02E-05
WT1chr11:32366048..32412429650.0026591590.002769.41E-05
RAX2chr19:3720253..372089940.001959160.0041E-04
CAPGchr2:85475570..85489706110.009872770.0008510.000107
PIGXchr3:197926529..197946898120.003838750.002280.000111
RPS3chr11:74788237..7479413540.001819310.005640.000128
PGRMC2chr4:129414746..12942770950.003674190.00280.000128
TRAF3IP3chr1:207996202..208020329320.00363610.00330.000148
PTPLAD1chr15:63611672..63657183240.005597890.002310.000158
Comparison of significant shared gene sets in AQP4 negative NMOSD produced with VEGAS2 and PLINK.

Protein-Protein Association Network Analysis of NMOSD Susceptibility Genes

To further verify these findings, a protein-protein interaction network analysis was conducted for AQP4-positive and negative NMOSD potential susceptibility genes. Interestingly, significant connectivity among proteins encoded by these NMOSD susceptibility genes was highlighted according to the protein-protein interaction network in STRING (version 11.0) (Figures 2, 3).
FIGURE 2

Network of known and predicted interactions between proteins encoded by NMOSD potential susceptibility genes were identified by GWAS of AQP4-positive NMOSD. Network nodes represent the proteins produced by a single, protein-coding gene locus. Edges represent protein-protein associations meant to be specific and meaningful. In STRING, blue edge represents protein-protein associations with known interactions from curated databases, purple edge represents protein-protein associations with known interactions experimentally determined, green edge represents protein-protein associations with predicted interactions with gene neighborhood, red edge represents protein-protein associations with predicted interactions with gene fusions, Navy blue edge represents protein-protein associations with predicted interactions with gene co-occurrence, and other associations with text mining (yellow edge), co-expression (black edge), and protein homology (lavender edge).

FIGURE 3

Network of known and predicted interactions between proteins encoded by NMOSD potential susceptibility genes were identified by GWAS of AQP4-negative NMOSD.

Network of known and predicted interactions between proteins encoded by NMOSD potential susceptibility genes were identified by GWAS of AQP4-positive NMOSD. Network nodes represent the proteins produced by a single, protein-coding gene locus. Edges represent protein-protein associations meant to be specific and meaningful. In STRING, blue edge represents protein-protein associations with known interactions from curated databases, purple edge represents protein-protein associations with known interactions experimentally determined, green edge represents protein-protein associations with predicted interactions with gene neighborhood, red edge represents protein-protein associations with predicted interactions with gene fusions, Navy blue edge represents protein-protein associations with predicted interactions with gene co-occurrence, and other associations with text mining (yellow edge), co-expression (black edge), and protein homology (lavender edge). Network of known and predicted interactions between proteins encoded by NMOSD potential susceptibility genes were identified by GWAS of AQP4-negative NMOSD.

Pathway Analysis of NMOSD-GWAS Dataset

In AQP4-positive NMOSD, KEGG pathway analysis was conducted on the 50 potential susceptibility genes, and 12 significant KEGG pathways were identified. (p < 0.05). The 12 KEGG pathways fall into seven categories: (1) Immune disease (n = 2) : systemic lupus erythematosus (SLE) (hsa05322) and Rheumatoid arthritis (RA) (hsa05323), (2) Immune system (n = 2): antigen processing and presentation (hsa04612), complement and coagulation cascades (hsa04610), (3) Infectious disease (n = 4): Staphylococcus aureus infection (hsa05150), Vibrio cholerae infection (hsa05110), Leishmaniasis (hsa05140) and herpes simplex infection (hsa05168), (4) Endocrine and metabolic disease (n = 1): Type I diabetes mellitus (hsa04940), (5) Transport and catabolism (n = 1): Phagosome (hsa04145), (6) Substance dependence (n = 1): Alcoholism (hsa05034), (7) Longevity regulating pathway (n = 1): Longevity regulating pathway (hsa04211) (Table 4). In KEGG, genes associated with immune disease and system include HLA-DQA1, HIST1H2BL, HIST1H2AL, C2, CFB, HSPA1A, TAP2, HIST1H3Jand ATP6V1G2. In GO annotations of the 128 pathways identified, 67 biological process pathways, 32 cellular-component pathways, and 29 molecular-function pathways with a p-Value of <0.05 were obtained. The top 10 of GO annotations are illustrated in Figure 4, and detailed results are listed in Supplementary Table 2. Regarding the GO analysis, regulation of humoral immune response (GO:0002920), protein processing (GO:0016485), regulation of complement activation (GO:0030449), regulation of protein activation cascade (GO:2000257) and complement activation (GO:0006956) were associated with C2 and CFB locus. Histone H3K9 methylation (GO:0051567) and histone methyltransferase activity (H3K27 and H3K9) (GO:0046976 and GO:0046976) were associated with EHMT2 and HIST1H1B locus; phosphatidylcholine metabolic process (GO:0046470), phosphatidylcholine biosynthetic process (GO:0006656) and vitamin transmembrane transporter activity (GO:0090482) were associated with SLC44A4 locus; DNA repair complex (GO:1990391) was associated with MSH5 locus; pathways related to antigen processing, presentation (GO:0002478, GO:0019884, and GO0048002) and MHC protein complex (GO:0042611) are also associated with HLA-DQA1 locus.
TABLE 4

Significant KEGG pathways of AQP4-positive and negative with P < 0.05 by pathway analysis of NMOSD GWAS.

Pathway IDPathway namePathway classificationP-valueRelated genes
AQP4-positive
hsa05322Systemic lupus erythematosusImmune disease6.62E-05HLA-DQA1; HIST1H2BL; HIST1H2AL; C2; HIST1H3J
hsa05150Staphylococcus aureus infectionInfectious disease8.17E-04HLA-DQA1; CFB; C2
hsa04612Antigen processing and presentationImmune system0.002058HLA-DQA1; HSPA1A; TAP2
hsa04940Type I diabetes mellitusEndocrine and metabolic disease0.008931HLA-DQA1; LTA
hsa05110Vibrio cholerae infectionInfectious disease0.011943ATP6V1G2; TJP2
hsa04145PhagosomeTransport and catabolism0.013689HLA-DQA1; ATP6V1G2; TAP2
hsa05134LegionellosisInfectious disease0.014332EEF1A1; HSPA1A
hsa05034AlcoholismSubstance dependence0.021444HIST1H2BL; HIST1H2AL; HIST1H3J
hsa05168Herpes simplex infectionInfectious disease0.023038HLA-DQA1; LTA; TAP2
hsa04610Complement and coagulation cascadesImmune system0.028318CFB; C2
hsa04211Longevity regulating pathwayAging0.03527EHMT2; APPL1
hsa05323Rheumatoid arthritisImmune disease0.035998HLA-DQA1; ATP6V1G2
AQP4-negative
hsa00563Glycosylphosphatidylinositol (GPI)-anchor biosynthesisGlycan biosynthesis and metabolism1.33E-02PIGX
hsa00140Steroid hormone biosynthesisLipid metabolism3.18E-02HSD11B1
hsa00980Metabolism of xenobiotics by cytochrome P450Xenobiotics biodegradation and metabolism4.01E-02HSD11B1
hsa05204Chemical carcinogenesisCancer4.32E-02HSD11B1
FIGURE 4

The top ten GO annotations of AQP4-positive NMOSD, (A) is for biological process, (B) is for molecular function, and (C) is for cellular component.

Significant KEGG pathways of AQP4-positive and negative with P < 0.05 by pathway analysis of NMOSD GWAS. The top ten GO annotations of AQP4-positive NMOSD, (A) is for biological process, (B) is for molecular function, and (C) is for cellular component. In AQP4 negative NMOSD, seven KEGG pathways were available for the analysis of 22 potential susceptibility genes. Four significant KEGG pathways were highlighted (p < 0.05). The 4 KEGG pathways are comprised of four categories: (1) Glycan biosynthesis and metabolism: glycosylphosphatidylinositol (GPI)-anchor biosynthesis (hsa00563), (2) Lipid metabolism: Steroid hormone biosynthesis (hsa00140), (3) Xenobiotics biodegradation and metabolism: metabolism of xenobiotics by cytochrome P450 (hsa00980), (4) Cancer: Chemical carcinogenesis (hsa05204) (Table 4). In GO annotations of the 165 pathways identified, 99 biological process pathways, 37 cellular-component pathways, and 29 molecular-function pathways with a p-Value of <0.05 were obtained. The top 10 results of the GO analysis are illustrated in Figure 5, and detailed results are listed in Supplementary Table 4.
FIGURE 5

The top ten GO annotations of AQP4- negative NMOSD, (A) is for biological process, (B) is for molecular function, and (C) is for cellular component.

The top ten GO annotations of AQP4- negative NMOSD, (A) is for biological process, (B) is for molecular function, and (C) is for cellular component.

Discussion

Progression in NMOSD research has led to the deduction that environmental and genetic factors underlie the etiology of NMOSD. NMOSD genetic risk is still required to be identified to date. A NMOSD-GWAS dataset comprising 215 NMOSD cases (132 AQP4-positive and 83 AQP4-negative) and 1244 control cases of European ancestry which contain 7,138,498 autosomal SNPs was chosen as regards to the present study. As a useful complementary approach to per SNP-GWAS, gene-based tests play many roles in genetics research. Studies have shown that gene-based to be more powerful than the per SNP-GWAS approach, also, they are suitable for integrating the results of pathway tests from GWAS (Liu et al., 2010). Recently, Estrada et al. (2018) identified two independent single nucleotide polymorphisms (SNPs) (rs1150757 and rs28383224) in the MHC region associated with AQP4-Ab positive NMOSD. For further verification and as a supplement to Estrada et al’s. (2018) research based on SNP analysis, our study is based on gene analysis and focused on functional comparison. In the study, more stringent p-Values were used in both methods and five shared risk genes were obtained at chromosome 6 in AQP4-positive NMOSD cases (p < 0.05/16,532), while shared risk genes in the AQP4 negative NMOSD group were absent (p < 0.05/16,485). Further, there were 50 and 22 potential susceptibility genes in AQP4-positive and negative NMOSD, respectively. Combining the protein-protein association network analysis and pathway tests, focus was on susceptibility genes that were not only significant in gene-based tests, but also meaningful in the functional analyses. In AQP4-positive NMOSD group, the five risk genes comprised CFB associated with complements, HLA-DQA1 associated with antigen processing, and EHMT2, MSH5, and SLC44A4 associated with immune regulation. Complement factor B (CFB) localizes to the MHC class III region on chromosome 6 and is a component of the alternative pathway of complement activation. Upon activation of the alternative pathway, CFB is cleaved after which it yields the noncatalytic chain Ba and the catalytic subunit Bb. The active subunit Bb is a serine protease which associates with C3b to form the alternative pathway C3 convertase. Bb is involved in the proliferation of preactivated B lymphocytes, while Ba inhibits their proliferation. Previous studies suggest that CFB may contribute to SLE (Gateva et al., 2009). Similarly, the analysis showed that CFB may be involved in the pathogenesis of NMOSD. Using the KEGG pathways analysis, it was determined that complement and coagulation cascades are associated with C2-CFB locus, previous studies showed AQP4 antibodies and complement-mediated damage are associated with NMOSD. Additionally, terminal complement inhibitor therapy of NMOSD has been proposed and proved effective among AQP4-IgG-positive NMOSD patients (Pittock et al., 2019). Similarly, the GO analysis demonstrated that regulation of humoral immune response (GO:0002920), protein processing (GO:0016485), regulation of complement activation (GO:0030449), regulation of protein activation cascade (GO:2000257) and complement activation (GO:0006956) are associated with C2-CFB locus. Based on molecular function, serine-type endopeptidase activity (GO:0004252), serine-type peptidase activity (GO:0008236) and serine hydrolase activity (GO:0017171) are associated with C2-CFB locus. Hence, our analysis shows that complements associated with C2-CFB locus play vital roles in the pathogenesis of NMOSD. However, in Estrada et al’s. (2018) study, C4 CNV was associated with NMO-IgG+ but not NMO- IgG-, which is slightly different. HLA-DQA1 (MHC class II, DQ alpha 1) gene belongs to a group of MHC genes that encode the HLA-DQ heterodimer. Previous studies indicated that HLA-DQA1 is strongly associated with SLE (Demirci et al., 2016), systemic sclerosis (SSc) (Guo et al., 2016) and anticitrullinated protein antibodies (ACPA)-positive RA (Guo et al., 2019). Estrada et al’s. (2018) study demonstrated that HLA-DQA1 (rs28383224) was shared between AQP4-positive and negative NMOSD, suggesting that at least one common genetic determinant exists for both groups. However, in our study, there were no shared genes between the AQP4-positive and negative group. Given the complexity of the MHC region, larger studies will be needed to determine the role of HLA alleles to understand the effect of these haplotypes on the NMOSD subgroup. In AQP4-positive NMOSD, SLE (hsa05322) was the most significant pathway (p = 6.62E-05) associated with HLA-DQA1, HIST1H2BL, HIST1H2AL, C2, and HIST1H3J locus. Other immune diseases, such as RA (hsa05323) were associated with HLA-DQA1 and ATP6V1G2 locus and it was also revealed that autoimmune diseases may coexist with NMOSD or share common pathological pathways. In previous studies, NMOSD with AQP4-IgG+ has been shown to be associated with a high frequency of autoantibodies and autoimmune diseases including SLE (Asgari et al., 2018), RA, Sjogren’s syndrome (SS) (Pittock et al., 2008), myasthenia gravis (MG) (McKeon et al., 2009), and antiphospholipid syndrome (APS) (Guerra et al., 2018). Regarding Estrada et al’s. (2018) study, NMO-IgG+ is genetically similar to SLE. Additionally, KEGG pathway analysis demonstrated staphylococcus (S.) aureus infection (hsa05150) to be associated with C2, CFB, and HLA-DQA1 locus was the second most significant signal (p = 8.17E-04) in AQP4-positive NMOSD. Previous studies showed that superantigens (SAgs) might be potent T cell activators, and it causes the deregulation of the immune response resulting in the proliferation of autoreactive T cells and the development and/or exacerbation of chronic autoimmune diseases (Foster, 2005). Among bacterial superantigens, the relevance of S. aureus superantigens with MS exacerbation has been depicted (Mulvey et al., 2011; Libbey et al., 2014). Kumar et al. demonstrated the beneficial effect of chronic S. aureus infection in a model of MS through the secretion of extracellular adherence protein, indicating a dual role of S. aureus infection in the pathogenesis of MS (Kumar et al., 2015). Our study revealed that the infection disease pathway might be involved in the pathogenesis of AQP4-positive NMOSD. The MSH5 (MutS homolog 5) gene located in MHC class III region comprises 26 exons and spans 25 kb and is involved in DNA mismatch repair and meiotic recombination (Clark et al., 2013). Prior studies confirmed MSH5 as susceptibility loci for SLE, SS, and MS (Song et al., 2013; Demirci et al., 2016; Shen et al., 2019). Additionally, Fernando et al. determined the presence of risk and protective signals in and surrounding MSH5 (best risk SNP rs3130490; best protective SNP rs409558) in SLE (Fernando et al., 2012). According to the GO analysis, DNA repair complex (GO:1990391) and damaged DNA binding (GO:0003684) associated with MSH5 locus might be involved in the pathogenies of NMOSD. Previous studies showed defective DNA repair in SLE and even in quiescent SLE (Souliotis et al., 2016). Luo et al. revealed that novel autoantibodies were most related to cell death, cell cycle, and DNA repair pathways in SLE (Luo et al., 2019). Clark et al. highlighted important mechanisms of DNA methylation in RA and the wider context of immune dysregulation (Clark et al., 2020). Prior studies identified aberrant expression of apoptosis and DNA damage regulatory genes in MS, suggesting that DNA methylation may be effective in neuronal loss in RRMS (Gökdoǧan Edgünlü et al., 2020). Consistent with previous studies, MSH5 might contribute to the pathogenesis of NMOSD. Euchromatic Histone Lysine Methyltransferase 2 (EHMT2) is a methyltransferase and is involved with the demethylation of histone H3 at lysine 9 (H3K9) (Fiszbein et al., 2016). In vitro studies presented that EHMT2 promoted neuronal and immature oligodendrocyte differentiation was required for oligodendrocyte maturation. In previous studies, the methylation of histone H3K9, catalyzed by EHMT1 and EHMT2, was most depleted in patients with anti-neutrophil cytoplasmic autoantibody (ANCA)-associated vasculitis (AAV) (Yang et al., 2016). Ding et al. presented EHMT2 as a potent positive regulator of FOXP3 expression, playing essential roles in controlling immune responses in autoimmune diseases (Ding et al., 2019). A recent analysis observed higher C3 and lower EHMT2 plasma levels related to increasing brain atrophy in MS patients (Malekzadeh et al., 2019). The EHMT2 inhibitor triggered the inhibition of human diffuse large B-cell lymphoma cell proliferation leading to G1 phase arrest and induced apoptosis via endogenous and exogenous apoptotic pathways (Xu et al., 2021). The histone modification is a critical factor in regulating gene expression. Recent studies in autoimmune diseases suggest that increased expression of TLR2 in CD4+ T cells enhances immune reactivity and promotes IL-17 expression through upregulating H3K4 while downregulating H3K9 tri-methylation level in SLE (Liu et al., 2015). Additionally, Ding et al. discovered that increased B cell lymphoma 6 protein upregulates H3K27 trimethylation and downregulates H3K9/H3K14 acetylation of the MicroRNA-142 promoter in SLE CD4+ T cells (Ding et al., 2020). Araki et al. suggests that the histone lysine methylation (HKM) modifying enzymes results in changes of the gene expression of RA synovial fibroblasts (Araki et al., 2018). With respect to the GO analysis, histone H3K9 methylation (GO:0051567) and histone methyltransferase activity (H3K27 and H3K9) (GO:0046976 and GO:0046976), associated with EHMT2 locus, might be involved in the pathogenies of NMOSD. The SLC44A4 (solute carrier family 44 member 4) gene located at 6p21.33 encodes human thiamine pyrophosphate transporter (TPPT), which is involved in the uptake of choline by cholinergic neurons. The SLC44A4 gene was reported to be associated with ulcerative colitis (UC) susceptibility in the Indian, Japanese, and Chinese populations (Gupta and Thelma, 2016; Gupta et al., 2016; Wu et al., 2017). There were no shared potential susceptibility genes between the AQP4-positive and negative group. Given data limitations and the complexity of the MHC region, Further studies will be needed to understand the genetic and molecular mechanism underlying NMOSD.

Conclusion

In AQP4-positive NMOSD cases, five shared risk genes were obtained at chromosome 6 using stringent p-Values in both methods (p < 0.05/16,532) comprising CFB, EHMT2, HLA-DQA1, MSH5, and SLC44A4. The 50 potential susceptibility gene sets were determined and 12 significant KEGG pathways were identified. In GO annotations of the 128 pathways identified, 67 biological process pathways, 32 cellular-component pathways, and 29 molecular-function pathways with a p-Value of <0.05 were obtained. There were no shared potential susceptibility genes between the AQP4-positive and negative group, also 4 significant KEGG pathways were identified. In the GO annotations of the 165 pathways identified, 99 biological process pathways, 37 cellular-component pathways, and 29 molecular-function pathways with a p-Value of <0.05 were obtained. The potential molecular mechanism underlying NMOSD may be related to proteins encoded by the novel genes in complements, antigen presentation, and immune regulation. The results may represent an improved understanding of the genetic and molecular mechanism underlying NMOSD.

Data Availability Statement

The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

Author Contributions

TL, HL, and C-SY conceived and designed the study for NMOSD. TL and HL analyzed the GWAS data and wrote the manuscript. LY and F-DS were responsible for subject guidance. C-SY was responsible for research supervision and manuscript revision. YL, S-AD, MY, Q-XZ, and BF provided technical support. All authors approved the final version for submission.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
  51 in total

1.  Genome-wide pathway analysis of a genome-wide association study on multiple sclerosis.

Authors:  Gwan Gyu Song; Sung Jae Choi; Jong Dae Ji; Young Ho Lee
Journal:  Mol Biol Rep       Date:  2012-12-14       Impact factor: 2.316

2.  Staphylococcus aureus harbouring Enterotoxin A as a possible risk factor for multiple sclerosis exacerbations.

Authors:  Michael R Mulvey; Malcolm Doupe; Michael Prout; Christine Leong; Romeo Hizon; Amy Grossberndt; Meghann Klowak; Aneri Gupta; Maria Melanson; Andrew Gomori; Farid Esfahani; Loressa Klassen; Emma E Frost; Michael Namaka
Journal:  Mult Scler       Date:  2011-01-06       Impact factor: 6.312

3.  Eculizumab in Aquaporin-4-Positive Neuromyelitis Optica Spectrum Disorder.

Authors:  Sean J Pittock; Achim Berthele; Kazuo Fujihara; Ho Jin Kim; Michael Levy; Jacqueline Palace; Ichiro Nakashima; Murat Terzi; Natalia Totolyan; Shanthi Viswanathan; Kai-Chen Wang; Amy Pace; Kenji P Fujita; Róisín Armstrong; Dean M Wingerchuk
Journal:  N Engl J Med       Date:  2019-05-03       Impact factor: 91.245

4.  A cross-ethnic survey of CFB and SLC44A4, Indian ulcerative colitis GWAS hits, underscores their potential role in disease susceptibility.

Authors:  Aditi Gupta; Garima Juyal; Ajit Sood; Vandana Midha; Keiko Yamazaki; Arnau Vich Vila; Motohiro Esaki; Toshiyuki Matsui; Atsushi Takahashi; Michiaki Kubo; Rinse K Weersma; B K Thelma
Journal:  Eur J Hum Genet       Date:  2016-10-19       Impact factor: 4.246

5.  Coexistence of myasthenia gravis and serological markers of neurological autoimmunity in neuromyelitis optica.

Authors:  Andrew McKeon; Vanda A Lennon; Anu Jacob; Marcelo Matiello; Claudia F Lucchinetti; Nilufer Kale; Koon H Chan; Brian G Weinshenker; Metha Apiwattinakul; Dean M Wingerchuk; Sean J Pittock
Journal:  Muscle Nerve       Date:  2009-01       Impact factor: 3.217

6.  Evaluation of an approximation method for assessment of overall significance of multiple-dependent tests in a genomewide association study.

Authors:  Valentina Moskvina; Colm O'Dushlaine; Shaun Purcell; Nick Craddock; Peter Holmans; Michael C O'Donovan
Journal:  Genet Epidemiol       Date:  2011-10-17       Impact factor: 2.135

7.  Sequencing of the MHC region defines HLA-DQA1 as the major genetic risk for seropositive rheumatoid arthritis in Han Chinese population.

Authors:  Jianping Guo; Tao Zhang; Hongzhi Cao; Xiaowei Li; Hao Liang; Mengru Liu; Yundong Zou; Yuanwei Zhang; Yuxuan Wang; Xiaolin Sun; Fanlei Hu; Yan Du; Xiaodong Mo; Xu Liu; Yue Yang; Huanjie Yang; Xinyu Wu; Xuewu Zhang; Huijue Jia; Hui Jiang; Yong Hou; Xin Liu; Yin Su; Mingrong Zhang; Huanming Yang; Jian Wang; Liangdan Sun; Liang Liu; Leonid Padyukov; Luhua Lai; Kazuhiko Yamamoto; Xuejun Zhang; Lars Klareskog; Xun Xu; Zhanguo Li
Journal:  Ann Rheum Dis       Date:  2019-04-01       Impact factor: 19.103

8.  The STRING database in 2017: quality-controlled protein-protein association networks, made broadly accessible.

Authors:  Damian Szklarczyk; John H Morris; Helen Cook; Michael Kuhn; Stefan Wyder; Milan Simonovic; Alberto Santos; Nadezhda T Doncheva; Alexander Roth; Peer Bork; Lars J Jensen; Christian von Mering
Journal:  Nucleic Acids Res       Date:  2016-10-18       Impact factor: 16.971

9.  Plasma proteome in multiple sclerosis disease progression.

Authors:  Arjan Malekzadeh; Cyra Leurs; Wessel van Wieringen; Martijn D Steenwijk; Menno M Schoonheim; Michael Amann; Yvonne Naegelin; Jens Kuhle; Joep Killestein; Charlotte E Teunissen
Journal:  Ann Clin Transl Neurol       Date:  2019-07-31       Impact factor: 4.511

10.  WEB-based GEne SeT AnaLysis Toolkit (WebGestalt): update 2013.

Authors:  Jing Wang; Dexter Duncan; Zhiao Shi; Bing Zhang
Journal:  Nucleic Acids Res       Date:  2013-05-23       Impact factor: 16.971

View more
  3 in total

Review 1.  Neuromyelitis Optica Spectrum Disorder: From Basic Research to Clinical Perspectives.

Authors:  Tzu-Lun Huang; Jia-Kang Wang; Pei-Yao Chang; Yung-Ray Hsu; Cheng-Hung Lin; Kung-Hung Lin; Rong-Kung Tsai
Journal:  Int J Mol Sci       Date:  2022-07-18       Impact factor: 6.208

Review 2.  Genetics behind Cerebral Disease with Ocular Comorbidity: Finding Parallels between the Brain and Eye Molecular Pathology.

Authors:  Kao-Jung Chang; Hsin-Yu Wu; Aliaksandr A Yarmishyn; Cheng-Yi Li; Yu-Jer Hsiao; Yi-Chun Chi; Tzu-Chen Lo; He-Jhen Dai; Yi-Chiang Yang; Ding-Hao Liu; De-Kuang Hwang; Shih-Jen Chen; Chih-Chien Hsu; Chung-Lan Kao
Journal:  Int J Mol Sci       Date:  2022-08-26       Impact factor: 6.208

3.  Rare variants and HLA haplotypes associated in patients with neuromyelitis optica spectrum disorders.

Authors:  Inna Tabansky; Akemi J Tanaka; Jiayao Wang; Guanglan Zhang; Irena Dujmovic; Simone Mader; Venkatesh Jeganathan; Tracey DeAngelis; Michael Funaro; Asaff Harel; Mark Messina; Maya Shabbir; Vishaan Nursey; William DeGouvia; Micheline Laurent; Karen Blitz; Peter Jindra; Mark Gudesblatt; Alejandra King; Jelena Drulovic; Edmond Yunis; Vladimir Brusic; Yufeng Shen; Derin B Keskin; Souhel Najjar; Joel N H Stern
Journal:  Front Immunol       Date:  2022-10-04       Impact factor: 8.786

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.