Literature DB >> 34889895

Homozygosity Haplotype and Whole-Exome Sequencing Analysis to Identify Potentially Functional Rare Variants Involved in Multiple Sclerosis among Sardinian Families.

Teresa Fazia1, Daria Marzanati1, Anna Laura Carotenuto1, Ashley Beecham2,3, Athena Hadjixenofontos2,3, Jacob L McCauley2,3, Valeria Saddi4, Marialuisa Piras4, Luisa Bernardinelli1, Davide Gentilini1,5.   

Abstract

Multiple Sclerosis (MS) is a complex multifactorial autoimmune disease, whose sex- and age-adjusted prevalence in Sardinia (Italy) is among the highest worldwide. To date, 233 loci were associated with MS and almost 20% of risk heritability is attributable to common genetic variants, but many low-frequency and rare variants remain to be discovered. Here, we aimed to contribute to the understanding of the genetic basis of MS by investigating potentially functional rare variants. To this end, we analyzed thirteen multiplex Sardinian families with Immunochip genotyping data. For five families, Whole Exome Sequencing (WES) data were also available. Firstly, we performed a non-parametric Homozygosity Haplotype analysis for identifying the Region from Common Ancestor (RCA). Then, on these potential disease-linked RCA, we searched for the presence of rare variants shared by the affected individuals by analyzing WES data. We found: (i) a variant (43181034 T > G) in the splicing region on exon 27 of CUL9; (ii) a variant (50245517 A > C) in the splicing region on exon 16 of ATP9A; (iii) a non-synonymous variant (43223539 A > C), on exon 9 of TTBK1; (iv) a non-synonymous variant (42976917 A > C) on exon 9 of PPP2R5D; and v) a variant (109859349-109859354) in 3'UTR of MYO16.

Entities:  

Keywords:  Homozygosity Haplotype analysis; Region form Common Ancestor (RCA); Sardinian population; WES data; low-frequency variants; multiple sclerosis; multiplex families; rare variants

Mesh:

Year:  2021        PMID: 34889895      PMCID: PMC8929092          DOI: 10.3390/cimb43030125

Source DB:  PubMed          Journal:  Curr Issues Mol Biol        ISSN: 1467-3037            Impact factor:   2.976


1. Introduction

Multiple Sclerosis (MS) is a complex neurological autoimmune disease, which mainly affects people in early adulthood; for this reason, it is considered as the most common cause of neurologic disability in young adults [1,2]. The prevalence of the disease is different across the different countries: it has a high prevalence in Europe, with a north to south gradient, and a lower prevalence in Asia and Africa [3]. In Italy, we observed a disease prevalence of 176 per 100,000 inhabitants [4], except in the Mediterranean island of Sardinia, where we found an age- and sex-adjusted prevalence of MS of 330 per 100,000 inhabitants, among the highest reported worldwide, ranging from 217 in the Olbia-Tempio district to 425 in the Ogliastra district [5], with the lowest risk areas being closer to the coast. Although most MS cases occur sporadically, about 20% of the affected individuals are related by family, with first-degree relatives of MS patients at increased risk of disease, thus suggesting that the disease is moderately heritable, with a sibling relative recurrence risk of 6.35 in the Caucasian population [6] and of 31 in the founder population of the Sardinian province of Nuoro [7]. In line with other common, complex disorders, almost 20% of risk heritability is attributable to common genetic variants in the autosomal genome, including 233 unequivocally MS-associated loci identified over the last 15 years by GWAS (genome-wide association studies), comprising 32 loci within the Major Histocompatibility Complex (MHC) [8,9,10,11,12,13], each of which explain only a small fraction of risk [14]. A recent study by the International Multiple Sclerosis Genetics Consortium (IMSGC) [15] provides evidence that 11.34% of risk heritability is explained by low-frequency variants (Minor Allele Frequency (MAF) < 5%); of these rare variants, (MAF < 1%) alone explains 9%. Most low-frequency variants impact genes that are not detectable by common variants identified by genome-wide association studies (GWAS), and only a small portion of them is in Linkage Disequilibrium (LD) with variants highlighted by GWAS. Many low frequency and rare variant associations, as important sources of unexplained heritability, remain to be discovered [16]. This investigation would require large sample sizes to reach an appropriate statistical power, or alternatively, the use of multiplex families from a founder population for which both genotyping and sequencing data are available. Our study aims at understanding the genetic contribution to MS and suggesting new potential causative variants in families, by contributing to the discovery of new exonic and potentially functional low-frequency variants. To this end, we analyzed multiplex families originating from the genetically homogeneous and isolated population of the Nuoro province of Sardinia for which Immunochip genotyping and whole exome-sequencing (WES) data are available. We followed a two-stage approach. In the first stage, we prioritized candidate regions to be further investigated via a non-parametric Homozygosity Haplotype (HH) analysis, which uses reduced haplotypes composed by homozygous single nucleotide polymorphisms (SNPs) only and deletes all the heterozygous ones. We performed this analysis on thirteen families by exploiting the co-segregation of the disease and genetic variants between affected and unaffected subjects for a genome-wide search of shared autosomal segments. In the second stage, on the promising candidate regions identified in the HH analysis, we searched for the presence of rare variants shared by the affected individuals by analyzing WES data that were available for five families only.

2. Results

2.1. Sample Description

Thirteen multiplex Sardinian pedigrees, containing from three to sixteen MS patients each, were selected for the analysis, for a total of 80 affected (63 with Immunochip genotyping data) and 655 unaffected (220 with Immunochip genotyping data) patients. Table 1 reports the description of the family data available for the HH analysis. We analyzed a total of 129.448 Immunochip QC-filtered SNPs that had assigned dbSNP refIDs. WES data were available for five families only. Specifically, three cases and one control for family 61, 10 cases for family 2360, two cases for family 45, four cases for family 4, and five cases for family 5.
Table 1

Description of the family data available for the HH analysis. For each family, the total numbers of affected and unaffected subjects are reported together with the availability of Immunochip genetic data.

FamilyTotal N. of Affected N. of Affectedwith Genotyping DataTotal N. of UnaffectedN. of Unaffectedwith Genotyping Data
3658429
4553723
5857922
9986422
1233193
21555319
26634313
4433149
45643615
58322714
61764017
8133238
2360161113626

2.2. Identification of RCHHs

The HH statistical analysis was performed for all the 13 families using both the genotyped patients (n = 63) and controls (n = 220), so that the algorithm worked to treat the affected and unaffected members of a family as cases and controls, respectively. As we had Immunochip data [10], obtained with an Illumina Infinium HD custom array designed for the fine mapping of 184 established autoimmune loci, and not a high-density array, we used a cutoff value of 7 cM to search the candidate RCHHs, in order to reduce the risk of false positives and to increase specificity, and we used CEUAnnotation1MDuo (http://www.hhanalysis.com/ (accessed on 15 February 2021)) as an annotation file. We selected regions with a significance level of -log10 (p-value) 1.2, corresponding to a p-value of 0.06, to establish difference between patient and control pools. The choice of this liberal level of significance was driven by our research strategy in which HH analysis represents a step towards prioritizing candidate regions, and thus, towards reducing false negative probability, in order to enable further investigation in the second stage (WES analysis). HH was run to identify Regions with a Conserved Homozygosity Haplotype (RCHHs) that were differently shared among cases and controls (see Table 2). Then, we searched for genes in each significant RCHH, and Figure 1 reports the upset plot in which the number of genes shared between families is graphically represented. The first five bar charts on the top of the figure represent the number of genes in the significant RCHHs found only in each individual family (whose identified number is specified in the left of the figure), e.g., 305 genes that are only in family 2360 but not in the other families, etc. The last three bar charts on the top of the figure report the number of genes shared between families. In particular, families 4 and 44 share one gene, i.e., AL833583, and families 3 and 2360 share DQ596042, while families 9 and 44 share the USP16 gene. Interestingly, increased USP16 levels, a deubiquitinase required for chromosomal segregation in mitosis [17], were observed in peripheral CD4+ T cells from patients with distinct autoimmune diseases and T cell-specific USP16 knockout mice showed a reduced severity of experimental autoimmune encephalitis [18]. Families that were not reported in the upset plot are those who did not share any gene between them.
Table 2

List of RCHHs grouped by family. The table reports the number of subjects sharing an RCHH in both pools, its coordinate and the -log10 (p-value) for the difference between the pools. In our study, we set a significance level of -log10 (p-value) 1.2, corresponding to a p-value of 0.06.

FamilyNo. Subjects Sharing RCHH in Patient PoolNo. Subjects Sharing RCHH in Control PoolChrStart–End (bp *)Start–End (SNP)-log10(p-Value)
23606 out of 117 out of 26120024018-20110347rs12405947-rs104929991.20
23608 out of 1111 out of 263127665610-128219449rs1687462-rs98265261.23
23605 out of 115 out of 26626287459-32628250rs4458680-rs117571591.25
23607 out of 119 out of 261116548170-17019030rs4593976-rs79420851.20
236011 out of 1119 out of 2613106416156-106713791rs9555302-rs18192431.30
236011 out of 1118 out of 2613106716741-106978764rs3949948-rs169706231.47
236011 out of 1119 out of 2613106979630-106979630rs1830754-rs18307541.30
236011 out of 1118 out of 2613106979661-108046934rs2076766-rs79921491.47
236011 out of 1119 out of 2613108048315-108048315rs9583266-rs95832661.30
236011 out of 1118 out of 2613108050353-108090301rs12586075-rs169728491.47
236011 out of 1117 out of 2613108090996-108968251rs16972855-rs95214151.63
23607 out of 119 out of 262243217334-43366873rs133807-rs1386281.20
615 out of 67 out of 172221458760-223560597rs634813-rs14400631.22
616 out of 610 out of 174181036139-181598064rs7655585-rs27274261.23
616 out of 610 out of 17531060610-31060610rs1392428-rs13924281.23
615 out of 67 out of 17642767957-43333769rs394754-rs77521201.22
615 out of 67 out of 17645282619-46437363rs7762957-rs13725671.22
614 out of 64 out of 17662416537-67861137rs213824-rs92947361.38
614 out of 64 out of 176100591693-100883678rs9399393-rs26581321.38
615 out of 67 out of 1778552281-8768564rs1859275-rs102685801.22
616 out of 69 out of 1798991207-8991239rs10511519-rs108160281.41
616 out of 68 out of 1798991538-8994896rs10511520-rs108160291.61
616 out of 69 out of 1799000336-9395003rs7025315-rs107590641.41
616 out of 610 out of 1799395316-11328859rs1475680-rs177883701.23
616 out of 610 out of 17912662320-13015284rs1408801-rs105148221.23
615 out of 67 out of 179115676064-115712622rs6478042-rs70349291.22
616 out of 610 out of 179134997809-134997809rs626713-rs6267131.23
616 out of 610 out of 1712122479650-122743242rs1706477-rs71379461.23
616 out of 610 out of 1712127998742-128288581rs11060036-rs108478071.23
615 out of 67 out of 171684533519-84542101rs305059-rs9089881.22
615 out of 66 out of 172048202462-49044808rs6020298-rs61229911.43
615 out of 67 out of 172049044993-50323395rs11904901-rs60681171.22
615 out of 66 out of 172050326580-50563902rs6021835-rs20246501.43
454 out of 47 out of 15122014570-23032859rs2010397-rs46548211.22
453 out of 44 out of 15754280160-56646646rs13438238-rs26340811.21
454 out of 47 out of 151372551882-73411123rs4883922-rs7289261.22
454 out of 46 out of 151373411181-75116077rs17288193-rs10064121.40
443 out of 33 out of 912289487-2455004rs7545940-rs98037641.24
443 out of 33 out of 91156653770-160351933rs2188102-rs14152591.24
443 out of 33 out of 92138041133-138458405rs7563139-rs101995421.24
443 out of 33 out of 9496765575-103179634rs11941922-rs21292941.24
443 out of 33 out of 9573747-556484rs7709758-rs64200451.24
443 out of 33 out of 95150483977-150816773rs2303027-rs178028281.24
443 out of 33 out of 95154874578-156864954rs1295243-rs22770271.24
443 out of 33 out of 97142903019-142909027rs6963381-rs18805601.24
443 out of 33 out of 98602758-3038563rs9314595-rs132615501.24
443 out of 33 out of 91610909415-11580966rs2229321-rs80504611.24
263 out of 33 out of 131171619320-173691804rs6701066-rs8609051.65
263 out of 34 out of 131173695580-174106457rs1016815-rs107984181.40
263 out of 34 out of 131480069595-82872217rs1543918-rs176259291.40
215 out of 510 out of 19114226898-14226898rs4579751-rs45797511.27
215 out of 510 out of 19573478194-73598034rs2120729-rs14608121.27
94 out of 84 out of 221145356859-51450167rs717653-rs122915811.31
94 out of 84 out of 221244020977-44336606rs10785572-rs8781111.31
97 out of 810 out of 222127810551-29354016rs2830992-rs10640191.48
53 out of 54 out of 22558726008-59841224rs525099-rs405121.41
54 out of 58 out of 22985066453-85227053rs10867967-rs8717901.21
54 out of 57 out of 22985228118-85448505rs3860918-rs10526901.39
55 out of 512 out of 2213109106993-109251608rs9515092-rs79863461.22
55 out of 511 out of 2213109253434-109607383rs11069806-rs95216231.36
54 out of 58 out of 2213109945232-110218960rs9555712-rs128654651.21
45 out of 512 out of 231237070145-237394008rs869035-rs19800041.30
45 out of 511 out of 231237395275-238575190rs11808376-rs104954661.43
45 out of 512 out of 231238576784-238576784rs7552602-rs75526021.30
45 out of 511 out of 231238579605-238713399rs12137050-rs96621361.43
45 out of 511 out of 237142336895-143056687rs4236481-rs125401881.43
45 out of 512 out of 237143059971-143855577rs4640977-rs20578681.30
45 out of 511 out of 237143858588-144083589rs17169930-rs77932271.43
45 out of 512 out of 237144088382-145211344rs6954142-rs46012311.30
45 out of 512 out of 238126600646-126663920rs4006563-rs70168671.30
45 out of 512 out of 238126672376-128144872rs4870946-rs14563141.30
45 out of 5 12 out of 2313109370211-109416832rs7323507-rs79846461.30
45 out of 511 out of 2313109419064-109607383rs9583447-rs95216231.43
45 out of 511 out of 231492683815-94120061rs12589195-rs128808621.43
45 out of 512 out of 231494122073-94216494rs2069956-rs71482041.30
45 out of 512 out of 231767527147-67755206rs9914764-rs129392711.30
45 out of 512 out of 232142351204-42525479rs17114247-rs8813951.30
45 out of 511 out of 232142525851-42616108rs915846-rs6915671.43
45 out of 512 out of 232224549605-24559502rs4820658-rs48226611.30
45 out of 512 out of 232224561312-24579927rs17704912-rs27482341.30
33 out of 56 out of 291171432870-171774723rs1234313-rs14610191.30
31 out of 51 out of 29290959860-91680834rs10201040-rs43738031.23
35 out of 514 out of 29371749225-72031216rs864380-rs98615831.38
35 out of 515 out of 29372031489-72458673rs1995453-rs43038231.27
35 out of 514 out of 29372459498-74148704rs6790069-rs14053961.38
35 out of 514 out of 29374710393-74710393rs13073838-rs130738381.38
34 out of 510 out of 29375820822-76074141rs536575-rs40955461.27
31 out of 51 out of 29569782071-69967168rs169717-rs38714601.23
34 out of 510 out of 29652061483-52205031rs6906409-rs69134721.27
34 out of 59 out of 29652205660-52213626rs9395771-rs93820841.42
35 out of 510 out of 2974275511-4316475rs10272180-rs21078341.96
35 out of 511 out of 2974318943-4930528rs2097884-rs20899671.79
35 out of 512 out of 2974930906-4930906rs6947947-rs69479471.64
35 out of 511 out of 2974934189-5342068rs13224720-rs102347091.79
35 out of 512 out of 2975344404-5549582rs13238999-rs20982251.64
35 out of 513 out of 2975551125-6459404rs1725213-rs78105531.50
35 out of 514 out of 2976461542-7642706rs7792987-rs102801851.38
35 out of 515 out of 2977767212-7831276rs17137412-rs127026611.27
35 out of 515 out of 297150524562-150524562rs310586-rs3105861.27
35 out of 514 out of 297150526978-150611097rs7458773-rs69535521.38
35 out of 515 out of 297150613053-150619105rs7797007-rs2192451.27
31 out of 51 out of 29940059290-44362584rs375972-rs49290231.23
31 out of 51 out of 29944670536-46992793rs12006135-rs70490151.23
31 out of 51 out of 29965231255-65635106rs28533023-rs14803681.23
31 out of 51 out of 2913111553045-114123122rs12017986-rs128742901.23
31 out of 51 out of 292028039018-28259678rs7267880-rs65674651.23

* SNP Position hg18.

Figure 1

Upset plot. In the plot, the intersections of genes between families are visualized. The first five bar charts on the top of the figure represent the number of genes in the significant RCHHs found only in each individual family. The last three bar charts on the top of the figure report the number of genes shared between families.

The Phenolyzer (http://phenolyzer.wglab.org (accessed on 30 March 2021)) and WEbGestalt (http://www.webgestalt.org (accessed on 30 March 2021)) tools were used to verify if there were genes involved in MS-related diseases due to their involvement in the CNS or immune system. We found that: (i) CD244 (highlighted in family 44), CIITA (highlighted in family 44), HLA-DRB1 (highlighted in family 2360) and NFKBIL1 (highlighted in family 2360) are related to rheumatoid arthritis; (ii) BANK1, FCGR2A, FCGR2B and FCGR2C (highlighted in family 44) are related to systemic lupus erythematosus; and iv) HLA-G (highlighted in family 2360), HNMT (highlighted in family 44), and TNF (highlighted in family 2360) are related to ASMA.

2.3. Identification of Pathogenic Variants

WES analysis of available subjects on the regions prioritized by HH analysis was performed to identify putative causative rare variants for MS. Figure 2 reports the pedigree structure of the analyzed families for which we also had WES data, and the identified variants.
Figure 2

Pedigree of families in the study. Squares and circles indicate men and women, respectively. The symbols in black represent the affected members. The squares or circles with a line indicate a deceased individual. The asterisk represents subjects in the pedigree for which WES data were available, while the colored triangles indicate the subjects carrying the variant.

From the FASTQ files, we obtained the VCF files, using bphg19 as the reference genome. The VCF files were annotated using wANNOVAR (http://wannovar.wglab.org (accessed on 3 May 2021)), and we searched for variants on the resulting csv file. Calls with synonymous variants, and benign variants were excluded, while calls with MAF reported in the 1000 genomes (https://www.internationalgenome.org/ (accessed on 10 May 2021)) dataset and gnomAD (https://gnomad.broadinstitute.org (accessed on 10 May 2021)), with values of less than 0.001 or unknown, were included. The aim was to search for rare genetic variants shared between the affected within the genes prioritized via HH analysis. For each family and for each RCHH, the variant, its function and the number of cases and control sharing the variant are reported in Table 3. Interestingly, three variants, i.e., 43181034 T > G and 50245517 A > C, located in splicing region on exon 19 of the CUL9 gene and on exon 16 of the ATAP9A gene, respectively, and a deletion (43557751-43557832) in the splicing site of the UMODL1 gene, were classified as photogenic in VarSome (https://varsome.com/ (accessed on 12 October 2021)). The other variants were classified as of uncertain significance.
Table 3

Results of WES analysis. For each family and for each RCHH region, identified by HH analysis, the variant, its function, and the number of cases and controls sharing the variant are reported.

FamilyRCHH Region(bph18)ChrStartEndRefAltFunctionGeneNo. of AffectedNo. of Unaffected
61chr6: 42767957-4333376964318103443181034TGsplicing region on exon 27 CUL9 3 out of 31 out of 1
64310696443106964ACnon-synonymous variant on exon 9 PTK7 3 out of 30 out of 1
64322353943223539ACnon-synonymous variant on exon 9 TTBK1 3 out of 30 out of 1
64297691742976917ACnon-synonymous variant on exon 9 PPP2R5D 3 out of 30 out of 1
chr20:48202462-49044808204936693349366933GCnon-synonymous variant on exon 3 PARD6B 3 out of 30 out of 1
chr20:49044993-50323395205024551750245517ACsplicing region on exon 16 ATP9A 3 out of 31 out of 1
chr12:122479650-12274324212123943942123943942ACintronic variant SNRNP35 3 out of 30 out of 1
chr2:221458760-2235605972223554057223554057TGnon-synonymous variant on exon 3 MOGAT1 3 out of 31 out of 1
2360chr6:26287459-3262825062985625729856257GCnon-coding RNA HLA-H 6 out of 10NA
62985626129856261GAnon-coding RNA HLA-H 6 out of 10NA
62985626329856263GAnon-coding RNA HLA-H 6 out of 10NA
chr13:108090996-10896825113108519067108519070CTCT-5′UTR FAM155A (NLF-1) 9 out of 10NA
13109859349109859354TGTGTT-3′UTR MYO16 7 out of 10NA
4chr21:42351204-42525479214355775143557832 deletion on slicing site UMODL1 3 out of 4NA
5chr13:109945232-110218960131136487711364877-GGGinsertion in the upstream site ING1 3 out of 5NA

NA = No WES data were available for unaffected subjects in the family.

In Table S1, for each variant, the frequency in the different populations, such as those in 1000 Genome, gnomAD, and in the Sardinian population [19,20], are reported. ST1 also reports the results obtained using different prediction algorithms, with the grade of pathogenicity for each variant.

3. Discussion

MS is characterized by a complex multi-factorial nature that involves the interplay of a still non-identified environmental exposure and a genetic predisposition. In recent studies, it was observed that extensive grey matter lesions in the cerebral and cerebellar cortex and hippocampus [21,22,23,24] are involved in the pathology of the disease. As with other multifactorial diseases, MS has been predominantly studied assuming the “common disease-common variant” paradigm via GWASs. The importance and success of the GWASs approach to identifying the loci underlying common disease cannot be overlooked [25]. These breakthroughs, along with both statistical and technological advances, have led to the identification and confirmed association of numerous genetic loci for MS susceptibility [10,11,12,26]. In spite of these advances, only a relatively small proportion of the genetic influences in MS have been uncovered, and much is yet to be understood. It is important to point out that there are alternative hypotheses concerning the genetic architecture of common diseases, including the multiple rare variant hypothesis, which may help to elucidate the so-called missing heritability associated with common complex diseases such as MS [27]. Our study aims at contributing to the understanding of the genetic risk component of MS and the contribution of rare variants to disease risk through a combination of HH analysis—with the aim of prioritizing candidate regions—and WES to deeply explore this candidate region in search of putative causative rare variants in the founder population of Nuoro province (Sardinia). The Sardinian population has a high prevalence of MS and autoimmune disease, and its genetic background is proven to be homogeneous and an outlier in Europe, having been under selective pressure due to malaria, which has been endemic there for centuries. Exons are regions in the DNA that are particularly vulnerable in the presence of mutations in their sequence: an unexpected mutation may alter the structure and the function of the protein, leading to deleterious biological consequences that may contribute to—or cause—a specific disease. Thus, identifying the presence of an exonic genetic variant in a candidate region, previously highlighted by HH, will help us to better understand the biological processes involved in the disease. Firstly, HH analysis performed on 13 multiplex Sardinian families made it possible to identify significant RCHHs (-log10 (p-value) 1.2). Secondly, in these regions, WES data that were available for 25 subjects belonging to five families were analyzed. Interesting results were found in: The RCHH region on chr6:42767957-43333769, shared in 5 cases and 7 controls of family 61, where we identified the variant 43181034 T > G in the splicing region on exon 27 of the CUL9 gene. CUL9 is highly expressed in the brain, particularly in the cerebral cortex [28]. A study [29] using a human cell-derived model to characterize CUL9 in human neuronal development showed that the deletion or depletion of the protein causes the aberrant formation of neural rosettes that are related to the early stage of neurodevelopment. Furthermore, the neuronal transcription factors CUX1 and SOX3 were significantly upregulated in CUL9 knockout neuroepithelial progenitor cells. Fisher et al. [30] analyzed the potential molecular pathways of tissue injury in active cortical MS lesions, and by identifying prominent changes in gene expression, they found genes that are involved in different steps of apoptosis, DNA damage, p53 function, and DNA repair, including CUL9. In the same RCHH region, 43106964 A > C, a non-synonymous variant on exon 9, and rs780764712, in PTK7, a gene involved in the Wnt/planar cell polarity pathway, were also found. It is important to note that the PTK7 mutant with a truncated protein perinatally caused severe defects in neural tube closure [31]. In this stidy, 43223539 A > C, a non-synonymous variant on exon 9 in the TTBK1 gene, and 42976917 A > C, a non-synonymous variant on exon 9 in the PPP2R5D gene, were also found in all the available cases. TTBK1 is a brain-specific tau kinase expressed in the entorhinal cortex and hippocampal regions. TTBK1 transgenic mice showed severe axonal degeneration in the perforant path, which is essential for many forms of memory [32]. TTBK1 is highly expressed in the entorhinal cortex and the perforant path region, two specific brain regions involved in the early stage of Alzheimer’s disease pathology [33], and thus, has a critical role in axonal degeneration. Collapsin response mediator protein-2 (CRMP2) is a downstream target of TTBK1 [32], whose expression induces the accumulation of phosphorylated CRMP2, and it was shown to be involved in the axonal degeneration pathology in MS [34]. PPP2R5D is a regulatory B subunit of Protein Phosphatase 2A (PP2A) and plays a crucial role in normal neuronal development and functioning. Variants of this gene were found to be associated with intellectual disability, autism, and other neurodevelopmental disorders [35]. Mutations in this gene were found in juvenile-onset parkinsonism [36]. The RCHH region on chr13:108090996-108968251, shared in 11 cases and 17 controls of family 6, where we identified the variant 109859349-109859354 TGTGTT> in 3′UTR of the MYO16 gene. This variant is also present in 1 case of family 4, in 2 cases of family 45, and in 1 case of family 5. MYO16 is mainly expressed in the central nervous system and seems to be involved in the development and functioning of the nervous system also in adulthood; therefore, alterations in this gene, e.g., SNPs, deletions, or epigenetic modifications, are associated with neurodegenerative and neuropsychiatric disorders [37,38,39]. MYO16 is thus considered as an important regulator of neural cells’ functioning even if its specific role and molecular mechanisms remain to be elucidated. Interestingly, not far from MYO16, in the chr13:108090996-108968251 region highlighted by HH analysis, is located the TNFSF13B gene, encoding the cytokine and drug target B-cell activation factor (BAFF) whose overexpression is related to autoimmunity [40]. In particular, in [41], a TNFSF13B variant was found to be associated with MS and systemic lupus erythematosus (SLE) through a mechanism that led to an overexpression of BAFF, which, in turn, upregulated the humoral immunity. The RCHH region on chr20:49044993-50323395, shared in 5 cases and 7 controls of family 61, where we identified the variant 50245517 A > C in the splicing region on exon 16 of the ATP9A gene. ATP9A is a regulator of endosomal recycling and plays an inhibitory role in the release of extracellular vesicles (EV) [42], and many biological processes, such as the immune response, are modulated by proteins, DNA, miRNA, and mRNAs that could be controlled via EV-instigated intercellular communication [43]. Taken together, these results provide newer insights into the genetics of MS and a more thorough understanding of the disease biology that, in future functional studies of these highlighted specific gene variants, may provide hints towards the creation of new and more effective treatments in MS.

4. Materials and Methods

4.1. Sample Collection and Genotyping

MS patients were ascertained through the case register established in 1995 in the province of Nuoro, Sardinia, Italy. Cases were diagnosed according to Poser’s criteria [44]. Thirteen pedigrees, containing from three to sixteen MS patients each, were selected for the analysis. Genotyping data were obtained using Immunochip results obtained from a previous study [13], where the quality control-filtered dataset included 131.497 SNPs.

4.2. HH Analysis

HH analysis, proposed by Miyazawa et al. [45], is an efficient non-parametric tool to detect regions harboring either novel or known mutations; it makes it possible to identify patient’s shared chromosomal segments derived from a common ancestor, which are characterized by the distinct identity of their haplotype. The analysis is based on the concept of identity-by-descent (IBD), in which a DNA segment is defined as IBD in two or more individuals if it is a direct copy of the same ancestral allele; thus, affected subjects who inherit the mutation from a common ancestor share IBD—the genetic portion around the mutation in which there should be no discordant homozygous calls for both dominant and recessive genes. HH are a reduced haplotype obtained by removing all the heterozygous SNPs from the sample dataset and leaving only the homozygous ones. The method works by comparing homozygous segments; in this way, there is no need to reconstruct haplotypes. This results in a reduced computational timing and a simplification of the analysis process, even for large numbers of SNPs. Specifically, the method makes it possible to compare the number of subjects in the patient pool and in the control pool who share an RCHH region. RCHH is a region with a given genetic length, chosen on the basis of the study, containing comparable SNPs (compSNP) in homozygosity in two subjects. The algorithm performs a pairwise comparison of individuals based on the presence of compSNP in the pair. A mismatched comparable SNP, also indicated as a discordant homozygous SNP (dhSNP), has discordant homozygous SNP genotypes in two subjects (e.g., AA vs. BB). The borders of an RCHH are defined by either dhSNPs or by the ends of the chromosome. RCHH regions shared by multiple subjects are used to predict the presence of a Region from a Common Ancestor (RCA) or an IBD [46,47]. Given m and n, the number of generations of two affected subjects descended from a common ancestor RCA is calculated as In consanguineous families or in populations that are geographically isolated, such as our Sardinia multiplex families, patients suffering from a disease (in our study, MS) share a common ancestor, and the RCA represent candidate regions in which disease genes can be looked for. RCA is identified through RCHHs via the comparison of RCHHs shared between the patient and the control pool [45]. As explained in [46], the algorithm works firstly by dividing a genetic autosomal region into smaller ones and selecting, for each of them, the representative RCHHs shared by the largest number of affected subjects within the patient pool. The numbers of individuals sharing each representative RCHH pool are then counted in both the patient and the control groups and compared to each other. The p-value of the null hypothesis of no difference between these two proportions is calculated according to the standard normal distribution. In the HH approach, genotyping error may lead to be causative gene being excluded mistakenly. In fact, an RCHH will be truncated when a non-dhSNP changes into a dhSNP due to a genotyping error; in this way, the resulting genomic segment will be smaller than the chosen cut-off length, and thus, the RCHH will be not identified. The algorithm, by means of Monte Carlo Chain Simulation (MCMC), makes it possible to calculate the probability of obtaining dhSNPs due to genotyping errors. If the probability of creating dhSNPs in the region is very low (p-value < 0.001), the HH results are reliable. A critical step in the analysis is represented by the choice of cut-off value since there is not an optimal cut-off value. They need to be chosen on the basis of the population under study, taking into account that the decreases in the average genetic length of the RCAs over generations; therefore, when we analysed distantly related subjects who shared smaller RCA, it was preferable to use a small cut-off (e.g., 3 cM). Another aspect that guides our choice of the cut-off is the array used for genotyping (e.g., low-density vs. high-density). As we only had Immunochip data [10], which were obtained with the Illumina Infinium HD custom array that was designed for the fine mapping of 184 established autoimmune loci, we chose a conservative cutoff value of 7 cm to search for candidate RCHHs that represent our prioritized regions. This step was taken to reduce the risk of false positives and increase the specificity of the results. In summary, the advantages of HH approach numerous are: (a) there is no need to reconstruct haplotypes since the homozygote haplotype for each chromosome is uniquely determined; (b) the chromosomal segments in which all polymorphic markers are homozygous are considered to be autozygous segments [48]; (c) if the coefficient of consanguinity for a patient is large as a result of belonging to an inbred family, and the disease is rare, then the probability that the disease-causing gene is located in the shared segment is very high; (d) since HH analysis looks for ancestral segments, both dominant and recessive genes can be detected; (e) the HH approach is robust to genotyping errors. In our analysis, the dataset was prepared using R software [49], and the HH analysis was run using the HH program (http://www.hhanalysis.com (accessed on 15 February 2021)).

4.3. Screening of Known Causative Genes

We then scrutinized the segments highlighted in the HH analysis with the help of publicly available databases such as Phenolyzer (http://phenolyzer.wglab.org (accessed on 30 March 2021)) and WEbGestalt (http://www.webgestalt.org (accessed on 30 March 2021)), in order to search for genes located, within the specified region, whose function could be plausibly related to MS.

4.4. Whole-Exome Sequencing Data Generation

All samples were sequenced at the Center for Genome Technology within the University of Miami John P. Hussman Institute for Human Genomics. Library preparation was conducted using the SureSelectXT Human All Exon V4 + UTR (Agilent Technologies Inc., Santa Clara, CA, USA). This protocol targets 99% of coding regions in addition to 5′ and 3′-untranslated region sequences. Pre-enrichment libraries were constructed using the SureSelect Low Input reagent kit, and exome enrichment of the DNA library was performed via a hybridization reaction with biotinylated baits from the SureSelect Human All Exon V4 + UTR Enrichment Kit. Sequencing of the prepared DNA libraries was undertaken using the Illumina HiSeq2000 instrument (Illumina Inc., San Diego, CA, USA) with an average coverage of 80× with 2 × 100 bps paired-end reads. Quality controls were applied at the lane and fastq levels. Specifically, the cutoff used for a successful lane was Pass Filter > 90%, with over 250 M reads for the high-output mode. The fraction of reads in each lane assigned to each sample (no set value) and the fraction of bases with a quality score > Q30 for read 1 and read 2 (above 80% expected for each) were also checked. Raw sequencing reads were demultiplexed using Illumina bcl2fastq. In addition, the FASTQC tool kit (www.bioinformatics.babraham.ac.uk/projects/fastqc/ (accessed on 8 November 2020)) was used to review the base quality distribution, which provided representations of the four nucleotides of particular k-mer sequences (adaptor contamination). We used the Genome Analysis Software Kit (GATK) (version 4.1) best-practice pipeline to analyze our WES data. Reads were aligned with the human reference genome (hg19), using the Maximal Exact Matches algorithm in the Burrows–Wheeler Aligner (BWA) [50]. PCR duplicates were removed using the Picard tool (picard.sourceforge.net/). The GATK base quality score recalibrator was applied to correct the sequencing artifacts. Variants were called using the GATK haplotypeCaller algorithm, visually inspected using the Integrative Genomics Viewer (IGV, Broad Institute), and further annotated with ANNOVAR. Variants were categorized as follows: (1) non-synonymous; (2) synonymous; (3) frameshift deletion or insertion; (4) splicing; (5) stop gain or loss; or (6) functional intronic or promoter variants.

5. Conclusions

Our study, which was performed on multiplex Sardinian families and combined the HH approach with WES data analysis, first enabled the identification of disease-linked regions, and then the identification of specific rare variants located in these regions. Although our study was limited to the use of Immunochip data that, despite making it possible to scrutinize the entire genome, only allowed this within specific autoimmune candidate regions, the obtained results represent an important step in the comprehension of the genetics of MS.
  48 in total

Review 1.  BAFF AND APRIL: a tutorial on B cell survival.

Authors:  Fabienne Mackay; Pascal Schneider; Paul Rennert; Jeffrey Browning
Journal:  Annu Rev Immunol       Date:  2001-12-19       Impact factor: 28.527

2.  PPP2R5D Genetic Mutations and Early-Onset Parkinsonism.

Authors:  Ian M Walker; Giulietta M Riboldi; Patrick Drummond; Sandra Saade-Lemus; Juan Sebastian Martin-Saavedra; Steven Frucht; Tanya M Bardakjian; Pedro Gonzalez-Alegre; Andres Deik
Journal:  Ann Neurol       Date:  2020-11-05       Impact factor: 10.422

3.  Multiple sclerosis genomic map implicates peripheral immune cells and microglia in susceptibility.

Authors: 
Journal:  Science       Date:  2019-09-27       Impact factor: 47.728

4.  Prevalence of multiple sclerosis in Sardinia: A systematic cross-sectional multi-source survey.

Authors:  Silvana Am Urru; Antonello Antonelli; Giuseppe M Sechi
Journal:  Mult Scler       Date:  2019-02-22       Impact factor: 6.312

5.  Extensive hippocampal demyelination in multiple sclerosis.

Authors:  Jeroen J G Geurts; Lars Bö; Stefan D Roosendaal; Thierry Hazes; Richard Daniëls; Frederik Barkhof; Menno P Witter; Inge Huitinga; Paul van der Valk
Journal:  J Neuropathol Exp Neurol       Date:  2007-09       Impact factor: 3.685

6.  PTK7/CCK-4 is a novel regulator of planar cell polarity in vertebrates.

Authors:  Xiaowei Lu; Annette G M Borchers; Christine Jolicoeur; Helen Rayburn; Julie C Baker; Marc Tessier-Lavigne
Journal:  Nature       Date:  2004-07-01       Impact factor: 49.962

7.  Multiple sclerosis.

Authors:  Alastair Compston; Alasdair Coles
Journal:  Lancet       Date:  2008-10-25       Impact factor: 79.321

8.  Genetic risk and a primary role for cell-mediated immune mechanisms in multiple sclerosis.

Authors:  Stephen Sawcer; Garrett Hellenthal; Matti Pirinen; Chris C A Spencer; Nikolaos A Patsopoulos; Loukas Moutsianas; Alexander Dilthey; Zhan Su; Colin Freeman; Sarah E Hunt; Sarah Edkins; Emma Gray; David R Booth; Simon C Potter; An Goris; Gavin Band; Annette Bang Oturai; Amy Strange; Janna Saarela; Céline Bellenguez; Bertrand Fontaine; Matthew Gillman; Bernhard Hemmer; Rhian Gwilliam; Frauke Zipp; Alagurevathi Jayakumar; Roland Martin; Stephen Leslie; Stanley Hawkins; Eleni Giannoulatou; Sandra D'alfonso; Hannah Blackburn; Filippo Martinelli Boneschi; Jennifer Liddle; Hanne F Harbo; Marc L Perez; Anne Spurkland; Matthew J Waller; Marcin P Mycko; Michelle Ricketts; Manuel Comabella; Naomi Hammond; Ingrid Kockum; Owen T McCann; Maria Ban; Pamela Whittaker; Anu Kemppinen; Paul Weston; Clive Hawkins; Sara Widaa; John Zajicek; Serge Dronov; Neil Robertson; Suzannah J Bumpstead; Lisa F Barcellos; Rathi Ravindrarajah; Roby Abraham; Lars Alfredsson; Kristin Ardlie; Cristin Aubin; Amie Baker; Katharine Baker; Sergio E Baranzini; Laura Bergamaschi; Roberto Bergamaschi; Allan Bernstein; Achim Berthele; Mike Boggild; Jonathan P Bradfield; David Brassat; Simon A Broadley; Dorothea Buck; Helmut Butzkueven; Ruggero Capra; William M Carroll; Paola Cavalla; Elisabeth G Celius; Sabine Cepok; Rosetta Chiavacci; Françoise Clerget-Darpoux; Katleen Clysters; Giancarlo Comi; Mark Cossburn; Isabelle Cournu-Rebeix; Mathew B Cox; Wendy Cozen; Bruce A C Cree; Anne H Cross; Daniele Cusi; Mark J Daly; Emma Davis; Paul I W de Bakker; Marc Debouverie; Marie Beatrice D'hooghe; Katherine Dixon; Rita Dobosi; Bénédicte Dubois; David Ellinghaus; Irina Elovaara; Federica Esposito; Claire Fontenille; Simon Foote; Andre Franke; Daniela Galimberti; Angelo Ghezzi; Joseph Glessner; Refujia Gomez; Olivier Gout; Colin Graham; Struan F A Grant; Franca Rosa Guerini; Hakon Hakonarson; Per Hall; Anders Hamsten; Hans-Peter Hartung; Rob N Heard; Simon Heath; Jeremy Hobart; Muna Hoshi; Carmen Infante-Duarte; Gillian Ingram; Wendy Ingram; Talat Islam; Maja Jagodic; Michael Kabesch; Allan G Kermode; Trevor J Kilpatrick; Cecilia Kim; Norman Klopp; Keijo Koivisto; Malin Larsson; Mark Lathrop; Jeannette S Lechner-Scott; Maurizio A Leone; Virpi Leppä; Ulrika Liljedahl; Izaura Lima Bomfim; Robin R Lincoln; Jenny Link; Jianjun Liu; Aslaug R Lorentzen; Sara Lupoli; Fabio Macciardi; Thomas Mack; Mark Marriott; Vittorio Martinelli; Deborah Mason; Jacob L McCauley; Frank Mentch; Inger-Lise Mero; Tania Mihalova; Xavier Montalban; John Mottershead; Kjell-Morten Myhr; Paola Naldi; William Ollier; Alison Page; Aarno Palotie; Jean Pelletier; Laura Piccio; Trevor Pickersgill; Fredrik Piehl; Susan Pobywajlo; Hong L Quach; Patricia P Ramsay; Mauri Reunanen; Richard Reynolds; John D Rioux; Mariaemma Rodegher; Sabine Roesner; Justin P Rubio; Ina-Maria Rückert; Marco Salvetti; Erika Salvi; Adam Santaniello; Catherine A Schaefer; Stefan Schreiber; Christian Schulze; Rodney J Scott; Finn Sellebjerg; Krzysztof W Selmaj; David Sexton; Ling Shen; Brigid Simms-Acuna; Sheila Skidmore; Patrick M A Sleiman; Cathrine Smestad; Per Soelberg Sørensen; Helle Bach Søndergaard; Jim Stankovich; Richard C Strange; Anna-Maija Sulonen; Emilie Sundqvist; Ann-Christine Syvänen; Francesca Taddeo; Bruce Taylor; Jenefer M Blackwell; Pentti Tienari; Elvira Bramon; Ayman Tourbah; Matthew A Brown; Ewa Tronczynska; Juan P Casas; Niall Tubridy; Aiden Corvin; Jane Vickery; Janusz Jankowski; Pablo Villoslada; Hugh S Markus; Kai Wang; Christopher G Mathew; James Wason; Colin N A Palmer; H-Erich Wichmann; Robert Plomin; Ernest Willoughby; Anna Rautanen; Juliane Winkelmann; Michael Wittig; Richard C Trembath; Jacqueline Yaouanq; Ananth C Viswanathan; Haitao Zhang; Nicholas W Wood; Rebecca Zuvich; Panos Deloukas; Cordelia Langford; Audrey Duncanson; Jorge R Oksenberg; Margaret A Pericak-Vance; Jonathan L Haines; Tomas Olsson; Jan Hillert; Adrian J Ivinson; Philip L De Jager; Leena Peltonen; Graeme J Stewart; David A Hafler; Stephen L Hauser; Gil McVean; Peter Donnelly; Alastair Compston
Journal:  Nature       Date:  2011-08-10       Impact factor: 49.962

9.  Analysis of immune-related loci identifies 48 new susceptibility variants for multiple sclerosis.

Authors:  Ashley H Beecham; Nikolaos A Patsopoulos; Dionysia K Xifara; Mary F Davis; Anu Kemppinen; Chris Cotsapas; Tejas S Shah; Chris Spencer; David Booth; An Goris; Annette Oturai; Janna Saarela; Bertrand Fontaine; Bernhard Hemmer; Claes Martin; Frauke Zipp; Sandra D'Alfonso; Filippo Martinelli-Boneschi; Bruce Taylor; Hanne F Harbo; Ingrid Kockum; Jan Hillert; Tomas Olsson; Maria Ban; Jorge R Oksenberg; Rogier Hintzen; Lisa F Barcellos; Cristina Agliardi; Lars Alfredsson; Mehdi Alizadeh; Carl Anderson; Robert Andrews; Helle Bach Søndergaard; Amie Baker; Gavin Band; Sergio E Baranzini; Nadia Barizzone; Jeffrey Barrett; Céline Bellenguez; Laura Bergamaschi; Luisa Bernardinelli; Achim Berthele; Viola Biberacher; Thomas M C Binder; Hannah Blackburn; Izaura L Bomfim; Paola Brambilla; Simon Broadley; Bruno Brochet; Lou Brundin; Dorothea Buck; Helmut Butzkueven; Stacy J Caillier; William Camu; Wassila Carpentier; Paola Cavalla; Elisabeth G Celius; Irène Coman; Giancarlo Comi; Lucia Corrado; Leentje Cosemans; Isabelle Cournu-Rebeix; Bruce A C Cree; Daniele Cusi; Vincent Damotte; Gilles Defer; Silvia R Delgado; Panos Deloukas; Alessia di Sapio; Alexander T Dilthey; Peter Donnelly; Bénédicte Dubois; Martin Duddy; Sarah Edkins; Irina Elovaara; Federica Esposito; Nikos Evangelou; Barnaby Fiddes; Judith Field; Andre Franke; Colin Freeman; Irene Y Frohlich; Daniela Galimberti; Christian Gieger; Pierre-Antoine Gourraud; Christiane Graetz; Andrew Graham; Verena Grummel; Clara Guaschino; Athena Hadjixenofontos; Hakon Hakonarson; Christopher Halfpenny; Gillian Hall; Per Hall; Anders Hamsten; James Harley; Timothy Harrower; Clive Hawkins; Garrett Hellenthal; Charles Hillier; Jeremy Hobart; Muni Hoshi; Sarah E Hunt; Maja Jagodic; Ilijas Jelčić; Angela Jochim; Brian Kendall; Allan Kermode; Trevor Kilpatrick; Keijo Koivisto; Ioanna Konidari; Thomas Korn; Helena Kronsbein; Cordelia Langford; Malin Larsson; Mark Lathrop; Christine Lebrun-Frenay; Jeannette Lechner-Scott; Michelle H Lee; Maurizio A Leone; Virpi Leppä; Giuseppe Liberatore; Benedicte A Lie; Christina M Lill; Magdalena Lindén; Jenny Link; Felix Luessi; Jan Lycke; Fabio Macciardi; Satu Männistö; Clara P Manrique; Roland Martin; Vittorio Martinelli; Deborah Mason; Gordon Mazibrada; Cristin McCabe; Inger-Lise Mero; Julia Mescheriakova; Loukas Moutsianas; Kjell-Morten Myhr; Guy Nagels; Richard Nicholas; Petra Nilsson; Fredrik Piehl; Matti Pirinen; Siân E Price; Hong Quach; Mauri Reunanen; Wim Robberecht; Neil P Robertson; Mariaemma Rodegher; David Rog; Marco Salvetti; Nathalie C Schnetz-Boutaud; Finn Sellebjerg; Rebecca C Selter; Catherine Schaefer; Sandip Shaunak; Ling Shen; Simon Shields; Volker Siffrin; Mark Slee; Per Soelberg Sorensen; Melissa Sorosina; Mireia Sospedra; Anne Spurkland; Amy Strange; Emilie Sundqvist; Vincent Thijs; John Thorpe; Anna Ticca; Pentti Tienari; Cornelia van Duijn; Elizabeth M Visser; Steve Vucic; Helga Westerlind; James S Wiley; Alastair Wilkins; James F Wilson; Juliane Winkelmann; John Zajicek; Eva Zindler; Jonathan L Haines; Margaret A Pericak-Vance; Adrian J Ivinson; Graeme Stewart; David Hafler; Stephen L Hauser; Alastair Compston; Gil McVean; Philip De Jager; Stephen J Sawcer; Jacob L McCauley
Journal:  Nat Genet       Date:  2013-09-29       Impact factor: 38.330

10.  The P4-ATPase ATP9A is a novel determinant of exosome release.

Authors:  Jyoti Naik; Chi M Hau; Lysbeth Ten Bloemendaal; Kam S Mok; Najat Hajji; Ann M Wehman; Sander Meisner; Vanesa Muncan; Nanne J Paauw; H E de Vries; Rienk Nieuwland; Coen C Paulusma; Piter J Bosma
Journal:  PLoS One       Date:  2019-04-04       Impact factor: 3.240

View more
  1 in total

1.  Heritability Estimation of Multiple Sclerosis Related Plasma Protein Levels in Sardinian Families with Immunochip Genotyping Data.

Authors:  Andrea Nova; Giulia Nicole Baldrighi; Teresa Fazia; Francesca Graziano; Valeria Saddi; Marialuisa Piras; Ashley Beecham; Jacob L McCauley; Luisa Bernardinelli
Journal:  Life (Basel)       Date:  2022-07-21
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.