Literature DB >> 34820281

Long-read single molecule real-time (SMRT) sequencing of GBA1 locus in Gaucher disease national cohort from Argentina reveals high frequency of complex allele underlying severe skeletal phenotypes: Collaborative study from the Argentine Group for Diagnosis and Treatment of Gaucher Disease.

Guillermo I Drelichman1, Nicolas Fernández Escobar1, Barbara C Soberon1, Nora F Basack1, Joaquin Frabasil2, Andrea B Schenone2, Gabriel Aguilar3, Maria S Larroudé3, James R Knight4, Dejian Zhao4, Jiapeng Ruan5, Pramod K Mistry5.   

Abstract

Gaucher disease is reckoned for extreme phenotypic diversity that does not show consistent genotype/phenotype correlations. In Argentina, a national collaborative group, Grupo Argentino de Diagnóstico y Tratamiento de la Enfermedad de Gaucher, GADTEG, have delineated uniformly severe type 1 Gaucher disease manifestations presenting in childhood with large burden of irreversible skeletal disease. Here using Long-Read Single Molecule Real-Time (SMRT) Sequencing of GBA1 locus, we show that RecNciI allele is highly prevalent and associates with severe skeletal manifestations in childhood.
© 2021 Published by Elsevier Inc.

Entities:  

Keywords:  BD, bone disease; Bone disease; ERT, Enzyme replacement therapy; GADTEG, The Argentine Group for Diagnosis and Treatment of Gaucher Disease (Grupo Argentino de Diagnóstico y Tratamiento de la Enfermedad de Gaucher; GD, Gaucher disease; GL1, Glucosylceramide; Gaucher disease; Genotype phenotype correlation; Mutation analysis

Year:  2021        PMID: 34820281      PMCID: PMC8600149          DOI: 10.1016/j.ymgmr.2021.100820

Source DB:  PubMed          Journal:  Mol Genet Metab Rep        ISSN: 2214-4269


Introduction

Gaucher disease (GD) is a prototype lysosomal storage disease due to bi-allelic mutations in GBA1, which encodes lysosomal acid β-glucosidase (glucocerebrosidase, EC 3.2.1.45) [1]. Deficiency of acid β-glucosidase leads to a progressive accumulation of glucosylceramide (GlcCer) and glucosylsphingosine (GlcSph) in the lysosomes of myeloid cells, most prominently displayed by the macrophages [2,3]. Three broad phenotype categories have been classified based on the absence of (type 1, GD1: non-neuronopathic, OMIM # 230800), or the presence of, and severity of early onset neurodegenerative symptoms (type 2, GD2: acute neuronopathic form, OMIM # 230900; type 3, GD3: chronic neuronopathic forms, OMIM # 231000) [3,4]. In GD1, some patients develop neurodegeneration as adults manifesting as Parkinson's disease and Lewy Body Dementia [ 5]. GBA1, located on chromosome 1q21 of GRCh37/hg19 (now on 1q22 of the latest version GRCh38/hg38), is comprised of 11 exons and 11 introns spanning 7.6 Kb DNA. Notably, it is located in a highly gene dense region that harbors seven genes and two pseudogenes within only 85 Kb of DNA [6,7] (Fig. 1). There is a highly homologous pseudogene (GBAP1), 16 Kb downstream from the GBA1 with an exon and intron organization spanning 5.7 kb, similar to GBA1. In fact, the assigned exons of the GBAP1 share up to 98% sequence homology with the coding region of GBA1 [6]. Notably, GBAP1 harbors many mutations which, if present in the GBA1, causes Gaucher disease. The genomic organization of the GBA locus results in a propensity for gene conversion events, which underlie numerous disease mutations and complex alleles involving GBA1 and GBAP1. These attributes of GBA present significant challenges for the accurate and comprehensive genotyping of patients in the clinics and in large cohorts studies.
Fig. 1

Location of GBA1 on Chromosome 1q21 with flanking genes, and the LR-PCR amplicons for SMRT sequencing. The 133kb human GBA1 loci (GRCh38.p13, Chr1q22; NC_000001.11) consist of 15 genes: PKLR (pyruvate kinase L/R, Chr1:155,289,293..155,301,438. Length:12,146nt); HCN3 (hyperpolarization activated cyclic nucleotide gated potassium channel 3, complement Chr1:155,277,427..155,289,848. Length:12,422nt), CLK2 (CDC like kinase 2, Chr1:155,262,868..155,273,504. Length:10,637nt), SCAMP3 (secretory carrier membrane protein 3, Chr1:155,255,981..155,262,360. Length:6,380nt), FAM189B (family with sequence similarity 189 member B, Chr1:155,247,205..155,255,892. Length:8,688nt), GBA1 (glucosylceramidase beta, Chr1:155,234,452..155,244,627. Length:10,176nt), MTX1P1 (metaxin 1 pseudogene 1, complement Chr1:155,230,976..155,234,451. Length:3,476nt), GBAP1 (glucosylceramidase beta pseudogene 1, Chr1:155,213,825..155,227,534. Length:13,710nt), MTX1 (metaxin 1, complement Chr1:155,208,699..155,213,839. Length:5,141nt), THBS3 (thrombospondin 3, Chr1:155,195,588..155,209,180. Length:13,593nt), LOC (THBS3-AS1/LOC105371450, complement Chr 155,196,035..155,200,571. Length:4,537nt), MIR92B (microRNA 92b, complement Chr1:155,195,177..155,195,272. Length:96nt), TRIM46 (tripartite motif containing 46, complement Chr1:155,173,381..155,184,971. Length:11,591nt), MUC1 (mucin 1, cell surface associated, Chr1:155,185,824..155,192,915. Length:7,092nt), KRTCAP2 (keratinocyte associated protein 2, Chr1:155,169,408..155,173,304. Length:3,897nt). GBA1 pseudogene (GBAP1) is approximately 12 kb downstream of GBA1 gene. Red bar, length and location of the six long-range (LR) SMRT amplicons used in this study (primers in Table 1). Lower panel, purified LR-PCR amplicons on 0.75% agarose gel.

Location of GBA1 on Chromosome 1q21 with flanking genes, and the LR-PCR amplicons for SMRT sequencing. The 133kb human GBA1 loci (GRCh38.p13, Chr1q22; NC_000001.11) consist of 15 genes: PKLR (pyruvate kinase L/R, Chr1:155,289,293..155,301,438. Length:12,146nt); HCN3 (hyperpolarization activated cyclic nucleotide gated potassium channel 3, complement Chr1:155,277,427..155,289,848. Length:12,422nt), CLK2 (CDC like kinase 2, Chr1:155,262,868..155,273,504. Length:10,637nt), SCAMP3 (secretory carrier membrane protein 3, Chr1:155,255,981..155,262,360. Length:6,380nt), FAM189B (family with sequence similarity 189 member B, Chr1:155,247,205..155,255,892. Length:8,688nt), GBA1 (glucosylceramidase beta, Chr1:155,234,452..155,244,627. Length:10,176nt), MTX1P1 (metaxin 1 pseudogene 1, complement Chr1:155,230,976..155,234,451. Length:3,476nt), GBAP1 (glucosylceramidase beta pseudogene 1, Chr1:155,213,825..155,227,534. Length:13,710nt), MTX1 (metaxin 1, complement Chr1:155,208,699..155,213,839. Length:5,141nt), THBS3 (thrombospondin 3, Chr1:155,195,588..155,209,180. Length:13,593nt), LOC (THBS3-AS1/LOC105371450, complement Chr 155,196,035..155,200,571. Length:4,537nt), MIR92B (microRNA 92b, complement Chr1:155,195,177..155,195,272. Length:96nt), TRIM46 (tripartite motif containing 46, complement Chr1:155,173,381..155,184,971. Length:11,591nt), MUC1 (mucin 1, cell surface associated, Chr1:155,185,824..155,192,915. Length:7,092nt), KRTCAP2 (keratinocyte associated protein 2, Chr1:155,169,408..155,173,304. Length:3,897nt). GBA1 pseudogene (GBAP1) is approximately 12 kb downstream of GBA1 gene. Red bar, length and location of the six long-range (LR) SMRT amplicons used in this study (primers in Table 1). Lower panel, purified LR-PCR amplicons on 0.75% agarose gel.
Table 1

LR-PCR primers used to amplify the 1q21 GBA1 region for PacBio SMRT.

Primer IDPrimer Sequence (Universal tag + gene specific sequence)Amplicon Size (bp)
F5704/F8155GGTAGGCGCTCTGTGTGCAGCtCGGGGTTGGGATTCGCACT
R5704CCATCTCATATGTAGTACTCTtGATGTCCAGGGGCTGGCAA5704
R8155CCATCTCATATGTAGTACTCTtGATGTCCAGGGGCTGGCAA8155
F6242GGTAGGCGCTCTGTGTGCAGCgGCCACACCATGGACAGCTT
R6242CCATCTCATATGTAGTACTCTtTGGGTCCTCCTTCGGGGTT6242
F5900GGTAGGCGCTCTGTGTGCAGCaGCAGATGTGTCCATTCTCCATGT
R5900CCATCTCATATGTAGTACTCTtTGTCTCCATCCAGCGGGCA5900
F6746GGTAGGCGCTCTGTGTGCAGCgGTCCACTTTCTTGGCCGGA
R6746CCATCTCATATGTAGTACTCTaACCTATTGCTATGAAAAGGAGCAG6746
F8077GGTAGGCGCTCTGTGTGCAGCgGACCGACTGGAACCTTGCC
R8077CCATCTCATATGTAGTACTCTgCCAGCACACCCTTAGTGGG8077

Each primer has 5 base padding sequence at 5’-end (underline) following with barcode sequence and GBA1 gene specific sequence (starts from the lower-case nucleotide).

Currently, more than 600 GBA1 pathogenetic variants have been catalogued (HGMD professional 2020.4; CentoLSD™, https://www.centogene.com/centolsd.html) The most prevalent GBA1 variants are N370S (c. 1226A > G; p.Asn409Ser), a founder mutation from Eastern Europe, and L444P (c.1448 T > C; p.Leu483Pro), a variant in GBAP1 transferred to GBA1 by gene conversion events that occurs recurrently in all populations of the world [7,8]. Delineation of the genotype/phenotype correlations have been the focus of many studies. N370S mutation is predictive of GD type 1, therefore is a neuro-protective variant for childhood-onset of neurodegenerative disease (GD2 or GD3). However, it does not protect from late-onset neurodegenerative diseases, Parkinson's disease and Lewy Body Dementia, seen in some patients with GD1. It accounts for more than 70% of pathogenetic variants in GD type 1 Ashkenazi Jewish patients, and ~ 30% of pathogenetic variants in European non-Jewish GD patients [9]. In contrast, homozygosity for the L444P mutation is strongly associated with GD3, and, when present in context of a complex allele, it can be associated with the most severe forms neuronopathic GD [8]. A wide spectrum of GBA1 mutations have been reported, including missense/nonsense, indels, gross deletions/insertions, duplications, alternative splicing, promoter elements, regulatory RNAs, and complex recombinant alleles. GD complex alleles arise from the high homology and the physical proximity between GBA1 and GBAP1, that enable reciprocal as well as nonreciprocal homologous recombination events [10]. To date, more than 20 GBAP1 derived recombinant alleles have been reported, where recombination sites are variable spanning intron 2 to exon 11 of GBA1. The most frequently encountered complex alleles are RecNciI and RecDelta55 (c.1263-1317Del55) [11]. Recombination events underlying RecNciI mutation occur in the area from intron 9 to exon 10, involve the incorporation of a GBAP1 segment that harbors three variants: L444P, A456P, and the silent change of V460V [12]. Therefore, targeted NGS sequencing or Sanger sequencing analysis of only L444P (as occurs in many diagnostic panels not using whole gene sequencing) can miss a complex allele, hindering optimal genetic care and confounds genotype/phenotype studies. Taken together, the complexity of the GBA1 locus challenges and confound precise genotype assignment which can hamper accurate assessment of individual patients and large cohort studies. A major cause of morbidity and disability in GD1 is complex skeletal disease, manifesting as chronic unrelenting bone pain, avascular osteonecrosis, complex lytic bone lesions and fragility fractures. In Europe and the US, bone disease occurs in GD patients with a frequency of 50-60% [13,14]. In contrast, in the Argentinian GD population, bone involvement is more frequent at diagnosis (71%) and remains predominant after long-term follow-up (69.8%), despite the Enzyme Replacement Therapy (ERT) [15]. To understand what types of mutation are associated with bone disease in GD, comprehensive GBA1 analysis is necessary. Previous studies have examined prevalent individual pathogenetic variants and in general shown that “N370S/other allele variant(s)” is associated with more severe skeletal disease [16]. Hitherto, approaches to ascertainment of GBA1 mutations in such studies have relied mostly on screening for common mutations and not full gene sequencing. Therefore, the nature of genotype/phenotype correlation with respect to bone disease in GD is not fully understood. The Argentine Group for Diagnosis and Treatment of Gaucher Disease (Grupo Argentino de Diagnóstico y Tratamiento de la Enfermedad de Gaucher, GADTEG) was created in 2006. Our collaborative group is formed by ~70 physicians throughout Argentina, tomonitor the phenotypic spectrum, natural history, and treatment outcomes in 300 patients across the country. We showed a strikingly prevalence of skeletal disease in our GD1 population [15]. Until 2017, only 30% of our patients underwent limited genotyping for common pathogenetic variants. Therefore, our cohort of GD patients offers a unique opportunity to understand the genetic contribution of GBA1 variants to high burden of skeletal disease. Such studies are essential prelude for unravelling putative modifier genes. Moreover, knowledge of comprehensive genotype of in our patients promises to advance precision medicine for optimal management of Gaucher disease.

Materials and methods

Patients

A total of 192 patients provided informed consent to participate in the study. All patients had enzymatically confirmed diagnoses of Gaucher disease in peripheral blood leucocytes. Patients have been longitudinally followed and comprehensively evaluated for hematological, visceral and bone disease indicators as described previously [ 14]. Response to ERT (Enzyme Replacement Therapy) has also been carefully documented. DNA samples were processed at Laboratorio “Dr. N. A. Chamoles” in Argentine and sequenced at Yale Center of Genome Analysis (YCGA). PacBio long-read Single Molecule Real-Time (SMRT) GBA1 deep sequencing was developed using GBA1 specific primers depicted in Fig. 1, covering from the 5'-UTR to 3'-URT of GBA1 gene, avoiding amplification of GBAP1. To fully genotype GBA1, a total of six specific LR-PCR amplicons (5.7 to 8.15 kb, spanning 19.4 kb) were designed as shown in Fig. 1. Primers (Table 1) were optimized to yield amplicons of similar amount to enhance sequencing efficiency and loading capacity for SMRT sequencing. LR-PCR fragments were amplified from 100 ng of genomic DNA, using 1× PrimeSTAR GXL polymerase (R050B, Takara Bio USA Inc.) on a 25 μl of PCR reaction volume with 200 nM of barcode tagged primers. Initial denaturation was performed for 8 min at 98°, followed by 30 cycles of 10 s at 98 °C, 15 s at 60 °C, and 10 min at 68 °C, respectively. Final extension was 10 min at 68 °C. In some cases, 5 ul of 5M Betaine solution (B0300, Sigma-Aldrich USA) was included to the 25 μl of LR-PCR reaction to increase PCR efficiency. Pooled PCR products were size selected, purified, and visually inspected on agarose gel. Equal amount of pooled amplicons were used to generate SMRT libraries and sequenced on SMRT cells in PacBio RS II system [17] according to manufacturer's instructions. Briefly, the damages of the pooled amplicons were first repaired, followed by end-repair and A-tailing. Next, PacBio sequencing adaptors were ligated to the pooled amplicons and purified using AMPure PB beads. The final PacBio libraries were then annealed to the sequencing primer, bound to the polymerase, and loaded on the PacBio RS II for sequencing. SMRT phasing was then performed to resolve individual alleles. The genotype was validated in original patient DNA sample by Sanger sequencing. LR-PCR primers used to amplify the 1q21 GBA1 region for PacBio SMRT. Each primer has 5 base padding sequence at 5’-end (underline) following with barcode sequence and GBA1 gene specific sequence (starts from the lower-case nucleotide). For Sanger sequencing, four Sanger LR-PCR were designed (Table 2) covering the whole GBA1 (Table 1). Sanger LR-PCR enzyme and conditions were the same as for PacBio LR-PCR, except the extension time was decreased to 3 min. For each patient, four gel-purified Sanger LR-PCR fragments were prepared and stored individually at -20 °C. Each variant detected by SMRT was verified at least two times from both 5’- and 3’- directions by LR-PCR based Sanger sequencing.
Table 2

LR-PCR primers used to amplify GBA1 gene for Sanger sequencing.

Primer IDPrimer SequenceAmplicon Size (bp)
NA2568FCCATCCTCTGGGATTTAGGAGC
NA2568RGAAGTCAGGGTCCAAAGAAAGGG2568
NB2664FTGCATCCCTAAAAGCTTCGGCTA
NB2664RGGTGAGTACTGTTGGCGAGGG2664
NC2470FCTCAAGACCAATGGAGCGGT
NC2470RTCGACAAAGTTACGCACCCA2470
C1600FCTTCCTGCAAAGCAGACCTCA
C1600RTTGGGCCCAGCTTTCCTAGTC1600
LR-PCR primers used to amplify GBA1 gene for Sanger sequencing. Genotyping results were evaluated for correlation with skeletal phenotype and compared our genotype distribution with that reported in the Gaucher International Registry data (International Collaborative Gaucher Group, ICGG, https://clinicaltrials.gov/NCT00358943).

Statistical methods

Qualitative variables are expressed as frequency and percentage, while quantitative variables are expressed as mean, median, minimum, and maximum. For univariate analysis, contingency tables were evaluated using chi-square test or Fisher's exact test, as appropriate. Yates´ continuity correction was applied to 2 × 2 contingency tables. Logistic regression models were used for multivariate analysis. Alpha values of 0.05 were considered as statistically significant. In the multivariate statistical analysis we included 4 variables: RecNcil allele, RecNcil/N370S genotype, RecNcil/other genotype and Argentine ancestors. In addition, we took 2 models as a dependent variable: MODEL I: less severe bone manifestations during follow up, i.e., bone marrow infiltration and EFD. MODEL II: severe bone manifestations during follow up, i.e., acute and/or chronic avascular necrosis and bone marrow infarcts). We found, in both models (I and II), that RecNcil/N370S genotype and patient carrying heterozygote a RecNcil allele have a statistically significant association with model I (p = 0.017) and model II (p = 0.004) Argentine ancestors were not a significant variable (p = 0.059) in either of the two models.

Results

Of a total of 192 samples, 146 (76%) were successfully genotyped by SMRT sequencing. In 46 samples, full GBA1 genotype could not be ascertained in the first round of PacBio SMRT analysis. As the key objective of our study was delineating the comprehensive GBA1 genotype of our cohort, we removed these 46 samples from further analysis. Separately, we are conducting studies to optimize our sequencing strategies to overcome this limitation. All genotypes assigned by SMRT were confirmed by Sanger sequencing in the original gDNA sample.

Pathogenetic variants

The most frequent Gaucher disease mutation in the Argentine cohort is N370S, with 126 patients (86.3%) harboring at least one N370S allele. Notably, this frequency is similar to that reported in the ICGG (June 2014) [13] for Ashkenazi Jewish and non-Jewish European populations (Table 3).
Table 3

N370S allele frequency in Argentine compared with other regions worldwide (ICGG 2014) [13].

RegionArgentineEurope1Japac2Latin America3(Without Argentine)Middle East4North America4TOTAL (Without Argentine)
Total genotyping14697113549769718544156
1 N370S ALLELE(Source: ICGG 2014)86.3%74.8%4.4%79%84.6%83.8%79.2%

1 Albania, Austria, Balearic Islands, Belgium, Bulgaria, Czech Republic, Denmark, England, Finland, France, Germany, Greece, Hungary, Ireland, Italy, Lithuania, Netherlands, Norway, Poland, Portugal, Romania, Russia, Serbia, Slovenia, Spain, Switzerland, Turkey, and Ukraine. 2 China, Hong Kong, India, Japan, Korea, Malaysia, Philippines, Taiwan, and Thailand. 3 Bolivia, Brasil, Chile, Colombia, Costa Rica, Dominican Republic, Ecuador, Guatemala, México (n = 13; 1.4%), Panamá, Paraguay, Perú, Suriname, Uruguay, and Venezuela. Canada and United States. Egypt, Israel, Jordan, Kuwait, Oman, Saudi Arabia, and Unites Arab Emirates.

N370S allele frequency in Argentine compared with other regions worldwide (ICGG 2014) [13]. 1 Albania, Austria, Balearic Islands, Belgium, Bulgaria, Czech Republic, Denmark, England, Finland, France, Germany, Greece, Hungary, Ireland, Italy, Lithuania, Netherlands, Norway, Poland, Portugal, Romania, Russia, Serbia, Slovenia, Spain, Switzerland, Turkey, and Ukraine. 2 China, Hong Kong, India, Japan, Korea, Malaysia, Philippines, Taiwan, and Thailand. 3 Bolivia, Brasil, Chile, Colombia, Costa Rica, Dominican Republic, Ecuador, Guatemala, México (n = 13; 1.4%), Panamá, Paraguay, Perú, Suriname, Uruguay, and Venezuela. Canada and United States. Egypt, Israel, Jordan, Kuwait, Oman, Saudi Arabia, and Unites Arab Emirates. The second most frequent allele was RecNcil, with 77 patients (52.7%) harboring this complex mutation (Table 4). Notably, this allelic frequency for RecNciI is the highest reported in the literature, including a previous small study from Argentina [[18], [19], [20], [21], [22], [23]].
Table 4

Comparison of RecNcil allele frequency in Argentine and other regions.

CountryArgentine [15]Egypt [16]Spain [17]Argentine [18]India [19]Brazil [20]
Total genotyping14626193312258
1 RecNcil ALLELE52.7%13.4%0.7%21%7%15.5%
Comparison of RecNcil allele frequency in Argentine and other regions.

Genotypes

We retained the older mutation nomenclature to facilitate comparison with past studies, i.e., N370S for p. Asp409Ser and L444P for p. Lys483Pro, respectively. The most frequent genotype in our population was N370S/RecNciI. These compound heterozygotes mutations occurred in 68 patients (46.6%). This is the highest frequency of compound heterozygote N370S/RecNciI genotype reported in the literature. In our study of Argentine GD population, it accounted for 46.6% of genotypes compared to only 1.9% in the International Gaucher Registry (p = 0.001) (Table 5). Interestingly, 14 patients (9.6%) were homozygous for N370S mutation. Equally prevalent genotype was N370S/L444P with 14 patients (9.6%) harboring this compound heterozygous genotype. This is similar to other regions worldwide. In 14 patients (9.6%), we found multiple rare or novel mutations (Table 6); in 10 patients, the rare mutation was in compound heterozygous state with N370S and in only one patient with L444P. One patient was homozygous for F411I mutation, and 6 patients were heterozygous for F411I.
Table 5

Genotype frequency in Argentine compared with other regions of the world (ICGG 2019)12.

RegionArgentinaEurope1Japac2Latin America3(without Argentina)Middle East4North America4OceaniaTotal(without Argentina)
Total genotyping14613451392849212146424877
N370S/RecNciI(Source: ICGG 2019)46.6%4.8%0%2.8%0.5%0.7%4.8%1.9%
N370S/L444P(Source: ICGG 2019)9.6%17.8%2.9%26.8%4.7%13.4%19%13.8%

1 Albania, Austria, Balearic Islands, Belgium, Bulgaria, Czech Republic, Denmark, England, Finland, France, Germany, Greece, Hungary, Ireland, Italy, Lithuania, Netherlands, Norway, Polad, Portugal, Romania, Russia, Serbia, Slovenia, Spain, Switzerland, Turkey, and Ukraine. 2 China, Hong Kong, India, Japan, Korea, Malaysia, Philippines, Taiwan, and Thailand. 3 Bolivia, Brazil, Chile, Colombia, Costa Rica, Dominican Republic, Ecuador, Guatemala, Mexico (n = 13; 1.4%), Panama, Paraguay, Peru, Suriname, Uruguay, and Venezuela. Canada and United States. Egypt, Israel, Jordan, Kuwait, Oman, Saudi Arabia, and United Arab Emirates. 6 Australia and New Zealand.

Table 6

Summary of GBA1 genotypes in Argentine GD national cohort.

Genotype (n = 146)Frequency
RecNcil/N370S68 (46.6%)
N370S/N370S14 (9.6%)
New (see Table 10)14 (9.6%)
RecNcil/F411I9 (6.2%)
L444P/N370S9 (6.2%)
F411I/F411I6 (4.1%)
F411I/N370S4 (2.7%)
R285C/N370S2 (1.4%)
L444P/R496H2 (1.4%)
H255Q-D409H/F411I2 (1.4%)
F411I/R48W2 (1.4%)
G202R/N370S2 (1.4%)
G195W/N370S2 (1.4%)
H255Q-D409H/N370S1 (0.7%)
R120Q/N370S1 (0.7%)
V394L/N370S1 (0.7%)
R163X/R463C1 (0.7%)
I161N/N370S1 (0.7%)
L371V/L444P1 (0.7%)
I260T/N370S1 (0.7%)
F411I/E233X1 (0.7%)
F397S/N370S1 (0.7%)
F411I/Y135C1 (0.7%)
Total146
Genotype frequency in Argentine compared with other regions of the world (ICGG 2019)12. 1 Albania, Austria, Balearic Islands, Belgium, Bulgaria, Czech Republic, Denmark, England, Finland, France, Germany, Greece, Hungary, Ireland, Italy, Lithuania, Netherlands, Norway, Polad, Portugal, Romania, Russia, Serbia, Slovenia, Spain, Switzerland, Turkey, and Ukraine. 2 China, Hong Kong, India, Japan, Korea, Malaysia, Philippines, Taiwan, and Thailand. 3 Bolivia, Brazil, Chile, Colombia, Costa Rica, Dominican Republic, Ecuador, Guatemala, Mexico (n = 13; 1.4%), Panama, Paraguay, Peru, Suriname, Uruguay, and Venezuela. Canada and United States. Egypt, Israel, Jordan, Kuwait, Oman, Saudi Arabia, and United Arab Emirates. 6 Australia and New Zealand. Summary of GBA1 genotypes in Argentine GD national cohort.

Clinical characteristics of N370S/RecNcil genotype patients at diagnosis

The finding of the high prevalence of N370S/RecNcil genotype in our cohort offered an unprecedented opportunity to understand the phenotypic spectrum associated with this combination of mutations. First, all patients had type 1 GD, underscoring the notion that presence of at least one N370S mutation predicts GD1 and that it is neuroprotective against neuronopathic GD [24]. Patients with N370S/RecNcil genotype had severe disease indicated by onset of symptoms at mean age 8.9 years, and it was not until mean age 21.4 years that they were diagnosed with Gaucher disease and started ERT at mean age 28.1 years. These results reveal a wide gap in interval between onset of symptoms in pediatric age group and delayed diagnosis as adults and moreover significant delay from diagnosis to initiation of ERT for a population that is severely affected. Notably, patients with this genotype had severe visceral and bone disease at diagnosis and as well as during follow-up. The numbers of patients with other genotypes were too small to conduct meaningful comparisons. Bone involvement is the most disabling and incapacitating complication of GD1. The major goal of treatment for Gaucher disease is to improve bone health by preventing irreversible complications such as avascular necrosis and fragility fractures as well as ameliorate chronic bone pain. In our cohort of N370S/RecNcil Gaucher disease patients, we found a higher prevalence of skeletal disease at diagnosis (71%) as well as during follow up on ERT (69.8%), compared to the ICGG Registry which shows bone disease in 59.8% of patients. Moreover, there was major diagnostic delay with onset of symptoms in childhood and diagnosis in adulthood. These findings underscore the large burden of irreversible bone complications at initiation of treatment, such as avascular osteonecrosis (Table 7).
Table 7

Bone manifestations at diagnosis and follow-up in the Argentine population vs. world population (Source: ICGG 2014) [13].

ArgentineBasalICGGBasalArgentinefollow-upICGGfollow-up
Patients, N26046252664625
Bone pain65.6%49.8%41%37.8%
Bone crisis35.6%13.6%14.5%2.6%
Infiltration91%87.3%89.5%80.8%
Erlenmeyer89.7%73.1%87.7%71.1%
Infarcts60%43.8%59.6%48.9%
Necrosis43.6%32%42.7%28%
% Total of bone disease71%60.8%69.8%58.8%
Bone manifestations at diagnosis and follow-up in the Argentine population vs. world population (Source: ICGG 2014) [13].

N370S/RecNcil genotype and bone manifestations

Comparing the presence of bone manifestations in our RecNcil/N370S genotype patients with those reported at the ICGG (June 2019), we observed that bone manifestations were significantly more frequent in our population not only at diagnosis (61% vs. 38%, p = 0.001), but also after long-term follow-up (median 154 months) (70% vs. 57%, p = 0.001).

Univariate analysis

The presence of bone lesions at diagnosis and follow-up (mean duration of ERT 154 months) was compared in N370S/RecNciI and F411I/RecNciI patients (n = 77) and those with other genotypes (n = 69). The prevalence of bone lesions at diagnosis did not differ in the two genotype groups(p = 0.183). At follow-up, bone lesions were significantly more frequent among patients harboring RecNciI allele (p = 0.017). Additionally, follow-up on ERT, both genotypes (RecNciI/N370S and RecNciI/ F411I), were significantly associated with bone lesions (p = 0.017 and p = 0.001, respectively). Rates of splenectomy in different genotypes were similar, ~9% (data not shown).

Argentine ancestor analyses in patients harboring RecNciI allele

The preponderance of RecNciI allele in nearly 50% of our national cohort is highest of any cohort described to date. We examined the origin of patients using questionnaires designed by experts in population genetics based on the origins of the families to the birthplace of the third generation. Of the respondents, approximately 70% (n = 56) of patients reported at least one Argentine ancestor and 49% (n = 39) had two ancestors from Argentine in one paternal and/or maternal lineage.

Multivariate analysis

Variables included were: 1) RecNciI allele, 2) N370S/RecNciI genotype, 3) RecNciI/other genotype, and 4) Argentine ancestors. Taking as a dependent variable solitary bone lesion in the follow-up (model I) and severe multiple lesions at follow-up (model II) we found, in both models, that N370S/RecNciI genotype and a presence of RecNciI allele was significantly associated with severe bone lesions at follow-up (p = 0.017) as well as with irreversible severe lesions (necrosis and acute and chronic infarcts) (p = 0.004). Argentine ancestors were not a significant variable (p = 0.059) in either of the two models. Interestingly, we found a significant correlation between the presence of a RecNciI allele and the presence of, one Argentine ancestor (p < 0.001) or both Argentine ancestors (p < 0.001) (Table 8). Notably, there was also a significant correlation between N370S/RecNciI genotype and at least one Argentine ancestor (p = 0.001) or two Argentine ancestors (p = 0.001) (Table 9). There was no significant correlation between history of Argentine ancestors and bone lesions at diagnosis (p = 0.470) or at follow-up (p = 0.549).
Table 8

Association between RecNciI allele and Argentine ancestor.

Without associationWith associationP value
One Argentine ancestor, N = 8024 (30%)56 (70%)<0.001
Two Argentine ancestors, N = 5011 (22%)39 (48.8%)<0.001
Table 9

RecNciI/N370S genotype and Argentine ancestors.

Without associationWith associationP value
One Argentine ancestor, N = 8020 (29.4%)48 (70.6%)0.001
Two Argentine ancestors, N = 5017 (34%)33 (66%)0.001
Association between RecNciI allele and Argentine ancestor. RecNciI/N370S genotype and Argentine ancestors.

Novel GD pathogenetic variants

Using our comprehensive genotyping strategy, we found rare GD pathogenetic variants in 14 patients (9.6%), which have not been previously reported. We assessed their damaging properties in silico, as shown in the Table 10. Phenotype of patients with these pathogenetic variants is severe, with early childhood presentations (mean age of diagnosis: 6.9 years) and almost 64% prevalence of advanced bone disease (Table 10).
Table 10

Novel Pathogenic variants found in PacBio Argentina patients.

Novel Var.Numbers in GDArgGD ID.PolyPhen ScorePosition/ChangeClinical SignificanceGenotype
L-17SfsX361#311:155210462 (Del:AG > A)PathogenicL-17SfsX36/N370S
R48GfsX4 (c.259DelC)1#1861:155209724 (Del:CG > C)PathogenicR48GfsX4 (c.259DelC)/N370S
D218A1#1780.8921:155207361 (T > G)UncertainL444P/D218A
P332L1#14711:155206148 (G > A)UncertainP332L/N370S
W348R1#100.863rs765182863/1:155206101 (A > T)PathogenicL444P/W348R
L372P1#590.9871:155205628 (A > G)PathogenicF411I/L372P
Y313H + V375L G > C3#86, #153, #1540.795rs398123528/1:155205620 (C > G)PathogenicN370S/Y313H + V375L G > C
P401R1#931(rs74598136)/1:155205541 (G > C)PathogenicP401R/N370S
S424R1#19011:155205102 (G > C)UncertainS424R/N370S
F426S1#5811:155205097 (A > G)UncertainF426S/N370S
S488IfsX381#1651:155204819 (Ins:G > GTAGC)PathogenicS488IfsX38/N370S
L461P1#1050.9981:155204992 (A > G)UncertainL461P/N370S
Novel Pathogenic variants found in PacBio Argentina patients.

Discussion

Comprehensive GBA1 genotyping remains challenging [10,11,25,26], due to the vast multiplicity of disease mutations and highly homologous pseudogene in proximity, that harbors numerous disease mutations. Several variants normally present in the GBA1P sequence, when they occur in GBA1, generally cause severe forms of GD, e.g., L444P, complex alleles with multiple pseudogene variants in tandem, D409H, and RecDelta55 deletion, etc. The occurrences of such disease mutations around the world have been attributed to the propensity of GBA1 for gene conversion events. Several approaches have been developed to analyze the coding regions of GBA1 by Sanger sequencing on amplicons generated using primers designed to exploit limited differences in the GBA1 and GBA1P sequences. While the Sanger sequencing approach is adequate for most clinical applications, its short-read (from 500bp to 1000bp) and low-throughput limitations may hinder its application for refined genetic counselling and for study of large cohorts. Moreover, this standard approach does not allow for phasing of the pathogenic variants. Next generation sequencing (NGS) platforms [27], utilizing short read lengths (from 25 bp to 400 bp), are also inadequate for the precise identification of GBA1 variants, especially for complex alleles and structural variants (SVs) [11,26]. Third-generation sequencing (TGS) technologies display the capability to produce read lengths higher than 10,000 bp (typically 5,000 bp to 20,000 bp), offering the advantage of identifying repetitive and complex genome regions. Superior to short-read sequencing and arrays, long-read sequencing technologies have now reached a level of accuracy and yield that allows their application to variant detection on a scale of tens to thousands of samples [28]. Therefore, TGS is a promising tool for identifying recombinant alleles, SVs and phasing bi-allelic GD mutations. However, TGS using Nanopore [29] or PacBio [30] reportedly shows results with low accuracy reads. So far, there is only one such algorithm designed to exploit GBA1 long-read data, using Nanopore technology studied on a relatively small cohort to compare with Sanger sequencing results [31]. A high proportion of the RecNciI complex allele has been noticed in an earlier Argentinian GD report [21]. Our results confirmed the superiority of SMRT to NGS in detecting the RecNciI allele, and to Sanger sequencing in scaled-up bi-allelic sequencing of the whole GBA1 gene. In the Argentinian GD cohort, bone disease is highly prevalent. Advanced bone lesions are frequent both at diagnosis (71%) and at follow-up (69.8%), underscoring the irreversible nature of bone disease such as osteonecrosis. There was large gap between diagnosis in childhood and initiation of treatment as young adults which may contribute to this finding. Previous studies have demonstrated that the maximal impact of treatment in prevention of irreversible bone lesions occurs when ERT is initiated within 2 years of diagnosis compared to when there is a larger interval between diagnosis and starting ERT [32,33]. There are no published descriptions of a comprehensive genotype/phenotype correlation regarding bone manifestations in GD patients, hence our cohort provides valuable insight. We found that the high frequency of advanced bone lesions in Argentinian GD patients is correlated with an unprecedentedly high frequency of N370S/N370S genotype. Additional contributors to advanced bone disease in the Argentinian GD phenotype include prolonged intervals between childhood onset of disease, and long gaps between diagnosis and initiation of ERT in adulthood, which likely promotes irreversible bone lesions such as osteonecrosis and bone marrow fibrosis around focal collections of lipid-laden macrophages, ‘Gaucheromas’. ERT targets tissue macrophages via the mannose receptor and exhibits very high uptake by the liver and spleen, hindering delivery to the bone marrow compartment [34,35]. In 2016, the GADTEG identified five unfavorable prognostic factors for advanced bone manifestations of Gaucher disease, viz a viz, history of splenectomy, diagnostic delay and long gap between diagnosis and initiation of ERT, poor compliance and suboptimal dose of ERT [15]. Our study found that the Argentinian GD population harbors the highest burden of RecNciI GBA1 mutation accounting for 52.7% of all disease alleles ever reported. This is higher than those reported in the ICGG Registry and in a smaller cohort study from Argentina (1.9% and 21%, respectively) [21]. The high prevalence of RecNciI allele found in our study in Argentina is the most accurate ascertainment to date, enabled by the SMRT strategy to decipher GBA locus. Previous studies reporting lower prevalence of this disease allele may have under-estimated the prevalence due to screening for only common variants. There are several important findings in our study. First, our patient population harbors high prevalence of the RecNciI GBA1 complex allele, which is associated with severe early childhood onset of GD1. Second, there is large gap between the onset of GD1 symptoms and diagnosis to initiation of ERT. Therefore, by the time of initiation of ERT, there is already high burden of irreversible bone manifestation, attenuating the impact of therapy in the skeletal compartment. Third, the high prevalence of the RecNciI mutation highlights that the gene conversion event(s) giving rise to this variant could disrupt another gene in the tightly packed GBA locus that could impact the skeletal phenotype. This topic merits further investigation for a potential candidate gene for severe skeletal manifestations. Finally, we found about 10% of disease variants were not described before, hence contributing to the catalogue of GBA1 mutations. It is intriguing that we found a strong association of the N370S/RecNciI genotype with Argentinian ancestry. It seems likely that the RecNciI mutation is founder mutation in the Argentine GD patient population. Generally the Argentinian population is said to be comprised, in a greater proportion, by immigrants, while only a minor proportion with Argentinian ancestors. However, in 2012, Avena et al. [36], demonstrated, after analyzing national surveys and census, that 65% of Argentinians have European ancestors, 31% have Argentinian ancestors, and 4% have African. These frequencies vary according to geographical region, for example, the proportion of indigenous Americans is higher in the northwest (60%) and south (50%) [12,13]. Our analysis showed that 54.8% of GD patients have at least one Argentinian ancestor, and 30.5% have two ancestors in any paternal/maternal lineage. Univariate and multivariate analyses showed a statistically significant correlation between the presence of the RecNciI allele and at least one Argentine ancestor (p < 0.001) or two Argentinian ancestors (p < 0.001). This demonstrates a strong relationship between the presence of Argentine ancestors and the RecNciI allele as well as the RecNciI/N370S genotype (p = 0.001). This high incidence of Argentinian ancestors should explain the high frequency of RecNciI allele. A limitation of our study was that, of a total of 192 samples, 146 (76%) were successfully genotyped by SMRT sequencing but for 46 samples full genotypes could not be ascertained. However, having 146 patients comprehensively genotyped enabled a robust genotype/phenotype correlation study. We are currently optimizing and adapting our strategy to achieve successful genotyping in all samples.

Conclusions

Complete GBA1 gene sequencing using long-read SMRT plus short-read validation sequencing provides a rapid scaled-up mutation analysis, with essential information that cannot be obtained by Sanger or NGS sequencing alone. In our population, a high frequency of the RecNciI allele (52.7%) and RecNciI/N370S genotype (46.6%) was detected, data not captured in the international registry. Using a questionnaire to assess familial origin, we show that Argentina GD patients have at least one Argentine ancestor (54.8%) or two Argentine ancestors (30.5%). Moreover, there was a statistically significant correlation between the presence of a RecNciI allele and one Argentine ancestor (p < 0.001) or two Argentine ancestors (p < 0.001). Our study indicates that the RecNciI allele and RecNciI/N370S genotype are significantly associated with severe bone manifestations at presentation and during follow-up on ERT (p = 0.017).

Funding

This study was supported by Yale Center of Genome Analysis (YCGA) and by Sanofi.
  27 in total

1.  The underrecognized progressive nature of N370S Gaucher disease and assessment of cancer risk in 403 patients.

Authors:  Tamar H Taddei; Katherine A Kacena; Mei Yang; Ruhua Yang; Advitya Malhotra; Michael Boxer; Kirk A Aleck; Gadi Rennert; Gregory M Pastores; Pramod K Mistry
Journal:  Am J Hematol       Date:  2009-04       Impact factor: 10.047

2.  The human glucocerebrosidase gene and pseudogene: structure and evolution.

Authors:  M Horowitz; S Wilder; Z Horowitz; O Reiner; T Gelbart; E Beutler
Journal:  Genomics       Date:  1989-01       Impact factor: 5.736

Review 3.  Gaucher disease and other storage disorders.

Authors:  Gregory A Grabowski
Journal:  Hematology Am Soc Hematol Educ Program       Date:  2012

4.  Analysis and classification of 304 mutant alleles in patients with type 1 and type 3 Gaucher disease.

Authors:  V Koprivica; D L Stone; J K Park; M Callahan; A Frisch; I J Cohen; N Tayebi; E Sidransky
Journal:  Am J Hum Genet       Date:  2000-05-04       Impact factor: 11.025

5.  Mutation prevalence among 51 unrelated Spanish patients with Gaucher disease: identification of 11 novel mutations.

Authors:  P Alfonso; A Cenarro; J I Pérez-Calvo; M Giralt; P Giraldo; M Pocoví
Journal:  Blood Cells Mol Dis       Date:  2001 Sep-Oct       Impact factor: 3.039

6.  Replacement therapy for inherited enzyme deficiency--macrophage-targeted glucocerebrosidase for Gaucher's disease.

Authors:  N W Barton; R O Brady; J M Dambrosia; A M Di Bisceglie; S H Doppelt; S C Hill; H J Mankin; G J Murray; R I Parker; C E Argoff
Journal:  N Engl J Med       Date:  1991-05-23       Impact factor: 91.245

7.  Glucosylsphingosine is a key biomarker of Gaucher disease.

Authors:  Vagishwari Murugesan; Wei-Lien Chuang; Jun Liu; Andrew Lischuk; Katherine Kacena; Haiqun Lin; Gregory M Pastores; Ruhua Yang; Joan Keutzer; Kate Zhang; Pramod K Mistry
Journal:  Am J Hematol       Date:  2016-08-08       Impact factor: 10.047

8.  Evaluation of the detection of GBA missense mutations and other variants using the Oxford Nanopore MinION.

Authors:  Melissa Leija-Salazar; Fritz J Sedlazeck; Marco Toffoli; Stephen Mullin; Katya Mokretar; Maria Athanasopoulou; Aimee Donald; Reena Sharma; Derralynn Hughes; Anthony H V Schapira; Christos Proukakis
Journal:  Mol Genet Genomic Med       Date:  2019-01-13       Impact factor: 2.183

9.  Gaucher disease: single gene molecular characterization of one-hundred Indian patients reveals novel variants and the most prevalent mutation.

Authors:  Jayesh Sheth; Riddhi Bhavsar; Mehul Mistri; Dhairya Pancholi; Ashish Bavdekar; Ashwin Dalal; Prajnya Ranganath; Katta M Girisha; Anju Shukla; Shubha Phadke; Ratna Puri; Inusha Panigrahi; Anupriya Kaur; Mamta Muranjan; Manisha Goyal; Radha Ramadevi; Raju Shah; Sheela Nampoothiri; Sumita Danda; Chaitanya Datar; Seema Kapoor; Seema Bhatwadekar; Frenny Sheth
Journal:  BMC Med Genet       Date:  2019-02-14       Impact factor: 2.103

10.  Timing of initiation of enzyme replacement therapy after diagnosis of type 1 Gaucher disease: effect on incidence of avascular necrosis.

Authors:  Pramod K Mistry; Patrick Deegan; Ashok Vellodi; J Alexander Cole; Michael Yeh; Neal J Weinreb
Journal:  Br J Haematol       Date:  2009-09-03       Impact factor: 6.998

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.