Literature DB >> 25741863

Determining the prevalence of McArdle disease from gene frequency by analysis of next-generation sequencing data.

Mauricio De Castro¹, Jennifer Johnston¹, Leslie Biesecker¹.

Abstract

PURPOSE: McArdle disease is one of the most common glycogen storage disorders. Although the exact prevalence is not known, it has been estimated to be 1 in 100,000 patients in the United States. More than 100 mutations in PYGM have been associated with this disorder. McArdle disease has significant clinical variability: Some patients present with severe muscle pain and weakness; others have only mild, exercise-related symptoms.
METHODS: Next-generation sequencing data allow estimation of disease prevalence with minimal ascertainment bias. We analyzed gene frequencies in two cohorts of patients based on exome sequencing results. We categorized variants into three groups: a curated set of published mutations, variants of uncertain significance, and likely benign variants.
RESULTS: An initial estimate based on the frequency of six common mutations predicts a disease prevalence of 1/7,650 (95% confidence interval (CI) 1/5,362-1/11,108), which greatly deviates from published estimates. A second method using the two most common mutations predicts a prevalence of 1/42,355 (95% CI 1/24,536-1/76,310) in Caucasians.
CONCLUSIONS: These results suggest that the currently accepted prevalence of McArdle disease is an underestimate and that some of the currently considered pathogenic variants are likely benign.

Entities: Chemical

Mesh：

Year: 2015 PMID： 25741863 PMCID： PMC4561039 DOI： 10.1038/gim.2015.9

Source DB: PubMed Journal: Genet Med ISSN： 1098-3600 Impact factor: 8.822

INTRODUCTION

McArdle disease (Glycogen storage disease type V) is an inherited disorder of glycogen metabolism that affects exclusively skeletal muscle. Initially described in 1951 by British physician Brian McArdle who described a patient with exercise intolerance that failed to produce lactate. Symptoms consist of rapid fatigue, myalgia, and cramping associated with exercise. There is clinical variability with some patients having mild symptoms (fatigue or poor stamina) related to exercise[1] while others have more pronounced proximal muscle weakness[2]. A fatal, rapidly progressive neonatal form with widespread muscle weakness has also been reported[3]. A classic finding in patients with the disease is the rapid improvement of symptoms with rest (so called “second-wind phenomenon”). In mildly to moderately affected patients, the clinical diagnosis requires a high degree of suspicion, especially in older patients in which the only symptom can be exercise intolerance. The diagnosis is confirmed with identification of biallelic pathogenic variants in the PYGM gene that encodes for the muscle phosphorylase protein, the only gene known to be associated with McArdle disease[4]. If the results are unclear, muscle biopsy with measurement of phosphorylase enzyme activity can be helpful. A less invasive, recently described method includes the use of antibodies to determine the expression of PYGM in white blood cells[5]. The prevalence of McArdle disease has been reported to be 1 in 100,000 in the US[6], at least 1 in 170,000 in Spain[7] and 1 in 350,000 in the Netherlands[8]. In Spain and the Netherlands, the calculations were based on the number of affected individuals from national McArdle disease registries. Because McArdle disease can cause mild symptoms, it is possible that an estimate of prevalence based on ascertainment by clinical presentation to a metabolic disease expert could severely underestimate the prevalence. Access to exome sequencing data allows us to estimate the prevalence of this disorder based on carrier frequency using the Hardy-Weinberg equilibrium, reducing the bias associated with clinical ascertainment.

MATERIALS AND METHODS

We evaluated variant call data from the ClinSeq® cohort (n=951) and the NHLBI GO Exome Sequencing Project (ESP) (n=4,297 EA and 2,201 AA). The ClinSeq® cohort is composed of 951 patients predominantly of Caucasian descent ascertained for their family history of cardiovascular disease, participants are otherwise healthy and were not selected for known muscular conditions or symptoms. The ESP cohort is composed of several groups of patients, most of the patients have a personal or family history of cardiovascular or pulmonary disease, some of them are healthy controls while others are affected with hyperlipidemia, cardiovascular disease, or other associated conditions. None of the cohorts were selected for primary muscle disease. We first analyzed variant calls for the PYGM gene in the ClinSeq® database, materials and methods for the ClinSeq® study are described elsewhere[9]; DNA isolation, library preparation, capture, sequencing and alignment and base calling were performed as described in previous reports[10]. PYGM variant analysis was performed in VarSifter v1.6[11]. Variants were filtered for mutation type and population frequency. Variants that met population frequency (MAF <0.5% in ClinSeq® and ESP) and quality filters were further classified by cross-referencing them with mutations in the Human Gene Mutation database (HGMD). The pathogenicity of these variants was evaluated by reviewing publications with clinical, functional, and/or genetic data. To be considered pathogenic, a variant had to be reported in the literature in a patient with classical manifestations of the disease with compatible ancillary testing (e.g., characteristic muscle biopsy, absent muscle phosphorylase levels, or second-wind phenomenon on treadmill testing) and the identification of biallelic variants in PYGM. The phase of the variants had to be known and appropriate Mendelian segregation confirmed. For variants not described in the literature, further classification was limited to allele frequency in the general population and in-silico model predictions: PolyPhen-2, SIFT[12] and CADD (Combined annotation dependent depletion) score[13]. Variants that did not meet our criteria for classification as pathogenic, were predicted to be deleterious by all four models and had a MAF<0.5% were considered to be variants of uncertain significance (VOUS). Variants with a MAF>0.5% or unpublished variants predicted to be benign by one or more in silico models were considered to be likely benign. Statistical analysis for the 95% confidence intervals was performed using the exact binomial method based on the beta distribution as described by Clopper and Pearson[14]. Variants p.Arg50* and p.Gly205Ser were Sanger verified for the ClinSeq® cohort, Sanger validation is not possible for variants in the ESP cohort.

RESULTS

The ClinSeq® data were evaluated first. Two variants were excluded (p.Thr395Met and p.Arg414Gly) because they were above the frequency limit. We were left with 59/951 ClinSeq® participants who had among them 27 PYGM variants (Table 1). No participant had two minor alleles. Fifteen participants were heterozygous for one of six published mutations. Thirteen participants were heterozygous for 12 VOUS and 31 participants were heterozygous for nine likely benign variants. We then evaluated the ESP dataset for European Americans for the mutations that we identified in ClinSeq®. In the ESP EA dataset, 105 participants were heterozygous for one of the six published mutations. Twenty-six participants were heterozygous for six of the 12 VOUS and 64 participants were heterozygous for one of the nine likely benign variants.

Table 1

Variants evaluated in this study

cDNA	AA change	Number of individuals with variant in ClinSeq^®	Number of individuals with variant in EA ESP	Total ClinSeq^® + ESP EA	Number of individuals with variant in AA ESP	Variant Classification	Published in the literature	SIFT	Polyphen	CADD
c.148C>T	p.R50*	6	27	33	2	Pathogenic	Yes	LOF	LOF	40
c.613G>A	p.G205S	1	3	4	0	Pathogenic	Yes	DAMAGING	PROBABLY DAMAGING	36
c.1094C>T	p.A365V	2	4	6	2	Pathogenic	Yes	DAMAGING	PROBABLY DAMAGING	23.6
c.1537A>G	p.I513V	1	35	36	3	Pathogenic	Yes	TOLERATED	BENIGN	10.83
c.1805G>A	p.R602Q	1	0	1	0	Pathogenic	Yes	DAMAGING	PROBABLY DAMAGING	36
c.2009C>T	p.A670V	4	35	39	6	Pathogenic	Yes	DAMAGING	PROBABLY DAMAGING	35
c.100C>T	p.R34W	1	0	1	0	VOUS	No	DAMAGING	PROBABLY DAMAGING	35
c.209G>A	p.R70H	1	0	2	0	VOUS	No	DAMAGING	PROBABLY DAMAGING	35
c.482G>A	p.R161H	1	0	0	0	VOUS	No	DAMAGING	PROBABLY DAMAGING	34
c.832C>T	p.R278C	1	0	0	0	VOUS	No	DAMAGING	PROBABLY DAMAGING	22.1
c.848A>G	p.N283S	2	12	14	2	VOUS	No	DAMAGING	PROBABLY DAMAGING	24.2
c.1160G>A	p.R387H	1	0	1	0	VOUS	No	DAMAGING	PROBABLY DAMAGING	31
c.1558C>T	p.R520C	1	2	3	2	VOUS	No	DAMAGING	PROBABLY DAMAGING	22.8
c.1885G>T	p.D629Y	1	1	2	5	VOUS	No	DAMAGING	PROBABLY DAMAGING	24.89
c.2083G>A	p.G695R	1	0	1	0	VOUS	No	DAMAGING	PROBABLY DAMAGING	28.3
c.2446C>T	p.R816C	1	1	2	0	VOUS	No	DAMAGING	PROBABLY DAMAGING	20.9
c.2467C>T	p.R823W	1	1	2	0	VOUS	No	DAMAGING	PROBABLY DAMAGING	19.07
c.2500C>T	p.R834C	1	0	1	0	VOUS	No	DAMAGING	PROBABLY DAMAGING	20.4

To increase power, we combined our results with data from the ESP project, which yielded 5,248 exomes. Although there were no homozygotes for any of these variants in the NHLBI ESP, we could not exclude compound heterozygosity because that database does not provide these data. A total of 27 variants were considered amongst 59 individuals from the 951 participants in ClinSeq®. Six of these 27 variants have been claimed to be pathogenic in prior publications. These six variants, which were present in a total of 15 participants for a MAF of 0.00789, predict a disease prevalence of 1/16,080 (95% CI 1/5,940-1/51,163). Because the confidence intervals of this estimate were so large, we expanded our dataset by analyzing the NHLBI ESP EA, for a total of 5,248 individuals. Between the two datasets, there were a total of 120 participants with one of the six pathogenic variants, for a MAF of 0.0114, which predicts a prevalence of 1/7,650 (95% CI 1/5,362 to 1/11,108). Given the discrepancy with published estimates, we critically evaluated the evidence supporting the pathogenicity of the variants and rank ordered them from most evidence to least evidence. The p.Arg50* variant was the highest ranked since it is present in large numbers of affected individuals as compared to controls and has been shown to undergo non-sense mediated decay in muscle tissue from patients with McArdle disease[15], we calculated the predicted disease prevalence based on that variant alone. In the combined ClinSeq® and ESP EA data, the MAF for this variant was 0.00313, which predicts a disease prevalence of 1/101,166 (95% CI 1/51,349 – 1/213,345). We then took the variant with the next most strong evidence, p.Gly205Ser, and added the frequencies of that variant to p.Arg50* and estimated the frequency of the disease, this variant is located in a critical region for tetramerization of the PYGM enzyme and mutations in residue 205 have been shown to lead to misfolding of the protein in human cell lines[16]. The MAF of those two variants in the combined data set were 0.00352, which predicts a disease prevalence of 1/80,478 (95% CI 1/42,407 – 1/162,198). This series of calculations was continued for all six mutations, showing that the previous estimated prevalence of the disease is accounted for by only the p.Arg50* variant and that the upper 95% confidence interval of our calculations falls to about 1/100,000 when accounting for only three mutations (Figure 1). Indeed, by using all six of the published variants identified in ClinSeq®, the predicted disease frequency is far more common than prior estimates. Although there are more than 100 reported PYGM mutations, we calculated a predicted disease prevalence of 1/7,650 (95% CI 1/5,362 to 1/11,108) using only six published mutations.

FIGURE 1

Ordinal mutation prevalence. Prevalence estimate with 95% CI starting with the mutation with the most evidence for pathogenicity (p.Arg50*) and subsequently adding published mutations in decreasing order of evidence for pathogenicity.

To provide yet another approach to these estimates, we calculated the prevalence by deriving the total fraction of all other pathogenic alleles using data from affected patients[17]. First, we tabulated the total mutation burden for the two most common mutations: p.Arg50* and p.Gly205Ser. The former is the most common mutation in McArdle disease, with the actual prevalence of the mutation varying among populations. The estimated prevalence in the US for p.Arg50* amongst patients with McArdle disease is 63%[1,18]. p.Gly205Ser is the second most common mutation in Europe and the US, comprising about 9% of pathogenic alleles. The combination of these two alleles should account for 72% of alleles for McArdle disease in European Americans in the US. The prediction using both allele frequencies and assuming this accounted for 72% of causative alleles resulted in a prevalence of 1/42,355 (95% CI 1/24,536 - 1/76,310), which does not overlap with the currently estimated prevalence.

DISCUSSION

These data suggest that McArdle disease is significantly more common among European-derived Americans than the currently accepted 1/100,000 prevalence, and we conclude that the disorder is at least twice as common, in the range of 1/50,000. There are two potential explanations: 1) McArdle disease is under diagnosed and/or, 2) the penetrance of some of the variants in McArdle disease is overestimated. It is possible that some mutations in PYGM are not fully penetrant thus overestimating the prevalence when calculating from combined allele frequencies. We believe this is one of the strengths of the calculations that use only the two most common mutations (p.Arg50* and p.Gly205Ser), which all evidence to date suggests are fully penetrant. That both methods predict a higher frequency supports our thesis. Expressivity should also be considered – were there to be a wider range of expressivity than currently appreciated, there could be many patients who have a very mild form of this disease. This would be just as interesting and important – we suggest that a very mild form of McArdle disease could be present in a patient, not diagnosed as McArdle disease, but have significant implications for exercise tolerance. A separate issue to consider is the possibility that many of the variants in McArdle disease are actually benign, which would erroneously increase the calculated prevalence (for instance the variant p.Ile513Val seems to be just as common as p.Arg50* in certain populations). We do not believe this to be valid, as our higher prevalence is supported by the method of extrapolating from only two variants that are essentially certain to be pathogenic, which makes the questions of individual pathogenicity assessment of other variants irrelevant. Nearly all variants other than p.Arg50* and p.Gly205Ser would have to be benign for the 95% CI of our estimates to overlap with the current prevalence estimate, which we think is an unreasonable hypothesis. It is possible that some mutations in PYGM cause a very clinically mild phenotype of McArdle disease. This has been described for autosomal recessive metabolic disorders such as: biotinidase deficiency[19], pyruvate kinase deficiency[17] or Gaucher disease, but not for McArdle. Because McArdle disease is a condition with high clinical variability, symptoms can go unrecognized for many years before coming to diagnosis. It is possible that many affected patients develop an aversion to anaerobic exercise that does not limit their life enough to seek a diagnosis and as such, they are not included in current prevalence estimates. There are some limitations to this approach. We assumed that McArdle is a monogenic condition and all variants can be accounted for by looking at PYGM. If locus heterogeneity were a possibility for McArdle disease then the prevalence of mutations would be higher than we are suggesting here. A second limitation is that for the NHLBI-ESP dataset, we are not able to ascertain the phase of the variants. Given that our estimates of prevalence are much higher than the inverse of the NHLBI-ESP dataset, we think this is unlikely to be an issue. Finally, it is important to point out the technical limitations of identifying variants from next-generation sequencing data. Appropriate depth of coverage, deep intronic mutations, mutations in the promoter region and inability to detect large deletions or duplications would lead to under ascertainment of pathogenic variants. However, such an error would again make our estimate conservative, and the disease would be more common than we predict. The estimation of disease frequency based on patients who present to specialty clinics is biased towards those with typical, recognizable, and more severe presentations. We predict that as sequencing is applied more widely in the clinic and in larger research cohorts that undiagnosed individuals with biallelic mutations in PYGM will be identified. This approach of genome-driven ascertainment (as opposed to phenotype-driven ascertainment) mitigates the inherent ascertainment bias towards more severe presentations. It will be important to identify patients by mutations and follow that with clinical research to elucidate the possible associated phenotype, which has been termed hypothesis-generating clinical research[20]. Such identifications will allow a better appreciation of the true spectrum of clinical phenotypes associated with variation in this gene. We predict that a substantial number of such identified individuals will be found to have abnormal biochemistry and exercise tolerance, and that the full delineation of this phenotype will become a component of predictive medicine.

18 in total

1. VarSifter: visualizing and analyzing exome-scale sequence variation data on a desktop computer.

Authors: Jamie K Teer; Eric D Green; James C Mullikin; Leslie G Biesecker
Journal: Bioinformatics Date: 2011-12-30 Impact factor: 6.937

2. Cell models for McArdle disease and aminoglycoside-induced read-through of a premature termination codon.

Authors: Kathryn E Birch; Ros M Quinlivan; Glenn E Morris
Journal: Neuromuscul Disord Date: 2012-07-20 Impact factor: 4.296

3. Profound biotinidase deficiency in two asymptomatic adults.

Authors: B Wolf; K Norrgard; R J Pomponio; D M Mock; J R McVoy; K Fleischhauer; S Shapiro; M G Blitzer; J Hymes
Journal: Am J Med Genet Date: 1997-11-28

4. Secondary variants in individuals undergoing exome sequencing: screening of 572 individuals identifies high-penetrance mutations in cancer-susceptibility genes.

Authors: Jennifer J Johnston; Wendy S Rubinstein; Flavia M Facio; David Ng; Larry N Singh; Jamie K Teer; James C Mullikin; Leslie G Biesecker
Journal: Am J Hum Genet Date: 2012-06-14 Impact factor: 11.025

5. Diagnosis of McArdle's disease by molecular genetic analysis of blood.

Authors: M el-Schahawi; S Tsujino; S Shanske; S DiMauro
Journal: Neurology Date: 1996-08 Impact factor: 9.910

6. Genotypic and phenotypic features of McArdle disease: insights from the Spanish national registry.

Authors: Alejandro Lucia; Jonatan R Ruiz; Alfredo Santalla; Gisela Nogales-Gadea; Juan C Rubio; Inés García-Consuegra; Ana Cabello; Margarita Pérez; Susana Teijeira; Irene Vieitez; Carmen Navarro; Joaquín Arenas; Miguel A Martin; Antoni L Andreu
Journal: J Neurol Neurosurg Psychiatry Date: 2012-01-16 Impact factor: 10.154

7. The ClinSeq Project: piloting large-scale genome sequencing for research in genomic medicine.

Authors: Leslie G Biesecker; James C Mullikin; Flavia M Facio; Clesson Turner; Praveen F Cherukuri; Robert W Blakesley; Gerard G Bouffard; Peter S Chines; Pedro Cruz; Nancy F Hansen; Jamie K Teer; Baishali Maskeri; Alice C Young; Teri A Manolio; Alexander F Wilson; Toren Finkel; Paul Hwang; Andrew Arai; Alan T Remaley; Vandana Sachdev; Robert Shamburek; Richard O Cannon; Eric D Green
Journal: Genome Res Date: 2009-07-14 Impact factor: 9.043

8. PYGM expression analysis in white blood cells: a complementary tool for diagnosing McArdle disease?

Authors: Noemí de Luna; Astrid Brull; Alejandro Lucia; Alfredo Santalla; Nuria Garatachea; Ramon Martí; Antoni L Andreu; Tomàs Pinós
Journal: Neuromuscul Disord Date: 2014-08-21 Impact factor: 4.296

9. Novel mutations in patients with McArdle disease by analysis of skeletal muscle mRNA.

Authors: I García-Consuegra; J C Rubio; G Nogales-Gadea; J Bautista; S Jiménez; A Cabello; A Lucía; A L Andreu; J Arenas; M A Martin
Journal: J Med Genet Date: 2009-03 Impact factor: 6.318

10. A general framework for estimating the relative pathogenicity of human genetic variants.

Authors: Martin Kircher; Daniela M Witten; Preti Jain; Brian J O'Roak; Gregory M Cooper; Jay Shendure
Journal: Nat Genet Date: 2014-02-02 Impact factor: 38.330

18 in total

Review 1. Myopathies Related to Glycogen Metabolism Disorders.

Authors: Mark A Tarnopolsky
Journal: Neurotherapeutics Date: 2018-10 Impact factor: 7.620

2. Muscle diffusion tensor imaging in glycogen storage disease V (McArdle disease).

Authors: R Rehmann; L Schlaffke; M Froeling; R A Kley; E Kühnle; M De Marées; J Forsting; M Rohm; M Tegenthoff; T Schmidt-Wilcke; M Vorgerd
Journal: Eur Radiol Date: 2018-12-17 Impact factor: 5.315

Review 3. The Genetic Challenges and Opportunities in Advanced Heart Failure.

Authors: Fady Hannah-Shmouni; Sara B Seidelmann; Sandra Sirrs; Arya Mani; Daniel Jacoby
Journal: Can J Cardiol Date: 2015-08-21 Impact factor: 5.223

Review 4. Skeletal muscle disorders of glycogenolysis and glycolysis.

Authors: Richard Godfrey; Ros Quinlivan
Journal: Nat Rev Neurol Date: 2016-05-27 Impact factor: 42.937

5. Manifesting heterozygotes in McArdle disease: a myth or a reality-role of statins.

Authors: Judit Núñez-Manchón; Alfonsina Ballester-Lopez; Emma Koehorst; Ian Linares-Pardo; Daniëlle Coenen; Ignacio Ara; Carlos Rodriguez-Lopez; Alba Ramos-Fransi; Alicia Martínez-Piñeiro; Giuseppe Lucente; Miriam Almendrote; Jaume Coll-Cantí; Guillem Pintos-Morell; Alejandro Santos-Lozano; Joaquin Arenas; Miguel Angel Martín; Mauricio de Castro; Alejandro Lucia; Alfredo Santalla; Gisela Nogales-Gadea
Journal: J Inherit Metab Dis Date: 2018-06-20 Impact factor: 4.982

6. Clinical utility gene card for McArdle disease.

Authors: Rhonda L Taylor; Mark Davis; Emma Turner; Astrid Brull; Tomás Pinos; Macarena Cabrera; Kristen J Nowak
Journal: Eur J Hum Genet Date: 2018-01-25 Impact factor: 4.246

7. Myophosphorylase (PYGM) mutations determined by next generation sequencing in a cohort from Turkey with McArdle disease.

Authors: Güldal Inal-Gültekin; Bahar Toptaş-Hekimoğlu; Zeliha Görmez; Özlem Gelişin; Hacer Durmuş; Bekir Ergüner; Hüseyin Demirci; Mahmut Ş Sağıroğlu; Yeşim Parman; Feza Deymeer; Hülya Yılmaz-Aydoğan; Sadrettin Pençe; Can Ebru Bekircan-Kurt; Ersin Tan; Sevim Erdem-Özdamar; Duran Üstek; Urs Giger; Oğuz Öztürk; Piraye Serdaroğlu-Oflazer
Journal: Neuromuscul Disord Date: 2017-06-16 Impact factor: 4.296

8. Taking advantage of an old concept, "illegitimate transcription", for a proposed novel method of genetic diagnosis of McArdle disease.

Authors: Ines Garcia-Consuegra; Alberto Blázquez; Juan Carlos Rubio; Joaquín Arenas; Alfonsina Ballester-Lopez; Adrián González-Quintana; Antoni L Andreu; Tomàs Pinós; Jaume Coll-Cantí; Alejandro Lucia; Gisela Nogales-Gadea; Miguel A Martín
Journal: Genet Med Date: 2016-02-25 Impact factor: 8.822

9. Next-generation sequencing to estimate the prevalence of a great unknown: McArdle disease.

Authors: Gisela Nogales-Gadea; Tomàs Pinós; Antoni L Andreu; Miguel A Martín; Joaquin Arenas; Alejandro Lucia
Journal: Genet Med Date: 2015-08 Impact factor: 8.822

10. The global prevalence of HFE and non-HFE hemochromatosis estimated from analysis of next-generation sequencing data.

Authors: Daniel F Wallace; V Nathan Subramaniam
Journal: Genet Med Date: 2015-12-03 Impact factor: 8.822