| Literature DB >> 35295941 |
Manon Boivin1, Nicolas Charlet-Berguerand1.
Abstract
Microsatellites are repeated DNA sequences of 3-6 nucleotides highly variable in length and sequence and that have important roles in genomes regulation and evolution. However, expansion of a subset of these microsatellites over a threshold size is responsible of more than 50 human genetic diseases. Interestingly, some of these disorders are caused by expansions of similar sequences, sizes and localizations and present striking similarities in clinical manifestations and histopathological features, which suggest a common mechanism of disease. Notably, five identical CGG repeat expansions, but located in different genes, are the causes of fragile X-associated tremor/ataxia syndrome (FXTAS), neuronal intranuclear inclusion disease (NIID), oculopharyngodistal myopathy type 1 to 3 (OPDM1-3) and oculopharyngeal myopathy with leukoencephalopathy (OPML), which are neuromuscular and neurodegenerative syndromes with overlapping symptoms and similar histopathological features, notably the presence of characteristic eosinophilic ubiquitin-positive intranuclear inclusions. In this review we summarize recent finding in neuronal intranuclear inclusion disease and FXTAS, where the causing CGG expansions were found to be embedded within small upstream ORFs (uORFs), resulting in their translation into novel proteins containing a stretch of polyglycine (polyG). Importantly, expression of these polyG proteins is toxic in animal models and is sufficient to reproduce the formation of ubiquitin-positive intranuclear inclusions. These data suggest the existence of a novel class of human genetic pathology, the polyG diseases, and question whether a similar mechanism may exist in other diseases, notably in OPDM and OPML.Entities:
Keywords: RAN translation; microsatellite; neurodegeneration; protein aggregates toxicity; trinucleotide repeat disorders
Year: 2022 PMID: 35295941 PMCID: PMC8918734 DOI: 10.3389/fgene.2022.843014
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Repeat expansion diseases, sorted by their proposed pathogenic mechanism.
| Proposed mechanism | Disease | Gene | Localization | Repeat | Normal | Pathogenic | Reference |
|---|---|---|---|---|---|---|---|
| size | size | ||||||
| LOF | BSS |
| Promoter | CGG | 9–20 | 120–800 |
|
| LOF | FXS |
| 5′ UTR | CGG | 5–50 | >200 |
|
| LOF | FRAXE |
| 5′ UTR | CCG | 4–39 | 200–900 |
|
| LOF | EPM1 |
| 5′ UTR | C4GC4GCG | 2–3 | 30–75 |
|
| LOF | GDPAG |
| 5′ UTR | GCA | 8–16 | 680–1400 |
|
| LOF | FRDA |
| Intron | GAA | 5–34 | 65–1300 |
|
| LOF | XDP |
| Intron | C3TCT | absent | 30–55 |
|
| polyAla | SPD1 |
| Exon | GCG | 15 | 24 |
|
| polyAla | BCCD |
| Exon | GCN | 17 | 27 |
|
| polyAla | HFGS |
| Exon | GCN | 12–18 | 18–30 |
|
| polyAla | BPES |
| Exon | GCN | 14 | 19–24 |
|
| polyAla | HPE5 |
| Exon | GCN | 15 | 25 |
|
| polyAla | EIEE1 |
| Exon | GCN | 12–16 | 20–23 |
|
| polyAla | MRGH |
| Exon | GCN | 15 | 26 |
|
| polyAla | CCHS |
| Exon | GCN | 20 | 25–29 |
|
| polyAla | OPMD |
| Exon | GCG | 6–10 | 11–18 |
|
| polyQ | SBMA |
| Exon | CAG | 9–36 | 38–68 |
|
| polyQ | DRPLA |
| Exon | CAG | 3–35 | 48–93 |
|
| polyQ | HD |
| Exon | CAG | 6–35 | 36–200 |
|
| polyQ | HDL2 |
| Exon | CAG | 6–28 | 41–58 |
|
| polyQ | SCA1 |
| Exon | CAG | 6–38 | 39–88 |
|
| polyQ | SCA2 |
| Exon | CAG | 13–31 | 32–500 |
|
| polyQ | SCA3 |
| Exon | CAG | 12–44 | 55–87 |
|
| polyQ | SCA6 |
| Exon | CAG | 4–18 | 20–33 |
|
| polyQ | SCA7 |
| Exon | CAG | 4–33 | 37–460 |
|
| polyQ | SCA8 |
| Exon | CAG | 15–50 | 74–250 |
|
| polyQ | SCA17 |
| Exon | CAG | 25–40 | 43–66 |
|
| ? | SCA12 |
| 5′ UTR | CAG | 4–32 | 43–78 |
|
| polyGly | FXTAS |
| 5′ UTR | CGG | 5–50 | 55–200 |
|
| polyGly | NIID |
| 5′ UTR | CGG | 7–60 | 60–200 |
|
| ? | FXPOI |
| 5′ UTR | CGG | 5–50 | 55–200 |
|
| ? | OPML |
| LncRNA | CGG | 3–16 | 50–60 |
|
| ? | OPDM1 |
| 5′ UTR | CGG | 13–45 | 80–130 |
|
| ? | OPDM2 |
| 5′ UTR | CGG | 12–32 | 70–120 |
|
| ? | OPDM3 |
| 5′ UTR | CGG | 7–60 | 60–200 |
|
| RAN | ALS/FTD |
| Intron | G4C2 | 3–25 | >30 |
|
| RAN | SCA36 |
| Intron | G3C2T | 5–14 | 650–2,500 |
|
| RAN | SCA31 |
| Intron | G2A2T | variable | 110–760 |
|
| ? | CANVAS |
| Intron | G3A2 | variable | 400–2000 |
|
| RNA | DM1 |
| 3′ UTR | CTG | 5–37 | 50–10,000 |
|
| RNA | DM2 |
| Intron | CCTG | 11–30 | 50–11,000 |
|
| RNA | FECD3 |
| Intron | CTG | 5–31 | >50 |
|
| ? | FAME1 |
| Intron | TTTCA | absent | 440–3,680 |
|
| ? | FAME2 |
| Intron | TTTCA | absent | >660–730 |
|
| ? | FAME3 |
| Intron | TTTCA | absent | >660–2,800 |
|
| ? | FAME4 |
| Intron | TTTCA | absent | >500 |
|
| ? | FAME6 |
| Intron | TTTCA | absent | >400 |
|
| ? | FAME7 |
| Intron | TTTCA | absent | >500 |
|
| ? | SCA10 |
| Intron | TTCTA | 10–32 | 280–4,500 |
|
| ? | SCA37 |
| Intron | TTTCA | absent | 31–75 |
|
LOF, loss of function mechanism; polyAla, polyalanine; polyGly, polyglycine; polyQ, polyglutamine; RAN, repeat non-ATG, translation; ALS, amyotrophic lateral sclerosis; BCCD, brachydactyly and cleidocranial dysplasia; BPES, blepharophimosis, ptosis and epicanthus inversus; BSS, Baratela-Scott syndrome; CANVAS, cerebellar ataxia, neuropathy and vestibular areflexia syndrome; CCHS, congenital central hypoventilation syndrome; DM1, myotonic dystrophy type 1; DM2, myotonic dystrophy type 2; DRPLA, dentatorubral-pallidoluysian atrophy; EIEE1, early infantile epileptic encephalopathy type 1; EPM1, progressive myoclonus epilepsy type 1; FAME, familial adult myoclonic epilepsy; FECD3, Fuchs endothelial corneal dystrophy type 3; FRAXE, fragile XE, syndrome; FRDA, Friedreich ataxia; FTD, frontotemporal dementia/; FXPOI, Fragile X-associated premature ovarian infertility; FXS, fragile X syndrome; FXTAS, fragile X-associated tremor ataxia syndrome; GDPAG, global developmental delay, progressive ataxia and elevated glutamine; HD, Huntington disease; HDL2, Huntington disease-like 2; HFGS, hand-foot-genital syndrome; HPE5, holoprosencephaly type 5; MRGH, mental retardation with isolated growth hormone deficiency; NIID, neuronal intranuclear inclusion disease; OPDM, oculopharyngodistal myopathy type; OPMD, oculopharyngeal muscular dystrophy; OPML, oculopharyngeal myopathy with leukoencephalopathy; SBMA, spinal and bulbar muscular atrophy; SPD1, synpolydactyly type 1; SCA, spinocerebellar ataxia; XDP, X-linked dystonia parkinsonism.
FIGURE 1CGG repeat expansions cause a spectrum of disease. (A) Identical CGG repeat expansions embedded within the 5′UTR of different genes cause various neurodevelopmental, neuromuscular and neurodegenerative disorders. (B) Brain sections of individuals with FXTAS or NIID show identical p62-or sumo-positive intranuclear inclusions. (C) FXTAS, NIID, OPML, and OPDM may belong to a continuum of neuromuscular and neurodegenerative disorders.
FIGURE 2CGG repeat expansions are translated into polyG proteins in FXTAS and NIID. (A) Scheme of FMR1 indicating FMRpolyG upstream ORF and FMRP main ORF localization. (B) Scheme of NOTCH2NLC indicating uN2CpolyG upstream ORF and NOTCH2NLC (abbreviated N2C) main ORF localization. Initiation codons and stop codons are indicated in red for uORFs and in blue for the main ORFs. Near-cognate initiations codons are indicated in bold yellow. (C) Protein sequences of FMRpolyG and uN2CpolyG show no similitude beyond their polyglycine stretch. (D) Sequences of the putative uORFs embedded within NOTCH2, NOTCHNLA, B and C 5′UTRs. Variant amino acids are indicated in bold. The sequence required for uN2C to interact with KU70/KU80 is indicated in blue.
FIGURE 3Repeat expansions located in “non-coding” regions are nonetheless translated. (A) CGG repeat expansions embedded in FMR1 and NOTCH2NLC 5′UTR are translated into polyglycine-containing proteins in NIID and FXTAS, respectively. (B) G3C2T and G4C2 repeats embedded in the first unspliced intron of the NOP56 and C9ORF72 genes are translated into poly(glycine-proline) and poly(glycine-alanine)-containing proteins in SCA36 and ALS/FTD, respectively. (C) CAG repeats and G2C4 repeats embedded in ATXN8, JPH3AS and C9ORF72AS long “non-coding” RNAs are translated into polyglutamine- or poly(glycine-proline)-containing proteins in SCA8, HDL2, and ALS/FTD, respectively. Initiation codons and stop codons of the ORFs containing the pathogenic expansion are indicated in red, while they are indicated in blue for the main ORFs. Near-cognate initiations codons are indicated in bold yellow.
FIGURE 4Repeat expansion range of toxicity in different microsatellite diseases. CGG repeat expansions are pathogenic between ∼60–70 and 200–300 repeats when they are expressed into toxic polyglycine-containing proteins, but over 200 repeats when they promote DNA epigenetic changes, promoter silencing and a loss of function mechanism, such as in FXS. This range of protein toxicity is to be compared to other translated microsatellite diseases such as polyalanine-containing proteins that are generally toxic with 10–35 GCN repeats, while polyQ proteins are generally pathogenic with longer CAG expansions (∼40–80–200 repeats), with the exception of short polyglutamine stretch altering functions of the calcium voltage-gated channel subunit alpha1A (CACNA1A) in SCA6. In contrast, an RNA gain of function mechanism such as titration of the MBNL RNA binding proteins in DM1, require much longer repeat expansions.