Literature DB >> 35229468

Partners in crime: Proteins implicated in RNA repeat expansion diseases.

Anna Baud¹, Magdalena Derbis¹, Katarzyna Tutak¹, Krzysztof Sobczak¹.

Abstract

Short tandem repeats are repetitive nucleotide sequences robustly distributed in the human genome. Their expansion underlies the pathogenesis of multiple neurological disorders, including Huntington's disease, amyotrophic lateral sclerosis, and frontotemporal dementia, fragile X-associated tremor/ataxia syndrome, and myotonic dystrophies, known as repeat expansion disorders (REDs). Several molecular pathomechanisms associated with toxic RNA containing expanded repeats (RNAexp ) are shared among REDs and contribute to disease progression, however, detailed mechanistic insight into those processes is limited. To deepen our understanding of the interplay between toxic RNAexp molecules and multiple protein partners, in this review, we discuss the roles of selected RNA-binding proteins (RBPs) that interact with RNAexp and thus act as "partners in crime" in the progression of REDs. We gather current findings concerning RBPs involved at different stages of the RNAexp life cycle, such as transcription, splicing, transport, and AUG-independent translation of expanded repeats. We argue that the activity of selected RBPs can be unique or common among REDs depending on the expanded repeat type. We also present proteins that are functionally depleted due to sequestration on RNAexp within nuclear foci and those which participate in RNAexp -dependent innate immunity activation. Moreover, we discuss the utility of selected RBPs as targets in the development of therapeutic strategies. This article is categorized under: RNA Interactions with Proteins and Other Molecules > Protein-RNA Interactions: Functional Implications RNA in Disease and Development > RNA in Disease.

Entities: Chemical

Keywords: RAN translation; RNA binding proteins; liquid-liquid phase separation; repeat expansion; short tandem repeats

Mesh：

Substances：
RNA-Binding Proteins
RNA

Year: 2022 PMID： 35229468 PMCID： PMC9539487 DOI： 10.1002/wrna.1709

Source DB: PubMed Journal: Wiley Interdiscip Rev RNA ISSN： 1757-7004 Impact factor: 9.349

INTRODUCTION

Short tandem repeats (STRs), also termed microsatellites, are repeats of 1–8 nucleotide‐long sequences and occur very frequently within different parts of genomes. Some STRs are overrepresented in exons, and depending on the sequence, their frequencies in open reading frames and 5′ and 3′ untranslated regions (UTRs) differ (Kozlowski et al., 2010). Moreover, trinucleotide repeats are more frequent in the coding parts of genes, whereas 5‐ and 6‐nucleotide repeats are more frequent in noncoding parts (Malik, Kelley, et al., 2021). STRs are polymorphic in length and prone to expansion or, to a lesser extent, contraction either in cells of the same individual (somatic instability) or between generations (germline instability; Ashizawa et al., 1992; Paulson, 2018; Trang et al., 2015; Trottier et al., 1994). The origin of STR instability remains incompletely understood, but the possible mechanisms include replication slippage (McMurray, 2010), replication fork stalling (Gadgil et al., 2017), mismatch repair, and bidirectional transcription (Castel et al., 2010). To date, several possible models of pathogenesis caused by STR expansions have been proposed, and in different repeat expansion disorders (REDs), combinations of these pathomechanisms play crucial roles in disease development (extensively reviewed in Malik, Kelley, et al. (2021). At the DNA level, STRs in noncoding parts of genes may enhance the cotranscriptional formation of RNA/DNA hybrids called R‐loops, which are associated with induction of the DNA damage response and gene silencing due to methylation or blockage of transcription (Reddy et al., 2011). At the RNA level, a dominant gain‐of‐function mechanism in which RNA bearing expanded repeats (RNAexp) sequesters RNA‐binding proteins (RBPs), mostly in cell nuclei, impairing their function, has been postulated in many diseases (i.e., A. Mankodi et al., 2000; Miller et al., 2000; Sellier et al., 2010, 2013). Moreover, if RNAexp is transported to the cytoplasm, its toxicity can be exerted by the translation of the repeats that do not reside in the canonical AUG‐initiated open reading frame via a process called repeat‐associated non‐AUG (RAN) translation (Zu et al., 2011). Finally, a gain‐of‐function mechanism involving the protein products of genes bearing STRs in their coding sequences is associated with their tendency to form toxic aggregates or activate stress response pathways (Cook et al., 2020; Y. J. Zhang et al., 2018). In all of the mentioned pathomechanisms, processes involving RNAexp molecules can be considered a causative agent from transcription to translation (Box 1). Gene loss‐of‐function: Repeat expansion in DNA might result in partial or complete lack of gene product. Altered gene expression can be caused by transcriptional gene silencing mediated by mutated allele hypermethylation or impaired transcription. For example, in fragile X‐syndrome, heterochromatinization of FMR1 locus silences the expression of fragile X mental retardation protein 1 (FMRP). Alternatively, in Friedreich's ataxia actively transcribed nascent RNA with GAA repeats can form DNA–RNA R‐loop structures which initiates epigenetic silencing of FXN gene. RNA gain‐of‐function: Expanded repeats within (pre‐)mRNAs form stable secondary structures, accumulate in the cell nuclei, attract and trap RNA binding proteins and form ribonucleoprotein complexes called RNA foci. RNA foci are pathological hallmarks of many REDs. For example, in myotonic dystrophy type 1 (DM1), muscleblind‐like proteins (MBNLs) are sequestered to CUG repeat foci what leads to the global alternative splicing abnormalities due to MBNLs functional depletion. Protein gain‐of‐function: The presence of expanded repeats in RNA, if transported to cytoplasm, may result in the production of mutated variant of the protein containing repeated amino acids, like in the case of polyQ expansion in huntingtin. Moreover, peptides derived from RNAexp can aggregate and represent toxic properties. Insoluble protein aggregates are detected in many REDs, for example, polyglycine in FXTAS. On the other hand, in C9‐ALS/FTD, intron retention can result in decreased synthesis of C9orf72 protein, which exemplifies protein loss‐of‐function mechanism due to repeat expansion. However, it takes two to tango: in the crime of RED pathogenesis, the protein–RNAexp interaction cannot be neglected. RBPs bind specific RNA sequences and/or its specific secondary/tertiary structures through RNA‐binding domains (RBDs), form ribonucleoprotein (RNP) complexes, and engage in fundamental cell processes, that is, the regulation of transcription, mRNA splicing, maturation, transport, stability, cellular localization, and translation (reviewed in Jazurek et al., 2016). The most common RBDs contain either an RNA‐recognition motif (RRM), a heterogeneous nuclear RNP (hnRNP) K homology domain (KH), zinc fingers (ZNFs), an S1 domain, a double‐stranded RBD (dsRBD), or a combination of these domains (reviewed in Lunde et al., 2007). Eukaryotic cells contain hundreds of different RBPs with unique RNA‐binding specificity, many of which have been described as involved in the development of REDs. To date, over 50 human inherited REDs, mainly neurological or neuromuscular diseases, have been described (Figure 1a). Those typically multisystemic diseases can be inherited in an autosomal dominant manner, that is, Huntington's disease (HD; CAGexp in the protein‐coding region; MacDonald et al., 1993), myotonic dystrophies (DMs; CTGexp in the 3′UTR; Brook et al., 1992), and C9orf72‐linked amyotrophic lateral sclerosis and frontotemporal dementia (C9‐ALS/FTD; G4C2exp in the intron; DeJesus‐Hernandez et al., 2011); an autosomal recessive manner, that is, Friedreich's ataxia (FRDA; GAAexp in the intron of the frataxin FXN gene; Campuzano et al., 1996); or an X‐linked dominant manner, that is, fragile X syndrome (FXS) and fragile X‐associated tremor/ataxia syndrome (FXTAS; CGGexp in the 5′UTR of the fragile X mental retardation 1 FMR1 gene (Tassone et al., 2000). In HD, CAGexp in the coding region of the huntingtin gene (HTT) results in the production of a mutant protein containing a polyglutamine (polyQ) stretch, which is susceptible to misfolding and aggregation and thus underlies disease pathogenesis (Martí, 2016; Persichetti et al., 1995). In FXS, CGGexp exceeding 200 repeats, named full mutation, causes hypermethylation and silencing of the FMR1 promoter (Tassone et al., 2000, 2007). The absence of the synaptic functional regulator FMR1 protein (FMRP) is linked to alterations in brain synaptic plasticity, impairing cognitive functions and resulting in intellectual disability. In FXTAS, the mutant RNA containing 55–200 CGGexp can sequester some RBPs (Sellier et al., 2010, 2013) or trigger the production of toxic polyglycine (polyG) protein (Todd et al., 2013). The pathological hallmark of FXTAS is the presence of large, ubiquitin‐positive inclusions in the nuclei of neurons and astrocytes (Greco et al., 2002), while the clinical symptoms include intention tremor, cerebellar ataxia, parkinsonism, and brain atrophy (Greco et al., 2006; Hagerman et al., 2001). In C9‐ALS/FTD, G4C2exp in the first intron of the C9orf72 gene leads to insufficiency of the product of the mutated gene, the C9orf72 protein (DeJesus‐Hernandez et al., 2011); toxic gain‐of‐RNA function (Xu et al., 2013); and the production of toxic dipeptide repeat proteins (DPRs; Ash et al., 2013; Gendron et al., 2013). Although DM1 is caused by CTGexp in the 3′UTR of the DMPK gene and DM2 by CCTGexp in intron 1 of the nucleic acid‐binding protein‐coding CNBP gene, these two neuromuscular disorders exhibit common pathological symptoms, that is, skeletal muscle weakness, wasting, and cognitive dysfunction (Brook et al., 1992; Liquori et al., 2001). Both CUGexp and CCUGexp are toxic and lead to nuclear sequestration of the Muscleblind‐like (MBNL) proteins, impairing their physiological functions (Fardaei et al., 2001, 2002; Mankodi, 2001; Miller et al., 2000).

FIGURE 1

Characteristics of expanded STRs specific for different diseases. (a) Localization and size of STRs within specific gene regions. Expanded STRs, depending on the sequence, are located in different parts of the gene. Size of expansion of STRs necessary for the development of individual REDs differ between diseases, however, it may be roughly specified that the longest expansions are located within introns and further within 3′UTRs, middle‐size expansions within 5′UTRs and the shortest within exons, and thus with protein‐coding sequences. Here we show representative REDs from the larger group of diseases. (b) Structures formed by RNA. Trinucleotide CNG repeats form RNA hairpin structures, all characterized by high thermodynamic stability, the highest for CGG, next CAG, and the lowest for CUG repeats. Hairpin structures are also formed by CCUG and G4C2 repeats. Moreover, G4C2 and CGG repeats are able to form G‐quadruplex structure. (c) Protein products derived from repeat‐associated non‐AUG (RAN) translation. RAN translation may potentially start and produce proteins in all three reading frames. In the process of RAN translation of trinucleotide repeats, the homopolymeric proteins with tracts of single amino acids are biosynthesized. CCUG tetranucleotide repeats are RAN translated to proteins containing tracts of four amino acids, the same in all reading frames. RAN translation of G4C2 hexanucleotide repeats is the source of DPRs composed of tracts of two amino acids repeats, one of which is glycine in all reading frames In this review, we focus on the roles of RNAexp and its protein partners in the pathogenesis of selected REDs, C9‐ALS/FTD, FXTAS, HD, and DM, at different stages of the RNAexp life cycle (Table 1). Importantly, the pathomechanisms of these diseases are shared among many other REDs. We discuss not only the proteins that play a role in nuclear processing, that is, the transcription and splicing of RNAexp, and those that are sequestered by RNAexp in various disorders, but also proteins that play a role in RNAexp transport, non‐canonical translation, stress response, and phase separation of products of mutant genes. We also discuss these proteins in the context of the development of therapeutic strategies. The described proteins may be considered partners in crime that, together with RNAexp molecules, are responsible for the development and progression of REDs.

TABLE 1

Proteins implicated in pathomechanisms of repeat expansion diseases (REDs)

Process	RNA^exp	Protein	Protein name	UniProt ID	Effect/method
Transcription	CAG	SUPT4H1	Transcription elongation factor SPT4	P63272 (human)	KO in mouse reduced production of mutated huntingtin	(C. R. Liu et al., 2012)
	CAG	SPT4‐A	Transcription elongation factor SPT4‐A	P63271 (mouse)	KD (ASO) in mouse selectively reduced mutated mRNA and protein	(H. M. Cheng et al., 2015)
	G4C2; C4G2	SUPT4H1	Transcription elongation factor SPT4	P63272 (human)	KO in yeast and in fly reduced RNA foci and RAN translation	(Kramer et al., 2016)
	G4C2	PAF1, LEO1	RNA polymerase II‐associated factor 1 homolog; RNA polymerase‐associated protein LEO1	Q8N7H5 (human), Q8WVC0 (human)	KD in fly suppressed toxicity, upregulated in patients	(Goodman, Prudencio, Kramer, et al., 2019)
	G4C2	AFF2/FMR2	AF4/FMR2 family member 2	P51816 (human)	KD in fly reduced RNA^exp and RAN protein level, KO in iPSNs reduced level of C9orf72 RNA, RNA foci, and RAN translation products	(Yuva‐Aydemir et al., 2019)
	GGCCUG	SPT4‐A, SPT5	Transcription elongation factor SPT4‐A; Transcription elongation factor SPT5	P63271 (mouse), O55201 (mouse)	KD in Neuro2A cells reduced RNA foci and RAN translation	(Furuta et al., 2019)
Sequestration, co‐localization	CAG	MBNL1	Muscleblind‐like protein 1	Q9NR56 (human)	OE rescued splicing abnormalities in HeLa cells expressing CAG^exp	(Mykowska et al., 2011)
	G4C2	Nucleolin	Nucleolin	P19338 (human)	Nucleolar stress in C9‐ALS/FTD patients	(Haeusler et al., 2014)
	G4C2	Pur alpha	Transcriptional activator protein Pur‐alpha	Q00577 (human)	OE reduced neurodegeneration in fly, cell, and fish models.	(Swinnen et al., 2018; Xu et al., 2013)
	G4C2	hnRNP H	Heterogeneous nuclear ribonucleoprotein H	P31943 (human)	Colocalized with RNA foci in C9‐ALS/FTD patients, missplicing of hnRNP H‐dependent transcript	(Y. B. Lee et al., 2013)
	G4C2	hnRNP H1/F, ALYREF, and SRSF2	Heterogeneous nuclear ribonucleoprotein F, THO complex subunit 4, SRSF protein kinase 2	P52597 (human), Q86V81 (human), P78362 (human)	Colocalized with RNA foci in C9‐ALS/FTD patients	(Cooper‐Knock et al., 2014)
	C4G2	SRSF1, hnRNP A1, hnRNP H/F, ALYREF	SRSF protein kinase 1, Heterogeneous nuclear ribonucleoprotein A1, Heterogeneous nuclear ribonucleoprotein F, THO complex subunit 4	O70551 (human), P09651 (human), P52597 (human), Q86V81 (human)	Colocalized with RNA foci in C9‐ALS/FTD patients	(Cooper‐Knock et al., 2015)
	G4C2	ADARB2	Double‐stranded RNA‐specific editase B2	Q9NS39 (human)	KD decreased RNA foci in iPSNs, colocalized with RNA foci in C9‐ALS/FTD patients	(Donnelly et al., 2013)
	G4C2	ZPF106	Zinc finger protein 106	Q9H2Y7 (human)	OE suppressed neurotoxicity in C9‐ALS/FTD fly model	(Celona et al., 2017)
	G4C2	Matrin‐3	Matrin‐3	P43243 (human)	OE suppressed neurotoxicity in C9‐ALS/FTD fly model	(Ramesh et al., 2020)
	CUG	MBNL1	Muscleblind‐like protein 1	Q9NR56 (human)	OE/ASO blockers complementary to RNA^exp reversed DM1 phenotype in mouse model	(Kanadia et al., 2006; Wheeler et al., 2009)
	CUG	hnRNP H	Heterogeneous nuclear ribonucleoprotein H	P31943 (human)	Colocalized with RNA foci in DM1 patients	(D. H. Kim et al., 2005)
	CCUG	MBNL1	Muscleblind‐like protein 1	Q9NR56 (human)	Colocalized with RNA foci in DM2 patients	(Mankodi, 2001)
	CCUG	hnRNP core protein and snRNP Sm antigen	Small nuclear ribonucleoprotein E	P62304 (human)	Colocalized with RNA foci in DM2 patients	(Perdoni et al., 2009)
	CCUG	RBFOX1	RNA binding protein fox‐1 homolog	Q9NWB1 (human)	Colocalized with RNA foci in DM2 patients	(Sellier et al., 2018)
	CGG	FMRpolyG/PolyG	FMRpolyG	–	Colocalized with RNA^exp, cause neurotoxicity in primary neurons derived from FXTAS mouse	(Asamitsu et al., 2021)
	CGG	hnRNP A2/B1	Heterogeneous nuclear ribonucleoproteins A2/B1	P22626 (human)	OE suppressed neurotoxicity in FXTAS fly model	(Sofola et al., 2007)
	CGG	CELF1	CUGBP Elav‐like family member 1	Q92879 (human)	OE suppressed neurotoxicity in FXTAS fly model	(Sofola et al., 2007)
	CGG	Pur alpha	Transcriptional activator protein Pur‐alpha	Q00577 (human)	OE suppressed neurotoxicity in FXTAS fly model	(Jin et al., 2007)
	CGG	TRA2A	Transformer‐2 protein homolog alpha	Q13595 (human)	Colocalized with RNA^exp foci in FXTAS cellular model and with inclusions in FXTAS mouse model and patients	(Cid‐Samper et al., 2018)
	CGG	DROSHA‐DGCR8	Ribonuclease 3 and Microprocessor complex subunit DGCR8	Q9NRR4 (human), Q8WYQ5 (human)	OE suppressed neurotoxicity in FXTAS cellular model, miRNA processing is impaired in FXTAS patients	(Sellier et al., 2013)
	CGG	SAM68	KH domain‐containing, RNA‐binding, signal transduction‐associated protein 1	Q07666 (human)	Colocalized with RNA^exp foci in FXTAS cellular model, SAM68‐dependent alternative splicing is impaired in FXTAS patients	(Sellier et al., 2010)
Nucleocytoplasmic transport	CAG	U2AF65	Splicing factor U2AF 65 kDa subunit	P26368 (human)	KD in fly enhanced RNA^exp toxicity and accumulation, reduction in HD mouse caused RNA^exp nuclear accumulation	(Tsoi et al., 2011)
	G4C2	RanGAP1	Ran GTPase‐activating protein 1	P46060 (human)	OE suppressed RNA^exp toxicity in C9‐ALS/FTD fly, OE rescued impairment of NCT in C9‐ALS iPSNs	(K. Zhang et al., 2015)
	G4C2; CGG	SRSF1	Serine/arginine‐rich splicing factor 1	O70551 (human)	KD reduced nuclear export of mutated G4C2^exp/CGG^exp transcripts in cells and neurotoxicity in C9‐ALS/FTD iPSNs and in FXTAS & C9‐ALS/FTD fly models	(Hautbergue et al., 2017; Malik, Tseng, et al., 2021
	G4C2	POM 121	Nuclear envelope pore membrane protein POM121	Q8TEM1 (human)	OE rescued expression of nuclear pore components and NCT impairment in C9‐ALS/FTD iPSNs	(Coyne et al., 2020)
	CUG	hnRNP H	Heterogeneous nuclear ribonucleoprotein H	P31943 (human)	KD reduced nuclear retention of RNA^exp in cells	(D. H. Kim et al., 2005)
	CUG	Staufen1	Double‐stranded RNA‐binding protein Staufen homolog 1	O95793 (human)	OE reduced nuclear retention of RNA^exp in cells	(Ravel‐Chapuis et al., 2012)
RAN translation	G4C2; CGG	eIF4A	Eukaryotic initiation factor 4A‐I	P60842 (human)	Inhibition reduced RAN translation in vitro	(Green et al., 2017; Kearse et al., 2016)
	CGG; G4C2	eIF4B and eIF4H	Eukaryotic translation initiation factor 4B and Eukaryotic translation initiation factor 4H	P23588 and Q15056 (human)	KD reduced RAN translation in C9‐ALS/FTD fly model	(Goodman, Prudencio, Srinivasan, et al., 2019; Linsalata et al., 2019)
	CGG	DDX3X	ATP‐dependent RNA helicase DDX3X	O00571 (human)	KD reduced RAN translation in FXTAS fly model	(Linsalata et al., 2019)
	G4C2	DDX3X	ATP‐dependent RNA helicase DDX3X	O00571 (human)	KD enhanced RAN translation in C9‐ALS/FTD fly model and in C9‐ALS/FTD iPSNs	(W. Cheng et al., 2019)
	G4C2; CGG	DHX36	ATP‐dependent DNA/RNA helicase DHX36	Q9H2U1 (human)	KD reduced RAN translation in C9‐ALS/FTD iPSNs	(H. Liu, Lu, Paul, Periz, Banco, Ferré‐D'Amaré, et al., 2021; Tseng et al., 2021)
	CGG	eIF1 and eIF5	Eukaryotic translation initiation factor 1 and Eukaryotic translation initiation factor 5	P41567 and P55010 (human)	Modulated RAN translation in vitro	(Linsalata et al., 2019)
	CAG; G4C2	eIF3F	Eukaryotic translation initiation factor 3 subunit F	O00303 (human)	KD reduced RAN translation in vitro	(Ayhan et al., 2018)
	CGG	5MP	eIF5‐mimic protein (also known as BZW2)	E9PFD4 (human)	OE reduced RAN translation in FXTAS fly model	(Singh et al., 2021)
	G4C2	eIF2D	Eukaryotic translation initiation factor 2D	P41214 (human)	KD reduced RAN translation in C. elegans C9‐ALS/FTD model	(Sonobe et al., 2021)
	G4C2; CAG	RPS25	40S ribosomal protein S25	P62851 (human)	KD reduced RAN translation in C9‐ALS/FTD fly model and in C9‐ALS/FTD iPSNs	(Yamada et al., 2019)
Stress response	G4C2; CAG; CCTG; CAGG	PKR	Interferon‐induced, double‐stranded RNA‐activated protein kinase	P19525 (human)	Inhibition of PKR reduced RAN translation in C9‐ALS/FTD mouse model	(Zu et al., 2020)
	G4C2; CGG	eIF2α	Eukaryotic translation initiation factor 2 subunit 1	P05198 (human)	Enhanced RAN translation when phosphorylated in stress conditions in vitro	(W. Cheng et al., 2018; Green et al., 2017)
	G4C2; CAGG; CCTG	eIF2A	Eukaryotic translation initiation factor 2A	Q9BY44 (human)	KO reduced RAN translation in vitro	(Sonobe et al., 2018; Tusi et al., 2021)
	G4C2	PERK	Eukaryotic translation initiation factor 2‐alpha kinase 3	Q9NZJ5 (human)	Elevated activity in response to accumulation of RAN peptides increases RAN translation	(Zu et al., 2020)
	CUG/CAG	Dicer	Endoribonuclease Dicer	Q9UPY3 (human)	Cleaves RNA repeat regions into ~21 nt fragments	(Krol et al., 2007; Lawlor et al., 2011; Yu et al., 2011)
	CAG/CUG	ADAR1	Double‐stranded RNA‐specific adenosine deaminase	P55265 (human)	Co‐expression of ADAR1 with (CAG/CUG)100 dsRNA rescued repeat‐related pathology in Drosophila	(van Eyk et al., 2019)
	CAG/CUG	TLR	Toll‐like receptors		KD of TLRs in Drosophila decreased the toxicity of CAG/CUG_~100 repeats	(Samaraweera et al., 2013)
MLO formation (Phase separation)	CAG	SRSF2	Serine/arginine‐rich splicing factor 2	Q01130 (human)	Marker of nuclear speckles, colocalized with RNA^exp foci	(Jain & Vale, 2017)
	G4C2	SRSF2	Serine/arginine‐rich splicing factor 2	Q01130 (human)	Marker of nuclear speckles, colocalized with RNA^exp foci	(Jain & Vale, 2017)
	G4C2	G3BP1, Caprin1, USP10, eIF3b, ELAVL1, TIAR	Ras GTPase‐activating protein‐binding protein 1, Caprin‐1, Ubiquitin carboxyl‐terminal hydrolase 10, Eukaryotic translation initiation factor 3 subunit B, ELAV‐like protein 1, Nucleolysin TIAR	Q13283 (human), Q14444 (human), Q14694 (human), P55884 (human), Q15717 (human) Q01085 (human)	Markers of stress granules, condensed in vitro with RNA^exp and lysates from cell lines and mouse brain	(Fay et al., 2017)
	G4C2	FMRP	Synaptic functional regulator FMR1	Q06787 (human)	Marker of transport granules, colocalized with RNA^exp, FMRP‐dependent translation regulation was impaired in C9‐ALS/FTD iPSNs	(Burguete et al., 2015)

Abbreviations: ASO, antisense oligonucleotides; iPSNs, induced pluripotent stem cells‐derived neurons; KD, knockdown; KO, knockout; MLO, membraneless organelles; NCT, nucleocytoplasmic transport; OE, overexpression.

Proteins implicated in pathomechanisms of repeat expansion diseases (REDs) eIF3F Abbreviations: ASO, antisense oligonucleotides; iPSNs, induced pluripotent stem cells‐derived neurons; KD, knockdown; KO, knockout; MLO, membraneless organelles; NCT, nucleocytoplasmic transport; OE, overexpression.

TOXIC RNA MOLECULES

In various REDs, distinct toxic RNAexp molecules are generated. Interestingly, for the development of each RED, a specific “repeat load” is needed. This “repeat load” not only consists of the length of STRs within RNAexp and cellular RNAexp levels but also the specific location, for example, certain cellular compartments and specific backgrounds within cell types. Thus, some RNAexp molecules are pathogenic when the length of inherited expanded repeats exceeds a few dozen copies, while for others, when the number of repeats exceeds hundreds or even thousands of copies (Paulson, 2018). STR length is also related to pathomechanisms in which RNAexp molecules are involved; for example, RBP sequestration plays a larger role in REDs caused by very long STRs, such as DMs (up to 10,000 CTG or CCTG repeats; Malik, Kelley, et al., 2021; Rohilla & Gagnon, 2017). The toxicity of RNAexp may further increase with the elongation of STRs during lifespan of patients, manifesting as increased disease severity or a younger age of onset (Paulson, 2018; Swami et al., 2009). RNAexp molecules were shown to form different types of structures with different thermodynamic stabilities, depending on the repeated sequence motif: unstructured single‐stranded RNAs (e.g., AAG repeats), semistable hairpins, fairly stable hairpins, or very stable G‐quadruplexes (Figure 1b). The most common RNAexp molecules in REDs, CNG repeats, follow the thermodynamic stability order CGG > CAG > CUG > CCG (Sobczak et al., 2010). At physiological KCl concentration, pH, and temperature, G4C2exp and CGGexp form a stable parallel G‐quadruplex (Asamitsu et al., 2021; Haeusler et al., 2014). These higher‐order structures show extremely high thermodynamic stability and are directly linked to the pathogenic potential of RNAexp, as they are recognized by different RBPs. Importantly, the structure of RNAexp may be influenced by the presence of nucleotide interruptions in the sequences of repeats. In general, these protective elements against STR instability at the genomic level reduce RNAexp toxicity. Interruptions including CGG, CTC, GGC, or CAG triplets in CTGexp occur in ~3–5% of DM1 patients, reduce somatic instability, and may result in atypical or milder symptoms, later ages of onset, and progression (Braida et al., 2010; Cumming et al., 2018; Wenninger et al., 2021). Similarly, AGG interruptions within CGGexp in FMR1, increase genetic stability of alleles with 45–90 repeats (Nolin et al., 2015; Villate et al., 2020; Yrigollen et al., 2014). The CAA interruptions in CAGexp in SCA2 are linked with delayed disease onset. On the other hand, numerous CCG/CGG interruptions in CTG/CAGexp of ATXN8 related to spinocerebellar ataxia type 8 (SCA8), increase thermodynamic stability of RNA hairpin, change amino acid composition of RAN proteins enhancing their toxicity, and thus are associated with increased disease penetrance (Perez et al., 2021). Despite differences concerning their sequence, the length of STRs, structure, and so forth, various RNAexp molecules are engaged in similar pathogenic mechanisms, such as RBP sequestration, impairment of nucleocytoplasmic transport (NCT), or RAN translation. This is related to the natural multivalency of RNAexp molecules, which makes them susceptible to interaction with each other and with multiple RBPs. It further leads to the formation of RNA foci, structures composed of RNAexp and proteins, mainly in the nucleus, that varies in size, frequency, and composition but are common between different REDs (reviewed in N. Zhang & Ashizawa, 2017; Box 2). Specific repeat load, necessary for development of individual RED, consists of a few components. First, the length of expanded repeats, which may further increase between generations or in different cells within the same organism, due to germline and more robust somatic instability. Second, the biogenesis and cellular stability of expanded repeat‐bearing transcripts. Third, localization of RNAexp in a defined cell type and cellular compartment where it interacts with specific partners. Finally, the sequence itself, which, if enriched in nucleotide interruptions in the repeat tract, changes the potential of RNAexp molecule for relevant interactions with RBPs or toxicity of RAN protein products related to their amino acid composition.

TRANSCRIPTION AND SPLICING OF RNA

Transcription of genes containing expanded repeats plays an important role in the pathogenesis of some REDs, as RNAexp can exert toxicity by sequestration of many RBPs as a function of RNAexp load. The different noncanonical secondary structures formed by STR expansions need to be resolved during transcription, impeding transcriptional efficiency (Goodman & Bonini, 2020). Slowed or stalled transcription of STRs can lead to the formation of R‐loops in the gene locus. These nucleic acid structures are RNA/DNA hybrids formed during transcription when nascent RNA hybridizes to the DNA template strand behind elongating RNA polymerase II (Pol II; Thomas et al., 1976). In FXTAS, the GC‐rich 5′UTR of FMR1 is susceptible to cotranscriptional R‐loop formation upon reannealing of the nascent FMR1 pre‐mRNA to the complementary DNA strand because a G‐rich RNA:C‐rich DNA heteroduplex is thermodynamically more stable than the corresponding DNA:DNA duplex (Loomis et al., 2014; Reddy et al., 2011). R‐loops over expanded repeats may form a structural block, directly interfering with Pol II transcription elongation, and influencing transcription efficiency (Belotserkovskii et al., 2017; Crossley et al., 2019; Groh et al., 2014). Excessive R‐loop formation in an individual locus can result in double‐stranded DNA brakes and activation of the DNA damage response, triggering a series of signaling events that may be pathogenic (Diab et al., 2018; Loomis et al., 2014). In line with this mechanism, the phosphorylated histone variant γH2AX, a DNA damage‐response molecule, was detected in FXTAS patients (Hoem et al., 2011; Iwahashi et al., 2006). Moreover, methylation caused by the formation of atypical DNA/DNA or DNA/RNA structures in the expanded repeats at a given locus was shown to significantly modulate the expression of mutant genes. For example, in FRDA, R‐loops can trigger heterochromatinization, which results in the FXN gene silencing (Li et al., 2016). Expanded CTG (in DM1), G4C2 (in C9‐ALS/FTD), and CGG (in FXS) repeats were shown to dysregulate the transcription process via inhibition or impairment of Pol II initiation or elongation (Brouwer et al., 2013; Colak et al., 2014; Haeusler et al., 2014). However, at least two protein complexes, the DSIF complex, and PAF1 complex (PAF1C), were shown to promote Pol II transcription at repeat expansion sites (Figure 2a).

FIGURE 2

Nuclear processing and accumulation of RNAexp molecules. (a) Transcription. DSIF and PAF1 complexes promote transcription of repeat expansion regions through inhibition of the formation of DNA secondary structures and R‐loops (described for CAGexp and G4C2exp). (b) Splicing. The majority of pre‐mRNAs with expanded repeats undergo correct splicing, however, in some parts of mature transcripts intron retention takes place (described for CCUGexp and G4C2exp). Moreover, G4C2exp‐containing spliced intron is stabilized in a circular form. Bold line, exon; thin line, intron. (c) Sequestration. RNAexp molecules accumulate in nuclei where they bind multiple RBPs and sequester some of them and form RNAexp foci (described for CAGexp, CUGexp, CCUGexp, CGGexp, G4C2exp). (a–c) Arrow with a dotted line, change in place and/or in time; solid lines show induction or inhibition of certain processes The DSIF complex, composed of two highly conserved proteins, SUPT4H1 and SPT5H (or the yeast orthologs Spt4 and Spt5), is a transcription elongation factor that regulates Pol II processivity by reducing the efficiency of its dissociation from template DNA (Wada et al., 1998). During transcription, Spt5 interacts with the Pol II coiled‐coil domain and encircles the RNA/DNA hybrid, which may prevent the dissociation of Pol II from the template (Klein et al., 2011; Martinez‐Rucobo et al., 2011). Spt4 interacts indirectly with Pol II via Spt5, and the DSIF complex interacts with the DNA template outside of the transcription bubble (Klein et al., 2011; Martinez‐Rucobo et al., 2011). Spt4 possesses a ZNF domain that probably plays a role in modulating interactions with DNA and stabilizes RNA polymerase/template complexes, preventing Pol II from pausing (Crickard et al., 2016; Wenzel et al., 2008). SUPT4H1 was shown to be required for the transcription of expanded CAG (C. R. Liu et al., 2012) and G4C2 (Kramer et al., 2016) repeats in HD and C9‐ALS/FTD, respectively. Deletion of SPT4‐A in mouse striatal neurons expressing Htt containing long (>100) CAG repeats (a mouse model of HD) resulted in reduced synthesis of mutant huntingtin containing long polyQ, thus diminishing its aggregation and toxicity (C. R. Liu et al., 2012). In vivo studies showed that downregulation of SUPTH1 by delivering antisense oligonucleotides (ASOs) that activate RNaseH‐mediated target RNA degradation into the brains of HD model mice reduced the levels of mRNA and protein from the mutant but not the normal Htt allele (H. M. Cheng et al., 2015). Deletion of Spt4 in C9‐ALS/FTD yeast, C. elegans, and Drosophila models reduced the expression of G4C2exp and C4G2exp transcripts, blocked accumulation of these transcripts into RNA foci, and decreased the levels of RAN‐translated polyglycine‐proline (polyGP; one of three dipeptide repeats (DPRs) produced from G4C2exp; Figure 1c). These findings were also confirmed in fibroblasts derived from C9‐ALS patients (Kramer et al., 2016). The PAF1C is composed of the highly conserved PAF1, LEO1, CDC73, CTR9, and RTF1 proteins and plays a role in the initiation, promoter‐proximal pausing, elongation, and RNA processing/termination stages of transcription (Goodman & Bonini, 2020). Recently, an RNA interference (RNAi)‐based screen in a Drosophila model of C9‐ALS/FTD revealed that the Drosophila orthologs of PAF1 and LEO1 (dPaf1 and dLeo1, respectively) are selectively involved in transcription of the G4C2exp‐containing allele, while downregulation of other PAF1C components affected transcription of both long and short G4C2 tracts. The downregulation of dPaf1 and dLeo1 suppressed the toxicity of repeat expansion in fly tissues (Goodman, Prudencio, Kramer, et al., 2019). Notably, the RNA levels of hPAF1 and hLEO1 were upregulated in post‐mortem cortical tissue from C9‐ALS patients, supporting the link between the PAF1C and G4C2exp in C9‐ALS/FTD (Goodman, Prudencio, Kramer, et al., 2019). Recently, the AFF2/FMR2 protein, a component of superelongation complex (SEC)‐like 2, was found to selectively regulate transcription of the C9orf72 allele containing long G4C2 tracts in C9‐ALS induced pluripotent stem cell (iPSC)‐derived neurons (iPSNs; Yuva‐Aydemir et al., 2019). Together, the emerging roles of SUPT4H1 (H. M. Cheng et al., 2015; Furuta et al., 2019; Kramer et al., 2016; C. R. Liu et al., 2012), the PAF1C (Goodman, Prudencio, Kramer, et al., 2019), and AFF2/FMR2 (Yuva‐Aydemir et al., 2019) in the transcription of RNA with expanded STRs suggest that the factors implicated in transcriptional elongation are potential therapeutic targets for REDs. In eukaryotes, splicing plays a major role in cotranscriptional gene expression, and almost 95% of protein‐encoding genes undergo alternative splicing (AS). Previously, it was unclear how expanded repeats present in introns exert toxicity at the RNA and protein levels. While the majority of pre‐mRNA molecules with expanded repeats undergo proper splicing and maturation, GC‐rich expansion leads to intron retention (IR) (Sznajder et al., 2018; Wang et al., 2021) (Figure 2b). This interesting AS event was observed in some of the genes with repeat expansion, and the resulting RNA molecules contained an unprocessed sequence, which failed to be excised from pre‐mRNA and was protected from the 5′‐ and 3′‐ends. IR may be a consequence of spliceosome stalling or abnormal association of RBPs with cis‐regulatory sequences, which, in the case of expanded repeats, can confer structural arrangements (Taylor & Sobczak, 2020). Although IR can be considered a physiological mechanism (Wong et al., 2013), it may also play a relevant role in REDs with intronic GC‐rich sequences. For instance, C9orf72 mRNA with an enlarged 5′UTR retaining the G4C2exp intron can accumulate in the nucleus and was detected in the frontal cortex in heterozygous expansion carriers (Mori, Weng, Arzberger, May, Rentzsch, Van Broeckhoven, et al., 2013; Sznajder et al., 2018). Similarly, CCUGexp associated with DM2 was shown to induce the retention of host very long intron 1 and elevated levels of mutant CNBP mRNA. Retention of introns containing CCUGexp has been detected in many DM2 tissues, including skeletal muscle and the frontal cortex of the brain, and lymphoblastoid cell lines (Sznajder et al., 2018). It was also shown that generally, GC‐rich sequences in DNA, due to their secondary structures, can slow the RNA Pol II elongation rate and cause RNA Pol II pausing over‐retained introns (Veloso et al., 2014). Nevertheless, the trans‐acting regulators of IR related to microsatellite repeat disorders remain to be elucidated and require further exploration. In summary, the transcription and splicing of RNAexp depend on the STR sequence and location of the repeats within the gene. Noncanonical secondary structures, including G‐quadruplexes and R‐loops formed by repeat expansions, impact the transcription rate and/or IR, probably by altering accessibility to trans‐acting factors (Table 1), which ultimately has a significant effect on RNAexp load and toxicity.

SEQUESTRATION OF PROTEINS ON RNA MOLECULES

As mentioned above, many REDs have been linked to pathogenic RNA gain‐of‐function models, also termed RNA toxicity, in which mutant RNAexp molecules accumulate in the nucleus, forms aggregates, and sequesters specific RBPs within nuclear foci (Figure 2c, Table 1). This in turn leads to functional depletion of these proteins and impairment of their physiological functions (N. Zhang & Ashizawa, 2017). Depending on not only the type and load of expanded repeats but also on the tissue type and availability of RBPs, RNAexp foci can present distinct morphologies and abundance (reviewed in Wojciechowska & Krzyzosiak, 2011). It should be noted that some proteins are indirectly sequestered in a process mediated by the interaction of RNAexp molecules with other protein partners (Yang & Hu, 2016).

Myotonic dystrophies

CUGexp foci were reported for the first time in muscle and fibroblast biopsies from DM1 patients (Taneja et al., 1995), and they have since been observed in smooth muscle (Cardani et al., 2008), the heart (Mankodi et al., 2005), and the brain (Jiang et al., 2004) as well. Foci formation depends on the length of the CUGexp repeat; in biopsies of the vastus lateralis muscle, sporadic inclusions were observed within the 70–100 CUGexp repeat range (Mankodi, 2001). CUGexp repeats fold into very stable RNA hairpin structures, attract MBNL1 and sequester it away from its normal RNA‐binding sites, leading to its functional insufficiency in cells (Miller et al., 2000). MBNL1 is an RBP that recognizes multiple GCs flanked by pyrimidines, and under physiological conditions, the major role of MBNL is the regulation of tissue‐specific AS and polyadenylation (Konieczny et al., 2014). Increased activity of MBNL1 and 2 during tissue development shifts the pattern of target mRNA AS from fetal‐ to adult‐specific, while its downregulation in DM1 has the opposite effect (Konieczny et al., 2014). Missplicing of multiple mRNAs is exacerbated in the development of this progressive disease (a consequence of somatic expansion). Abnormalities in the AS of muscle‐specific chloride channel (CLCN1; Kino et al., 2009), insulin receptor (INSR; Ho et al., 2004), bridging integrator 1 (BIN1; Fugier et al., 2011), and calcium channel voltage‐dependent L type alpha 1 s subunit (CACNA1S) cause‐specific DM symptoms, including myotonia, insulin resistance, and muscle weakness, respectively. MBNL1 was also shown to interact with other RBPs (e.g., hnRNP H, H2, H3, F, A2/B1, K, and L; the probable ATP‐dependent RNA helicases DDX5 and DDX17; and ATP‐dependent RNA helicase A [DHX9]; Paul et al., 2011). Although only a fraction of these interactors colocalizes with CUGexp foci, long CUG repeats alter the stoichiometry of MBNL1‐protein complexes. Increased levels of hnRNP H, H2, H3, and F and DDX5 dysregulate the AS of many DM1‐specific RNA targets in a manner similar to MBNL1 depletion, showing that the interruption of functional interactions between MBNL1 and other RBPs may contribute to DM1 pathogenesis (Paul et al., 2011). Apart from MBNL1 sequestration, other factors also play crucial roles in DM1 pathogenesis. In DM1 myoblasts, skeletal muscle, and heart tissues, the steady‐state level of CUG‐binding protein (CELF1 aka CUGBP1) is augmented, leading to translation defects and misregulation of the AS of some CELF1 mRNA targets. Although the possible sequestration of CELF1 on mutant RNA containing CUG repeats was suggested two decades ago (Timchenko et al., 2001), other studies argued that CELF1 does not colocalize with CUGexp foci (Fardaei et al., 2001; Jiang et al., 2004; Mankodi et al., 2005) and that its increased level is a result of hyperphosphorylation by protein kinase C, which leads to an increase in protein half‐life and activity (Kuyumcu‐Martinez et al., 2007). Additionally, hnRNP H was shown to bind CUGexp in vivo and colocalize to a limited extent with RNA foci in DM1 patient‐derived myoblasts (D. H. Kim et al., 2005). The level of this protein increases ~2‐fold, probably due to the signaling events downstream of CUGexp (Paul et al., 2006). Interestingly, hnRNP H was also shown to interact with MBNL1 in normal myoblasts in an RNA‐independent manner, and in DM1 myoblasts, elevated levels of MBNL1 increased the colocalization of hnRNP H with RNA foci, suggesting that hnRNP H is recruited to intranuclear DM1 foci by MBNL1 (Paul et al., 2006). DM2 presents a clinical phenotype similar to that of DM1, even though its symptoms are generally milder, and its progression is less severe (Meola & Cardani, 2017). Ribonuclear foci containing CCUGexp were shown to efficiently sequester MBNL proteins (Mankodi, 2001). Moreover, the hnRNP core protein and snRNP Sm antigen were found to colocalize with some MBNL1‐CCUGexp foci (Perdoni et al., 2009). Recently, in vitro studies showed that RNA‐binding protein fox‐1 homolog 1 (RBFOX1) directly binds the CCUGexp. Although this regulator of mRNA metabolism colocalized with CCUGexp foci in muscle biopsies derived from DM2 patients, its splicing regulatory functions were not impaired (Sellier et al., 2018). Interestingly, RBFOX1 seems to compete with MBNL1 for CCUG repeat‐binding sites, which results in reduced sequestration of MBNL proteins by CCUGexp. This phenomenon, together with the intronic localization of expanded repeats (described above), may explain the fewer splicing alterations in DM2 than in DM1 (Sellier et al., 2018). Additionally, the misregulated AS patterns in DM1 and DM2 are similar, but the extent of these changes is tissue‐specific (Nakamori et al., 2013). Indeed, the muscle weakness associated with DM1 affects distal muscles, while proximal muscles are affected in DM2. Taken together, the imbalance of multiple interacting proteins may cause aberrant splicing patterns and contribute to DM pathogenesis. The sequestration rate depends on a load of CUGexp or CCUGexp (repeat length and expression level of RNA with expanded repeats). Due to unequal somatic instability, the efficiency of protein sequestration in distinct tissues or even myofibers differs. Moreover, MBNL proteins are mobile in RNA foci, and when saturated, they can diffuse freely between RNA foci and the nucleoplasm, which may underlie MBNL sequestration (Sznajder et al., 2016). Multiple therapeutic strategies involving limiting the sequestration or degradation of CUGexp were previously developed (reviewed in Konieczny et al., 2014).

Fragile X‐associated tremor/ataxia syndrome

The pathological hallmark of FXTAS is the presence of solitary, large (2–5 μm), ubiquitin‐positive inclusions in the nuclei of neurons, astrocytes (Greco et al., 2002), and Purkinje cells (Ariza et al., 2016). The number of inclusions is correlated with CGG repeat length, suggesting that toxic RNA gain‐of‐function plays an important role in FXTAS pathogenesis. In FXTAS, the repeat length is limited to approximately 200 repeats; longer expansions cause the development of a completely different neurodevelopmental disease, a FXS fragile X syndrome, in which complete silencing of FMR1 occurs (Heitz et al., 1992). FMR1 mRNA containing CGGexp can sequester one or more RBPs, thus impairing their physiological function. Another possible mechanism in FXTAS is that repeat‐associated non‐AUG (RAN) translation leads to the production of toxic proteins containing monoamino acid tracts, mainly polyG (Figure 1c, described in the chapter on RAN translation), which colocalize with ubiquitin in FXTAS in different regions of brain sections (Todd et al., 2013) and, to a lesser extent, in other tissues (Buijsen et al., 2014). Recently, the direct interaction of CGGexp and polyG was shown to promote polyG aggregation and lead to neuronal dysfunction (Asamitsu et al., 2021). PolyG colocalizes with Hsp70 (Bonapace et al., 2019; Jin et al., 2003; Todd et al., 2013) and lamina‐associated polypeptide (LAP) 2β, eventually leading to abnormal nuclear envelope structure and cell death (Sellier et al., 2017). Together, these data suggest that multiple mechanisms may play a role in FXTAS and that inclusion formation can be triggered by both toxic RNA and RAN translation products. The composition of inclusions has been investigated in a few studies. Inclusions purified from FXTAS patients were shown to be composed of multiple proteins, including lamin and hnRNP A2 (Iwahashi et al., 2006). Recent studies have identified over 200 proteins enriched within these inclusions in comparison to whole nuclei in FXTAS (Ma et al., 2019). No predominant protein was observed, but over half of the identified proteins were involved in RNA binding, protein turnover, and DNA damage repair. The most enriched proteins included small ubiquitin‐related modifier 2 (SUMO 2), ubiquitin, and p62/sequestosome‐1. Although the mechanism of inclusion formation is still not well understood, these results suggest that these inclusions may be a consequence of protein aggregation that exceeds the threshold of proteasomal degradation (Ma et al., 2019) and that aggregates may contain RNAexp. In accordance with the toxic RNA gain‐of‐function hypothesis, several proteins were found to bind CGGexp. Among them, the cytoplasmic form of hnRNP A2/B1 interacts directly with CGGexp (Jin et al., 2007; Sofola et al., 2007) and mediates the binding of CELF1 (Sofola et al., 2007). Importantly, overexpression of hnRNP A2/B1 or CELF1 suppressed CGGexp toxicity in Drosophila, suggesting their involvement in FXTAS pathogenesis (Jin et al., 2007; Sofola et al., 2007). In Drosophila, the overexpression of ALS‐associated TAR DNA‐binding protein 43 (TDP‐43) partially alleviated neurodegeneration, probably by preventing sequestration of the hnRNP A2/B1 protein homologs Hrb87F and Hrb98De on CGGexp and thus restoring their functions (He et al., 2014). Additionally, the transcriptional activator protein Pur‐alpha (Pur α) was found to bind CGGexp in vitro and was also observed in the inclusions of post‐mortem brain tissues from patients with FXTAS (Jin et al., 2007). The splicing regulator transformer‐2 protein homolog alpha (TRA2A) binds CGGexp in vitro and colocalizes with nuclear CGGexp foci and polyG in FXTAS inclusions. Moreover, in cells expressing (CGG)60, TRA2A‐dependent splicing of genes linked to mental retardation (i.e., ACTB) or intellectual disabilities (i.e., DOCK3) was impaired (Cid‐Samper et al., 2018). CGGexp also sequesters the double‐stranded RBP DGCR8 and its partner, ribonuclease type 3 (DROSHA). The enzymatic complex microprocessor, in which these two proteins play a crucial role, is involved in the processing of primary precursors of microRNA (pri‐miRNA), and when impaired in FXTAS, mature miRNA levels are reduced in neuronal cells, leading to neurodegeneration (Sellier et al., 2013). Moreover, colocalization studies showed that the DROSHA‐DGCR8 complex interacts with the splicing regulator KH domain‐containing, RNA‐binding, signal transduction‐associated protein 1 (KHDRBS1, alternative name SAM68), which leads to its aggregation in CGGexp foci and ultimately to the abnormal AS of several target mRNAs (Sellier et al., 2013). Importantly, overexpression of DGCR8, but not SAM68, rescued neuronal cell death induced by the expression of CGGexp in cultured mouse cortical neurons, suggesting that the sequestration of DGCR8 may play a role in the neuronal degeneration observed in FXTAS and that SAM68 does not directly bind CGGexp (Sellier et al., 2010, 2013).

C9orf72‐linked amyotrophic lateral sclerosis and frontotemporal dementia

The presence of nuclear sense G4C2 and antisense C4G2 RNA foci transcribed from G4C2exp in C9orf72, observed in the post‐mortem cerebral cortex and spinal cord tissue of C9‐ALS/FTD patients, significantly augments the complexity of proposed RNA‐mediated toxicity in this disease (DeJesus‐Hernandez et al., 2011). Nuclear foci containing the sense transcript are more abundant in cerebellar granule neurons, while antisense RNA‐containing foci are more common in Purkinje cells and motor neurons (Cooper‐Knock et al., 2015). Multiple RBPs, including Pur α (Xu et al., 2013), Nucleolin (Haeusler et al., 2014), and various hnRNPs (Cooper‐Knock et al., 2015; Haeusler et al., 2014; Y. B. Lee et al., 2013; Mori, Lammich, Mackenzie, Forné, Zilow, Kretzschmar, et al., 2013), double‐stranded RNA‐specific editase B2 (ADARB2; Donnelly et al., 2013), serine/arginine‐rich splicing factor 1 (SRSF1; Cooper‐Knock et al., 2014), THO complex subunit 4 (ALYREF; Cooper‐Knock et al., 2014), ZFP106 (Celona et al., 2017), Matrin‐3 (Ramesh et al., 2020), and essential paraspeckle proteins (Česnik et al., 2019), were shown to interact/colocalize with G4C2exp or C4G2exp foci, yet the colocalization of a candidate protein with RNA foci does not always mirror its impact on the disease phenotype. More detailed studies to assess the role of RNA‐mediated depletion/modification of the activity of these RBPs in C9‐ALS/FTD should be performed. Below, we briefly describe some of the candidate proteins and their significance regarding disease phenotype. In vitro studies showed that Nucleolin, hnRNP U, hnRNP F, and 60S ribosomal protein L7 (RPL7) bind (G4C2)4 RNA, while hnRNP K preferentially binds antisense (C4G2)4 RNA (Haeusler et al., 2014). Variable colocalization of Nucleolin with G4C2exp foci was further confirmed in motor cortex tissue and Purkinje neurons, suggesting that Nucleolin sequestration can impair Nucleolin homeostasis and result in nucleolar stress (Cooper‐Knock et al., 2015; Haeusler et al., 2014). In line with this, an increase in nucleolar volume was observed in frontal cortex neurons containing RNA foci, showing that nucleolar abnormalities are linked to RNA toxicity (Mizielinska et al., 2017). The interaction between G4C2exp and Pur α was initially shown in vitro and further confirmed in the mouse Neuro2A cell line and a Drosophila model carrying G4C2exp; Pur α inclusions were also observed in ALS/FTD patients with G4C2exp. Importantly, the overexpression of Pur α in both cell and Drosophila models alleviated the neurodegeneration mediated by G4C2exp (Xu et al., 2013). The colocalization of Pur α with G4C2 RNA foci was also shown in a zebrafish model of C9‐ALS/FTD, in which its overexpression prevented axonal abnormalities induced by toxic RNA (Swinnen et al., 2018). hnRNP H colocalizes with G4C2exp foci, changing its activity, as inclusion of TARBP2 intron 7, normally orchestrated by hnRNP H, was dysregulated in patient brain tissues. Y. B. Lee et al. (2013) suggested that sequestration of this protein may enhance the nuclear retention and aggregation of G4C2exp RNA, resulting in impaired RNA processing and toxicity. Additionally, hnRNP H1/F, ALYREF, and SRSF2 were shown to colocalize in sense RNA foci in cerebellar granule cells and motor neurons derived from C9‐ALS/FTD patients (Cooper‐Knock et al., 2014). Later, the same group reported colocalization of SRSF1, hnRNP A1, hnRNP H/F, ALYREF, and hnRNP K with antisense foci in cerebellar Purkinje neurons derived from C9‐ALS/FTD patients and confirmed the direct interaction of these proteins, with the exception of hnRNP K, with synthetic (C4G2)5 repeats by UV cross‐linking studies (Cooper‐Knock et al., 2015). The ADARB2 protein, an RNA editing regulator, was also shown to colocalize with G4C2exp foci in patient tissues (Donnelly et al., 2013). The siRNA‐mediated knockdown of ADARB2 decreased the number of RNA foci in C9‐ALS/FTD iPSNs, and the cells devoid of ADARB2 were more susceptible to the toxic effects of an excitatory neurotransmitter (glutamate) and showed increased cell death. The use of antisense nucleotides (ASOs) targeting G4C2exp reduced the nuclear retention of ADARB2 and rescued the glutamate susceptibility phenotype (Donnelly et al., 2013). The ZFN protein ZFP106 was recently reported to bind G4C2 repeats and other RBPs, including TDP‐43 and Fused in sarcoma (FUS), suggesting its potential role in C9‐ALS/FTD pathogenesis (Celona et al., 2017). In line with this, Zfp106‐knockout mice are characterized by motor neuron and muscle fiber degeneration and muscle atrophy, and this phenotype can be reversed by overexpression of ZFP106 in motor neurons. Additionally, overexpression of ZFP106 in a Drosophila C9‐ALS/FTD model suppressed neurotoxicity, but did not reduce the expression of G4C2exp RNA, suggesting that this protein may participate in the toxic RNA gain‐of‐function mechanism of C9‐ALS/FTD (Celona et al., 2017). Colocalization of the punctate and diffuse forms of Matrin‐3 with G4C2exp foci was reported in C9‐ALS/FTD iPSNs and post‐mortem C9‐ALS/FTD motor cortex sections and validated by RNA pull‐down assays (Ramesh et al., 2020). The expression of Matrin‐3 in G4C2 Drosophila models improved eye degeneration and suppressed motor deficits but did not change G4C2 mRNA levels. Interestingly, truncation of the Matrin‐3 RBD did not fully inhibit binding to G4C2 RNA, suggesting that this interaction is sometimes direct but also mediated to some extent through other protein partners (Ramesh et al., 2020). Similar to that in FXTAS, protein sequestration in C9‐ALS/FTD is not limited to RNAexp foci. RNAexp can be translated bidirectionally, resulting in the production of the DPRs poly‐GP, poly‐GA, poly‐GR, poly‐PR, and poly‐PA (Figure 1c; Ash et al., 2013; Gendron et al., 2013; Mori, Weng, Arzberger, May, Rentzsch, Kremmer, et al., 2013; Zu et al., 2013). The positively charged, hydrophilic poly‐GR repeats were found to form inclusions and colocalize with TDP‐43 in an RNA‐independent manner in motor neurons derived from C9‐ALS/FTD patients, which suggests a role for polyGR in TDP‐43 proteinopathy, a hallmark of the majority of ALS and FTD cases (Saberi et al., 2018). NCT proteins and nucleoporins are also partially colocalized with poly‐GR aggregates, suggesting that sequestration of these proteins can contribute to the cytoplasmic accumulation of TDP‐43 (Cook et al., 2020). Proteomic analysis of polyGA aggregates in HEK293 cells expressing polyGA‐GFP showed enrichment in the UNC119 protein, which is involved in the trafficking of lipidated cargo proteins (May et al., 2014). In summary, various proteins can be sequestered on RNAexp, and as a result, their physiological functions can be impaired in certain REDs (Table 1). Notably, some proteins do not interact with RNAexp directly but rather co‐aggregate with other protein partners during foci formation. To distinguish between toxic protein sequestration and colocalization, mechanistic studies showing impairment of candidate protein functions should be performed. Moreover, the role of RNA‐gain of function model and protein sequestration in the pathogenesis of some REDs should be confirmed, as our current understanding is often based on either in vitro studies or simple disease models, with the exception of DM1. Confirmation of pathogenic protein sequestration in patient‐derived samples would be a valuable source of knowledge on this subject (Box 3). Protein sequestration: Selective recruitment and immobilization of specific proteins on RNA molecules in membraneless compartments. Physiological sequestration is used in cellular processes to temporally withdraw proteins from the available pool (e.g., nucleolar sequestration of MDM2 leads to stabilization of p53 in the nucleus, which leads to the cell growth arrest or apoptosis). Toxic sequestration is a pathogenic mechanism, where detained proteins are titrated away from their biological targets, leading to irreversible paucity in specific compartments and inhibition of their physiological roles (e.g., sequestration of MBNL1 on CUGexp causes altered splicing of different pre‐mRNA targets, a hallmark of DM1 pathogenesis). Toxic sequestration is often targeted in RED therapies, by using therapeutic agents which bind to toxic RNA and thus release sequestered proteins. Colocalization: Spatial overlap of two probes, for example, protein and RNA, which can be detected by fluorescence microscopy. It should be distinguished from sequestration, as colocalization of protein and RNA does not necessarily lead to the impairment of protein activity.

NUCLEOCYTOPLASMIC TRANSPORT

Impairment of nucleocytoplasmic transport (NCT) is a common pathology in neurodegenerative diseases, including those caused by STR expansion, and a growing body of evidence indicates the great importance of this process in the development of neurodegeneration (extensively reviewed in H. J. Kim & Taylor, 2017). A number of reports have suggested the contribution of toxic proteins such as polyQ proteins or RAN translation proteins to the disruption of NCT (Boeynaems et al., 2016; Cook et al., 2020; Grima et al., 2017; Hayes et al., 2020; Jovičić et al., 2015; K. Y. Shi et al., 2017; Y. J. Zhang et al., 2016). However, some of the articles presented below support the involvement of toxic RNAexp due to the direct binding of proteins that participate in NCT (Figure 3a,b, Table 1).

FIGURE 3

Involvement of RNAexp in nucleocytoplasmic transport (NCT). (a) Impairment of NCT by RNA. Gradient of RanGDP/RanGTP proteins between nucleus and cytoplasm, supported by RanGAP1, enables proper NCT. Binding of G4C2exp to RanGAP1 leads to impaired import of nuclear proteins, exemplified by TDP‐43. (b) Export of RNA. Nuclear export adaptor SRSF1 binds to RNA with G4C2exp and C4G2exp and supports its transport to cytoplasm through NXF1‐dependent pathway. NXF1 and its cofactor NXT1 participate also in export of circular RNAexp (circRNAexp) derived from G4C2exp‐bearing intron lariat. (a,b) Arrow with a dotted line, change in place and/or in time; solid lines show induction or inhibition of certain processes RNA with expanded G4C2 repeats exerts toxicity in both the nucleus and cytoplasm; therefore, abnormalities in NCT may significantly contribute to the pathomechanism of C9‐ALS/FTD. Genetic screens of a Drosophila model of this disease revealed that the pathology caused by G4C2 is related to the impairment of NCT and that Ran GTPase‐activating protein (RanGAP, RanGAP1 in humans) is a potent suppressor of G4C2 repeat‐associated neurodegeneration (Freibaum et al., 2015; K. Zhang et al., 2015; Figure 3a). RanGAP1 participates in the conversion of cytoplasmic RanGTP to RanGDP, a process required to maintain the Ran protein gradient between the nucleus and cytoplasm, which is necessary for the correct operation of active transport through the nuclear pore complex (Bischoff et al., 1994; Görlich et al., 1996). Binding of RanGAP to the G‐quadruplex structure formed by RNA with G4C2 repeats led to disruption of the Ran protein gradient in C9‐ALS iPSNs and inhibition of protein import to the nucleus in both a Drosophila model and iPSNs (K. Zhang et al., 2015). This is illustrated by nuclear depletion of TDP‐43 over the course of ASL/FTD (Neumann et al., 2006; K. Zhang et al., 2015). On the other hand, a recent report showed that an increased concentration of TDP‐43 in the cytoplasm may lead to the formation of liquid droplets, further causing mislocalization of RanGAP1 and other proteins engaged in NCT (Chou et al., 2018; Gasset‐Rosa et al., 2019). G4C2exp repeat‐related neurodegeneration and impairment of NCT were abolished by overexpression of RanGAP or the application of molecules that inhibit its interaction with the repeats (K. Zhang et al., 2015). Among the proteins that directly bind G4C2 repeat RNA are nuclear export adaptors, serine/arginine‐rich splicing factors (SRSFs; Cooper‐Knock et al., 2014; Figure 3b). Further studies showed that knockdown of SRSF1 significantly reduced repeat‐related neurodegeneration in a fly model of C9‐ALS/FTD (Hautbergue et al., 2017). In addition to its splicing regulatory functions, SRSF1 participates in the transport of mRNAs via a nuclear RNA export factor 1 (NXF1)‐dependent pathway (Y. Huang et al., 2003; Müller‐McNicoll et al., 2016). The proposed model assumes that SRSF1 directly binds RNA with expanded G4C2 or antisense C4G2 repeats and supports its export to the cytoplasm driven by NXF1, where it undergoes RAN translation to toxic DPRs (Hautbergue et al., 2017). Importantly, this process specifically concerns expansion‐bearing transcripts, as SRSF1 deficiency reduced only the nuclear export of C9orf72 mRNA that retained an intron with expanded repeats but did not influence the nuclear export of wild‐type C9orf72 mRNA in C9‐ALS iPSNs (Hautbergue et al., 2017). Recently, the NXF1–NXT1 pathway was also shown to regulate the nuclear export of circular RNA containing G4C2exp produced due to defective debranching of the spliced intron of C9orf72 mRNA (Wang et al., 2021). Here, the presence of G‐rich repeats influences the stability of the circular form and its export to the cytoplasm. Interestingly, such circular RNAexp undergoes RAN translation and supports DPR production (Wang et al., 2021). Although direct binding was not shown, a recent study suggests that RNA with G4C2exp contributes to the reduction of nuclear envelope pore membrane protein POM 121 (POM121; Coyne et al., 2020). The observed effect was RNAexp‐specific and independent of DPR translation or loss of the C9orf72 protein (Coyne et al., 2020). Deficiency of POM121 resulted in further decreases in a set of nucleoporins, leading to a significant change in nuclear pore complex composition, mislocalization of RanGTPase, and reduced survival of C9‐ALS iPSNs (Coyne et al., 2020).

PolyQ diseases

Similar to the interplay between SRSF1 and G4C2 observed in ALS, RNAs with expanded CAG repeats interact with another nuclear export adapter, splicing factor U2AF 65 kDa subunit (U2AF65; Tsoi et al., 2011). U2AF65 directly binds RNA with expanded but not normal CAG repeats and then NXF1, facilitating the export of RNAexp to the cytoplasm (Tsoi et al., 2011). Knockdown of the Drosophila ortholog of U2AF65 enhanced the nuclear accumulation of RNAexp and RNAexp‐mediated toxicity (Tsoi et al., 2011). In a mouse model of polyQ disease expressing RNA with CAGexp, symptomatic mice showed a significant reduction in U2AF65 in comparison to U2AF65 levels in asymptomatic mice, which confirmed the contribution of this protein to CAGexp‐related pathology (Tsoi et al., 2011). Interestingly, subsequent work suggested that nuclear retention of CAGexp‐bearing RNA results from its binding to MBNL1, leading to impairment of the interaction with U2AF65 and nuclear export by NXF1 (Sun et al., 2015). Some RBPs that contribute to NCT may also affect this process in FXTAS due to their binding to CGGexp RNA. A recent study presented the direct interaction of SRSF1 with these repeats (Malik, Tseng, et al., 2021). In line with the previously shown effects of the SRSF1/G4C2 interaction (Hautbergue et al., 2017), depletion of SRSF1 activity in a cellular model of FXTAS decreased RAN translation and led to an increased abundance of CGGexp‐bearing transcripts in the nucleus (Malik, Tseng, et al., 2021). SRSF1 knockdown rescued repeat‐related toxicity in both C9‐ALS/FTD and FXTAS Drosophila models (Hautbergue et al., 2017; Malik, Tseng, et al., 2021). Interestingly, articles reporting the pathological impact of RNA with G4C2 (Coyne et al., 2020) and CAG (Tsoi et al., 2011) repeats on NCT proteins showed that CGG‐related toxicity did not induce mislocalization of POM121 and was not modulated by U2AF65. DMPK transcripts containing very long expanded CUG repeats are almost fully retained in the nucleus (Brook et al., 1992; Davis et al., 1997). Nuclear export of such mRNAs is also modulated by RBPs. hnRNP H was shown to hinder this process (D. H. Kim et al., 2005). In contrast, another CUGexp‐binding protein, the double‐stranded RBP Staufen homolog 1 (Staufen1), was shown to rescue NCT of RNAexp in cellular models of DM1 (Ravel‐Chapuis et al., 2012). In DM2, overexpression of MBNL1 increased RNA foci formation and reduced the production of RAN proteins, which suggests its impact on the nuclear retention of CCUGexp (Zu et al., 2017). In summary, two faces of NCT pathology relate to the interplay between RNAexp and RBPs. The first concerns functional impairment of RBPs engaged in different steps of NCT due to their binding to RNAexp, as exemplified by RanGAP1 and POM121. The second involves the contribution of RBPs to toxicity exerted by different RNAexp molecules in either the nucleus (e.g., CUGexp) or cytoplasm (e.g., G4C2exp) by enabling their retention or transport to the cytoplasm (e.g., hnRNP H or SRSF1, respectively).

RAN TRANSLATION

Repeat associated non‐AUG (RAN) translation is a protein synthesis mechanism that, in contrast to canonical translation, does not require an AUG start codon for initiation (Zu et al., 2011). RAN translation was reported for many diseases associated with repeat expansions, such as HD (Bañez‐Coronel et al., 2015), C9‐ALS/FTD (Ash et al., 2013; Mori, Weng, Arzberger, May, Rentzsch, Van Broeckhoven, et al., ; Zu et al., 2013), FXTAS (Todd et al., 2013), DM2 (Zu et al., 2017), and others (Banez‐Coronel & Ranum, 2019; Zu et al., 2011). RAN proteins may be produced from sense or antisense transcripts and contain homopolymeric tracts comprised of a single amino acid, such as polyglycine (polyG) or polyalanine (polyA) tracts in the case of trinucleotide repeats (Todd et al., 2013), dipeptides repeats (DPRs), such as polyGA in case of G4C2exp in C9‐ALS/FTD (Ash et al., 2013; Mori, Weng, Arzberger, May, Rentzsch, Kremmer, et al., 2013; Zu et al., 2013), or tetrapeptide repeats in DM2: polyLPAC and polyQAGR translated from CCUGexp and CAGGexp, respectively (Zu et al., 2017; Figure 1c). RAN products have toxic properties; they mostly tend to aggregate, creating nuclear, or cytoplasmic inclusions (Mori, Weng, Arzberger, May, Rentzsch, Kremmer, et al., 2013; Sellier et al., 2017), and sequester other proteins, resulting in their functional depletion (Sellier et al., 2010, 2013). Insights describing the involvement of RAN translation products in the pathogenesis of particular REDs are beyond the scope of this review and were presented elsewhere (Banez‐Coronel & Ranum, 2019). It is worth noting that so far deleterious properties of RAN peptides were mostly observed in experiments based on artificial cellular or animal models (e.g., Sellier et al., 2017; Sonobe et al., 2021; Todd et al., 2013). Indeed, RAN proteins were identified in several patient's samples (Ash et al., 2013; Bañez‐Coronel et al., 2015; Mori, Weng, Arzberger, May, Rentzsch, Van Broeckhoven, et al., ; Sellier et al., 2017; Todd et al., 2013; Zu et al., 2013, 2017), however, our knowledge about toxicity of those endogenously expressed peptides is limited (Freibaum & Taylor, 2017). For example, current findings concerning the presence of polyG in FXTAS patients implicate that RAN translation of CGGexp may occur at relatively low level and it is not well‐defined to what extent this process contributes to the pathogenesis of this disease (Ma et al., 2019; Salcedo‐Arellano et al., 2020). Additionally, RAN translation can also occur when transcripts contain repeats of a normal range, indicating the possible regulatory functions of this process under physiological conditions (C. M. Rodriguez et al., 2020). Despite the fact that RAN translation was first described nearly a decade ago, mechanistic insights into this process remain elusive (Zu et al., 2011; Figure 4a–c). In the majority of cases, initiation of RAN translation depends on a canonical cap‐dependent scanning model; however, it was shown that AUG‐independent translation initiates at less favored so‐called near‐cognate codons (such as CUG, GUG, and ACG) located upstream or within expanded, structured repeats (Green et al., 2017; Kearse et al., 2016; Tabet et al., 2018). RAN translation of DPRs from G4C2exp was shown to initiate with Met‐tRNA at CUG start codon in a cap‐dependent manner (Green et al., 2017; Tabet et al., 2018). The mechanism of translation initiation from alternative start codons has not been fully elucidated; however, one of the proposed models of CGG RAN translation in FXTAS involves a mechanism in which the 43S preinitiation complex (PIC) scans through the 5′UTR of FMR1 mRNA and encounters steric hindrances such as hairpins and/or G‐quadruplexes formed by expanded repeats, which leads to ribosomal stalling and the initiation of translation at noncanonical codons, resulting in the production of RAN polypeptides (Kearse et al., 2016). Resolving structured RNA seems to be crucial for RAN translation, as it was shown that inhibition of the canonical DEAD box helicase eIF4A, which is needed for ribosome scanning, abolished both CGGexp‐ and G4C2exp‐dependent RAN translation (Green et al., 2017; Kearse et al., 2016). It was demonstrated that the costimulatory factors of eIF4A, eIF4B, and eIF4H, also exhibit RAN translation modulatory properties (Linsalata et al., 2019). In addition, in a C9‐ALS/FTD fly model, the Drosophila homologs of eIF4B and eIF4H were shown to modulate the production of polyGR, and eIF4H was found to be significantly downregulated in cerebellar tissue and fibroblasts obtained from patients harboring G4C2exp (Goodman, Prudencio, Srinivasan, et al., 2019). Recently, Drosophila Belle (a homolog of the human DEAD‐box helicase DDX3X) was found to selectively affect the production of RAN peptides (Linsalata et al., 2019). The authors reported that DDX3X binds the 5′UTR of FRM1 independent of its CGG repeats, excluding the possibility that repeat expansion determines the interaction between mRNA and the protein (Linsalata et al., 2019). Knockdown of Belle/DDX3X in an FXTAS fly model resulted in the decreased production of RAN dipeptides and ameliorated retinal degeneration (as expression of CGGexp was directed to the retina using the Gmr‐GAL4 driver, which manifests as a severe so‐called rough‐eye phenotype), confirming the regulatory role of DDX3X in vivo. In contrast, DDX3X depletion upregulated G4C2exp‐specific RAN translation (W. Cheng et al., 2019). This study showed that DDX3X binds G4C2exp repeats and relaxes their secondary structure, thus impeding initiation of the RAN translation process (W. Cheng et al., 2019). This is not surprising, as it was previously shown that the yeast homolog of DDX3X, Ded1p, controls initiation from near‐cognate codons by binding and unwinding RNA secondary structures upstream of alternative codons within 5′UTRs (Guenther et al., 2018). Recently, another DEAH‐box RNA helicase, DHX36, was identified as a new modulator of G4C2 or CGG RAN translation (H. Liu et al., 2021; Tseng et al., 2021; Figure 4b). The authors postulate that similar to DDX3X, DHX36 unfolds RNA secondary structures formed within G4C2 repeats and thus modulates noncanonical protein synthesis (H. Liu et al., 2021; Tseng et al., 2021). All of the presented evidence indicates that resolving the RNAexp structure by RBPs seems to be a crucial element of RAN translation regulation. Additionally, start codon fidelity seems to substantially contribute to the regulation of RAN translation, as a recently performed screening elucidated two other translation initiation factors, eIF1 and eIF5, that influence the initiation of RAN protein synthesis in vitro (Linsalata et al., 2019), which is in line with previously published reports concerning the function of these proteins, which is contribution to start codon selection (Ivanov et al., 2010; Kozel et al., 2016; Loughran et al., 2012; Tang et al., 2017). Recent in vitro studies demonstrated that knockdown of eIF3F selectively downregulates synthesis of RAN peptides derived from expanded CAG repeats (related to SCA8) as well as G4C2 repeats (Ayhan et al., 2018). The eIF3F acts as a non‐core component of eiF3 complex, which is known to regulate canonical and internal ribosome entry site (IRES)‐dependent translation but its role in RAN translation regulation is yet to be defined (Cate, 2017; Hashem et al., 2013). Further studies demonstrating the role of eiF3 in both canonical and RAN translation initiation on CGGexp indicated that human oncoprotein eIF5‐mimic (5MP) displaces eIF5 through eIF3c subunit within the PIC and thus affects both modes of translation via increasing the accuracy of translation initiation (Singh et al., 2021; Tang et al., 2017). In addition, it was shown that overexpression of Drosophila 5MP homolog in FXTAS flies reduced RAN peptides‐mediated toxicity (Singh et al., 2021). Recently, Sonobe et al. (2021) developed a new C. elegans model of C9‐ALS/FTD. The study showed that loss‐of‐function mutations of eIF2D lead to reduced production of polyGA (and to lesser extent polyGP), positively affecting locomotor function, and lifespan of transgenic worms (Sonobe et al., 2021). Authors presented that eIF2D is required for DPRs production and perhaps acts at initiation CUG codon for polyGA and delivers the Met‐tRNA to the P‐site of the 40S ribosomal subunit (Boivin et al., 2020; Dmitriev et al., 2010; Green et al., 2017; Sonobe et al., 2021; Figure 4a).

FIGURE 4

Processes involving RBPs and RNAexp in cytoplasm. (a) The role of eIF2D in RAN translation initiation. Non‐canonical translation initiation factor eIF2D delivers Met‐tRNA to the P‐site of 40S ribosomal subunit at CUG codon contributing to RAN translation initiation (described for G4C2exp). A (aminoacyl) site, P (peptidyl) site, E (exit) site in the ribosome. (b) Elongation of RAN translation. DHX36 helicase unwinds G‐quadruplexes formed by RNAexp and thus facilitates ribosome processivity and production of toxic homopolymeric proteins or dipeptide repeat proteins (described for G4C2exp and CGGexp). (c) RAN translation initiation upon stress. Stress related to the presence of double‐stranded RNA (dsRNA) formed by RNAexp and RAN proteins activates PKR and PERK kinases, respectively, which catalyze phosphorylation of eIF2α. This, in turn, inhibits eIF2α‐P binding to Met‐tRNA and the formation of preinitiation complex (PIC) with 40S ribosome subunit. Under the stress, eIF2A may take over role of phosphorylated eIF2α‐P, bind Met‐tRNA and participate in translation initiation at near‐cognate start codons (described for G4C2exp and CCUGexp). (d) Stress response caused by RNA. RNAi pathway component, ribonuclease Dicer, cleaves dsRNA formed by RNAexp hairpin or bidirectionally transcribed CUGexp/CAGexp duplex into 21‐mer fragments. Such RNA fragments may next activate TLRs and trigger innate immune response. (e) Stress granules formation. Stress stimuli such as RNAexp and RAN proteins lead to phosphorylation of eIF2α, followed by global translation suppression and formation of stress granules (SG; described for CGGexp and G4C2exp). Moreover, G4C2exp may serve as a core component of SG and promote formation of membraneless organelles composed of different mRNAs and SG protein markers. Chronic stress may entail SG transition towards more solid‐like structures. (f) Phase separation of RNA. Homopolymeric RAN protein, polyG, binds to its own RNAexp what promotes its phase transition from liquid droplets towards gel‐like aggregates (described for CGGexp). (a–c) arrow with a dotted line, change in place and/or in time; solid lines show induction or inhibition of certain processes A genetic screening performed to identify RAN translation‐specific modifiers in a yeast model identified 40S ribosomal protein S25‐A (RPS25A), which selectively inhibited RAN translation from G4C2exp (Yamada et al., 2019). Treatment with ASOs targeting RPS25 in C9‐ALS iPSNs reduced DPRs production and significantly increased cell survival. Moreover, upon RNAi‐mediated silencing in C9‐ALS/FTD Drosophila, the efficiency of polyGP synthesis significantly decreased, and the lifespan of the adult male flies was extended. Additionally, in human cell lines, RPS25 silencing led to a decrease in RAN translation from another repeats, CAGexp associated with the HTT and ATXN2 genes (related to HD and spinocerebellar ataxia type 2 [SCA2], respectively). The exact function of RPS25 in translation remains to be determined; however, RPS25 was shown to play a crucial role in both cellular and viral noncanonical translation events, such as ribosomal shunting and IRES‐mediated translation (Fuchs et al., 2015; Hertz et al., 2013; Landry et al., 2009; Y. Shi et al., 2016). As RPS25 exhibits specificity toward stable RNA secondary structures, structured G4C2 repeats might attract this protein (Nishiyama et al., 2007). Interestingly, RPS25 is not required for cap‐dependent translation, which might explain why this factor plays an important role in G4C2 RAN translation of the C9orf72 transcript, which can be translated in IRES‐like cap‐independent manner (W. Cheng et al., 2018; Wang et al., 2021). Altogether, the presented reports imply that RBPs modulate RAN translation by contributing to multiple translation steps, such as PIC scanning, start codon fidelity, unfolding structured RNAexp. Recent findings provide multiple lines of evidence that particular RBPs can modulate RAN translation in various ways, sometimes leading to contrary effects depending on repeat type, length and RNA sequence context (Table 1, Box 4). Repeat‐associated non‐AUG (RAN) translation is a non‐canonical protein synthesis initiated upstream or within expanded repeats in so‐called near cognate start codons such as ACG, CUG, and GUG. RAN translated proteins may consist of single amino acids such as polyglycine (FXTAS) or repeated dipeptides (DPRs; C9‐ALS/FTD). These RAN peptides tend to aggregate, interact with both RNAs and other proteins, and thus form toxic intracellular inclusions. Mechanistic insights of RAN translation are yet to be established, however, it is known that such processes as ribosomal scanning, start codon fidelity, RNA secondary structure unwinding and cellular stress contribute to RAN translation regulation. RAN translation might be also perceived as combination of RNA/protein gain‐of‐function phenomena as structured RNAexp acts as template for production of toxic RAN peptides, affects the efficiency of RAN translation and may contribute to the formation of aggregates with RAN products.

IMMUNE RESPONSE TO RNA AND PRODUCTS OF RAN TRANSLATION

Growing evidence suggests that RNAexp or double‐stranded RNA (dsRNA), e.g., products of bidirectional transcription in locus of expanded repeats, or RAN proteins itself can induce an innate immune response. They can act as danger‐associated molecular patterns (DAMP), and are recognized by pattern recognition receptors (PRRs), including Toll‐like receptors (TLR), or double‐stranded RNA‐activated protein kinase (PKR) in various types of brain cells. These events might correlate with pathological innate immune system activation, induction of stress response, and inflammation. The presence of structured RNAs containing different types of repeats, for example, CUGexp, CAGexp or G4C2exp, and RAN proteins, for example, polyG or polyGP, was postulated to activate PKR, and PKR‐like endoplasmic reticulum kinase (PERK) pathway, respectively (Figure 4c; W. Cheng et al., 2018; Green et al., 2017; Tian et al., 2000; Zu et al., 2020). PKR activation leads to the phosphorylation of eIF2α, resulting in global translation shutdown, but simultaneously selective upregulation of RAN translation. Expression of expanded CGG and G4C2 repeats in primary rat cortical neurons triggered eIF2A‐phosphorylation‐dependent stress granule formation. Activation of integrated stress response enhanced RAN translation and halt global translation, contributing to neurodegeneration in a feed‐forward loop (Green et al., 2017). The cytoplasmic dsRNA derived from G4C2 expansions was detected in brains of C9‐ALS/FTD patients, and when expressed in a mouse model it triggered type‐I interferon signaling and neuronal death (S. Rodriguez et al., 2021). Several strategies to inhibit eIF2α phosphorylation and thus decrease the accumulation of RAN proteins were successfully applied in both in vitro and in vivo models (W. Cheng et al., 2018; Green et al., 2017; Westergard et al., 2019; Zu et al., 2020). When eiF2α activity is abolished, eIF2 alternative factor, eIF2A, can participate in translation regulation (Komar & Merrick, 2020). Upon eIF2A knockout, synthesis of polyGA from G4C2exp was also shown to be decreased (Sonobe et al., 2018). In addition, eIF2A can also increase the production of RAN proteins translated from both CCUG and CAGG repeats in cellular DM2 models (Tusi et al., 2021). Interestingly, although the CGGexp repeats forms hairpins/G‐quadruplex, it does not activate the PKR in vitro or in vivo and is inefficiently cleaved by a major component of RNAi machinery Dicer in vitro (Handa et al., 2003). On the other hand, in patient‐derived cells with mutant DMPK or mutant HTT, CUGexp and CAGexp are cleaved by ribonuclease Dicer into short ~21 nt RNA fragments (Krol et al., 2007; Figure 4d). In Drosophila model, transiently expressed transcripts containing CUGexp and CAGexp were cleaved by Dicer‐2 and loaded on Ago2‐RISC complex (Yu et al., 2011). This finding was confirmed in another fly model, where dsRNA of CAG/CUG~100 repeats was produced in endogenous bidirectional transcription, and was shown to be processed by Dicer‐2 into CAG7 21‐mers (Lawlor et al., 2011). Primary activity of Dicer is the cleavage dsRNAs in RNAi pathway; however, its role in antiviral activity was also reported recently (Montavon et al., 2021). Dicer requires several protein partners to cleave dsRNA in RNAi pathway, and recent studies revealed that other key components of RNAi pathway, namely R2D2 and loquacious, are not required for CAG/CUG~100 dsRNA toxicity in Drosophila, in the contrary to Dicer and Argonaute (van Eyk et al., 2019). This suggests that Dicer and Argonaute are implicated in antiviral RNAexp‐mediated cell death pathway rather than in RNAi pathway. Adenosine deaminase of RNA‐1 (ADAR1) is another important player in antiviral immunity, as it edits dsRNA to distinguish viral double‐stranded nucleic acid (non‐self) from endogenous dsRNA (i.e. Alu repeats). Depending on the number of repeats, RNAexp can be conferred as “non‐self” and thus triggers autoinflammatory response. Co‐expression of human hADAR1c or hADAR1i with CAG/CUG100 in Drosophila rescued the inflammatory phenotype (van Eyk et al., 2019). Whole‐transcriptomic analysis of Drosophila expressing CAG/CUG~100 repeats revealed alterations in several pathways of innate immunity (Samaraweera et al., 2013). Upon knockdown of TLRs in Drosophila, the toxicity of CAG/CUG~100 repeats decreased, whereas the opposite effect was observed upon knockdown of autophagy‐related genes. This suggests that activation of Toll pathway in the presence of dsRNA contributes to neurodegeneration, and that autophagy is required for clearance of pathogenic dsRNA (Samaraweera et al., 2013). Together, many reports show the link between innate immunity, toxic dsRNA, and RAN proteins. These aberrant molecules may induce cellular stress in a feed‐forward loop, leading to global protein translation shutdown, but increase in RAN translation, thus contributing to disease progression. Nevertheless, further studies are needed to better understand the involvement of immune system in REDs.

PHASE SEPARATION

In recent years, efforts to elucidate the underlying causes of neurodegenerative diseases have turned to the phenomenon of phase separation and its contribution to the formation of protein aggregates and RNA foci (comprehensively discussed in excellent reviews; Alberti & Dormann, 2019; Banani et al., 2017; Hyman et al., 2014; Nedelsky & Taylor, 2019; Polymenidou, 2018; Wolozin & Ivanov, 2019). Since RNA molecules and RBPs are involved in phase separation, alterations in the physiology of this process seem to be particularly relevant for REDs. Briefly, liquid–liquid phase separation (LLPS) is the process in which two fluids demix and one form droplets within the other, as in a mixture of oil and vinegar. In living cells, macromolecules separate from their surroundings to perform specific functions and create assemblies called membraneless organelles (MLOs). Among MLOs in the nucleus are the nucleolus, paraspeckles, and Cajal bodies, whereas MLOs in the cytoplasm are exemplified by processing bodies (P‐bodies), transport granules, or stress granules (SG). An important feature of these structures is their flexible response to changing conditions. The transition of liquid droplets to more solid‐like foci influences the dynamics of their assembly and disassembly. This mechanism, widely investigated in SG, is now considered a source of the toxic foci/aggregates observed in neurodegenerative disorders, including REDs. Proteins predisposed to undergo LLPS are characterized by multivalency and thus the ability to interact with multiple other proteins or RNA molecules or the presence of intrinsically disordered regions, which are fragments with low sequence complexity that enable noncovalent interactions with other macromolecules (reviewed in Shin & Brangwynne, 2017). Many RBPs possess these features and thus tend to undergo LLPS. The contribution of some RBPs and their mutant forms to aberrant phase separation in neurodegenerative disorders has been widely investigated (Bakthavachalu et al., 2018; Bolognesi et al., 2019; Pakravan et al., 2021; Patel et al., 2015). Similar to proteins, RNA molecules can undergo phase separation. It was shown that different RNAexp molecules (RNAs bearing CUG, CAG, and G4C2 repeats), which are multivalent by nature, form droplets in isolation in vitro (Jain & Vale, 2017). This phenomenon occurs when the number of repeats exceeds a threshold value, similar to this, which is pathogenic in particular diseases (Jain & Vale, 2017). Consistent with these observations, RNAexp molecules accumulate in nuclear foci in different REDs. Interestingly, RNAexp droplets in vitro are characterized as a gel because of their solid‐like features (Jain & Vale, 2017). In the cellular environment, the formation of MLOs depends on the interactions of proteins and RNAs (Langdon et al., 2018; Maharana et al., 2018; Van Treeck et al., 2018). As stated above, a number of RBPs, including those associated with LLPS, are bound to different RNAexp (e.g., TDP‐43, FUS, hnRNPs). However, surprisingly, the interplay between RNAexp and RBPs in the context of aberrant phase separation in REDs has barely been recognized. The aforementioned work (Jain & Vale, 2017) concerning the phase separation of RNAexp in vitro also examined RNAexp droplets in living cells. Induction of the expression of RNA with CAGexp, but not with normal CAG repeat number resulted in the formation of liquid‐like foci that colocalized with serine/arginine‐rich splicing factor 2 (SRSF2 aka SC‐35), a marker of nuclear speckles, MLOs involved in splicing regulation (Mintz et al., 1999). Thus, the interaction of CAGexp RNA with nuclear speckle components leads to their retention in the nucleus (Jain & Vale, 2017). RNA with G4C2 repeats, but not antisense RNA with C4G2 repeats, formed nuclear foci in a length‐dependent manner in vitro and in living cells (Jain & Vale, 2017). The observed difference may be associated with the secondary structures of those molecules, as the presence of G‐quadruplexes promotes phase separation (Fay et al., 2017; Jain & Vale, 2017; Y. Zhang et al., 2020, Zhang et al., 2019). Moreover, foci driven by the expression of G4C2exp RNA are less dynamic than CAGexp RNA foci but are similarly retained in nuclei and colocalize with nuclear speckles (Jain & Vale, 2017; Y. B. Lee et al., 2013). RNA bearing G4C2 repeats was also shown to drive the formation of SG in a length‐dependent manner both in vitro and in cellulo (Fay et al., 2017; Rossi et al., 2015; Figure 4e). RBPs specific for these MLOs were identified in foci assembled in vitro with the use of G4C2 RNA and lysates derived from different cell lines or the mouse brain (Fay et al., 2017). Condensed SG markers included Ras GTPase‐activating protein‐binding protein 1 (G3BP1), Caprin‐1, ubiquitin carboxyl‐terminal hydrolase 10 (USP10), eukaryotic translation initiation factor 3 subunit B (eIF3b), ELAV‐like protein 1 (ELAVL1), and nucleolysin TIAR (Fay et al., 2017). This process was promoted by the G‐quadruplex structure of G4C2 repeats (Fay et al., 2017). RNAexp bearing CAG and G4C2 repeats may also be components of mRNA transport granules in neurons, as they undergo active transport in neurites (Burguete et al., 2015). G4C2exp RNA molecules interfere with the functions of these MLOs, possibly due to interaction with their RBP components, such as FMRP, leading to defects in neuritic branching (Burguete et al., 2015). Interesting data concerning the interaction of RBPs and RNAexp in the context of aberrant LLPS come from studies of FXTAS‐related CGGexp RNA repeats. As for G4C2exp, overexpression of CGGexp induced the formation of SG positive for the markers G3BP1 and FMRP (Green et al., 2017). Computational analyses revealed that FMR1 mRNA carrying CGGexp is a candidate scaffolding RNA component of RNP MLOs (RNP granules) exemplified by SG or P‐bodies (Cid‐Samper et al., 2018). Such scaffolding RNAs are characterized by the presence of secondary structure within UTRs and by a large number of interactions with MLO‐related RBPs, among other characteristics (Cid‐Samper et al., 2018). Indeed, the results of computational analysis, which were further validated using in vitro assays, showed that CGGexp molecules interact with a considerable number of RBPs (Cid‐Samper et al., 2018). A significant portion of the interactors are RBPs, which are predicted to form MLOs (Cid‐Samper et al., 2018). Considering its strong CGGexp‐binding and granule‐forming propensities, the splicing regulator TRA2A was determined to be a potential player in CGGexp‐related toxicity (Cid‐Samper et al., 2018). Its colocalization with polyG‐positive nuclear inclusions was then reported in the brains of an FXTAS mouse model and patients (Cid‐Samper et al., 2018). Surprisingly, polyG, the product of RAN translation of CGGexp embedded in FMR1 mRNA, was recently described as a protein that binds its own encoding mRNA, preferentially the expanded CGG repeats (Asamitsu et al., 2021; Figure 4f). It was shown that both RNA with CGGexp and polyG undergo phase separation in vitro (Asamitsu et al., 2021). Furthermore, the addition of CGGexp RNA to polyG enhanced the phase transition, leading to the formation of gel‐like aggregates in vitro and possibly also in cells (Asamitsu et al., 2021). Taking the presented reports into consideration, one may conclude that the scaffolding properties of CGGexp‐bearing RNA and possibly other RNAexp molecules lead to the formation of droplets containing RBPs and RAN translation products, which then undergo a transition to solid aggregates, the hallmark of neurodegenerative disorders. It is worth noting that DPRs, products of G4C2 repeat RAN translation, were shown to contribute to aberrant phase transition and disruption of MLO dynamics and functions (Boeynaems et al., 2017; K. H. Lee et al., 2016). In summary, the RNAexp molecules form foci by themselves and are an attractive platform for interaction with RBPs. The appearance of such unusual molecules changes cell flexibility and modulates MLO assembly and disassembly (Table 1). The presence of different types of inclusions in REDs may be a consequence of this impairment, followed by the liquid‐to‐solid transition of MLOs.

TARGETING RBPs AS POTENTIAL THERAPEUTIC STRATEGIES

For the majority of REDs, targeting RNAexp molecules by oligonucleotide therapeutics or small compounds stands as a significant strategy directed to combat repeat‐associated toxicity (Scoles & Pulst, 2018). In particular, ASOs were shown to successfully target mutated transcripts harboring expanded repeats and alleviate RNAexp‐mediated toxicity related to HD (Lane et al., 2018; Tabrizi et al., 2019), FXTAS (Derbis et al., 2021), C9‐ALS/FTD (Donnelly et al., 2013; Lagier‐Tourenne et al., 2013; Sareen et al., 2013), DM1 (Jauvin et al., 2017; Wheeler et al., 2009; Wojtkowiak‐Szlachcic et al., 2015), and other REDs in vivo models (Scoles & Pulst, 2018). Such an approach seems to be very promising, as some ASO‐based drugs, such as nusinersen and eteplirsen, which target specific RBP‐binding site, have already been approved by the Food and Drug Administration (FDA) and contribute to the treatment of neurodegenerative diseases (Goodkey et al., 2018; Lim et al., 2017). Another approach to tackle RNAexp‐related toxicity is the use of small molecules targeting the structural motifs of different RNAexp (Angelbello et al., 2019, 2021; Khan et al., 2019). Current findings concerning this approach are detailed elsewhere (Angelbello et al., 2020; Crunkhorn, 2021), therefore, in this review, we would like to focus on strategies targeting RBPs that contribute to the pathogenesis of REDs. One therapeutic strategy for DM1 is to restore the activity of MBNL proteins, which are sequestered on CUGexp or CCUGexp, to its physiological level, which seems to be a successful approach to correct AS defects and improve overall phenotype (Chamberlain & Ranum, 2012; Kanadia et al., 2006; Wheeler et al., 2009). Flow cytometry‐based genetic screening allowed the identification of two histone deacetylase (HDAC) inhibitors, ISOX, and vorinostat, which increased MBNL1 levels and resulted in the partial correction of alternative splicing in DM1 patient‐derived fibroblasts (F. Zhang et al., 2017). A recent screening revealed that the inhibition of cyclooxygenase 1 (COX‐1) by nonsteroidal anti‐inflammatory drugs (NSAIDs) caused demethylation of the MeR2 enhancer and restored MBNL1 mRNA levels in myogenic cells (K. Huang et al., 2020). Another NSAID, phenylbutazone (PBZ) was shown to suppress methylation of an enhancer region in Mbnl1 intron 1 and successfully upregulate MBNL1 levels in cells and a DM1 mouse model (G. Chen et al., 2016). Moreover, PBZ treatment reduced the recruitment of MBNL1 to CUGexp foci in cellulo, ameliorated splicing, mitigated muscle pathology, and improved scores on mouse physical testing (G. Chen et al., 2016). Such results prove the potential value of NSAIDs as potent therapeutic candidates for DM1, especially because those drugs are already a part of the treatment regimens for other neurodegenerative diseases (G. Chen et al., 2016; H. Chen et al., 2005; Hirohata et al., 2005; Uaesoontrachoon et al., 2014). Another strategy to increase MBNL1 levels is the use of antagomiRs to silence miR‐23b and miR‐218, which regulate MBNL1 levels in muscle (Cerro‐Herreros et al., 2018). MBNL1 derepression by antagomiRs rescued missplicing of muscle transcripts and improved functional phenotypes in DM1 mice (Cerro‐Herreros et al., 2018, 2020). Targeting pathologically increased autophagy in DM1 is another therapeutic strategy, as treatment with an anti‐autophagic drug, chloroquine, in human‐derived myoblasts, Drosophila model and DM1 mouse elevated the level of MBNL1, corrected splicing, reduced muscle degeneration, extended the lifespan of the flies, and finally improved the functionality and histopathology of murine muscle tissues (Bargiela et al., 2015, 2019). A correlation between low miR‐7 levels and increased autophagy was observed in biopsies from DM1 patients; thus, antagomiR‐7 treatment was applied, which indeed improved myotube diameter and the fusion capacity in differentiating DM1 myoblasts; however, the effect turned out to be independent of MBNL1, suggesting that miR‐7 acts downstream or alongside MBNL1 in the pathogenesis of DM1 (Sabater‐Arcis et al., 2020). One of the pathological phenotypic hallmarks of C9‐ALS/FTD is the presence of an excessive amount of TDP‐43 in SG in the patient brain, which causes toxicity in neurons (Barmada et al., 2010). As stress conditions elicit the phosphorylation of eIF2α regulated by PERK, the small molecule GSK2606414, an inhibitor of PERK, was shown to successfully mitigate TDP‐43‐dependent toxicity, improve mobility in C9‐ALS/FTD flies, and increase the survival rate of neurons (H. J. Kim et al., 2014; Wek et al., 2006). Small‐molecule screening focused on the modulation of SG formation revealed that molecules with planar moieties, such as mitoxantrone, can decrease the accumulation of TDP‐43 and other RBPs in SG (Fang et al., 2019). Another small molecule, nTRD22, was shown to bind the N‐terminus of TDP‐43, affecting RNA‐binding properties and the degradation of TDP‐43 in vitro and in vivo (Mollasalehi et al., 2020). A study conducted in a C9‐ALS/FTD mouse model indicated that inhibiting the stress‐triggered PKR pathway by metformin treatment reduced the level of RAN translation. Metformin prevented the excessive accumulation of RAN peptides and improved the mouse phenotype (Zu et al., 2020). Metformin is already approved by the FDA and serves as a well‐established treatment for type 2 diabetes (Sanchez‐Rangel & Inzucchi, 2017) and is part of the treatment regimens for DM1 (Bassez et al., 2018), FXS (Dy et al., 2018), and HD (Hervás et al., 2017). Recent findings demonstrate that targeting another kinase, SRSF protein kinase 1 (SRPK1), the protein responsible for the phosphorylation of SRSF1 by the compounds: SRPIN340 and SPHINX31, significantly suppressed CGGexp or G4C2exp RAN translation in vitro, and depletion of the Drosophila SRPK1 homolog reduced repeat‐associated toxicity and improved the rough‐eye phenotype in flies expressing both types of repeats (Malik, Tseng, et al., 2021). In summary, the presented findings underlie the value of MBNL1 rescue and regulation of autophagy in DM1 therapy by selected drugs, paving the way for further clinical validation. Targeting kinases related to stress response pathways seems to be encouraging, as the presence of structured RNAexp molecules corresponds with elevated cellular stress, which is common among many REDs. However, although some already approved drugs might be applied to combat neurodegenerative diseases, new approaches to enrich existing therapies are still needed.

CONCLUSION

RNAexp acts as a very attractive platform that supports interactions with a number of RBPs; thus, wherever and whenever RNAexp molecules appear, they distract proteins from performing their functions. To exert toxicity and drive the development of REDs, RNAexp molecules require their “partners in crime” at various stages of their life cycle. At the transcriptional level, Pol II is required for the transcription of GC‐rich repeat sequences. On the one hand, RNAexp impairs Pol II initiation and elongation, but in turn, the DSIF complex and PAF1C promote Pol II transcription at RNAexp sites and may be potential therapeutic targets of REDs. NCT is impaired in REDs and leads to imbalance in homeostasis between the nucleus and cytoplasm. Some evidence indicates that RNAexp molecules bind RBPs involved in NCT, disturbing their functions. On the other hand, RNAexp/RBPs interplay may influence the cellular localization of RNAexp molecules and help them exert toxicity in specific compartments. Thus, it is important to consider potential therapeutic strategies targeting RNAexp or interacting RBPs in the context of pathogenic effects triggered by these molecules in different cellular compartments. Multiple proteins can be sequestered on RNAexp via direct interaction or coaggregation with protein partners during nuclear foci formation, leading to deterioration of their physiological functions. This impairment can often be observed as symptoms of a disease, that is, in DM1, in which the sequestration of MBNL proteins on CUGexp leads to aberrant AS and contributes to the disease phenotype in a CUGexp load‐dependent manner. Importantly, foci formation by RNAexp and sequestered proteins probably changes cell flexibility and impairs LLPS. More studies are needed to decipher the relationship between RNAexp and RBPs, which are MLO components, and their contribution to the liquid‐to‐solid transition leading to inclusion formation. Moreover, due to recent findings concerning the influence of CGGexp on the phase transition of its RAN translation product, polyG, a new and exciting research field related to the convergence of RNA gain‐of‐function and RAN translation/protein gain‐of‐function mechanisms has opened up. Toxic RAN peptides derived from non‐AUG‐initiated translation significantly contribute to the pathogenesis of many REDs. Recent findings provide multiple lines of evidence that RBPs play diverse roles in modulating RAN translation starting from ribosome positioning, the unwinding of structured RNAexp, and responding to stress conditions. The significance of these RBPs makes them promising therapeutic targets, especially because their modulatory properties have been confirmed by in vivo experiments. Over the last decade, many studies have been devoted to rescue proteins trapped in RNA foci and the modulation of proteins involved in the stress response. Emerging studies suggest that RNAexp and RAN proteins can be involved in the activation of innate immune system. The inflammation can account for pathogenesis in many REDs and be a possible target for therapeutic intervention. Much attention has been given to approved drugs and testing them in the context of potential REDs therapy; however, recent discoveries have pointed out a number of potential targets, requiring medical advancements. Additionally, more research on the development of biomarkers, such as RBPs involved in REDs, is desired to enable accurate diagnosis and monitoring of disease progression and effectiveness of given treatment. 3′/5′ untranslated regions alternative splicing antisense oligonucleotide C9orf72‐linked amyotrophic lateral sclerosis and frontotemporal dementia expanded CAG repeats expanded CCUG repeats expanded CGG repeats expanded CTG repeats expanded CUG repeats myotonic dystrophy type 1 myotonic dystrophy type 2 dipeptide‐repeat proteins Food and Drug Administration Friedreich's ataxia frataxin fragile X syndrome fragile X‐associated tremor/ataxia syndrome expanded GGGGCC repeats Huntington's disease heterogeneous nuclear ribonucleoproteins huntingtin induced pluripotent stem cells induced pluripotent stem cells‐derived neurons intro retention K homology domain liquid–liquid phase separation musceblind‐like proteins membraneless organelles nucleocytoplasmic transport repeats expansion disorders nonsteroidal anti‐inflammatory drugs PAF1 complex processing bodies phenylbutazone protein kinase R‐like ER kinase 43S preinitiation complex protein kinase R RNA polymerase II polyglycine polyglycine‐alanine polyglycine‐proline polyglutamine repeat associated non‐AUG translation RNA binding domain RNA binding proteins RNA containing expanded repeats ribonucleoprotein complexes RNA recognition motif spinocerebellar ataxia stress granules short tandem repeats

CONFLICT OF INTEREST

All authors declare no conflict of interest.

AUTHOR CONTRIBUTIONS

Anna Baud: Conceptualization (equal); funding acquisition (equal); writing – original draft (equal). Magdalena Derbis: Conceptualization (equal); visualization (equal); writing – original draft (equal). Katarzyna Tutak: Conceptualization (equal); writing – original draft (equal). Krzysztof Sobczak: Conceptualization (equal); funding acquisition (equal); supervision (equal); writing – review and editing (equal).

Partners in crime: Proteins implicated in RNA repeat expansion diseases.

INTRODUCTION

TOXIC RNA MOLECULES

TRANSCRIPTION AND SPLICING OF RNA

SEQUESTRATION OF PROTEINS ON RNA MOLECULES

Myotonic dystrophies

Fragile X‐associated tremor/ataxia syndrome

C9orf72‐linked amyotrophic lateral sclerosis and frontotemporal dementia

NUCLEOCYTOPLASMIC TRANSPORT

PolyQ diseases

RAN TRANSLATION

IMMUNE RESPONSE TO RNA AND PRODUCTS OF RAN TRANSLATION

PHASE SEPARATION

TARGETING RBPs AS POTENTIAL THERAPEUTIC STRATEGIES

CONCLUSION

CONFLICT OF INTEREST

AUTHOR CONTRIBUTIONS

RELATED WIREs ARTICLES

Review 1. Sequestration of cellular interacting partners by protein aggregates: implication in a loss-of-function pathology.

2. Long-term treatment with naproxcinod significantly improves skeletal and cardiac disease phenotype in the mdx mouse model of dystrophy.

3. Cytoplasmic mislocalization of TDP-43 is toxic to neurons and enhanced by a mutation associated with familial amyotrophic lateral sclerosis.

4. Somatic expansion of the Huntington's disease CAG repeat in the brain is associated with an earlier age of disease onset.

Review 5. Fragile X syndrome and associated disorders: Clinical aspects and pathology.

6. Discovery of a potent small molecule inhibiting Huntington's disease (HD) pathogenesis via targeting CAG repeats RNA and Poly Q protein.

7. RNA self-assembly contributes to stress granule formation and defining the stress granule transcriptome.

8. Genome-encoded cytoplasmic double-stranded RNAs, found in C9ORF72 ALS-FTD brain, propagate neuronal loss.

9. Eukaryotic ribosomal protein RPS25 interacts with the conserved loop region in a dicistroviral intergenic internal ribosome entry site.

10. SRSF1-dependent nuclear export inhibition of C9ORF72 repeat transcripts prevents neurodegeneration and associated motor deficits.

Review 1. Partners in crime: Proteins implicated in RNA repeat expansion diseases.