| Literature DB >> 23801983 |
Bruno Almeida1, Sara Fernandes, Isabel A Abreu, Sandra Macedo-Ribeiro.
Abstract
Trinucleotide repeat (TNR) expansions are present in a wide range of genes involved in several neurological disorders, being directly involved in the molecular mechanisms underlying pathogenesis through modulation of gene expression and/or the function of the RNA or protein it encodes. Structural and functional information on the role of TNR sequences in RNA and protein is crucial to understand the effect of TNR expansions in neurodegeneration. Therefore, this review intends to provide to the reader a structural and functional view of TNR and encoded homopeptide expansions, with a particular emphasis on polyQ expansions and its role at inducing the self-assembly, aggregation and functional alterations of the carrier protein, which culminates in neuronal toxicity and cell death. Detail will be given to the Machado-Joseph Disease-causative and polyQ-containing protein, ataxin-3, providing clues for the impact of polyQ expansion and its flanking regions in the modulation of ataxin-3 molecular interactions, function, and aggregation.Entities:
Keywords: amino acid-repeats; amyloid; microsatellites; protein aggregation; protein complexes; protein structure
Year: 2013 PMID: 23801983 PMCID: PMC3687200 DOI: 10.3389/fneur.2013.00076
Source DB: PubMed Journal: Front Neurol ISSN: 1664-2295 Impact factor: 4.003
Human diseases associated with nucleotide repeat expansions (adapted from Messaed and Rouleau, .
| Disease name | Repeat type | Repeat location | Gene | Protein (UniProt identifier, number of residues) | Biological process | Normal repeat length | Disease repeat length | Protein structure determined? |
|---|---|---|---|---|---|---|---|---|
| Spinal and bulbar muscular atrophy (SBMA) | CAG | Protein coding region (polyQ) | Androgen receptor (P10275, 919 residues) | Transcription, transcription regulation | 9–36 | 38–62 | Residues 20–30 and 671–919 (PDB code 1xow) | |
| Huntington’s disease (HD) | CAG | Protein coding region (polyQ) | Huntingtin (P42858, 3142 residues) | Apoptosis | 6–34 | 36–121 | Residues 5–18 (3lrh), Residues 1–17 (2ld0, 2ld2), Residues 1–64 (3io4, 3io6, 3ior, 3iot, 3iou, 3iov, 3iow) | |
| Dentatorubral- pallidouysian atrophy (DRPLA) | CAG | Protein coding region (polyQ) | atrophin 1 (P54259, 1190 residues) | Transcription, transcription regulation | 7–34 | 49–88 | No structural information | |
| Spinocerebellar ataxia 1 (SCA1) | CAG | Protein coding region (polyQ) | ataxin 1 (P54253, 815 residues) | Transcription, transcription regulation | 6–39 | 40–82 | Residues 563–693 (1oa8) | |
| Spinocerebellar ataxia 2 (SCA2) | CAG | Protein coding region (polyQ) | ataxin 2 (Q99700, 1313 residues) | No associated GO keywords for biological process | 15–24 | 32–200 | Residues 912–928 (3ktr) | |
| Spinocerebellar ataxia 3 (SCA3) | CAG | Protein coding region (polyQ) | ataxin 3 (P54252, 364 residues) | Transcription, transcription regulation, Ubl conjugation pathway | 10–51 | 55–87 | Residues 1–182 (1yzb), Residues 222–263 (2klz) | |
| Spinocerebellar ataxia 6 (SCA6) | CAG | Protein coding region (polyQ) | CACNA 1A, P/Q-type α1A calcium channel subunit (O00555, 2505 residues) | Calcium transport, ion transport, transport | 4–20 | 20–29 | Residues 1955–1975 (3bxk) | |
| Spinocerebellar ataxia 7 (SCA7) | CAG | Protein coding region (polyQ) | ataxin 7 (O15265, 892 residues) | Transcription, transcription regulation | 4–35 | 37–306 | Residues 330–401 (2kkr) | |
| Spinocerebellar ataxia 17 (SCA17) | CAG | Protein coding region (polyQ) | TATA box binding protein (TBP) (P20226, 339 residues) | Transcription, transcription regulation, Host-virus interaction | 25–42 | 47–63 | Residues 159–337 (1cdw, 1c9b, 1jfi, 1nvp, 1tgh) | |
| Multiple skeletal dysplasias (COMP) | GAC | Protein coding region (polyaspartate) | cartilage oligomeric matrix protein (a.k.a Thrombospondin-5) (P49747, 757 residues) | Apoptosis, cell adhesion | 5 | 4, 6, 7 | Residues 225–757 (3fby). | |
| Synpolydactyly (HOXD13) | GCG | Protein coding region (polyA) | homeobox D13 (P35453, 343 residues) | Transcription, transcription regulation | 15 | 22–29 | No structural information | |
| Oculopharyngeal Muscular Dystrophy (OPMD) | GCG | Protein coding region (polyA) | Polyadenylate-binding protein 2 (Q86U42, 306 residues) | mRNA processing | 10 | 12–17 | Residues 167–254 (3b4d, 3b4m, 3ucg) | |
| Cleidocranial dysplasia (CBFA1) | GCG | Protein coding region (polyA) | Runt-related transcription factor 2 (Q13950, 521 residues) | Transcription; transcription regulation | 17 | 27 | No structural information | |
| Holoprosencephaly (ZIC2) | GCG | Protein coding region (polyA) | Zinc-finger protein ZIC 2 (O95409, 532 residues) | Differentiation, neurogenesis, transcription, transcription regulation | 15 | 25 | No structural information | |
| Hand-Foot-Genital Syndrome/HOXA13) | GCG | Protein coding region (polyA) | homeobox A13 (P31271, 388 residues) | Transcription, transcription regulation | 18 | 24–26 | No structural information | |
| Blepharophimosis/ptosis/epicanthus inversus syndrome type II (FOXL2) | GCG | Protein coding region (polyA) | Forkhead box like 2 (P58012, 376 residues) | Differentiation, transcription, transcription regulation | 14 | 22–24 | Residues 322–328 (2l7z) | |
| Infantile spasm syndrome (ARX) | GCG | Protein coding region (polyA) | Aristaless-related homeobox (Q96QS3, 562 residues) | Differentiation, neurogenesis, transcription, transcription regulation | 10–16 | 17–23 | No structural information | |
| Myotonic dystrophy type 1 (DM1) | CTG | 3′UTR | Myotonic dystrophy protein kinase (DMPK) (Q09013, 639 residues) | No associated GO keywords for biological process | 5–37 | 90–6500 | Residues 11–420 (2vd5), Residues 460–537 (1wt6) | |
| Friedreich ataxia (FRDA) | GAA | Intron | Frataxin (Q16595, 210 residues) | Heme biosynthesis, Ion transport, Iron storage, Iron, transport | 6–32 | >200 | Residues 88–210 (1ekg), Residues 91–210 (1ly7), Residues 82–210 (3s4m, 3s5d, 3s5e, 3s5f, 3t3j, 3t3k, 3t3l, 3t3t, 3t3x) | |
| Spinocerebellar ataxia 8 (SCA8) | CTG | 3′UTR | Ataxin-8 (a.k.a protein 1C2; (Present in SCA8-specific 1C2-positive intranuclear inclusions) (Q156A1, 80 residues) | Cell death | 2–130 | >110 | Nostructural information | |
| Spinocerebellar ataxia 12 (SCA12) | CAG | 5′UTR | Serine/threonine-protein phosphatase 2A 55 kDa regulatory subunit B β isoform (Q00005, 443 residues) | Apoptosis | 7–45 | 55–78 | No structural information | |
| Huntington disease-like 2 (HDL2) | CAG | Alternative splice isoform 2 – polyA-expansion | Junctophilin 3 (Q8WXH2, 748 residues) | No associated GO keywords for biological process | 6–27 | 51–57 | No structural information | |
| FRAXA: fragile X syndrome | CGG | 5′UTR | Fragile X mental retardation 1 protein (Q06787, 632 residues). | Transport; mRNA transport | 6–52 | 230–2000 | Residues 1–134 (2bkd), Residues 216–280 (2fmr), Residues 216–425 (2qnd), Residues 527–541 (2la5) | |
| FXTAS: fragile X tremor/ataxia syndrome | CGG | 5′UTR | Fragile X mental retardation 1 protein (Q06787, 632 residues). | Transport; mRNA transport | 6–52 | 59–230 | Residues 1–134 (2bkd), Residues 216–280 (2fmr), Residues 216–425 (2qnd), Residues 527–541 (2la5) | |
| FRAXE: fragile X syndrome | CGG | 5′UTR | Fragile X mental retardation 2 protein (P51816, 1311 residues) | mRNA processing, mRNA splicing | 4–39 | 200–900 | No structural information |
UTR, untranslated region.
.
Figure 1Structural variability of proteins encoded by TNR-containing genes. Illustrative domain graphics of the multi-domain structure of proteins associated with polyQ-expansion diseases. All proteins shown are referenced by their name as annotated in UniProt. The protein domains for which information is annotated in the Pfam database are shown as colored boxes with Pfam family accession code referenced above the domain box. Complete names of domains can be assessed by searching the specific Pfam accession code at http://pfam.sanger.ac.uk/. Numbers below the domain schemes represent amino acid residue numbers. Regions containing the amino acid repeats and with a prediction for formation of coiled-coils (as annotated in UniProt) are shown as well as regions with known 3D structure (boxed in red, with PDB accession codes shown). Notice the predominant location of the repeat regions within the N-terminal regions of the proteins.
Figure 2Structural variability of proteins encoded by TNR-containing genes. Illustrative domain graphics of the multi-domain structure of proteins associated with polyD- and polyA-expansion diseases. All proteins shown are referenced by their name as annotated in UniProt. The protein domains for which information is annotated in the Pfam database are shown as colored boxes with Pfam family accession code referenced above the domain box. Complete names of domains can be assessed by searching the specific Pfam accession code at http://pfam.sanger.ac.uk/. Numbers below the domain schemes represent amino acid residue numbers. Regions containing the amino acid repeats and with a prediction for formation of coiled-coils (as annotated in UniProt) are shown as well as regions with known 3D structure (boxed in red, with PDB accession codes shown). Notice the predominant location of the repeat regions within the N-terminal regions of the proteins.
Figure 3Structure of proteins/protein domains containing polyQ regions. (A) Cartoon representation of the domain swapped dimer of chymotrypsin inhibitor 2 with a 4 glutamine insertion [(Chen et al., 1999); PDB accession code 1cq4], dotted lines represent the polyQ linker not visible in the X-ray crystal structure. (B) Cartoon representation of domain swapped major dimers of ribonuclease A. Inset shows a short segment resembling the polar zipper formed by asparagine residues in the linker region [(Liu et al., 2001); PDB accession code 1f0v]. (C) Surface representation Fv fragment of a monoclonal antibody in complex with a polyQ peptide shown as sticks [(Li et al., 2007a), PDB accession code 2otu]. (D) Cartoon representation of the glutamine-rich domain from HDAC4 showing details of the polar interactions (dotted lines) at the oligomer interfaces involving glutamine residues [(Guo et al., 2007), PDB accession code 2o94]. (E) Cartoon representation of the crystal structures of huntingtin exon-1 fragments observed in different crystal forms, highlighting the different orientations of the C-terminal polyQ residues shown as sticks. The 17 glutamine stretch adopts variable conformations in the structures: α helix, random coil, and extended loop. [(Kim et al., 2009), PDB accession codes 3io4, 3iow, 3iov, 3iou, 3iot, 3ior, 3io6].
Figure 4Overview of ataxin-3 structural information. Schematic illustration of ataxin-3 (isoform 2; a.k.a. 3UIM isoform) domain structure highlighting the regions involved in protein–protein interactions. The solution structures of the Josephin domain (PDB accession code 1yzb) and UIMs1-2 (PDB accession code 2klz) are shown colored from N-(blue) to C- terminus (red). JD-, UIM-, NLS-, and polyQ-mediated interactions are represented by blue, red, green, and purple arrows, respectively; blue arrows indicate the location of post-translational modification sites, resulting from the interaction and phosphorylation by CK2 and GSK3. Representative multi-subunit complexes where ataxin-3 participates are boxed (Li et al., 2002; Matsumoto et al., 2004; Scaglione et al., 2011; Durcan et al., 2012). One of the main questions in the quest for ataxin-3 interacting proteins is whether polyQ-expansion of the disease-protein modulates the binding affinities. Current data indicates that polyQ-expansion increments the ataxin-3 affinity for CHIP (Scaglione et al., 2011), VCP/p97 (Matsumoto et al., 2004; Boeddrich et al., 2006; Zhong and Pittman, 2006), and the transcription regulators p300, CBP, and PCAF (Li et al., 2002) (interactions represented by broken lines). Strikingly, all these interactions are mediated by ataxin-3 flexible tail, which includes the polyQ tract. Moreover the transcriptional regulators p300, CBP, and NCOR all contain amino acid repeats.
Figure 5Overview of ataxin-3 protein interaction network. Data on the ataxin-3 interactors was obtained by analysis of Interactome3D (Mosca et al., 2012), MINT (Ceol et al., 2010), and Dr. PIAS (Sugaya and Furuya, 2011) protein interaction databases, and completed with data compiled from current literature on ataxin-3 protein associations obtained with a diverse set of experimental approaches (see complete information on Table 2). Red arrows indicate interactions for which structural data has been obtained, while orange arrows indicate that biophysical data on interaction affinity in vitro is known (Table 2). Broken arrows represent interactions that result from high-throughput interactome analysis that still require detailed biochemical and functional analysis. Proteins are grouped according to their biological role.
Human ataxin-3 associated proteins.
| Ataxin-3 interacting protein (UniProt accession code) | Protein name | Direct interaction? | Interaction domains | Reference | |
|---|---|---|---|---|---|
| Ataxin-3 | Partner protein | ||||
| HHR23A/B (P54725/P54727) | UV excision repair protein RAD23 homolog A/B | Yes, kD (JD:Ubl) = 12 μM | JD | Ubiquitin-like (Ubl) N-terminal domain | Wang et al. ( |
| Poly-ubiquitin (P0CG48/P0CG47) | Polyubiquitin-C/Polyubiquitin-B | Yes, kD (atxn3:K48-tetraUb) = 0.2 μM, kD (atxn3:Ub) = 50 μM | UIMs, JD | K48- and K63-linked Ub (≥4 Ub), K48-linked diUb | Burnett et al. ( |
| Ubiquilin-1 (Q9UMX0) | Protein linking IAP with cytoskeleton 1 | n.d. | n.d. | n.d. | Heir et al. ( |
| NEDD8 (Q15843) | Ubiquitin-like protein Nedd8 | Yes | JD | NEDD8 | Ferro et al. ( |
| Parkin (O60260) | E3 ubiquitin-protein ligase parkin | Yes | JD, UIMs | IBR domain, Ubiquitin-like (Ubl) domain | Durcan et al. ( |
| Ubc7 (P62253) | Ubiquitin-conjugating enzyme E2 G1 | Yes (transient interaction detected using cross-linking reagents) | n.d. | n.d. | Durcan et al. ( |
| p45 (P62195) | 26S proteasome regulatory subunit 8 | Yes | N-terminal atxn3 region (residues 1–133) | n.d. | Wang et al. ( |
| 20S Proteasome (P25786, P25787, P25788, P25789, P28066, P60900, O14818, P20618, P49721, P49720, P28070, P28074, P28072, Q99436) | Proteasome subunits α types 1-7 and β types 1-7 | n.d. | N-terminal atxn3 region (residues 1–150) | n.d. | Doss-Pepe et al. ( |
| CHIP (Q9UNE7) | E3 ubiquitin-protein ligase CHIP | Yes, kD (atxn3:CHIP) = 2.2 μM, kD (atxn3:Ub-CHIP) = 0.1 μM | Atxn3 C-terminus (residues 133–357) | CHIP N-terminus | Jana et al. ( |
| VCP/p97 (P55072) | Transitional endoplasmic reticulum ATPase | Yes | Residues 277–281 (includes arginine/lysine-rich NLS) | N domain, residues 1-199 | Hirabayashi et al. ( |
| E4B (O95155) | Ubiquitin conjugation factor E4 B | Yes (with 79Q-ataxin-3) | n.d. | n.d. | Matsumoto et al. ( |
| OTUB2 (Q96DC9) | Ubiquitin thioesterase OTUB2 | n.d. | n.d. | n.d. | Sowa et al. ( |
| USP13 (Q92995) | Ubiquitin carboxyl-terminal hydrolase 13 | n.d. | n.d. | n.d. | Sowa et al. ( |
| KCTD10 (Q9H3F6) | BTB/POZ domain-containing adapter for CUL3-mediated RhoA degradation protein 3 | n.d. | n.d. | n.d. | Sowa et al. ( |
| Tubulin dimer (Q71U36/P68363) | Tubulin α-1A, Tubulin β-2B | Yes, kD (atxn3:tubulin) = 50–70 nM | JD | n.d. | Mazzucchelli et al. ( |
| Dynein (Q9Y6G9) | Cytoplasmic dynein 1 light intermediate chain 1 | n.d. | n.d | n.d. | Burnett and Pittman ( |
| HDAC6 (Q9UBN7) | Histone deacetylase 6 | n.d. | n.d. | n.d. | Burnett and Pittman ( |
| p300 (Q09472) | Histone acetyltransferase p300 | Yes | PolyQ-containing C terminus of atxn3 (residues 288–354) | n.d. | Li et al. ( |
| CBP (Q92793) | cAMP-response-element binding protein (CREB)-binding protein | Yes | PolyQ-containing C terminus of atxn3 (residues 288–354) | n.d. | Li et al. ( |
| PCAF (Q92831) | p300/CREB-binding protein-associated factor: histone acetyltransferase KAT2B | Yes | PolyQ-containing C terminus of atxn3 (residues 288–354) | n.d. | Li et al. ( |
| Histone H3/H4 (P68431/P62805) | Histone | Yes | JD + UIM1 and 2 (residues 1–288) | n.d. | Li et al. ( |
| HDAC3 (O15379) | histone deacetylase 3 | Yes | n.d. | n.d. | Evert et al. ( |
| NCOR1 (O75376) | Nuclear receptor corepressor 1 | n.d. | n.d. | n.d. | Evert et al. ( |
| MAML3 (Q96JK9) | Mastermind-like protein 3 | n.d. | n.d. | n.d. | Ravasi et al. ( |
| EWSR1 (Q01844) | RNA-binding protein EWS | n.d. | n.d. | Vinayagam et al. ( | |
| CK2 (P19784) | Casein kinase II subunit α | Yes | n.d. | n.d. | Tao et al. ( |
| GSK3B (P49841) | Glycogen synthase kinase-3 β | Yes | n.d | n.d | Fei et al. ( |
| DNM2 (P50570) | Dynamin-2 | n.d. | n.d. | n.d. | Vinayagam et al. ( |
| CDKN1A (P38936) | Cyclin-dependent kinase inhibitor 1 | n.d. | n.d. | n.d. | Vinayagam et al. ( |
| ANXA7 (P20073) | Annexin A7 | n.d. | n.d. | n.d. | Vinayagam et al. ( |
| RPS6AK1 (Q15418) | Ribosomal protein S6 kinase α-1 | n.d. | n.d. | n.d. | Vinayagam et al. ( |
| TK1 (P04183) | Thymidine kinase, cytosolic | n.d. | n.d. | n.d. | Vinayagam et al. ( |
| MKNK1 (Q9BUB5) | MAP kinase-interacting serine/threonine-protein kinase 1 | n.d. | n.d. | n.d. | Vinayagam et al. ( |
| TEX11 (Q8IYF3) | Testis-expressed sequence 11 protein | n.d. | n.d. | n.d. | Lim et al. ( |
| C16orf70 (Q9BSU1) | UPF0183 protein C16orf70 | n.d. | n.d. | n.d. | Lim et al. ( |
| ARHGAP19 (Q14CB8) | Rho GTPase-activating protein 19 | n.d. | n.d. | n.d. | Lim et al. ( |
| PICK1 (Q9NRD5) | PRKCA-binding protein | n.d. | n.d. | n.d. | Lim et al. ( |
Boxes shaded in gray represent associations identified in high-throughput interactome screenings.
Atxn3, ataxin-3; IBR, In Between Ring fingers; JD, Josephin domain; n. d., not determined; NLS, nuclear localization sequence; Ub, ubiquitin; UBA, ubiquitin associated domain; Ubl, ubiquitin-like domain; UIM, ubiquitin-interacting motifs.