Literature DB >> 28937758

Monomeric Huntingtin Exon 1 Has Similar Overall Structural Features for Wild-Type and Pathological Polyglutamine Lengths.

John B Warner¹, Kiersten M Ruff², Piau Siong Tan³, Edward A Lemke³, Rohit V Pappu², Hilal A Lashuel¹.

Abstract

Huntington's disease is caused by expansion of a polyglutamine (polyQ) domain within exon 1 of the huntingtin gene (Httex1). The prevailing hypothesis is that the monomeric Httex1 protein undergoes sharp conformational changes as the polyQ length exceeds a threshold of 36-37 residues. Here, we test this hypothesis by combining novel semi-synthesis strategies with state-of-the-art single-molecule Förster resonance energy transfer measurements on biologically relevant, monomeric Httex1 proteins of five different polyQ lengths. Our results, integrated with atomistic simulations, negate the hypothesis of a sharp, polyQ length-dependent change in the structure of monomeric Httex1. Instead, they support a continuous global compaction with increasing polyQ length that derives from increased prominence of the globular polyQ domain. Importantly, we show that monomeric Httex1 adopts tadpole-like architectures for polyQ lengths below and above the pathological threshold. Our results suggest that higher order homotypic and/or heterotypic interactions within distinct sub-populations of neurons, which are inevitable at finite cellular concentrations, are likely to be the main source of sharp polyQ length dependencies of HD.

Entities: CellLine Chemical Disease Gene Mutation Species

Mesh：

Substances：

Year: 2017 PMID： 28937758 PMCID： PMC5677759 DOI： 10.1021/jacs.7b06659

Source DB: PubMed Journal: J Am Chem Soc ISSN： 0002-7863 Impact factor: 15.419

Introduction

Huntington’s disease (HD) is a devastating inherited neurodegenerative disorder that is caused by mutational expansion of a CAG repeat region within the first exon of the huntingtin (Htt) gene.[1] Ages of onset and disease severity are inversely correlated with the length of the CAG repeat expansion. On average, the penetrance and severity at onset increase sharply above a threshold CAG repeat length of 36,[2] although there is considerable variability in the length dependence of the disease phenotype, as quantified in clinical studies.[3] Recent studies have demonstrated the possibility of CAG repeat-length-dependent aberrant splicing that leads to Htt exon 1 spanning transcripts.[4] When translated, these transcripts yield Htt exon 1 encoded protein fragments, referred to hereafter as Httex1. The sequence architecture of Httex1 is modular. The CAG repeat encodes a central polyglutamine (polyQ) domain. This is flanked N-terminally by a 17-residue amphipathic stretch (Nt17) and C-terminally by a 50-residue proline-rich (PR) domain. N-terminal fragments of the Htt protein, including Httex1, are among the smallest proteins that recapitulate HD pathology in mouse models.[5] These fragments form neuronal intranuclear inclusions and are associated with the formation of dystrophic neurites in the cortex and striatum in HD.[5] Additionally, Httex1 and N-terminal fragments of Httex1 with expanded polyQ tracts aggregate and lead to toxicity in cell culture models.[6] The existence of a pathogenic polyQ length threshold for HD has led to the expectation that there should be a sharp conformational change within monomeric Httex1 at and above the pathogenic polyQ length.[7] A direct test of this hypothesis requires atomic-level structural characterization of monomeric Httex1 as a function of polyQ length. These studies have to be performed in the absence of confounding contributions from intermolecular associations. However, detailed structural studies of monomeric forms of monomeric Httex1 are challenging because of the high aggregation propensity and the polyQ-length-dependent insolubility of Httex1,[8] the repetitive nature of the polyQ and PR domains, and the sequence-encoded preference for conformational heterogeneity.[9] Httex1 molecules are highly insoluble and their solubility limits fall below the micromolar range with increasing polyQ length.[8] This poses serious challenges for interpreting data from methods such as nuclear magnetic resonance (NMR) spectroscopy, vibrational spectroscopy, or small-angle X-ray scattering. These methods require protein concentrations that are in the micromolar to millimolar range. Heterogeneous mixtures of monomers, oligomers, and higher-order aggregates inevitably confound interpretations from structural studies and make it difficult to compare results obtained from different techniques and laboratories. Furthermore, the repetitive nature of the polyQ and PR domains can lead to overlapping signals that are difficult to deconvolve. To overcome problems posed by the poor solubility of Httex1, solubilizing sequences (e.g., oligolysine tags) or proteins (e.g., GST, MBP) are usually added to the N- or C-terminal ends of Httex1 proteins and model systems.[7,10] To monitor polyQ-mediated conformational changes and aggregation in cellular models of HD, fluorescent proteins such as GFP and YFP are commonly fused to N- and/or C-terminal ends of Httex1.[7] These protein domains, which are typically larger than 20 kDa, are as large as or larger than the Httex1 construct of interest and can have a significant influence on conformational properties as evidenced by their ability to modulate Httex1 solubility and aggregation mechanisms.[7,11] Even the addition of minimally perturbing solubilizing flanking residues (e.g., Lys (n = 1–8)) can lead to substantial alterations of the complex aggregation landscape and phase behavior of Httex1 constructs.[8,10] Bioinformatics predictions, computer simulations,[12] and NMR[13] studies on small fragments suggest that Httex1 molecules are intrinsically disordered. This designation implies that Httex1 molecules are likely to display considerable conformational heterogeneity and lack persistent secondary and tertiary structures. To characterize the conformational ensembles of intrinsically disordered proteins (IDPs) such as Httex1, we need quantitative assessments of intramolecular distances, the amplitudes of conformational fluctuations, and the overall shapes and sizes of molecules. Importantly, such measurements need to be made under conditions where there are no confounding contributions from intermolecular associations. Here, we report results from investigations that deploy a novel combination of single-molecule Förster resonance energy transfer (smFRET) measurements on site-specifically labeled semisynthetic Httex1 proteins and atomistic computer simulations that are based on the ABSINTH implicit solvation model and force field paradigm.[14] We deployed a recently developed intein-based expression system that enables the generation of bona fide Httex1 proteins with polyQ repeats below and above the pathogenic threshold (Q15–49).[15] Our investigations quantify the variation of intramolecular distances within Httex1 as a function of polyQ length and provide the first complete structural characterization of monomeric forms of Httex1. The smFRET measurements allow us to characterize the conformational properties of Httex1 constructs in the sub-nanomolar regime, where confounding effects of oligomerization are readily avoided.[16] Our semi-synthetic strategy allowed us to generate Httex1 with polyQ tracts of five different lengths, viz., n = 15, 23, 37, 43, 49. These lengths span the range starting from below and going above the pathological threshold length of 36–37 glutamine residues. We introduced sequential and site-specific donor and acceptor fluorophores to obtain homogeneous dual-labeled Httex1 proteins with consistent localization of the donor and acceptor fluorophores. This enables accurate structural studies of monomeric Httex1 using smFRET measurements. The semi-synthetic strategy also allowed the incorporation of specific post-translational modifications (PTMs) within the Nt17 domain, thus enabling the investigation of the role of PTMs, such as the phosphorylation of Thr 3,[17] in modulating the conformational properties of Httex1. For each polyQ length, we performed three distinct sets of smFRET experiments to obtain quantitative assessments of the intramolecular distances and the global conformational properties of monomeric Httex1. The measured smFRET efficiencies were combined with a maximum entropy method[18] to reweight ensembles obtained from atomistic simulations. Our approach yields the first-ever detailed atomistic description of the conformational ensembles for Httex1 as a function of polyQ length. We find that monomeric Httex1 adopts “tadpole-like” conformations characterized by a globular head comprising of Nt17 adsorbed on the surface of the polyQ domain and a semi-flexible PR domain that adopts mostly expanded conformations. Additionally, we observed a continuous global compaction of Httex1 as polyQ length increased. This arises from the increased prominence of the globular polyQ domain and does not reflect any special intramolecular interactions among the three domains. Our results negate the hypothesis of a sharp, polyQ length-dependent change in the structure of monomeric Httex1 that emerges from certain classes of computer simulations,[19] although models that invoke sharp structural changes of monomeric Httex1 within oligomers[20] cannot be ruled out. Taken together, our findings provide a structural rationalization for the large variability in the age of onset for a given polyQ repeat length.[2] Importantly, our results suggest that higher order homotypic and/or heterotypic interactions within distinct sub-populations of neurons, as opposed to sharp conformational changes within monomeric Httex1, are likely to be the main source of sharp polyQ length dependencies of HD.

Results

Novel Strategies Yield Constructs for smFRET Measurements

smFRET measurements require the generation of fluorescently labeled molecules. This involves the introduction of a pair of cysteine residues and their covalent modification with donor and acceptor fluorophores via maleimide chemistry. These strategies typically result in a heterogeneous mixture of single and dual, albeit randomly, labeled proteins[21] although further preferential labeling can be achieved via kinetic control[22] or through chromatographic separation.[23] We have developed a broadly applicable strategy for sequential, site-specific dual fluorophore protein labeling utilizing cysteine chemistry. We adapted methodologies from peptide science that take advantage of the selective reaction between an N-terminal cysteine and formaldehyde to form a thiazolidine adduct that can be selectively deprotected under mild conditions to allow selective and sequential labeling of cysteine residues.[24] By incorporating this approach into our Ssp-intein-based strategy for producing Httex1 proteins,[15b] we were able to produce site-specific, dual-labeled Httex1 with different polyQ lengths. These Ssp-Httex1 constructs were designed with a fixed cysteine at the N-terminus of Httex1 and a second motile cysteine in the PR domain (Figure a). Double cysteine constructs where the N-terminal cysteine residue is thiazolidine protected were obtained by addition of formaldehyde during the Ssp-Httex1 splicing reaction. Following purification of the N-terminally protected construct, the unprotected C-terminal cysteine was rapidly labeled with an acceptor Alexa 594-maleimide probe. For each polyQ repeat length (15, 23, 37, 43, or 49Q) the acceptor was positioned at position A60C proximal to the polyQ domain, either position P70C or P80C internal to the PR domain, or at the C-terminal P90C residue. Labeling of Httex1 with the acceptor probe was quantitative and site-selective. Deprotection of the N-terminal thiazolidine was then achieved by treatment with silver triflate under mild conditions.[24] The acceptor-labeled Httex1, with a liberated N-terminal cysteine, was then labeled with the donor Alexa 488-maleimide probe. Site-specifically dual-labeled Httex1 constructs were obtained in high purity as determined by sodium dodecyl sulfide–polyacrylamide gel electrophoresis (SDS-PAGE), liquid chromatography mass spectrometry (LC-MS), and C8 reversed-phase ultra-performance liquid chromatography (RP-UPLC) (see Supporting Information, Figures S1–S4 and Table S1).

Figure 1

Site-specifically dual-labeled Httex1 library for smFRET measurements. (a) General strategy for sequential labeling of Httex1 at the N-terminal residue and within the C-terminal proline-rich region spanning residues 60–90. Here, Alexa 594 is denoted as AF594, Alexa 488 as AF488, and silver triflate as AgOTf. (b) Unmodified dual-labeled Httex1 constructs labeled with Alexa 488 (green) at the N-terminus and Alexa 594 (red) at the indicated C-terminal position. (c) Semi-synthetic strategy for obtaining dual-labeled Httex1 proteins containing site-specific post-translational modifications within Nt17, e.g., threonine 3 (T3) phosphorylation. (d) pT3-modified dual-labeled Httex1 constructs prepared as described in (c). To be able to investigate the effect of N-terminal PTMs on the structure of Httex1, we also developed a modified semi-synthetic strategy that allows the site-specific introduction of PTMs and sequential labeling of the protein. This strategy was then used to incorporate a phosphorylated threonine residue (pT3) at position 3 into dual-labeled Httex1 constructs (Figure c). We focused on T3 for the following reasons: (1) it is the most common N-terminal PTM;[17] (2) the levels of T3 phosphorylation are inversely correlated with the polyQ length repeat;[17] (3) phosphorylation at T3 induces the most pronounced stabilizing effect on the α-helical conformation of the Nt17 domain of Httex1;[25] and (4) the T3-specific kinases have not yet been identified.[26] As with Httex1, the C-terminal 18–90 fragments were expressed from E. coli and thiazolidine protected following splicing from the Ssp-Intein. The Nt17 fragment was prepared containing an N-terminal thiazolidine, pT3, and a C-terminal thioester by solid-phase peptide synthesis. Following native chemical ligation (NCL), the ligation site cysteine, C18, was masked by treatment with iodoacetamide to generate a glutamine mimetic rather than desulfurization to alanine. This strategy allows for the site-specific incorporation of single or multiple Nt17 PTMs, thus enabling studies to elucidate the effect of these PTMs on the conformational ensemble of Httex1 at the monomeric and oligomeric levels. Using the recombinant and semi-synthetic strategies described above, we prepared a library of 15 site-specifically dual fluorophore-labeled Httex1 constructs from 15 to 49Q and 6 pT3-modified dual-labeled constructs of 23Q and 43Q (Figure b,d) that were suitable for smFRET measurements.

PolyQ Repeat Length Dependence of Httex1 Conformations Obtained Using smFRET

We used smFRET to investigate the effect of polyQ repeat length on intramolecular distances within Httex1. The data were used to generate two-dimensional FRET efficiency (EFRET) versus stoichiometry (S) histograms (Figure a).[27] Mean EFRET, ⟨EFRET⟩, values were obtained from a two-dimensional Gaussian fit of S versus EFRET histograms (Figure b). Given that Httex1 and Httex1-like model systems readily form oligomers or aggregates at micromolar and sub-micromolar concentrations,[8,15b,28] an important and unique advantage of smFRET measurements is that they were performed at sub-nanomolar concentrations thus mitigating the effect of aggregation and allowing for characterization of the intramolecular distances within monomeric Httex1 as a function of polyQ length. We observed two main populations in the smFRET measurements for dual-labeled Httex1 constructs. These include a population with an S value of 0.4–0.5 and a population with no acceptor population with an S value of 1.0. Donor only populations can arise from dye photophysics and/or incomplete labeling.[16b] Despite rigorous disaggregation,[29] we occasionally observed a third population, even in the picomolar range. Our multi-parameter analysis yielding two-dimensional plots of EFRET and S combined with a burst search algorithm[30] allowed us to separate species within this sub-population, which is most likely due to aggregated species (see Supporting Information, Figure S12). Such a population could emerge from the formation of oligomeric species in the stock solution prior to dilution and additional quenching of the donor Alexa 488. Upon resuspension of the protein in an acetic acid/acetonitrile solvent and serial dilutions into PBS we were able to minimize Httex1 aggregation and significantly reduced the presence of this tertiary population.

Figure 2

smFRET measurements of Httex1. (a) Two-dimensional EFRET versus S histograms for Httex1 15–49Q with acceptor labeled at position A60C, P70C, P80C, or P90C. (b) ⟨EFRET⟩ values calculated from 2D Gaussian fits of EFRETS versus histograms. A2C† indicates labeling with Alexa488; P90C‡ indicates labeling with Alexa594, and NA denotes where no construct is made. (c) Double logarithmic plot of ⟨EFRET⟩ versus donor–acceptor amino acid spacing for the unmodified Httex1 constructs. Acceptor label positions are indicated as follows: proximal to the polyQ domain (●), within the PR domain (■), and C-terminal (▲). (d) Double logarithmic plot of ⟨EFRET⟩ versus donor–acceptor amino acid spacing for the pT3-modified Httex1 constructs. Unmodified constructs are shown in filled shapes and pT3-modified constructs as open shapes with acceptor positions as indicated previously. The mean smFRET efficiencies were plotted as a function of the amino acid spacing between the donor and acceptor fluorophores for unmodified Httex1 constructs (Figure c) and pT3-modified Httex1 constructs (Figure d). We observed a consistent trend with regard to ⟨EFRET⟩ values versus polyQ lengths. For a particular polyQ length, as the sequence spacing between donor and acceptor FRET pairs increased, the ⟨EFRET⟩ became consistently smaller. This trend is preserved upon the introduction of the pT3 modification and we observed a further decrease in the measured ⟨EFRET⟩ values from that of the unmodified Httex1. Additionally, for a given dye pair, ⟨EFRET⟩ decreased with increasing polyQ length.

All-Atom Simulations Are Used To Convert Mean FRET Efficiencies to Inferences Regarding Httex1 Conformations

For IDPs, the general method to convert measured mean EFRET values to estimates of inter-dye distances, r, requires the assumption of a functional form for the inter-dye distance distribution P(r).[23,31] We do not have a priori knowledge of the functional form for P(r) that is applicable for converting ⟨EFRET⟩ to estimates of inter-dye distances. Typically, one uses distributions from the Gaussian chain, worm-like chain, or Flory–Fisk models.[32] In the Gaussian chain model P(r) is parametrized in terms of the inter-dye distance r, the number of peptide bonds (n) between dyes, the distance l = 0.38 nm between consecutive Cα atoms, and lp, a free parameter that measures chain stiffness.[33] However, the assumption of a canonical distance distribution function for P(r) is only applicable if we know that the sequence adopts uniformly expanded or compact conformations.[32a] Such models are inapplicable for sequences that are chimeras of distinct types of conformations.[34] Httex1 is likely to fall in this chimeric class of IDPs, as it is composed of a polyQ region that has been previously shown to adopt compact conformations and a semi-flexible PR domain with two rod-like polyproline segments.[35] Given that the Gaussian chain model only depends on the effective chain stiffness, quantified by lp, we can determine lp for a specific dye pair and use it to extract ⟨EFRET⟩ values for the remaining dye pairs. Thus, if the relative error between the measured and calculated ⟨EFRET⟩ values were large, then it would suggest that a uniformly scaling model would not describe the protein.[34] Figure compares the relative errors associated with the calculated ⟨EFRET⟩ values extracted using lp obtained from numerical fits of the Gaussian chain model to the measured A60C ⟨EFRET⟩ values. We find that the relative error in the calculated ⟨EFRET⟩ values increases with increasing sequence separation of dyes. The relative errors are as high as 13%. Comparatively, a protein that is uniformly expanded shows a mean relative error that is typically less than 4%.[33] These results suggest that the scaling determined from the A60C dye pair underestimates the distance between dyes for the remaining dye pairs, and this underestimation increases for longer sequence separations. Such a result is consistent with the N-terminus being more compact than the C-terminus of Httex1, as would be expected if the polyQ domain adopts compact conformations, whereas the PR domain adopts expanded conformations. Overall, this analysis shows that a uniform scaling model cannot describe the conformational distributions of monomeric Httex1. Therefore, we combined experimental results with distance distributions extracted from atomistic simulations to obtain refined, atomic level descriptions of the monomeric ensembles of Httex1 as a function of polyQ length.

Figure 3

Test of the validity of using the Gaussian chain model to extract distances from FRET efficiencies for Httex1 as compared to denatured ubiquitin.[33] Relative error, (⟨EFRETcalc⟩ – ⟨EFRETmeas⟩)/⟨EFRETmeas⟩, between the measured (⟨EFRETmeas⟩) and calculated (⟨EFRETcalc⟩) FRET efficiencies as a function of (|j – i| – |j – i|ref). Here, j is the position of the C-terminal dye and i is the position of the N-terminal dye. |j – i|ref denotes the number of peptide bonds between dyes for the dye pair used to calculate lp. For Httex1 constructs, lp was determined by fitting the measured A60C ⟨EFRET⟩ values using the Gaussian chain model. The calculated ⟨EFRET⟩ values were determined for the other dye pairs by inserting lp into the equation for P(r). As a control, the relative error between measured and calculated ⟨EFRET⟩ values was calculated for ubiquitin in 8 M urea using ⟨EFRET⟩ values from Aznauryan et al.[33] Ubiquitin in 8 M urea (black circles) should follow uniform scaling and thus Gaussian chain models should reasonably approximate the underlying distance distributions for denatured ubiquitin. Here, the K48C-R74C construct was used as the reference construct to calculate lp. The dashed black line denotes the mean relative error for ubiquitin in 8 M urea. We performed all atom simulations of Httex1 constructs using the ABSINTH implicit solvation model and force field paradigm.[14] These simulations were performed using unlabeled molecules. However, the smFRET experiments report ⟨EFRET⟩ values calculated for constructs comparing efficiencies between an N-terminal donor and a C-terminal acceptor either proximal to the polyQ domain, within the PR domain, or at the C-terminal residue. In order to compare the simulated ensembles to the experimental results, we had to account for the presence of the dyes and their influence on the simulated conformational ensembles. A reasonable, albeit minimalist, assumption is that dyes are fully accessible to the solvent.[34] If this were not the case, then the smFRET and fluorescence polarization anisotropy data would have revealed anomalies such as substantially hindered motions of dyes, which they do not. To account for the presence of fluorescent dyes in each of the three different positions for each polyQ length, we added dyes to the simulated ensembles in a post-processing step (see Methods for details).[34] We assume that solvation shells of radius 5 Å delineate the dyes, and inter-dye distances were calculated between the C19 atoms of Alexa 488 and Alexa 594. Our goal was to extract atomic level descriptions of conformational ensembles that are concordant with all three experimental ⟨EFRET⟩ values for each polyQ length. We achieved this using a maximum entropy reweighting method.[18] The procedure attempts to give all simulated conformations similar weights while minimizing the difference between the experimental and simulated observables. Here, for each polyQ length, the experimental observables were the ⟨EFRET⟩ values for the three dye pairs. Using the experimental ⟨EFRET⟩ values, rather than converting these mean efficiencies to mean distances, is advantageous because it limits the use of meta-data that depends on assumptions of the underlying experimental distribution.[34] To convert simulated inter-dye distances to FRET efficiencies, we deployed the Förster approximation for each conformation and generated a distribution of FRET efficiencies for each simulation and dye pair. Using the Förster formula, we calculated the conformation-specific FRET efficiency to beHere, r is the conformation and position specific distance between the dyes and R0 is the Förster radius, which was set to R0 = 56 Å (see methods in Supporting Information) and recent work.[34]

Generating Self-Consistent Conformational Ensembles for Httex1

We analyzed the conformational ensembles obtained at each of the distinct simulation temperatures. We quantified the agreement between calculated ⟨EFRET⟩ values from each of the simulated ensembles and the experimentally measured ⟨EFRET⟩ values. This procedure involved reweighting the conformations at each of the simulation temperatures to maximize the information theoretic entropy while minimizing the deviation between the calculated and measured ⟨EFRET⟩ values. This procedure shows that the extent of reweighting is rather minimal, as quantified by the change in entropy upon reweighting (Supporting Information, Figure S5). These results suggest that the simulations generate ensembles of sufficient accuracy for pursuing a detailed atomistic description of the conformational preferences of Httex1 as a function of polyQ length. We identified 320 K as the lowest simulation temperature for which the ensembles are most representative of the experimental data. For temperatures above 320 K, ensembles show optimal comparisons with the measured ⟨EFRET⟩ values (see Supporting Information, Figure S5). This robustness was preserved for all polyQ lengths examined.

Httex1 Adopts Tadpole-like Conformations with a Globular Nt17-PolyQ Head and Semi-flexible Proline-Rich Tail

Figure summarizes our analysis of various conformational features extracted from the reweighted conformational ensembles for Httex1 as a function of polyQ length. The results are shown for the ensembles obtained at 320 K because this is the lowest temperature that yields an entropy change corresponding to less than a kT change in the simulation energy function upon reweighting (see Methods and Supporting Information, Figure S5). The results presented here do not vary substantially across a broad temperature range spanning from 310 K – 335 K. We calculated the average distances between all pairs of residues from the reweighted ensembles. This provides a quantitative description of the conformational properties across Httex1 constructs as a function of polyQ length. Figure a–e shows the results of this analysis for all polyQ repeat lengths. The hotter the color the farther two residues are from each other. The defining features of these distance maps are as follows: (1) the general distance preferences are conserved across polyQ repeat lengths and dye positions; (2) the combination of Nt17 and polyQ domains adopt compact conformations as highlighted by small values for average distances between all pairs of residues within these domains; and (3) the PR domain predominantly adopts extended conformations, although there is a minor, temperature-dependent population characterized mainly by contacts between the flexible linker between polyproline modules of the PR domain and the surface of the polyQ domain (Supporting Information, Figure S6). Overall, these features suggest that Httex1 constructs adopt tadpole-like conformations for all polyQ repeat lengths. The tadpole-like architecture is defined by a globular “head”, consisting of Nt17 adsorbed to polyQ, and a semi-flexible “tail”, which refers to the PR domain.

Figure 4

Conformational properties derived from simulated ensembles that match all three smFRET ⟨EFRET⟩ values for a given polyQ length. (a–e) Distance maps quantify the average distance between all pairs of residues (in Å) for 15Q, 23Q, 37Q, 43Q, and 49Q, respectively. The hotter the color, the farther the average distance between a pair of residues. Tadpole-like architectures consisting of an Nt17-polyQ head and a PR domain tail are observed for all Httex1 constructs. (f–j) Normalized Rg distributions for the reweighted conformational ensembles of 15Q, 23Q, 37Q, 43Q, and 49Q, respectively. Here, Rg is normalized by √N, where N is the number of residues in the construct. Insets depict highly probable conformations that are consistent with a given Rg/√N value. In these snapshots, glutamine is shown in orange, proline in purple, negatively charged residues in red, positively charged residues in blue, hydrophobic residues in black, non-glutamine polar residues in green, and glycine and histidine in pink. (k) Comparison of normalized Rg distributions for all polyQ lengths. As the polyQ length increases a continuous decrease in the distribution of Rg/√N values is observed. This is a result of the increased presence of a globular polyQ domain and is visually observed from the snapshots in panels f–j. (l) The average Rg/√N as a function of polyQ length. Error bars denote the standard error of the mean calculated over three independent simulations. (m) Scaling of the mean size (⟨Rg⟩) of the polyQ domain as a function of polyQ length. The line shows the best fit to the equation ln(⟨Rg⟩) = ln(α) + ν ln(N). Here, ν = 0.36 and α = 2.62 Å. Error bars denote the standard error of the mean for three independent simulations. (n) Probability that Nt17-Qn adopts globular conformations. The probability was calculated from two-dimensional histograms of Rg/N1/3 and asphericity, δ. Specifically, the probability was calculated by summing the density within the two-dimensional region defined by 2.5 Å ≤ Rg/N1/3 < 3.5 Å and 0 ≤ δ < 0.26. The error bars correspond to the standard error of the mean over three independent simulations. (o) Two-dimensional histogram of Rg/N1/3 and δ for the polyQ-PR domains of Httex1 49Q. The red rectangle corresponds to the region that corresponds to globular conformations as defined above. For all polyQ lengths the probability of polyQ-PR domains adopting globular conformations is negligible.

Conformational Properties of Httex1 Change Continuously with PolyQ Length

A prevailing hypothesis in the field is of an abrupt conformational change that accompanies an increase in polyQ repeat length beyond the threshold of 36–37 residues. To test whether the smFRET efficiencies are consistent with an abrupt change in conformational properties, we quantified the distributions of normalized radii of gyration for each polyQ repeat length. In order to put all Httex1 constructs on the same scale, the radius of gyration (Rg) distributions, which quantify the size of the conformations in the simulated ensemble, are normalized by N0.5. Here, N is the number of residues in the construct. The results from our analysis are shown in Figure f–l. These distributions suggest that Httex1 undergoes a continuous global contraction as the polyQ repeat length increases. This contraction arises from the increased prominence of the globular polyQ domain as the polyQ length increases. Fluorescence correlation spectroscopy experiments on Gly-(Gln)-Cys*-Lys2 show that polyQ adopts collapsed conformations.[35a] We asked if this feature is preserved in the context of the native Httex1 constructs. For a uniform collapsed polymer, the ensemble-averaged Rg scales with chain length, N, according to ⟨Rg⟩ = αN1/3, where α ≈ 3.0 Å. Figure m shows the results of the least-squares regression analysis for ln(N) versus ln(⟨Rg⟩). The parameters for the slope and intercept, obtained from the regression analysis, are found to be 0.36 and 2.62 Å, respectively. This implies that the polyQ domain maintains its intrinsic preference for globular conformations in the context of Httex1. These globular conformations are likely to be more stable as polyQ length increases because the surface-to-volume ratio decreases as N–1/3 as N increases. Importantly, unlike recent simulation results, we do not observe a compaction that lead to values below the canonical exponent of 1/3 or any increases in β-sheet contents that were recently reported for polyQ lengths above the pathological threshold.[19] Inasmuch as our simulation results are concordant with and vetted by experimental data, it appears that abrupt conformational transitions are likely to be low likelihood fluctuations that may or may not be enhanced by intermolecular interactions.[36] However, such low likelihood fluctuations are not the defining intrinsic features of the polyQ-length-dependent conformational properties of monomeric Httex1 constructs and are likely to be discernible through the use of biased sampling methods that mimic the effects of intermolecular interactions.[20a] Previous studies have suggested that Nt17 undergoes a polyQ-mediated expansion that coincides with an adsorption to the polyQ domain for polyQ lengths of greater than ∼20.[12a,37] In order to test whether Nt17 adsorbs on the polyQ domain within Httex1 constructs, we constructed two-dimensional histograms of Rg/N1/3 and asphericity, δ, calculated over Nt17-Q. Here,Here, λ1, λ2, and λ3 are the eigenvalues of the conformation-specific gyration tensor.[12c,38] When δ ≤ 0.25, conformations are spherical (globular), whereas δ → 1 corresponds to rod-like conformations. For globular conformations, which should be observed if Nt17 adsorbs on the polyQ domain, Rg/N1/3 should be approximately 3.0 Å. Summation over the density within the two-dimensional region of 2.5 Å ≤ Rg/N1/3 < 3.5 Å and 0 ≤ δ < 0.26 quantifies the probability that Nt17 is adsorbed on the polyQ domain. As shown in Figure n, all polyQ lengths lead to high degree of adsorption between Nt17 and polyQ with greater than 70% of the conformations being globular for monomeric Httex1 in the absence of oligomerization. We performed a similar analysis over polyQ-PR domains and found that for all polyQ lengths and dye pairs a negligible percentage of the conformations were observed to be globular. An example, two-dimensional histogram of Httex1 49Q is shown in Figure o. Most of the density was observed outside the region that corresponds to globular conformations.

Discussion

The prevailing hypothesis in the HD field is that sharp changes in conformational properties of monomeric Httex1 accompany the increase in polyQ length beyond the pathological threshold. We tested this hypothesis using data from our smFRET measurements and computational analysis. Our integrative approach yielded the following insights: Httex1 constructs adopt tadpole-like architectures defined by a globular head comprised of Nt17 adsorbed on the polyQ domain and an extended tail comprised of the PR domain. These results do not support a sharp, polyQ-length-dependent structural change within monomeric Httex1. Instead, they support a continuous global compaction with increasing polyQ length that arises due to the increased prominence of the compact polyQ domain. The cellular concentrations and sub-cellular localization of Httex1 are unknown and need to be measured precisely. Estimates in the literature place the cellular concentrations of Httex1 to be in the nanomolar or sub-nanomolar regime.[39] The sub-nanomolar concentrations used in our experiments are significantly below the critical concentration thresholds that promote aggregation and phase separation in vitro(8) and the solubility limits in cells where deleterious phenotypes such as the impairment of proteostasis networks are manifest.[40] By working at sub-nanomolar concentrations, we were able to decouple evidence of a sharp conformational change at the monomer level as the sole reason for the polyQ-length-dependent toxicity threshold observed in HD. Instead we propose that the continuous increase in the surface area of the polyQ domain with increasing polyQ length leads to changes in homotypic and heterotypic interactions and these changes engender the polyQ-length-dependent threshold observed in HD. By studying the conformational properties of monomeric forms of Httex1 we were able to establish that sharp changes observed in Httex1 aggregation or interaction networks as a function of polyQ length are not a result of sharp, polyQ-length-dependent conformational changes at the monomer level. These results imply that in order to understand the polyQ-length-dependent toxicity observed in HD, future biophysical studies should focus on understanding differences in higher order interactions and conformational transitions mediated by intermolecular interactions as a function of polyQ length. This will require a combination of biased sampling methods to construct the appropriate free energy surfaces impacted by conformational fluctuations as well as advanced experimental methods that probe conformations and fluctuations influenced by the interplay between intra- and intermolecular interactions. A subset of studies, based on the binding of antibodies, also argues against a “structural toxic threshold” model.[41] A continuous increase in binding was observed for two polyQ-targeting antibodies as a function of polyQ length. These results suggest that there is a monotonic increase in the number of surface epitopes rather than a sharp structural change within the Httex1 fusion proteins as the polyQ length increases. Other studies have suggested that the polyQ domain undergoes an increased rigidity transition only above the pathogenic polyQ length threshold.[7] These inferences were based on fluorescence lifetime imaging microscopy FRET experiments conducted in live cells on Httex1 fluorescent protein fusion constructs. As noted earlier, tagged systems generate confounding observations with considerable variability, depending on the tags that are used.

Implications of the Structures of Monomeric Httex1 for Heterotypic Interactions

The surface area of the polyQ globules increases as N2/3 with polyQ length N. This increase in polyQ surface area with N should increase the number of surface accessible polyQ sites and enable the emergence of new interactions with proteins in the cellular milieu (Figure ).[42] Such heterotypic interactions might give rise to sharp changes in cellular phenotypes that influence protein quality control, toxicity, and cell death.[42,43] Even though Htt is ubiquitously expressed, medium spiny and striatal neurons are most susceptible to neurotoxicity and degeneration.[44] This suggests that the growing prominence of the polyQ domain within the tadpole-like structure of monomeric Httex1 might elicit toxic, gain-of-function interactions in specific neuronal sub-types, thus giving rise to the appearance of a sharp pathological transition as a function of polyQ length.[42] The key question is if the tadpole-like architecture is sufficient to engender sharp, polyQ length-dependent gain-of-function heterotypic interactions within cells. A recent study provides preliminary support for this hypothesis, showing that the network of protein–protein interactions, with Httex1 at the hub, changes sharply with polyQ length.[42] Importantly, these findings, in cells, were explained using the central tenets of the tadpole-like architecture presented in this work.

Figure 5

Proposed influences of tadpole-like monomeric Httex1. The top row shows the proposed impact of monomeric Httex1 on heterotypic interactions. Green, orange, and purple symbols and edges depict interactions of monomeric Httex1 through Nt17, polyQ, and the PR domain, respectively. As polyQ length increase, we propose that the number and strengths of heterotypic interactions can increase, vis-à-vis the wild-type, due to the increased prominence of the globular polyQ domain in the tadpole-like architecture of monomeric Httex1. The bottom row shows the proposed impact of polyQ length on homotypic interactions that drive the aggregation and phase separation of Httex1. The total cellular concentration of Httex1 is denoted as ct. For the wild-type, we propose that ct < cF, where cF is the saturation concentration that has to be crossed to drive the formation of insoluble, fibrillar aggregates.[8] Conversely, polyQ expansions lead to a reversal whereby ct > cF, and hence, depending on the gap between cF and ct, there is an increasing driving force for forming large fibrillar aggregates. The tadpole-like structures of monomeric Httex1 determine the overall bottlebrush architecture of the aggregates,[11a] whereas nucleated conformational changes within Httex1 determine the intermolecular interfaces and the strengths of aggregates, including fibrils.[11a,51] The large variability in ages of onset for a given pathogenic polyQ length suggests that additional factors, including gain of function heterotypic interactions and higher-order homotypic interactions, may be the determinants of HD progression. This idea is consistent with studies that suggest overexpression of proteins housing Q-rich regions bind mutant Htt and suppress cellular toxicity in yeast.[45] This suppression was proposed to be a result of blocking the interactions between more essential proteins and mutant Htt. Wear et al.,[46] showed that there was an enrichment in proteins housing long intrinsically disordered regions (IDRs) associated with mutant Htt aggregates. For two representative binding partners, this interaction was dependent on the presence of the IDR, which may suggest the IDR engages in preferential interactions with expanded polyQ domains. Finally, protein quality-control machineries in striatal neurons, as opposed to cortical neurons, are impaired in response to the expression of mutant Httex1.[44] This suggests that monomeric or soluble forms Httex1 with expanded polyQ tracts might engage deleteriously, albeit in cell-specific ways, with components of the protein quality-control machineries such as the ubiquitin proteasome system and autophagy.

Implications for the Driving Forces for and Mechanisms of Httex1 Aggregation

Inferences from previous in vitro studies of aggregation kinetics suggest that the rate of nucleation of β-sheet-rich conformations should increase with increasing polyQ length.[37,47] nificantly high peptide concentrations.[15b,48] Analysis of Httex1 fibril structure by solid-state NMR showed that Httex1 fibrils adopt a β-hairpin-based polyQ core structure, which requires a minimum of 22 glutamine residues.[49] Our smFRET measurements cannot rule out the possibility of increased β-sheet content within the collapsed polyQ domains of monomeric Httex1.[20] However, our simulation results show negligible secondary structure preferences within polyQ domains, irrespective of polyQ length. The implication is that β-sheet formation is likely to be a rare event that is confronted by high free energy barriers.[20a] The most likely scenario is that the tadpole-like architecture drives the spontaneous formation of lower molecular weight aggregates such as bristled spheres that are characterized by sequestration of the polyQ domains on the interior of the spheres and exposure of the PR domain tails to solvent. Nucleated conformational conversion within these bristled spheres[8,35b] likely promotes the templated formation of high-molecular-weight β-sheet-rich fibrils that have bottlebrush architectures stabilized by polyQ cores and PR domains forming the bristles of the brush.[11a] From the standpoint of aggregation, the biophysical basis for the pathological length threshold may well be the lowering of the saturation concentrations for forming bristled spheres and bottlebrush fibrils as the polyQ length increases.[8] The tendency for polyQ peptides to form collapsed conformations is consistent with previous results showing that water is a poor solvent for polyQ thus explaining the poor solubility of polyQ peptides in aqueous solutions.[12c,35a] This connection between monomeric collapse and solubility was also observed experimentally by Walters et al., who showed that polyQ peptides that underwent monomeric collapse readily formed soluble aggregates.[35b] Increasing the polyQ length leads to more unfavorable interactions between the surface of the collapsed monomer and the surrounding solvent. The driving force of Httex1 aggregation arises from increased intermolecular interactions that minimize glutamine interactions with the surrounding solvent rather than a sharp polyQ- induced structural rearrangement. Furthermore, this suggests that at physiological concentrations longer polyQ domains will have a more pronounced tendency to aggregate, and this could contribute to the pathogenic polyQ length threshold observed in HD[50] (Figure ). If the physiological concentration of Httex1 were designated as ct, then only proteins containing polyQ lengths greater than the pathogenic threshold have a saturation concentration for aggregation greater than ct. This would lead to a sharp polyQ length threshold for the formation of a heterogeneous set of, potentially toxic, aggregates that are not observed for wild-type polyQ lengths. To test this hypothesis, we need accurate measurements of the physiological concentrations of Httex1, as well as the saturation concentrations for aggregation as a function of polyQ length.

Impact of N- and C-Terminal Flanking Sequences of polyQ

The aggregation of Httex1 is dependent not only on polyQ repeat length but also on the presence of Nt17 and PR domains. For a given polyQ length, the presence of Nt17 increases the drive to form large, linear, insoluble aggregates and decreases the solubility of Httex1 constructs.[8,28,37,52] Two models have been proposed in order to describe how Nt17 modulates Httex1 aggregation.[53] In the proximity model, Nt17 drives Httex1 aggregation by increasing the effective local concentration of polyQ through Nt17-dependent helical bundling.[28,37,54] Sahoo et al., showed that Httex1-like constructs with expanded polyQ tracts readily form tetramers by fluorescent correlation spectroscopy. However, when the Nt17 domain is replaced by di-lysine, only monomers are observed. Although these results do suggest that Nt17 is important for modulating Httex1 aggregation, the lack of oligomerization may be a result of the addition of the di-lysine rather than just the removal of the Nt17 domain. Such a result is consistent with previous studies, which show that the addition of Lys (n = 1–8) flanking the polyQ domain can modulate both the degree of collapse within the polyQ domain and the solubility of Httex1-like constructs.[8,35b] In the domain cross-talk model, Nt17 and polyQ inter-domain interactions control the specificity and stability of intermolecular interactions.[53,55] This model suggests that the length of the polyQ domain is the main driver of Httex1 aggregation and Nt17 enhances the formation of linear as opposed to spherical aggregates by providing a surface-adsorbed amphipathic “patch” on polyQ that promotes the formation of linear aggregates by diluting the contacts that lead to spherical aggregates.[8,55] This suggests that as the polyQ repeat length is increased, Nt17, which then makes up a smaller portion of Httex1, should be less effective at modulating polyQ-dependent aggregation. This is supported experimentally by the observation of a decrease in fibril formation rates upon removal of the Nt17 domain.[8,52] Given that the degree of Nt17 adsorption is modulated by interactions between the polyQ domain and uncharged residues of Nt17, increasing the charge within Nt17 is likely to reduce the degree of adsorption between the Nt17 and polyQ domains.[55] Recent studies have shown that phosphorylating T3, S13, or S16 in Nt17 reduces the driving forces for forming insoluble aggregates.[17,56] However, whereas the phosphorylation of T3 stabilizes Nt17 helicity in isolation, phosphorylation of S13 and S16 destabilizes Nt17 helicity.[25] Together these results suggest that the degree of cross talk between Nt17 and polyQ and/or the charge within Nt17, rather than the degree of intrinsic helicity within Nt17, is likely to be more important as a modulator of aggregation mechanisms. We hypothesize that phosphorylation of Nt17 residues reduces Httex1 aggregation by (1) increasing the charge within Nt17, which may increase the solubility of Httex1 constructs strictly through a charge effect, as well as reduce intermolecular Nt17 interactions and/or (2) reducing the degree of adsorption of Nt17 on the polyQ domain which modulates the types of aggregates that form, as well as reduces the stability that can be gained from intermolecular interactions between the Nt17 and polyQ domains.[55] In contrast to the effect of Nt17 on Httex1 aggregation, the PR domain, as well as C-terminally truncated versions of the PR domain, increases the solubility and reduces the drive to form fibrils of polyQ-containing constructs.[8,10,57] Our smFRET measurements are consistent with the PR domain being an extended, semi-flexible chain and engaging in relatively few contacts with the Nt17 or polyQ domains. The small degree of conformational coupling between the PR domain and the Nt17 and polyQ domains, when compared to the coupling between the Nt17 and polyQ domains, may explain why the PR domain helps to solubilize polyQ-containing constructs whereas the Nt17 domain decreases the solubility of the same constructs. Beyond the intrinsic solubility of the PR domain, limited interactions with the Nt17 and polyQ domain engender conformations in which an excluded volume tail can restrict the ways in which molecules can come together and may further increase the solubility of polyQ-containing constructs. Furthermore, coarse-grained simulations on Httex1-like constructs suggest that flanking regions that show coil-like properties and limited coupling with the polyQ domain preferentially form spherical aggregates which may kinetically hinder the formation of large, linear aggregates.[55]

Summary

The integrative approach deployed here has allowed us to obtain a detailed description of the monomeric forms of Httex1. We propose that as the polyQ length increases, the increased prominence of polyQ domain leads to increased unfavorable interactions with the surrounding solvent. This, in turn, should lead to an increased drive to form higher order homotypic and/or heterotypic interactions through the polyQ domain. As the formation of intermolecular contacts and higher order oligomeric species appears to be at the crux of HD pathophysiology, it will be crucial to isolate and characterize these higher molecular weight species. Furthermore, identification of binding partners that promote or stabilize non-toxic oligomeric Httex1 species will enable advances in understanding the relationship between Httex1 phase behavior and HD pathophysiology.

Methods

Expression and Purification of Httex1 A2C N-Terminal Thiazolidine Double Cysteine Constructs

Expression and purification was performed with modifications as described by Vieweg et al.[15b] Chemo-competent E. coli ER2566 cells (NEB) were transformed with resulting vectors pTWIN1-His6-Ssp-Httex1-QN-A2C-A60/P70/P80/P90C. Isolated single colonies were inoculated in 500 mL lysogeny broth (LB) (100 μg/mL ampicillin) at 37 °C overnight with 180 rpm shaking. The following morning, 12 L of LB (100 μg/mL ampicillin) were mixed with the overnight culture to obtain an OD600 of 0.05. Cells were grown at 37 °C until an OD600 of 0.1 was reached, the temperature of the incubator was then set to 14 °C. Protein induction was then initiated at OD600 of 0.3 with 0.4 mM IPTG overnight. Cells were harvested by centrifugation (4 °C, 6238g, 8 min) and cell pellets were kept on ice. Cell pellets were resuspended in 50 mL buffer A (50 mM HEPES, 0.5 M NaCl, pH 8.5) containing 0.3 mM PMSF and 1x CLAP. Cells were lysed on ice by sonication (6 min, pulse on 30s, pulse off 59 s, 70% amplitude) using a vibra cell VCX130 from Sonics. The lysate was cleared by centrifugation (30 min, 4 °C, 27216g). The cleared supernatant was then filtered through 0.45 μm syringe filter membranes and applied to a 5 mL Histrap column (GE Healthcare, 17-5248-02) at a flow rate of 1 mL/min. The column was then washed with 10 column volumes (CV) of buffer B (50 mM HEPES, 0.5 M NaCl) to remove non-specifically bound proteins. The column was then washed with 3 CV of 5% buffer C (50 mM HEPES, 0.5 M NaCl, 0.5 M imidazole). Fusion proteins were then eluted off the Histrap column using a gradient from 4 to 50% buffer C over 50 mL. Elution fractions were analyzed by SDS-PAGE and pooled for splicing. Splicing and in situ N-terminal thiazolidine formation was initiated by addition of 1 mM TCEP and 1 mM formaldehyde and adjusting the pH to 6.8. Splicing was carried out at room temperature (RT) and monitored by SDS-PAGE and analytical C8 reversed-phase ultra-high-performance liquid chromatography (RP-UHPLC). For polyQ repeat lengths >37Q protein was allowed to splice for a maximum of 4 h while for polyQ repeat lengths ≤37Q splicing was performed for 12–16 h. Following splicing, samples were filtered through 0.45 μm syringe filter membranes and injected into a preparative C4 reversed-phase high performance liquid chromatography (RP-HPLC) column (00G-4168-P0-AX, Jupiter C4, 10 μm, 300 Å, 21.2 mm i.d. × 250 mm length) pre-equilibrated with 95% buffer D (water with 0.1% trifluoroacetic acid (TFA)) and 5% buffer E (acetonitrile with 0.1% v/v TFA). Spliced Httex1 constructs were eluted using a gradient of 30–40% buffer E over 40 min. Collected fractions were analyzed by liquid chromatography mass spectrometry (LCMS) using a Thermo Scientific LTQ ion trap mass spectrometer and pooled accordingly for lyophilization. Purity of lyophilized protein was assessed by LCMS using a C3 poroshell 300SB 1.0 × 75 mm, 5 μm column from Agilent (method: 5–95%ACN in 5 min, flow rate of 0.3 mL/min, injection volume of 10 μL). LCMS spectra were deconvoluted with MagTran software v. 1.03b from Amgen.

Httex1 Double Labeling

Purified N-terminally thiazolidine protected Httex1 constructs with C-terminal cysteine residues, 2.0 mg, were disaggregated using trifluoroacetic acid/hexafluoroisopropanol (TFA/HFIP) (1:1 v/v) as described by O’Nuallain et al.[29a] Protein was resuspended on ice in 1.0 mL labeling buffer (100 mM Tris pH 7.4, 6 M guanidinium HCl (GdHCl)) for constructs with polyQ ≤ 37Q and mutant labeling buffer (100 mM Tris pH 7.4, 6 M GdHCl, 50 mM trehalose, 0.5 M proline) for constructs with polyQ > 37Q. The pH was quickly adjusted to 7.4 as needed followed by the addition of 1.5 equiv of Alexa594-maleimide and incubated on ice for 15 min. The reaction was monitored by LCMS as previously described. Excess Alexa594-maleimide was removed using a PD-10 desalting column equilibrated with thiazolidine deprotection buffer (5% acetic acid, 5% acetonitrile in water with 0.1% TFA). The protein was then diluted to 5.0 mL with thiazolidine deprotection buffer and kept on ice. Thiazolidine deprotection was initiated by addition of 100 equiv of silver triflate for 15–30 min on ice. The reaction was monitored by LCMS. Upon completion of N-terminal deprotection, the reaction was flash frozen in liquid nitrogen and solvent was removed by lyophilization. The protein was resuspended in labeling buffer with the addition of 10 mM TCEP and incubated on ice for 30 min and then precipitated by addition of 14 mL of cold ethanol and stored at −80 °C overnight. Protein was then pelleted by centrifugation (30 min, 4 °C, 5251g), and the supernatant was discarded. The pellet was washed with 5 mL of cold ethanol and collected by centrifugation. Trace solvent was then removed by lyophilization for 1–2 h. Protein was then disaggregated and excess silver was removed by resuspension in TFA/HFIP (1:1 v/v). Insoluble silver salts were removed by centrifugation and the supernatant was carefully removed. The pellet was washed twice with TFA/HFIP (1:1 v/v) and supernatants were combined and dried under a stream of nitrogen. Trace solvent was removed by lyophilization for 1–2 h. Protein was then resuspended in 1.0 mL of either labeling or mutant labeling buffer and labeled with Alexa488-maleimide as described previously for Alexa594-maleimide. Following labeling, excess Alexa488-maleimide was removed by ethanol precipitation as previously described. Prior to final HPLC purification, protein was disaggregated with neat TFA containing a catalytic amount of ammonium iodide to reduce any possible methionine oxidation as described by Christian et al.[58] Following evaporation of TFA, trace solvent was removed by lyophilization. Protein was resuspended in 1.0 mL of either labeling or mutant labeling buffer and directly injected onto a Jupiter 5 μm C4 300 Å 250 × 4.6 mm or Jupiter 5 μm C4 300 Å 250 × 10 mm column. Protein was eluted with a 25–55% gradient of buffer E over 50 min. Collected fractions were analyzed by LCMS for purity and pooled accordingly. Final purity of the doubly labeled protein was performed by SDS-PAGE, C8 UPLC, and LCMS (see Supporting Information, including Figures S1–S4).

Expression and Purification of Htt18–90 Q18C N-Terminal Thiazolidine Double-Cysteine Constructs

Expression and purification was performed as previously described for full length Httex1 constructs. Spliced Htt18–90 constructs were eluted using a gradient of 20–55% buffer E over 50 min. Collected fractions were analyzed by LCMS and pooled accordingly. Final purity was assessed by C8 UPLC.

Semi-synthesis of Dual-Labeled and T3-Phosphorylated Httex1 Proteins

Htt18–90 Q18Thz double cysteine fragments were labeled with Alexa594-maleimide and subsequently N-terminally deprotected as described previously. Following N-terminal deprotection, native chemical ligation (NCL) was performed as described by Chiki et al.[25] Briefly, labeled protein was dissolved in 800 μL of labeling or mutant labeling buffer containing 100 mM TCEP and 50 mM methoxyamine. The pH was increased to ∼4.0 and incubated at room temperature for 30 min. Following brief methoxyamine treatment, 50 mM MPAA was added and the pH of the reaction was adjusted to 6.9. NCL was initiated by addition of 3 equiv of Nbz-thioester peptide. The reaction was incubated at room temperature and monitored by LCMS. Upon completion, protein was purified using a C4 semi-prep HPLC with a linear gradient of 10–45% buffer E over 50 min. Fractions were collected and purity was assayed by LCMS and pooled accordingly. Protein was then dried by lyophilization. Disaggregation was performed using TFA/HFIP (1:1 v/v) as described previously. Iodoacetamide treatment was then performed to mask Q18C as a pseudo-glutamine by resuspending protein in 1.0 mL labeling or mutant labeling buffer and 1 mM freshly prepared iodoacetamide and 1 mM TCEP. The protein was incubated at room temperature for 15 min or on ice for 30 min. The reaction was monitored by LCMS and upon completion desalted into thiazolidine deprotection buffer as described previously. Following, protein was N-terminally deprotected, labeled with Alexa488-maleimide, and HPLC purified as described previously. Final protein purity was characterized by SDS-PAGE, C8 UPLC, and LCMS (see Supporting Information, Figure S8).

Single Fluorophore Httex1 Labeling

Httex1 A2C 15–49Q were expressed and purified as previously described but without the addition of formaldehyde during Intein-Ssp splicing. Protein was labeled with Alexa488-maleimide, donor fluorophore, as described above and excess fluorophore was removed by ethanol precipitation. Donor-only constructs were HPLC purified as previously described. Httex1 A2Thz P90C 15–49Q were prepared as described above. Protein was labeled with Alexa594-maleimide, acceptor fluorophore, as previously described and excess fluorophore was removed by ethanol precipitation. Acceptor only constructs were HPLC purified as previously described. Final protein purity was characterized by SDS-PAGE, C8 UPLC, and LCMS (see Supporting Information, Figures S1–S4).

smFRET Measurements

For all smFRET measurements 5–30 μg portions of protein were weighed out and disaggregated by TFA/HFIP (1:1 v/v) as previously described. Protein samples were resuspended at a target concentration of 1 μM in 20% acetonitrile and 20% acetic acid in water with 0.1% TFA. Aliquots were prepared, flash frozen, and stored at −80 °C. Dual-labeled protein samples were diluted to between 50 and 200 pM in Dubelco’s PBS pH 7.4. Initial measurements were made on samples prior to freezing and replicates were collected from −80 °C stored samples. Data were collected using a custom-built multi-parameter single-molecule spectrometer analogous to that previously described.[59] Single-molecule bursts were identified using a burst search and a threshold of 80 photons was subsequently applied over the donor and acceptor channels.[30a] Leakage of donor fluorescence into the acceptor channel was corrected. Mean FRET efficiencies, ⟨EFRET⟩, and stoichiometry, S, were calculated using a custom-written code in IgorPro (Wavemetrics, Lake Oswego, OR).Measurements were performed in triplicate for all constructs. ⟨EFRET⟩ was calculated as intensity-based FRET efficiencies, where ID and IA are donor and acceptor intensities respectively, and γ is a correction factor dependent upon Httex1 donor (ΦD) and acceptor (ΦA) quantum yields and the detection efficiencies of the donor channel (ηD) and acceptor channel (ηA) (see Supporting Information, including Figures S9–S11). IAdir is the intensity from directly excited acceptor molecules using pulse-interleaved excitation with an orange laser.[27b,31] The two-parameter histograms shown in Figure a are highly reproducible from one run to the next, and this derives, in part, from the purity of the samples.

Details of All-Atom Simulations

All-atom simulations of Httex1 constructs were performed with the CAMPARI simulation package (http://campari.sourceforge.net) utilizing the ABSINTH implicit solvation model and force field paradigm.[14,60] Simulations were based on the abs3.2_opls.prm parameter set and were combined with temperature replica exchange in order to enhance sampling. The temperature schedule used was T = [288, 293, 298, 305, 310, 315, 320, 325, 335, 345, 360, 375, 390, 405] K. A total of 6.15 × 107 steps were performed for each simulation. Here, a step refers to either a temperature replica exchange swap or a Metropolis Monte Carlo move. The first 107 steps were taken as equilibration steps. Observables were collected every 5 × 103 steps during the last 5.15 × 107 steps of the simulation to use for further analysis. Simulations were performed in droplets with radii of 150 Å. This radius choice was chosen to ensure against confinement artifacts that arise due to too small a droplet. Excess and neutralizing Na+ and Cl– ions were modeled explicitly with an excess NaCl concentration of 5 mM. The specific sequences used were ATLEKLMKAFESLKSF-Q-P11-QLPQPPPQAQPLLPQPQ-P10-GPAVAEEPLHRP, where n = 15, 23, 37, 43, and 49. The N- and C-termini were left uncapped for consistency with the experimental constructs. The sequences were simulated without the double cysteine residues used for dye labeling. Dyes were added to the simulated ensembles post facto as described below.

Addition of Dyes to Simulated Ensembles

In order to add dyes post facto, our in-house program COCOFRET was used. For each dye pair and polyQ length combination, COCOFRET utilizes the atomistic simulation trajectories, dye rotamer libraries, residue positions at which to add the dyes, and the Förster radius, R0, in order to determine the mean FRET efficiency for each conformation that is consistent with the inclusion of the dye pair. Explicitly, for each conformation, we attempted 100 independent attachments of the Alexa 488 dye at position 2 and the Alexa 594 dye at one of the three C-terminal dye positions. Dye rotamers were randomly chosen from the HandyFRET rotamer libraries and dyes were attached to the protein such that the carbon–sulfur–carbon angle was approximately ideal.[61] A protein + dye conformation was accepted if no steric clashes were observed between the protein and the dye. A steric clash was defined as any protein atom being within the solvation shell of any dye atom. Here, we set the solvation shell of each dye atom to be 5 Å, except for the malemide atoms which were set to 2 Å in order to account for the connectivity of the protein and dye. All-retained protein + Alexa 488 conformations were combined with all-retained protein + Alexa 594 conformations. Conformations of the protein + Alexa 488 + Alexa 594 system were retained if no steric clashes were observed between the dyes. The FRET efficiencies corresponding to these conformations were then calculated using the Förster formula, and the mean and standard error associated with these FRET efficiencies were computed. Distances between dyes were calculated using the positions of the C19 atoms of Alexa 488 and Alexa 594, as defined by the HandyFRET AF488.pdb and AF594m.pdb files (http://karri.anu.edu.au/handy/rl.html), respectively. The mean FRET efficiency was recorded if the standard error was less than 0.005. If this criterion was not met, then the above process was repeated until the standard error was less than 0.005. However, if this criterion was not met after 10 trials then a mean FRET efficiency value was not recorded for the given conformation. Given that our goal was to construct simulated ensembles that are consistent with all three mean FRET efficiencies measured for each polyQ length, only conformations that had mean FRET efficiencies for all three dye pairs were kept for use in the reweighting procedure described next.

Reweighting Simulated Ensembles To Match Mean smFRET Efficiencies

The simulated ensembles were reweighted to match all three mean FRET efficiencies, ⟨EFRET⟩, measured for a given polyQ length using the maximum entropy method of Leung et al.[18] This method maximizes entropy (i.e., tries to give all conformations similar weights) while minimizing the difference between the simulated and experimental ⟨EFRET⟩ efficiencies and yields a unique global solution. Here, the error in the experimental FRET efficiencies was set to be 0.02.[62] In order to determine the simulated temperature that best matches the experimental results, the decrease from maximum entropy (ΔS) was calculated. This calculation is insensitive to the number of conformations used, which is important given that the number of conformations varies for each temperature and polyQ length combination. Here, ΔS is calculated usingHere, ppost is the posterior vector of weights determined from the maximum entropy method and pprior is the vector of equal weights given to each of the nc conformations. The mean free energy change is given by kTΔS, where k is the Boltzmann constant and T is the temperature. Thus, if ΔS = −1, then this is equivalent to adding an auxiliary reweighting term to the potential function that contributes 1kT to the overall energy function.

79 in total

1. A Rigorous and Efficient Method To Reweight Very Large Conformational Ensembles Using Average Experimental Data and To Determine Their Relative Information Content.

Authors: Hoi Tik Alvin Leung; Olivier Bignucolo; Regula Aregger; Sonja A Dames; Adam Mazur; Simon Bernèche; Stephan Grzesiek
Journal: J Chem Theory Comput Date: 2015-12-02 Impact factor: 6.006

2. Aggregation landscapes of Huntingtin exon 1 protein fragments and the critical repeat length for the onset of Huntington's disease.

Authors: Mingchen Chen; Peter G Wolynes
Journal: Proc Natl Acad Sci U S A Date: 2017-04-11 Impact factor: 11.205

3. Single-step detection of mutant huntingtin in animal and human tissues: a bioassay for Huntington's disease.

Authors: Andreas Weiss; Dorothée Abramowski; Miriam Bibel; Ruth Bodner; Vanita Chopra; Marian DiFiglia; Jonathan Fox; Kimberly Kegel; Corinna Klein; Stephan Grueninger; Steven Hersch; David Housman; Etienne Régulier; H Diana Rosas; Muriel Stefani; Scott Zeitlin; Graeme Bilbe; Paolo Paganetti
Journal: Anal Biochem Date: 2009-08-06 Impact factor: 3.365

4. Modulation of polyglutamine conformations and dimer formation by the N-terminus of huntingtin.

Authors: Tim E Williamson; Andreas Vitalis; Scott L Crick; Rohit V Pappu
Journal: J Mol Biol Date: 2009-12-21 Impact factor: 5.469

5. Accounting for dye diffusion and orientation when relating FRET measurements to distances: three simple computational methods.

Authors: Katarzyna Walczewska-Szewc; Ben Corry
Journal: Phys Chem Chem Phys Date: 2014-06-28 Impact factor: 3.676

6. One-pot semisynthesis of exon 1 of the Huntingtin protein: new tools for elucidating the role of posttranslational modifications in the pathogenesis of Huntington's disease.

Authors: Annalisa Ansaloni; Zhe-Ming Wang; Jae Sun Jeong; Francesco Simone Ruggeri; Giovanni Dietler; Hilal A Lashuel
Journal: Angew Chem Int Ed Engl Date: 2014-01-20 Impact factor: 15.336

7. Single molecule study of the intrinsically disordered FG-repeat nucleoporin 153.

Authors: Sigrid Milles; Edward A Lemke
Journal: Biophys J Date: 2011-10-05 Impact factor: 4.033

8. Critical nucleus size for disease-related polyglutamine aggregation is repeat-length dependent.

Authors: Karunakar Kar; Murali Jayaraman; Bankanidhi Sahoo; Ravindra Kodali; Ronald Wetzel
Journal: Nat Struct Mol Biol Date: 2011-02-13 Impact factor: 15.369

Review 9. Proteostasis in striatal cells and selective neurodegeneration in Huntington's disease.

Authors: Julia Margulis; Steven Finkbeiner
Journal: Front Cell Neurosci Date: 2014-08-07 Impact factor: 5.505

10. Pathogenic and non-pathogenic polyglutamine tracts have similar structural properties: towards a length-dependent toxicity gradient.

Authors: Fabrice A C Klein; Annalisa Pastore; Laura Masino; Gabrielle Zeder-Lutz; Hélène Nierengarten; Mustapha Oulad-Abdelghani; Danièle Altschuh; Jean-Louis Mandel; Yvon Trottier
Journal: J Mol Biol Date: 2007-05-18 Impact factor: 5.469

28 in total

1. Relation between single-molecule properties and phase behavior of intrinsically disordered proteins.

Authors: Gregory L Dignon; Wenwei Zheng; Robert B Best; Young C Kim; Jeetain Mittal
Journal: Proc Natl Acad Sci U S A Date: 2018-09-14 Impact factor: 11.205

2. Dissecting the Energetics of Intrinsically Disordered Proteins via a Hybrid Experimental and Computational Approach.

Authors: Junjie Zou; Carlos Simmerling; Daniel P Raleigh
Journal: J Phys Chem B Date: 2019-12-03 Impact factor: 2.991

3. Valence and patterning of aromatic residues determine the phase behavior of prion-like domains.

Authors: Erik W Martin; Alex S Holehouse; Ivan Peran; Mina Farag; J Jeremias Incicco; Anne Bremer; Christy R Grace; Andrea Soranno; Rohit V Pappu; Tanja Mittag
Journal: Science Date: 2020-02-07 Impact factor: 47.728

4. The folding equilibrium of huntingtin exon 1 monomer depends on its polyglutamine tract.

Authors: Jose M Bravo-Arredondo; Natalie C Kegulian; Thomas Schmidt; Nitin K Pandey; Alan J Situ; Tobias S Ulmer; Ralf Langen
Journal: J Biol Chem Date: 2018-10-12 Impact factor: 5.157

5. Thermodynamics of Huntingtin Aggregation.

Authors: Tam T M Phan; Jeremy D Schmit
Journal: Biophys J Date: 2020-05-20 Impact factor: 4.033

6. Profilin reduces aggregation and phase separation of huntingtin N-terminal fragments by preferentially binding to soluble monomers and oligomers.

Authors: Ammon E Posey; Kiersten M Ruff; Tyler S Harmon; Scott L Crick; Aimin Li; Marc I Diamond; Rohit V Pappu
Journal: J Biol Chem Date: 2018-01-22 Impact factor: 5.157