Literature DB >> 25652623

Reprogramming cells with synthetic proteins.

Xiaoxiao Yang, Vikas Malik, Ralf Jauch1.   

Abstract

Conversion of one cell type into another cell type by forcibly expressing specific cocktails of transcription factors (TFs) has demonstrated that cell fates are not fixed and that cellular differentiation can be a two-way street with many intersections. These experiments also illustrated the sweeping potential of TFs to "read" genetically hardwired regulatory information even in cells where they are not normally expressed and to access and open up tightly packed chromatin to execute gene expression programs. Cellular reprogramming enables the modeling of diseases in a dish, to test the efficacy and toxicity of drugs in patient-derived cells and ultimately, could enable cell-based therapies to cure degenerative diseases. Yet, producing terminally differentiated cells that fully resemble their in vivocounterparts in sufficient quantities is still an unmet clinical need. While efforts are being made to reprogram cells nongenetically by using drug-like molecules, defined TF cocktails still dominate reprogramming protocols. Therefore, the optimization of TFs by protein engineering has emerged as a strategy to enhance reprogramming to produce functional, stable and safe cells for regenerative biomedicine. Engineering approaches focused on Oct4, MyoD, Sox17, Nanog and Mef2c and range from chimeric TFs with added transactivation domains, designer transcription activator-like effectors to activate endogenous TFs to reprogramming TFs with rationally engineered DNA recognition principles. Possibly, applying the complete toolkit of protein design to cellular reprogramming can help to remove the hurdles that, thus far, impeded the clinical use of cells derived from reprogramming technologies.

Entities:  

Mesh:

Substances:

Year:  2015        PMID: 25652623      PMCID: PMC4430937          DOI: 10.4103/1008-682X.145433

Source DB:  PubMed          Journal:  Asian J Androl        ISSN: 1008-682X            Impact factor:   3.285


SWITCHING CELL FATES WITH TRANSCRIPTION FACTOR PROTEINS

The notion that organismic development does not stubbornly follow a predetermined path but is a rather plastic process has been made long before DNA was recognized as the carrier of inheritable information.1 Nearly 100 years later, cellular reprogramming was first demonstrated when live tadpoles arose after transferring the nuclei of frog intestine derived somatic cells into oocytes.2 This observation led to the realization that even fully differentiated cells must still contain the complete genetic blueprint required to build a whole organism. That transcription factors (TFs) are remarkably powerful to drive cellular reprogramming, that is to convert one cell type into another, was first demonstrated by turning a mouse fibroblast into a muscle cell with a cDNA encoding a single TF, MyoD (Myod1).3 Subsequently, a cocktail of four TFs, Krüppel-like factor 4 (Klf4), c-Myc, Sry (sex determining region y) box2 (Sox2) and octamer binding protein 4 (Oct4), was discovered to induce pluripotent stem cells (iPSCs) when forcibly expressed in fibroblasts of mouse4 and human.56 Since this seminal work, cellular reprogramming has become a mainstream research activity. Like embryonic stem cells derived from the blastocyst, iPSCs can be passaged in culture for indefinite periods and given appropriate growth conditions, can be differentiated into all cell types of the body.7 The latter can straightforwardly be demonstrated by transplanting iPSCs back into blastocysts, but it is a challenge to recapitulate this process in vitro. The MyoD experiment already demonstrated that differentiated cells do not have to be pushed up all the way the “Waddington canal”8 to a completely undifferentiated cell type before subsequent re-specification to an alternative cell fate. Instead, the MyoD conversion of fibroblasts directly to muscle indicates that direct transdifferentiation can be accomplished without a pluripotent intermediate. Yet, analogous to the MyoD example, lineage conversion was initially only reported between cell types originating from a similar developmental trajectory such as the interconversion of two immune cell types, B cells (lymphoid cells) to macrophages (myeloid cells),9 exocrine to endocrine pancreatic cells,10 glial cells to neurons,11 brown fat to muscle cells12 and fibroblasts to cardiomyocytes.13 All of these transdifferentiation events are within the same germ layer (i.e. mesoderm to mesoderm or neurectoderm to neurectoderm). Eventually, conversions between cells originating from different germ layers could also be achieved as fibroblast as well as hepatocytes could be directly converted into functional neurons with defined TF cocktails.141516 Apparently, even pronounced reprogramming barriers separating very distant cellular states can be crossed with small sets of lineage specifying TF proteins. The appreciation of this astonishing developmental plasticity sparked a plethora of studies attempting to re-direct cell fates. For example, blood cells,17 endothelial cells,18 hepatocytes,192021 sertoli cells22 and thymic epithelial cells23 could be successfully generated with TF cocktails. Readers are referred to excellent reviews that discuss the progress of cellular reprogramming and lineage conversions in more detail.24252627282930313233 Remarkably, particularly potent reprogramming factors such as Oct4 were found to be able to induce reprogramming alone in certain cell types and other cell types with small molecule supplementation.343536 Moreover, a few studies have reported that TFs can be omitted altogether as induced pluripotent or neuronal cells could be produced solely with small molecules.3738 However, efficiency of chemical reprogramming is considerably lower than TF based approaches and applications currently remain limited to mice. Hence, TF based cell lineage conversions continue to be the most effective and versatile approach. Therefore, we will focus our discussion on efforts to enhance TF based cell lineage conversions through protein engineering.

ROADBLOCKS ON THE WAY TO FUNCTIONAL CELLS

The excitement sparked by cellular reprogramming is catalyzed by its promise to lead to new clinical applications. One strategy is to conduct “in vitro clinical trials.”2639 That is, cells obtained from patients through biopsies, blood or urine samples are differentiated into disease-relevant cell types. Next, preselected drugs or drug libraries can be assessed for their toxicity and potential to exert curative effects on those cells. It is hoped that this approach will accelerate personalized therapies, facilitate drug discovery and avoid the prescription of drugs that are toxic or ineffective to certain patient populations. Moreover, reprogramming technologies can be used to model human diseases in a dish. Here, the behavior of cells derived from patients is compared to cells from healthy donors. If disease-causing mutations are known, the mutation can be engineered using genome editing technologies and genetically matched isogenic cell lines can be studied. This way, diseases can be understood at an unprecedented depth, cellular pathways can be mapped, biomarkers can be discovered and therapeutic strategies can be developed. Lastly, the holy grail of stem cell research is to produce functional cells that can be transplanted back into patients to remedy degenerative diseases.40 Encouragingly, diseases could be cured through cell therapies in animal models. For example, gene-corrected iPSC derived hematopoietic progenitors transplanted back into humanized sickle cell anemia mouse models could cure the animals.41 This has led to the hope that diseases caused by deficiencies in well-defined cell types such as type 1 diabetes,42 Parkinson's disease43 and retinal degeneration44 are curable with cell-based therapies. Though, hematopoietic stem cells have been used in bone marrow transplants since the 1950's, cell therapies in humans still pose major challenges, and daunting roadblocks remain. Most importantly, safety has to be rigorously assessed before transplanting the reprogrammed cells. iPSCs resemble cancer cells in many ways and are teratogenic when injected into mice. This poses a significant risk as incomplete differentiation, and remnant pluripotent cells could potentially lead to cancerous growth.4546 Collectively, avoiding insertional mutagenesis, oncogenic TFs and pluripotent reprogramming intermediates could solve this problem. Furthermore, it is often problematic to terminally differentiate cells so that they fully replicate the function of the cells matured in vivo. Cells have to be stable and need to be expandable so that they can be produced in sufficient quantities needed to support transplantation medicine. Ideally, reprogramming strategies should leave the genome unscathed, utilize cells that are genetically matched to the recipient with just the disease-causing loci corrected and produce an epigenetic state identical to the tissue embedded cell they are meant to replace. While optimized factor cocktails, novel culture conditions and small molecule compounds will likely further advance reprogrammed cells toward the clinic, we surmise that the engineering of the reprogramming TFs themselves provides a viable strategy to be further explored.

DESIGNING BETTER PROTEINS

Bioengineering proteins to either enhance their activity or to install completely novel functions has been successfully accomplished in numerous instances. Day-to-day laboratory operations utilize a range of artificially enhanced proteins. Those include DNA polymerases with thoroughly optimized fidelity,474849 proteases with engineered activity and substrate specificity50 and fluorescent proteins with increased brightness.5152 Likewise, protein therapeutics are often rationally improved. In particular coagulation factors to treat bleeding disorders were bioengineered in a variety of ways.53 For example, factor IX was engineered to have prolonged activity by fusing it to a Fc fragment54 and the coagulation factor VIIIa was rationally mutagenized for inactivation resistance and optimized secretion profiles.5556 More ambitious goals include the engineering of whole pathways leading to the biotechnological synthesis of new products.5758 What are the methods protein designers use to achieve their engineering goals? A rather simple way is to concatenate functional protein domains or even whole proteins. Examples include fusions of green fluorescent protein with antibody fragments that increase their brightness59 or attaching effector domains such as nucleases to artificial TFs with customized DNA sequence preferences.60 In addition, functional regions such as phosphorylation sites, protease cleavage sites, and signaling sequences can be rationally modified to install desired properties.5556 Most commonly, rational and randomization strategies are combined to achieve the desired results. Using the knowledge of the protein's structure, sequence conservation and functional insights gained from site-directed mutagenesis experiments can lead to the selection of functionally important structural elements. Such elements could be individual or a small set of amino acids, secondary structure elements or subdomains. Frequently, design efforts target catalytic centers, substrate binding pockets or macromolecular contact interfaces. Those elements can then be modified taking biophysical parameters such as charge, size, and hydrophobicity, as well as functional data and sequence information of homologs into account. Yet, rationally predicting how a specific structural modification affects protein activity is a daunting task as our understanding about the structural basis for protein function remains limited. Therefore, protein designers often subject, structural elements earmarked for protein optimization to directed evolution.61 This strategy requires a carefully designed randomization strategy, which can include error-prone polymerase chain reaction,62 site-directed mutagenesis with randomized oligos63 and “chimeragenesis,” that is the recombination of protein fragment libraries.64 Libraries of modified proteins now undergo a screening and selection procedure to identify variants with improved functionality. Selection systems include binding assays such as phage display,65 ribosome display,66 enzymatic assays,67 tests for protein stability,68 genetic complementation combined with phenotypic read-outs69 and in vitro compartmentalization.70 Obviously, selection system design is critical as desired protein variants would escape detection if the screen cannot rigorously discriminate between enhanced and unwanted variants of the designed protein.61 Remarkably, efforts are being made to design proteins entirely from scratch using fragment libraries of nonnatural peptide sequences with minimal architectural constraints. Given the mindboggling number of theoretically possible protein sequences this seems like a herculean feat. Nevertheless, de novo design has led to the creation of some functional sequences.7172 Thus far, examples for the engineering of TF proteins are still rather rare. Here we ask whether the toolkit of protein engineering could be employed to design reprogramming TFs to more effectively engineer cell lineage conversions and to bring progress to regenerative biomedicine.

ENGINEERING SYNTHETIC REPROGRAMMING FACTORS

Enhancing reprogramming efficiency with potent transactivation domains

The optimization of reprogramming strategies has been a priority for many laboratories as the original protocol was rather inefficient. Efficiency enhancements could be achieved by supplementing the media,73 altering the factor cocktails,74757677 changing the sequence of factor addition,78 adding small molecules35798081828384 or removing reprogramming roadblocks.8586 In addition, some studies resorted to protein engineering to improve reprogramming (). Based on the assumption that reprogramming TFs mainly act by inducing mRNA synthesis of their target genes, several engineering efforts were made to increase the transactivation potential of TFs by fusing them to potent transactivation domains (TADs) (TAD-TFs, ).
Table 1

Engineered reprogramming factors

Figure 1

Enhancing reprogramming efficiency with TAD-TF chimeras. (a) Domain structure of VP16, MyoD and YAP from which TADs were derived. (b) Domain structure of chimeric Oct4-TAD proteins demonstrated to potently enhance the reprogramming of somatic cells to iPSCs.9397103 YAP1 was drawn according to Uniprot-ID P46938 and VP16 and MyoD according to Hirai et al.,99 2010. TAD: transactivation domain; bHLH: basic helix-loop-helix domain; POU: Pit1-Oct-Unc-86 related domain; TEAD-IA: interaction region with the TF TEAD; TFs: transcription factors; Oct4: octamer binding protein 4; iPSCs: induce pluripotent stem cells; VP16: viral protein 16; YAP: yes-associated protein.

Engineered reprogramming factors Enhancing reprogramming efficiency with TAD-TF chimeras. (a) Domain structure of VP16, MyoD and YAP from which TADs were derived. (b) Domain structure of chimeric Oct4-TAD proteins demonstrated to potently enhance the reprogramming of somatic cells to iPSCs.9397103 YAP1 was drawn according to Uniprot-ID P46938 and VP16 and MyoD according to Hirai et al.,99 2010. TAD: transactivation domain; bHLH: basic helix-loop-helix domain; POU: Pit1-Oct-Unc-86 related domain; TEAD-IA: interaction region with the TF TEAD; TFs: transcription factors; Oct4: octamer binding protein 4; iPSCs: induce pluripotent stem cells; VP16: viral protein 16; YAP: yes-associated protein. The viral protein 16 (VP16) is a 490 amino acids TF protein of the herpes simplex virus with strong transactivating activity (). Its potent TAD was mapped and molecularly dissected more than 25 years ago and found to consist of an acidic C-terminal region spanning approximately 80 amino acids.8788 Immediately after its discovery the VP16-TAD has been utilized to engineer chimeric TFs with enhanced activity.89 More recently, VP16 has also been utilized to enable cellular reprogramming. When the VP16-TAD was fused to the Xenopus ortholog of the pancreatic and duodenal homeobox1 (Pdx1), a chimeric protein could induce the conversion of liver cells to pancreatic cell in transgenic tadpoles.90 A similar Pdx1-VP16 fusion induced insulin biosynthesis and ameliorated glucose tolerance in mouse diabetic models.9192 In an effort to enhance iPSC formation the VP16-TAD was fused to pluripotency reprogramming factors.93 In this study, a core fragment of the VP16-TAD (residues 446-490) was attached to Oct4, Sox2, Klf4 and Nanog TFs separated by a glycine-rich linker (). With the exception of Klf4, the engineered TAD-TFs substantially outperformed the wild-type proteins with regards to both the efficiency and the kinetics of iPSC generation in mouse and human cells. Moreover, Oct4-VP16 alone could efficiently reprogram mouse embryonic fibroblasts into germline-competent iPSCs.93 An Oct4 construct containing three C-terminal VP16 copies arranged in tandem exhibited the highest efficiency (). A separate study also reported that fusions of VP16 to mouse Oct4, human Oct4 and Xenopus Xlpou91 could support reprogramming as well as rescue Oct4 null ESCs.94 However, the authors did not observe a substantial enhancement in the reprogramming efficiency by the engineered proteins. The differences between the two studies could be caused by variations in the reprogramming conditions and construct design as different VP16 fragments, linkers, and VP16 copy numbers were used. The potency of the VP16-TAD for iPSC generation was again highlighted when it could be shown that fusions of VP16 to the truncated DNA binding high-mobility group (HMG) domain of Sox2 imparted some reprogramming activity to the otherwise inactive HMG fragments.95 Inspired by the remarkable potency of MyoD to single-handedly convert fibroblasts into muscle cells96 Hirai et al.97 asked whether the TAD of MyoD can enhance reprogramming to pluripotency. MyoD, a TF of the basic helix-loop-helix (bHLH) family,98 contains a ~60 amino acid TAD at its N-terminus99 (). The authors generated a series of MyoD TAD fragments and generated chimeric proteins by attaching them to the full-length mouse Oct4 protein. A chimeric protein consisting of a MyoD fragment (residues 1–62) attached to the N-terminus of Oct4 (termed M3O, ) was found to strongly amplify the number of iPSC colonies in reprogramming assays using mouse and human cells. However, increasing the copy number of the M3 fragment was detrimental to iPSC generation. Notably, neither Klf4 nor Sox2 could be enhanced with the M3 fragment. Rather M3 Sox2 and M3 Klf4 fusions inhibited reprogramming. Likewise, replacing M3 with the TADs of Gata4, Mef2c, Tax and Tat eliminated the reprogramming activity of Oct4. Collectively, the engineered M3O factor appears to support reprogramming in a more specific and context-dependent manner in contrast to the VP16 fusions that allow a more flexible design.97 It was subsequently found that when culturing the cells in serum-free media at low density, M3O containing cocktails could enhance iPSC formation to 26% and 7% of the transfected mouse or human cells, respectively.100 Moreover, when tested side-by-side, the M3O construct outperformed Oct4-VP16 fusions.100 Consistently, in a study done by another group, M3O was found to accelerate reprogramming based on a cocktail using modified messenger RNAs.101 Motivated by the success in using the M3 domain to optimize Oct4 activity, Hirai et al.102 went on to ask whether factors involved in cardiac transdifferentiation can be improved through M3 chimeras. While M3 fusions to Gata4, Tbx5 and Hand2 had either no or adverse consequences, M3-Mef2c chimeras lead to a 15-fold increase in the number of beating clusters of induced cardiomyocytes (iCMs). Mef2c-VP16 fusions also showed some increase in iCM formation, albeit to only 20% of the M3Mef2c levels. More recently, the TAD of the yes-associated protein (YAP) was used to engineer iPSC inducing TFs ().103 YAP is a downstream effector of the Hippo signaling pathway and co-activates transcription in concert with TFs of the TEAD family by recruiting histone methyltransferases.104 YAP promotes oncogenesis as well as the self-renewal of ESCs via its potent transactivation activity.105106 Its C-terminal TAD was found to activate reporter genes as potently as the VP16 TAD.107 Zhu et al.103 fused the C-terminal YAP residues 275-489 to the C-termini of Oct4, Sox2 and Nanog to generate the engineered proteins Oy, Sy and Ny (). When this cocktail was used for iPSC induction in combination with native Klf4, the reprogramming efficiency rose from <1% to ~40%. Furthermore, the reprogramming kinetics was markedly accelerated with iPSC colonies appearing on the day after switching the transfected fibroblasts to ESC medium. The Sox2-YAP fusion was reported to be most critical for the acceleration among the three modified TFs. What do the three TADs used to engineer reprogramming TFs have in common? They consist of predominantly acidic and hydrophobic residues leading to a strongly negative net charge. However, none of the TADs has been structurally characterized presumably due to their flexible structure in the absence of binding partners. Therefore, the molecular details of how those TADs create a chromatin environment instructive for the mRNA synthesis of nearby genes remains unclear. While a series of TAD interaction partners were previously detected,99108 the affinity and selectivity of those TADs for co-regulators such as p300 or the mediator complex has not yet been studied in a systematic manner. Hence, whether those TADs mediate a general, or a TAD specific mechanism of transcriptional activation awaits further exploration. Collectively, there appear to be no obvious rules of how TAD-TF chimeras should be designed to engineer lineage converting TFs. Rather, optimal constructs had to be empirically produced for each TAD-TF combination.939497 For example, increasing TAD copy numbers can either boost93 or impede TAD-TF activity.9497109 Further parameters to be optimized include the length of the TAD fragment used, the position of the TADs at either the N- or C-termini of the TFs, and the inclusion of linker sequences.

Inducing endogenous reprogramming factors with TAL effectors

Artificial proteins based on C2-H2 zinc-finger proteins (ZFPs), transcription activator-like effectors (TALEs) and RNA-guided clustered regularly interspaced short palindromic repeat (CRISPR) Cas (CRISPR associated) can be designed to target genomic loci with high specificity. Typically, those proteins are constructed to contain nuclease effector domains that enable genome editing at single base-pair resolution (reviewed in110111). While off-target effects had been a lurking concern, whole genome sequencing studies demonstrated that unwanted modifications are very rare.112113 Recently, designer TALEs (dTALEs) and ZFPs have also been used to engineer transcriptional activators.114115 TALEs consist 33-34 amino acid repeat domains with hypervariable residues at position 12 and 13.116 The identity of the dipeptide at positions 12/13 determines a recognition code (HD = C, NG = T, NI = A, NS = A, C, G or T, NN = A or G), which allows to rationally create dTALEs that recognize DNA sequences of choice.116 As up to 33 TALE repeat domains can be arranged in tandem, genomic loci can be targeted with high precision. Although, dTALE design is somewhat more straightforward than ZFP design, initial efforts were undertaken using ZFP-VP16 fusions constructed to target a sequence proximal to the transcriptional start site of Oct4.117 In another attempt, the fusion of a KRAB domain fused to a designer ZFP could activate endogenous Oct4 protein in series of cell lines.118 This was a surprising observation because the KRAB conventionally acts as transcriptional repressor. dTALE-VP16 fusions designed to bind proximal promoter sequences of SOX2, Klf4, c-MYC and Oct4 could activate reporter constructs, but only dTALEs targeting Klf4 and SOX2 could also activate the endogenous genes in 293FT cells.60 Moreover, dTALEs targeting the proximal promoter of Oct4 could activate the gene in NSCs where it is otherwise silenced. However, this strategy required the addition of histone deacetylase or DNA methyltransferase inhibitors suggesting that some chromatin loosening is needed for the dTALE-TF to access its target site.119 Gao et al.120 asked whether dTALE-TFs can replace conventionally used reprogramming TFs by activating their endogenous counterparts. The authors used VP64-TADs (four tandem repeats of VP16) to engineer designer transcription activators (A-dTF) that target distal enhancers of reprogramming factors.120 Indeed, a A-dTF designed to activate endogenous Oct4 could replace exogenous Oct4 and induce iPSCs in combination with c-Myc, Klf4 and Sox2. While reprogrammed cells appeared faster when using the A-dTF the overall iPSC colony yield was higher when Oct4 was used directly.120 The authors went on to show that A-dTF, targeting a distal Nanog enhancer could convert epiblast stem cells into ESCs. This study provides an elegant proof-of-concept that TALE-based TFs can replace native reprogramming factors. However, as the sole function of dTALE-TFs is to activate endogenous reprogramming factors that would eventually have to finish the job, it remains to be demonstrated whether this method can enhance cell lineage conversions.

Engineering chromatin association of reprogramming factors

The TAD-TFs and TFs endogenously activated by A-dTFs will likely engage the genome in the same manner as the native TFs as the DNA recognition domain is not modified. So far, only a few engineering efforts focused on protein interfaces involved in DNA recognition that would alter their genomic binding profile. Still, several swap experiments that install new functions and create engineered reprogramming TFs have been successful. The reprogramming pioneer Weintraub had provided the first evidence that lineage conversion activity of reprogramming TFs can be radically interchanged with strategically placed point mutations at the DNA contact interface.121122 Following the seminal discovery that MyoD alone can induce a myogenic program in fibroblasts,3 Weintraub et al. continued to dissect the molecular basis of its specific activity. MyoD belongs to the bHLH family of TFs whose members bind to short palindromic CANNTG E-box motifs as homo-or heterodimers. By adopting a scissor-like architecture, bHLH TFs bind the major groove of the DNA through the basic regions of helix198 (). The E-box can be bound by most bHLH with similar affinity and many amino acids contacting the DNA are highly conserved. Nevertheless, some subtly different sequence preferences have recently been detected which could contribute to distinctive roles of bHLH TFs in cell fate determinations.123 Weintraub compared the DNA recognition and the reprogramming potential of MyoD and E12, another bHLH factor that is ubiquitously expressed and does not trigger myogenesis.121122 By grafting just three amino acids from MyoD into E12: N114A and N115T of the basic region and D124K of the linker (MyoD numbering; residues 6,7 and 17 – numbering according to current bHLH conventions), E12 acquired the ability to convert fibroblasts into muscle cells121122 (). As the Ala-Thr dipeptide is conserved in myogenic bHLH TFs, this sequence is critical for executing a myogenic gene expression program. Surprisingly, the large degree of sequence variation between MyoD and E12 outside the bHLH domain did not contribute to their cell type specific functions (). Rather, just three amino acids at the DNA interaction surface specify their functional diversity. Subsequent studies suggested that the Ala-Thr dipeptide affects the conformation of Arg-111 in the basic region and thereby modulates the access of Arg-111 to the major groove of the DNA binding site.124 Those rearrangements at the DNA-binding interface could translate into allosteric events at other interfaces, such as binding sites for chromatin modifiers, and thereby influence the functional consequences of the binding event.125
Figure 2

Rational engineering how reprogramming factors read genomes. (a) Dimeric MyoD structure bound to DNA (pdb-ID 1 mdy98). The DNA is shown as a gray surface plot and the two MyoD molecules forming the dimer as yellow and brown cartoon. Residues engrafted from MyoD into E12 to turn E12 into a myogenic protein are shown as black ball-and-sticks and are highlighted with a dashed oval. (b) Structural models showing heterodimers of Oct4 (light blue) and Sox2 (yellow) on the canonical motif and Oct4-Sox17 (orange) on the compressed motif. Sox residues that determine the discriminative heterodimerization with Oct4 on canonical and compressed motifs are shown as black ball-and-sticks. Transplanting Lys57 from Sox2 into Sox17 alone turns Sox17 into a potent inducer of pluripotency.142 The structural models were constructed as described in146 based on coordinates derived from pdb-IDs 3f27 and 1 gt0.129130145 To the right of the structural cartoons the domain structure of MyoD versus E12 and Sox2 versus Sox17 is compared. The percentages above the domain plots indicate the amino acid identity between the protein pairs in the N-terminal, DNA binding and C-terminal region. The alignment was performed using sequences derived from Uniprot: MyoD P10085; E12:P15806; Sox2:P48432; Sox17:Q61473. Oct4: octamer binding protein 4.

Rational engineering how reprogramming factors read genomes. (a) Dimeric MyoD structure bound to DNA (pdb-ID 1 mdy98). The DNA is shown as a gray surface plot and the two MyoD molecules forming the dimer as yellow and brown cartoon. Residues engrafted from MyoD into E12 to turn E12 into a myogenic protein are shown as black ball-and-sticks and are highlighted with a dashed oval. (b) Structural models showing heterodimers of Oct4 (light blue) and Sox2 (yellow) on the canonical motif and Oct4-Sox17 (orange) on the compressed motif. Sox residues that determine the discriminative heterodimerization with Oct4 on canonical and compressed motifs are shown as black ball-and-sticks. Transplanting Lys57 from Sox2 into Sox17 alone turns Sox17 into a potent inducer of pluripotency.142 The structural models were constructed as described in146 based on coordinates derived from pdb-IDs 3f27 and 1 gt0.129130145 To the right of the structural cartoons the domain structure of MyoD versus E12 and Sox2 versus Sox17 is compared. The percentages above the domain plots indicate the amino acid identity between the protein pairs in the N-terminal, DNA binding and C-terminal region. The alignment was performed using sequences derived from Uniprot: MyoD P10085; E12:P15806; Sox2:P48432; Sox17:Q61473. Oct4: octamer binding protein 4. Our laboratory has made efforts to scrutinize the mechanism how proteins of the 20-member Sox family recognize their DNA target sites. Confusingly, all Sox proteins bind a near-identical CATTGT-like sequence126127 and engage DNA by binding to the minor groove to induce a 70° kink using a conserved set of amino acids.128129130131132133134135136 How then can individual members select specific gene-sets to initiate characteristic cell fate decisions? The DNA binding HMG domain of Sox protein not only mediates sequence-specific DNA recognition but is also the main determinant of a partner code enabling selective interactions with other TFs.137138139140141 By conducting quantitative electrophoretic mobility shift assays to study the HMG mediated partnership with the Pit1-Oct-Unc-86 (POU) domain of Oct4, we observed different propensities of Sox-family members to heterodimerize with Oct4 on a series of differently configured composite sox-oct binding sites.133142 In particular an unusual “compressed” element-where one nucleotide separating the sox and oct half sites is removed – still recruits Sox17/Oct4 dimers, whereas Sox2/Oct4 dimers can no longer assemble (). Conversely, the Sox2/Oct4 pair dimerizes markedly better on the canonical motif than the Sox17/Oct4 pair. In the search for the structural basis for these differences, a single amino-acid at position 57 of the HMG caught our attention. This residue, a Lys in Sox2 and a Glu in Sox17, shows a high degree of sequence variation amongst paralogous Sox proteins although it occupies a critical position at the Oct4 interaction interface.129130132133 By exchanging this residue between Sox2 and Sox17 to produce Sox17EK and Sox2KE proteins, highly cooperative dimer formation of the Sox17EK/Oct4 complex on the canonical motif is installed. The wild type Sox2 normally partners with Oct4 in OKSM45 or OSNL (OS plus Nanog and Lin28)6 cocktails to activate the pluripotency circuitry. By contrast, the wild-type Sox17 induces endoderm differentiation when overexpressed in ESCs. The activity of the engineered factors was, therefore, studied in iPSC generation assays.142143 When we replaced Sox2 with Sox17EK in OSKM cocktails, we could induce iPSCs with improved efficiency in both mouse142 and human cells.95 An analogously modified Sox7EK protein showed a similar behavior, whereas Sox4 and Sox18 needed additional TAD engineering for their conversion into reprogramming TFs.95 Using chromatin immunoprecipitation followed by high-throughput sequencing, we found that Sox17EK and Sox2 show a very similar binding profile when overexpressed in mouse ESCs.144 Both proteins pair with Oct4 on many genomic loci earmarked with canonical sox-oct motifs. By contrast, Sox17 partners with Oct4 on enhancers containing the compressed motif. Apparently, a single point mutation drastically changed how Sox proteins co-select their target genes by partnering with Oct4. Yet, the converse Sox2KE mutant could neither effectively dimerize with Oct4 on the canonical nor on the compressed sequence. This puzzle was resolved more recently when a novel Oct4 crystal structure and molecular dynamics simulation suggested an additional discriminatory interaction between residue 46 of Sox proteins with an Oct4 specific helix in the POU linker.145146 Indeed, a rationally designed Sox2E46LK57E double mutant now cooperatively dimerizes with Oct4 on the compressed motif. It will be of interest to explore the activity of this engineered Sox factor in lineage conversion experiments. Collectively, the MyoD and Sox17EK examples show that the cell fate conversion potential of reprogramming TFs can be drastically changed with rather minimal modifications at structurally critical interfaces. We surmise that these insights could be utilized to engineer more potent and safer reprogramming TFs. Contrary to the TAD-TF and the TALE-TF approach; TFs with engineered DNA-binding domain likely engage the genome in a new manner (). This way, it could be possible to break reprogramming barriers more effectively and to direct cells trapped in a local minimum of the Waddington landscape towards a desired state of differentiation.
Figure 3

Regulatory outcome of different categories of engineered reprogramming TFs. (a) TALEs coupled with TADs can be engineered to switch on otherwise silenced reprogramming TFs. By targeting the distal enhancer of Oct4 its expression is activated. The resulting gene expression program is expected to resemble the wild-type scenario. (b) When potent TADs are fused to reprogramming TFs the chimeric protein is expected to target genomic regions reminiscent of the unmodified wild-type protein (a). However, the presence of a potent TAD elevates expression levels as compared to the wild-type and may also trigger the activation of genes that would otherwise be silent. (c) Rationally placed point mutations within the DNA binding domain can modify DNA recognition principles and alter the genomic binding profile. The cartoon represents a synthetic Oct4 with modified binding preferences that leads to the activation of genes that the wild-type protein would not switch on. TFs: transcription factors; TALEs: transcription activator-like effectors; TAD: transactivation domain; Oct4: octamer binding protein 4.

Regulatory outcome of different categories of engineered reprogramming TFs. (a) TALEs coupled with TADs can be engineered to switch on otherwise silenced reprogramming TFs. By targeting the distal enhancer of Oct4 its expression is activated. The resulting gene expression program is expected to resemble the wild-type scenario. (b) When potent TADs are fused to reprogramming TFs the chimeric protein is expected to target genomic regions reminiscent of the unmodified wild-type protein (a). However, the presence of a potent TAD elevates expression levels as compared to the wild-type and may also trigger the activation of genes that would otherwise be silent. (c) Rationally placed point mutations within the DNA binding domain can modify DNA recognition principles and alter the genomic binding profile. The cartoon represents a synthetic Oct4 with modified binding preferences that leads to the activation of genes that the wild-type protein would not switch on. TFs: transcription factors; TALEs: transcription activator-like effectors; TAD: transactivation domain; Oct4: octamer binding protein 4.

OUTLOOK — NOVEL APPROACHES FOR REPROGRAMMING FACTOR DESIGN

To produce cells for clinical applications, the process should be tightly controlled, fast and exclude undesired by-products. In particular, reprogramming to pluripotency has witnessed a multitude of studies aimed to improve the efficiency of iPSC generation (excellent reviews by Papp and Plath28 and Soufi31). Initially, iPSC generation was rather slow and only a small number of cells transfected with a cocktail of reprogramming TFs could be reprogrammed.4 Confusingly, it appeared that there is a high degree of randomness in cell populations and by simple chance a small subset of cells enters a path leading to the successive progression towards pluripotency in a more deterministic fashion.147148 Yet, as roadblocks toward the pluripotency continue to be removed; fully controlled and efficient iPSC generation could soon be achieved.8586 As the quality of iPSCs produced by engineered reprogramming factors was validated by examining their contribution to embryonic development and the capacity for germline transmission, synthetic TFs could still contribute to the ultimate cocktail.9397100103120 However, iPSCs are only an intermediary by-product on the way towards transplantation-grade functional cells. To lower the risk of cancerogenesis, a pluripotent intermediate should be avoided altogether or, minimally, complete differentiation of formerly pluripotent cells has to be ensured. Reproducibly generating functional cells to cure degenerative diseases will remain a challenge in the years to come. We anticipate that protein engineering techniques can help to overcome reprogramming barriers and better control cell lineage conversions to produce functional cells more safely and with properties more closely matching their in vivo counterparts (). While widely used in fields such as enzymology and immunology, protein engineering is still in its infancy in cellular reprogramming. This is partly because of our incomplete understanding of how TFs work. Our structural knowledge is mostly restricted to isolated domains bound to short stretches of DNA. The mechanism of DNA target site selection, chromatin opening and how TFs stimulate mRNA synthesis remains largely unclear. Nevertheless, the studies highlighted in this review testify the promise of the approach and warrant further exploration as to whether protein engineering can bring stem cell biology closer to the bedside.

COMPETING INTERESTS

The authors declare that they have no competing interests.
  147 in total

1.  Crystal structure of a POU/HMG/DNA ternary complex suggests differential assembly of Oct4 and Sox2 on two enhancers.

Authors:  Attila Reményi; Katharina Lins; L Johan Nissen; Rolland Reinbold; Hans R Schöler; Matthias Wilmanns
Journal:  Genes Dev       Date:  2003-08-15       Impact factor: 11.361

2.  Acquisition of myogenic specificity by replacement of three amino acid residues from MyoD into E12.

Authors:  R L Davis; H Weintraub
Journal:  Science       Date:  1992-05-15       Impact factor: 47.728

3.  Randomization of genes by PCR mutagenesis.

Authors:  R C Cadwell; G F Joyce
Journal:  PCR Methods Appl       Date:  1992-08

4.  The crystal structure of the Sox4 HMG domain-DNA complex suggests a mechanism for positional interdependence in DNA recognition.

Authors:  Ralf Jauch; Calista K L Ng; Kamesh Narasimhan; Prasanna R Kolatkar
Journal:  Biochem J       Date:  2012-04-01       Impact factor: 3.857

5.  H3K9 methylation is a barrier during somatic cell reprogramming into iPSCs.

Authors:  Jiekai Chen; He Liu; Jing Liu; Jing Qi; Bei Wei; Jiaqi Yang; Hanquan Liang; You Chen; Jing Chen; Yaran Wu; Lin Guo; Jieying Zhu; Xiangjie Zhao; Tianran Peng; Yixin Zhang; Shen Chen; Xuejia Li; Dongwei Li; Tao Wang; Duanqing Pei
Journal:  Nat Genet       Date:  2012-12-02       Impact factor: 38.330

6.  Structural basis for the SOX-dependent genomic redistribution of OCT4 in stem cell differentiation.

Authors:  Felipe Merino; Calista Keow Leng Ng; Veeramohan Veerapandian; Hans Robert Schöler; Ralf Jauch; Vlad Cojocaru
Journal:  Structure       Date:  2014-08-07       Impact factor: 5.006

Review 7.  Mechanisms and models of somatic cell reprogramming.

Authors:  Yosef Buganim; Dina A Faddah; Rudolf Jaenisch
Journal:  Nat Rev Genet       Date:  2013-06       Impact factor: 53.242

8.  Direct reprogramming of fibroblasts into embryonic Sertoli-like cells by defined factors.

Authors:  Yosef Buganim; Elena Itskovich; Yueh-Chiang Hu; Albert W Cheng; Kibibi Ganz; Sovan Sarkar; Dongdong Fu; G Grant Welstead; David C Page; Rudolf Jaenisch
Journal:  Cell Stem Cell       Date:  2012-09-07       Impact factor: 24.633

9.  Coordination of engineered factors with TET1/2 promotes early-stage epigenetic modification during somatic cell reprogramming.

Authors:  Gengzhen Zhu; Yujing Li; Fei Zhu; Tao Wang; Wensong Jin; Wei Mu; Wei Lin; Weiqi Tan; Wenqi Li; R Craig Street; Siying Peng; Jian Zhang; Yue Feng; Stephen T Warren; Qinmiao Sun; Peng Jin; Dahua Chen
Journal:  Stem Cell Reports       Date:  2014-02-27       Impact factor: 7.765

10.  In vivo reprogramming of adult pancreatic exocrine cells to beta-cells.

Authors:  Qiao Zhou; Juliana Brown; Andrew Kanarek; Jayaraj Rajagopal; Douglas A Melton
Journal:  Nature       Date:  2008-08-27       Impact factor: 49.962

View more
  4 in total

Review 1.  Diversity among POU transcription factors in chromatin recognition and cell fate reprogramming.

Authors:  Vikas Malik; Dennis Zimmer; Ralf Jauch
Journal:  Cell Mol Life Sci       Date:  2018-01-15       Impact factor: 9.261

2.  Changing POU dimerization preferences converts Oct6 into a pluripotency inducer.

Authors:  Stepan Jerabek; Calista Kl Ng; Guangming Wu; Marcos J Arauzo-Bravo; Kee-Pyo Kim; Daniel Esch; Vikas Malik; Yanpu Chen; Sergiy Velychko; Caitlin M MacCarthy; Xiaoxiao Yang; Vlad Cojocaru; Hans R Schöler; Ralf Jauch
Journal:  EMBO Rep       Date:  2016-12-22       Impact factor: 8.807

Review 3.  Regenerating Damaged Myocardium: A Review of Stem-Cell Therapies for Heart Failure.

Authors:  Dihan Fan; Hanrong Wu; Kaichao Pan; Huashan Peng; Rongxue Wu
Journal:  Cells       Date:  2021-11-11       Impact factor: 6.600

4.  Clinical Trial-Ready Patient Cohorts for Multiple System Atrophy: Coupling Biospecimen and iPSC Banking to Longitudinal Deep-Phenotyping.

Authors:  Alain Ndayisaba; Ariana T Pitaro; Andrew S Willett; Kristie A Jones; Claudio Melo de Gusmao; Abby L Olsen; Jisoo Kim; Eero Rissanen; Jared K Woods; Sharan R Srinivasan; Anna Nagy; Amanda Nagy; Merlyne Mesidor; Steven Cicero; Viharkumar Patel; Derek H Oakley; Idil Tuncali; Katherine Taglieri-Noble; Emily C Clark; Jordan Paulson; Richard C Krolewski; Gary P Ho; Albert Y Hung; Anne-Marie Wills; Michael T Hayes; Jason P Macmore; Luigi Warren; Pamela G Bower; Carol B Langer; Lawrence R Kellerman; Christopher W Humphreys; Bonnie I Glanz; Elodi J Dielubanza; Matthew P Frosch; Roy L Freeman; Christopher H Gibbons; Nadia Stefanova; Tanuja Chitnis; Howard L Weiner; Clemens R Scherzer; Sonja W Scholz; Dana Vuzman; Laura M Cox; Gregor Wenning; Jeremy D Schmahmann; Peter Novak; Geoffrey S Young; Mel B Feany; Tarun Singhal; Vikram Khurana
Journal:  Cerebellum       Date:  2022-10-03       Impact factor: 3.648

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.