Steven A Benner1, A Michael Sismour. 1. Department of Chemistry, University of Florida, Gainesville, FL 32611-7200, USA. benner@chem.ufl.edu
Abstract
Synthetic biologists come in two broad classes. One uses unnatural molecules to reproduce emergent behaviours from natural biology, with the goal of creating artificial life. The other seeks interchangeable parts from natural biology to assemble into systems that function unnaturally. Either way, a synthetic goal forces scientists to cross uncharted ground to encounter and solve problems that are not easily encountered through analysis. This drives the emergence of new paradigms in ways that analysis cannot easily do. Synthetic biology has generated diagnostic tools that improve the care of patients with infectious diseases, as well as devices that oscillate, creep and play tic-tac-toe.
Synthetic biologists come in two broad classes. One uses unnatural molecules to reproduce emergent behaviours from natural biology, with the goal of creating artificial life. The other seeks interchangeable parts from natural biology to assemble into systems that function unnaturally. Either way, a synthetic goal forces scientists to cross uncharted ground to encounter and solve problems that are not easily encountered through analysis. This drives the emergence of new paradigms in ways that analysis cannot easily do. Synthetic biology has generated diagnostic tools that improve the care of patients with infectious diseases, as well as devices that oscillate, creep and play tic-tac-toe.
Those familiar with academia know that disputes over trademarks can be more intense (and, in a prurient sense, more interesting) than disputes over substance. Synthetic biology has such a dispute in the making.The title 'synthetic biology' appeared in the literature in 1980, when it was used by Barbara Hobom to describe bacteria that had been genetically engineered using recombinant DNA technology[1]. These bacteria are living systems (therefore biological) that have been altered by human intervention (that is, synthetically). In this respect, synthetic biology was largely synonymous with 'bioengineering'.In 2000, the term 'synthetic biology' was again introduced by Eric Kool and other speakers at the annual meeting of the American Chemical Society in San Francisco[2]. Here, the term was used to describe the synthesis of unnatural organic molecules that function in living systems. More broadly in this sense, the term has been used with reference to efforts to 'redesign life'[3,4,5]. This use of the term is an extension of the concept of 'biomimetic chemistry', in which organic synthesis is used to create artificial molecules that recapitulate the behaviour of parts of biology, typically enzymes[6]. Synthetic biology has a broader scope, however, in that it attempts to recreate in unnatural chemical systems the emergent properties of living systems[7], including inheritance, genetics and evolution[3,4,5,8]. Synthetic biologists seek to assemble components that are not natural (therefore synthetic) to generate chemical systems that support Darwinian evolution (therefore biological). By carrying out the assembly in a synthetic way, these scientists hope to understand non-synthetic biology, that is, 'natural' biology. This motivation is similar in biomimetic chemistry, where synthetic enzyme models are important for understanding natural enzymes.More recently, an engineering community has given further meaning to the title. This community seeks to extract from living systems interchangeable parts that might be tested, validated as construction units, and reassembled to create devices that might (or might not) have analogues in living systems[9]. The parts come from natural living systems (that is, they are biological); their assembly is, however, unnatural. Therefore, one engineering goal might be to assemble biological components (such as proteins that bind DNA and the DNA sequences that they bind) to create, for example, outputs analogous to those of a computer.A common ground between the 'synthetic biology' and engineering communities lies in the global strategy by which scientists come to understand their subject matter, make discoveries and overturn paradigms. Synthesis offers opportunities for achieving these goals that observation and analysis do not. The use of synthesis in a way that complements analysis will be a main theme of this review (Box 1).Synthetic biology already has many accomplishments to its credit. The effort to generate synthetic genetic systems has yielded diagnostic tools, such as Bayer's branched DNA assay (described in a later section), which annually helps improve the care of some 400,000 patients infected with HIV and hepatitis viruses[10,11,12]. These and other artificial genetic systems now support primitive genetic processes, including replication with the possibility of mutation[13,14], selection[15] and evolution. Synthetic biology has also generated some interesting toys from biomolecular parts, including systems that oscillate[16] and that carry out simple computations[17].For engineering purposes, parts are most suitable when they contribute independently to the whole. This 'independence property' allows one to predict the behaviour of an assembly. Therefore, it makes sense to structure this review to follow the search for independently interchangeable parts.This search turns out to be interesting. In molecular science, it is well known that the simplest building units (the atomic parts) do not always contribute independently to the behaviour of a molecular assembly (the whole). In the macroscopic physical world, building units often do, especially if they are designed to do so (as in modular software assembly, for example). Ultimately, synthetic biology succeeds or fails as an engineering discipline depending on where independence approximations become useful in the continuum between the atomic and macroscopic worlds.As a science, synthetic biology can be evaluated in different ways. By measuring the insights, discoveries and paradigm shifts that are driven by synthetic biology, we ask here whether the synthetic approach has contributed in a way that is not easily possible by analysis alone.Seeking interchangeable parts: DNAAs described by Watson and Crick 52 years ago, DNA has a modular structure. In a reductionist sense, DNA can be described as two antiparallel strands. Each strand is assembled from four different nucleotide building blocks, which are themselves assembled from sugars, phosphates, and nucleobases. These are, in turn, assembled from carbon, nitrogen, oxygen, phosphorus and hydrogen atoms.In the Watson–Crick model, nucleotide pairs contribute independently to the stability of a duplex. In reality, this is a good approximation. DNA duplexes can be designed with considerable success by applying just two rules: A pairs with T, and G pairs with C. A second-order model does very well by adding only the effect of adjacent base pairs into the calculation[18]. Although some diversity in nucleic acid structure and function is not captured by such simple rules (for example, that of Z-DNA[19], G QUARTETS[20], and catalytic RNA[21]), most molecular biologists only use this diversity occasionally.The elegance of the Watson–Crick model has caused most molecular biologists to overlook the chemical peculiarity of such rules. No other molecular system can be described so simply. For example, the behaviour of a protein is generally not a transparent function, linear or otherwise, of the behaviours of its constituent amino acids, even as an approximation. The power of the Watson–Crick rules was nevertheless sufficient to lead to complacency by most of those who learned the double helix structure; molecular recognition in DNA was a 'solved problem'.This complacency was only dislodged through synthesis of nucleic acids. Starting in the 1980s, some synthetic biologists began to wonder whether DNA and RNA were the only molecular structures that could support genetics on Earth or elsewhere[3,22,23]. Other biologists, seeking technological goals, attempted to replace modules in the DNA structure to create DNA analogues that would, for example, passively enter cells, but could still support the 'A pairs with T, G pairs with C' rule, with the aim of disrupting the performance of intracellular nucleic acids in a sequence-specific 'antisense' way[24].This antisense idea was simple in cartoon form. The phosphate backbone was thought to be largely responsible for the unsuitability of DNA as a drug: the repeating backbone phosphates prevented nucleic acids from partitioning into lipid phases, an event believed to be essential for molecules to enter cells passively. The phosphate–ribose backbone is also the recognition site for nucleases. This knowledge, and the fact that the Watson–Crick model proposed no particular role for the phosphates in molecular recognition, encouraged the inference that the backbone could be changed without affecting pairing rules.The effort to synthesize non-ionic backbones changed the established view of nucleic acid structure. Nearly 100 linkers were synthesized to replace the 2′-deoxyribose sugar, starting with the first by the Pitha[25] and Benner[26] laboratories. Nearly all analogues that lacked the REPEATING CHARGE showed worse rule-based molecular recognition. Even with the most successful uncharged analogues (such as the polyamide-linked nucleic-acid analogues (PNA) created by Nielsen and his group[27]) molecules longer than 15 or 20 building units generally failed to support rule-based duplex formation. In other uncharged systems, the breakdown occurs earlier[28].This discovery was unfortunate for the antisense industry, but it had a marked effect on our understanding of DNA. The repeating charge in the DNA backbone could no longer be viewed as a dispensable inconvenience. The same is true for the ribose backbone of RNA: although several backbones (such as THREOSE DNA or LOCKED NUCLEIC ACIDS) work as well or better than ribose[24,29,30], most of the replacements work less well. The backbone is not simply scaffolding to hold the nucleobases in place; it has an important role in the molecular recognition that is central to genetics.The above example illustrates how synthesis drives discovery and paradigm change. The failure to obtain non-ionic DNA analogues that retain rule-based pairing led scientists to think about the chemical structures that might be needed to support Darwinian evolution.In particular, a genetic molecule must be able to suffer change (mutation) without markedly changing its overall physical properties. Again, this feature is infrequent in chemical systems (in proteins, for example). But because charge dominates the physical properties of a molecule, a repeating charge should allow appendages (the nucleobases, in the case of DNA and RNA) to be replaced without changing the dominant behaviour of a genetic system[31]. This has led to the suggestion that a repeating charge might be a universal feature of genetic molecules that work in water[31].Furthermore, the discovery that ribose was one of the better backbone sugars for supporting molecular recognition[24,32] had implications for the origin of life on Earth. In the mid 1990s, Miller had commented that because of the ease with which ribose decomposes as a sugar on heating[33], ribose could not have supported the first genetic system on Earth. The results from synthesis, which indicated that ribose is especially good for genetics, drove efforts to find prebiotic routes to ribose that would overcome its intrinsic instability[34,35].Synthesis focusing on the nucleobases also generated discoveries. The Watson–Crick pairing rules arise from two rules of chemical complementarity. The first, size complementarity, pairs large purines with small pyrimidines. The second, hydrogen-bonding complementarity, pairs hydrogen-bond donors from one nucleobase with hydrogen-bond acceptors from the other.If nucleobase pairing were indeed so simple, it should be possible to move atoms around within the nucleobases (on paper) to synthesize unnatural nucleobases that would still pair following rules of size and hydrogen bonding complementarity, but differently from the natural nucleobases. Indeed, by shuffling the hydrogen-bond donating and accepting groups, one can easily generate eight additional syntheticnucleobases, forming four additional base pairs (Fig. 1).
Figure 1
Examples of alternative nucleobases.
Parts of the nucleobases of DNA can be used as interchangeable building modules. The blue units are the hydrogen bonding donor (D) collections of atoms. The red units are the hydrogen bonding acceptor (A) collections of atoms. a | The four standard nucleobases are shown. b | Shuffling the hydrogen bond donor and acceptor modules generates eight additional nucleotides, which constitute a synthetic genetic system. These synthetic bases have been used in an artificial genetic system that can support Darwinian evolution. A, adenine; C, cytosine; G, guanine; Pu, purine; Py, pyrimidine; T, thymine.
Examples of alternative nucleobases.
Parts of the nucleobases of DNA can be used as interchangeable building modules. The blue units are the hydrogen bonding donor (D) collections of atoms. The red units are the hydrogen bonding acceptor (A) collections of atoms. a | The four standard nucleobases are shown. b | Shuffling the hydrogen bond donor and acceptor modules generates eight additional nucleotides, which constitute a synthetic genetic system. These synthetic bases have been used in an artificial genetic system that can support Darwinian evolution. A, adenine; C, cytosine; G, guanine; Pu, purine; Py, pyrimidine; T, thymine.In this case, synthesis showed that nucleobase pairing is as simple as the Watson–Crick model implies. A synthetic genetic alphabet with up to 12 independently replicatable nucleobase pairs can be supported by an extended set of Watson–Crick rules[36]. Furthermore, a small amount of protein engineering converts natural polymerases into polymerases that accept components of an expanded genetic alphabet in a polymerase chain reaction[14]. This created, for the first time, a synthetic genetic system that can be repeatedly copied, with the level of mutation needed to support adaptation and evolution.By searching for synthetic systems that could recreate such emergent properties, synthetic biologists have discovered a great deal. For example, it was proposed that DNA polymerases scan the minor groove of a DNA duplex to look for unshared pairs of electrons as a recognition feature[37]. It was likewise proposed that this scanning of the minor groove was essential for the high fidelity of DNA replication. Efforts to obtain polymerases to support the evolution of the artificial genetic system led to the discovery that minor-groove scanning is not an essential feature of all polymerases.Today, the effort to make a synthetic chemical system that is capable of Darwinian evolution is an important focus of the National Science Foundation's Chemical Bonding Program. Here, the details of the chemical structures of nucleobases that are essential to support genetics have been determined, with the goal of repairing specific chemical problems that limit the use of specific components of an expanded genetic alphabet. For example, several components of an artificial genetic system suffer from EPIMERIZATION; this has been rectified by adding nitro substituents to the nucleobases[38]. Another component of the artificial system, iso-guanosine, has a minor TAUTOMERIC form that cross bonds with thymidine, creating a significant number of mutations in polymerase chain reactions. This defect was solved by replacing a nitrogen in the structure by a carbon atom[39].Because it provides rule-based molecular recognition that is orthogonal to the recognition provided by natural DNA, this synthetic genetic system is found today in the clinic. As part of the Bayer VERSANT branched DNA diagnostic assay[40], synthetic biology helps to manage the care of approximately 400,000 patients infected with HIV and hepatitis viruses each year[10,11](Fig. 2).
Figure 2
Branched DNA assay developed by scientists at Chiron and Bayer Diagnostics.
The target RNA molecule to be detected (the analyte) is attached to the plastic of a microwell (bottom) by the hybridization of the analyte to a series of capture probes. This complex then captures, through hybridization, a target probe, which in turn hybridizes to a pre-amplifier molecule, thereby 'sandwiching' the analyte between the capture probe and the pre-amplifier. The pre-amplifier captures a branched DNA dendrimer (amplifier) that contains several signalling molecules on each branch. As a consequence of the branching, a single analyte assembles a large number of signalling molecules in the microwell. These assays use the expanded genetic alphabet shown in Fig. 1. When standard nucleotides were used to assemble the signalling nanostructure, significant noise was seen, because non-target DNA that was present in the biological sample was captured by the probes in the microwell even in the absence of analyte. Incorporating components of the artificial genetic alphabet in the dendrimer reduced the noise. As a consequence, the assay now helps manage the care of some 400,000 patients annually, detecting as few as eight molecules of the analyte DNA in a sample.
Branched DNA assay developed by scientists at Chiron and Bayer Diagnostics.
The target RNA molecule to be detected (the analyte) is attached to the plastic of a microwell (bottom) by the hybridization of the analyte to a series of capture probes. This complex then captures, through hybridization, a target probe, which in turn hybridizes to a pre-amplifier molecule, thereby 'sandwiching' the analyte between the capture probe and the pre-amplifier. The pre-amplifier captures a branched DNA dendrimer (amplifier) that contains several signalling molecules on each branch. As a consequence of the branching, a single analyte assembles a large number of signalling molecules in the microwell. These assays use the expanded genetic alphabet shown in Fig. 1. When standard nucleotides were used to assemble the signalling nanostructure, significant noise was seen, because non-target DNA that was present in the biological sample was captured by the probes in the microwell even in the absence of analyte. Incorporating components of the artificial genetic alphabet in the dendrimer reduced the noise. As a consequence, the assay now helps manage the care of some 400,000 patients annually, detecting as few as eight molecules of the analyte DNA in a sample.Seeking interchangeable parts: proteinsThe synthetic biology of nucleic acids is successful because the repeating charge in the backbone enables the nucleotide parts to be exchanged independently (although we acknowledge the fact that some RNA structures, such as G-rich sequences, are themselves problematic to engineer). Proteins, unfortunately, do not have a repeating charge; engineering them has therefore been more difficult.Proposals to engineer proteins, for which the interchangeable unit is the amino acid, is as old as recombinant DNA technology[41,42]. This idea was discussed in an engineering context in 1983 by Kevin Ulmer, then director of exploratory research at Genex[43]. In Ulmer's vision, synthetic biologists would first alter the behaviours of proteins by replacing amino acid BUILDING MODULES in the natural proteins. The replacements would come from the standard set of 20 natural amino acids, and would be chosen using primitive design principles to meet specific goals defined by the properties desired in the synthetic protein. Such primitive principles might, for example, place charged residues at the top and bottom of α helices, or strategically alter amino-acid size complementarity in the active site of an enzyme.Design based on such primitive rules was expected to frequently fail. Failure, however, would drive the development of better design rules[44]. This would generate a cycle, involving the setting of a goal, the replacement of amino acids to create proteins to meet the goal using the improved design rules, followed by success and failure, refinement of the design rules, and the setting of new goals. This process might go on for decades, and perhaps even generate some of the emergent properties that characterize biological systems. This vision remains largely unrealized. The 20 years of experience since Ulmer presented his vision has shown that the behaviour of a protein is not a simple combination of independent contributions from the constituent amino acids[45].The failure of the independence approximation was, in large part, expected[46]. First, amino acids in a folded polypeptide sequence strongly interact with others, even amino acids distant in the polypeptide chain[47]. More seriously, even the simplest of molecular interactions are poorly understood. Today, chemical theory still cannot retrodict the freezing point of water[48], the solubility of simple salts in water[49], or the packing of crystals of simple organic molecules[50]. Protein folding is, in one view, an aggregate of these particular processes. A theory that cannot manage the particulars is not expected to manage the whole. Nevertheless, the synthetic effort will be crucial for demonstrating and overcoming these limitations of theory.Serious efforts are under way to improve the computational tools needed to design and engineer proteins[51,52,53]. Some have attempted to improve design principles by examining ROTAMERS of amino acids, where different arrangements of side-chain atoms are used as the building modules, rather than the amino acids themselves[54]. Small protein folds have focused the simplification of design issues[55]. These include elegant examples from the laboratories of Imperiali, Allemann and Mayo.Today, the technology of amino-acid replacement is done using a combination of calculation, design, screening, selection, and luck[56]. Even so, the outcome has been positive[57]; many useful enzymes have emerged by means of amino acid replacement, including polymerases used for DNA sequencing[58], reverse transcriptases that use PCR to amplify synthetic genetic systems[14], and enzymes in commercial laundry detergents[59]. But these are far from a synthetic biology that captures the emergent properties of living systems.Proteins are built from secondary structure units, including the α helix and the β strand[60]. This gave rise to the idea that such secondary structural elements might serve as interchangeable parts to support protein design[61].Kaiser and his many students have been especially successful in using the amphiphilic helix as the interchangeable building unit. In an amphiphilic secondary structure, hydrophobic and hydrophilic amino-acid side chains are arranged in the sequence so that a hydrophilic side of the unit can face water, while a hydrophobic side can be buried in the protein fold. Such amphiphilic structures are expected to pack spontaneously in water.Using this strategy, DeGrado et al. designed an artificial polypeptide that reproduced some of the folding and biological properties of mellitin, a protein from the sting of a bee, without reproducing its exact sequence[62]. The designed peptide was amphiphilic, and the model proposed that its hydrophobic side buries itself in the hydrophobic membrane of a cell. Amphiphilic helices as units have frequently been exploited since then[63]. Analogous approaches have used the β strand as the architectural module[64].Several laboratories have worked to create emergent biological properties, including templated replication, by using secondary structural elements as interchangeable building modules. For example, the Ghadiri laboratory designed a peptide by assembling α helical coiled coils to obtain a peptide ligase[65] and a peptide replicator[66] (Fig 3).
Figure 3
A self-templating system built from peptide units.
Authors: Tarek Elbeik; Johan Surtihadi; Mark Destree; Jed Gorlin; Mark Holodniy; Saeed A Jortani; Ken Kuramoto; Valerie Ng; Roland Valdes; Alexandra Valsamakis; Norah A Terrault Journal: J Clin Microbiol Date: 2004-02 Impact factor: 5.948
Authors: Denis A Malyshev; Kirandeep Dhami; Henry T Quach; Thomas Lavergne; Phillip Ordoukhanian; Ali Torkamani; Floyd E Romesberg Journal: Proc Natl Acad Sci U S A Date: 2012-07-06 Impact factor: 11.205