Literature DB >> 35840179

Reinventing positive-strand RNA virus reverse genetics.

Abstract

Reverse genetics is the prospective analysis of how genotype determines phenotype. In a typical experiment, a researcher alters a viral genome, then observes the phenotypic outcome. Among RNA viruses, this approach was first applied to positive-strand RNA viruses in the mid-1970s and over nearly 50 years has become a powerful and widely used approach for dissecting the mechanisms of viral replication and pathogenesis. During this time the global health importance of two virus groups, flaviviruses (genus Flavivirus, family Flaviviridae) and betacoronaviruses (genus Betacoronavirus, subfamily Orthocoronavirinae, family Coronaviridae), have dramatically increased, yet these viruses have genomes that are technically challenging to manipulate. As a result, several new techniques have been developed to overcome these challenges. Here I briefly review key historical aspects of positive-strand RNA virus reverse genetics, describe some recent reverse genetic innovations, particularly as applied to flaviviruses and coronaviruses, and discuss their benefits and limitations within the larger context of rigorous genetic analysis.

Entities: Chemical

Keywords: Coronaviruses; Flaviviruses; Molecular virology; Positive-strand RNA viruses; Reverse genetics

Mesh：

Year: 2022 PMID： 35840179 PMCID： PMC9273853 DOI： 10.1016/bs.aivir.2022.03.001

Source DB: PubMed Journal: Adv Virus Res ISSN： 0065-3527 Impact factor: 9.938

bacterial artificial chromosome complementary deoxyribonucleic acid circular polymerase extension reaction clustered regularly interspaced short palindromic repeats multiplex automated genome engineering rolling circle amplification RNA-dependent RNA polymerase ribonucleic acid replication organelle ribozyme viral protein, genome-linked yeast artificial chromosome

Introduction

Positive-strand RNA viruses are incredibly diverse, with over 2300 virus species currently recognized within 60 virus families (more than for any other virus genome type) by the International Committee on the Taxonomy of Viruses (Walker et al., 2021). Some positive-strand RNA viruses, like alphaviruses and flaviviruses, are lipid enveloped, while others, like picornaviruses, are non-enveloped; some have small genomes, like nodaviruses (~ 4.5-kb), while others have massive genomes, like coronaviruses (~ 30-kb) (Fig. 1 ); most have only a single genome segment, while others, like bromoviruses and nodaviruses, have multiple genome segments. Despite these differences in genome size and replication strategy, all positive-strand RNA viruses share common features relevant to understanding their genetics.

Fig. 1

Representative positive-strand RNA virus genomes. Genomes are drawn to scale, showing viral protein expression and polyprotein cleavage strategies, subgenomic RNA transcripts, a programmed ribosomal frameshift, and the location of conserved viral enzymes and structural proteins, as per the Key.

General principles of positive-strand RNA virus replication

Upon entry into cells, positive-strand RNA virus genomes are directly translated by cellular ribosomes (Baltimore, 1971; Nirenberg and Matthaei, 1961; Ofengand and Haselkorn, 1962) to produce the viral proteins needed for viral replication. All positive-strand RNA viruses encode an RNA-dependent RNA polymerase (RdRP) and, in fact, the evolution of this enzyme is central to the phylogenetic classification of RNA viruses (Wolf et al., 2018). Viruses with genomes longer than ~ 5-kb typically also encode an RNA helicase; and viruses with genomes bearing 5′ caps must encode capping enzymes, while those with a 5′ genome-linked viral protein must encode a corresponding VPg protein. In the case of flaviviruses, genome translation produces a single long polyprotein that is processed by cellular proteases (ER-resident signal peptidases) and a viral serine protease (NS2B-NS3) to produce three viral structural proteins—capsid, prM, and E—and seven nonstructural proteins—NS1, NS2A, NS2B, NS3, NS4A, NS4B, and NS5 (Lindenbach et al., 2020) (Fig. 1). In the case of coronaviruses, the primary translation product is open reading frame (orf) orf1a, a polyprotein that is cleaved by two virally encoded cysteine proteases (nsp3 and nsp5) into ten nonstructural proteins, nsp1-nsp10 (Fig. 1). These proteins prepare the cell for viral replication by cleaving the viral polyproteins, forming ER-derived replication organelles, and inhibiting host defenses. In addition, a virally encoded ribosomal frameshift reliably produces a larger polyprotein, orf1ab, that is similarly cleaved to produce the viral RNA replication enzymes, nsp12-nsp16 (Perlman and Masters, 2020). Once a core set of viral replication proteins are produced, a subset of these proteins must recruit the genome out of translation and into a viral replication compartment. This transition from genome translation to replication is thought to be a critical event, as it is presumed that a single RNA molecule cannot be simultaneously translated in the 5′ to 3′ direction and copied in the 3′ to 5′ direction (Ahlquist et al., 2003; Gamarnik and Andino, 1998). Indeed, live cell imaging of incoming Coxsackie virus B3 genome translation events confirmed that this transition is a rate-limiting step in establishing productive replication (Boersma et al., 2020). Positive-strand RNA virus replication occurs via synthesis of a complementary negative-strand RNA, which serves as a template to produce new viral genomes, often in great excess. Thus, positive-strand RNA viruses use a semi-conservative mode of replication, producing a relatively small number of negative-strands, each of which give rise to multiple progeny genomes (Baltimore, 1968). Some viruses, such as coronaviruses, also produce or “transcribe” one or more sgRNAs that are colinear with the 3´ end of the viral genome, allowing for late gene expression. For coronaviruses, these include separate transcripts encoding structural proteins Spike, Envelope, Membrane, and Nucleocapsid, as well accessory proteins that modulate virus-host interaction, such as SARS-CoV-2 orf3a. The processes of positive-strand RNA virus genome replication and transcription occur within the cytosol of infected cells, in association with membrane-bound replication organelles (ROs), which have two general morphologies (Nguyen-Dinh and Herker, 2021; Unchwaniwala et al., 2021; Wolff and Barcena, 2021). Some viruses, such as flaviviruses and nodaviruses, form “spherule”-like ROs, sac-like invaginations into larger membranes that feature a neck, such that the RNA-containing spherule interior is contiguous with the cytosol. For other viruses, like coronaviruses and hepaciviruses, ROs are double-membrane vesicles (DMVs) with uncertain topology. Many other membrane virus-induced alterations are also observed, and the details of RO structure and function exceed the scope of this review. Nevertheless, I would be remiss if I did not mention that recent advances in cryo-electron microscopy and electron tomography have revealed that both spherules (e.g., nodavirus and alphavirus ROs) and DMVs (coronavirus ROs) feature virus-encoded proteinaceous pores with 6- or 12-fold symmetry that connect the interior of these compartments to the cytosol (Jones et al., 2021; Unchwaniwala et al., 2020; Wolff et al., 2020; Zhang et al., 2021a). Aside from the remarkable structural conservation of these structures across vast evolutionary distances, what is most germane to this discussion is the fact that the viral proteins needed for RO formation and RNA synthesis are all produced through translation of a positive-strand RNA virus genome, which explains why viral RNAs are infectious. The only notable exception to this is the coronavirus Nucleocapsid protein, which plays an important role in viral RNA synthesis (Almazán et al., 2004); this protein enters cells with the incoming viral genome and is not produced until after RNA replication and transcription has commenced (much like a negative-strand RNA virus nucleocapsid protein). Another important feature of positive-strand RNA virus genetics is that RdRP enzymes have high error rates, typically around one misincorporation for every 10− 3 to 10− 6 nucleotides synthesized (Domingo et al., 2021). Thus, for many viruses, nearly every nascent daughter genome contains one or more nucleotide substitutions (Drake and Holland, 1999). Consequently, RNA virus populations resemble a swarm of related genotypes and may lack a clearly defined “wild-type” sequence, fueling speculation that RNA viruses may behave as “quasi-species,” a term coined by Manfred Eigen and Peter Schuster to describe a theoretical primordial replicator with high mutation rates (Eigen and Schuster, 1977). It should be noted that the term “quasi-species” is frequently misapplied to describe the mutant swarm observed in RNA virus populations. Rather, this term specifically refers to the concept that a population of interconnected genotypes is a unit of natural selection, rather than any given individual genotype (i.e., a purely population genetics model of evolution). While this evolutionary model remains controversial (Holmes and Moya, 2002), the quasispecies concept has provided a useful framework for understanding the dynamics of RNA virus populations. For instance, one corollary of the quasispecies concept is that RNA viruses replicate very near their mutational limit, and that small increases in mutation rate can drive a virus population to extinction, a phenomenon called “error catastrophe.” Indeed, the nucleoside analog ribavirin was shown to mediate its antiviral effects against poliovirus through lethal mutagenesis (Crotty et al., 2001); this is also the mechanism of action for molnupiravir, an antiviral nucleoside used to treat SARS-CoV-2 (Sheahan et al., 2020). Moreover, further evidence for population-based natural selection comes from observations that diversity itself is important for the in vivo fitness of positive-strand RNA viruses. Specifically, mutant polioviruses with higher replication fidelity were shown to be attenuated in mouse models of poliomyelitis; however, their lethality was restored by using a chemical mutagen to synthetically increase genetic diversity within the input, high-fidelity virus population (Pfeiffer and Kirkegaard, 2005; Vignuzzi et al., 2006). Presumably, the increased diversity within the input population provided the virus a means to adapt to different tissues and host immune responses in vivo. The key points for our discussion are that: (i) genetic diversity is generally high, even within small positive-strand RNA virus populations; and (ii) this diversity can be important for viral phenotypes.

The origins of positive-strand RNA virus reverse genetics

Because all of the viral proteins needed to initiate RNA replication are produced by translation of a positive-strand RNA virus genome, infectious virus can be rescued by transfecting genomic RNA into permissive host cells (Ada and Anderson, 1959; Gierer and Schramm, 1956). These early observations opened the possibility of performing reverse genetics on positive-strand RNA viruses by directly manipulating viral genomes and observing the phenotypic effects. In a set of seminal experiments, the 3′ poly(A) tail was enzymatically removed from purified poliovirus genomes by using RNase H and poly(dT), which was found to destroy RNA infectivity (Spector et al., 1975; Spector and Baltimore, 1974). Furthermore, some of the very first site-directed mutagenesis experiments ever performed were on the RNA genome of bacteriophage Qβ, by selectively incorporating mutagenic nucleotide triphosphates at defined positions during in vitro synthesis with a purified replicase (Domingo et al., 1976; Flavell et al., 1974, Flavell et al., 1975; Sabo et al., 1977). Some of these defined mutants had altered replication fitness in bacterial cells. While direct manipulation of viral RNA is technically challenging and limited in scope, these early efforts did produce useful methods to measure and compare RNA specific infectivity (i.e., the number of infectious units recovered normalized to the amount of input RNA). For instance, in an infectious center assay, cells are transfected with viral RNA, serially diluted, then plated with untransfected cells to seed plaque formation. Each plaque or “infectious center” is inferred to represent one productively transfected cell, which can then be used to estimate both transfection efficiency (% productively transfected cells) and RNA specific infectivity (e.g., PFU/μg RNA), and is generally considered to represent the efficiency with which viral replication is initiated (Ellem and Colter, 1961; Koch et al., 1966). Depending on the transfection method, RNA levels may be saturating, and cells may be transfected with more than one RNA; thus, RNA specific infectivity measurements should ideally demonstrate that the number of infectious units correlates to the input RNA concentration. As discussed below, optimizing RNA specific infectivity can be particularly important when designing reverse genetics experiments with complex mutant libraries.

Construction and application of functional cDNA clones

Following the discovery of reverse transcriptase (RT) in 1970 and the development of biosafety and ethical guidelines at the Asilomar Conference on Recombinant DNA Technology in 1975, the first positive-strand RNA virus functional cDNA clones, also known as “infectious clones,” were created. Specifically, Taniguchi and colleagues inserted a full-length cDNA of the Qβ genome into a bacterial plasmid and found that Escherichia coli harboring this plasmid spontaneously gave rise to infectious virus (Taniguchi et al., 1978). Subsequently, Racaniello and Baltimore cloned a full-length poliovirus cDNA into bacterial plasmid pBR322, which gave rise to infectious virus after transfecting this DNA into primate cell lines (Racaniello and Baltimore, 1981). Remarkably, neither of these cloned cDNAs incorporated a specific promoter or method to terminate transcription of the viral genome; presumably, replication competent RNAs fortuitously arose via cryptic transcription, termination, and/or RNA degradation. Consistent with the stochastic origin of these transcripts, the initiation of viral replication from these first viral cDNA clones was inefficient. Therefore, the next major improvement was to flank viral cDNAs by a well-defined bacterial promoter at the 5´ end and a unique restriction enzyme site at the 3´ end, allowing for the production of viral RNAs with precisely defined ends and high specific infectivity via run-off in vitro transcription with purified E. coli RNA polymerase (Ahlquist et al., 1984; Ahlquist and Janda, 1984). The use of the multi-component E. coli RNA polymerase was soon supplanted by the use of single-chain bacteriophage SP6, T3, and T7 RNA polymerases, which became commercially available in recombinant form and allow precise initiation of > 1-kb length RNA transcripts from small (≤ 20-bp) promoters (Janda et al., 1987; van der Werf et al., 1986). Once broadly applicable reverse genetics methods were in place, infectious clones were produced for numerous positive-strand RNA viruses. Two additional principles that emerged from this expanded clone building era were: (1) for many viruses, the initiation of RNA replication depends on precise generation of the correct 5′ and 3′ ends (Ball, 1995; Ball and Li, 1993; Janda et al., 1987); and (2) given the genetic diversity within RNA virus populations, it is critically important to identify and reconstruct a functional genome sequence. Regarding this latter point, it was not uncommon for full-length cDNA clones to be painstakingly assembled, only to find that the resultant RNA transcripts had very low specific infectivity or were not functional (Kolykhalov et al., 1997; Liu et al., 2003; van Dinten et al., 1997). Most notably, the first successful hepatitis C virus infectious clones were developed only after identifying cDNAs that matched the consensus sequence of the input virus population (Kolykhalov et al., 1997; Yanagi et al., 1997). Presumably, the nonfunctional clones contained deleterious mutations that prevented viral replication. By definition, an infectious clone should be clonal; that is to say, it should be possible to isolate a single viral genotype prior to rescuing the virus. Thus, although positive-strand RNA viruses exist as quasispecies-like swarms, an infectious clone only represents a single sequence. In practice, however, once a virus has been launched from an infectious clone, the error-prone nature of viral replication can quickly regenerate genetic diversity within the rescued virus population. As discussed below, some reverse genetic systems are population-based rather than clonal. The most straightforward use of an infectious clone is to rescue virus with a well-defined genotype. Thus, an infectious clone can serve as a genetic archive, minimizing viral diversity that would arise through serial passage of virus stocks. This genetic stability has had special appeal for maintaining live attenuated vaccine strains capable of reverting to virulent phenotypes, such as the oral poliovirus vaccine (Kohara et al., 1986). An infectious clone can also allow researchers to reconstruct and rescue a virus via DNA synthesis, regardless of whether the virus has been previously cultured or is now extinct (Cello et al., 2002; Thao et al., 2020; Tumpey et al., 2005). Infectious clones are commonly used to create reporter viruses that express proteins that can be easily assayed, such as luciferase enzymes or fluorescent proteins, in a replication-dependent manner (Fischl and Bartenschlager, 2013; Hou et al., 2020; Li et al., 2020; Schoggins et al., 2012; Thao et al., 2020; Xie et al., 2020; Zou et al., 2011). Infectious clones can also be used to deliver foreign antigens as live viral vaccines. For instance, the yellow fever virus (YFV) vaccine strain 17D has been used to deliver small antigenic peptides (Barba-Spaeth et al., 2005; McAllister et al., 2000; Tao et al., 2005) or even large protein antigens like SARS-CoV-2 Spike (Sanchez-Felipe et al., 2021). The YFV-17D backbone has also been used to create chimeric live attenuated vaccines that express the structural glycoproteins of other flaviviruses in place of the YFV structural glycoproteins (Lai and Monath, 2003). Since RNA replication can often be decoupled from virus particle assembly, another common modification is to delete viral structural genes to create RNA “replicons” that replicate within cells but do not make infectious virus particles (Kümmerer, 2018). Replicons, particularly those that express reporter enzymes, have been particularly useful for conducting high throughput screening assays with reduced biosafety work practices (He et al., 2021; Kümmerer, 2018; Qing et al., 2010; Ricardo-Lax et al., 2021; Yang et al., 2013; Zhang et al., 2021b). Naturally, an infectious clone also allows researchers to make hypothesis-driven changes within a viral genome, such as point mutations, deletions, or insertions, in order observe the phenotypic results. Individual mutants or small mutant panels can be readily made with even the simplest of systems (Fig. 2A). However, complex mutant libraries (Fig. 2B), such as those derived by saturation mutagenesis of individual residues (Airaksinen et al., 1999); genetic bar-coding of a few residues (Lauring and Andino, 2011; Weger-Lucarelli et al., 2018); deep mutational scanning of many residues (Fowler and Fields, 2014); or genetic foot printing via transposon insertion mutagenesis (Arumugaswami et al., 2008; Fulton et al., 2017), require robust and efficient reverse genetics systems to minimize genetic bottlenecks and ensure that library complexity can be representationally oversampled at every stage of library construction, amplification, and launch. Since the launch efficiency is best estimated by RNA specific infectivity, this is a critical parameter to optimize for complex library sampling efficiency, such as by quantifying the complexity of the resultant library. For example, the Ebel lab used a non-clonal Zika virus (ZIKV) reverse genetic system (described below) to create barcoded libraries with 6.55 × 104 potential genotypes; yet only 18–146 authentic barcodes were initially identified within rescued ZIKV stocks (Aliota et al., 2018; Weger-Lucarelli et al., 2018). After optimizing virus launch conditions, sampling efficiency improved significantly, yielding up to 1.40 × 104 unique ZIKV genotypes (Sexton et al., 2021).

Fig. 2

Standard reverse genetics workflows. (A) Construction of a viral cDNA clone, rescue of the wild-type virus, and the construction and testing of a mutant virus. Here, ® represents a unique restriction site just downstream of the viral genome, allowing for full-length, run-off transcripts corresponding to the viral genome. (B) Illustration of mutant virus library complexity. To the left is a theoretical mutational scanning analysis (low complexity); in the middle is a theoretical deep mutational analysis (randomization) of a single codon; on the right is a deep mutational scanning analysis, where multiple residues are targeted for randomization. Finally, infectious clones have greatly enabled the mechanisms of RNA replication to be studied in model systems, including cell-free reactions and in cells that are not ordinarily permissive for infection. For instance, the poliovirus genome can be translated, replicated, and produce infectious virus particles in in vitro reactions containing cell-free extracts from HeLa cells (Barton and Flanegan, 1993; Molla et al., 1991). A few positive-strand RNA viruses, including brome mosaic virus (plant-infecting), tomato bushy stunt virus (plant), carnation Italian ringspot virus (plant), Flock House virus (insect), and Nodamura virus (insect and mammal) are capable of RNA replication when viral RNAs are expressed in brewer's yeast, Saccharomyces cerevisiae (Janda and Ahlquist, 1993; Panavas and Nagy, 2003; Price et al., 1996, Price et al., 2005). The power of yeast genetics then enabled the first comprehensive genome-wide screens for host factors required for positive-strand RNA virus replication to be conducted (Hao et al., 2014; Kushner et al., 2003; Panavas et al., 2005).

Flavivirus infectious clones

Yellow fever virus (YFV) was the first human virus to be described (Reed and Carroll, 1902), the first flavivirus to be fully sequenced (Rice et al., 1985), and is the type species and eponymous namesake for a major clade of positive-strand RNA viruses (genus Flavivirus, from Latin flavus, “yellow”; family Flaviviridae; order Amarillovirales, from Spanish amarillo, “yellow”; and phylum Kitrinoviricota, from Greek kítrinos, “yellow”) (Walker et al., 2021). YFV was also the first flavivirus for which a functional infectious clone was assembled, which required considerable effort due to instability of full-length YFV cDNA in bacterial plasmid and λ phage vectors (Rice et al., 1989). What ultimately succeeded was to clone two pieces of the viral genome into separate plasmids, which could be stably maintained in E. coli, cut with restriction enzymes, and ligated to form a full-length cDNA just prior to in vitro transcription (Fig. 3A). Later, after unstable cDNAs for two pestiviruses, classical swine fever virus (CSFV) and bovine viral diarrhea virus (BVDV), were successfully cloned into the low-copy vector pACYC184 bearing a p15A origin of replication (Chang and Cohen, 1978; Mendez et al., 1998; Ruggli et al., 1996), a full-length YFV genome was reassembled in this vector and found to be stably maintained in E. coli (Bredenbeek et al., 2003).

Fig. 3

Solutions to flavivirus and coronavirus cDNA instability. (A) Construction of a stable, two-piece YFV-17D infectious clone. Here, the arrow indicates an SP6 promoter, restriction enzyme A represents ApaI, enzyme (B) represents AatII, and enzyme ® represents XhoI. (B) Construction of a stable, full-length SARS-CoV-2 infectious clone within a YAC via transformation-associated homologous recombination (TAR) cloning. Here, the arrow indicates a T7 promoter, enzyme ® represents EagI. (C) In vitro assembly of non-clonal SARS-CoV-2 cDNAs. Overlapping viral cDNA fragments are amplified; combined with a DNA adaptor encoding a HDV Rz, polyadenylation (pA) signal, and CMV promoter; and circularized via CPER or Gibson assembly to produce DNAs that can be transfected into mammalian cells. During the 1990s and 2000s, numerous additional flavivirus infectious clones were developed (Aubry et al., 2015). Some cDNA clones, such as dengue 2 virus (DENV2) strain 16,681 (Kinney et al., 1997), Murray Valley encephalitis virus (MVEV) strain 1–51 (Hurrelbrink et al., 1999), tick-borne encephalitis virus (TBEV) strain Neudoerfl (Mandl et al., 1997), West Nile virus (WNV) subtype Kunjin (Khromykh and Westaway, 1994), and WNV lineage II (Yamshchikov et al., 2001a, Yamshchikov et al., 2001b) were fortuitously stable in standard bacterial vectors, while many others such as DENV1 strain WP74 (Puri et al., 2000), DENV2 strain NGC (Kapoor et al., 1995; Polo et al., 1997), DENV4 (Lai et al., 1991), Japanese encephalitis virus (JEV) strain JaOArS982 (Sumiyoshi et al., 1992), JEV strain K87P39 (Yun et al., 2003), and tick-borne encephalitis virus (TBEV) strain Hypr (Mandl et al., 1997) exhibited instability and/or a strong preference for low copy number vectors. To overcome these problems, some groups used the in vitro ligation strategy first employed for YFV (Mandl et al., 1997; Sumiyoshi et al., 1992), while other utilized bacterial artificial chromosomes (BACs) that are maintained at 1–2 copies/bacterial cell (Yun et al., 2003). Still others developed methods to assemble full-length DENV clones by using homologous recombination in yeast, although the resultant yeast shuttle plasmids still had to be amplified in E. coli to generate enough material for RNA transcription (Polo et al., 1997; Puri et al., 2000). During this period, a few groups set out to identify the molecular basis of flavivirus cDNA instability in bacteria. Yamshchikov and colleagues noticed that spontaneous mutations arose within their JEV infectious clone, giving rise to large bacterial colonies; these mutations mapped to the prM and E regions, suggesting that these regions encoded elements that were toxic for bacteria (Yamshchikov et al., 2001a, Yamshchikov et al., 2001b). To overcome this instability, the researchers created a JEV clone that was driven by a 5′ cytomegalovirus (CMV) immediate early gene 1 promoter, contained two synthetic introns to disrupt these regions of instability, and terminated with a 3′ mammalian polyadenylation signal. The resulting clone was stable in E. coli and after transfection of DNA into mammalian cells, produced infectious virus; presumably the result of RNA Pol II transcription, 5′ capping, correct splicing, nuclear export, translation, and selective replication of mRNAs with correct 3′ ends. Similarly, Zheng and colleagues identified transcriptionally active, cryptic bacterial promoters within the 5′ one-third of the JEV cDNA; these promoters were weakly transcribed when bacteria were grown at 25 °C, which also increased plasmid stability (Zheng et al., 2016). Consistent with these findings, two groups showed that introducing silent mutations within the viral cDNA to reduce bacterial promoter activity stabilized full-length DENV-2, JEV, and ZIKV cDNA clones (Münster et al., 2018; Pu et al., 2011). It should be noted that the strategy of using the mammalian transcription and RNA processing machinery to produce viral transcripts in the nucleus bears some risks. Cellular mRNAs are preferentially exported from the nucleus only after they are capped, spliced, and polyadenylated (Stewart, 2019). Moreover, since positive-strand RNA viruses evolved to replicate in the cytosol, their genomes often contain fortuitous splice sites, which can lead to the production of nonfunctional, edited transcripts, as well as fortuitous nuclear retention signals, which can prevent export to the cytosol (Palazzo and Lee, 2018). Another concern is in generating the correct 3′ ends. Since flavivirus genomes lack 3′ polyadenylated tails, most Pol II-driven flavivirus infectious clones utilize a 3′ self-cleaving hepatitis delta virus (HDV) genomic or antigenomic ribozyme (Rz) to generate an authentic 3′ end, upstream of the polyadenylated tail; however, this then removes a major determinant of mRNA nuclear export (Stewart, 2019). Some might argue that splicing efficiencies can vary, that some non-polyadenylated RNAs still get exported from the nucleus, and that, ultimately, only a single intact viral transcript is needed to initiate viral infection. I remind readers, however, that the efficiency of RNA launch is critical for working with complex libraries (Section 4), or for examining transient events such as RNA replication phenotypes, such as with replicons, etc. Yet very few papers have rigorously examined the quality of positive-strand RNA genome transcripts produced via the nuclear route under conditions where RNA replication was inhibited or blocked (to prevent selective amplification of authentic, virus-like transcripts). In one superb study, Schwartz, et al. inserted a synthetic intron to stabilize a ZIKV infectious clone, then used RT-PCR and sequencing to examine the efficiency of intron removal from both replication-competent and RdRP-inactive transcripts (Schwarz et al., 2016). In the absence of RNA replication, roughly half of the transcripts sequenced still contained the intron, while the spliced, intron-less transcripts were selectively amplified by viral replication. It is also notable that many Pol II-driven flavivirus infectious clones require longer rescue times (sometimes up to a week) and/or additional passages to reach the same viral titers as virus produced by transfection of RNAs produced by in vitro transcription (Edmonds et al., 2013; Gao et al., 2021; He et al., 2019; Jiang et al., 2015), suggesting that their launch efficiency is lower than RNA transfection-based methods. For experiments where efficiency is important (i.e., complex library sampling), I encourage fellow researchers who use DNA transfection-based launch systems to critically analyze the efficiency of RNA processing in the absence of replication, as well the overall efficiency of virus rescue by quantifying library complexity (Section 4). Finally, I would like to point out that cDNA-stabilizing benefits of inserting an intron can be realized without having to rely on inherently inefficient DNA transfection-based methods. Liu and colleagues demonstrated that insertion of a group II intron, a self-splicing Rz that can excise itself from in vitro transcripts, stabilized a ZIKV infectious clone (Liu et al., 2017).

Coronavirus infectious clones

With massive genomes of ~ 30-kb, nearly three times the length of flaviviruses (Fig. 1) and containing multiple problematic sequences (from a bacterial cloning standpoint), coronavirus cDNAs were initially quite challenging to work with. Consequently, it wasn't until the year 2000 when two groups independently developed the first coronavirus infectious clones. Yount, et al. succeeded in stably cloning the transmissible gastroenteritis virus (TGEV) genome as six cDNA fragments in separate bacterial plasmids (Yount et al., 2000). Because each fragment was flanked by restriction sites with uniquely compatible ends, full-length cDNAs could be assembled in vitro by restriction digestion, gel purification of each fragment, and DNA ligation; similar to the first YFV infectious clone (Fig. 3A), these ligation products were then directly used to transcribe viral genomic RNA. Importantly, this team discovered that virus rescue was greatly improved by co-transfection with a synthetic mRNA encoding the TGEV nucleocapsid (N) gene; we now know that this is because coronavirus N protein plays an essential role in RNA synthesis (Almazán et al., 2004; Koetzner et al., 2022; Schelle et al., 2005; Zuniga et al., 2010). While this multi-piece cloning and reassembly strategy was initially inefficient, the Baric lab and others have subsequently improved upon it by using type IIS or IIG restriction sites to force directional ligation of the cDNA fragments; this strategy has succeeded for several coronavirus infectious clones, including SARS-CoV, MERS-CoV, and SARS-CoV-2 (Beall et al., 2016; Becker et al., 2008; Donaldson et al., 2008; Hou et al., 2020; Menachery et al., 2015; Scobey et al., 2013; Xie et al., 2020; Yount et al., 2002, Yount et al., 2003). In contrast to the multi-piece cloning strategy described above, Almazón, et al. succeeded in creating a full-length TGEV cDNA cloned into a single copy/cell bacterial artificial chromosome (BAC), flanked by a 5´ CMV promoter and 3´ HDV Rz (Almazán et al., 2000). Virus rescue efficiency, however, was quite low, requiring extensive serial passage to amplify virus titers; during this time, mutations accumulated in the rescued virus stock. The authors found that some spliced, presumably nonfunctional, viral transcripts were produced; however, their unspliced counterparts were functional and selectively replicated. Moreover, the authors successfully used this system to demonstrate that the TGEV Spike gene is a major determinant of viral tropism and pathogenesis in vivo. Subsequently, this team showed that the full-length TGEV-containing BAC was greatly stabilized by inserting an intron into either of two locations within the viral cDNA (Gonzalez et al., 2002). Despite persistent stability problems and the low efficiencies of virus rescue with DNA-based launch systems, similar full-length cDNA BAC systems have also been developed for SARS-CoV and SARS-CoV-2 (Almazán et al., 2006; Rihn et al., 2021; Ye et al., 2020). The above successes and setbacks spurred new innovations in cloning full-length coronavirus cDNAs. In one remarkable study, Thiel and colleagues found that a full-length human coronavirus 229E cDNA was stably maintained within a recombinant vaccinia virus DNA genome, which could be purified in sufficient quantities to transcribe high quality infectious transcripts in vitro (Thiel et al., 2001). Subsequently, a full-length avian infectious bronchitis virus (IBV) cDNA was cloned in vaccinia and could be launched by co-infection with a second recombinant poxvirus expressing T7 RNAP, followed by plaque purification of the resultant IBV (Bickerton et al., 2017; Casais et al., 2001). One key advantage of the vaccinia-based system is that genetic changes can be introduced in vivo through homologous recombination, which is highly efficient in poxviruses. However, the major disadvantages are that the use of vaccinia virus vectors increases biosafety and biocontainment risks and the resulting coronavirus must be plaque-purified away from the poxvirus(es) for subsequent analysis. Thiel's group has now pioneered the use of a yeast artificial chromosome (YACs) to stably maintain full-length coronavirus and other viral cDNAs within S. cerevisiae (Thao et al., 2020) (Fig. 3B). As for vaccinia virus, homologous recombination is highly efficient in yeast, allowing for rapid construction and modification of the viral genome. The only minor downside to this approach is that yields of the YAC are often too low for in vitro transcription; however, this can be overcome by in vitro amplification of the isolated YAC via rolling circle amplification (RCA) with phage ø29 DNA polymerase (Ricardo-Lax et al., 2021).

Non-clonal reverse genetic systems

To overcome the issues of cDNA instability and toxicity in prokaryotic hosts, many groups have developed methods to assemble and amplify viral cDNAs purely in vitro. The first example of this was by Gritsun and Gould, who amplified the tick-borne encephalitis genome as two overlapping cDNAs, flanked by an upstream SP6 promoter. These products were then assembled into a full-length cDNA by cross-primed PCR amplification, allowing the researchers to produce functional, full-length transcripts within days (Gritsun and Gould, 1995). Further simplifying this strategy, Aubry, et al. tried skipping the assembly PCR step, transfecting mammalian cells with three overlapping flavivirus cDNA constructs flanked by a 5´ CMV and 3´ HDV Rz elements, under the premise that full-length cDNAs would be reassembled via homologous recombination and transiently transcribed (Atieh et al., 2016; Aubry et al., 2014). While successful, this process is, however, inefficient, requiring extensive time and serial passage to rescue virus, which can lead to the accumulation of additional mutations during passage. A few groups have utilized circular polymerase extension reactions (CPER) to assemble functional full-length flavivirus and coronavirus cDNAs flanked by a strong 5´ RNA Pol II promoter and 3´ HDV Rz (Amarilla et al., 2017, Amarilla et al., 2021; Edmonds et al., 2013; Piyasena et al., 2017; Torii et al., 2021). This in vitro, PCR-based method uses cross-priming of target amplicons and a circularizable DNA vector to assemble full-length, nicked circles (Quan and Tian, 2009) that can be directly transfected into mammalian or insect cells to initiate virus rescue. As for other DNA transfection-based methods, this requires extensive virus rescue times and/or passage. Nevertheless, the CPER method showed partial representation of an input WNV population variation (Edmonds et al., 2013) and was efficient enough to partially sample a complex mutant library (12,852 theoretical codon variants) of the ZIKV E glycoprotein with sufficient depth to identify mutations that altered host cell tropism (Setoh et al., 2019). Yet another in vitro cDNA assembly method was pioneered by James Weger-Lucarelli and the Ebel lab, whereby viral cDNA amplicons, flanked by a 5´ CMV promoter and 3´ HDV Rz, are assembled via Gibson assembly reactions into circular DNAs, which can then be amplified by RCA with phage ø29 DNA polymerase (Bates et al., 2021; Kang et al., 2021; Sexton et al., 2021; Weger-Lucarelli et al., 2018). As detailed above (Section 4), the ability of this method to sample complex libraries depends on careful optimization of the transfection and rescue conditions. A key advantage of this method over CPER is that the Gibson assembly products are covalently closed circular DNAs and can therefore be amplified via RCA, whereas CPER produces nicked circles. Thus, this method should be able to produce sufficient cDNA product to produce infectious transcripts in vitro, which would allow for more rigorous determination of specific infectivity. A potential advantage of these non-clonal reverse genetics systems is that they should allow virus populations to be accurately represented, perhaps even as a tool to examine the quasispecies nature of RNA virus evolution. However, in their current manifestations, all these methods depend on DNA-based rescue, which, as detailed above, can be prone to genetic bottleneck events. Another concern is that these population-based methods do not allow one to unambiguously define a “wild-type” for rigorous genetic analysis on an isogenic background. Thus, these methods are less ideal for analyzing the phenotypic effects of individual mutations. Moreover, in the absence of purifying clonality (i.e., the ability to isolate a single colony), the serial amplification of cDNA populations in vitro may lead to the propagation of PCR- or assembly-induced errors. Thus, cell-free cloning methods have great promise for dealing with viral genomes as populations, but additional rigor is needed to ensure that a specific phenotype is caused by a given genotype.

Novel methods and future directions

While the above discussion has focused on reverse genetic approaches that lead to the expression of a functional viral replicase, it should be noted that other aspects of the virus life cycle can also be dissected, independent of RNA replication, through reverse genetic approaches. For instance, the laboratories of Drs. Jennifer Doudna and Melanie Ott identified a packaging signal within the SARS-CoV-2 genome, which allowed them to package reporter gene mRNAs into virus-like particles by co-expression with the viral Spike, Envelope, Membrane, and Nucleocapsid genes (Syed et al., 2021). The authors then used this system to identify Nucleocapsid gene mutations present in SARS-CoV-2 variants of concern that enhance viral RNA packaging and increase virus titers. While I am encouraged by the remarkable progress in cell-free reverse genetics approaches and their potential application to population-based genetics, the current inefficiency of these methods and inability to work with a genetically well-defined clone on an isogenic background raises concerns that must be addressed. Moreover, all reverse genetic methods discussed in this chapter depend on extensive molecular biology manipulations in vitro. I can't help but wonder if maybe we have this backwards; maybe it would be easier to maintain viral cDNA clones within permissive host cells and edit them in situ, using in vivo genome editing technologies recently developed by the field of synthetic biology. For instance, homologous recombination and/or CRISPR/Cas9 can be readily exploited to replace regions within large YACs (Ruiz et al., 2019). However, CRISPR/Cas9 has limitations since it depends on creating and repairing double-stranded DNA breaks, which can be toxic when multiple breaks are created simultaneously, and often leads to insertions and deletions (indels) at edited sites. Two technologies that I am particularly optimistic about are multiplex automated genome engineering (MAGE), which allows precise editing of bacterial genomes, and presumably, BACs (Gallagher et al., 2014; Wang et al., 2009); as well as eukaryotic MAGE (eMAGE), which allows precise editing of the yeast genome, and presumably, YACs (Barbieri et al., 2017). These methods are essentially in vivo forms of site-directed mutagenesis, whereby mutagenic oligonucleotides are used to modify a locus during synthesis of the lagging strand during DNA replication (Fig. 4A). Importantly, these methods are efficient (> 10% gene conversion), combinatorial (i.e., multiple changes can be made simultaneously via mutiplexing), and iterative (i.e., edited populations are ready to be edited further). Other promising genome engineering approaches include retron-associated recombineering, which uses the bacterial retron reverse transcriptase to continuously produce mutagenic oligos in vivo (Lopez et al., 2022; Schubert et al., 2021), as well as synthetic base editing enzymes (Gaudelli et al., 2017; Komor et al., 2016; Mok et al., 2020) and prime-editing enzymes (Anzalone et al., 2019). I am also excited by the recent work of Ding, et al., who showed that yeast harboring a Sindbis virus cDNA driven by the strong, galactose-inducible Gal1 promoter can produce infectious transcripts and initiate replication upon fusion of the induced yeast with mammalian cells, thereby avoiding the need to isolate a cDNA and transcribe RNA in vitro (Ding et al., 2021). I imagine a nearby future whereby nearly all standard molecular virology may be outsourced to yeast: we will be able to program yeast to create mutants of interest, such as with eMAGE, and to then produce infectious RNAs by feeding them galactose (Fig. 4A).

Fig. 4

Proposed synthetic biology approaches to reverse genetics. (A) In vivo editing of the viral cDNA via genome engineering methods, such as MAGE or eMAGE, and induction of infectious transcripts in vivo, such as in yeast. In this case, the arrow represents a Gal1 promoter, ® represents a HDV Rz and pA signal. (B) Engineered biocontainment of host cells stably harboring a viral infectious clone. Red bullets represent amber codons recoded for a synthetic amino acid. Of course, as new technologies are developed to make it easier to create, modify, and rescue viral cDNAs, molecular virologists must be mindful of the biosafety and biosecurity risks of these new technologies (Mackelprang et al., 2022). Synthetic biology may also be used to address these concerns by engineering unbreakable biocontainment mechanisms into cells harboring the viral cDNA. For instance, MAGE was used to edit the E. coli genome, removing all known amber (TAG) stop codons from 321 bacterial open reading frames (Lajoie et al., 2013). This codon was then reassigned to a synthetic orthogonal tRNA-tRNA synthase pair, allowing the TAG codon to specifically code for an unnatural amino acid. This team then used MAGE to incorporate amber codons into multiple essential bacterial genes, rendering bacterial growth dependent on the unnatural amino acid (Rovner et al., 2015). Similar engineering approaches could be used to eliminate horizontal gene transfer of viral cDNAs or unauthorized use of a cell-based reverse genetic system (Fig. 4B).

161 in total

1. Characterization of infectious Murray Valley encephalitis virus derived from a stably cloned genome-length cDNA.

Authors: Robert J Hurrelbrink; Ann Nestorowicz; Peter C McMinn
Journal: J Gen Virol Date: 1999-12 Impact factor: 3.891

2. An infectious clone of the West Nile flavivirus.

Authors: V F Yamshchikov; G Wengler; A A Perelygin; M A Brinton; R W Compans
Journal: Virology Date: 2001-03-15 Impact factor: 3.616

3. Infectious RNA transcribed in vitro from a cDNA copy of the human coronavirus genome cloned in vaccinia virus.

Authors: Volker Thiel; Jens Herold; Barbara Schelle; Stuart G Siddell
Journal: J Gen Virol Date: 2001-06 Impact factor: 3.891

4. Transcription of infectious yellow fever RNA from full-length cDNA templates produced by in vitro ligation.

Authors: C M Rice; A Grakoui; R Galler; T J Chambers
Journal: New Biol Date: 1989-12

5. Replication and single-cycle delivery of SARS-CoV-2 replicons.

Authors: Inna Ricardo-Lax; Joseph M Luna; Tran Thi Nhu Thao; Jérémie Le Pen; Yingpu Yu; H-Heinrich Hoffmann; William M Schneider; Brandon S Razooky; Javier Fernandez-Martinez; Fabian Schmidt; Yiska Weisblum; Bettina Salome Trüeb; Inês Berenguer Veiga; Kimberly Schmied; Nadine Ebert; Eleftherios Michailidis; Avery Peace; Francisco J Sánchez-Rivera; Scott W Lowe; Michael P Rout; Theodora Hatziioannou; Paul D Bieniasz; John T Poirier; Margaret R MacDonald; Volker Thiel; Charles M Rice
Journal: Science Date: 2021-10-14 Impact factor: 47.728

6. Dengue reporter viruses reveal viral dynamics in interferon receptor-deficient mice and sensitivity to interferon effectors in vitro.

Authors: John W Schoggins; Marcus Dorner; Michael Feulner; Naoko Imanaka; Mary Y Murphy; Alexander Ploss; Charles M Rice
Journal: Proc Natl Acad Sci U S A Date: 2012-08-20 Impact factor: 11.205

7. Systematic, genome-wide identification of host genes affecting replication of a positive-strand RNA virus.

Authors: David B Kushner; Brett D Lindenbach; Valery Z Grdzelishvili; Amine O Noueiry; Scott M Paul; Paul Ahlquist
Journal: Proc Natl Acad Sci U S A Date: 2003-12-11 Impact factor: 11.205

8. CReasPy-Cloning: A Method for Simultaneous Cloning and Engineering of Megabase-Sized Genomes in Yeast Using the CRISPR-Cas9 System.

Authors: Estelle Ruiz; Vincent Talenton; Marie-Pierre Dubrana; Gabrielle Guesdon; Maria Lluch-Senar; Franck Salin; Pascal Sirand-Pugnet; Yonathan Arfi; Carole Lartigue
Journal: ACS Synth Biol Date: 2019-10-30 Impact factor: 5.110

9. Increased fidelity reduces poliovirus fitness and virulence under selective pressure in mice.

Authors: Julie K Pfeiffer; Karla Kirkegaard
Journal: PLoS Pathog Date: 2005-10-07 Impact factor: 6.823

10. Infectious DNAs derived from insect-specific flavivirus genomes enable identification of pre- and post-entry host restrictions in vertebrate cells.

Authors: Thisun B H Piyasena; Yin X Setoh; Jody Hobson-Peters; Natalee D Newton; Helle Bielefeldt-Ohmann; Breeanna J McLean; Laura J Vet; Alexander A Khromykh; Roy A Hall
Journal: Sci Rep Date: 2017-06-07 Impact factor: 4.379