Alexander C Pfotenhauer1,2, Alessandro Occhialini1,2, Mary-Anne Nguyen1,2, Helen Scott3, Lezlee T Dice1,2, Stacee A Harbison1,2, Li Li1,2, D Nikki Reuter1,2, Tayler M Schimel1,2, C Neal Stewart2,4, Jacob Beal3, Scott C Lenaghan1,2. 1. Department of Food Science, University of Tennessee Knoxville, 102 Food Safety and Processing Building 2600 River Dr., Knoxville, Tennessee 37996, United States. 2. Center for Agricultural Synthetic Biology, University of Tennessee Institute of Agriculture, Knoxville, Tennessee 37996, United States. 3. Intelligent Software and Systems, Raytheon BBN Technologies, Cambridge, Massachusetts 02138, United States. 4. Department of Plant Sciences, University of Tennessee Knoxville, 2431 Joe Johnson Dr., Knoxville, Tennessee 37996, United States.
Abstract
While the installation of complex genetic circuits in microorganisms is relatively routine, the synthetic biology toolbox is severely limited in plants. Of particular concern is the absence of combinatorial analysis of regulatory elements, the long design-build-test cycles associated with transgenic plant analysis, and a lack of naming standardization for cloning parts. Here, we use previously described plant regulatory elements to design, build, and test 91 transgene cassettes for relative expression strength. Constructs were transiently transfected into Nicotiana benthamiana leaves and expression of a fluorescent reporter was measured from plant canopies, leaves, and protoplasts isolated from transfected plants. As anticipated, a dynamic level of expression was achieved from the library, ranging from near undetectable for the weakest cassette to a ∼200-fold increase for the strongest. Analysis of expression levels in plant canopies, individual leaves, and protoplasts were correlated, indicating that any of the methods could be used to evaluate regulatory elements in plants. Through this effort, a well-curated 37-member part library of plant regulatory elements was characterized, providing the necessary data to standardize construct design for precision metabolic engineering in plants.
While the installation of complex genetic circuits in microorganisms is relatively routine, the synthetic biology toolbox is severely limited in plants. Of particular concern is the absence of combinatorial analysis of regulatory elements, the long design-build-test cycles associated with transgenic plant analysis, and a lack of naming standardization for cloning parts. Here, we use previously described plant regulatory elements to design, build, and test 91 transgene cassettes for relative expression strength. Constructs were transiently transfected into Nicotiana benthamiana leaves and expression of a fluorescent reporter was measured from plant canopies, leaves, and protoplasts isolated from transfected plants. As anticipated, a dynamic level of expression was achieved from the library, ranging from near undetectable for the weakest cassette to a ∼200-fold increase for the strongest. Analysis of expression levels in plant canopies, individual leaves, and protoplasts were correlated, indicating that any of the methods could be used to evaluate regulatory elements in plants. Through this effort, a well-curated 37-member part library of plant regulatory elements was characterized, providing the necessary data to standardize construct design for precision metabolic engineering in plants.
The growing human population will require
an increase in our food
production. The effects from climate change will likely affect our
ability to grow food, and it will be necessary to improve the resilience
of crops to environmental damage.[1] The
rise of global temperature linked to severe drought conditions, loss
of land due to natural disasters, and emergence of more resilient
plant pathogens represent only a few examples of important future
agricultural challenges.[2]Over the
last few decades, the role of synthetic biology has become
increasingly important to provide alternative solutions to traditional
breeding techniques for crop improvement. Many advancements have been
achieved for improving crops through the installation of alternative
metabolic pathways and high-fidelity enzymes for improved photosynthesis.[3] Additionally, plants have been generated with
increased tolerance to biotic and abiotic stresses.[4,5] To
improve the nutritional quality of foods, transgenic plants have been
developed to increase the level of valuable fatty acids,[6] carotenoids,[7] and
anthocyanins[8] and to decrease the level
of toxic acrylamide compounds.[9] Despite
many of these advancements, the installation of other valuable pathways
such as carbon concentrating mechanisms (CCMs) to improve CO2 fixation[10] and nitrogen-fixing pathways
for autonomous atmospheric nitrogen fixation[11] is still at an early stage of research.A fundamental aspect
for genetic engineering is the rational design
of transformation vectors to reach the targeted spatio-temporal levels
of transgene expression. The current plant synthetic biology toolbox
comprises endogenous and heterologous genetic elements, as well as
synthetic sequences.[12] These genetic elements
include promoters and untranslated regions (UTRs) at both 5′
and 3′ ends of the coding region. Transgene expression can
be constitutive (in the entire plant or in specific organs and tissues)
or induced by treatment with a cognate molecule acting as a genetic
switch.[13,14] In association with the core promoter that
is involved in transcription initiation, upstream cis-regulatory elements (enhancers and silencers) are present in different
types, numbers, and orientation. These cis-regulatory
elements are involved in the modulation of gene expression through
binding with particular transcription factors.[15] Both 5′ UTRs, also known as leaders, and 3′
UTRs, also known as terminators, are important sites of post-transcriptional
modification. RNA-loop structures of UTRs perform many functions including
stability of the resulting transcript. While the presence of 5′
UTRs is important in the formation of an active translation complex,
sequences at 3′ are used for transcription termination and
stabilization of the resulting mRNA.[16,17]Many
studies have focused on overexpressing transgenes in plant
cells for the production of pharmaceutical proteins and other industrially
relevant compounds.[18,19] While maximization of expression
may be ideal for single enzymes, the installation of complex metabolic
pathways requires fine coordination both within the pathway and throughout
the plant. With regard to precise metabolic engineering, high expression
of all genes is rarely desirable. It is necessary to identify a well-defined
part library of regulatory elements that enables tuning of expression
to control the optimal stoichiometry of genes necessary to coordinate
complex pathways. The part libraries for similar pathway design in
microbial systems are much more mature than in plants, as is the standardization
of nomenclature across organisms. Further, the throughput of microbial
systems allows for combinatorial screening of elements several orders
of magnitude greater than what can be accomplished in plants. Quantitative
analysis of expression patterns from combinations of plant regulatory
elements is critical to enable the breakthroughs envisioned in plant
synthetic biology.High throughput screening in plant cells
has been enabled through
rapid assembly of transformation vectors through modular cloning,[20] transient expression in model plants, and gene
expression quantification. Plant tissues can be analyzed using scanning
fluorometry, while single cells (protoplasts) can be analyzed by flow
cytometry.[21] Calibration of flow cytometry
to units of equivalent standard dye molecules (e.g., molecules of
equivalent fluorescein – MEFL) allows reproducible measurements
across laboratories.[22,23] This supports better reproducibility
of results and improved process debugging by new users. Calibrated
measurement also allows estimation of parameters in biologically meaningful
units, which has been used with mammalian cells for high accuracy
predictive modeling[24] and as a guide for
designing improved devices.[25] Here, we
use both scanning fluorometry and flow cytometry to quantify transgene
expression.In this work, 91 plant expression cassettes were
characterized
at the canopy, leaf, and protoplast level to establish an expression
library that could be used for tunable metabolic engineering. In particular,
regulatory elements were ranked based on the desired expression level
and the interplay between regulatory elements and promoters was demonstrated.
In many cases, the promoter function was directly tied to pairing
with an appropriate 5′ UTR. Further, the importance of careful
curation of the parts sequences and the potential effects of minor
cloning scars on overall expression are discussed. This characterized
library brings plant synthetic biology another step closer toward
generating predictive software that can guide construct design from
a collection of described parts.
Results and Discussion
Curation of the Parts Library
The 37-member combinatorial
library comprises 14 common plant active promoters, 7 5′UTRs,
11 3′ UTRs, 5 promoter-5′UTR fusions, and 3 fluorescent
reporters (Figure , Tables S1 and S2). All parts were either
obtained from a previously assembled MoClo kit[26] or were produced by gene synthesis. For testing regulatory
elements through single expression cassettes, the constructs were
subdivided into four different functional groups; promoters, 3′UTRs,
promoter-5′UTR fusions, and 5′UTRs. The promoter group
was designed to test the activity of promoters, keeping the other
regulatory elements of the expression cassette consistent. The 3′UTR
group was designed similarly, swapping 3′UTRs while keeping
all other elements consistent. The promoter-5′UTR group was
designed to test the activity of promoter-5′UTR fusions along
with two different 3′UTRs, in all possible combinations. The
5′UTR group was designed to test the activity of 5′UTRs
with three different promoters and two different 3′UTRs, in
all possible combinations. For all groups tested, a positive control
was included (P2x35S:5TMVΩ:GFP:335S). This cassette comprised
a 2x35S promoter, a TMVΩ 5′UTR, and a 35S polyA 3′UTR,
all of which are commonly used in plant genetic engineering.
Figure 1
Comparative
study of plant genetic regulatory elements. (A) Schematic
representation of DNA constructs and regulatory elements used to modulate
gene expression in plant cells. Indicated in the image are promoters
(P), 5′ untranslated regions (5′UTR), 3′ untranslated
regions (3′UTR), and coding sequences for fluorescent protein
reporters (reporters). The nucleotide sequences for the regulatory
elements and descriptions are provided in Table S1. (B) Schematic representation of the experimental approach
used to test genetic elements in plant cells. The approach involves
(1) canopy Agrobacterium tumefaciens-mediated infiltration of Nicotiana benthamiana with DNA constructs, (2) scanning fluorometric analysis of leaf
tissue, (3) protoplast isolation from the same tissue, and (4) single
cell analysis by flow cytometry.
Comparative
study of plant genetic regulatory elements. (A) Schematic
representation of DNA constructs and regulatory elements used to modulate
gene expression in plant cells. Indicated in the image are promoters
(P), 5′ untranslated regions (5′UTR), 3′ untranslated
regions (3′UTR), and coding sequences for fluorescent protein
reporters (reporters). The nucleotide sequences for the regulatory
elements and descriptions are provided in Table S1. (B) Schematic representation of the experimental approach
used to test genetic elements in plant cells. The approach involves
(1) canopy Agrobacterium tumefaciens-mediated infiltration of Nicotiana benthamiana with DNA constructs, (2) scanning fluorometric analysis of leaf
tissue, (3) protoplast isolation from the same tissue, and (4) single
cell analysis by flow cytometry.The geometric means from individual expression
cassettes were compared
based on the geometric mean for the reference construct (P2x35S:5TMVΩ:GFP:335S).
Analysis of all expression cassettes was conducted on both plant leaves
by fluorometry and individual protoplasts by flow cytometry. By analyzing
the same sample at the population and individual cell level, it was
possible to determine the strengths and limitations of each analytical
approach.
Investigating the Effect of the Promoter Region
Cassettes
using the viral CaMV 35S promoters enable higher expression of the
reporter gene compared to both the nos promoter from Agrobacterium tumefaciens and most other endogenous
plant promoters tested. Out of the three 35S promoters tested, cassettes
using the short (−420 to +6) and long (−1332 to +6)
promoters generated the highest levels of reporter gene expression
(Figure ). The strong
activity of the 35S promoters is compatible with previous investigations
demonstrating that the upstream −343/–46 region is responsible
for the majority of promoter strength.[27,28] Surprisingly,
both of the single versions of the 35S promoters regulated expression
at approximately half a decade higher than the positive control P2x35S:5TMVΩ:GFP:335S.
This is in contrast with early publications from 1987 that demonstrated
that duplication of enhancer sequences has a considerable effect in
improving promoter strength.[29,30] However, the 35S annotated
in one of these publications contains three SNPs in the distal region
of the promoter, and this distal region is thought to activate the
core promoter.[30] Differences in activity
from these initial studies could also be due to the different lengths
of duplicated regions (−148/–89 vs −343/–89),
cloning artifacts, or the use of different UTRs. More recently, it
was shown that the three 35S promoters used in our study have similar
activity when quantified at the GFP level.[26]
Figure 2
Comparative
analysis of promoters. Each construct contains a different
promoter (Table S2), while keeping the
5′UTR (TMVΩ leader), reporter gene (GFP), and 3′UTR
(35S CaMV polyA) consistent. Graphs represent scanning fluorometry
and flow cytometry data obtained using intact leaf tissue and protoplasts
isolated from the same tissue. Scanning fluorometry (excitation 475
nm, emission 509 nm) data is expressed as log10 of CPS
(counts per second) values, while flow cytometry (excitation 488 nm,
emission 510/10 bandpass filter) data is expressed as log10 of MEFL (molecules of equivalent fluorescein) values. Negative (plants
transformed with untransformed Agrobacterium tumefaciens) and positive (plants transformed with the P2x35S:5TMVΩ:GFP:335S
construct) controls are indicated with blue and red lines across the
graphs, respectively. Data is represented as the mean ± standard
deviation (SD) of at least three transformed plants per construct.
Comparative
analysis of promoters. Each construct contains a different
promoter (Table S2), while keeping the
5′UTR (TMVΩ leader), reporter gene (GFP), and 3′UTR
(35S CaMV polyA) consistent. Graphs represent scanning fluorometry
and flow cytometry data obtained using intact leaf tissue and protoplasts
isolated from the same tissue. Scanning fluorometry (excitation 475
nm, emission 509 nm) data is expressed as log10 of CPS
(counts per second) values, while flow cytometry (excitation 488 nm,
emission 510/10 bandpass filter) data is expressed as log10 of MEFL (molecules of equivalent fluorescein) values. Negative (plants
transformed with untransformed Agrobacterium tumefaciens) and positive (plants transformed with the P2x35S:5TMVΩ:GFP:335S
construct) controls are indicated with blue and red lines across the
graphs, respectively. Data is represented as the mean ± standard
deviation (SD) of at least three transformed plants per construct.Other than the monocotyledon spm promoter that
is not active in dicotyledon plants, cassettes using all other promoters
tested provided a detectable range of expression by both scanning
fluorometry and flow cytometry. The flow cytometry data suggested
that the endogenous plant promoters LHB1B2, ls1, cab1, RbcS2B, and LHB1B1 regulated expression at a similar level to P2x35S:5TMVΩ:GFP:335S.
Again, this result is surprising considering the generally accepted
dogma that the 2x35S promoter is the best promoter for high level
transgene expression. Cassettes using promoters from RbcS1B, act2, and nos (−256 to
+20) expressed at the weakest level, at a half decade to a decade
lower than P2x35S:5TMVΩ:GFP:335S. The majority of endogenous
promoters have complex and non-fully characterized structures of upstream cis-regulatory elements, making their activity unpredictable
in different environmental conditions. Therefore, the plant genetic
engineering toolbox will benefit greatly from rational deconstruction
of endogenous promoters, with the ultimate goal of building synthetic
promoters that reach desired levels of either constitutive, inducible,
or organ specific activity.[31,32]
Investigating the Effect of the 3′UTR Region
Even with a strong promoter/5′UTR
combination (P2x35S:5TMVΩ), the choice of 3′UTR has a
strong effect on gene expression (Figure ). Cassettes with the A. tumefaciensocs 3′UTR were the strongest, expressing
at a similar level to the P2x35S:5TMVΩ:GFP:335S cassette and
cassettes with the nos 3′UTR. While scanning
fluorometry indicated that the g7 3′UTR may
positively enhance expression, flow-cytometry data did no support
this trend. Constructs using the mas 3′UTR
reduced expression approximately half a decade, while RbcS3C, g7, act2, and H4 3′UTRs showed similar activities, all reducing expression
to a similar level compared to P2x35S:5TMVΩ:GFP:335S. The most
significant reduction in reporter expression was observed for the
3′UTR from the endogenous plant ATPase, which
could not be resolved from the background signal in the flow cytometry
analysis.
Figure 3
Comparative analysis of 3′UTRs. Each construct contains
a different 3′UTR (Table S2), while
keeping the promoter (CaMV 2x35S), 5′UTR (TMVΩ leader)
and reporter gene (GFP) consistent. Graphs represent scanning fluorometry
and flow cytometry data obtained using intact leaf tissue and protoplasts
isolated from the same tissue. Scanning fluorometry (excitation 475
nm, emission 509 nm) data is expressed as log10 of CPS
(counts per second) values, while flow cytometry (excitation 488 nm,
emission 510/10 bandpass filter) data is expressed as log10 of MEFL (molecules of equivalent fluorescein) values. Negative (plants
transformed with untransformed Agrobacterium tumefaciens) and positive (plants transformed with the P2x35S:5TMVΩ:GFP:335S
construct) controls are indicated with blue and red lines across the
graphs, respectively. Data is represented as the mean ± standard
deviation (SD) of at least three transformed plants per construct.
Comparative analysis of 3′UTRs. Each construct contains
a different 3′UTR (Table S2), while
keeping the promoter (CaMV 2x35S), 5′UTR (TMVΩ leader)
and reporter gene (GFP) consistent. Graphs represent scanning fluorometry
and flow cytometry data obtained using intact leaf tissue and protoplasts
isolated from the same tissue. Scanning fluorometry (excitation 475
nm, emission 509 nm) data is expressed as log10 of CPS
(counts per second) values, while flow cytometry (excitation 488 nm,
emission 510/10 bandpass filter) data is expressed as log10 of MEFL (molecules of equivalent fluorescein) values. Negative (plants
transformed with untransformed Agrobacterium tumefaciens) and positive (plants transformed with the P2x35S:5TMVΩ:GFP:335S
construct) controls are indicated with blue and red lines across the
graphs, respectively. Data is represented as the mean ± standard
deviation (SD) of at least three transformed plants per construct.As shown in previous investigations, this work
supports the hypothesis
that choosing the appropriate 3′UTR is necessary for reaching
optimal transgene expression in plant cells.[33] The modulation of the 3′UTR allows an additional level of
stoichiometric control for a multi-component pathway. The cassettes
using the ocs, 35S, and nos 3′UTRs
were the strongest tested. Unlike ocs, other commonly
used 3′UTRs reported in the literature, such as mas, RbcS3C, g7, act2, and H4(26,34,35) reduced the cassette expression level, representing suboptimal modules
for transgene overexpression applications. However, with regard to
modulating transgene expression, these 3′UTRs could be useful
to reduce expression from cassettes using strong predetermined promoters/5′UTRs.In addition to demonstrating the importance of the 3′UTR
with regards to gene expression/translation, comparison of the fluorometry
and flow cytometry data for the ATPase 3′UTR
expression cassette revealed a limitation of the current flow cytometry
assay. While reporter gene expression could not be resolved from the
flow cytometry assay, it could be clearly resolved in the fluorometry
analysis. At first, this result was confounding as flow cytometry
is far more sensitive than fluorometry; however, upon further examination
of the fluorometry data, it was determined that the relatively low
ratio of transformed to untransformed cells made it more difficult
to resolve low expressing constructs than fluorometry. This could
be due to several technical considerations of the current protoplast
flow cytometry assays. (1) Transformed protoplasts may have weakened
plasma membranes compared to untransformed protoplasts and thus are
more sensitive to shear forces within the flow cells; and (2) protoplasts
are near the size maximum for many flow cytometers and are inherently
susceptible to shearing through the microfluidic nozzles. Imaging
data collected prior to flow cytometry support the hypothesis that
a significant portion of cells are sheared as a result of the microfluidic
environment resulting in an overall decreased signal. Further optimization
of the flow cytometry protocol will likely be necessary to increase
the resolution of flow cytometry analysis. In the current work, the
ability to directly compare the flow cytometry and leaf fluorometry
allowed us to resolve all levels of expression.
Investigating the Effect of Promoter-5′UTR Fusion Regions
In addition to the characterization of 5′UTR modules fused
to well-characterized promoters (Figure ), constructs were designed to test other
promoter-5′UTR fusions found in the literature (Figure ). These regulatory elements
were tested in combination with either nos or gctt-3CPMV-nos 3′UTRs. Except for the nonfunctional Nicotiana tabacum cryptic promoter tCUP fused to the minimal
promoter/5′UTR region min35S::TMV Ω, all other cassettes
tested showed a range of detectable activity. The CsVMV promoter-5′UTR
fusion regulated expression at the highest level in this group of
constructs, at nearly half a decade over P2x35S:5TMVΩ:GFP:335S.
This is in agreement with several previous investigations that showed
higher activity from the CsVMV promoter compared to 35S in several
plant systems.[36,37] The flow cytometry data suggests
that the viral PM24 MMV promoter fused to the AlMV 5′UTR regulates
expression at a lower level from the CsVMV fusion, though at the same
level as P2x35S:5TMVΩ:GFP:335S. The UBQ11 promoter-5′UTR
cassette drove the highest level of reporter expression from the endogenous
plant promoter-5′UTR fusions, also at the same level as P2x35S:5TMVΩ:GFP:335S.
The UBQ11 promoter is one of few identified strong endogenous plant
promoters commonly used in genetic engineering.[38] Considering that viral sequences can induce transcriptional
silencing of transgenes,[39] the use of strong
endogenous plant promoters has a distinct advantage for building engineered
plants. Among the others tested, the cassette using the mas promoter-5′UTR reduced expression by approximately half a
decade and the H4 cassette by a full decade, and
the ocs cassette only generated signal close to the
background level. As seen in Figure , low expressing cassettes such as Pocs-5ocs:GFP:3gctt3CPMVnos and Pocs-5ocs:GFP:3nos are undetectable
by flow cytometry, though they show detectable expression by scanning
fluorometry. It is interesting that using the ocs promoter-5′UTR decreases cassette expression, while the ocs 3′UTR greatly increases cassette expression (Figure ).
Figure 4
Comparative analysis
of promoter-5′UTR fusions in combination
with 3′UTRs of different activities. Each construct contains
a different promoter-5′UTR (Table S2). While the reporter gene (GFP) was kept consistent, two different
3′UTRs (nos 3′UTR and gctt3CPMV-nos fusion) were interchanged downstream from each promoter-5′UTR
fusion. Graphs represent scanning fluorometry and flow cytometry data
obtained using intact leaf tissue and protoplasts isolated from the
same tissue. Scanning fluorometry (excitation 475 nm, emission 509
nm) data is expressed as log10 of CPS (counts per second)
values, while flow cytometry (excitation 488 nm, emission 510/10 bandpass
filter) data is expressed as log10 of MEFL (molecules of
equivalent fluorescein) values. Negative (plants transformed with
untransformed Agrobacterium tumefaciens) and positive (plants transformed with the P2x35S:5TMVΩ:GFP:335S
construct) controls are indicated with blue and red lines across the
graphs, respectively. Data is represented as the mean ± standard
deviation (SD) of at least three transformed plants per construct.
Comparative analysis
of promoter-5′UTR fusions in combination
with 3′UTRs of different activities. Each construct contains
a different promoter-5′UTR (Table S2). While the reporter gene (GFP) was kept consistent, two different
3′UTRs (nos 3′UTR and gctt3CPMV-nos fusion) were interchanged downstream from each promoter-5′UTR
fusion. Graphs represent scanning fluorometry and flow cytometry data
obtained using intact leaf tissue and protoplasts isolated from the
same tissue. Scanning fluorometry (excitation 475 nm, emission 509
nm) data is expressed as log10 of CPS (counts per second)
values, while flow cytometry (excitation 488 nm, emission 510/10 bandpass
filter) data is expressed as log10 of MEFL (molecules of
equivalent fluorescein) values. Negative (plants transformed with
untransformed Agrobacterium tumefaciens) and positive (plants transformed with the P2x35S:5TMVΩ:GFP:335S
construct) controls are indicated with blue and red lines across the
graphs, respectively. Data is represented as the mean ± standard
deviation (SD) of at least three transformed plants per construct.Independent of the testing of activity from all
promoter-5′UTR
combinations, the use of two 3′UTRs with different strengths
resulted in the same high and low trend of expression regulation.
For these 5′UTR/3′UTR pairings, there was no substantial
interaction between the upstream promoter-5′UTR and the downstream
3′UTR, as the addition of the weak 3′UTR (gctt-3CPMV-nos) decreased expression levels for all promoter-5′UTR
cassettes tested by approximately half a decade. Thus, with these
elements, the dynamic range of expression achieved by varying the
5′UTR was considerably larger than when varying the 3′UTR.
It should be noted that comprehensive analysis of a library of 5′UTR/3′UTR
pairings may reveal that certain pairings have either an antagonistic
or synergistic effect on gene expression. Further study on the relationship
between these elements in plant gene expression is clearly warranted
and will help inform cassette design.
Investigating the Effect of the 5′UTR Region
We next modulated expression by changing both the 5′ and 3′UTRs,
while keeping one of three promoter sequences consistent (Figure ). The nos promoter cassettes expressed at a reduced level compared to both
35S and 2x35S for all combinations tested. As nos is one of the weakest available plant active promoters, many of
the cassettes were undetectable by flow cytometry. Scanning fluorometry
data suggests that the TMV 5′UTR cassette was the weakest,
expressing at approximately a decade and a half below P2x35S:5TMVΩ:GFP:335S,
followed by BSMV, 5SO, and PVX at slightly higher levels. The CMV2, RbcS2, and CMV1 cassettes all expressed at approximately
one decade below P2x35S:5TMVΩ:GFP:335S.
Figure 5
Comparative analysis
5′UTRs in combination with promoters
and 3′UTRs of different activities. Per promoter group (nos, 35S and 2x35S, in A-C, respectively), the library of
5′UTRs (Table S2) was tested along
with two 3′UTRs (nos 3′UTR and gctt3CPMV-nos fusion) while keeping the reporter gene (GFP) consistent.
The promoter nos was also tested with or without
three 5′UTRs (TMVΩ, PVXΩ and 5SO) along with a
4 bp (ttcg) variant of 3CPMV-nos 3′UTR (ttcg3CPMV-nos). Graphs represent scanning fluoromety and flow cytometry
data obtained using intact leaf tissue and protoplasts isolated from
the same tissue. Scanning fluorometry (excitation 475 nm, emission
509 nm) data is expressed as log10 of CPS (counts per second)
values, while flow cytometry (excitation 488 nm, emission 510/10 bandpass
filter) data is expressed as log10 of MEFL (molecules of
equivalent fluorescein) values. Negative (plants transformed with
untransformed Agrobacterium tumefaciens) and positive (plants transformed with the P2x35S:5TMVΩ:GFP:335S
construct) controls are indicated with blue and red lines across the
graphs, respectively. Data is represented as the mean ± standard
deviation (SD) of at least three transformed plants per construct.
Comparative analysis
5′UTRs in combination with promoters
and 3′UTRs of different activities. Per promoter group (nos, 35S and 2x35S, in A-C, respectively), the library of
5′UTRs (Table S2) was tested along
with two 3′UTRs (nos 3′UTR and gctt3CPMV-nos fusion) while keeping the reporter gene (GFP) consistent.
The promoter nos was also tested with or without
three 5′UTRs (TMVΩ, PVXΩ and 5SO) along with a
4 bp (ttcg) variant of 3CPMV-nos 3′UTR (ttcg3CPMV-nos). Graphs represent scanning fluoromety and flow cytometry
data obtained using intact leaf tissue and protoplasts isolated from
the same tissue. Scanning fluorometry (excitation 475 nm, emission
509 nm) data is expressed as log10 of CPS (counts per second)
values, while flow cytometry (excitation 488 nm, emission 510/10 bandpass
filter) data is expressed as log10 of MEFL (molecules of
equivalent fluorescein) values. Negative (plants transformed with
untransformed Agrobacterium tumefaciens) and positive (plants transformed with the P2x35S:5TMVΩ:GFP:335S
construct) controls are indicated with blue and red lines across the
graphs, respectively. Data is represented as the mean ± standard
deviation (SD) of at least three transformed plants per construct.Cassettes using the 35S promoter and all 5′UTRs other than TMV expressed at a similar
level to P2x35S:5TMVΩ:GFP:335S,
while TMV reduced expression by approximately half a decade. Absence
of a 5′UTR reduced expression by a full decade. Interestingly,
cassettes using the 2x35S promoter and the PVX 5′UTR expressed
approximately half a decade higher than P2x35S:5TMVΩ:GFP:335S.
This is especially interesting considering that the addition of the
PVX 5′UTR to the nos promoter greatly reduced
expression. The BSMV 5′UTR reduced expression slightly, and
the absence of a 5′UTR reduced expression by approximately
half a decade, while all others expressed
at a similar level to P2x35S:5TMVΩ:GFP:335S.The activity
of all promoters was improved by adding 5′UTRs,
confirming the importance of this element when designing genetic constructs.
The 5′UTR is crucial for stabilizing transcripts, which positively
enhances gene expression.[40,41] However, the degree
of increased expression was not consistent between the promoters.
These results suggest that the addition of 5′UTRs affects promoter
activity differently, and expression from combinations may not be
easily predicted. The enhancer effect of 5′UTRs is not always
linked to the presence of common motifs in their sequence,[42] supporting the hypothesis that the upstream
promoter region may have a synergistic role. Therefore, testing combinations
of heterologous promoters and 5′UTRs is an important aspect
to consider when designing transgene expression cassettes. Similarly
to what is observed in Figure , the effect on activity from different 3′UTRs appears
to be distinct from the upstream sequences. The addition of the weak
gctt-3CPMV-nos repeatedly decreased activity as compared
to the nos 3′UTR.
Common Four Base Pair Scar (GCTT) between the Stop Codon and
the 3′UTR Reduces the Cassette Expression Level
In
order to test the effect of short base pair sequences introduced as
cloning scars between regulatory elements, the activity of two different
four base pair 3CPMV-nos 3′UTRs were tested
downstream of four different nos promoter/5′UTR
fusions (5SO, PVX, TMVΩ, or absence of 5′UTR). For this
purpose, either the commonly used GCTT[20,26] or TTCG 3′
scar was inserted between the GFP stop codon and the 3CPMV-nos 3′UTR (Figure A and Table S2). These two
groups were compared to constructs with the nos 3′UTR.
Scanning fluorometry data suggested that cassettes using the gctt-3CPMV-nos reduced expression for all 5′UTRs tested by approximately
one and a half decades compared to P2x35S:5TMVΩ:GFP:335S. The
use of the nos 3′UTR decreased cassette expression
by approximately one and a half decades for all 5′UTRs tested.
Cassettes with ttcg-3CPMV-nos expressed at the highest
levels, at approximately one decade under P2x35S:5TMVΩ:GFP:335S
for the 5SO, PVX, TMV, or absence of 5′UTR. As expected due
to low expression values, flow cytometry was not able to validate
differences between these cassettes.Due to the consistent change
in expression using GCTT or TTCG 3′ overhangs, we generated
10 additional constructs by swapping the four base pairs between GCTT
and TTCG either one, two, three, or four nucleotides at a time (Figure S1). All constructs contained the nos promoter, TMVΩ 5′UTR, and 3CPMV-nos 3′UTR. The construct using the GTTT overhangs
expressed at the lowest ranked level, which was similar to GCTT. The
highest ranked construct used the TTCG overhangs. Interestingly, all
six overhangs tested that began with a G nucleotide were the lowest
ranked, while all six overhangs beginning with a T were the highest
ranked.All combinations indicated a strong change of expression
levels
due to the presence of different four base pair scars. The ttcg-3CPMV-nos consistently increased activity, while the gctt-3CPMV-nos decreased activity compared to constructs with the nos 3′UTR. Surprisingly, these results indicate that
cloning scars are a critical aspect to consider when designing expression
cassettes. It is possible that the intrinsic secondary loop-structure
and stability of the transcript can be altered through the introduction
of a few base pairs outside of the main 3′UTR sequence. This
is supported by a previous study performed using the 3′UTR
of CPMV RNA-2, which demonstrated that the preservation of an optimal
RNA secondary structure was fundamental to achieve high expression.[17] In this study, the introduction of either point
mutations in critical RNA-Y-loops within the 3CPMV, or the introduction
of a linker in between two 3′UTRs (3CPMV and nos) had a strong effect on the gene expression level.[17]Our data added one more level of complexity for using
3′UTR
modules, demonstrating that even changing a few base pairs can decrease
the cassette expression level. The introduction of scars in between
regulatory elements is part of both traditional and Golden Gate cloning,
and therefore choosing appropriate restriction sites and overhangs
is fundamental for optimal construct design. With regards to Golden
Gate cloning, the GCTT sequence is a common 3′ overhang used
to design gene of interest modules in many plasmid kits.[20,26] Therefore, reconsidering the use of this particular overhang and
testing alternative sequences is particularly important to achieve
high transgene expression in plant cells. As important is the need
to identify insulator regions that can minimize the effects of scars
on expression. A previous work in E. coli promoters demonstrated a path for using randomized insulators to
reduce the impact of scars on gene expression.[43] Similar designs for plant constructs may help alleviate
this issue to increase the fidelity of engineered plants.
Ranking Genetic Regulatory Elements to Achieve Different Activity
Levels
In addition to the investigation of regulatory elements
in various groupings, the activity of all constructs was ranked against
one another (Figure ). The graph showing the expression of all 85 GFP expressing constructs
vs P2x35S:5TMVΩ:GFP:335S reflects a wide range of expression
levels. Expression ranges from undetectable when using the nonfunctional
tCUP promoter (two decades below), to very high levels when using
the CsVMV promoter (approximately half a decade above). For a number
of commonly used regulatory elements, plant genetic engineering is
largely devoid of clear sequence information and common nomenclature
for elements used in different laboratories. To further this issue,
it is not uncommon for distinct elements to be identified with the
same name. In our library, three different 35S promoters found in
common databases have been characterized: the short 35S promoter from
the CaMV virus, a longer and mutated form, and a repeated double form.
The three versions might casually be referred to as 35S, though they
may differ in activity strength, especially when used with various
5′UTRs (Figures ). While the greater synthetic biology community
has begun to use standardization of parts nomenclature through software
solutions, such as SBOL,[44] there is currently
no extensive effort to solve this stopgap in plant synthetic biology.
Additionally, the effects from small four base pair scars clearly
play a discernible role in transgene expression and should not be
overlooked (Figure C and Figure S1). Often, plasmids may
be casually interchanged within laboratories, and small differences
between commonly used regulatory elements may be overlooked. This
data demonstrates that deep sequence analysis of genetic modules used
for transgene expression is fundamental to achieve rational construct
design. This is particularly important to study regulatory elements
where small mutations in critical functional regions could drastically
affect their activity.[17,32]
Figure 6
Comparison of scanning fluorometry and
flow cytometry data for
all combinations of genetic regulatory elements. (A, B) Graphs represent
scanning fluorometry and flow cytometry data obtained using intact
leaf tissue and protoplasts isolated from the same tissue. Scanning
fluorometry (excitation 475 nm, emission 509 nm) data is expressed
as log10 of CPS (counts per second) values, while flow
cytometry (excitation 488 nm, emission 510/10 bandpass filter) data
is expressed as log10 of MEFL (molecules of equivalent
fluorescein) values. Negative (plants transformed with untransformed Agrobacterium tumefaciens) and positive (plants transformed
with the P2x35S:5TMVΩ:GFP:335S construct) controls are indicated
with blue and red lines across the graphs, respectively. Data is represented
as the mean ± standard deviation (SD) of at least three transformed
plants per construct. (C) Graph showing the correlation between scanning
fluorometry measurements and flow cytometry data. The R2 correlation value of ∼0.72 is shown in the graph.
Comparison of scanning fluorometry and
flow cytometry data for
all combinations of genetic regulatory elements. (A, B) Graphs represent
scanning fluorometry and flow cytometry data obtained using intact
leaf tissue and protoplasts isolated from the same tissue. Scanning
fluorometry (excitation 475 nm, emission 509 nm) data is expressed
as log10 of CPS (counts per second) values, while flow
cytometry (excitation 488 nm, emission 510/10 bandpass filter) data
is expressed as log10 of MEFL (molecules of equivalent
fluorescein) values. Negative (plants transformed with untransformed Agrobacterium tumefaciens) and positive (plants transformed
with the P2x35S:5TMVΩ:GFP:335S construct) controls are indicated
with blue and red lines across the graphs, respectively. Data is represented
as the mean ± standard deviation (SD) of at least three transformed
plants per construct. (C) Graph showing the correlation between scanning
fluorometry measurements and flow cytometry data. The R2 correlation value of ∼0.72 is shown in the graph.
Combining Transgene Cassettes to Achieve Predicted Level of
Expression within a Three-Gene Pathway
As a proof of concept
to demonstrate the ability to modulate the stoichiometry of gene expression
within a three-component pathway, plant cells were co-transformed
with low, medium, and high combinations of regulatory elements to
express GFP, RFP, or BFP reporter cassettes in all possible combinations
(Combo-1–6) (Figure ). Values obtained from both leaves and single cell analyses
were converted from RFP or BFP units into equivalent GFP units using
values from plants co-transfected with three high-expressing constructs
controlling the three reporters. If the regulatory elements can be
applied equivalently to different coding sequences and the expression
cassettes do not interfere significantly with one another, then we
would predict that the expression levels from each color in these
three-color constructs should be close to the expression levels of
the single-color GFP constructs reported above. The observed results
are shown in Figure A–C. Signal specificity and absence of cross contamination
between the three fluorescent channels was confirmed by confocal microscopy
(Figure D). For the
GFP cassettes, the combinations expressed as expected, with low values
obtained from Combo-4 and Combo-5, medium values from Combo-1 and
Combo-2, and high values from Combo-3 and Combo-6. Interestingly,
predicted values from flow cytometry for medium and low expressing
cassettes were slightly less than those achieved from the combinations,
while the values predicted for the high expressing cassette were slightly
greater than those of the combination. A similar trend was seen for
the RFP cassettes. The low, medium, and high expressing combinations
expressed as expected. As with the GFP cassettes, the predicted values
for the low RFP expressing cassettes were slightly less than those
achieved from the combinations, with the RFP medium cassettes indistinguishable
from prediction and the RFP high cassettes consistently below prediction.
In this experimental design, the three constructs (GFP, RFP, and BFP)
were co-transformed on separate plasmids in order to minimize feedback
from one expression cassette to the other; however, some of the variability
between observed vs predicted results could be due to the energetic
load placed on the cells. For example, one may hypothesize that maximal
expression of GFP alone would be higher than when a cell is tasked
with producing two other fluorescent proteins. This would be a simple
mass balance problem, where each cell has a set capacity to produce
heterologous proteins. The production of one protein thus has an effect
on a cell’s ability to produce another protein. In this situation,
it would be anticipated that combinations with lower production capabilities
(low and medium) would be most accurately predicted as they would
be farthest from this maximal threshold. To test this hypothesis,
it would be necessary to test low expressing constructs for all three
fluorescent proteins and validate if the predictions are more accurate.
Perhaps more importantly, in future works, the effect of one expression
cassette on another in binary or ternary expression vectors should
be tested, as this would be the likely scenario for the generation
of transgenic plants.
Figure 7
Combining transgene cassettes to achieve predicted fluorescence
levels within a three-gene pathway. (A) High (PCsVMV-5CsVMV:FP:3nos, H), medium (P35S:5TMVΩ:FP:3gctt3CPMVnos, M), and low (Pnos:5CMV1:FP:3gctt3CPMVnos, L) regulatory element cassettes (Table S2) were designed to control the expression of either GFP, BFP, or
RFP reporters in all possible combinations. Combination 1 (Combo-1):
H RFP, M GFP, and L BFP; combination 2 (Combo-2): H BFP, M GFP, and
L RFP; combination 3 (Combo-3): H GFP, M BFP, and L RFP; combination
4 (Combo-4): H RFP, M BFP, and L GFP; combination 5 (Combo-5): H BFP,
M RFP, and L GFP; combination 6 (Combo-6): H GFP, M RFP, and L BFP.
(B, C) Graphs represent scanning fluorometry and flow cytometry data
obtained using intact leaf tissue and protoplasts isolated from the
same tissue. Scanning fluorometry (excitation 475, 550, and 400 nm,
emission 509, 574, and 455 nm) data is expressed as log10 of CPS (counts per second) values, while flow cytometry (excitation
488, 561, and 405, emission 510/10, 585/16, and 440/50 bandpass filters)
data is expressed as log10 of MEFL (molecules of equivalent
fluorescein) values for GFP, RFP, and BFP respectively.. Negative
(plants transformed with untransformed Agrobacterium
tumefaciens) and positive (plants transformed with
the P2x35S:5TMVΩ:GFP:335S construct) controls are indicated
with blue and red lines across the graphs, respectively. Data is represented
as the mean ± standard deviation (SD) of at least three transformed
plants per construct. The fluorescence levels of combinations 1–6
are compared with predicted single cassette fluorescence levels (light
green, blue, and red). (D) Confocal images showing the same epidermal N. benthamiana leaf cells transformed with the indicated
construct combinations, (Combo-1–6); reference combination
7 (Combo-7): H GFP, H RFP, and H BFP; negative control 1 (NC1): H
GFP only; negative control 2 (NC2): H BFP only; negative control 3
(NC3): H RFP only; and negative control 4 (NC4): leaves transformed
with wild-type A. tumefaciens. Chlorophyll
(Chl), bright-field (BF), and merged images are indicated. Scale bars
= 50 μm.
Combining transgene cassettes to achieve predicted fluorescence
levels within a three-gene pathway. (A) High (PCsVMV-5CsVMV:FP:3nos, H), medium (P35S:5TMVΩ:FP:3gctt3CPMVnos, M), and low (Pnos:5CMV1:FP:3gctt3CPMVnos, L) regulatory element cassettes (Table S2) were designed to control the expression of either GFP, BFP, or
RFP reporters in all possible combinations. Combination 1 (Combo-1):
H RFP, M GFP, and L BFP; combination 2 (Combo-2): H BFP, M GFP, and
L RFP; combination 3 (Combo-3): H GFP, M BFP, and L RFP; combination
4 (Combo-4): H RFP, M BFP, and L GFP; combination 5 (Combo-5): H BFP,
M RFP, and L GFP; combination 6 (Combo-6): H GFP, M RFP, and L BFP.
(B, C) Graphs represent scanning fluorometry and flow cytometry data
obtained using intact leaf tissue and protoplasts isolated from the
same tissue. Scanning fluorometry (excitation 475, 550, and 400 nm,
emission 509, 574, and 455 nm) data is expressed as log10 of CPS (counts per second) values, while flow cytometry (excitation
488, 561, and 405, emission 510/10, 585/16, and 440/50 bandpass filters)
data is expressed as log10 of MEFL (molecules of equivalent
fluorescein) values for GFP, RFP, and BFP respectively.. Negative
(plants transformed with untransformed Agrobacterium
tumefaciens) and positive (plants transformed with
the P2x35S:5TMVΩ:GFP:335S construct) controls are indicated
with blue and red lines across the graphs, respectively. Data is represented
as the mean ± standard deviation (SD) of at least three transformed
plants per construct. The fluorescence levels of combinations 1–6
are compared with predicted single cassette fluorescence levels (light
green, blue, and red). (D) Confocal images showing the same epidermal N. benthamiana leaf cells transformed with the indicated
construct combinations, (Combo-1–6); reference combination
7 (Combo-7): H GFP, H RFP, and H BFP; negative control 1 (NC1): H
GFP only; negative control 2 (NC2): H BFP only; negative control 3
(NC3): H RFP only; and negative control 4 (NC4): leaves transformed
with wild-type A. tumefaciens. Chlorophyll
(Chl), bright-field (BF), and merged images are indicated. Scale bars
= 50 μm.All constructs analyzed maintained similar expression
levels compared
to predicted single cassette activities, with the exception of both
low and medium expressing cassettes for BFP. As expected from the
single cassette GFP experiments, high BFP expression was achieved
from Combo-2 and Combo-5. Surprisingly, even higher expression was
quantified from the predicted medium expressers Combo-3 and Combo-4.
Though not higher than Combo-2 and Combo-5, high expression was also
detected from the predicted low expressers Combo-1 and Combo-6. In
these particular cases, it is possible that the promoter-5′UTR
in the predicted low and medium expressing BFP cassettes combined
with the downstream mTag-BFP2 coding sequence to
create a motif enhancing transgene expression. It is known that the
nucleotide region surrounding the ATG start codon is important to
create a favorable start codon, and downstream sequences after can
also have an enhancing effect on expression and protein production.[45]The data from the library described here
represents a valuable
tool to estimate and experimentally reproduce a known stoichiometry
of expression for multi-component pathways. This investigation will
accelerate metabolic engineering of complex heterologous pathways
where the expression of a number of genes must be coordinated both
within the pathway and in the endogenous cellular system. However,
as seen with the BFP expressing cassettes, unpredicted motifs may
originate and affect gene expression. Combinations of chosen regulatory
elements with desired coding sequences must be tested and quantified
in order to avoid unpredicted effects.
Analysis of Expression Cassettes in Canopies by Fluorescence-Inducing
Laser Projector (FILP)
Fluorescent signal produced by the
canopy N. benthamiana plants infiltrated
with low, medium, and high expresser cassettes was further analyzed
using the fluorescence-inducing laser projector (FILP) standoff detection
system.[46] Based on MEFL flow-cytometry
data, the constructs were subdivided into high (MEFL: above 7 units),
medium (MEFL: 6.5–7 units), and low (MEFL: 5–6.5 units)
expressers. Three construct candidates per functional group were selected.
The GFP and endogenous control pigment chlorophyll were imaged at
a distance of 3 m. The GFP signal produced by the plant canopy was
quantified by pixel intensity, and the chlorophyll signal was used
to set the leaf area of each plant analyzed. The images in Figure indicate that the
three functional groups of constructs can be efficiently discriminated
using this standoff detection system. This is supported by the high
correlation (R2 = 0.89) between pixel
intensity values obtained using FILP and the corresponding fluorometric
values obtained from the same leaf tissue. This high correlation is
likely because values obtained using either technique are both representative
of pooled-cell populations of intact tissues. These experiments confirmed
that by modulating genetic regulatory elements, we could alter the
level of fluorescent signal detectable at a distance of 3 m. Standoff
detection is a necessary characteristic for engineered plants that
sense and report the presence of environmental stimuli, termed phytosensors.
It is crucial for these phytosensors to reach distinct expression
levels between situations with or without the presence of the stimuli
of interest. In the new era of plant synthetic biology, phytosensors
will find important applications both in agriculture and human environments
to monitor the presence of pathogens, chemicals, and physical stresses.[14,47]
Figure 8
Analysis
of the activity of genetic modules in canopies by a fluorescence-inducing
laser projector (FILP). (A) Graphs represent fluorometric data obtained
using intact leaf tissue. Scanning fluorometry (excitation 475 nm,
emission 509 nm) data is expressed as log10 of CPS (counts
per second) values, while FILP (Fluorescence-Inducing Laser Projector)
(excitation 465, 525 nm, emission 525, 680 nm filters) data is expressed
as log10 of pixel intensity. Data of the graph correlating
both methods is also shown (R2: ∼0.89).
Data is represented as the mean ± standard deviation (SD) of
at least three transformed plants per construct. Plants expressing
single high (H1–3: PCsVMV-5CsVMV:GFP:3nos,
P35S:5RbcS2B:GFP:3nos, and P35S:5PVX:GFP:3nos), medium (M1–3:P35S:5RbcS2B:GFP:3gctt3CPMVnos, PM24MMV:5AIMV:GFP:3nos, and PUBQ11-5UBQ11-link:GFP:3nos), and low (L1–3: P35S:5TMV:GFP:3gctt3CPMVnos, PH4:5H4:GFP:3nos, and Pnos:5CMV1:GFP:3gctt3CPMVnos) expression cassettes (Table S2) have been analyzed using the two methods.
(B) FILP images showing N. benthamiana plants expressing high (H), medium (M), and low (L) expression cassettes
along with wild-type controls (W). GFP (green), chlorophyll (red),
bright-filed (gray), and merge images are shown. Scale bar: 10 cm.
Analysis
of the activity of genetic modules in canopies by a fluorescence-inducing
laser projector (FILP). (A) Graphs represent fluorometric data obtained
using intact leaf tissue. Scanning fluorometry (excitation 475 nm,
emission 509 nm) data is expressed as log10 of CPS (counts
per second) values, while FILP (Fluorescence-Inducing Laser Projector)
(excitation 465, 525 nm, emission 525, 680 nm filters) data is expressed
as log10 of pixel intensity. Data of the graph correlating
both methods is also shown (R2: ∼0.89).
Data is represented as the mean ± standard deviation (SD) of
at least three transformed plants per construct. Plants expressing
single high (H1–3: PCsVMV-5CsVMV:GFP:3nos,
P35S:5RbcS2B:GFP:3nos, and P35S:5PVX:GFP:3nos), medium (M1–3:P35S:5RbcS2B:GFP:3gctt3CPMVnos, PM24MMV:5AIMV:GFP:3nos, and PUBQ11-5UBQ11-link:GFP:3nos), and low (L1–3: P35S:5TMV:GFP:3gctt3CPMVnos, PH4:5H4:GFP:3nos, and Pnos:5CMV1:GFP:3gctt3CPMVnos) expression cassettes (Table S2) have been analyzed using the two methods.
(B) FILP images showing N. benthamiana plants expressing high (H), medium (M), and low (L) expression cassettes
along with wild-type controls (W). GFP (green), chlorophyll (red),
bright-filed (gray), and merge images are shown. Scale bar: 10 cm.
Conclusions
The correlation between fluorometric values
obtained from plant
tissue and single cell analysis supports the quantified expression
level from tested cassettes. Discrepancies between scanning fluorometry
and flow cytometry data may be explained by the method of collection.
While flow cytometry pools the data from ∼106 individual
various cell types, our scanning fluorometer collects ≤3 technical
replicates from arbitrary spots on a leaf to quantify expression for
a population of cells. Additionally, it is possible that threshold detection limits may
be different for either instrument. It is evident that our current
methods for flow cytometry do not accurately quantify expression from
weaker constructs. This may seem counter-intuitive, as flow cytometry
is typically more sensitive to low expression levels than bulk measurement
methods but is caused in this case by challenges in the effective
gating of a heterogeneous sample event population. Additional studies
can likely tune the sensitivity for flow cytometry by adjusting sample
preparation and data gating methods in order to better enrich the
events of interest in the assay and thus improve the limit of detection.
Regardless, the relatively fast time to acquire expression data from
leaves is enough to roughly predict values obtained by both standoff
detection of plant canopies and single cell analysis, and even this
rough level of agreement across multiple instruments increases the
degree of confidence that can be taken in the values that are reported.
Furthermore, the calibration of flow cytometry data to units of equivalent
standard dye molecules allows reproducible measurements of the activity
of plant regulatory elements across different laboratories, as well
as meaningful data interpretation and modeling with respect to the
actual molecular biology of the cells. The typical use of arbitrary
or relative units for reporting fluorescence, by contrast, is a source
of high uncertainty when data is compared across different laboratories
or experiments. Likewise, calibrated flow cytometry also provides
a path for linking the counts-per-second of the Fluorolog, pixel intensity
units of standoff detection, and molecular interpretation of phenomena
being quantified.Building on these results, standard laboratory
equipment such as
a plate reader could likely be used for rapid quantification in labs
without a Fluorolog-3 scanning fluorometer. Likewise, data collection
from flow cytometry for prediction software could be utilized to predict
the canopy expression level necessary for standoff detection. Single
cell analysis allows for large sets of quantitative data to be calculated
from a population of cells. In comparison with the analysis of intact
tissue, flow cytometry allows more robust statistical parameters calculated
on ∼106 events. Furthermore, protoplasts extracted
from various tissues reflect data on different cell subpopulations.
This information is important to compile a plant cell atlas, allowing
users to analyze and subdivide cells depending on size, internal complexity,
and fluorescence level. The plant synthetic biology community will
tremendously benefit from a common resource of activity of genetic
modules, allowing more precise, reproducible, and predictable engineering
of plant cells.Through the use of these three methods, we improved
the available
knowledge regarding activity of regulatory elements and initiated
the process of prediction, from single cell to leaf and canopy expression.
This is important in the context of phenomics, where information at
genetic levels is associated to plant attributes. With regard to engineered
plants used to sense environmental stimuli (phytosensors), the quantified
level of fluorescence from standoff detection is particularly important.
These phytosensors must be optimally designed to prevent false positive
or false negative reporting of potentially harmful detected stimuli.
Through the use of this library, the on/off state of phytosensors
can be properly tuned for realistic function.The library described
here is only one of the first steps toward
building a foundation of knowledge for plant synthetic biology. Several
additional kinds of studies will help assemble a comprehensive rulebook.
(1) The regulatory elements described in this work clearly do not
encompass all possible modules available for plant transgene expression.
Many other promoters, UTRs, N and C-terminal tags, linkers, and other
types of modules could be designed, assembled, and analyzed. Additionally,
monocotyledon regulatory elements are not typically functional in
dicotyledon plants. Additional studies focusing on monocotyledon parts
will be necessary to expand described libraries to this other group
of flowering plants. (2) Scars resulting from molecular cloning can
have considerable effects on transgene expression. Here, we demonstrated
that four base pair sequences between the stop codon and 3′UTR
have dramatic effects on gene expression. With regard to Golden Gate
cloning, screening of all 256 possible four-nucleotide overhangs used
to link modules could be used to optimize this method. However, utilizing
insulators or modeling approaches, similar to what has been achieved
in other organisms, will provide a more robust solution, especially
when considering the varying effects of different genes of interest
and not just well-characterized reporters. (3) It is known that potentially
deleterious positional effects occur following transgene integration
into the plant genome.[48] It is likely that
positional effects also occur from the spacing and orientation of
multiple transgene cassettes in transformation vectors. Therefore,
the evaluation of respective cassettes’ expressions due to
different structural designs and orientations will be important to
perform precise metabolic engineering. Unpredicted genetic motifs
with strong effects on gene expression can occur when combining different
regulatory elements. The validation of novel combined modules with
unique expression patterns will considerably expand the library of
genetic parts.There are undoubtedly additional kinds of studies
that would continue
to advance the overall goal of plant synthetic biology. Likely, many
of these studies could be conducted in a manner similar to the one
described here. Regardless, there are improvements necessary for high-throughput
screening of increasingly populated module libraries. Though beneficial
and appropriate for state-of-the-art plant biotechnology, agroinfiltration
of N. benthamiana is hindered by the
need for hands-on labor. PEG-mediated transfection of protoplasts
by robotics seems to be the most likely answer to this high-throughput
bottleneck.[49] Protoplasts can easily be
obtained in large numbers through the generation of cell suspension
cultures.[21] Though established methods
have been generated for robotics transfection of protoplasts, a potential
hurdle is the necessity of concentrated and purified plasmid DNA.[50] To overcome this necessity, protocols could
be potentially established describing the use of PCR products for
transformation, as opposed to plasmids. Nonetheless, agroinfiltration
is a currently accepted method for transient expression of transgenes
in plants and can be immediately conducted to screen module libraries
to progress the molecular toolbox. Studies such as this one will help
the long-term goal of synthetic biology to develop new plants to secure
and sustain a growing human population.
Materials and Methods
Plant Growth Conditions
Nicotiana benthamiana plants were grown on a BK25 potting mix (ProMix) for 4–6
weeks before Agrobacterium tumefaciens-mediated transformation. Plants were kept in a controlled environment
at a constant temperature of 24 °C and a light/dark cycle of
16/8 h, respectively. Irradiation was provided using LED light at
an intensity of ∼350 μmol m–2 s–1. After transformation, plants were kept at the same
environmental conditions listed before.
Construction of Plant Transformation Vectors
Golden
Gate modular cloning was used to assemble plant transformation vectors
as previously described.[20,26] The IIS restriction
enzymes BsaI-HF-V2 (New England Biolabs, NEB) and BpiI (Anza Invitrogen, Thermo Fisher Scientific) were used
for level-1 and level-2 assemblies, respectively. All modules used
for Golden Gate assembly have been designed and domesticated as previously
described.[20,26] A list of genetic regulatory
elements including promoters, 5′UTRs, 3′UTRs, and coding
sequences used in this study is shown in Table S1. The coding sequences for GFP (mEmerald) (FPbase ID: AD4BK),
BFP (mTag-BFP2) (FPbase ID: ZO7NN), and RFP (mScarlet-I) (FPbase ID:
6VVTK) were gene synthesized through GeneArt (Thermo Fisher Scientific).
Plant expression cassettes for testing regulatory elements were integrated
into the level-2 acceptor plasmid pAGM4723.[26]
Vacuum Infiltration of N. Benthamiana Plants
The library of plant transformation vectors generated
in this study was transformed in A.tumefaciens LBA4404 by using the freeze thawing method
as described previously.[51] For plant transformation,
a single bacteria colony carrying the construct of interest was grown
overnight in YEP selective liquid media (10 g/L peptone; 10 g/L yeast
extract; 5 g/L NaCl; 50 mg/L rifampicin; 50 mg/L kanamycin; pH 7)
at 28 °C under vigorous shaking (250 rpm). The day after, 1 mL
of preculture was inoculated in 100 mL of fresh YEP selective liquid
media. After 24 h of growth at the same conditions, 100 μL of
100 mM acetosyringone was added, and then the culture was grown for
an extra hour. Bacterial cells were pelleted by centrifugation at
4000g for 10 min at room temperature and resuspended
in infiltration buffer (10 mM 2-(N-morpholino)ethanesulfonic
acid hydrate, MES pH 5.7; 10 mM MgCl2; 100 μM acetosyringone)
to an OD of 0.6 for all single constructs tested. For the combination
infiltrations of three different cassettes, they were each mixed at
0.2 OD, reaching a final OD of 0.6. Each plant infiltration requires
∼300 mL of bacteria solution. The bacteria solution was poured
in a Magenta GA-7 vessel (7.7 × 7.7 × 9.7 cm size) able
to contain the entire canopy of 4–6-week-old plants. After
submersion into the bacteria solution, the plant infiltration was
performed in a chamber applying and releasing vacuum for 3–6
times until complete infiltration. Infiltrated leaves were pat dried
to remove the excess bacterial solution, returned to a controlled
environment, and analyzed 72 h post infiltration. A minimum of three
plants were infiltrated per construct.
Fluorometric Analysis of Leaves
The scanning fluorescence
spectroscopy apparatus (Fluorolog-3, Jobin Yvon and Glen Spectra,
Edison, NJ) equipped with the FluorEssence Software (HORIBA Scientific,
version 3.8.0.60) was used to quantify the fluorometric signal from
leaf tissue as described previously.[46] The
GFP, RFP, and BFP signals were analyzed at excitation wavelengths
of 475, 550, and 400 nm, and emission ranges of 509, 574, and 455
nm, respectively, to collect the maximum emission peaks. Six measurements
per infiltrated plant were collected from the same leaf tissue used
for protoplasts isolation. A minimum of three independent plants per
construct was analyzed. Fluorometric data was processed using Microsoft
Excel software as previously described.[46] The results are expressed as mean ± standard deviation of log10 of CPS (counts per second). To ensure that detected fluorescence
was not the direct result of A. tumefaciens, cultures containing constructs with promoters and 5′ UTRs
from A. tumefaciens were checked for
expression on a plate reader (Figure S2).
Protoplasts Isolation
The top three fully expanded
leaves of each agroinfiltrated plant were cut into strips of 2–3
mm width using a blade. Leaf tissue was then put in a deep Petri dish
(10 cm diameter and 2 cm height) containing 25 mL of freshly made
enzyme solution (600 mM mannitol; 1 mg/mL BSA; 20 mM KCl; 10 mM CaCl2; 4.6 mM β-mercaptoethanol; 20 mM 2-(N-morpholino)ethanesulfonic acid hydrate, MES pH 5.7; along with 24
μL of Rohament CL; 22 μL of Rohapect 10 L and 2.2 μL
of Rohapect UF per each ml of solution) and incubated in the dark
under gentle shaking (40–50 rpm) at room temperature for 2–3
h. After this incubation time, the plates were incubated for 5 extra
minutes at 80 rpm shaking to facilitate the release of protoplasts
from digested leaf tissue. The protoplast solution was then filtered
through a 40 μm nylon mesh filter into a clean Petri dish. After
transferring into a 50 mL conical tube, filtrated protoplasts were
centrifuged at 100 × g for 3 min and then the
cell pellet was resuspended in 5 mL of washing solution (600 mM mannitol;
20 mM KCl; 4 mM 2-(N-morpholino)ethanesulfonic acid
hydrate, MES pH 5.7). The protoplast solution was underlaid with 5
mL of 23% sucrose, keeping the two phases separated. After centrifugation
at 100 × g for 3 min, 5–6 mL of live
protoplasts were collected at the interface sucrose/washing solution.
Protoplasts were diluted adding 8–10 mL of fresh wash buffer
in a 15 mL tube, centrifuged again at 100 × g for 3 min, and then the cell pellet was resuspended in 1–2
mL of incubation solution (500 mM mannitol; 20 mM KCl; 4 mM 2-(N-morpholino)ethanesulfonic acid hydrate, MES pH 5.7). Resuspended
protoplasts were then analyzed by flow cytometry.
Flow Cytometry and Data Analysis
Flow cytometric analysis
was conducted using an Attune NxT acoustic focusing cytometer equipped
with the manufacturer’s operating software (Life Technologies,
Carlsbad, CA, USA). Protoplasts suspended in WI buffer at a concentration
of ∼1 × 105 cells per mL were analyzed at an
acquisition volume of 750 μL with a flow rate of 500 μL/min.
Forward-scattered (FSC) and side-scattered (SSC) light voltages were
set at 50 and 180 V, respectively. The GFP, BFP, and RFP fluorescent
reporters were excited using 488, 405, and 561 nm lasers with 510/10,
440/50, and 585/16 bandpass filters, respectively. The voltage for
GFP was 200 V, whereas those for BFP and RFP were at 300 V.Flow cytometry data was processed using the TASBE Flow Analytics
software package,[52] using the recommended
practices for gating, background subtraction, and bead-based calibration,
using the closest-fit match of GFP to the defined 488 excitation and
530/30 filter channel for MEFL units on SpheroTech URQP-38-6 K calibration
beads. Data from each sample was fit to a two-component Gaussian mixture,
with the low component interpreted as non-transfected cells and the
high component interpreted as transfected cells. The non-transfected
component was discarded, and statistics for the high component was
reported. Additional details and examples are provided in the Supporting Information, “Flow Cytometry
Data Processing”. The results are expressed as the mean ±
standard deviation of log10 of MEFL (molecules of equivalent
fluorescein).
The fluorescence-inducing laser projector (FILP) was used for imaging
agroinfiltrated N. benthamiana canopies
expressing fluorescent reporters as described previously.[46] The GFP fluorescent signal was acquired at 100
ms of exposure time using a 465 nm laser and a 525 nm emission filter,
while the chlorophyll a was imaged at 100 ms of exposure time using
a 525 nm emission laser and detected using a 680 nm filter. The standoff
detection was performed at 3 m from the laser source.Images
were processed using ImageJ 1.41o (National Institute of Health, Bethesda,
MD, USA), and the same software was used to quantify the fluorescence
intensity from plant canopies. For this purpose, chlorophyll images
were used to set the threshold and automatically identify individual
canopy areas (ROI: region of interest), and then the GFP fluorescence
in these regions of interests was quantified by pixel intensity. The
results are expressed as the mean ± standard deviation of log10 of pixel intensity.
Confocal Microscopy
Agroinfiltrated Nicotiana benthamiana leaves were imaged using an
Olympus Fv1000 confocal microscope (Olympus) equipped with argon,
HeNe, and diode lasers. The mEmerald, mTag-BFP2, and mScarlet-I fluorescent
proteins were detected at optimal excitation (Ex)/emission (Em) wavelengths
of 487/509, 399/454 and 569/593, respectively. Chlorophyll was excited
at a wavelength of 561 nm and detected at 682 nm. Digital images were
acquired using the Olympus FV10-ASW Viewer software Ver.4.2a (Olympus)
and processed using ImageJ 1.41o (National Institute of Health, Bethesda,
MD, USA).
Expression Analysis
We do not use significance testing
methods because our aim is to evaluate quantitative levels of expression,
not to test a hypothesis about differences of distributions. Significance
testing is therefore not applicable: each set of samples may be evaluated
independently in terms of how well its expression level has been able
to be determined, as shown through the geometric mean, geometric standard
deviation, and (for flow cytometry) the geometric means for individual
samples. Note also that such statistics have previously been shown
to be effective for modeling and prediction of genetic regulatory
elements.[24,53−55]
Author Contribution
A.C.P., A.O., M.-A.N., H.S., C.N.S.,
J.B., and S.C.L. designed
the strategy; A.C.P., A.O., M.-A.N., L.T.D., S.A.H., L.L., D.N.R.,
and T.M.S. collected data; A.C.P., A.O., M.-A.N., and H.S. analyzed
data; A.C.P., A.O., M.-A.N., H.S., C.N.S., J.B., and S.C.L. wrote
the article.
Authors: M X Caddick; A J Greenland; I Jepson; K P Krause; N Qu; K V Riddell; M G Salter; W Schuch; U Sonnewald; A B Tomsett Journal: Nat Biotechnol Date: 1998-02 Impact factor: 54.908
Authors: Tobias Jores; Jackson Tonnies; Travis Wrightsman; Edward S Buckler; Josh T Cuperus; Stanley Fields; Christine Queitsch Journal: Nat Plants Date: 2021-06-03 Impact factor: 15.793
Authors: Benjamin H Weinberg; Jang Hwan Cho; Yash Agarwal; N T Hang Pham; Leidy D Caraballo; Maciej Walkosz; Charina Ortega; Micaela Trexler; Nathan Tague; Billy Law; William K J Benman; Justin Letendre; Jacob Beal; Wilson W Wong Journal: Nat Commun Date: 2019-10-24 Impact factor: 14.919