The process of optimizing the properties of biological molecules is paramount for many industrial and medical applications. Directed evolution is a powerful technique for modifying and improving biomolecules such as proteins or nucleic acids (DNA or RNA). Mimicking the mechanism of natural evolution, one can enhance a desired property by applying a suitable selection pressure and sorting improved variants. Droplet-based microfluidic systems offer a high-throughput solution to this approach by helping to overcome the limiting screening steps and allowing the analysis of variants within increasingly complex libraries. Here, we review cases where successful evolution of biomolecules was achieved using droplet-based microfluidics, focusing on the molecular processes involved and the incorporation of microfluidics to the workflow. We highlight the advantages and limitations of these microfluidic systems compared to low-throughput methods and show how the integration of these systems into directed evolution workflows can open new avenues to discover or improve biomolecules according to user-defined conditions.
The process of optimizing the properties of biological molecules is paramount for many industrial and medical applications. Directed evolution is a powerful technique for modifying and improving biomolecules such as proteins or nucleic acids (DNA or RNA). Mimicking the mechanism of natural evolution, one can enhance a desired property by applying a suitable selection pressure and sorting improved variants. Droplet-based microfluidic systems offer a high-throughput solution to this approach by helping to overcome the limiting screening steps and allowing the analysis of variants within increasingly complex libraries. Here, we review cases where successful evolution of biomolecules was achieved using droplet-based microfluidics, focusing on the molecular processes involved and the incorporation of microfluidics to the workflow. We highlight the advantages and limitations of these microfluidic systems compared to low-throughput methods and show how the integration of these systems into directed evolution workflows can open new avenues to discover or improve biomolecules according to user-defined conditions.
The compartmentalization of
phenotype and genotype in living cells is a key requirement for natural
evolution. Since the first experiments using oil and aqueous phases
to generate cell-like compartments in the late nineties,[1] the marriage between droplet-based microfluidics
and directed evolution has become a key technique in the field of
protein and nucleic acid engineering. Droplet-based microfluidics
encompasses a set of microelectromechanical systems able to generate,
steer, manipulate, and analyze water-in-oil droplets inside a microfluidic
chip. Molecular engineering refers to biomolecule optimization methods
to improve a specific property of a protein or a nucleic acid, such
as its catalytic activity, ability to bind a ligand, or thermostability.
From the first experiments carried out in the late 1960s[2] to the Nobel Prize awarded to Frances Arnold
in 2018,[3] research in this field has experienced
exponential growth. Directed evolution consists of three well-defined
steps, variant generation, production, and selection, performed iteratively
until a biomolecule with a set of desired properties is obtained.
In this Review, we highlight the advantages of incorporating microfluidics
into the directed evolution workflow. Additional information on the
directed evolution of novel catalytic functions and improved enzymes
in drops can be found in several excellent recent reviews from the
Hilvert[4] and Hollfelder[5] groups, among others. Here, we focus on the versatility
of the microfluidics technology and how it allows the use of multiple
strategies for assay design, protein expression, and selection.
From
Natural to Directed Evolution
Biomolecular evolution can
be described as a path from one functional
biomolecule to another in the space of all possible biomolecular variants,
where each variant has an assigned fitness(6) (Figure ). In nature, the fitness is the ability of an organism to
reproduce in a particular environment and consequently spread its
genes. In the laboratory, the selective pressure and therefore the
fitness are set by the experimenter. Directed evolution is a growing
field in synthetic biology and has the capacity to provide new proteins
or nucleic acids on the basis of predefined industrial or biomedical
needs. It relies on the Darwinian principle of mutation and selection,
where the probability of success is determined by the ability to find
rare optimal variants within a large pool of sequences. All directed
evolution experiments require a measurable activity that acts as a
fitness indicator to drive the selection process “uphill”
within the protein or nucleic acid sequence space, ultimately resulting
in a biomolecule with a desired set of properties. As a result, high-throughput
screening techniques are necessary to achieve directed evolution,
and droplet-based microfluidics now provides a way of overcoming the
limiting step of screening, making it possible to achieve ∼1000-fold
higher throughput than microtiter plate screening.[7] Droplets act as a picoliter-volume reaction vessel to perform
biochemical assays at the single variant level, linking the activity
of the functional molecule often made of amino acids (phenotype) to
its corresponding encoding molecule in the form of nucleic acids (genotype).
This compartmentalization is as crucial for directed evolution as
it is for natural evolution since the encoding molecules and the functional
molecules are distinct with the exception of DNA- and RNA-based molecules
(ribozymes, DNAzymes, and aptamers). Furthermore, working at a picoliter
scale not only allows for higher throughput but also decreases reagent
consumption by a million-fold, thus greatly impacting the cost of
molecular engineering.[8] In the following
sections, we present the various elements that make up a directed
evolution workflow with an emphasis on aspects that are unique to
droplet-based microfluidic setups.
Figure 1
Directed evolution fitness landscape.
A predefined library is used
as a starting point to navigate the genetic diversity landscape, ideally
reaching a local or global fitness maximum after several iterations
of mutagenesis, expression, and selection.
Directed evolution fitness landscape.
A predefined library is used
as a starting point to navigate the genetic diversity landscape, ideally
reaching a local or global fitness maximum after several iterations
of mutagenesis, expression, and selection.
Variant
Generation
Any evolutionary process requires genetic diversity
as a starting
point from which an improved variant can emerge. In directed evolution,
genetic diversity is obtained by introducing mutations within the
gene of interest to yield genetic libraries with up to tens of thousands
of variants. Methods to generate genetic diversity are broadly divided
into two groups: random and semirational. Random mutagenesis methods[9] are based on the use of an error-prone polymerase
chain reaction (epPCR),[10] a process whereby
PCR reaction conditions are altered to facilitate the misincorporation
of nucleotides, resulting in randomly incorporated variations along
the sequence. This approach is widely used when the key residues responsible
for a given function are not known. In contrast, semirational design
relies on a priori biochemical or structural knowledge
of the system to create constraints in the design of the mutant library,[11,12] such that residues that influence a biomolecule’s function
are preferentially targeted and consequently the resulting library
is more likely to contain variants with enhanced properties. The most
widely used technique for semirational library design is saturation
mutagenesis, where the targeted residues are randomized to introduce
all possible amino acids or nucleotides in the final product. For
protein evolution experiments, codons are often not fully randomized
and more restricted codon sets are used, such as NNK (N = A/C/G/T,
K = G/T), which covers all possible amino acids and one stop codon.
In addition, one can also use codons to encode a minimal set of amino
acids, representing the main chemical types.[13] The method of choice for generating mutants will depend on prior
knowledge of the key residues of a biomolecule, the size of the gene,
and the screening capacities. As an example, in a NNK library, all
20 amino acids will be represented at each mutated codon, and the
theoretical number of full-length protein variants is 20, where n indicates the number of
targeted sites. Therefore, if this is the chosen strategy, one should
carefully choose the residues to mutate, as it will be virtually impossible
to test all of the possible amino acid variants at each position for
most proteins. If the key residues responsible for the function of
a biomolecule are not known or the researcher simply wishes to explore
different evolutionary paths, it may therefore be more effective to
use a randomized library, bearing in mind that, the bigger the biomolecule,
the lower the likelihood that a key residue will be targeted.
Variant
Production
Protein variants may be expressed either inside
a host or within
a cell-free system. The use of a host for protein expression is well
established in research and in the pharmaceutical industry[14] and has historically been the most used method
to amplify and produce candidates for directed evolution.[15] The two most common prokaryotic and eukaryotic
hosts for protein production are the model bacterium E. coli and the yeast S. cerevisiae, respectively.
In both cases, a plasmid carrying the gene of interest is transformed
into the host cells for expression by the translation machinery. Cells
are then propagated and induced to express the protein in a variety
of ways, most commonly inside the cytoplasm (Figure ) but also inside the membrane or secreted
into the periplasm or into the surrounding medium. The advantages
of using a cellular expression system lie in its simplicity and in
the reliance on the cell environment and translational machinery,
which enable the correct folding and post-transcriptional modification
of the target proteins. Moreover, cell expression provides a tighter
genotype–phenotype linkage, which is useful in the context
of directed evolution.
Figure 2
Cell and cell-free protein expression in drops. In vivo expression takes advantage of a host organism (bacteria,
yeast)
for the heterologous expression of a gene or a DNA fragment. In vitro transcription and/or translation systems use the
basic machinery of the cells to produce the protein of interest without
the cell wall.
Cell and cell-free protein expression in drops. In vivo expression takes advantage of a host organism (bacteria,
yeast)
for the heterologous expression of a gene or a DNA fragment. In vitro transcription and/or translation systems use the
basic machinery of the cells to produce the protein of interest without
the cell wall.An alternative to protein expression
inside a host is a cell-free
system, in which an in vitro transcription and translation
system (IVTT) processes the information encoded in a DNA template
into an RNA transcript that is subsequently translated into protein
without the environment of a cell[16] (Figure ). Cell-free systems
allow the user to have direct control over the transcription and translation
machinery of the cell without the constraints of the cell envelope.
In this way, it is possible to co-translationally produce and solubilize
membrane proteins,[17,18] translate toxic or difficult-to-express
proteins,[19] introduce site-specific labels,
or incorporate nonstandard amino acids[20−23] or other monomers[24] into the polypeptide chain. However, the use
of these systems results in lower yields of protein compared to host
organisms. The first effective cell-free transcription–translation
system dates back to 1961,[25] when Nirenberg
and Matthaei successfully synthesized proteins with the translational
machinery of E. coli in order to decipher the
genetic code, but it was not until the 2000s when the applications
of this technology began to be exploited with notable advances in
the field of protein synthesis,[26] production
of pharmaceutical compounds,[27,28] or screening of protein
and peptide libraries.[29]Finally,
DNA variant libraries for directed evolution are typically
synthesized by solid-phase synthesis and/or amplified by PCR, while
RNA libraries are generally produced enzymatically from DNA templates
using T7 RNA polymerase.Once the choices for generating diversity
and expressing variants
are made, the microfluidics workflow must be adapted accordingly (Figure ). How microfluidics
technologies are implemented into each directed evolution scenario
mainly depends on the expression system used. Common modules included
in a droplet-based microfluidic system are droplet makers[30] and sorters, on the basis of either dielectrophoresis[31] or acoustic waves.[32] In addition to these key modules, various modules allow droplet
manipulation, such as droplet fusion,[33] splitting,[34] and picoinjection,[35] making microfluidics a versatile and adaptable
tool. Commonly, droplets are collected and incubated off-chip and
later reinjected for an end-point measurement. Alternatively, incubation
can take place on-chip in incubation channels when shorter incubation
times are needed.[36] The latter allows precise
control on the reaction’s incubation time, as it is possible
to precisely control flow parameters, making it possible to measure
the reaction at an end point or at a controlled time point, such as
in kinetics measurements. To illustrate this, let us consider a typical
microfluidic workflow for the directed evolution of enzymes with enhanced
properties. In cases where the enzyme of interest is to be produced
in an E. coli host, individual cells expressing
a single enzyme variant are coencapsulated with a substrate for the
chosen assay. If the protein is set to remain within the cytoplasm,
protein induction will be performed off-chip, and the cells will later
be encapsulated together with a lysis agent to provide enzyme accessibility
to the assay. In the case of protein secretion, the order would be
to first encapsulate the cells and later induce them by picoinjection
once they are inside the drops to maintain phenotype–genotype
linking. Alternatively, cells can also be encapsulated in inductive
medium. Finally, if the protein is displayed at the cell surface or
targeted to the periplasm, fewer steps are required, as there is no
need to lyse the cells, and the induction can be carried out prior
to the encapsulation. If, on the other hand, the enzyme is to be
produced using an IVTT system, a DNA library encoding a large number
of variants will be diluted and encapsulated together with a PCR mixture,
such that each drop contains no more than a single DNA molecule. After
performing in-drop PCR amplification, each droplet can be fused with
another drop containing IVTT reagents and incubated, making these
kind of processes more complex than cell-based experiments. After
the enzyme of interest has been expressed, substrate is added to perform
the enzymatic assay. Finally, droplets containing cells or the IVTT
mixture are (re)injected into a sorting module to select the desired
variants.
Figure 3
Typical microfluidic workflow and description of the main microfluidic
modules. An initial compartmentalization is followed usually by one
or more picoinjection, fusion, and/or incubation events. Finally,
a sorting step is needed to select the biomolecules of interest. Scale
bars are 50 μm in all the pictures. Insets are reprinted with
permission from open access papers: Beneyton et al. Out-of-equilibrium microcompartments for the bottom-up integration
of metabolic functions. Nat. Commun.2018, 9, 1–10;[47] Beneyton et al. High-throughput synthesis and screening of functional
coacervates using microfluidics. ChemSystemsChem.2020, 2, e2000022;[48] Schütz, S. S. et al. Rational design
of a high-throughput droplet sorter. Lab Chip2019, 19, 2220–2232.[60]
Typical microfluidic workflow and description of the main microfluidic
modules. An initial compartmentalization is followed usually by one
or more picoinjection, fusion, and/or incubation events. Finally,
a sorting step is needed to select the biomolecules of interest. Scale
bars are 50 μm in all the pictures. Insets are reprinted with
permission from open access papers: Beneyton et al. Out-of-equilibrium microcompartments for the bottom-up integration
of metabolic functions. Nat. Commun.2018, 9, 1–10;[47] Beneyton et al. High-throughput synthesis and screening of functional
coacervates using microfluidics. ChemSystemsChem.2020, 2, e2000022;[48] Schütz, S. S. et al. Rational design
of a high-throughput droplet sorter. Lab Chip2019, 19, 2220–2232.[60]
Variant Selection
The assessment
of the performance of individual variants is achieved
through an assay that produces a readout signal proportional to the
fitness of the variant. The choice of the assay is crucial in determining
the success of the experiment and in most cases the restraining factor
on the possibility to perform directed evolution. First of all, a
minimal starting activity is required to proceed with a directed evolution
workflow. Second, the readout must be fast and sensitive enough for
the high-throughput screening of the droplets, which are flown at
high velocity through microfluidic channels. Third, the diffusion
of the assay components between droplets and into the oil should be
limited.[37,38]Most assays performed in droplets
rely on a laser-induced fluorescence
readout, such as fluorescence-activated droplet sorting (FADS)[31,32] or adapted versions of commercial fluorescence-activated cell sorting
(FACS).[39,40] This allows for highly sensitive measurements
to be performed in the sub-millisecond time scale down to 1 nM of
product,[41] thus enabling the sorting and
selection of variants in the kilohertz range. However, the implementation
of a fluorogenic assay is not always trivial and has so far been limited
to a narrow range of highly specific reactions that result in the
activation of a fluorophore directly, by using a fluorogenic substrate,[41−43] through a coupled assay,[44,45] or by the release of
a quencher.[40,46] In the first case, typically,
the natural substrate of the enzyme must be chemically modified with
a fluorophore, which could potentially lead to the identification
of an enzyme with improved activity toward the modified fluorogenic
substrate rather than the native substrate. To minimize this risk,
a coupled assay with a fluorophore may be used to keep the substrate
unmodified. This may require an additional enzymatic cascade reaction
to be included in the assay, though it must be ensured that the side
reactions do not interfere with the assayed enzyme. Certainly, these
additional reactions should not be rate-limiting in order to make
sure that the selection pressure is applied to the enzyme of interest.
In the final case, the assay relies on the activation of a fluorophore
by displacing or removing a quencher from the donor–quencher
pair and in this way increasing fluorescence intensity. Finding a
good donor–quencher pair for a substrate is not straightforward.
The substrate needs to be chemically modified with the donor–quencher
pair, which can alter enzymatic activity toward the native substrate,
similar to the use of a fluorogenic substrate. Finally, with this
method, the background signal tends to be higher, which makes the
assay less sensitive.[47,48]Recent developments in
detection systems compatible with microfluidic
chips have been made in order to extend the range of assays amenable
to droplet microfluidics. Absorption-based methods are more universal
but remain challenging because of the reduced optical path length
of the microfluidic channels, which impacts the sensitivity and reliability
of the measurements. However, successful absorption-based directed
evolution has been achieved at a reduced throughput (300 Hz)[49] with a detection limit of 10 μM of product
through a coupled assay. Another method was recently developed for
high-speed absorbance measurements. The method relies on the phase
shift of light due to the photothermal effect and allows single-point
absorption measurements at rates similar to those used in FADS.[50] Recently, the first directed evolution screen
based on electrochemical measurements[51] and other label-free screening methods have been demonstrated in
droplets. Moreover, light scattering[52] and
image processing[53] could successfully be
applied to the screening of populations based on cell growth with
promising capabilities when coupled to artificial intelligence.[54−56] Finally, efforts were made recently to integrate Raman spectroscopy[57] and mass spectroscopy[58,59] into the droplet format.After detection, droplets are deflected
toward the desired outlet
mostly using electric fields. The design of the electrodes plays a
key role in the optimization of the electric field gradient and in
maximizing droplet displacement.[60] A second
key factor is the size of the droplets to be sorted. The highest sorting
throughput reported was 30 kHz,[61] which
is achieved for rather small droplets (8 pL). However, this throughput
has yet to be reached for biological experiments. The limiting factor
at this point was the data acquisition of the electronic system. Nevertheless,
when investigating, for example, cell proliferation, larger drops
(100 pL to 1 nL) are required to provide a sufficient amount of nutrients
for the longer incubation time.[62] Sorting
of such large drops has been performed at decreased throughputs since
these require a larger deflective force to be displaced, tend to break
easily due to the electric field, or split at the sorting junction
when the flow velocity is too high. Recently, a novel method that
utilizes an array of electrodes that can be triggered sequentially
has been introduced. This method allows one to sort larger drops at
increased throughputs[55] (850 Hz for 1 nL
droplets). The relationship between droplet size and throughput is
shown in Table using
selected studies and their novelties.
Table 1
Achieved
Sorting Throughputs for Various
Droplet Sizesa
drop size (pL)
sorting
throughput (kHz)
novelty
ref
12
2
fluorescence-activated droplet sorting
(31)
110
0.2
concentric sorting
electrode
(63)
8
30
layered sorting junction
(61)
100–1000
0.85–4.4
sequentially addressable electrode
array
(55)
Different technical advances
have been implemented in order to enhance the final throughput.
Different technical advances
have been implemented in order to enhance the final throughput.Several standard analytical techniques
have been compartmentalized,
such as PCR,[64] ELISA,[65] MDA,[66] and cell transfection,[67] giving rise to “digital” techniques
and resulting in improvements in accuracy, sensitivity, and throughput.
Further, droplet microfluidics can be integrated with benchtop flow
cytometers for droplet sorting. However, since FACS is not compatible
with the oil phase of droplet microfluidics, additional steps are
necessary to do so. For example, it is possible to create a double
water–oil–water emulsion[40,68,69] or to create hydrogel beads to compartmentalize the
aqueous phase[39,70] to allow droplet sorting. As
described above, directed evolution in droplets is a multistep process
and often requires a complex workflow with numerous handling steps
where temperature and pressure control are vital. Consequently, efforts
to automate these processes have been made, such as developing a platform,
based solely on integrating temperature control, to automate transformation,
culture, and expression of recombinant proteins inside a host microorganism.[71] Newly, the possibility of full automation and
integration of the microfluidic workflow was demonstrated using a
system composed of 3 components: (i) a robotic liquid handler; (ii)
syringe pumps with valves, which can withdraw and pump fluid; (iii)
microfluidic unit operations, such as droplet generation, merging,
and sorting.[72] Such a system can perform
all of the steps required for directed evolution and shows a high
level of flexibility.
Enzyme Engineering by Directed Evolution
in Drops
Natural enzymes have been optimized over billions
of years of evolution
to effectively perform an enormous variety of catalytic reactions
in a selective manner. For industrial or therapeutic needs, this process
needs to be accelerated and must be adapted to reactions that do not
normally occur in nature. Enzyme engineering and directed evolution
allows one to improve the natural activity of an existing enzyme,
change its preferred substrate toward an activity of interest, or
even design enzymes by computational methods.[73]
Native Substrates
The evolution of enzymes for their
activity on their natural substrate is a challenging game. Since the
enzyme has already been improved through the multiple cycles of natural
evolution, it is reasonable to expect, at best, a low magnitude improvement
from the starting molecule in the final hit. However, the high number
of variants that can be sorted using microfluidics increases the chances
of finding an enhanced variant that will entail an enzyme with improved
catalytic activity toward their original substrate. Usually, the generation
of genetic variation includes a step of random mutagenesis to cover
different evolutionary pathways, as the positions that need to be
changed in the protein are not always known to the protein engineer.One example of an enzyme that has been evolved to improve its activity
toward their original natural substrates is the enzyme phenylalanine
dehydrogenase. In this case, the enzyme, which catalyzes the NAD+-dependent deamination of amino acids, was improved to yield
2.7-fold higher activity than the wild-type.[49] To do so, genetic diversity was first introduced by epPCR followed
by DNA shuffling of the resulting amplicon. The originality of this
work, however, lies in the use of an absorbance detection module for
the subsequent selection of protein variants in contrast with the
FADS approach, which is often used in microfluidics. The protein expression
system of choice was the prevalent cytoplasmic expression, where the
cell is encapsulated with a lysis agent and the drop is used as a
compartment to link the genotype of a variant to its phenotype, usually
expressed as the level of fluorescence.One interesting alternative
to cytoplasmic cell-expression involves
the generation of hydrogel beads. These beads are surrounded by a
polyelectrolyte shell to link phenotype with genotype, where the compartmentalization
is robustly preserved. They can therefore function as a compartment
containing lysate from a single cell, allowing the use of a benchtop
flow cytometer for droplet sorting. The potential of this system was
demonstrated using phosphotriesterase (PTE), a bioremediation catalyst,
where a single round of mutagenesis by epPCR followed by sorting of
the 0.2% most active variants resulted in the identification of a
variant presenting an 8-fold improvement in kcat/Km for its native substrate,
the pesticide paraoxon, and a 19-fold improvement for the substrate
tetraethyl-O-fluorescein-diphosphate.[39] As expected, the increased activity toward the
non-native substrate was more significant than for its natural substrate.In some cases, the product of interest after an enzymatic reaction
can be one of two possible stereoisomers. Enzymes that exhibit high
enantioselectivity are seldom found in nature, and their directed
evolution has been limited by the requirement of a chiral chromatography
step. In order to engineer an esterase with improved enantioselectivity
for the production of pharmaceutically important (S)-profens, a dual channel microfluidic droplet screening system was
developed.[43] This system uses a dual-fluorescence
detection/sorting microfluidic device that allows the evaluation of
two reaction channels to simultaneously screen for improved catalytic
activity and enantioselectivity. Importantly, this system could also
be used to select for additional enzymatic properties such as regioselectivity
or chemoselectivity. After five rounds of mutagenesis and screening,
a variant with 700-fold improved enantioselectivity for the desired
(S)-profens was selected and identified. In this
case, the genetic diversity was generated by both random and rational
mutagenesis with a combination of epPCR, DNA shuffling, and saturation
mutagenesis.Alternatively, the display of the protein of interest
at the cell
surface rather than in the cytoplasm simplifies the whole workflow
of directed evolution in drops by bypassing the lysis step and allowing
simple DNA recovery by colony regrowth after sorting. In this manner,
the link between genotype and phenotype is strengthened, and such
an approach is typically chosen when using yeast as a host organism
for detection and sorting experiments. In one of the earliest examples
of directed evolution using microfluidics, horseradish peroxidase
(HRP) was displayed at the surface of S. cerevisiae by anchoring it to its cell wall. As mentioned earlier, the improvement
of an already highly efficient enzyme can be challenging, but in this
case, the final protein was an improved mutant with a 10-fold greater
catalytic rate compared to the wild-type enzyme. Notably, the high-throughput
screening system was key in this process, since it allowed the identification
of ∼100 variants at least as active as the wild-type HRP from
a population of ∼107, discarding the degenerate
mutations, which are a majority. Genetic diversity was achieved by
combining libraries created by epPCR that target residues along the
whole protein with libraries created by saturation mutagenesis that
target residues closer to the active site. The most active variants
from both libraries were further mutated and screened by a final round
of microfluidics.[41]More recently,
the same display system was used to enrich a population
of cells expressing glucose oxidase mutants with higher activity compared
to the wild-type enzyme.[74] The library
of mutants was created using site-directed mutagenesis, where the
changes of residues are directed to one specific amino acid, in contrast
with the less targeted site-saturation mutagenesis mentioned in previous
examples. Interestingly, from the top five mutants, three had previously
been discovered by the same group using FACS, while two were isolated
for the first time using microfluidics, including a variant with kcat increased by 2.1-fold, demonstrating the
robustness, sensitivity, and efficiency of the microfluidics strategy.Successful microfluidics droplet sorting using surface display
in E. coli has also been achieved.[75] In this case, a homodimeric arylsulfatase was
evolved for improved sulfatase activity toward two different substrates,
since the enzyme shows considerable activity toward phosphoester compounds,
apart from its primary activity of catalyzing the hydrolysis of arylsulfates.
Two libraries generated by epPCR were screened, initially against
the first substrate, fluorescein disulfate, and then also against
the second substrate, 4-nitrophenyl sulfate. The experiments resulted
in the identification of 25 unique SpAS1 variants with up to 30-fold
and 6.2-fold improved activity, respectively, after a single round
of mutagenesis.
Non-native Substrates
Since minimum
levels of activity
must be detected to start the cycle of mutagenesis, expression, and
selection, a common strategy to identify an enzyme variant with activity
toward a non-native substrate by directed evolution is to improve
an existing promiscuous enzymatic activity. Nearly a decade ago, droplet-based
microfluidics was used to successfully screen a promiscuous sulfatase
with hydrolytic activities toward the nonnative substrate phosphonate.
Genetic diversity was generated by epPCR, and the library was expressed
in the cytoplasm of E. coli with the corresponding
cell lysis. The top 4% of the most active clones, displaying at least
4-fold improved activity, was selected in each of a total of 3 rounds
of sorting. The final candidate presented a 6-fold increase in kcat/Km for the desired
function after purification of the enzyme.[42] This work demonstrated that rare variants with small improvements
in activity could be detected and selected by microfluidic droplet
screening.When targeting non-native substrates, another common
strategy is to apply semirational library design to directly target
residues in the substrate binding site. This approach was used to
completely remodel the active site of cyclohexylamine oxidase, an
enzyme used in the industrial production of chemicals and active pharmaceutical
ingredients.[76] Genetic diversity was generated
by targeting 8 residues close to the bound cyclohexanone and randomizing
them with either DYT codons (encoding for A, S, T, V, I, and F) or
BYT codons (encoding for A, S, P, V, L, and F), followed by selection
of the top 0.1% of the most active variants for each of 3 consecutive
rounds of directed evolution. Notably, the three most active variants
obtained after the third round of sorting had identical amino acid
sequences despite having been obtained independently. This variant
had five amino acid changes compared to the wild-type and after purification
presented an impressive 960-fold improvement in catalytic efficiency
for the same substrate.Non-native substrates also include synthetic
substrates, such as
xeno nucleic acid (XNA) polymers. As these synthetic polymers increasingly
show potential for synthetic biology and future applications in molecular
medicine, nanotechnology, and materials science, the development of
efficient synthetic polymerases by directed evolution is gradually
gaining importance. One example is the evolution of a polymerase that
replicates an unnatural genetic polymer composed of repeating units
of α-l-threofuranosyl nucleic acid (TNA) sugars. This
approach was used to develop a manganese-independent TNA polymerase
that functions with 99% template-copying fidelity after making the
hypothesis that the presence of manganese was making the polymerization
unspecific.[40] To achieve this, three key
residues known to affect substrate specificity were altered by saturation
mutagenesis. The resulting enzyme variants were encapsulated in double
emulsion droplets and sorted by FACS based on their ability to elongate
a full-length product, which produced a fluorescent signal by donor–quencher
pair disruption. Presumably, it should be possible to evolve other
polymerase functions provided that the optical detection of the product
can be achieved.
Computer-Designed Enzymes
Computational
design can
give rise to de novo biocatalysts with a function
not found in nature. However, newly designed enzymes typically show
very low catalytic activity and must subsequently be improved through
directed evolution. A striking example of the successful optimization
of a computer-designed enzyme is that of retro-aldolase, an enzyme
capable of cleaving a specific carbon–carbon bond in a non-natural
substrate using amine catalysis. A retro-aldolase slightly modified
from the original computer design[77] was
used as a starting point and reoptimized using a microfluidics-based
system able to detect enzyme activities as low as kcat/Km = 0.5 M–1 s–1. For this optimization, six libraries were
generated by saturation mutagenesis using NNK codons. The targeted
residues were close to the binding pocket and varied from four to
five simultaneously randomized residues. After a first round of selection,
two variants with >10-fold improved activity were chosen for DNA
shuffling,
obtaining a final variant with a 73-fold increase in kcat/Km compared to the initial
enzyme.[78]The same group had previously
evolved by microtiter plate the starting point retro-aldolase.[79] Notably, the best variant after 13 rounds of
directed evolution in microplates was not as active as the one obtained
from only two rounds of FADS, highlighting the importance of screening
a higher diversity of variants with a high-throughput method. A year
later, the microtiter plate-evolved enzyme was reoptimized using FADS,
leading to the identification of a new complex catalytic center, which
featured a Lys-Tyr-Asn-Tyr tetrad and was 30-fold more active.[80] When variants with low activity are detected
in a high-throughput manner, FADS is a powerful tool for tuning the
properties of computationally designed enzymes.
Optimization
of Ribozymes
Although there are tens of
examples of RNA molecules with catalytic properties (RNAzymes or ribozymes)
in nature, scarce attempts have been made to improve these biomolecules
using droplet microfluidics. In one of the few studies to date, the
catalytic properties of an X-motif capable of RNA cleavage via an internal phosphoester transfer reaction were improved
using a complex microfluidic workflow.[46] First, a DNA library and PCR reagents were injected into droplets,
and PCR was performed off-chip. Then, the droplets were mixed with
a T7 polymerase-based in vitro transcription mixture.
The authors discovered that T7 RNA polymerase interfered with the
fluorogenic nuclease assay. However, picoinjection of the assay mixture
with a high NaCl concentration allowed them to inactivate the T7 RNA
polymerase and stop the transcription reaction of the ribozyme. Finally,
droplets were selected with a FADS device on the basis of a fluorogenic
RNA substrate comprising a fluorophore and a quencher at the 5′
and 3′ends, respectively. After 9 rounds of selection, the
catalytic properties of the ribozyme were enhanced 28-fold. Several
mutations could be shown to improve the activity of the ribozyme.
Noncatalytic Biomolecules
Antibody Optimization with Droplet Microfluidics
Antibodies
like immunoglobulin G (IgG) are ∼150 kDa, Y-shaped, globular
proteins that form an essential component of the immune system used
to fight invading pathogens, such as bacteria or viruses. Although
their overall structure is very similar, the region used to specifically
recognize a given epitope on an antigen varies greatly from one antibody
to the next. Substantial advances have been made over the past 20
years in the research, development, and clinical application of therapeutic
monoclonal antibodies.[81] Monoclonal antibodies
have become one of the fastest growing sectors of human therapeutics
for treating various pathologies, such as cancer, inflammation, infections,
or autoimmune diseases.[82] The selection
of antibodies using microfluidic systems provides a low-cost and high-throughput
approach for disease diagnosis, phenotyping of tumor cells, and biomarker
detection.Although many methods have been devised to screen
for specific antibodies, each with its distinctive advantages and
limitations, the identification of antibodies that bind to cell-surface
receptors or target specific cells remains challenging. Starting from
hybridomas producing nonspecific antibodies, more than 80 000
hybridoma-clone secreting antibodies with specific binding properties
to the transferrin receptors on leukemic K562 cells could be selected.[83] Remarkably, very low amounts of IgG were used
per assay (33 fg), and the enrichment of specific hybridoma cells
could be achieved thanks to the selection system. This promising work
could be transposed to further therapeutic antibody discovery. For
instance, a new microfluidic method for single-cell deep phenotyping
of IgG-secreting cells was developed, in which thousands of droplet-encapsulated
cells arranged as a two-dimensional droplet array were screened using
a fluorescence relocation-based immunoassay.[84] A comprehensive step-by-step description of this method was also
published recently.[85]Historically,
one of the most used methods to generate new antibody
variants by directed evolution has been yeast display. When these
methods were combined with microfluidic tools, a substantial increase
in the number of tested variants could be achieved, resulting in a
jump from medium (102 to 103 variants) to high
throughput (106 to 109). In 2017, Adler et al.(86) combined microfluidics,
yeast single-chain variable fragment (scFv) display, and deep sequencing
to build an alternative to hybridoma-based antibody discovery. With
this system, mouse antibody repertoires could be selected against
the programmed cell-death protein 1 (PD-1), a checkpoint protein used
as a target in cancer immunotherapies. A droplet-based microfluidic
system was used to encapsulate B cells from mice with oligo-dT beads
and a lysis solution. Polyadenylated transcripts released from cells
and bound to the beads were purified from the droplets and injected
into a second emulsion with a multiplexed overlap extension reverse
transcriptase polymerase chain reaction mix. Finally, DNA amplicons
encoding scFv with native pairs of heavy and light chain Ig were generated.
These libraries were used for scFv display and screened by FACS, resulting
in the identification of high-affinity scFvs against human PD-1 immunogen
by deep sequencing. Two rounds of FACS produced populations of scFv
with an average enrichment of 800-fold. Seventeen of these anti-PD-1
binders were synthesized as full-length monoclonal antibodies. Among
them, 15 specifically bound surface-expressed PD-1 in a FACS assay,
while 9 antibodies acted as checkpoint inhibitors. This approach could
further be used to screen for other functional monoclonal antibodies.
Aptamer Development by Directed Evolution in Drops
Aptamers
are biomolecules that possess the capacity of specifically
binding another molecule. Their chemical nature can be proteinogenic
or nucleic acid based (both DNA and RNA). RNA aptamers named riboswitches
can be found in nature and are involved in the metabolite-dependent
control of gene expression.[87] The discovery
and development of aptamers by SELEX methods started in 1990, when
Tuerk and Gold identified various RNA ligands against T4 polymerase.[88] In another study, also focused on discovering
ligands against T4 polymerase, the term aptamer was coined by Ellington
and Szostak.[89] Since then, a plethora of
new aptamers have been developed with the SELEX methodology.Aptamer evolution in droplets took two more decades to become prominent.
Droplet-based systems provide the advantage that every mutant can
be studied individually inside drops, in contrast to classical SELEX
systems. Fluorogenic aptamers are an alternative to classical fluorescent
proteins, such as GFP, and are widely used for biochemistry, cell
biology, and biomedical applications, making the exploration of new
fluorogenic aptamers a particularly promising area of research. In
particular, G-quadruplex RNA aptamers and, in some cases, their corresponding
biosensors have been the focus of extensive optimization, resulting
in the development of highly fluorescent iSpinach[90,91] aptamer-based fluorogenic biosensors (Figure a), MangoIII[92,93] aptamer, and
the Gemini–o-Coral fluorogenic dimer.[94] All of these aptamers were improved by directed evolution using
the same microfluidic workflow (Figure b). First, a variant library was encapsulated in drops
with no more than one molecule per drop. Second, the drops were amplified
by PCR. Third, the drops containing the amplified variants were reinjected
into a microfluidic device and fused with another drop containing in vitro transcription reagents. Fourth, the fused drops
were collected and incubated to produce RNA, and fifth, the drops
were sorted on the basis of their fluorescence and collected for sequencing.
Step 2 was performed off-chip in a thermocycler. Step 4 can be carried
out either on- or off-chip.
Figure 4
Directed evolution of fluorogenic aptamers in
drops. (a) G-quadruplex
structure of an iSpinach-based RNA biosensor.[91] The sensor aptamer is in black; the optimized communication module
is in red and the fluorogenic G-quadruplex aptamer, in green. (b)
Microfluidic workflow of the process: First, a gene library is encapsulated
in droplets. After fusion events of all the drops with other drops
containing an IVTT system, the aptamer library is generated. Finally,
the aptamers of interest are selected by FADS and encapsulated again
for another round of selection or analyzed by NGS.
Directed evolution of fluorogenic aptamers in
drops. (a) G-quadruplex
structure of an iSpinach-based RNA biosensor.[91] The sensor aptamer is in black; the optimized communication module
is in red and the fluorogenic G-quadruplex aptamer, in green. (b)
Microfluidic workflow of the process: First, a gene library is encapsulated
in droplets. After fusion events of all the drops with other drops
containing an IVTT system, the aptamer library is generated. Finally,
the aptamers of interest are selected by FADS and encapsulated again
for another round of selection or analyzed by NGS.Spinach[95] is an artificial fluorogenic
RNA aptamer whose ligand emits green fluorescence similar to eGFP
upon binding. The screening of Spinach gene libraries led to an improved
Spinach (iSpinach) aptamer[90] thanks to
microfluidic-assisted in vitro compartmentalization.
Two pairs of enrichment rounds were separated by a mutagenesis round.
The main goal was to develop iSpinach mutants with higher thermal
stability and a wider salt tolerance because none of the known DFHBI-binding
aptamers are optimal for in vitro application. On
the basis of this development, a new biosensor capable of detecting
theophylline was built[91] by randomizing
the communication module of the biosensor. This region links the sensor
region with the fluorogenic G-quadruplex aptamer, which in the case
of iSpinach binds the fluorophore 3,5-difluoro-4-hydroxybenzylidene
imidazolinone (DFHBI). After 5 rounds of selection, the fluorescence
of the biosensor was improved by 5-fold.Like Spinach, Mango
is another fluorogenic RNA aptamer with a complementary
emission wavelength in the red region of the spectrum. A complex was
formed with thiazole orange T01-biotin and was selected on the basis
of the binding affinity between the aptamer and this ligand by 12
rounds of classical SELEX.[96] Three new
aptamers (Mango II, III, and IV) were developed from Mango I using
microfluidic selection. When 9 rounds of directed evolution were applied,
the fluorescence of the complex could be increased 6-fold.[92] Using a structure-guided library in which only
key residues are mutated, two novel Mango III mutants were also identified:
Mango III (A10U) and iMango III.[93] The
cocrystal structure of the aptamer and ligand shows a pseudoknot-like
base pairing interaction between nucleotides internal and adjacent
to a two-tiered G-quadruplex. The novel mutants are 50% brighter than
eGFP, making them useful for live cell RNA visualization. The most
recent discovery using this method is the complex Gemini–o-Coral,[94] a cell-permeable fluorogenic dimer of self-quenched
sulforhodamine B dyes (Gemini-561) and the corresponding dimerized
aptamer (o-Coral). In order to overcome the limitation of the detection
of low amounts of RNA with Spinach and Mango, a new strategy using
fluorescent quenchers was applied to develop new aptamers with enhanced
fluorescence. A recent study shows that this technology could also
be applied for the development of the recently discovered double analyte
aptamers.[97] Altogether, microfluidics has
had a considerable impact on fluorogenic aptamer research in the past
few years and shows great promise for future investigations in this
field.
Conclusions and Future Perspectives
Directed evolution is a field of growing importance with an increasing
number of successfully improved molecules, from enzymes being improved
for therapeutic applications to those used in self-replicating nucleic
acids. Droplet microfluidics achieves all these processes in an automated
and miniaturized format with a much higher efficiency in time and
cost compared to low-throughput methods. The use of droplets provides
an additional compartment in which to perform chemical assays and
enzymatic reactions while linking the activity of the molecule of
interest to the genetic variant it represents. In this way, as the
assay is not limited to the cell, one can broaden the spectrum of
analytical tools available compared to classical cell sorting approaches
like FACS.We have highlighted here some of the most recent
advances in the
engineering of functional biomolecules (enzymes, antibodies, aptamers,
and ribozymes) using droplet microfluidics. Over the past years, experimental
procedures have been miniaturized and automated in the droplet format
showing the long-term potential of the technology for molecular engineering.
Improved variants have already been successfully obtained from a variety
of experiments (Table ). Most of the examples use a cell-based system for the expression
of the molecules. Usually, the choice to express the molecule of interest
in a single-cell organism is merely economical, since nowadays the
cost of using commercial IVTT systems for expressing proteins is orders
of magnitude more expensive than using a host organism. In the case
of cell display, the choice of this system has an added justification,
using the cells not only as a compartment in which to link phenotype
and genotype but also as a means to replicate the genetic variants
after sorting since lysis is not necessary and the cells can be regrown.
Moreover, the related microfluidic workflow is simpler compared to
the IVTT-involving platform. In addition, since the scaling up needed
for production of a biological protein will usually be done in cell-based
systems, it is convenient to use this system from the beginning of
the experimental workflow and avoid unexpected difficulties.[98] However, due to the more generalized use of
IVTT systems, one can forecast a progressive lowering of the price.
In this sense, the idea of generating in-house methods to produce
cell-free systems is gaining importance. Complete methods for the
whole process and plasmid designs (One-pot system) are already available.[98,99] We foresee that this strategy will be widely deployed in the coming
years.
Table 2
Summary of the Evolved Biomolecules
Described in This Reviewa
expression
selection pressure
no.
rounds
E. coli
other
library design
reference
Enzymes
phosphotriesterase
fluorescence
1
cytoplasmic
random
(39)
TNA polymerase
fluorescence
1
cytoplasmic
semirational
(40)
peroxidase
fluorescence
2
S. cerevisiae
semirational
(41)
esterase
fluorescence
5
cytoplasmic
semirational
(43)
dehydrogenase
absorbance
2
cytoplasmic
random
(46)
oxidase
fluorescence
1
cytoplasmic
semirational
(74)
sulfatase
fluorescence
1
display
semirational
(75)
aldolase
fluorescence
2
cytoplasmic
semirational
(77)
fluorescence
6
cytoplasmic
random
(78)
Ribozymes
X-motif (RNA)
fluorescence
9
PCR
random
(46)
Antibodies
anti-tranferrin (K562)
fluorescence
1
hybridoma
cells
(83)
anti-PD-1
fluorescence
2
S. cerevisiae
(86)
Aptamers
iSpinach
fluorescence
5
IVTT
semirational
(90)
fluorescence
5
IVTT
semirational
(91)
Mango III
fluorescence
9
IVTT
semirational
(92)
Mango III (A10U), iMango III
fluorescence
4
IVTT
rational
(93)
Gemini-561, o-Coral
fluorescence
4
IVTT
semirational
(94)
A total of 8 enzymes, 5 aptamers,
1 ribozyme, and 2 antibodies have been reviewed. We have also dissected
the key components underlying these directed evolution experiments.
A total of 8 enzymes, 5 aptamers,
1 ribozyme, and 2 antibodies have been reviewed. We have also dissected
the key components underlying these directed evolution experiments.It is interesting to highlight
as well that peptides are gaining
importance as therapeutic agents. As their use is becoming common
to treat acute infections, chronic diseases, and even some types of
cancer, droplet microfluidics can help one to discover peptides with
novel properties. Droplet-based strategies could be adopted to identify
antimicrobial peptides,[100] discover novel
antiviral peptides to treat respiratory diseases,[101] or find anticancer peptides.[102]Research in self-replicating nucleic acids is another field
that
is gaining importance. The search for the molecular origins of life
and the construction of a minimal cell have found a versatile toolbox
in droplet-based microfluidics. Research in template-directed self-replicating
systems or replicators can also lead to the development of artificial
ribozymes[103] or control of protocell compartmentalization
and reproduction.[104] Using water-in-oil
emulsions, the replication efficiency of an RNA replicating system
could be increased 30-fold.[105] Encapsulation
of RNA catalysis reactions into droplet coacervates can also have
important implications for early Earth chemistry and protocell research.[106]Contrary to protein enzymes or ribozymes,
DNA-based enzymes (DNAzymes
or deoxyribozymes) are scarce with only a few examples found in nature,[107] acting mainly as ribonucleases and RNA ligases.
However, the directed evolution of artificial single-stranded DNA
(ssDNA) constructs has been favored due to some of the intrinsic advantages
of ssDNA over RNA. For instance, ssDNA generally has higher chemical
stability compared to RNA due to the absence of the 2′-hydroxyl
group on the ribose moiety. The potential of studying the directed
evolution of these molecules could lead to new biomedical and industrial
applications due to their different chemical nature and versatility,
such as DNA aptamers or DNA–metal nanoclusters for biosensing.Nonetheless, despite the many examples discussed and the increasing
speed in which the field of microfluidics for directed evolution is
growing, the technology is far from being fully mature and further
improvements are to be expected. For instance, DNA recovery and cell
viability are still challenges to take into account and can be the
limiting step in biological workflows. Moreover, greater access of
nonspecialist laboratories to microfluidic setups is necessary to
maximize the impact of this technology. Microfluidic technology will
most likely gain in miniaturization, automation, and parallelization
capabilities, pushing further the throughputs of selection with the
development of new instruments by technology developers. Furthermore,
the development of novel surfactant formulations may increase the
scope of the assays possible in droplets and solve some of the current
limitations such as the leakage of molecules through the drop or the
interaction of surfactant with the content of the droplet. Finally,
new and emerging approaches in molecular programming will provide
interesting functions to further increase the selection throughputs
and will provide new methods for the biomimetic selection of improved
variants of practical interest.[108] We believe
that the combination of these tools will lead in the future to a whole
new range of approaches for the discovery and improvement of chemicals
of fundamental and practical interest in therapeutics and industrial
applications.
Authors: Adam R Abate; Tony Hung; Pascaline Mary; Jeremy J Agresti; David A Weitz Journal: Proc Natl Acad Sci U S A Date: 2010-10-20 Impact factor: 11.205
Authors: Gur Pines; Assaf Pines; Andrew D Garst; Ramsey I Zeitoun; Sean A Lynch; Ryan T Gill Journal: ACS Synth Biol Date: 2014-10-30 Impact factor: 5.110
Authors: Eric A Althoff; Ling Wang; Lin Jiang; Lars Giger; Jonathan K Lassila; Zhizhi Wang; Matthew Smith; Sanjay Hari; Peter Kast; Daniel Herschlag; Donald Hilvert; David Baker Journal: Protein Sci Date: 2012-03-30 Impact factor: 6.725
Authors: Benjamin J Hindson; Kevin D Ness; Donald A Masquelier; Phillip Belgrader; Nicholas J Heredia; Anthony J Makarewicz; Isaac J Bright; Michael Y Lucero; Amy L Hiddessen; Tina C Legler; Tyler K Kitano; Michael R Hodel; Jonathan F Petersen; Paul W Wyatt; Erin R Steenblock; Pallavi H Shah; Luc J Bousse; Camille B Troup; Jeffrey C Mellen; Dean K Wittmann; Nicholas G Erndt; Thomas H Cauley; Ryan T Koehler; Austin P So; Simant Dube; Klint A Rose; Luz Montesclaros; Shenglong Wang; David P Stumbo; Shawn P Hodges; Steven Romine; Fred P Milanovich; Helen E White; John F Regan; George A Karlin-Neumann; Christopher M Hindson; Serge Saxonov; Bill W Colston Journal: Anal Chem Date: 2011-10-28 Impact factor: 6.986
Authors: Alexis Autour; Sunny C Y Jeng; Adam D Cawte; Amir Abdolahzadeh; Angela Galli; Shanker S S Panchapakesan; David Rueda; Michael Ryckelynck; Peter J Unrau Journal: Nat Commun Date: 2018-02-13 Impact factor: 14.919