In reverse genetics, a gene's function is elucidated through targeted modifications in the coding region or associated DNA cis-regulatory elements. To this purpose, recently developed customizable transcription activator-like effector nucleases (TALENs) have proven an invaluable tool, allowing introduction of double-strand breaks at predetermined sites in the genome. Here we describe a practical and efficient method for the targeted genome engineering in Drosophila. We demonstrate TALEN-mediated targeted gene integration and efficient identification of mutant flies using a traceable marker phenotype. Furthermore, we developed an easy TALEN assembly (easyT) method relying on simultaneous reactions of DNA Bae I digestion and ligation, enabling construction of complete TALENs from a monomer unit library in a single day. Taken together, our strategy with easyT and TALEN-plasmid microinjection simplifies mutant generation and enables isolation of desired mutant fly lines in the F1 generation.
In reverse genetics, a gene's function is elucidated through targeted modifications in the coding region or associated DNA cis-regulatory elements. To this purpose, recently developed customizable transcription activator-like effector nucleases (TALENs) have proven an invaluable tool, allowing introduction of double-strand breaks at predetermined sites in the genome. Here we describe a practical and efficient method for the targeted genome engineering in Drosophila. We demonstrate TALEN-mediated targeted gene integration and efficient identification of mutant flies using a traceable marker phenotype. Furthermore, we developed an easy TALEN assembly (easyT) method relying on simultaneous reactions of DNA Bae I digestion and ligation, enabling construction of complete TALENs from a monomer unit library in a single day. Taken together, our strategy with easyT and TALEN-plasmid microinjection simplifies mutant generation and enables isolation of desired mutant fly lines in the F1 generation.
The generation of mutant organisms with defined genomic modifications is the most efficient way in reverse genetics to elucidate the functions of gene products or unravel gene regulatory mechanisms. This has been successfully exploited in yeast owing to the ability to exchange DNA via homologous recombination (HR) between endogenous target locus and an exogenously introduced DNA having desired genetic alteration (1,2). In the cases of multicellular organisms, however, such targeted HR is rarely induced, when homologous foreign DNAs is provided, with a few exceptions including mouse embryonic stem cells or the moss Physcomitrella patens (3,4).A breakthrough in targeted genome manipulation was achieved by the development of the engineered nucleases: zinc-finger nucleases (ZFNs) (5) and, more recently, transcription activator-like effector nucleases (TALENs) (6), which consist of repeats of DNA-binding domains and a Fok I nuclease domain. Because Fok I is only active as a dimer (7), these enzymes can cut genomic DNA only where a pair of two engineered nucleases binds nearby in a head-to-head manner. In eukaryotic cells, double-strand breaks (DSBs) stimulate intrinsic DNA repair mechanisms, either error-prone non-homologous end-joining (NHEJ) or error-free HR. In addition to unspecific insertions and deletions mediated by NHEJ, targeted integration of exogenously provided DNA sequences by HR is of substantial value for genome engineering.TALENs are lately getting more attention over ZFNs because of their unique and simple modularity of DNA-binding domains conferring target specificity (8). Essential to the sequence-specific DNA recognition by TALENs is the contiguous repeat of DNA-binding modules of transcription activator-like effectors (TALE-repeats), which varies in number from 15.5 to 19.5 repeats among most of naturally occurring TALEs (9). Each module consists of 34 amino acids recognizing a single base on DNA. The base preference of each module is specified by the two residues at 12th and 13th positions, called repeat variable di-residues (RVDs), and a set of RVD–base codes has been deciphered (10–13). Because the modules that preferentially bind to a unique base have been identified (e.g. NI–adenine, NG–thymine, NK–guanine and HD–cytosine), TALENs with desired sequence specificity can be constructed by arranging these distinctive modules in a specific order. There is, however, a constraint in selection of TALEN target sites. A thymine is the base that precedes the sequences bound by TALE-repeats. The thymine, often called T at position 0, is essential for the targeting of designer TALEs or TALENs and recognized by two degenerate modules at the N-terminus (14,15). Nonetheless, the customizability of TALENs for the design and construction of tailored molecular scissors to target any sequence of interest is simple and straightforward.The utility of TALENs for targeted germ line mutagenesis has been successfully proven in a wide range of model organisms, including nematodes (16), insects (17–21), zebrafish (22–26), Xenopus (27) and some mammals (28,29). In the fruit fly Drosophila melanogaster, based on studies with ZFNs (30), the microinjection of TALEN mRNA into the syncytial embryo has been used to generate targeted mutations in the germ line (18,21). However, the efficiency of TALEN-mediated targeted mutagenesis in Drosophila is generally lower than in other model organisms such as zebrafish or Xenopus, where TALEN mRNAs are delivered at a single-cell stage. Therefore, identification of mutant alleles without a visible phenotype is laborious and requires interrogation of the genomic TALEN target loci. The use of genetic markers has made identification of transgenic mutants and tracking of the transgenes an uncomplicated matter. However, to date, no efforts have been made to use a traceable marker as an indicator of TALEN-induced mutations. TALEN-mediated direct in-frame introduction of fluorescence genes into endogenous genes has been attempted in cultured cells (31,32) and zebrafish (26). Unfortunately, the utility of these selection markers depends on the expression levels of the fusion proteins.Here we develop a simple and efficient method of assembling TALENs into an expression vector that can subsequently be used for Drosophila embryo microinjections. We demonstrate that we can induce targeted locus-specific mutations via NHEJ as well as marker gene integration via HR in the Drosophila genome. This traceable marker allows efficient identification of the engineered flies.
MATERIALS AND METHODS
Construction of TALEN backbone plasmid
The TALEN backbone created in this work is a modification of the AvrXa7-FN from B. Yang’s lab (33). Owing to a high homology and related origin of AvrXa7 and AvrXa10 proteins, the truncation produced in this study was modelled after the AvrXa10 truncation of Sun et al. (34). The N3-C1 truncation produced in this study retained 207 amino acids on the N-terminus and 63 amino acids on the C-terminus (Supplementary Figure S1). The nuclear localization signal (NLS) from SV40 T-antigen was added to the N-terminus. The CAT and ccdB selection markers were inserted between N3 and C1 fragments with Mlu I and Aat II. To allow insertion of a tailored TALE-repeat in easy TALEN assembly (easyT) protocol, Bae I sites were created on either side of the CAT–ccdB cassette. The TALEN.N3-C1 backbone was placed under Copia promoter for subsequent expression in Drosophila embryos.
Construction of unit template plasmids
The template DNA sequences for individual monomer units were created by dimerizing each pair of DNA-oligos shown in Supplementary Table S1, and cloned into pGEM-T Easy vector (Promega). The RVDs used were NI for A, NG for T, NK for G and HD for C. Because polymerase chain reaction (PCR) amplification of repetitive sequences is inefficient and 4-mer repeats amplification is required in the easyT protocol, four types of the template DNA sequences (type a, b, c and d) were created for unit 2 to 19 without changing the amino acid sequences (Supplementary Figure S2).
TALEN construction
The TALEN–y pair targeting y (chrX: 253 627–263 693) was constructed with a modular assembly protocol (33), and subsequently cloned into the TALEN N3-C1 backbone. TALEN–w pair (chrX: 2 686 269–2 686 214), TALEN–upd pair (chrX: 18 207 553–18 207 606) and TALEN–wg pair (chr2L: 7 325 600–7 325 658) were created with the easyT protocol. See Supplementary Methods for step-by-step easyT protocols and backup troubleshooting protocols. The TALE-repeats and target DNA sequences for each TALEN pair were shown in Table 1.
Table 1.
TALENs constructed in this work
TALEN
RVDs and target sequence
Repeats
Assembly protocol (reference)
TALEN-y-L
NI NI NG NG HD NG HD HD NG NI NG NG NG NG NI NG NG NG NG NG NG NI NK NK
23.5
Modular assembly (33)
a a t t c t c c t a t t t t a t t t t t t a g G
TALEN-y-R
HD NI NI NI HD NG NK HD NK NK NG HD HD NI NG NK NG NG NG NI NG NI NG NI
23.5
Modular assembly (33)
C A A A C T G C G G T C C A T G T T T A T A T A
TALEN-w-L
HD NK NK HD HD NI NG HD NI NK NI NI NK NK NI NG HD NG NG
18.5
easyT (this study)
C G G C C A T C A G A A G G A T C T T
TALEN-w-R
NG HD NI NG HD NI NK HD HD NK NG HD NG NG HD HD NK NI NK
18.5
easyT (this study)
T C A T C A G C C G T C T T C C G A G
TALEN-upd-L
NK HD HD HD NK NI HD NI NG NG NG NG NK HD HD
14.5
easyT (this study)
g c c c g a c a t t t t g c c
TALEN-upd-R
NI HD HD NK NG NI HD NI NK HD HD NK NG HD NG
14.5
easyT (this study)
a c c g t a c a g c c g t c t
TALEN-wg-L
HD NI NG HD NG NK NI NG NK HD NG NG HD NI HD NI NK NI NI
18.5
easyT (this study)
c a t c t g a t g c t t c a c a g a a
TALEN-wg-R
NK NI NI NG NG HD HD NG NK NI NI NI HD NG NK NI NI NG HD
18.5
easyT (this study)
g a a t t c c t g a a a c t g a a t c
In each TALEN target sequence, protein coding sequences are shown in upper cases, and intron or intergenic region are written in lower cases.
TALENs constructed in this workIn each TALEN target sequence, protein coding sequences are shown in upper cases, and intron or intergenic region are written in lower cases.
Construction of Donor Plasmids
The right and left homology arms of donor DNA were separately amplified by PCR from a wild-type fly genome as a template. Both arms and 3xP3-EGFP-SV40 cassette (35) were cloned into pGEM-T Easy vector. The regions used for homologous donor sequences were as follows: Donor-y, chrX: 251 198–255 994; Donor-upd, chrX: 18 205 583–18 209 591; and Donor-wg, chr2L: 7 323 606–7 327 640. On Donor-upd and Donor-wg, the putative activator protein-1 (AP-1)–binding sequences TGA[C/G]TCA at TALEN-upd or TALEN-wg target sites were deleted for further study (TK, unpublished data). The primers used are available on request.
Drosophila embryo microinjection
The concentrations of TALEN and donor plasmids used were shown in Tables 2 and 3. The syncytial blastodermal embryos of wild type or w (36) were used for microinjection. To detect TALEN-induced germ line y or w mutants, the eclosed flies (F0) were individually crossed with lesion-known y or w mutants, respectively. Because the levels of 3xP3-EGFP expression are influenced by the chromosomal location, the identification of the mutant (F1) expressing EGFP in its eyes was performed in w background. Frequencies of targeted mutagenesis or gene integration were calculated as the frequency of yielders: the number of F0 yielding mutant offspring in F1 per the total number of fertile F0.
Table 2.
TALEN-mediated targeted gene mutagenesis
Target
Recipient
Concentration (ng/µl each)
Frequency of targeted gene integrationa
Gene
Chr.
Male
Female
Total
y
X
Wild type
25
27.6% (8/29)
8.3% (1/12)
22.0% (9/41)
50
43.2% (19/44)
11.1% (5/45)
27.0% (24/89)
100
22.4% (11/49)
6.8% (3/44)
15.1% (14/93)
250
0.0% (0/39)
0.0% (0/42)
0.0% (0/81)
w
X
Wild type
50
12.5% (3/24)
0.0% (0/14)
7.9% (3/38)
100
0.0% (0/68)
0.0% (0/50)
0.0% (0/118)
aThe frequencies were calculated as frequency of yielder.
Table 3.
TALEN-mediated targeted gene integration
Target
Homology length of donor DNA (kb)
Concentration (ng/µl)
Frequency of targeted gene integrationa
Gene
Chr.
TALEN
Donor
Male
Female
Total
y
X
5.0
50
100
1.6% (1/63)
1.5% (1/67)
1.5% (2/130)
upd
X
4.0
50
500
8.3% (9/108)
1.6% (3/189)
4.0% (12/297)
4.0
50
100
14.3% (11/77)
6.7% (5/75)
10.5% (16/152)
wg
II
4.0
50
100
2.6% (3/114)
1.4% (1/72)
2.2% (4/186)
aThe frequencies were calculated as frequency of yielder.
TALEN-mediated targeted gene mutagenesisaThe frequencies were calculated as frequency of yielder.TALEN-mediated targeted gene integrationaThe frequencies were calculated as frequency of yielder.
Validation of targeted mutagenesis
Genomic DNA of TALEN-mediated mutants were isolated from single fly with QIAamp DNA Micro Kit (QIAGEN). The TALEN-y and TALEN-w target sites at y or w loci were amplified and sequenced. TALEN-mediated targeted 3xP3-EGFP integrations were also determined by PCR amplification and sequencing. Primer pairs used in genomic DNA amplification are shown in Supplementary Table 3.
In silico estimation of TALEN specificity as a function TALEN-binding sequence length
To assess the TALEN target specificity, we used a custom-made R script using different bioconductor packages (37) and multicore (parallel processing of R code on machines with multiple cores or CPUs, Simon Urbanek, version 0.1-7). To estimate target specificity of a TALEN-L/R pair of a given TALE-repeat length, 500 TALEN target sites, which consist of TALEN-L– and -R–binding sequences separated by a spacer sequence ranging from 12 to 32 bases (randomly assigned in each sample), were sampled uniformly from D. melanogaster genome. Here, TALE-repeat lengths ranging from 5 to 20 were considered. In total, 500 samples were independently generated from each of the following chromosomal arms: chr2L, chr2R, chr3L, chr3R, chrX, chr2LHet, chr2RHet, chr3LHet, chr3RHet and chrXHet. The TALEN-L– and -R–binding sequences were subsequently used to generate a set of sequences representing all potential targeting sites. For this purpose, the TALEN-L– and -R–binding sequences are joined to either end of spacer sequences of Ns, which vary in length from 12 to 32; [TALEN-L–binding sequence] + [spacer: N12-32] + [TALEN-R–binding sequence]. In addition, and to account for the targeting sites caused by TALEN-L and TALEN-R self-pairing, [TALEN-L–binding sequence] + [spacer: N12-32] + [TALEN-L–binding sequence] and [TALEN-R–binding sequence] + [spacer: N12-32] + [TALEN-R–binding sequence] were generated. All sequences (63 in total) were aligned to the D. melanogaster reference genome (BDGP Release 5 dm3) requiring a perfect match for both TALEN-binding sequences while ignoring the spacer sequence of Ns. A TALEN target site was considered to be specific if only the initial sequence uniquely aligns to its origin and all other constructed sequences fail to align. Hence, sequence sets exhibiting more than one alignment are considered to be unspecific. Using all 500 samples, the TALEN target specificity for a given length and chromosomal arm was calculated as the number of specific TALENs divided by the number of samples. To assess the contribution of genomic repeat regions to unspecific TALEN-binding sites, we subtracted all samples derived from repeat regions of the genome. Repeat regions were identified by Tandem Repeats Finder and RepeatMasker masks, including, for example, simple repeats and transposable elements, contained in the D. melanogaster object provided by the BSgenome bioconductor package.
RESULTS
The length of TALE-repeats required for specific targeting in Drosophila genome
TALENs with desired sequence specificities are constructed by arranging the order of TALE-repeat modules. To gain the target specificity, TALE-repeats have to contain a certain number of modules. Additionally, highly site-specific DSB introduced by a TALEN pair is in part supported by the requirement that Fok I nuclease domain acts as dimmer (7). The genome of D. melanogaster is ∼180 Mb in length, consisting of ∼120 Mb of euchromatic sequences and 60 Mb of heterochromatic regions (38). Thus, to determine the length of the TALE-repeats required for minimizing unfavourable off-target DNA cleavages in D. melanogaster genome, we computationally assessed TALEN-pair–binding specificity as a function of the TALE-repeat–binding sequence length. To this end, we sampled 500 TALEN-L/R pairs (TALEN target sites) of a given TALE-repeat length uniformly from the euchromatin or heterochromatin of each chromosomal arm of D. melanogaster. Subsequently, a set of possible target sites by TALEN-L/R pair and TALEN-L or -R self-pairing were constructed and aligned to D. melanogaster reference genome (Figure 1A). A TALEN target site was considered to be specific if only the initial sequence uniquely aligns to its origin and all other constructed sequences fail to align. The in silico analysis revealed that TALE-repeats longer than 12 bp reached a plateau in TALEN-pair–binding specificity in the euchromatic regions of the genome (Euchromatin in Figure 1B). To assess the contribution of genomic repeat regions to unspecific TALEN-binding sites, we subtracted the samples derived from repeat regions such as simple repeats or transposable elements on the euchromatin [Euchromatin (non-repetitive) in Figure 1B]. On the other hand, more than half of TALEN-pairs sampled from heterochromatic regions still exhibited off-targets even with TALE-repeat length of 20 bp (Heterochromatin in Figure 1B). These results suggested that TALEN-pairs of TALE-repeat longer than 12 bp are required for specific targeting within non-repetitive euchromatic regions of D. melanogaster genome. Accordingly, we constructed the TALEN-pairs recognizing sequences >15 bp (Table 1).
Figure 1.
Target specificity of TALEN-pairs as a function of the binding sequence length. (A) The overview of in silico analysis. Five hundred TALEN-pair target sites were sampled uniformly from euchromatic or heterochromatic regions of the different chromosomal arms. The specificity of a TALEN-L/R pair in each sample was determined by aligning all the potential targeting sites to D. melanogaster reference genome (BDGP Release 5 dm3). Specific TALEN-pairs contain exactly one targeting site within the genome, and hence, TALEN-pairs exhibiting multiple targeting sequences are considered to be unspecific. (B) The probability of specific targeting was calculated as the number of specific TALEN pairs divided by the number of samples. Results obtained for different chromosome arms were summarized using a boxplot representation. To assess the contribution of genomic repeat regions (e.g. simple repeats or transposable elements) to the fraction of unspecific TALEN pairs, the specificity was estimated using only samples localizing outside of annotated repeat regions within the euchromatin.
Target specificity of TALEN-pairs as a function of the binding sequence length. (A) The overview of in silico analysis. Five hundred TALEN-pair target sites were sampled uniformly from euchromatic or heterochromatic regions of the different chromosomal arms. The specificity of a TALEN-L/R pair in each sample was determined by aligning all the potential targeting sites to D. melanogaster reference genome (BDGP Release 5 dm3). Specific TALEN-pairs contain exactly one targeting site within the genome, and hence, TALEN-pairs exhibiting multiple targeting sequences are considered to be unspecific. (B) The probability of specific targeting was calculated as the number of specific TALEN pairs divided by the number of samples. Results obtained for different chromosome arms were summarized using a boxplot representation. To assess the contribution of genomic repeat regions (e.g. simple repeats or transposable elements) to the fraction of unspecific TALEN pairs, the specificity was estimated using only samples localizing outside of annotated repeat regions within the euchromatin.
Development of the easyT protocol
A variety of TALEN assembly protocols are now available (39). However, these protocols rely on extensive plasmid libraries, special reagents or extended cloning schemes requiring many days of work (22,33,40–44). We developed a new easyT protocol that enables us to construct TALENs of up to 18.5 TAL-repeat modules in a single cloning process (Figure 2, Supplementary Figures S1–S5 and Supplementary Methods). Importantly, the activity of the restriction enzyme Bae I at 25°C allowed combining DNA digestion and ligation into a single 1-h ‘digation’ reaction, thereby saving time and effort (Figure 2B). In a first step, individual units were assembled into 4-mers in a first digation reaction. DNA sequence of each unit within a 4-mer has been altered to reduce nucleotide sequence homology without changing amino acid sequence. This enabled efficient PCR-amplification of 4-mers regardless of combination of four units (Supplementary Figure S2). In a subsequent second digation, a TALE-repeat was assembled into the predigested TALEN backbone plasmid and followed by conventional cloning procedure. Thus, construction of custom TALENs can be achieved in a single day. Besides the simplicity of the cloning procedures, the easyT prepares a unique troubleshooting protocol. In the regular Golden Gate based TALEN assembly protocols (39), the assembled units are no longer replaceable. By shifting the boundaries of assembly units from those of TALE-repeat modules, however, we made it possible to embed unique restriction sites at the borders of every 4-mer units, enabling users to replace any 4-mer in case there is a mutation by PCR-error or misligation (Supplementary Figure S3).
Figure 2.
Construction of TALENs with the easyT protocol. (A) A schematic representation of a TALEN with a TALE-repeat length of 18.5 modules. The TALE-repeat is assembled from 20 monomer units. The boundaries of monomer units were shifted from those of the TALE-repeat modules. (B) Overview of TALEN cloning. In the first step, four units are assembled into 4-mers in a ‘digation’ reaction. In the second step, 4-mers are PCR-amplified, run on an agarose gel, gel-extracted and concentrated. Finally, 4-mers were assembled into the TALEN backbone plasmid in the second digation reaction. Yellow and blue arrows indicate primers used for 4-mer amplification.
Construction of TALENs with the easyT protocol. (A) A schematic representation of a TALEN with a TALE-repeat length of 18.5 modules. The TALE-repeat is assembled from 20 monomer units. The boundaries of monomer units were shifted from those of the TALE-repeat modules. (B) Overview of TALEN cloning. In the first step, four units are assembled into 4-mers in a ‘digation’ reaction. In the second step, 4-mers are PCR-amplified, run on an agarose gel, gel-extracted and concentrated. Finally, 4-mers were assembled into the TALEN backbone plasmid in the second digation reaction. Yellow and blue arrows indicate primers used for 4-mer amplification.
Targeted mutagenesis in Drosophila germ line by microinjection of TALEN expression plasmids
The microinjection of plasmid DNA into Drosophila embryos has been routinely used to generate transgenic flies. Co-injection of P-element–based transformation vector with a transposase-expressing helper plasmid can efficiently integrate the transgenes into the genome of germ cell lineage (45). Additionally, the use of plasmid DNA for TALEN introduction eliminates an mRNA synthesis step. Hence, the TALEN plasmids synthesized via easyT were used directly in embryo microinjections. We first tested whether TALEN plasmid DNA microinjection can be used to introduce DSB in Drosophila embryos by targeting two genes, yellow (y) and white (w), mutations of which result in a visible phenotype. Flies carrying a knockout of y display a characteristic yellow body phenotype, while mutation of w results in reduced or completely lost red eye pigmentation. As shown in Table 2, a series of different TALEN plasmid concentrations were tested for y targeting, with a peak [27.0% (24/89)] in mutation frequency observed at 50 ng/µl concentration for each TALEN plasmid. Interestingly, the mutation frequency was consistently higher in males than in females at all concentrations tested. For w targeting, 50 ng/µl and 100 g/µl concentrations were tested, yielding germ line w mutants with 7.9% (3/38) and 0.0% (0/118) frequencies, respectively (Table 2). The optimal TALEN concentration is likely to be a function of the TALEN pair affinity to its recognition sequence, the chromatin status of the target locus, and potential off-target sites in the genome and thus might need to be adjusted for individual TALEN pairs.The targeted mutations were determined by sequencing genomic TALEN target region. For the y mutants generated, randomly selected 10 independent mutant lines had a mutation by NHEJ-induced indels at the target site (Figure 3A, C and D). Interestingly, targeting of the w gene resulted in a distinctive genomic lesion in all three mutant lines derived from individual microinjected embryos (Figure 3B). The deletion of nine nucleotides from the coding sequence caused reduced red eye pigmentation in mutants (Figure 3E and F) and could be due to an accidental 5-bp microhomology in the spacer sequence between two TALEN-binding sites. In summary, TALEN plasmid microinjection is a convenient alternative to mRNA microinjection for germ line mutagenesis in Drosophila embryos.
Figure 3.
Targeted germ line gene mutagenesis. (A) TALEN-induced mutations in y. The target site was chosen at the junction of exon 2 and intron, as indicated with the scissors mark. Ten y mutants (y) derived individual F0 were sequenced. (B) TALEN-induced mutations in w. The target site encodes the end part of the ATP-binding cassette transporter domain. Three independently generated F1 mutants have the same genomic lesions of a 9-bp deletion. The sequence microhomology shown with asterisk at the target site indicates the probable DSB repair through microhomology-mediated end-joining. (C and D) The body colour phenotypes of wild type and y flies, both male, are shown. (E and F) The eye colour phenotypes of wild type and w flies are shown. w is a hypomorphic allele, showing an orange eye colour. In the TALEN target sequences, the DNA aberrations are shown in red and TALEN-binding sites in blue. The intronic and exonic sequences are shown in lowercase and uppercase, respectively. Scale bar represents 1 kb length.
Targeted germ line gene mutagenesis. (A) TALEN-induced mutations in y. The target site was chosen at the junction of exon 2 and intron, as indicated with the scissors mark. Ten y mutants (y) derived individual F0 were sequenced. (B) TALEN-induced mutations in w. The target site encodes the end part of the ATP-binding cassette transporter domain. Three independently generated F1 mutants have the same genomic lesions of a 9-bp deletion. The sequence microhomology shown with asterisk at the target site indicates the probable DSB repair through microhomology-mediated end-joining. (C and D) The body colour phenotypes of wild type and y flies, both male, are shown. (E and F) The eye colour phenotypes of wild type and w flies are shown. w is a hypomorphic allele, showing an orange eye colour. In the TALEN target sequences, the DNA aberrations are shown in red and TALEN-binding sites in blue. The intronic and exonic sequences are shown in lowercase and uppercase, respectively. Scale bar represents 1 kb length.
TALEN-mediated targeted gene integration by HR in Drosophila germ line
Next, we investigated TALEN-mediated targeted gene integration via HR between the target locus and an exogenously provided donor DNA. To expand the TALEN application to any region in the genome, we constructed two TALEN-pairs targeting the intergenic region of unpaired (upd) and the wingless (wg) locus, respectively (Table 1). To that end, three donor DNA plasmids were constructed to contain an eye-specific enhanced green fluorescent protein marker gene [3xP3-EGFP (35)] flanked on either side by homology arms (Figure 4A–C). Insertion of the marker gene in y donor DNA disrupts the coding sequence of the y gene. In the other two cases, the insertion results in a deletion of a putative AP-1–binding sequence in the upd and wg donor DNA. Co-injection of donor DNA plasmids (100 ng/µl) along with TALEN plasmids (50 ng/µl each) was carried out in a DNA ligase IV (lig4) mutant background (36), shown previously to have higher efficiency of donor DNA integration via HR (30).
Figure 4.
TALEN-mediated targeted marker gene integration. (A–C) Schematics of TALEN target loci and corresponding donor DNA. TALEN target sites are indicated with scissors. Scale bar in each panel represents 1 kb length. (D–F) Mutant progeny of microinjected embryos with TALEN-mediated targeted 3xP3-EGFP integration. y fly also show the expected yellow body colour (D). upd and wg mutants show weaker levels and specific patterns of green fluorescence in the fly eyes (E and F). (G–J) Genomic DNA amplification of y, upd and wg. Red and blue arrows indicate primers used for genomic DNA amplification.
TALEN-mediated targeted marker gene integration. (A–C) Schematics of TALEN target loci and corresponding donor DNA. TALEN target sites are indicated with scissors. Scale bar in each panel represents 1 kb length. (D–F) Mutant progeny of microinjected embryos with TALEN-mediated targeted 3xP3-EGFP integration. y fly also show the expected yellow body colour (D). upd and wg mutants show weaker levels and specific patterns of green fluorescence in the fly eyes (E and F). (G–J) Genomic DNA amplification of y, upd and wg. Red and blue arrows indicate primers used for genomic DNA amplification.The flies carrying germ line integrations of the marker gene were identified by green fluorescence in eyes of the progeny of microinjected embryos (Figure 4D–F). The fluorescence level and pattern were different for the three genomic locations tested, probably reflecting the influence of surrounding DNA regulatory elements on the transcriptional activity of the marker gene. The triple-plasmid microinjection resulted in a successful introduction of germ line transmitting mutations at each locus: 1.5% (2/130) for y, 10.5% (16/152) for upd and 2.2% (4/186) for wg (Table 3). Using the best TALEN-pair targeting upd, we examined the frequency of HR event with higher concentrations (500 ng/µl) of donor DNA plasmids (Table 3). Unexpectedly, increasing donor DNA concentration resulted in a decrease of frequency of HR-mediated gene integration (4.0%, 12/297). Genomic DNA of EGFP-positive flies was amplified to confirm targeted gene integration (Figure 4G–J). Amplicons of upd and wg were also sequenced to verify that they indeed represent TALEN-mediated targeted gene integration (data not shown). These results demonstrate that targeted mutations via gene integration can be efficiently induced in Drosophila and easily scored with a traceable recombination marker.
DISCUSSION
Introduction of site-specific genetic modifications in a controlled manner has been recognized as an ultimate approach in biosciences to understand the function of a gene product or gene regulation. The development of the customizable engineer nucleases such as ZFNs and, more recently, TALENs made a leap forward for targeted genome editing in a variety of organisms. In this study, we developed a new TALEN synthesis method named easyT. This protocol allows construction of custom TALENs from monomer units in a single day and relies only on standard techniques commonly used in a molecular biology lab. Other assembly methods that permit construction of TALENs in a single day require either solid phase high-throughput ligation technology or a library of preassembled 4- or 5-mer unit plasmids (40,43,44). Furthermore, an additional feature of easyT is the capability of replacing a part of assembled units every 4-mers. The time to synthesize a TALEN by a given protocol relies on the successful cloning. Many of the currently available TALEN assembly protocols are based on either plasmid- or PCR-based Golden Gate cloning (39), where the correct assembly is confirmed at the end by sequencing. At this point, mutations introduced in TALE-repeat cannot be fixed, and either more clones are sequenced or the assembly procedure has to be repeated. Therefore, the ability to replace a part of TALE-repeats is a valuable troubleshooting feature of easyT.TALEN technology has been quickly and successfully applied to a wide variety of cells and organisms in the past years. To evaluate the utilities of TALENs in each model organism, most of the pioneering work has scored the frequency of NHEJ-mediated targeted mutations using the genes that express known visible phenotypes. In the practical use of TALEN technology, however, it is often required to target loci in which resulting phenotypes are not known. Because the frequencies of mutant production are generally not high enough in Drosophila (18,21,30), the identification of mutants could be the key step in Drosophila genome engineering. In this work, we demonstrated that TALEN-directed targeted genome alteration through HR with exogenous donor DNA could be an efficient and straight approach for mutant generation and identification. In this regard, we demonstrate the utility of a traceable recombination marker introduced by the donor DNA. First, it allows identification of germ line transmitting mutants as early as the progeny of injected embryos start to hatch. Second, it enables tracking of the mutated alleles for generating hetero- or homozygous stable lines for subsequent experiments. And finally, the easy identification allows more flexibility in the mode of TALEN introduction into Drosophila embryos, and we showed that TALEN plasmid microinjection is a convenient alternative to the mRNA microinjection in Drosophila. As we speculate from the limited number of examples in this work and previous studies (18,21), the simple identification of mutants by traceable markers will not be substantially improved by the potential of a slight increase in mutant frequencies, thus shifting the focus from TALEN efficiency to the overall convenience of the method. For example, the use of plasmid DNA for TALEN introduction eliminates additional mRNA synthesis steps and possible concerns on the quality of TALEN mRNA during microinjection. Taken together, HR-mediated gene integration with donor DNA bearing a traceable marker appears as a more favourable and practical strategy.The utility of TALEN-mediated targeted gene integration can be extended to precise genome engineering by combining TALEN-mediated targeted gene integration with site-specific recombinase/integrase systems such as flippase (Flp)-FRT, Cre-loxP or PhiC31 (46), and the piggyBac transposon system (47). For example, by using the 3xP3-EGFP cassette having flp/Cre recognition sites besides the 3xP3 promoter, the 3xP3 can be removed by appropriate recombinases, while leaving EGFP at the target site (Supplementary Figure S6). In this way, it is now possible to tag endogenous gene products with fluorescence protein regardless of their expression levels.
SUPPLEMENTARY DATA
Supplementary Data are available at NAR Online, including [48].
FUNDING
Deutsche Forschungsgemeinschaft [SPP1356 to T.K. and R.P.]. Funding for open access charge: Eidgenössische Technische Hochschule Zürich.Conflict of interest statement. None declared.
Authors: Robert C Gentleman; Vincent J Carey; Douglas M Bates; Ben Bolstad; Marcel Dettling; Sandrine Dudoit; Byron Ellis; Laurent Gautier; Yongchao Ge; Jeff Gentry; Kurt Hornik; Torsten Hothorn; Wolfgang Huber; Stefano Iacus; Rafael Irizarry; Friedrich Leisch; Cheng Li; Martin Maechler; Anthony J Rossini; Gunther Sawitzki; Colin Smith; Gordon Smyth; Luke Tierney; Jean Y H Yang; Jianhua Zhang Journal: Genome Biol Date: 2004-09-15 Impact factor: 13.583
Authors: Adrian W Briggs; Xavier Rios; Raj Chari; Luhan Yang; Feng Zhang; Prashant Mali; George M Church Journal: Nucleic Acids Res Date: 2012-06-26 Impact factor: 16.971
Authors: M D Adams; S E Celniker; R A Holt; C A Evans; J D Gocayne; P G Amanatides; S E Scherer; P W Li; R A Hoskins; R F Galle; R A George; S E Lewis; S Richards; M Ashburner; S N Henderson; G G Sutton; J R Wortman; M D Yandell; Q Zhang; L X Chen; R C Brandon; Y H Rogers; R G Blazej; M Champe; B D Pfeiffer; K H Wan; C Doyle; E G Baxter; G Helt; C R Nelson; G L Gabor; J F Abril; A Agbayani; H J An; C Andrews-Pfannkoch; D Baldwin; R M Ballew; A Basu; J Baxendale; L Bayraktaroglu; E M Beasley; K Y Beeson; P V Benos; B P Berman; D Bhandari; S Bolshakov; D Borkova; M R Botchan; J Bouck; P Brokstein; P Brottier; K C Burtis; D A Busam; H Butler; E Cadieu; A Center; I Chandra; J M Cherry; S Cawley; C Dahlke; L B Davenport; P Davies; B de Pablos; A Delcher; Z Deng; A D Mays; I Dew; S M Dietz; K Dodson; L E Doup; M Downes; S Dugan-Rocha; B C Dunkov; P Dunn; K J Durbin; C C Evangelista; C Ferraz; S Ferriera; W Fleischmann; C Fosler; A E Gabrielian; N S Garg; W M Gelbart; K Glasser; A Glodek; F Gong; J H Gorrell; Z Gu; P Guan; M Harris; N L Harris; D Harvey; T J Heiman; J R Hernandez; J Houck; D Hostin; K A Houston; T J Howland; M H Wei; C Ibegwam; M Jalali; F Kalush; G H Karpen; Z Ke; J A Kennison; K A Ketchum; B E Kimmel; C D Kodira; C Kraft; S Kravitz; D Kulp; Z Lai; P Lasko; Y Lei; A A Levitsky; J Li; Z Li; Y Liang; X Lin; X Liu; B Mattei; T C McIntosh; M P McLeod; D McPherson; G Merkulov; N V Milshina; C Mobarry; J Morris; A Moshrefi; S M Mount; M Moy; B Murphy; L Murphy; D M Muzny; D L Nelson; D R Nelson; K A Nelson; K Nixon; D R Nusskern; J M Pacleb; M Palazzolo; G S Pittman; S Pan; J Pollard; V Puri; M G Reese; K Reinert; K Remington; R D Saunders; F Scheeler; H Shen; B C Shue; I Sidén-Kiamos; M Simpson; M P Skupski; T Smith; E Spier; A C Spradling; M Stapleton; R Strong; E Sun; R Svirskas; C Tector; R Turner; E Venter; A H Wang; X Wang; Z Y Wang; D A Wassarman; G M Weinstock; J Weissenbach; S M Williams; K C Worley; D Wu; S Yang; Q A Yao; J Ye; R F Yeh; J S Zaveri; M Zhan; G Zhang; Q Zhao; L Zheng; X H Zheng; F N Zhong; W Zhong; X Zhou; S Zhu; X Zhu; H O Smith; R A Gibbs; E W Myers; G M Rubin; J C Venter Journal: Science Date: 2000-03-24 Impact factor: 47.728
Authors: Koen J T Venken; Alejandro Sarrion-Perdigones; Paul J Vandeventer; Nicholas S Abel; Audrey E Christiansen; Kristi L Hoffman Journal: Wiley Interdiscip Rev Dev Biol Date: 2015-10-08 Impact factor: 5.814