Siyu Chen1, Wanhua Xie2, Zhiquan Liu1, Huanhuan Shan1, Mao Chen1, Yuning Song1, Hao Yu3, Liangxue Lai4, Zhanjun Li5. 1. Key Laboratory of Zoonosis Research, Ministry of Education, College of Animal Science, Jilin University, Changchun 130062, China. 2. The Precise Medicine Center, Shenyang Medical College, Shenyang, China. 3. Key Laboratory of Zoonosis Research, Ministry of Education, College of Animal Science, Jilin University, Changchun 130062, China. Electronic address: yu_hao@jlu.edu.cn. 4. Key Laboratory of Zoonosis Research, Ministry of Education, College of Animal Science, Jilin University, Changchun 130062, China; CAS Key Laboratory of Regenerative Biology, Guangdong Provincial Key Laboratory of Stem Cell and Regenerative Medicine, South China Institute for Stem Cell Biology and Regenerative Medicine, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China; Guangzhou Regenerative Medicine and Health Guang Dong Laboratory (GRMH-GDL), Guangzhou 510005, China; Institute for Stem Cell and Regeneration, Chinese Academy of Sciences, Beijing 100101, China. Electronic address: lai_liangxue@gibh.ac.cn. 5. Key Laboratory of Zoonosis Research, Ministry of Education, College of Animal Science, Jilin University, Changchun 130062, China. Electronic address: lizj_1998@jlu.edu.cn.
Abstract
CRISPR-Cas9-mediated gene knockout and base-editing-associated induction of STOP codons (iSTOP) have been widely used to exterminate the function of a coding gene, while they have been reported to exhibit side effects. In this study, we propose a novel and practical alternative method referred to as CRISPR Start-Loss (CRISPR-SL), which eliminates gene expression by utilizing both adenine base editors (ABEs) and cytidine base editors (CBEs) to disrupt the initiation codon (ATG). CRISPR-SL has been verified to be a feasible strategy on the cellular and embryonic levels (mean editing efficiencies up to 30.67% and 73.50%, respectively) and in two rabbit models mimicking Otc deficiency (Otc gene) and long hair economic traits (Fgf5 gene).
CRISPR-Cas9-mediated gene knockout and base-editing-associated induction of STOP codons (iSTOP) have been widely used to exterminate the function of a coding gene, while they have been reported to exhibit side effects. In this study, we propose a novel and practical alternative method referred to as CRISPR Start-Loss (CRISPR-SL), which eliminates gene expression by utilizing both adenine base editors (ABEs) and cytidine base editors (CBEs) to disrupt the initiation codon (ATG). CRISPR-SL has been verified to be a feasible strategy on the cellular and embryonic levels (mean editing efficiencies up to 30.67% and 73.50%, respectively) and in two rabbit models mimicking Otc deficiency (Otc gene) and long hair economic traits (Fgf5 gene).
CRISPR-Cas9, which originally derives from the adaptive immune systems of bacteria and archaea,, has been widely applied to engineer and to elucidate gene functions.3, 4, 5 It is known that CRISPR-Cas9 generates double-strand breaks (DSBs) at target sites, the repair of which depends on non-homologous end joining (NHEJ) and homology-directed repair (HDR). NHEJ, the prominent repair pathway, can result in the introduction of insertion/deletion mutations (indels) of various lengths, which lead to the disruption of targeted genes., Although CRISPR-Cas9-associated gene editing has been successfully utilized in numerous organisms, such as plants,6, 7, 8, 9 rabbits,,
Drosophila, cynomolgus monkeys, mice, and rats, it has been reported to generate p53-dependent cell arrest, in human embryonic stem cells (hESCs) and induced pluripotent stem cells (iPSCs), on-target mRNA misregulation in ∼50% of HAPI cell lines studied, and large deletions and complicated genomic rearrangements in more than 40% of tested loci and two target loci, respectively., Moreover, it has been reported that about one-third of CRISPR-Cas9-mediated gene knockouts contain residual protein at variable levels.To avoid the DSBs generated by the CRISPR-Cas9 system, base editors (BEs) were widely used in gene editing in recent studies. BEs, a combination of catalytically impaired Cas9 and natural or laboratory-evolved nucleobase deaminase enzymes, have made it possible to convert targeted C-to-T (CBEs) or A-to-G (ABEs) in genomic DNA without generating DSBs.22, 23, 24, 25 Recently, CBEs have been utilized to target CGA (Arg), CAG (Gln), and CAA (Gln) on the coding strand or ACC on the noncoding strand to create TGA, TAG, or TAA stop codons (iSTOP),, which has been commonly applied to destroy gene and mimic disease-associated nonsense mutations., Its availability has been attested in mice,, pigs,, and rabbits to either imitate human diseases or identify gene function., In this study, we put forward a novel strategy termed CRISPR Start-Loss (CRISPR-SL), which can be induced by both CBEs and more stringent ABEs. In addition, each of three bases in the start codon can be edited by either CBEs or ABEs, which theoretically triples the chance of disrupting genes.
Results
CRISPR-SL Can Convert ATG into GTG, ACG, or ATA in Human Cells
CRISPR-SL relies on the capability of both ABEs and CBEs to convert a start codon (ATG) into a (GTG, ACG, or ATA) non-start codon. In detail, ABEs were used to target adenine (A) locating on the coding strand (Figures 1A and 1D) and thymine (T) locating on the non-coding strand (Figures 1B and 1D) and converted the start codon into GTG or ACG, respectively. Similarly, CBEs were applied to mutate guanine (G) on the non-coding strand into A to produce an ATA non-start codon (Figures 1C and 1D).
Figure 1
CRISPR-SL for Generating Loss of Gene Function
(A) Schematic of base-editor-mediated A to G mutation using ABEs. (B) Schematic of base-editor-mediated T to C mutation using ABEs. (C) Schematic of base-editor-mediated G to A mutation. (D) Schematic representation of the architecture of all base editors used for SL in this study.
CRISPR-SL for Generating Loss of Gene Function(A) Schematic of base-editor-mediated A to G mutation using ABEs. (B) Schematic of base-editor-mediated T to C mutation using ABEs. (C) Schematic of base-editor-mediated G to A mutation. (D) Schematic representation of the architecture of all base editors used for SL in this study.First, seven target sites in five genes were tested in the humanembryonic kidney cell line 293T (HEK293T) as a proof of concept by co-transfecting BE-coding and corresponding guide RNA plasmids. Classical ABEmax and rAPOBEC1-BE4max (rA1-BE4max) were harnessed to edit sites with a canonical NGG protospacer-adjacent motif (PAM). rAPOBEC1-NG-BEs (rA1-NG-BEs) and NG-ABEmax were used at sites with NGN PAMs. For sites with the target base locating out of the canonical editing window (from protospacer positions 4–8 nt), the enhanced activation-induced cytidine deaminase (eAID)-Cas9 fusions (eAID-BEs), which exhibit an editing window up to 11 nt as determined in our previous study, were used (Figure 2A). In detail, an A to G mutation was generated in Tyr, Otc, and Timeless. The Sanger sequencing chromatograms analyzed by Editr reveal an average efficiency between 11.63% and 30.67% (Figure 2B; Figure S1A). A T to C mutation was produced in Lmna and Pah with mean efficiencies of 17.67% and 21.33%, respectively (Figure 2C; Figure S1B). Only low editing frequency of G to A mutations were induced in Tyr (3.95%) and Timeless (4.30%) using conventional BEs since target bases are out of the original editing window (Figure 2D). However, it increased nearly 3-fold (10.93% and 12.50%) when eAID-BEs were used (Figure 2D; Figure S1C). Furthermore, the top five potential off-targets (≤3 mismatches) for all genomic sites were predicted, and no off-target editing events were detected (Figures S1D–S1F). Taken together, the start codon can be efficiently mutated by rA1-NG-BEs and eAID-BEs, which can expand the genome-wide targetable scope and increase the editing efficiency, respectively.
Figure 2
Base-Editor-Mediated CRISPR-SL in HEK293T Human Cells
(A) Summary of target sites and base editors used for CRISPR-SL at the cellular level. (B) Editing efficiencies of the A to G mutation in Tyr, Otc, and Timeless, and their representative Sanger sequencing chromatograms. (C) Editing efficiencies of the T to C mutation in Lmna and Pah, and their representative Sanger sequencing chromatograms. (D) Editing efficiencies of the G to A mutation in Tyr and Timeless and their representative Sanger sequencing chromatograms. PAM region (green) and the start codon are underlined, and target bases are marked in red. Red arrow indicates mutated base in ATG. ∗p < 0.05, Student’s t test. See also Figure S1.
Base-Editor-Mediated CRISPR-SL in HEK293THuman Cells(A) Summary of target sites and base editors used for CRISPR-SL at the cellular level. (B) Editing efficiencies of the A to G mutation in Tyr, Otc, and Timeless, and their representative Sanger sequencing chromatograms. (C) Editing efficiencies of the T to C mutation in Lmna and Pah, and their representative Sanger sequencing chromatograms. (D) Editing efficiencies of the G to A mutation in Tyr and Timeless and their representative Sanger sequencing chromatograms. PAM region (green) and the start codon are underlined, and target bases are marked in red. Red arrow indicates mutated base in ATG. ∗p < 0.05, Student’s t test. See also Figure S1.
CRISPR-SL Can Convert ATG into GTG, ACG, or ATA in Rabbit Embryos
To verify the CRISPR-SL strategy in rabbit embryos, BE-encoding mRNAs and corresponding in vitro-transcribed single guide RNAs (sgRNAs) were microinjected into zygotes. As shown in Figure 3A and Table 1, seven sgRNAs targeting different bases of the start codon in four genes were selected. The Otc gene encodes a mitochondrial matrix enzyme, the mutations of which will lead to hyperammonemia., The Tyrp1 gene encodes a melanosomal enzyme, and defects in this gene are the cause of rufous oculocutaneous albinism and oculocutaneous albinism type III.,
Hbb2 encodes the β polypeptide chain, which functions in the transport of oxygen from the lungs to various peripheral tissues., The Fgf5 gene was identified as fibroblast growth factor., For each gene, there was at least one base of the start codon that was accessible to CRISPR-SL (Figure 3A). The results showed that A to G mutations were generated in three genes (Otc, Tyrp1, Hbb2), with average editing frequencies ranging from 15.68% to 51.77% (Figure 3B; Figure S2A). T to C mutations can be induced in Fgf5 and Hbb2 with a mean editing efficiency of up to 53.60% (Figure 3C; Figure S2B). Also, G to A mutations were produced in Fgf5 and Hbb2 with a mean editing efficiency of up to 73.50% (Figure 3D; Figure S2C).
Figure 3
Base Editor-Associated CRISPR-SL in Rabbit Embryos
(A) Summary of target sites and base editors used for CRISPR-SL in rabbit embryos. (B) Editing efficiencies of the A to G mutation in Otc, Tyr, and Hbb2, and their representative Sanger sequencing chromatograms. (C) Editing efficiencies of the T to C mutation in Fgf5, Hbb2, and their representative Sanger sequencing chromatograms. (D) Editing efficiencies of the G to A mutation in Fgf5, Hbb2, and their representative Sanger sequencing chromatograms. The PAM region (green) and start codon are underlined, and target bases are marked in red. Red arrow indicates mutated base in ATG. See also Figure S2.
Table 1
Summary of Targeted Sites, Embryonic Development, and Base Editors Used in Rabbit Embryos
Target Sites
No. of Zygotes
No. of Two-Cell (%)a
No. of Blastocysts (%)a
No. of Mutants (%)b
Mean Editing Efficiency (%)b
Base Editors
Otc
21
17 (81)
11 (52)
10 (91)
51.77
ABEmax
Tyrp1
18
15 (83)
9 (50)
8 (89)
41.51
NG-ABEmax
Hbb2-1
18
13 (72)
9 (50)
6 (67)
15.68
NG-ABEmax
Fgf5
20
18 (90)
11 (55)
11 (100)
53.60
ABEmax
Hbb2-2
19
15 (79)
11 (58)
10 (91)
34.22
ABEmax
Hbb2-2
13
9 (69)
7 (54)
5 (71)
73.50
rA1-BE4max
Fgf5
11
10 (91)
7 (64)
7 (100)
26.92
rA1-BE4max
“Two-Cell” is referring to two-cell stage embryos developing from zygotes.
Calculated from the number of zygotes.
Calculated from the number of blastocysts.
Base Editor-Associated CRISPR-SL in Rabbit Embryos(A) Summary of target sites and base editors used for CRISPR-SL in rabbit embryos. (B) Editing efficiencies of the A to G mutation in Otc, Tyr, and Hbb2, and their representative Sanger sequencing chromatograms. (C) Editing efficiencies of the T to C mutation in Fgf5, Hbb2, and their representative Sanger sequencing chromatograms. (D) Editing efficiencies of the G to A mutation in Fgf5, Hbb2, and their representative Sanger sequencing chromatograms. The PAM region (green) and start codon are underlined, and target bases are marked in red. Red arrow indicates mutated base in ATG. See also Figure S2.Summary of Targeted Sites, Embryonic Development, and Base Editors Used in Rabbit Embryos“Two-Cell” is referring to two-cell stage embryos developing from zygotes.Calculated from the number of zygotes.Calculated from the number of blastocysts.
Start Codon Disruption at Fgf5 to Generate a Long Hair Rabbit Using CRISPR-SL
The high editing rates indicate the practicality of CRISPR-SL in cells and rabbit embryos, which led us to ask whether it could also be applied to disrupt gene function in individual animals. Fibroblast growth factor 5 (FGF5), a member of the FGF family, was selected. It has been reported that targeted disruption of Fgf5 resulted in the phenotype of abnormally long hair. Thus, A single G·C to A·T base pair conversion was designed in the start codon of Fgf5 intending to disable it (Figure 4A). In order to acquire Fgf5 start-loss offspring, rabbit zygotes were initially injected with BE4max-encoding mRNA and associated sgRNA, then transferred into surrogate mothers (Table 2). Notably, all five pups obtained were homozygous mutations (100%) analyzed by TA cloning and Sanger sequencing (Figures 4B and 4C; Figure S3A). The pups had been developing a long hair phenotype compared with age-matched wild-type (WT) rabbits since 1 month of age (Figures 4D and 4E) without sacrificing body weight (Figure 4F). Quantitative real-time PCR and western blot were performed to illustrate how CRISPR-SL affects the expression of mRNA and protein. As shown in Figures 4G and 4H, the expression of Fgf5 in Fgf5 pups was drastically reduced compared with age-matched WT rabbits (Figure 4G), and protein was almost undetectable in homozygous Fgf5 progeny (Figure 4H). In addition, five potential off-target sites were examined, and no off-target editing event was detected in Fgf5 F0 rabbits (Figure S3B). These results collectively indicated that start codon mutation is sufficient to eliminate protein and generate the accompanying phenotype in rabbits.
Figure 4
Generation of Fgf5 Disrupted F0 Rabbits Using the CRISPR-SL Strategy
(A) The target sequence at the Fgf5 locus. The target sequence is shown in black; the PAM region is shown in green; the coding sequence (CDS) is indicated by black rectangles; the UTR is indicated by white rectangles. The start codon is marked by a rectangle, and the targeted base is marked in red. (B) Genotypes of F0 rabbits and age-matched WT rabbits were identified by TA cloning. The PAM region is marked by green, the start codon is marked be rectangles, and the substituted nucleotides are marked in red. (C) Representative Sanger sequencing chromatograms of Fgf5 F0 rabbits and WT rabbits. The PAM and start codon are underlined. Red arrow indicates substituted nucleotides. (D) Fgf5−/− rabbits exhibited a long hair phenotype. (E) Staple length of Fgf5 and WT F0 rabbits at different growth stages (n = 3). (F) Body weight of Fgf5 and age-matched WT F0 rabbits at different growth stages. (G) Expression of the Fgf5 gene as determined by quantitative real-time PCR (Het, n = 3; WT, n = 3). (H) Western blot analysis of FGF5 expression. See also Figure S3.
Table 2
Generation of SL Founder Rabbits
Base Editor
Target gene
Mutant Ratio (%)
No. of Transplanted Embryos
No. of Offspring
No. of Mutants
No. of Target Mutants
ABEmax
Fgf5
25
5
5 (100)
5 (100)
rA1-BE4max
Otc
25
7
7 (100)
7 (100)
Generation of Fgf5 Disrupted F0 Rabbits Using the CRISPR-SL Strategy(A) The target sequence at the Fgf5 locus. The target sequence is shown in black; the PAM region is shown in green; the coding sequence (CDS) is indicated by black rectangles; the UTR is indicated by white rectangles. The start codon is marked by a rectangle, and the targeted base is marked in red. (B) Genotypes of F0 rabbits and age-matched WT rabbits were identified by TA cloning. The PAM region is marked by green, the start codon is marked be rectangles, and the substituted nucleotides are marked in red. (C) Representative Sanger sequencing chromatograms of Fgf5 F0 rabbits and WT rabbits. The PAM and start codon are underlined. Red arrow indicates substituted nucleotides. (D) Fgf5−/− rabbits exhibited a long hair phenotype. (E) Staple length of Fgf5 and WT F0 rabbits at different growth stages (n = 3). (F) Body weight of Fgf5 and age-matched WT F0 rabbits at different growth stages. (G) Expression of the Fgf5 gene as determined by quantitative real-time PCR (Het, n = 3; WT, n = 3). (H) Western blot analysis of FGF5expression. See also Figure S3.Generation of SL Founder Rabbits
Start Codon Disruption at Otc to Generate an Otc Deficiency Rabbit Using CRISPR-SL
In previous reports, ornithine transcarbamylase (OTC) mutation resulted in fatal hyperammonemia, liver fibrosis, and lower body weight., Notably, Otc single nucleotide variants (1A>G, 2T>C, 3G>A) were previously proven to be pathogenic in humans.49, 50, 51 To imitate OTC deficiency using CRISPR-SL in rabbits, a sgRNA (Otc) converting the A·T base pair in the start codon into G·C (Figure 5A) was designed. Rabbit zygotes were initially injected with ABEmax-encoding mRNA and corresponding sgRNA, then transferred into surrogate mothers. We obtained seven alive pups; one of them was a homozygous (#1), and the rest were heterozygous (Het) (Figures 5B and 5C; Figure S4A). Four pups (#2, #3, #5, #7) died within the first week, and one pup (#1) perished 24 h after birth, consistent with previous reports in mice. However, a Het male (#4) with a mutation efficiency of 30% lived to 58 days (Figure 5D). The body weight of pups had been recorded before death, and it decreased remarkably compared with age-matched WT rabbits (Figure 5E). As evidenced by histological hematoxylin and eosin (H&E) and Masson’s trichrome staining of liver sections, a Het male (#4) developed severe liver fibrosis (Figure 5F). Moreover, the blood test revealed high plasma ammonia in Het offspring (Figure 5G), and no off-target editing events were observed in Otc deficiency rabbits (Figure S4B). Gross morphology of kidney and liver sections and urinalysis implied other pathological abnormalities (Figures S4C–S4E). Phenotypes we observed in rabbit models were in accordance with those in human and mouse models., Effects of start-loss on mRNA and protein were characterized by quantitative real-time PCR and western blot, respectively. As shown in Figure 5H, Otc mRNA slightly reduced in Het progeny for partially mutated individuals maintain partly transcription level, and dramatically dropped in homozygous offspring. Likewise, protein decreased in Het pups, and no protein was detected in the homozygous offspring (Figure 5I), indicating the availability of CRISPR-SL to disrupt the gene and its coding protein.
Figure 5
Generation of Otc Disrupted F0 Rabbits Using the CRISPR-SL Strategy
(A) The target sequence at Otc locus. The target sequence is shown in black; the PAM region is shown in green; the CDS is indicated by black rectangles; the UTR is indicated by white rectangles. The start codon is marked by a rectangle, and the targeted base is marked in red. (B) Representative Sanger sequencing chromatograms of Otc F0 rabbits and WT rabbits. The PAM and start codon are underlined. Arrow indicates substituted nucleotides. (C) Genotypes of F0 rabbits and age-matched WT rabbits identified by TA cloning. The start codon is marked by a rectangle, and the substituted nucleotides are marked in red. (D) Survival proportions of Otc F0 rabbits and age-matched WT rabbits (Het, n = 7; WT, n = 4). (E) Body weight of Otc and age-matched WT F0 rabbits. (F) H&E and Masson’s trichrome staining of liver sections from WT and Otc rabbits (#4). The blue arrow indicates liver fibrosis. (G) Plasma ammonia of Otc and age-matched WT F0 rabbits (Het, n = 4; WT, n = 4). (H) Expression of the Otc gene determined by quantitative real-time PCR (repeats = 3). (I) Western blot analysis of OTC expression. See also Figure S4.
Generation of Otc Disrupted F0 Rabbits Using the CRISPR-SL Strategy(A) The target sequence at Otc locus. The target sequence is shown in black; the PAM region is shown in green; the CDS is indicated by black rectangles; the UTR is indicated by white rectangles. The start codon is marked by a rectangle, and the targeted base is marked in red. (B) Representative Sanger sequencing chromatograms of Otc F0 rabbits and WT rabbits. The PAM and start codon are underlined. Arrow indicates substituted nucleotides. (C) Genotypes of F0 rabbits and age-matched WT rabbits identified by TA cloning. The start codon is marked by a rectangle, and the substituted nucleotides are marked in red. (D) Survival proportions of Otc F0 rabbits and age-matched WT rabbits (Het, n = 7; WT, n = 4). (E) Body weight of Otc and age-matched WT F0 rabbits. (F) H&E and Masson’s trichrome staining of liver sections from WT and Otcrabbits (#4). The blue arrow indicates liver fibrosis. (G) Plasma ammonia of Otc and age-matched WT F0 rabbits (Het, n = 4; WT, n = 4). (H) Expression of the Otc gene determined by quantitative real-time PCR (repeats = 3). (I) Western blot analysis of OTCexpression. See also Figure S4.
CRISPR-SL Enables the Disruption of Human Genes on a Genome-wide Scale
To determine whether, despite this limitation, CRISPR-SL could be utilized to disrupt genes on a genome-wide scale, we identified sgRNA using CRISPR-SL (SgStartloss) for all 26,640 genes reported in the human reference genome (hg19). We first identified all start codons (ATG on the coding strand, and CAT on the noncoding strand) with an NGG PAM located at the appropriate distance (which is 12–16 nt) from the targeted base for conventional BEs (Figure 6A). The genome-wide analysis revealed that 60.03% of start codons in the human reference genome can be targeted, thus enabling the possibility to disrupt 15,993 genes (Figures 6B and 6C). Specifically, 57.57% of targetable start codons can be destroyed by targeting at least two bases, and 42.43% of them can be disrupted by targeting either A, T, or G in the start codon (Figure S5B).
Figure 6
Genome-Wide Analysis of Genes Accessible to CRISPR-SL Strategy
(A) Workflow utilized to identify CRISPR-SL targetable sites in all open reading frames (ORFs) with CDS coordinates available from the UCSC genome browser. (B) The proportion of human genes that can be edited precisely by CRISPR-SL with conventional base editors recognizing NGG PAM. Sky blue indicates targetable genes; violet indicates inaccessible genes. (C) Number of genes that can be targeted by disrupting A, T, or G in the start codon, respectively, using conventional base editors recognizing NGG PAM. A stands for adenine in the start codon of the targetable gene (yellow); T represents thymine in the start codon of the targetable gene (sky blue); and G stands for guanine (pink). (D) The percentages of genes that can be edited by CRISPR-SL using NG base editors. Sky blue indicates the proportion of the targetable gene, and violet indicates the proportion of the inaccessible gene. (E) Number of genes that can be targeted by disrupting A, T, or G in the start codon, respectively, using NG base editors. Yellow indicates the number of genes with targetable A, blue indicates the number of genes with targetable T, and pink indicates the number of genes with targetable G. (F) Proportion of genes with targetable or inaccessible G. The percent of targetable G (rA1-NG-BE4max) is marked by yellow; the increased percent of targetable G (eAID-NG-BE4max) is marked by dark yellow; the percent of inaccessible G is marked by violet. (G) The percentages of genes which can be edited by CRISPR-SL using NG base editors and eAID-BE4max. Sky blue indicates the proportion of the targetable gene, and violet indicates the proportion of the inaccessible gene. See also Figure S5.
Genome-Wide Analysis of Genes Accessible to CRISPR-SL Strategy(A) Workflow utilized to identify CRISPR-SL targetable sites in all open reading frames (ORFs) with CDS coordinates available from the UCSC genome browser. (B) The proportion of human genes that can be edited precisely by CRISPR-SL with conventional base editors recognizing NGG PAM. Sky blue indicates targetable genes; violet indicates inaccessible genes. (C) Number of genes that can be targeted by disrupting A, T, or G in the start codon, respectively, using conventional base editors recognizing NGG PAM. A stands for adenine in the start codon of the targetable gene (yellow); T represents thymine in the start codon of the targetable gene (sky blue); and G stands for guanine (pink). (D) The percentages of genes that can be edited by CRISPR-SL using NG base editors. Sky blue indicates the proportion of the targetable gene, and violet indicates the proportion of the inaccessible gene. (E) Number of genes that can be targeted by disrupting A, T, or G in the start codon, respectively, using NG base editors. Yellow indicates the number of genes with targetable A, blue indicates the number of genes with targetable T, and pink indicates the number of genes with targetable G. (F) Proportion of genes with targetable or inaccessible G. The percent of targetable G (rA1-NG-BE4max) is marked by yellow; the increased percent of targetable G (eAID-NG-BE4max) is marked by dark yellow; the percent of inaccessible G is marked by violet. (G) The percentages of genes which can be edited by CRISPR-SL using NG base editors and eAID-BE4max. Sky blue indicates the proportion of the targetable gene, and violet indicates the proportion of the inaccessible gene. See also Figure S5.To expand the genome-wide targetable scope, start codons with an NGN PAM of proper distance were selected. Notably, rA1-NG-BEs remarkably improve the accessible spectrum of the genome to 96.88% (Figures 6D and 6E). The rate of targetable start codons that can be disrupted by editing multiple bases increases from 57.57% to 85.88% (Figure S5B), and those that can only be disrupted by targeting a single base account for merely 14.12% (Figure S5B). To further broaden the genome-wide targetable scope, we identified all start codons (CAT in noncoding strand) with an NGN PAM located at the appropriate distance (which is 9–19 nt) from the targeted base C for eAID-BE4max as mentioned above. In comparison with NG-BE4max, eAID-BE4max theoretically increases 15.32% of genes with targetable G in the start codon (Figure 6F), which further increases the genome-wide targetable scope from 96.88% to 99.01% (Figure 6G). The rate of genes that can be destroyed by targeting multiple bases increases from 85.88% to 93.32%. Genes that can be disrupted by targeting a single base in the start codon take up only 6.68% (Figures S5A and S5B). In general, computational analysis of the human genome indicates that the CRISPR-SL approach can be used to target a majority of genes, and it is a promising avenue for functional studies without altering gene structures.
Discussion
In this study, we primarily examined the utilization of CRISPR-SL in silencing gene function and proved how it is a feasible alternative to CRISPR-Cas9-associated gene knockout,, and BE-mediated premature termination., CRISPR-SL has been successfully utilized in not only cellular and embryonic levels but also in the rabbit, while it is restrained by the editing window and targetable scope. We have used NG-BEs and eAID-BEs to broaden the targetable scope and widen the editing window, which remarkably augment the availability of CRISPR-SL. Except for what we have used in this study, other BEs could further optimize CRISPR-SL. One could use distinctive effector enzymes recognizing different PAMs, such as Spy-mac Cas9, which recognizes 5′-NAA-3′ PAM, and Cpf1, which requires a 5′-TTTV-3′ PAM,, to broaden the targeting spectrum. BEs with a wider editing window, BE-PLUS, CP-BEs, and latest prime editing, for example, can be utilized to expand the targeting scope of CRISPR-SL.DNA and RNA off-target effects generated by BEs are another limitation on CRISPR-SL, even though no off-target effects were observed in this study. However, the Cas9-dependent DNA off-target mutations, which mainly derive from effector enzymes, can be minimized by using Cas9 variants with higher DNA specificity, or by delivering BEs in ribonucleoprotein from, rather than expressing them from, DNA plasmids. The Cas9-independent off-target DNA and RNA mutations primarily arising from the deaminase domain can be eliminated by rational mutagenesis of the deaminase domain, such as representative YE1-CBEs.,We noticed a significant decrease in mRNA expression in both Het and homozygous F0 pups, which is consistent with the results observed in innate start codon mutation-induced disease cases. Previous reports have demonstrated that other codons stand a chance to initiate translation, even though at a low efficiency.60, 61, 62 The GTG codon, the most efficient alternative start codon tested, was produced in Otc start-loss pups, but no detectable protein was noticed. We supposed that non-ATG-initiated translation is after all a rare occurrence and it does not hinder CRISPR-SL from inactivating genes. To eliminate the possibility of initiating translation using a non-start codon, recently developed Target-ACE, a novel BE that tethers activation-induced cytidine deaminase (AID) and an engineered tRNA adenosine deaminase (TadA) to a catalytically impaired SpCas9, can be used. With Target-ACE, CRISPR-SL could disrupt T and G simultaneously, converting ATG into ACA, a non-start codon that has not been proven to enable translation, with only one sgRNA. It probably stands a better chance to destroy genes of interest.Except for gene silence, CRISPR-SL could also be used to identify the function of different transcripts with distinct sequences surrounding start codon. A previous study used BEs to disrupt splice sites so as to control mRNA isoforms. CRISPR-SL could also control mRNA isoforms by disrupting the start codon of unwanted transcripts. In this circumstance, a transcript of interest can be characterized by disrupting the start codon of other transcripts. Its function, therefore, can be verified. With knowledge about the functions of distinct transcripts, we are given controls over manipulating transcriptional products. However, transcripts with the same sequence surrounding the start codon cannot be identified by CRISPR-SL, as it is impossible to differentiate one transcript from another in this case.While we were preparing the manuscript, another team released similar research, in which they revealed that of 1,463 genes that cannot be targeted by iSTOP, 1,345 are accessible to (i-Silence). Since i-Silence was induced only by ABEs, CRISPR-SL, a combination of i-Silence and CBEs, might be able to eliminate more uncharted genes. Thus, CRISPR-SL is expected to expand the toolkit for functional studies and optimize gene function-studying methods when combined with existing avenues.
Materials and Methods
Ethics Statement
New Zealand White rabbits were obtained from the Laboratory Animal Center of Jilin University (Changchun, China). All animal studies were conducted according to experimental practices and standards approved by the Animal Welfare and Research Ethics Committee at Jilin University.
Cell Culture and DNA Transfection
The HEK293T cell line (Life Technologies) was cultured in Dulbecco’s modified Eagle’s medium (DMEM) supplemented with 10% fetal bovine serum (HyClone), 2 mM GlutaMAX (Life Technologies), 100 U/mL penicillin, and 100 mg/mL streptomycin, and was incubated at 37°C with 5% CO2. Cells were seeded into six-well poly-d-lysine-coated plates (Corning Life Sciences). 12–15 h after plating, cells were transfected with 5 μL of Lipofectamine 3000 (Thermo Fisher Scientific) using 1,250 ng of BE plasmid and 1,250 ng of guide RNA plasmid. Genomic DNA was extracted after 48 h of transfection using the TIANamp genomic DNA kit (Tiangen, Beijing, China) according to the manufacturer’s instructions. DNA sequences including target base-editing sites were amplified with primer pairs listed in Table S1.
Designing sgRNAs
Designing sgRNAs for CRISPR-SL follows five steps. First, locate all initiation codons ATG in the coding strand and CAT in the noncoding strand. Second, map the coding sequence coordinate to the genomic sequence coordinate. Third, search for a PAM with appropriate space from the targeted base in the start codon (e.g., 12–16 nt for conventional BEs). Fourth, annotate sgStartloss with the gene name. Fifth, blast the desirable sgRNA to the whole genome of the corresponding species and design primers to amplify the target sequence for the purpose of analysis.
In Vitro Transcription
The BE4max and ABEmax plasmids were obtained from Addgene (#112093 and #112095). NG-ABEmax, NG-BE4max, and eAID-BE4max plasmids were optimized by our laboratory as previously described. The plasmid was linearized with NotI, and mRNA was synthesized using an in vitro RNA transcription kit (HiScribe T7 ARCA mRNA kit [with tailing], NEB). The sgRNA oligonucleotides were annealed into pUC57-sgRNA expression vectors with a T7 promoter. Then, sgRNAs were amplified and transcribed in vitro using the MAXIscript T7 kit (Ambion) and purified with a miRNeasy mini kit (QIAGEN) according to the manufacturer’s instructions. The sgRNA sequences used in this study are shown in Table 1.
Microinjection of Rabbit Zygotes
The protocol for microinjection of pronuclear-stage embryos has been described in detail in our published protocols. Briefly, a mixture of ABEmax, NG-ABEmax, or BE4max mRNA (200 ng/μL) and sgRNA (50 ng/μL) were microinjected into the cytoplasm of pronuclear-stage zygotes and then transplanted into the surrogate mother.
Single-Embryo PCR Amplification and Rabbit Genotyping
The injected embryos were collected at the blastocyst stage. Genomic DNA was extracted with an embryo lysis buffer (1% Nonidet P-40 [NP40]) at 56°C for 60 min and 95°C for 10 min in a Bio-Rad PCR amplifier, and then subjected to Sanger sequencing. Genomic DNA was extracted from ear clips of newborn rabbits for PCR genotyping and subjected to Sanger sequencing and T-A cloning. Twenty positive T clones were sequenced by Tiangen. All the primers for detection are listed in Table S1.
Off-Target Editing Analysis
Top potential off-target sites for each target site in HEK293T cells and Otc, Otc F0 rabbits and Fgf5 F0 rabbits were predicted using an online tool (http://www.rgenome.net/cas-offinder/). Off-targets sites and primers used to amplify target sequences are listed in Table S1. Genomic DNA was extracted from ear clips of newborn rabbits and transfected HEK293T cells for off-target editing analysis and subjected to Sanger sequencing and then analyzed by EditR.
H&E and Masson’s Trichrome Staining
The protocol has been described in detail in our published protocols. Briefly, tissues from WT and mutant rabbits were fixed in 4% paraformaldehyde for 48 h, embedded in paraffin wax, and then sectioned for slides. Slides were stained with H&E and Masson’s trichrome and analyzed using a Nikon TS100 microscope.
Quantitative Real-Time PCR
Total RNA was isolated with TRNzol-A+ reagent (Tiangen, Beijing, China) according to the manufacturer’s instructions. cDNA was synthesized with DNase I (Fermentas) treated total RNA using the BioRT cDNA first-stand synthesis kit (Bioer Technology, Hangzhou, China). Primers used for quantitative real-time PCR are listed in Table S2. Quantitative real-time PCR was performed using the BioEasy SYBR Green I real-time PCR kit (Bioer Technology, Hangzhou, China) with the Bio-Rad IQ5 multicolor real-time PCR detection system. The relative gene expression normalized to Gapdh was determined by the 2−ΔΔCT formula. All data of gene expression were performed three times and are expressed as mean ± SEM.
Western Blotting
For western blotting, the ear tissues from WT and Fgf5rabbits and liver tissues of WT, Otc, and Otc+/− rabbits were homogenized in 200 μL of radioimmunoprecipitation assay (RIPA) lysis buffer (Beyotime). The protein concentrations were measured by the Braford method (Bio-Rad). Anti-FGF5 rabbit polyclonal antibody (1:600; Proteintech, catalog no. 18171-1-AP), anti-Otc rabbit polyclonal antibody (1:400; Abcam, catalog no. ab55914), and anti-tubulin monoclonal antibody (1:4,000; Proteintech, catalog no. 66240-1-Ig) were used as primary and internal controls.
Statistical Analysis
Editing efficiencies of CRISPR-SL in the cellular and embryonic levels were collected by analyzing corresponding Sanger sequencing chromatograms using Editr. Editing efficiencies of founder rabbits were analyzed by TA cloning. All data are expressed as mean ± SEM, representative of at least three individual determinations in all experiments. The data were analyzed with t tests using GraphPad Prism software 6.0. A probability of p < 0.05 was considered statistically significant (∗p < 0.05, ∗∗p < 0.01, ∗∗∗p < 0.001).
Author Contributions
S.C. and Z.Liu performed the experiment. W.X. conducted bioinformatics analysis. H.S., Y.S., and M.C. contributed reagents, materials, and analysis tools. Z.Liu, Z.Li, H.Y., and L.L. conceived the idea. Z.Li, H.Y., and L.L. provided funding support. S.C., H.Y., Z.Li, and L.L. wrote the manuscript. All authors reviewed the manuscript.