Literature DB >> 35983232

BEtarget: A versatile web-based tool to design guide RNAs for base editing in plants.

Xianrong Xie1,2, Fuquan Li1, Xiyu Tan1, Dongchang Zeng1, Weizhi Liu1, Wanyong Zeng1, Qinlong Zhu1,2, Yao-Guang Liu1,2.   

Abstract

CRISPR-dependent base editors enable direct nucleotide conversion without the introduction of double-strand DNA break or donor DNA template, thus expanding the CRISPR toolbox for genetic manipulation. However, designing guide RNAs (gRNAs) for base editors to enable gene correction or inactivation is more complicated than using the CRISPR system for gene disruption. Here, we present a user-friendly web tool named BEtarget dedicated to the design of gRNA for base editing. It is currently supported by 46 plant reference genomes and 5 genomes of non-plant model organisms. BEtarget supports the design of gRNAs with different types of protospacer adjacent motifs (PAM) and integrates various functions, including automatic identification of open reading frame, prediction of potential off-target sites, annotation of codon change, and assessment of gRNA quality. Moreover, the program provides an interactive interface for users to selectively display information about the desired target sites. In brief, we have developed a flexible and versatile web-based tool to simplify complications associated with the design of base editing technology. BEtarget is freely accessible at https://skl.scau.edu.cn/betarget/.
© 2022 The Author(s).

Entities:  

Keywords:  ABE, adenine base editor; Base editing; CBE, cytosine base editor; CDS, coding sequence; CRISPR; CRISPR, clustered regularly interspaced short palindromic repeats; DSB, double-strand break; Genome editing; NHEJ, non-homologous end joining; ORF, open reading frame; PAM, protospacer adjacent motif; gRNA design; gRNA, guide RNA

Year:  2022        PMID: 35983232      PMCID: PMC9355906          DOI: 10.1016/j.csbj.2022.07.046

Source DB:  PubMed          Journal:  Comput Struct Biotechnol J        ISSN: 2001-0370            Impact factor:   6.155


Introduction

Clustered regularly interspaced short palindromic repeats associated protein (CRISPR/Cas) systems are adaptive defense systems that protect bacteria and archaea from invading viruses or plasmids [1], [2], [3]. They act through at least three general steps, which are as follows. In the adaptive stage, organisms respond to viral or plasmid challenges by integrating short fragments of foreign sequence into the CRISPR locus. In the expression and interference stages, the CRISPR array is transcribed and processed into short crRNAs, which guide Cas proteins to cleave the invading foreign sequence [4]. Currently, CRISPR/Cas systems are engineered into versatile tools for genome editing [5], [6], [7]. Cas endonuclease targets a specific sequence through base pairing with the help of a guide RNA (gRNA) and creates a double-strand break (DSB) at the cleavage site [8]. In most eukaryotes, DSB is predominantly repaired by the error-prone non-homologous end joining (NHEJ) pathway, generally resulting in random nucleotide insertion or deletion mutations [9], [10]. Although the NHEJ-dependent gene disruption is efficient, it is frequently hard to achieve the expected accuracy in gene correction [11], [12], [13]. In the presence of a donor template, the homology-directed repair (HDR) pathway at the DSB site enables a symmetric sequence correction or insertion at the target site [5]. However, the low efficiency of HDR in plant cells and the lack of an efficient donor DNA delivery method limit its application in plants [14], [15]. Wild-type Cas9 contains two conversed nuclease domains (RuvC and HNH-like domains) in which each cleaves one strand of the double-helix DNA. By mutating one of the two critical residues (D10A or H840A) in the nuclease domains, nickase Cas9 (nCas9) can be generated which cuts only one strand of the target sites [16], [17]; if two nuclease domains are mutated, the catalytically dead Cas9 (dCas9) shows inactive nuclease activity while retaining its ability to bind DNA [18]. Engineering of Cas9 variants with different deaminases has generated diverse base editing tools, including cytosine base editor (CBE) and adenine base editor (ABE) [19], [20], [21]. The most commonly used base editors are nCas9 or dCas9, which are fused with cytidine deaminase (CBE) or adenosine deaminase (ABE) to simulate cellular mismatch repair and subsequent suitable base substitution [22], [23]. Guided by sgRNA, fused protein complexes are enabled to produce site-specific C-to-T or A-to-G substitutions without producing DSBs or introducing donor DNA templates [22], [23]. Owing to the unique capability of fine-tuned mutagenesis, base editors are widely applied to both model and non-model organisms for genetic manipulation [24], [25], [26], [27]. For example, base substitutions in the codons of an open reading frame (ORF) potentially enable amino acid substitutions (missense mutation) or cause gene inactivation by introducing premature STOP codons (nonsense mutation) [11], [12], [28], [29], [30], [31]. Moreover, multiple base substitutions can occur in the editing window, thus ensuring the role of base editors in large-scale saturation mutation, regulatory element editing, therapeutic gene correction, and crop improvement [32], [33], [34], [35], [36]. However, the design of gRNAs for base editors is complicated, and several specific criteria are carefully considered [24], including the preferred editing window, bystander effect, potential codon change, and off-target effect. Thus a convenient tool for the rapid design of gRNAs for base editors is required and currently, there are a few programs available [37]. Based on the extensive analysis of genome-wide base editing outcomes in mammalian cells, several machine learning models with corresponding design programs, including BE-Hive, BE-DICT, DeepBaseEditor, and FORECasT-BE, were developed to predict the editing efficiency and the bystander effect [38], [39], [40], [41]. In other organisms, a few gRNA design tools were developed to assist researchers in the rapid choosing of appropriate target sites [42], [43], [44], [45], [46]. Due to the difficulties faced while performing a large-scale genetic transformation method, especially in plants, these tools majorly focus on the function of searching all possible target sites in the target gene and the annotation of the codon changes in the editing window. For example, CRISPR-BETS and CRISPR-CBEI focus on designing gRNAs for CBE-mediated nonsense mutation [42], [43], CRISPyweb 2.0 is a tool developed only for Streptomyces coelicolor [44], beditor is a specialized tool used to design genome-wide gRNA libraries in cell lines [45], and BE-Designer provides a comprehensive analysis of all possible candidate target sites with useful information, including potential off-target sites for base editors in various organisms [46]. Considering the wide and prospective applications of base editors in plants, we have developed a dedicated web-based tool named BEtarget to aid researchers in the design of gRNAs for genes of interest by visiting directly at , or from our previous CRISPR-GE website [47] (). The program provides a user-friendly submission interface by supporting fully customizable settings of input parameters and target genes. BEtarget lists all possible target sites in the ORF of a given gene as an interactive table and a graph along with comprehensive information, including amino acid change, potential off-target sites, and quality assessment.

Workflow and implementation

Workflow of BEtarget

BEtarget is a web-based application in a “browser-server” mode. Its front web pages are implemented with HTML5, JavaScript, and Bootstrap, and the backend is constructed using the Django framework. Programs for processing the uploaded data and analyzing sequences are written with Python 3. The workflow of BEtarget is shown in Fig. 1. Overall, it is designed with three major steps: (i) the input page accepts user-defined parameters and a target sequence or a gene locus identity (ID) number; (ii) the backend processes the uploaded data, including automatic detection of the ORF, identification of all possible target sites in the ORF, and prediction of potential off-target sites; (iii) the output page retrieves the resulting data in the JSON format and displays results in an interactive table with useful information.
Fig. 1

Overall workflow chart of BEtarget. The program searches all possible target sites in a target gene based on the user’s defined parameters and performs a comprehensive analysis of the target sites, including their basic features, potential off-target sites, and changes in codons and amino acids within the editing widows.

Overall workflow chart of BEtarget. The program searches all possible target sites in a target gene based on the user’s defined parameters and performs a comprehensive analysis of the target sites, including their basic features, potential off-target sites, and changes in codons and amino acids within the editing widows.

Sequence input and setting of parameters

The submission interface consists of two parts, namely a panel for parameter settings, including the setting of a protospacer adjacent motif (PAM), editing type (CBE or ABE), editing window and a reference genome selection, and another panel for target sequence (or gene locus) input (Fig. 2). BEtarget presently supports the searching of targets sites with commonly used PAM (including NGG, NG, TTN, and TTTN) or any specific PAM defined by users. Based on the type and efficiency of the base editor used (either CBE or ABE), users can manually define the editing window accordingly. Currently, 46 plant reference genomes and 5 genomes of non-plant model organisms are provided as references to evaluate potential off-target sites. Users can also select “None” if no target reference genome is available for which the program will not predict potential off-target sites. Alternatively, users can request the developer to add more desired reference genome(s). In the sequence input panel, three types are supported as input: (i) the genomic sequence with the corresponding intact coding sequence (CDS) of a target gene; (ii) a partial sequence of the target gene; (iii) a gene locus ID of the selected reference genome. When a partial sequence of the target gene as input which can either be a genomic sequence or CDS, BEtarget will automatically extract the corresponding genomic sequence and CDS of the gene by invoking a subprogram called GeneCat for gRNA design. The GeneCat tool can also be accessed independently from the website () to extract the genomic sequence and corresponding CDS of a gene (Supplementary Fig. 1). Below the “Submit” button, users can select to check for gene structure (default is not selected).
Fig. 2

Submission page of BEtarget. (A) Settings panel for PAM type, editing type, editing window, and reference genome. (B) Input panel for target sequence (or gene). Three types are supported as input: (i) the genomic sequence with the corresponding CDS of a target gene; (ii) a partial sequence of target gene; (iii) a gene locus ID of the selected reference genome.

Submission page of BEtarget. (A) Settings panel for PAM type, editing type, editing window, and reference genome. (B) Input panel for target sequence (or gene). Three types are supported as input: (i) the genomic sequence with the corresponding CDS of a target gene; (ii) a partial sequence of target gene; (iii) a gene locus ID of the selected reference genome.

Data processing

After submitting a task, BEtarget first preprocesses the uploaded target sequence or gene locus. It is important to exactly define ORF in the given genomic sequence to identify the locations of target sites. After uploading the genomic sequence and intact CDS, the program aligns CDS to the genomic sequence and automatically detects the boundaries of each exon. If gene locus ID is used as input, the program fetches the genomic coordinates of all transcripts of the input gene from the reference genome database. If users choose to check gene structure at the submission page, detailed coordinate information and sequence are returned to the front check page for the users to confirm the exon/intron structure or select a preferred transcript from possible multiple alternative transcripts (Fig. 3). Minor modifications in the exon positions are allowed when necessary, in case the position offset happens in the program. Otherwise, in general, the program uses genomic coordinates judged by the program. Once the coordinates of ORF or the transcript are confirmed, all possible target sites are extracted, and only those located in or overlapping with the coding regions are kept for further analysis. Potential off-target sites and their scores are then predicted by invoking the offTarget program in the CRISPR-GE toolkit [41]. Possible secondary structures of the targets that pair with the gRNA scaffold sequence are also analyzed. Finally, possible base substitutions within the editing windows are subjected to the annotation of codon change and corresponding amino acid change. The resulting data, including the target sites and their correlated information, are returned to the results page in the JSON format.
Fig. 3

Proofreading of CDS and transcript selection of the target gene. (A) When the genomic sequence with corresponding CDS is used as input, the program automatically detects the coordinates of CDS on the genomic sequence. If users select to check the gene structure at the submission page, BEtarget jumps to the check page, which displays the exon/intron structure of the target gene sequence in which the exons are highlighted in yellow. Coordinates can be modified by clicking the “Adjust CDS” button. (B) When gene locus is used as input, BEtarget displays all possible alternative transcripts of the target gene, and users can select the preferred transcript for target design. The yellow boxes indicate ORFs of the transcript. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Proofreading of CDS and transcript selection of the target gene. (A) When the genomic sequence with corresponding CDS is used as input, the program automatically detects the coordinates of CDS on the genomic sequence. If users select to check the gene structure at the submission page, BEtarget jumps to the check page, which displays the exon/intron structure of the target gene sequence in which the exons are highlighted in yellow. Coordinates can be modified by clicking the “Adjust CDS” button. (B) When gene locus is used as input, BEtarget displays all possible alternative transcripts of the target gene, and users can select the preferred transcript for target design. The yellow boxes indicate ORFs of the transcript. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Output and result visualization

BEtarget produces an interactive graph and table for displaying the results (Fig. 4). The graph shows the structure of a target gene in which the exons are indicated as yellow-colored boxes (Fig. 4A). When the mouse pointer is moved on an exon, the table lists only candidate target sites in the exon, and when clicked on “show all target”, it displays all candidate target sites in the gene. The initial results table lists all candidate target sites and their positions, strands, GC content, potential off-target sites, and corresponding codon changes (Fig. 4B). Possible base substitutions and corresponding amino acid changes, including stop codon mutations within the editing windows are highlighted in red. Users can select to display target sites with the expected mutation type, either as all candidate sites or only those with stop codon mutations (nonsense mutations). By clicking the “see detail” link, users can examine detailed potential off-target sites with their scores.
Fig. 4

Visualization of the results. (A) The graph shows the structure of a target gene. When the mouse pointer is moved on an exon, the results table lists the target sites present in the exon. When the “Show all target” is clicked on, it displays all candidate target sites in the gene. Users can selectively display target sites with the expected mutation type, either as all candidate sites or only those with stop codon mutations. (B) The table lists candidate target sites and their useful information, including basic features, predicted amino acid changes, potential off-target sites, and scores. Users can select target site(s) (on the left-most side) and click “Show target(s) in the gene sequence” or “Primer design” for these purposes.

Visualization of the results. (A) The graph shows the structure of a target gene. When the mouse pointer is moved on an exon, the results table lists the target sites present in the exon. When the “Show all target” is clicked on, it displays all candidate target sites in the gene. Users can selectively display target sites with the expected mutation type, either as all candidate sites or only those with stop codon mutations. (B) The table lists candidate target sites and their useful information, including basic features, predicted amino acid changes, potential off-target sites, and scores. Users can select target site(s) (on the left-most side) and click “Show target(s) in the gene sequence” or “Primer design” for these purposes. Although there is a lack of large-scale analysis of base editing outcomes for predicting the efficiency because of the low-efficiency of delivery method in plants, it is possible to anticipate low-efficiency target sites, including those with very low or high GC content (≤25 % or ≥ 80 %), poly-T site(s), contiguous base-pairing with the sgRNA sequence, or potential off-target sites of high score value (≥0.7). These candidate low-quality target sites are marked with warning indicators (!, !!, or !!!). Finally, users can select target site(s) (on the left-most side) and click on “Show target(s) in the gene sequence” or “Primer design” (below the candidate targets) for the purposes indicated.

Conclusion

BEtarget is a user-friendly tool to select optimal gRNAs in a given gene for base editing. The program extracts comprehensive information of all candidate target sites in the coding regions, including their basic features, predicted codon changes, potential off-target sites, and annotations. The results are displayed using an interactive graph and table. In this study, BEtarget was carefully compared with other similar design tools that foucus on the function of searching target sites for base editing, including CRISPR-BETS [42], CRISPR-CBEI [43], PnB Designer [48], and BE-Designer [46], as summarized in Supplementary Table 1. The comparison showed that BEtarget has the following advantages. Firstly, BEtarget supports gRNA design for current base editors, including ABE and CBE, with no limitation of PAM variants. By modifying the parameters of the PAM setting, users can define their PAM type. Secondly, BEtarget is flexible for sequence input, which allows users input a partial sequence of target gene, and the program can automatically fetch the genomic sequence and corresponding CDS. Thirdly, BEtarget provides an interactive and customized visualization interface to display information about target sites. In summary, BEtarget is an innovative tool for researchers to rapidly choose appropriate target sites for base editing.

Author contributions

YL and XX drafted the project. XX, FL, and XT wrote the programs for BEtarget. YL and QZ provided critical comments on the design of BEtarget. WL and DZ tested the programs. XX and FL wrote the manuscript. All authors read and approved the final version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (31991223) and the Major Program of Guangdong Basic and Applied Basic Research (2019B030302006).

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
  47 in total

Review 1.  CRISPR/Cas, the immune system of bacteria and archaea.

Authors:  Philippe Horvath; Rodolphe Barrangou
Journal:  Science       Date:  2010-01-08       Impact factor: 47.728

2.  CRISPR-GE: A Convenient Software Toolkit for CRISPR-Based Genome Editing.

Authors:  Xianrong Xie; Xingliang Ma; Qinlong Zhu; Dongchang Zeng; Gousi Li; Yao-Guang Liu
Journal:  Mol Plant       Date:  2017-06-15       Impact factor: 13.164

Review 3.  The Biology of CRISPR-Cas: Backward and Forward.

Authors:  Frank Hille; Hagen Richter; Shi Pey Wong; Majda Bratovič; Sarah Ressel; Emmanuelle Charpentier
Journal:  Cell       Date:  2018-03-08       Impact factor: 41.582

Review 4.  CRISPR-Cas systems for editing, regulating and targeting genomes.

Authors:  Jeffry D Sander; J Keith Joung
Journal:  Nat Biotechnol       Date:  2014-03-02       Impact factor: 54.908

Review 5.  Methods and Applications of CRISPR-Mediated Base Editing in Eukaryotic Genomes.

Authors:  Gaelen T Hess; Josh Tycko; David Yao; Michael C Bassik
Journal:  Mol Cell       Date:  2017-10-05       Impact factor: 17.970

6.  Expanded base editing in rice and wheat using a Cas9-adenosine deaminase fusion.

Authors:  Chao Li; Yuan Zong; Yanpeng Wang; Shuai Jin; Dingbo Zhang; Qianna Song; Rui Zhang; Caixia Gao
Journal:  Genome Biol       Date:  2018-05-29       Impact factor: 13.583

7.  Web-based design and analysis tools for CRISPR base editing.

Authors:  Gue-Ho Hwang; Jeongbin Park; Kayeong Lim; Sunghyun Kim; Jihyeon Yu; Eunchong Yu; Sang-Tae Kim; Roland Eils; Jin-Soo Kim; Sangsu Bae
Journal:  BMC Bioinformatics       Date:  2018-12-27       Impact factor: 3.169

8.  PnB Designer: a web application to design prime and base editor guide RNAs for animals and plants.

Authors:  Sebastian M Siegner; Mehmet E Karasu; Markus S Schröder; Zacharias Kontarakis; Jacob E Corn
Journal:  BMC Bioinformatics       Date:  2021-03-02       Impact factor: 3.169

9.  Correction of the Marfan Syndrome Pathogenic FBN1 Mutation by Base Editing in Human Cells and Heterozygous Embryos.

Authors:  Yanting Zeng; Jianan Li; Guanglei Li; Shisheng Huang; Wenxia Yu; Yu Zhang; Dunjin Chen; Jia Chen; Jianqiao Liu; Xingxu Huang
Journal:  Mol Ther       Date:  2018-08-14       Impact factor: 11.454

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.