Literature DB >> 24199189

A guide RNA sequence design platform for the CRISPR/Cas9 system for model organism genomes.

Ming Ma1, Adam Y Ye, Weiguo Zheng, Lei Kong.   

Abstract

Cas9/CRISPR has been reported to efficiently induce targeted gene disruption and homologous recombination in both prokaryotic and eukaryotic cells. Thus, we developed a Guide RNA Sequence Design Platform for the Cas9/CRISPR silencing system for model organisms. The platform is easy to use for gRNA design with input query sequences. It finds potential targets by PAM and ranks them according to factors including uniqueness, SNP, RNA secondary structure, and AT content. The platform allows users to upload and share their experimental results. In addition, most guide RNA sequences from published papers have been put into our database.

Entities:  

Mesh:

Substances:

Year:  2013        PMID: 24199189      PMCID: PMC3809372          DOI: 10.1155/2013/270805

Source DB:  PubMed          Journal:  Biomed Res Int            Impact factor:   3.411


1. Introduction

Gene engineering technology has always been a hot topic in life science research. With the development of gene modification technology, certain genes can be knocked out or knocked down to a lower level. The appearance of zinc finger nuclease (ZFN) and tale nuclease (TALEN) has greatly accelerated progress in this field, but their efficiency is often unpredictable and it is difficult to target selected genes [1-8]. Recently, Cas9/CRISPR has been reported to successfully induce targeted gene disruption and homologous recombination in both prokaryotic and eukaryotic cells with higher efficiency compared with ZFN and TALEN [9-13]. Additionally, it is easier to design guide sequence and easy to use for Cas9/CRISPR system [10]. This novel technology will be of great potential for application in both research field and clinical trials. However, there is no available tool for the guide RNA design of Cas9/CRISPR silencing system. Although Mali et al. have reported the construction of unique whole human genome guide RNA library, covering more than 40% human exons [9], they did not provide a tool for researchers to design novel target sequences for other model organisms. Existed library also did not take into consideration related influencing factors, such as SNP, deletion or insertion on the genome, and potential RNA secondary structure. According to our current understanding of the gRNA maturing process, the secondary structure of gRNA is crucial for Cas9-gRNA complex [14]. The 20 bp guide RNA sequence is used to bind with target site in genomes. If they are mostly involved into RNA loops, the efficiency to bind with target sites would be low. Thus, this factor should be taken into consideration. Besides, the interference efficiency is probably closely related to the melting temperature of the gRNA-DNA hybrid. A relatively high AT content is negatively correlated with the off-target effect, and thus sequence with extremely low AT percentage is, to some extent, not recommended [9]. Thus, we developed an online platform for the guide RNA design of the Cas9/CRISPR silencing system (http://cas9.cbi.pku.edu.cn/), with DNA variants information integrated. This tool helps researchers design their candidate guide RNA sequences more easily and provides assistance for users to choose better candidates based on preliminary results.

2. Materials and Methods

Both guide RNA sequences and their corresponding efficiency were manually collected from the literature and stored in our database. For designing guide RNA, we used a Java framework mainly containing 5 steps, and connecting to Tomcat web server. In the first step, the program would find any candidate sequences based on the N20NGG sequence pattern principle, where NGG represents PAM sequence, by utilizing Java regular expression matching. In the second step, the program would put all the candidate sequences to a fasta file and run bowtie 0.12.9 to check if they could be mapped on selected model organism's genome uniquely [15]. The parameters for bowtie were “-f -v 1 -k 10 -l 16 –S,” as “-f” told bowtie the input was fasta file, “-v 1” for only allowing at most one mismatch, “-k 10” reporting up to 10 good alignments, “-l 16” setting seed length to 16, and “-S” outputting sam format. As the length of target region was only 23 bp, the default seed length 28 for bowtie was not proper for this job, so we adjusted it to 16. We thought the number of mismatches might largely affect effectiveness, and this step mainly focused on checking the mapping uniqueness, so we just looked for hits with at most one mismatch and output at most 10 hits. The mapping result would be parsed in Java, and then, in the third step, would call tabix 0.2.5 to find out any overlapped SNPs or indels as reported in dbSNP135 [16-18], if the target genome was human hg19. The dbSNP135 vcf file was downloaded from GATK bundle. In the fourth step, it would predict RNA secondary structures for those candidate gRNA sequences by calling Vienna RNAfold 2.0.7 with default parameters [19]. In the last step, the program rearranged all the information for the designed gRNA and formatted it to better-looking HTML. The AT% and the distance of the variants to the 3′ end of the target region were also calculated. The output gRNAs were sorted by both number of mapping hits and number of overlapping SNPs. The time consumption for this pipeline was mainly on running bowtie and sometimes tabix, when there existed many target sequences, and was roughly about three seconds for one query sequence.

3. Results and Discussion

Multiple gene sequences are allowed for batch gRNA design and the streamline of this platform is shown in Figure 1. The results contain genomic loci information of gRNAs and SNP/INDEL inside them. This would help researchers choose a more unique target candidate and avoid SNP/insertion/deletion. Moreover, this platform evaluates all candidates based on their RNA secondary structure and AT content, allowing users to choose better candidates (Figure 2).
Figure 1

Streamline of guide RNA design platform. Target sequences are searched for the whole genome for uniqueness, and then check SNP/indel status. The results are output from top to bottom with more unique and fewer SNP/indel. The entire gRNA secondary structure is also given as reference.

Figure 2

Instruction of platform function. Overview of platform interface. (A)–(C) represent functions and database. (D) represents sense/antisense and position information of output sequences on target sequences. (E) represents uniqueness and SNP/indel status. (F) represents mature gRNA secondary structure.

Recently, Jiang et al. report that only the first six base pairs near PAM are of great importance for recognition efficiency in bacteria [20]. It is unknown whether or not this is still the case for eukaryotic or even mammalian cells. We will keep updating our algorithm to rank candidate gRNAs. We conducted a validation by using those reported results in our platform on factors, such as uniqueness, SNP, and base in loops (Table 1, italic font represents low efficient targets). The more unique, with fewer SNPs and base in loops, generally the gRNA is more efficient. For the given gene PVALB, the first target sequence is 50% more efficient than the rest two, since the first has 0 SNP while the rest have 3 or 2 SNPs. The first target sequence has fewer base pairs involved in RNA secondary structure loops, allowing it to bind more with target genome, while the rest two both have 9 base pairs in loops. For the given gene AAVS1, the first target is more than twofold efficient than the other, since the other one has an off-target site in genomes. For the given gene VEGFA, the first one is about half efficient with the rest two, since it has 1 SNP while the rest have none.
Table 1

Analyze of reported targets in human cells in this platform.

Target genesGuide RNA sequencesMapping and SNPbp in loopsAT%EfficiencyMethodsReference
Human PVALB ATTGGGTGTTCAGGGCAGAG1 places matched on genome: chr22:37196884-37196906(+), with 1 SNPs: rs12483924 (2 bp to 3′ end)645%6.50% Surveyor Cong et al. 2013 [10]
Human PVALB GTGGCGAGAGGGGCCGAGAT 1 places matched on genome: chr22:37196866-37196888(+), with 3 SNPs: rs3484 (18 bp to 3 end) rs181855770 (10 bp to 3 end) rs9607383 (9 bp to 3 end) 9 30% ND
Human PVALB GGGGCCGAGATTGGGTGTTC 1 places matched on genome: chr22:37196875-37196897(+), with 2 SNPs: rs181855770 (19 bp to 3 end) rs9607383 (18 bp to 3 end) 9 35% ND

Human AAVS1 GGGGCCACTAGGGACAGGAT1 places matched on genome: chr19:55627117-55627139(−), with 0 SNPs835%8.07% HR Mali et al. 2013 [9]
Human AAVS1 GTCCCCTCCACCCCACAGTG 2 places matched on genome:chr19:55627136-55627158(−), with 0 SNP schr4:108975634-108975656(+), with 1 SNPs: rs115503552 (7 bp to 3 end) 7 30% 3.26%

Human VEGFA GGGTGGGGGGAGTTTGCTCC 1 places matched on genome: chr6:43737291-43737313(−), with 1 SNPs: rs12210204 (1 bp to 3 end) 11 30 26% T7EI assay Fu et al. 2013 [21]
Human VEGFA GACCCCCTCCACCCCGCCTC1 places matched on genome: chr6:43738556-43738578(−), with 0 SNPs42050%
Human VEGFA GGTGAGTGAGTGTGTGCGTG1 places matched on genome: chr6:43737454-43737476(+), with 0 SNPs124049.40%

*ND represents not detectable. Italic font represents low efficient gRNAs within the same gene group.

AT content is crucial factor as those previously mentioned, since evidence is not clear. Thus, we list it here as a consideration for users.

4. Conclusions

Our platform is an easy-to-use software to identify potential efficient gRNA sites within given sequences for model organisms, avoiding off-target effects and SNPs. This platform also allows users to search existing guide RNA/protospacer sequences and share their results. We have manually extracted most reported gRNA/protospacer sequences into our database for reference and will expand it with newly published work.
  20 in total

1.  Knockout rats generated by embryo microinjection of TALENs.

Authors:  Laurent Tesson; Claire Usal; Séverine Ménoret; Elo Leung; Brett J Niles; Séverine Remy; Yolanda Santiago; Anna I Vincent; Xiangdong Meng; Lei Zhang; Philip D Gregory; Ignacio Anegon; Gregory J Cost
Journal:  Nat Biotechnol       Date:  2011-08-05       Impact factor: 54.908

2.  The double-edged sword of CRISPR-Cas systems.

Authors:  Manuela Villion; Sylvain Moineau
Journal:  Cell Res       Date:  2012-09-04       Impact factor: 25.617

3.  Rapid and cost-effective gene targeting in rat embryonic stem cells by TALENs.

Authors:  Chang Tong; Guanyi Huang; Charles Ashton; Hongping Wu; Hexin Yan; Qi-Long Ying
Journal:  J Genet Genomics       Date:  2012-05-09       Impact factor: 4.275

4.  Multiplex genome engineering using CRISPR/Cas systems.

Authors:  Le Cong; F Ann Ran; David Cox; Shuailiang Lin; Robert Barretto; Naomi Habib; Patrick D Hsu; Xuebing Wu; Wenyan Jiang; Luciano A Marraffini; Feng Zhang
Journal:  Science       Date:  2013-01-03       Impact factor: 47.728

5.  RNA-guided human genome engineering via Cas9.

Authors:  Prashant Mali; Luhan Yang; Kevin M Esvelt; John Aach; Marc Guell; James E DiCarlo; Julie E Norville; George M Church
Journal:  Science       Date:  2013-01-03       Impact factor: 47.728

6.  One-step generation of mice carrying mutations in multiple genes by CRISPR/Cas-mediated genome engineering.

Authors:  Haoyi Wang; Hui Yang; Chikdu S Shivalila; Meelad M Dawlaty; Albert W Cheng; Feng Zhang; Rudolf Jaenisch
Journal:  Cell       Date:  2013-05-02       Impact factor: 41.582

7.  High-frequency off-target mutagenesis induced by CRISPR-Cas nucleases in human cells.

Authors:  Yanfang Fu; Jennifer A Foden; Cyd Khayter; Morgan L Maeder; Deepak Reyon; J Keith Joung; Jeffry D Sander
Journal:  Nat Biotechnol       Date:  2013-06-23       Impact factor: 54.908

8.  Targeted gene disruption in somatic zebrafish cells using engineered TALENs.

Authors:  Jeffry D Sander; Lindsay Cade; Cyd Khayter; Deepak Reyon; Randall T Peterson; J Keith Joung; Jing-Ruey J Yeh
Journal:  Nat Biotechnol       Date:  2011-08-05       Impact factor: 54.908

9.  Genome editing with RNA-guided Cas9 nuclease in zebrafish embryos.

Authors:  Nannan Chang; Changhong Sun; Lu Gao; Dan Zhu; Xiufei Xu; Xiaojun Zhu; Jing-Wei Xiong; Jianzhong Jeff Xi
Journal:  Cell Res       Date:  2013-03-26       Impact factor: 25.617

10.  RNA-guided editing of bacterial genomes using CRISPR-Cas systems.

Authors:  Wenyan Jiang; David Bikard; David Cox; Feng Zhang; Luciano A Marraffini
Journal:  Nat Biotechnol       Date:  2013-01-29       Impact factor: 54.908

View more
  28 in total

1.  Target specificity of the CRISPR-Cas9 system.

Authors:  Xuebing Wu; Andrea J Kriz; Phillip A Sharp
Journal:  Quant Biol       Date:  2014-06

Review 2.  Resources for functional genomics studies in Drosophila melanogaster.

Authors:  Stephanie E Mohr; Yanhui Hu; Kevin Kim; Benjamin E Housden; Norbert Perrimon
Journal:  Genetics       Date:  2014-03-20       Impact factor: 4.562

3.  An overview of designing and selection of sgRNAs for precise genome editing by the CRISPR-Cas9 system in plants.

Authors:  Ajay Prakash Uniyal; Komal Mansotra; Sudesh Kumar Yadav; Vinay Kumar
Journal:  3 Biotech       Date:  2019-05-21       Impact factor: 2.406

Review 4.  Recent advances in CRISPR/Cas9 mediated genome editing in Bacillus subtilis.

Authors:  Kun-Qiang Hong; Ding-Yu Liu; Tao Chen; Zhi-Wen Wang
Journal:  World J Microbiol Biotechnol       Date:  2018-09-29       Impact factor: 3.312

Review 5.  CRISPR/Cas9: an advanced tool for editing plant genomes.

Authors:  Milan Kumar Samanta; Avishek Dey; Srimonta Gayen
Journal:  Transgenic Res       Date:  2016-03-24       Impact factor: 2.788

Review 6.  Targeted genome engineering techniques in Drosophila.

Authors:  Kelly J Beumer; Dana Carroll
Journal:  Methods       Date:  2014-01-08       Impact factor: 3.608

7.  Detection of E2F-DNA Complexes Using Chromatin Immunoprecipitation Assays.

Authors:  Miyoung Lee; Lorraine J Gudas; Harold I Saavedra
Journal:  Methods Mol Biol       Date:  2018

Review 8.  Understanding the DNA damage response in order to achieve desired gene editing outcomes in mosquitoes.

Authors:  Justin M Overcash; Azadeh Aryan; Kevin M Myles; Zach N Adelman
Journal:  Chromosome Res       Date:  2015-02       Impact factor: 5.239

9.  Computational Approaches for Designing Highly Specific and Efficient sgRNAs.

Authors:  Jaspreet Kaur Dhanjal; Dhvani Vora; Navaneethan Radhakrishnan; Durai Sundar
Journal:  Methods Mol Biol       Date:  2022

Review 10.  CRISPR-Based Genome Editing: Advancements and Opportunities for Rice Improvement.

Authors:  Workie Anley Zegeye; Mesfin Tsegaw; Yingxin Zhang; Liyong Cao
Journal:  Int J Mol Sci       Date:  2022-04-18       Impact factor: 6.208

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.