Literature DB >> 24463181

Cas-OFFinder: a fast and versatile algorithm that searches for potential off-target sites of Cas9 RNA-guided endonucleases.

Sangsu Bae1, Jeongbin Park, Jin-Soo Kim.   

Abstract

SUMMARY: The Type II clustered regularly interspaced short palindromic repeats (CRISPR)/Cas system is an adaptive immune response in prokaryotes, protecting host cells against invading phages or plasmids by cleaving these foreign DNA species in a targeted manner. CRISPR/Cas-derived RNA-guided engineered nucleases (RGENs) enable genome editing in cultured cells, animals and plants, but are limited by off-target mutations. Here, we present a novel algorithm termed Cas-OFFinder that searches for potential off-target sites in a given genome or user-defined sequences. Unlike other algorithms currently available for identification of RGEN off-target sites, Cas-OFFinder is not limited by the number of mismatches and allows variations in protospacer-adjacent motif sequences recognized by Cas9, the essential protein component in RGENs. Cas-OFFinder is available as a command-line program or accessible via our website.
AVAILABILITY AND IMPLEMENTATION: Cas-OFFinder free access at http://www.rgenome.net/cas-offinder. CONTACT: baesau@snu.ac.kr or jskim01@snu.ac.kr.

Entities:  

Mesh:

Substances:

Year:  2014        PMID: 24463181      PMCID: PMC4016707          DOI: 10.1093/bioinformatics/btu048

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 INTRODUCTION

Genome editing with engineered nucleases is broadly useful for biomedical research, biotechnology and medicine. Engineered nucleases cleave chromosomal DNA in a targeted manner, and the repair of the resulting double-strand breaks by endogenous systems gives rise to targeted genome modifications in cultured cells, animals and plants. We and others have developed three different types of engineered nucleases: zinc finger nucleases (ZFNs) (Bibikova ; Kim ), transcription activator-like effector nucleases (TALENs) (Kim ; Miller ) and RNA-guided engineered nucleases (RGENs) (Cho ; Cong ; Jinek ; Mali ) derived from the Type II clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated (Cas) system, an adaptive immune response in bacteria and archaea. Unlike ZFNs and TALENs whose DNA specificities are determined by DNA-binding proteins, RGENs use complementary base pairing to recognize target sites. RGENs consist of (i) dual RNA components comprising sequence-invariant tracrRNA and sequence-variable guide RNA termed crRNA [or single-chain guide RNA (sgRNA) constructed by linking essential portions of tracrRNA and crRNA (Jinek )] and (ii) a fixed protein component, Cas9, that recognizes the protospacer-adjacent motif (PAM) downstream of target DNA sequences corresponding to guide RNA. Custom-designed RGENs are produced simply by replacing guide RNAs, making this system easy to access. Unfortunately, RGENs cleave not only on-target sites but also off-target sites that differ by up to several nucleotides from the on-target sites (Cho ; Fu ; Hsu ), causing unwanted off-target mutations and chromosomal rearrangements. These undesired off-target effects raise significant concerns for using RGENs as genome editing tools in diverse applications. To address this issue, researchers must be able to search for potential off-target sites in the genome. Sequence alignment tools such as TagScan (Cradick ; Iseli ), Bowtie (Langmead ) or GPGPU-enabled CUSHAW (Liu ) can be used to find potential off-target sites, but are limited by the number of mismatched bases allowed and a requirement for a fixed PAM sequence. Here we introduce a fast and highly versatile off-target searching tool, Cas-OFFinder. Importantly, Cas-OFFinder is written in OpenCL, an open standard language for parallel programming in heterogeneous environments, enabling operation in diverse platforms such as central processing units (CPUs), graphics processing units (GPUs) and digital signal processors (DSPs).

2 METHODS

2.1 Concept of Cas-OFFinder

Versions of Cas9 derived from three different species have been exploited to edit genes in human cells. These Cas9 proteins recognize different PAM sequences. Cas9 originated from Streptococcus pyogenes (SpCas9) recognizes 5′-NGG-3′ PAM sequences and, to a lesser extent, 5′-NAG-3′. Cas9 from Streptococcus thermophilus (StCas9) (Cong ) and that from Neisseria meningitidis (NmCas9) (Hou ) recognizes 5′-NNAGAAW-3′ (W = A or T) and 5′-NNNNGMTT-3′ (M = A or C), respectively. The degeneracy in PAM recognition by Cas9 must be accounted for when searching for potential off-target sites. In the case of SpCas9, Cas-OFFinder first compiles all the 23-bp DNA sequences composed of 20-bp sequences corresponding to the sgRNA sequence of interest and the 5′-NRG-3′ PAM sequences (Fig. 1A). Cas-OFFinder then compares all the compiled sequences with the query sequence and counts the number of mismatched bases in the 20-bp sgRNA sequence.
Fig. 1.

(A) The scheme of Cas-OFFinder. (B) The workflow of Cas-OFFinder. (C) Running time per target site as a function of the number of input target sites via CPU (black squares) and GPU (red circles)

(A) The scheme of Cas-OFFinder. (B) The workflow of Cas-OFFinder. (C) Running time per target site as a function of the number of input target sites via CPU (black squares) and GPU (red circles)

2.2 Workflow of Cas-OFFinder

Cas-OFFinder is composed of two different OpenCL kernels (a searching kernel and a comparing kernel) and C++ (wrapper) parts (Fig. 1B). First, Cas-OFFinder reads genome sequence data files in single or multi-sequence FASTA formats. To read and parse FASTA files, an open-source FASTA/FASTQ parser library is used. Although OpenCL supports various processors, the memory of the devices is not always large enough for big data analysis. To overcome the memory limitation of OpenCL devices, wrapper1 divides the genome data into units of the largest possible size allowed by the device memory. These divided chunks are then loaded into the searching kernel that compiles all sites that include a PAM sequence in the entire genome. To search for and select these specific sites rapidly and effectively, the searching kernel runs independently on every calculation unit of a processor, i.e. all searching processes on the calculation units are accomplished simultaneously. After this step, wrapper2 collects the information about the specific sites containing PAM sequences and delivers these sequences to the comparing kernel, which counts the number of mismatched bases. Similar to the searching kernel, all comparing processes on the calculation units are accomplished simultaneously. Finally, wrapper3 selects potential off-target sites that have fewer mismatched bases than a given threshold, and writes the results into an output file with the following information: chromosome number, position, direction, number of mismatched bases and potential off-target DNA sequences with mismatched bases noted in lowercase letters. These processes are repeated until all the divided chunks are loaded.

3 RESULTS AND DISCUSSION

To evaluate the performance of Cas-OFFinder, we first chose arbitrary SpCas9 target sites in the human genome and ran Cas-OFFinder with query sequences via CPU (Intel i7 3770K) or GPU (AMD Radeon HD 7870). Notably, running time per target site was decreased as the number of target sites was increased (Fig. 1C). This result is expected because the searching kernel works only once for many input targets. The speed of Cas-OFFinder based on GPU (3.0 s) was 20× faster than that of CPU (60.0 s) when 1000 target sites were analyzed. We also used Cas-OFFinder to search for potential off-target sites of NmCas9, which recognizes 5′-NNNNGMTT-3′ (where M is A or C) PAM sequences in addition to a 24-bp target DNA sequence specific to guide RNA in human and other genomes (Table 1). Note that Cas-OFFinder allows mixed bases to account for the degeneracy in PAM sequences.
Table 1.

Running time of Cas-OFFinder via GPU to search for NmCas9 potential off-target sites

Data set (size)Number of mismatchesTime for 100 targets
H. sapiens genome (3.01 Gb)176.4 ± 2.0 s
H. sapiens genome (3.01 Gb)579.9 ± 1.6 s
H. sapiens genome (3.01 Gb)10114.4 ± 3.0 s
M. musculus genome (2.65 Gb)562.6 ± 2.4 s
D. rerio genome (1.32 Gb)537.7 ± 3.5 s
A. thaliana genome (116 Mb)54.8 ± 0.8 s
Running time of Cas-OFFinder via GPU to search for NmCas9 potential off-target sites In conclusion, Cas-OFFinder enables searching for potential off-target sites in any sequenced genome rapidly without limiting the PAM sequence or the number of mismatched bases. These features make Cas-OFFinder applicable to ZFNs, TALENs and transcription factors that are prone to off-target DNA recognition. Funding: National Research Foundation of Korea (2013000718 to J.-S.K.) and the Plant Molecular Breeding Center of Next-Generation BioGreen 21 Program (PJ009081), the National Research Foundation of Korea (2013065262), TJ Park Science Fellowship (to S.B.). Conflict of Interest: none declared.
  17 in total

1.  Targeted genome engineering in human cells with the Cas9 RNA-guided endonuclease.

Authors:  Seung Woo Cho; Sojung Kim; Jong Min Kim; Jin-Soo Kim
Journal:  Nat Biotechnol       Date:  2013-01-29       Impact factor: 54.908

2.  A library of TAL effector nucleases spanning the human genome.

Authors:  Yongsub Kim; Jiyeon Kweon; Annie Kim; Jae Kyung Chon; Ji Yeon Yoo; Hye Joo Kim; Sojung Kim; Choongil Lee; Euihwan Jeong; Eugene Chung; Doyoung Kim; Mi Seon Lee; Eun Mi Go; Hye Jung Song; Hwangbeom Kim; Namjin Cho; Duhee Bang; Seokjoong Kim; Jin-Soo Kim
Journal:  Nat Biotechnol       Date:  2013-02-17       Impact factor: 54.908

3.  A TALE nuclease architecture for efficient genome editing.

Authors:  Jeffrey C Miller; Siyuan Tan; Guijuan Qiao; Kyle A Barlow; Jianbin Wang; Danny F Xia; Xiangdong Meng; David E Paschon; Elo Leung; Sarah J Hinkley; Gladys P Dulay; Kevin L Hua; Irina Ankoudinova; Gregory J Cost; Fyodor D Urnov; H Steve Zhang; Michael C Holmes; Lei Zhang; Philip D Gregory; Edward J Rebar
Journal:  Nat Biotechnol       Date:  2010-12-22       Impact factor: 54.908

4.  A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity.

Authors:  Martin Jinek; Krzysztof Chylinski; Ines Fonfara; Michael Hauer; Jennifer A Doudna; Emmanuelle Charpentier
Journal:  Science       Date:  2012-06-28       Impact factor: 47.728

5.  Multiplex genome engineering using CRISPR/Cas systems.

Authors:  Le Cong; F Ann Ran; David Cox; Shuailiang Lin; Robert Barretto; Naomi Habib; Patrick D Hsu; Xuebing Wu; Wenyan Jiang; Luciano A Marraffini; Feng Zhang
Journal:  Science       Date:  2013-01-03       Impact factor: 47.728

6.  RNA-guided human genome engineering via Cas9.

Authors:  Prashant Mali; Luhan Yang; Kevin M Esvelt; John Aach; Marc Guell; James E DiCarlo; Julie E Norville; George M Church
Journal:  Science       Date:  2013-01-03       Impact factor: 47.728

7.  DNA targeting specificity of RNA-guided Cas9 nucleases.

Authors:  Patrick D Hsu; David A Scott; Joshua A Weinstein; F Ann Ran; Silvana Konermann; Vineeta Agarwala; Yinqing Li; Eli J Fine; Xuebing Wu; Ophir Shalem; Thomas J Cradick; Luciano A Marraffini; Gang Bao; Feng Zhang
Journal:  Nat Biotechnol       Date:  2013-07-21       Impact factor: 54.908

8.  High-frequency off-target mutagenesis induced by CRISPR-Cas nucleases in human cells.

Authors:  Yanfang Fu; Jennifer A Foden; Cyd Khayter; Morgan L Maeder; Deepak Reyon; J Keith Joung; Jeffry D Sander
Journal:  Nat Biotechnol       Date:  2013-06-23       Impact factor: 54.908

9.  ZFN-site searches genomes for zinc finger nuclease target sites and off-target sites.

Authors:  Thomas J Cradick; Giovanna Ambrosini; Christian Iseli; Philipp Bucher; Anton P McCaffrey
Journal:  BMC Bioinformatics       Date:  2011-05-13       Impact factor: 3.307

10.  RNA-programmed genome editing in human cells.

Authors:  Martin Jinek; Alexandra East; Aaron Cheng; Steven Lin; Enbo Ma; Jennifer Doudna
Journal:  Elife       Date:  2013-01-29       Impact factor: 8.140

View more
  629 in total

1.  DNA-free genome editing in plants with preassembled CRISPR-Cas9 ribonucleoproteins.

Authors:  Je Wook Woo; Jungeun Kim; Soon Il Kwon; Claudia Corvalán; Seung Woo Cho; Hyeran Kim; Sang-Gyu Kim; Sang-Tae Kim; Sunghwa Choe; Jin-Soo Kim
Journal:  Nat Biotechnol       Date:  2015-10-19       Impact factor: 54.908

Review 2.  Creating and evaluating accurate CRISPR-Cas9 scalpels for genomic surgery.

Authors:  Mehmet Fatih Bolukbasi; Ankit Gupta; Scot A Wolfe
Journal:  Nat Methods       Date:  2016-01       Impact factor: 28.547

Review 3.  Cheating evolution: engineering gene drives to manipulate the fate of wild populations.

Authors:  Jackson Champer; Anna Buchman; Omar S Akbari
Journal:  Nat Rev Genet       Date:  2016-02-15       Impact factor: 53.242

4.  Site-directed mutagenesis in Petunia × hybrida protoplast system using direct delivery of purified recombinant Cas9 ribonucleoproteins.

Authors:  Saminathan Subburaj; Sung Jin Chung; Choongil Lee; Seuk-Min Ryu; Duk Hyoung Kim; Jin-Soo Kim; Sangsu Bae; Geung-Joo Lee
Journal:  Plant Cell Rep       Date:  2016-01-29       Impact factor: 4.570

5.  Target specificity of the CRISPR-Cas9 system.

Authors:  Xuebing Wu; Andrea J Kriz; Phillip A Sharp
Journal:  Quant Biol       Date:  2014-06

6.  Adenine base editing in mouse embryos and an adult mouse model of Duchenne muscular dystrophy.

Authors:  Seuk-Min Ryu; Taeyoung Koo; Kyoungmi Kim; Kayeong Lim; Gayoung Baek; Sang-Tae Kim; Heon Seok Kim; Da-Eun Kim; Hyunji Lee; Eugene Chung; Jin-Soo Kim
Journal:  Nat Biotechnol       Date:  2018-04-27       Impact factor: 54.908

7.  Arabidopsis glutamate:glyoxylate aminotransferase 1 (Ler) mutants generated by CRISPR/Cas9 and their characteristics.

Authors:  Yaping Liang; Xiuying Zeng; Xinxiang Peng; Xuewen Hou
Journal:  Transgenic Res       Date:  2018-02-01       Impact factor: 2.788

8.  PAX5-driven subtypes of B-progenitor acute lymphoblastic leukemia.

Authors:  Zhaohui Gu; Michelle L Churchman; Kathryn G Roberts; Ian Moore; Xin Zhou; Joy Nakitandwe; Kohei Hagiwara; Stephane Pelletier; Sebastien Gingras; Hartmut Berns; Debbie Payne-Turner; Ashley Hill; Ilaria Iacobucci; Lei Shi; Stanley Pounds; Cheng Cheng; Deqing Pei; Chunxu Qu; Scott Newman; Meenakshi Devidas; Yunfeng Dai; Shalini C Reshmi; Julie Gastier-Foster; Elizabeth A Raetz; Michael J Borowitz; Brent L Wood; William L Carroll; Patrick A Zweidler-McKay; Karen R Rabin; Leonard A Mattano; Kelly W Maloney; Alessandro Rambaldi; Orietta Spinelli; Jerald P Radich; Mark D Minden; Jacob M Rowe; Selina Luger; Mark R Litzow; Martin S Tallman; Janis Racevskis; Yanming Zhang; Ravi Bhatia; Jessica Kohlschmidt; Krzysztof Mrózek; Clara D Bloomfield; Wendy Stock; Steven Kornblau; Hagop M Kantarjian; Marina Konopleva; Williams E Evans; Sima Jeha; Ching-Hon Pui; Jun Yang; Elisabeth Paietta; James R Downing; Mary V Relling; Jinghui Zhang; Mignon L Loh; Stephen P Hunger; Charles G Mullighan
Journal:  Nat Genet       Date:  2019-01-14       Impact factor: 38.330

9.  Somatic Gene Editing of GUCY2D by AAV-CRISPR/Cas9 Alters Retinal Structure and Function in Mouse and Macaque.

Authors:  K Tyler McCullough; Sanford L Boye; Diego Fajardo; Kaitlyn Calabro; James J Peterson; Christianne E Strang; Dibyendu Chakraborty; Sebastian Gloskowski; Scott Haskett; Steven Samuelsson; Haiyan Jiang; C Douglas Witherspoon; Paul D Gamlin; Morgan L Maeder; Shannon E Boye
Journal:  Hum Gene Ther       Date:  2018-12-20       Impact factor: 5.695

10.  Functional Characterization of Novel U6 RNA Polymerase III Promoters: Their Implication for CRISPR-Cas9-Mediated Gene Editing in Aspergillus oryzae.

Authors:  Chanikul Chutrakul; Sarocha Panchanawaporn; Sukanya Jeennor; Jutamas Anantayanon; Tayvich Vorapreeda; Vanicha Vichai; Kobkul Laoteng
Journal:  Curr Microbiol       Date:  2019-09-20       Impact factor: 2.188

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.