Literature DB >> 25979474

WGE: a CRISPR database for genome engineering.

Alex Hodgkins1, Anna Farne1, Sajith Perera1, Tiago Grego1, David J Parry-Smith1, William C Skarnes1, Vivek Iyer1.   

Abstract

UNLABELLED: The rapid development of CRISPR-Cas9 mediated genome editing techniques has given rise to a number of online and stand-alone tools to find and score CRISPR sites for whole genomes. Here we describe the Wellcome Trust Sanger Institute Genome Editing database (WGE), which uses novel methods to compute, visualize and select optimal CRISPR sites in a genome browser environment. The WGE database currently stores single and paired CRISPR sites and pre-calculated off-target information for CRISPRs located in the mouse and human exomes. Scoring and display of off-target sites is simple, and intuitive, and filters can be applied to identify high-quality CRISPR sites rapidly. WGE also provides a tool for the design and display of gene targeting vectors in the same genome browser, along with gene models, protein translation and variation tracks. WGE is open, extensible and can be set up to compute and present CRISPR sites for any genome.
AVAILABILITY AND IMPLEMENTATION: The WGE database is freely available at www.sanger.ac.uk/htgt/wge CONTACT: : vvi@sanger.ac.uk or skarnes@sanger.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author 2015. Published by Oxford University Press.

Entities:  

Mesh:

Year:  2015        PMID: 25979474      PMCID: PMC4565030          DOI: 10.1093/bioinformatics/btv308

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

CRISPR-Cas technology is a powerful tool for genome editing that can be applied to virtually any species, from viruses to plants to mammals (Hsu ; Mali ). The CRISPR-Cas system, exemplified by Cas9 from Streptococcus pyogenes, is an RNA-guided endonuclease that can be targeted to specific sequences by Watson-Crick base pairing between a guide RNA (gRNA) molecule and a 20 bp target sequence adjacent to an obligate NGG protospacer adjacent motif (PAM). By replacing the first 20 bp of the gRNA with the desired target sequence Cas9 can be re-programmed to induce a double-stranded break at any N(20)NGG site in the genome. The Cas9 endonuclease will tolerate some mismatches in the alignment between the gRNA and target DNA sequence and off-target damage can occur at other sites in the genome with high sequence similarity to the CRISPR site (Hsu ). Therefore, when designing gRNAs for genome editing, it is important to consider potential off-target sites as well as the position of the gRNA site within a gene. Many web-delivered software solutions are available for choosing highly specific CRISPR sites in vertebrate genomes (Supplementary Table S1). Some require the input of target or gRNA sequences (Bae ; Hsu ; http://www.benchling.com). Others provide a basic genome browser (http://horizon.deskgen.com; Montague ). One can be run locally (Bae ), and a few have more involved scoring schemes, both for gRNA’s and off-targets (e.g. Heigwer ; http://www.benchling.com). Some use a heuristic short-read aligner such as Bowtie (Langmead and Salzberg, 2012) to compute gRNA specificity (Ma ; Montague ; Sander ). Here we describe WGE (http://www.sanger.ac.uk/htgt/wge). In contrast to existing websites, WGE guides the genome editing process using Genoverse—a fast, open and extensible genome browser. The display of single and paired CRISPR sites is integrated with pre-computed off-target scores and user-controlled filters. The browser also incorporates Ensembl gene structure, available variation, protein translation and user-generated targeting construct designs. The WGE website (Fig. 1) enables a designer to view and consider CRISPRs in the context of the underlying genomic landscape. CRISPRs and targeting vector designs can be stored and retrieved later by means of a Google login.
Fig. 1.

Display of precomputed CRISPR sites in the Genoverse genome browser (upper panel). A region of the human APOE gene is shown with CRISPR sites (green bars with PAM site in blue) below the annotated gene model. Clicking on individual CRISPR sites returns a popup window showing off-target information (lower panel). Genomic information for the original CRISPR site is in blue. Mismatches in the off-target sequence are in red

Display of precomputed CRISPR sites in the Genoverse genome browser (upper panel). A region of the human APOE gene is shown with CRISPR sites (green bars with PAM site in blue) below the annotated gene model. Clicking on individual CRISPR sites returns a popup window showing off-target information (lower panel). Genomic information for the original CRISPR site is in blue. Mismatches in the off-target sequence are in red The WGE system is divided into four parts: the CRISPR-Analyser software tool, the WGE database, the WGE Website and the Vector Designer (Supplementary Fig. S1.) We have used the WGE CRISPR-Analyser software to pre-compute genome-wide off-target potentials of all CRISPR sites within the mouse and human exomes, plus 200 bp of sequence. These results are stored in the WGE database. The storage of CRISPR sites and their off-target scores allows users to rapidly browse the genome and alter filter criteria to select CRISPRs. Website users can also initiate real-time off-target scoring for previously un-scored CRISPRs: the resulting scores are stored and made available to all users. All data on the website is accessible via an API. Our focus is on mouse and human genomes, but the components have been written to make the process of installing and extending to other genomes easy. See, for example (https://github.com/coronin/CRISPR-Analyser) which extends the CRISPR-Analyser software to NGG and NAG-pam sites in the dog genome.

2 Methods

2.1 CRISPR-analyser: CRISPR display and scoring

The WGE CRISPR-Analyser package identifies CRISPR sites by scanning every 23 bases of the reference genome, searching for a CC as the first two bases (indicating a PAM site on the reverse strand) or a GG as the final two bases (PAM site on forward strand). The CRISPR-Analyser software also provides very fast, genome-wide off-target CRISPR scoring (approximately 3 seconds per CRISPR site). Off-target potential is found by directly comparing the CRISPR sequence to all other possible matches in the genome, with up to 4 bp of mismatch. This is done very rapidly by building an in-memory index of all CRISPR sequences (Supplementary Information Section 2 and Supplementary Fig. S2). In contrast to other packages (e.g. Naito ), no other alignment software is used. Using 80 parallel processes, we computed genome-wide off-target scores for all CRISPR sites (PAM = NGG) in the mouse and human exomes—including 200 bp flanks—in less than 2 weeks. In combination with the built-in web server, the software is also fast enough to score moderately small numbers of CRISPR sites in real time for user-generated requests, made from the WGE website.

2.2 WGE database and WGE website

All CRISPR sites, pre-computed and user-requested off-target scores, as well as user-generated vector designs are stored in the WGE Database (Supplementary Information Section 3). Bulk downloads of the stored CRISPR locations and off-target information are available via the WGE website or by directly querying the WGE database with a REST API (Supplementary Information Section 1). The WGE website is built using Genoverse (www.genoverse.org) and standard web development components (Supplementary Information Section 4).

3 Results and discussion

Users enter the CRISPR-finding part of the WGE website by selecting a species—currently human or mouse—and the marker symbol of the gene to inspect. They are then prompted to select a target Ensembl exon. All possible CRISPR sites and paired sites (Supplementary Fig. S3) are shown on a scrollable Genoverse genome browser view, which incorporates the current gene models from Ensembl (Flicek ), protein translation, available variation and any user-generated targeting vector designs (see below). ‘Paired’ CRISPR sites (Shen ) are shown in WGE when CRISPR sites on opposite strands have a separation of up to 30 bp, or an overlap of up to 10 bp. Our scoring system reports the number of similar sequences in the genome with up to four mismatches (excluding the PAM region), summarized in a simple output string. For example, a score ‘0:1, 1:0, 2:0, 3:4, 4:56’ indicates there is 1 genomic site with 0 mismatches (the CRISPR site itself), no off-target sites with 1 or 2 mismatches, and an increasing number of potential off-target sites with 3 and 4 mismatches. By clicking on a CRISPR site, off-target information is displayed with a link to a summary report which highlights the bases that differ within the off-target sequences (Fig. 1) and reports their genomic coordinates and genomic context (intergenic, intron, exon). In this way, users can immediately grasp the off-target potential for each CRISPR site. Users can also filter CRISPR sites based on their stored off-target characteristics (Supplementary Fig. S4). Using this interface, hundreds of possible CRISPR sites can be narrowed down and evaluated to select the optimal site(s) for an editing task. WGE also mimics other CRISPR-finding websites by allowing a user to directly paste in genomic sequence, which is analysed rapidly to show CRISPR sites and their pre-computed off-targets (Supplementary Fig. S5). WGE can be used to design PCR primers for the assembly of gene targeting vectors by Gibson assembly (Gibson ) or other similar PCR-based methods. This involves first choosing a target exon, and then adjusting design parameters via a web interface to allow the primer calculations to be run (Supplementary Fig. S6). The resulting targeting vector designs can be bookmarked, edited, and are displayed alongside CRISPR sites in the genome browser (Supplementary Fig. S7). WGE provides the user with a highly visual method of rapidly designing genome edits using CRISPRs and targeting vectors. We plan to exploit this open and extensible platform to incorporate more genomes as needed, efficiency-based CRISPR scoring strategies and other useful browser tracks.
  12 in total

1.  Enzymatic assembly of DNA molecules up to several hundred kilobases.

Authors:  Daniel G Gibson; Lei Young; Ray-Yuan Chuang; J Craig Venter; Clyde A Hutchison; Hamilton O Smith
Journal:  Nat Methods       Date:  2009-04-12       Impact factor: 28.547

2.  Fast gapped-read alignment with Bowtie 2.

Authors:  Ben Langmead; Steven L Salzberg
Journal:  Nat Methods       Date:  2012-03-04       Impact factor: 28.547

3.  Efficient genome modification by CRISPR-Cas9 nickase with minimal off-target effects.

Authors:  Bin Shen; Wensheng Zhang; Jun Zhang; Jiankui Zhou; Jianying Wang; Li Chen; Lu Wang; Alex Hodgkins; Vivek Iyer; Xingxu Huang; William C Skarnes
Journal:  Nat Methods       Date:  2014-03-02       Impact factor: 28.547

4.  E-CRISP: fast CRISPR target site identification.

Authors:  Florian Heigwer; Grainne Kerr; Michael Boutros
Journal:  Nat Methods       Date:  2014-02       Impact factor: 28.547

Review 5.  Development and applications of CRISPR-Cas9 for genome engineering.

Authors:  Patrick D Hsu; Eric S Lander; Feng Zhang
Journal:  Cell       Date:  2014-06-05       Impact factor: 41.582

6.  ZiFiT (Zinc Finger Targeter): an updated zinc finger engineering tool.

Authors:  Jeffry D Sander; Morgan L Maeder; Deepak Reyon; Daniel F Voytas; J Keith Joung; Drena Dobbs
Journal:  Nucleic Acids Res       Date:  2010-04-30       Impact factor: 16.971

7.  CRISPRdirect: software for designing CRISPR/Cas guide RNA with reduced off-target sites.

Authors:  Yuki Naito; Kimihiro Hino; Hidemasa Bono; Kumiko Ui-Tei
Journal:  Bioinformatics       Date:  2014-11-20       Impact factor: 6.937

8.  Ensembl 2014.

Authors:  Paul Flicek; M Ridwan Amode; Daniel Barrell; Kathryn Beal; Konstantinos Billis; Simon Brent; Denise Carvalho-Silva; Peter Clapham; Guy Coates; Stephen Fitzgerald; Laurent Gil; Carlos García Girón; Leo Gordon; Thibaut Hourlier; Sarah Hunt; Nathan Johnson; Thomas Juettemann; Andreas K Kähäri; Stephen Keenan; Eugene Kulesha; Fergal J Martin; Thomas Maurel; William M McLaren; Daniel N Murphy; Rishi Nag; Bert Overduin; Miguel Pignatelli; Bethan Pritchard; Emily Pritchard; Harpreet S Riat; Magali Ruffier; Daniel Sheppard; Kieron Taylor; Anja Thormann; Stephen J Trevanion; Alessandro Vullo; Steven P Wilder; Mark Wilson; Amonida Zadissa; Bronwen L Aken; Ewan Birney; Fiona Cunningham; Jennifer Harrow; Javier Herrero; Tim J P Hubbard; Rhoda Kinsella; Matthieu Muffato; Anne Parker; Giulietta Spudich; Andy Yates; Daniel R Zerbino; Stephen M J Searle
Journal:  Nucleic Acids Res       Date:  2013-12-06       Impact factor: 16.971

9.  CHOPCHOP: a CRISPR/Cas9 and TALEN web tool for genome editing.

Authors:  Tessa G Montague; José M Cruz; James A Gagnon; George M Church; Eivind Valen
Journal:  Nucleic Acids Res       Date:  2014-05-26       Impact factor: 16.971

10.  A guide RNA sequence design platform for the CRISPR/Cas9 system for model organism genomes.

Authors:  Ming Ma; Adam Y Ye; Weiguo Zheng; Lei Kong
Journal:  Biomed Res Int       Date:  2013-10-03       Impact factor: 3.411

View more
  81 in total

1.  CRISPR/Cas9-mediated deletion of lncRNA Gm26878 in the distant Foxf1 enhancer region.

Authors:  Przemyslaw Szafranski; Justyna A Karolak; Denise Lanza; Marzena Gajęcka; Jason Heaney; Paweł Stankiewicz
Journal:  Mamm Genome       Date:  2017-04-12       Impact factor: 2.957

Review 2.  Gene-edited CRISPy Critters for alcohol research.

Authors:  Gregg E Homanics
Journal:  Alcohol       Date:  2018-03-07       Impact factor: 2.405

3.  The ciliary membrane-associated proteome reveals actin-binding proteins as key components of cilia.

Authors:  Priyanka Kohli; Martin Höhne; Christian Jüngst; Sabine Bertsch; Lena K Ebert; Astrid C Schauss; Thomas Benzing; Markus M Rinschen; Bernhard Schermer
Journal:  EMBO Rep       Date:  2017-07-14       Impact factor: 8.807

4.  The technical risks of human gene editing.

Authors:  Benjamin Davies
Journal:  Hum Reprod       Date:  2019-11-01       Impact factor: 6.918

5.  CRISPR/Cas9-Mediated Fluorescent Tagging of Endogenous Proteins in Human Pluripotent Stem Cells.

Authors:  Arun Sharma; Christopher N Toepfer; Tarsha Ward; Lauren Wasson; Radhika Agarwal; David A Conner; Johnny H Hu; Christine E Seidman
Journal:  Curr Protoc Hum Genet       Date:  2018-01-24

6.  The metalloprotease ADAM10 (a disintegrin and metalloprotease 10) undergoes rapid, postlysis autocatalytic degradation.

Authors:  Tobias Brummer; Martina Pigoni; Armando Rossello; Huanhuan Wang; Peter J Noy; Michael G Tomlinson; Carl P Blobel; Stefan F Lichtenthaler
Journal:  FASEB J       Date:  2018-02-07       Impact factor: 5.191

7.  A global Slc7a7 knockout mouse model demonstrates characteristic phenotypes of human lysinuric protein intolerance.

Authors:  Bridget M Stroup; Ronit Marom; Xiaohui Li; Chih-Wei Hsu; Cheng-Yen Chang; Luan D Truong; Brian Dawson; Ingo Grafe; Yuqing Chen; Ming-Ming Jiang; Denise Lanza; Jennie Rose Green; Qin Sun; J P Barrish; Safa Ani; Audrey E Christiansen; John R Seavitt; Mary E Dickinson; Farrah Kheradmand; Jason D Heaney; Brendan Lee; Lindsay C Burrage
Journal:  Hum Mol Genet       Date:  2020-08-03       Impact factor: 6.150

8.  CRISPRtools: a flexible computational platform for performing CRISPR/Cas9 experiments in the mouse.

Authors:  Kevin A Peterson; Glen L Beane; Leslie O Goodwin; Peter M Kutny; Laura G Reinholdt; Stephen A Murray
Journal:  Mamm Genome       Date:  2017-03-09       Impact factor: 2.957

9.  Bi-allelic Variants in TONSL Cause SPONASTRIME Dysplasia and a Spectrum of Skeletal Dysplasia Phenotypes.

Authors:  Lindsay C Burrage; John J Reynolds; Nissan Vida Baratang; Jennifer B Phillips; Jeremy Wegner; Ashley McFarquhar; Martin R Higgs; Audrey E Christiansen; Denise G Lanza; John R Seavitt; Mahim Jain; Xiaohui Li; David A Parry; Vandana Raman; David Chitayat; Ivan K Chinn; Alison A Bertuch; Lefkothea Karaviti; Alan E Schlesinger; Dawn Earl; Michael Bamshad; Ravi Savarirayan; Harsha Doddapaneni; Donna Muzny; Shalini N Jhangiani; Christine M Eng; Richard A Gibbs; Weimin Bi; Lisa Emrick; Jill A Rosenfeld; John Postlethwait; Monte Westerfield; Mary E Dickinson; Arthur L Beaudet; Emmanuelle Ranza; Celine Huber; Valérie Cormier-Daire; Wei Shen; Rong Mao; Jason D Heaney; Jordan S Orange; Débora Bertola; Guilherme L Yamamoto; Wagner A R Baratela; Merlin G Butler; Asim Ali; Mehdi Adeli; Daniel H Cohn; Deborah Krakow; Andrew P Jackson; Melissa Lees; Amaka C Offiah; Colleen M Carlston; John C Carey; Grant S Stewart; Carlos A Bacino; Philippe M Campeau; Brendan Lee
Journal:  Am J Hum Genet       Date:  2019-02-14       Impact factor: 11.025

10.  The tetraspanin Tspan15 is an essential subunit of an ADAM10 scissor complex.

Authors:  Chek Ziu Koo; Neale Harrison; Peter J Noy; Justyna Szyroka; Alexandra L Matthews; Hung-En Hsia; Stephan A Müller; Johanna Tüshaus; Joelle Goulding; Katie Willis; Clara Apicella; Bethany Cragoe; Edward Davis; Murat Keles; Antonia Malinova; Thomas A McFarlane; Philip R Morrison; Hanh T H Nguyen; Michael C Sykes; Haroon Ahmed; Alessandro Di Maio; Lisa Seipold; Paul Saftig; Eleanor Cull; Christos Pliotas; Eric Rubinstein; Natalie S Poulter; Stephen J Briddon; Nicholas D Holliday; Stefan F Lichtenthaler; Michael G Tomlinson
Journal:  J Biol Chem       Date:  2020-02-28       Impact factor: 5.157

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.