| Literature DB >> 32963084 |
Hayley R Stoneman1,2,3,4, Russell L Wrobel1,2,3,4, Michael Place1,2, Michael Graham1,3, David J Krause1,2,3,4, Matteo De Chiara5, Gianni Liti5, Joseph Schacherer6, Robert Landick1,7,8, Audrey P Gasch1,2,4, Trey K Sato1,3,4, Chris Todd Hittinger9,2,3,4.
Abstract
CRISPR/Cas9 is a powerful tool for editing genomes, but design decisions are generally made with respect to a single reference genome. With population genomic data becoming available for an increasing number of model organisms, researchers are interested in manipulating multiple strains and lines. CRISpy-pop is a web application that generates and filters guide RNA sequences for CRISPR/Cas9 genome editing for diverse yeast and bacterial strains. The current implementation designs and predicts the activity of guide RNAs against more than 1000 Saccharomyces cerevisiae genomes, including 167 strains frequently used in bioenergy research. Zymomonas mobilis, an increasingly popular bacterial bioenergy research model, is also supported. CRISpy-pop is available as a web application (https://CRISpy-pop.glbrc.org/) with an intuitive graphical user interface. CRISpy-pop also cross-references the human genome to allow users to avoid the selection of guide RNAs with potential biosafety concerns. Additionally, CRISpy-pop predicts the strain coverage of each guide RNA within the supported strain sets, which aids in functional population genetic studies. Finally, we validate how CRISpy-pop can accurately predict the activity of guide RNAs across strains using population genomic data.Entities:
Keywords: CRISPR/Cas9; genome editing; population genomics; sgRNA; yeast
Mesh:
Substances:
Year: 2020 PMID: 32963084 PMCID: PMC7642938 DOI: 10.1534/g3.120.401498
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1Screenshot of the CRISpy-pop homepage (https://CRISpy-pop.glbrc.org/). There are options to search a gene in S. cerevisiae and Z. mobilis, as well as an offsite and custom target search. There are options to select specific strains, the desired PAM site, and the sgRNA length. Users may select the following PAM sites: NGG, NNGRRT, TTTV, NNNNGATT, TTTN, NCC, or NNAGAAW. Additionally, there is an option to search the human genome for perfect matches. CRISpy-pop features a user-friendly, web-based GUI.
Table of the oligonucleotides used. These include the bridge primers for adding the sgRNA sequences to the pKOPIS + sgRNA plasmid, the primers for PCR SOEing to clone the donor DNA, and the primers for PCR and Sanger sequencing
| Name | Sequence |
|---|---|
| ADE2 Bridge L1 | cgggtggcgaatgggactttACAGTTGGTATATTAGGAGGgttttagagctagaaatagc |
| ADE2 Bridge L2 | cgggtggcgaatgggactttAACAGTTGGTATATTAGGAGgttttagagctagaaatagc |
| ADE2 Bridge H1 | cgggtggcgaatgggactttACTTTGGCATACGATGGAAGgttttagagctagaaatagc |
| ADE2 Bridge H2 | cgggtggcgaatgggactttACGGAGTCCGGAACTCTAGCgttttagagctagaaatagc |
| ADE2 5′ KO For | gatgtccacgacgtctctCAAATGACTCTTGTTGCATGG |
| ADE2 5′ KO Rev | GTATATCAATAAACTTATATAACTTGATTGTTTTGTCCGATTTTC |
| ADE2 3′ KO For | GAAAATCGGACAAAACAATCAAGTTATATAAGTTTATTGATATAC |
| ADE2 3′ KO Rev | cggtgtcggtgtcgtagGTATAATAAGTGATCTTATGTATG |
| ADE2 Conf For | ACCAACATAACACTGACATC |
| ADE2 Conf Rev | TATATGAACTGTATCGAAAC |
| pKOPIS sgRNA For | AACGCGAGCTGCGCACATAC |
| pKOPIS sgRNA Rev | GCGACAGTCACATCATGCC |
| pKOPIS sgRNA Seq For | CACCTATATCTGCGTGTTG |
| pKOPIS sgRNA Seq Rev | GCACGTCAAGACTGTCAAGG |
Table of the strains and plasmids used. These include the lab identifier used for each individual strain or plasmid. The strains include the reference, and the plasmids include the sgRNA target sequence
| Strain | Lab Identifier | Reference |
|---|---|---|
| S288C | yHDO554 | |
| K1 | yHEB306 | |
| L1374 | yHDPN448 | |
| SK1 | yHDPN454 | |
| T73 | yHDPN449 | |
| Y55 | yHDPN455 |
Figure 2Log2 histograms of sgRNAs found within 1011-strain set compared to S288C. The upper panel has a bin size of 1; the lower panel has a bin size of 100, except for the larger first bin. To explore sgRNAs designed against S288C using all non-mitochondrial verified ORFs vs. the variation within the 1011-strain set, we calculated the total number of strains in that set that could be targeted by each sgRNA found using S288C as the target. The total number of sgRNAs designed was 706,397. Only 55,875 of the sgRNAs had perfect matches in all 1011 genomes, while the remaining 605,522 target only the fraction of the genomes.
Figure 3Sample output from CRISpy-pop searched for the gene in S288C genome with NGG PAM sequence, spacer length of 20, and cross-referencing the human genome to ensure no perfect matches exist for selected sgRNAs. A, genome viewer output by CRISpy-pop, showing the relative position of each sgRNA within the target gene. B, portion of the sgRNA table of results with each data point for each output sgRNA sequence. C, detailed results for an individual sgRNA, including identities of targeted and non-targeted strains.
Figure 4Portions of the gene from each strain aligned with the four sgRNAs. The PAM sites are included in purple. The gene sequence from each strain was extracted and aligned to each other and the four sgRNA sequences (H1, H2, L1, L2). The single nucleotide polymorphism highlighted in red at position 27 is predicted to prevent the two low-coverage sgRNAs (L1 and L2) from targeting . Note that sgRNA H2 targets the opposite strand, so its reverse complement is shown in this figure.
Figure 5CRISpy-pop generated sgRNAs that target in a strain-specific manner. Results of the transformation of each strain with each sgRNA is shown. The two strains in red (S288C and SK1) were each predicted to be targeted by all four sgRNAs. Only these two strains both had non-zero % knockouts (KOs) using all four sgRNAs. The four remaining strains were predicted to be targeted by only the high-coverage sgRNAs (H1 and H2), but not the low-coverage sgRNAs (L1 and L2). These four strains only had non-zero % knockouts using the high-coverage sgRNAs. These results align with the strain coverage predictions made by CRISpy-pop. The predicted activity scores were H2 < H1 < L2 < L1, which are also consistent with the observed efficiencies.