| Literature DB >> 33231685 |
Heba Z Abid1, Eleanor Young1, Jennifer McCaffrey1, Kaitlin Raseley1, Dharma Varapula1, Hung-Yi Wang1, Danielle Piazza1,2,3, Joshua Mell2,3, Ming Xiao1,3.
Abstract
Whole-genome mapping technologies have been developed as a complementary tool to provide scaffolds for genome assembly and structural variation analysis (1,2). We recently introduced a novel DNA labeling strategy based on a CRISPR-Cas9 genome editing system, which can target any 20bp sequences. The labeling strategy is specifically useful in targeting repetitive sequences, and sequences not accessible to other labeling methods. In this report, we present customized mapping strategies that extend the applications of CRISPR-Cas9 DNA labeling. We first design a CRISPR-Cas9 labeling strategy to interrogate and differentiate the single allele differences in NGG protospacer adjacent motifs (PAM sequence). Combined with sequence motif labeling, we can pinpoint the single-base differences in highly conserved sequences. In the second strategy, we design mapping patterns across a genome by selecting sets of specific single-guide RNAs (sgRNAs) for labeling multiple loci of a genomic region or a whole genome. By developing and optimizing a single tube synthesis of multiple sgRNAs, we demonstrate the utility of CRISPR-Cas9 mapping with 162 sgRNAs targeting the 2Mb Haemophilus influenzae chromosome. These CRISPR-Cas9 mapping approaches could be particularly useful for applications in defining long-distance haplotypes and pinpointing the breakpoints in large structural variants in complex genomes and microbial mixtures.Entities:
Mesh:
Substances:
Year: 2021 PMID: 33231685 PMCID: PMC7826249 DOI: 10.1093/nar/gkaa1088
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Interrogation of individual bases with CRISPR–Cas9 labeling. Yellow lines indicate single molecules. The thick blue bars represent Nt.BSPQI reference map. The narrower blue bar represent consensus map of combined Nt.BSPQI CRISPR–Cas9 labeling. Red arrows and bases indicate the single base differences between the two strains. Additional details can be found in the Table 1.
sgRNA target sequences used for single base differentiation in Figure 1
| Strains | Locations | Loci | Target sequence | gRNA sequence |
|---|---|---|---|---|
| RR722 | 819899 | 1 | AAAAATTGCTGCATCTTCTTTG | AAAAATTGCTGCATCTTCTT |
| RR3131 | 885289 | 1 | AAAAATTGCTGCATCTTCTTTG | |
| RR722 | 828196 | 2 | AACCATTCAAACGGCGATTGC | AACCATTCAAACGGCGATTG |
| RR3131 | 893590 | 2 | CACTATTCAAACGGCTATTGC | |
| RR722 | 903309 | 3 | AATATCCTTGCCTTGAGAGAA | AATATCCTTGCCTTGAGAGA |
| RR3131 | 968698 | 3 | AATATCCTTGCCTTGAGAGAA |
The differing bases are shown in red for three locations.
Figure 2.The workflow of sgRNA synthesis. The multiple oligos with a promoter sequence (red) and an overlap sequence (green) on either side of the target sequence are hybridized with a single complementary oligo that shares the overlap sequence.
Figure 3.(A) Mapping results of RR722 molecules labeled with the 48 sgRNAs (Supplementary Table S1). The lines in the blue bar (designed reference map of RR722) represent the locations of the 48 sgRNAs on RR722. The yellow lines below the reference are labels with dark green dots representing where labels matched to the reference and light green dots representing labels not found in the reference. (B) Mapping results of RR3131 molecules labeled with the set of 48 sgRNAs (Supplementary Table S1). The lines in the blue bar (designed reference map of RR3131) represent the locations of the 48 sgRNAs on RR3131. The yellow lines below the reference are labels with dark green dots representing where labels matched to the reference map and light green dots representing labels not found in the reference map. The red arrows indicate the off-target labeling.
The off-target labeling of RR3131
| Strains | Locations | Labeling | Target sequence |
|---|---|---|---|
| RR722 | 21722 | GCTTTTTAGGATATCGTCCC | |
| RR3131 | 21698 | off target | GCTTTTTAAGATATCGTCCC |
| RR722 | 59529 | GCGGTATCCACCCCCACTGC | |
| RR3131 | 60913 | off target | GCAGTATCCACCCCCACTGC |
| RR722 | 86065 | GTTACATTACACACAAACTT | |
| RR3131 | 86656 | off target | GTTACATTACACACAAATTT |
| RR722 | 94393 | GGGGCGTAAATTCTTAACAT | |
| RR3131 | 151264 | off target | GGAGCGTAAATTCTTAACAT |
| RR722 | 253327 | CGAAGGGATAAATATTGCGA | |
| RR3131 | 316470 | off target | TGAAGGGATAAATATTGCGA |
| RR722 | 270963 | TAGCACTTAAAAGAGGAATG | |
| RR3131 | 334078 | off target | TGGCACTTAAAAGAGGAATG |
| RR722 | 219206 | TTGTTTTACGATATAATACG | |
| RR3131 | 281336 | no label | TTGTTTTGCGATATAATACG |
| RR722 | 296956 | TAATCAAGCATTAGATAGCT | |
| RR3131 | 359914 | no label | GCGTAAAGCATTAGATAGCT |
Two rows are shown for each of eight probes that did not have a perfect hit in the RR3131 genome. The second row is the designed probe named for its hit location on the RR722 genome. The upper row is the sequence found in the RR3131 strain, and named for its location. Bold indicates a PAM sequence motif (NGG). Red indicates a base that does not match the designed probe. The last two probes did not have a label seen consistently in the aligned data.
Figure 4.sgRNA design flow-chart.
Figure 5.Mapping results of RR722 molecules labeled with the 162 sgRNAs (Supplementary Table S2). (A) The lines in the blue bar (designed reference map of RR722) represent the locations of the 162 sgRNAs on RR722. The yellow lines below the reference are labels with dark green dots representing where labels matched to the reference and light green dots representing labels not found in the reference. (B) Alignment results to RR3131.