| Literature DB >> 30291759 |
Jianying Li1, Hakim Manghwar1, Lin Sun1, Pengcheng Wang1, Guanying Wang1, Hanyan Sheng1, Jie Zhang1, Hao Liu1,2, Lei Qin1, Hangping Rui1, Bo Li1, Keith Lindsey3, Henry Daniell4, Shuangxia Jin1, Xianlong Zhang1.
Abstract
The CRISPR/Cas9 system has been extensively applied for crop improvement. However, our understanding of Cas9 specificity is very limited in Cas9-edited plants. To identify on- and off-target mutation in an edited crop, we described whole genome sequencing (WGS) of 14 Cas9-edited cotton plants targeted to three genes, and three negative (Ne) control and three wild-type (WT) plants. In total, 4188-6404 unique single-nucleotide polymorphisms (SNPs) and 312-745 insertions/deletions (indels) were detected in 14 Cas9-edited plants compared to WT, negative and cotton reference genome sequences. Since the majority of these variations lack a protospacer-adjacent motif (PAM), we demonstrated that the most variations following Cas9-edited are due either to somaclonal variation or/and pre-existing/inherent variation from maternal plants, but not off-target effects. Of a total of 4413 potential off-target sites (allowing ≤5 mismatches within the 20-bp sgRNA and 3-bp PAM sequences), the WGS data revealed that only four are bona fide off-target indel mutations, validated by Sanger sequencing. Moreover, inherent genetic variation of WT can generate novel off-target sites and destroy PAMs, which suggested great care should be taken to design sgRNA for the minimizing of off-target effect. These findings suggested that CRISPR/Cas9 system is highly specific for cotton plants.Entities:
Keywords: CRISPR/Cas9; cotton; off-target; pre-existing/inherent variation; somaclonal variation; whole genome sequencing
Mesh:
Substances:
Year: 2018 PMID: 30291759 PMCID: PMC6587709 DOI: 10.1111/pbi.13020
Source DB: PubMed Journal: Plant Biotechnol J ISSN: 1467-7644 Impact factor: 9.803
Figure 1The whole genome sequencing analysis for the on‐ and off‐target mutations in CRISPR/Cas9‐edited cotton plants. Schematic diagram of the whole procedure of CRISPR/Cas9 system for gene editing in transgenic cotton plants. Whole genome sequencing was applied to 12 plants from Cas9‐edited T0 generation, targeting three endogenous genes (, and , four T0 plants from each target gene), two plants from T1 generation, three negative plants (Ne) in T0 generation and three wild‐type (WT) control cotton plants to detect the on‐ and off‐target mutations. As for the two T1 plants, they are derived from the same transgenic T0 plant. Cas9‐negative plants contained the edited target site without the T‐DNA (no CRISPR/Cas9 fragment). Cas9‐positive plants contained the edited target site as well as the T‐DNA (with CRISPR/Cas9 fragment). The number in the brackets e.g. AP2 (s1) represents the plant or line number used for the WGS.
Figure 2Whole genome sequencing and Sanger sequencing confirm on‐target mutations. (a–c) Sanger sequencing validate six sgRNA mutations of three endogenous genes. The orange and red lines represent the sgRNA and PAM sequences, respectively. The labelling on the side of the sequence alignment represent the each clones, and on top label represent on‐target genome coordinate, respectively. Black arrows indicate sgRNAs direction. (d) Whole genome sequencing analysis at sgRNA2 target region of in three wild‐type, three negative plants and four Cas9‐edited plants by Integrative Genomics Viewer (IGV). The number in the square brackets e.g. [0–45] represent the WGS supporting sequence reads with on‐target sites and the pileup strip represent Cas9‐edited plants a heterozygous deletion of 43‐bp in exon 1 of . Red arrows indicate DNA cleavage sites in different Cas9‐edited cotton plants. Orange and red box represent sgRNA sequence and PAM sequence, respectively. The sequences alignment of other five sgRNAs region in the three endogenous target genes are showed in Figures S2–S4. (e–g) The comparison of WGS and Sanger sequencing mutation frequencies for on‐target site. The ‘d’, ‘i’, ‘s’ represent the deletion, insertion and SNP genotypes, respectively. The ‘no’ indicate no editing and other mutation types at the on‐target site.
Figure 3Genome‐wide analysis of variations in Cas9‐edited cotton plants during tissue culture process. (a) The bioinformatics pipeline for the off‐target mutations analysis. (b) Heatmap represents the percentage of specific mutation type in Cas9‐edited transgenic cotton plants. (c) Length of indels in different Cas9‐edited plants.
The summary of total variations in wild‐type, negative and CRISPR/Cas9‐edited cotton plants
| Description | Lines vs Ref | Plants vs Ref/WT | Plants vs Ref/WT/Ne | Private variations | ||||
|---|---|---|---|---|---|---|---|---|
| Plants | SNP | Indel | SNP | Indel | SNP | Indel | SNP | Indel |
| WT (s79) | 1 211 622 | 149 327 | – | – | – | – | – | – |
| WT (s195) | 1 214 683 | 152 535 | – | – | – | – | – | – |
| WT (s199) | 1 210 509 | 148 567 | – | – | – | – | – | – |
| Negative (s65) | 1 203 206 | 149 636 | 69 242 | 13 080 | – | – | – | – |
| Negative (s66) | 1 217 124 | 148 842 | 62 882 | 13 431 | – | – | – | – |
| Negative (s67) | 1 209 155 | 135 845 | 82 264 | 19 840 | – | – | – | – |
| Cas9‐ | 1 266 777 | 150 156 | 61 648 | 13 870 | 15 210 | 6935 | 4188 | 500 |
| Cas9‐ | 1 284 258 | 150 421 | 68 483 | 14 940 | 18 540 | 7736 | 4893 | 527 |
| Cas9‐ | 1 270 814 | 150 405 | 67 517 | 13 691 | 19 415 | 6707 | 5976 | 549 |
| Cas9‐ | 1 260 704 | 149 367 | 61 976 | 14 292 | 16 704 | 7282 | 4345 | 495 |
| Cas9‐ | 1 262 125 | 149 979 | 62 580 | 14 224 | 32 591 | 7950 | 6096 | 777 |
| Cas9‐ | 1 258 810 | 149 834 | 66 866 | 14 204 | 32 412 | 8040 | 5984 | 859 |
| Cas9‐ | 1 259 373 | 149 594 | 64 312 | 13 508 | 21 368 | 6596 | 6404 | 532 |
| Cas9‐ | 1 268 973 | 150 803 | 59 756 | 14 353 | 17 179 | 7123 | 2807 | 312 |
| Cas9‐ | 1 261 093 | 150 637 | 61 338 | 13 064 | 18 726 | 6316 | 4595 | 484 |
| Cas9‐ | 1 261 356 | 150 166 | 61 323 | 13 599 | 18 845 | 6684 | 5604 | 660 |
| Cas9‐ | 1 269 877 | 151 413 | 66 408 | 14 202 | 25 447 | 6884 | 5839 | 593 |
| Cas9‐ | 1 277 072 | 149 570 | 65 577 | 15 603 | 25 114 | 7852 | 5397 | 745 |
| Cas9‐ | 1 245 699 | 145 824 | 64 627 | 14 312 | 26 249 | 7035 | 4405 | 378 |
| Cas9‐ | 1 264 623 | 149 644 | 60 793 | 14 837 | 22 497 | 7406 | 4660 | 578 |
The ‘lines vs Ref’ represent the variation of each plant compared to TM‐1 reference genome using SAMtools and GATK tools (Table S4). The ‘lines vs Ref/WT’ represent the variations of each plant aligned to wild‐type (WT) and TM‐1 reference genome. Similarly, the ‘lines vs Ref/WT/Ne’ represent the variation of each Cas9‐edited transgenic plants aligned to WT, negative plants. Sample‐specific variations in three WT plants have the same genotype as three negative plants, but differ from each Cas9‐edited plants. Sample‐specific variations (including tissue culture variations, or/and inherent variations or/and Cas9‐edited mutations) were annotated by ANNOVAR (Table S5).
Identification of off‐target mutations in Cas9‐edited plants by whole genome sequencing
| Cas9‐edited plants/sgRNA | Cas9 mutations/No. of NGG sites (Ratio%) | Cas9 mutations/No. of NAG sites (Ratio%) | Cas9 mutations/No. of NGA sites (Ratio%) |
|---|---|---|---|
|
| 0/441 (0.00) | 0/57 (0.00) | 0/155 (0.00) |
|
| 0/765 (0.00) | 0/55 (0.00) | 0/83 (0.00) |
|
| 0/683 (0.00) | 0/55 (0.00) | 0/151 (0.00) |
|
| 2/182 (1.10) | 0/8 (0.00) | 0/15 (0.00) |
|
| 2/341 (0.59) | 0/66 (0.00) | 0/54 (0.00) |
|
| 0/884 (0.00) | 0/169 (0.00) | 0/249 (0.00) |
The six sgRNA sequences were aligned TM‐1 reference genome using Cas‐OFFinder and BatMis tools with up five mismatch, the 3296 (NGG), 410 (PAM: NAG) and 707 (PAM: NGA; Table S6 and Appendix S1).
Figure 4The identification of potential off‐target mutations in Cas9‐edited cotton plants by whole genome sequencing and Sanger sequencing. (a) Off‐target mutations in promoter region of the gene. (b) Off‐target mutations in exon region of the gene. (c,d) Off‐target mutations in non‐coding regions of gene. The left panel showed the different off‐target mutation types. Mismatch (on‐target site vs off‐target site) nucleotides are showed in ‘x’, the PAM and off‐target region are showed in red and orange rectangular. The on top of left panel represent the each off‐target genome coordinate. Black arrows represent sgRNA transcription direction. The OFTM represent the off‐target site mutation in Cas9‐edited cotton plants. The Cas9‐edited MF represents the mutation frequency (MF) in different Cas9‐edited cotton plants compared to WT plants. The MF in the left panel showed the average mutation frequency per plant. The middle panel exhibited indel frequency in different Cas9‐edited plants. The right panel illustrated the Sanger sequencing data at off‐target sites. The stars in the right panel represent the cleavage sites.
Newly generated off‐target sites or PAMs by genetic variations in the WT plants
| Target |
|
|
| |||
|---|---|---|---|---|---|---|
| sgRNA | sgRNA1 | sgRNA2 | sgRNA1 | sgRNA2 | sgRNA1 | sgRNA2 |
| Off‐target site (NGG) | 4/441 | 7/765 | 12/683 | 4/182 | 4/341 | 8/884 |
| PAM site (NGG) | 0/441 | 3/765 | 1/683 | 0/182 | 0/341 | 1/884 |
| Off‐target site (NAG) | 0/57 | 2/55 | 0/55 | 1/8 | 0/66 | 3/169 |
| PAM site (NAG) | 0/57 | 0/55 | 0/55 | 0/8 | 0/66 | 0/169 |
| Off‐target site (NGA) | 2/155 | 2/83 | 3/151 | 0/15 | 0/54 | 3/249 |
| PAM site (NGA) | 0/155 | 0/83 | 0/151 | 0/15 | 1/54 | 0/249 |
The number in front of the ‘/’ represent the novel off‐target site in WT and number behind the ‘/’ represent the total potential off‐target sites in reference genome, respectively.
The inheritance of Cas9‐edited mutations from T0 to T1 in AP2 Cas9‐edited plants
| AP2 sgRNA1 | Type | s1 (T0, Cas9+) | s20 (T1, Cas9−) | s23 (T1, Cas9+) | |||
|---|---|---|---|---|---|---|---|
| D13 | D12 | D13 | D12 | D13 | D12 | ||
|
| no | 0 | 0 | 0 | 0 | 0 | 0 |
|
| i1 | 0 | 0 | 0 | 0 | 0 | 0 |
|
| d2 | 18 | 7 | 0 | 3 | 26 | 11 |
|
| d2 | 23 | 41 | 54 | 27 | 11 | 31 |
|
| D13 | A12 | D13 | A12 | D13 | A12 | |
|
| no | 24 | 22 | 32 | 29 | 0 | 27 |
|
| i1 | 0 | 0 | 2 | 3 | 0 | 0 |
|
| s3 | 0 | 0 | 0 | 0 | 0 | 0 |
|
| d1 | 1 | 13 | 1 | 0 | 21 | 0 |
|
| d2 | 0 | 0 | 0 | 0 | 0 | 0 |
|
| d3 | 1 | 0 | 0 | 1 | 0 | 0 |