| Literature DB >> 31767844 |
Erica McGrath1, Hyunsu Shin1, Linyi Zhang1, Je-Nie Phue2, Wells W Wu2, Rong-Fong Shen2, Yoon-Young Jang3, Javier Revollo4, Zhaohui Ye5.
Abstract
DNA base editors have enabled genome editing without generating DNA double strand breaks. The applications of this technology have been reported in a variety of animal and plant systems, however, their editing specificity in human stem cells has not been studied by unbiased genome-wide analysis. Here we investigate the fidelity of cytidine deaminase-mediated base editing in human induced pluripotent stem cells (iPSCs) by whole genome sequencing after sustained or transient base editor expression. While base-edited iPSC clones without significant off-target modifications are identified, this study also reveals the potential of APOBEC-based base editors in inducing unintended point mutations outside of likely in silico-predicted CRISPR-Cas9 off-targets. The majority of the off-target mutations are C:G->T:A transitions or C:G->G:C transversions enriched for the APOBEC mutagenesis signature. These results demonstrate that cytosine base editor-mediated editing may result in unintended genetic modifications with distinct patterns from that of the conventional CRISPR-Cas nucleases.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31767844 PMCID: PMC6877639 DOI: 10.1038/s41467-019-13342-8
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1WGS analysis of iPSC clones after long-term cytidine base editor expression reveals increased mutations. a Schematic of the doxycycline-inducible XL-AncBE4max piggyBAC construct inserted into the CL1 line. b Diagrammatic representation of experimental design for clonal expansion and whole-genome sequencing analysis of iPSCs with or without doxycycline-induced 21-day base editor expression. c Mutation numbers in uninduced (ANC-1 and ANC-2) and induced (AN21-1 and AN21-2) iPSC clones. The numbers are total sequence variations, including indels and single nucleotide variations, as compared to the sequence of the parental CL1 iPSCs. d Fractions of each type of mutations in uninduced and induced clones. Source data for c and d are provided in the Source Data file.
Summary of whole-genome sequencing of iPSC clones.
| Duration of editor expression | iPSC clone | Editing target | Form of editor expression | Total variations | C:G > T:A | C:G > G:C | C:G > A:T | A:T > C:G | A:T > G:C | A:T > T:A | Indel | Exonic | Non-syne | Non-syn C>T & C>Gf |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Long-term | ANC-1 | – | Integrated (−Dox) | 869 | 254 | 63 | 175 | 48 | 206 | 57 | 66 | 8 | 4 | 2 |
| ANC-2 | 847 | 241 | 69 | 169 | 38 | 214 | 43 | 73 | 6 | 4 | 3 | |||
| AN21-1 | Integrated (+Dox) | 7896 | 5612 | 718 | 371 | 167 | 817 | 98 | 113 | 105 | 67 | 50 | ||
| AN21-2 | 4605 | 3597 | 399 | 275 | 40 | 185 | 46 | 63 | 41 | 28 | 28 | |||
| Transient transfection | N1a | – | – (Control Plasmid) | 186 | 46 | 6 | 53 | 14 | 20 | 3 | 44 | 2 | 2 | 0 |
| N2a | 171 | 42 | 3 | 66 | 5 | 12 | 4 | 39 | 3 | 2 | 1 | |||
| N3a | 145 | 39 | 4 | 36 | 7 | 12 | 6 | 41 | 2 | 2 | 1 | |||
| HK31b | HEK3 | Plasmid | 164 | 55 | 14 | 41 | 6 | 16 | 5 | 27 | 1 | 0 | 0 | |
| HK34b | 2300 | 2086 | 59 | 71 | 10 | 23 | 6 | 45 | 29 | 20 | 19 | |||
| HK36b | 235 | 83 | 33 | 47 | 5 | 20 | 10 | 37 | 3 | 3 | 3 | |||
| HK32Mc | RNA | 242 | 96 | 16 | 59 | 9 | 11 | 9 | 39 | 4 | 2 | 1 | ||
| HK33Mc | 836 | 561 | 144 | 75 | 7 | 15 | 5 | 29 | 12 | 7 | 6 | |||
| HK34Mc | 113 | 40 | 5 | 30 | 2 | 7 | 3 | 26 | 0 | 0 | 0 | |||
| EX1M | EXM1 | RNA | 272 | 121 | 17 | 66 | 7 | 14 | 12 | 36 | 3 | 1 | 1 | |
| RF23Md | RNF2 | RNA | 599 | 437 | 41 | 69 | 5 | 9 | 12 | 26 | 5 | 2 | 2 | |
| RF24Md | 1813 | 1573 | 96 | 76 | 13 | 15 | 7 | 33 | 22 | 14 | 12 |
Please see Supplementary Data 1 for a complete annotation of all the mutations
a,b,c,dClones with the same letter note were isolated from the same transfection reaction (e.g. HK31, HK34 and HK36 are individual clones from the same transfection)
eIncluding non-synonym and stop-gain mutations that are located in exonic regions
fC:G > T:A and C:G > G:C mutations among non-synonym and stop-gain mutations that are located in exonic regions
Fig. 2Mutation frequencies in iPSC clones base-edited by transient transfection. a Diagrammatic representation of experimental design for whole-genome sequencing analysis of iPSCs edited at the HEK3, RNF2 or EMX1 locus after transient transfection with either the plasmid or the RNA form of base editor. Clones transfected by a GFP-only plasmid vector were used as procedure controls for the analysis. b Mutation numbers in control iPSC clones and base-edited iPSC clones after plasmid or RNA transfection. c Fractions of mutations in control and base-edited iPSC clones. Source data for b and c are provided in the Source Data file.
Fig. 3Characterizations of the unintended mutations and identification of a sequence signature related to APOBEC activity. a Fold change of each type of mutation in base-edited iPSC clones as compared to control clones. A fold change is calculated by dividing the number of one type of mutation detected in one clone by the average number of that mutation type (indicated on x-axis) detected in three control clones N1, N2 and N3. Source data are provided in the Source Data file. b Sequence logos of the conserved bases around the C>T mutations in each iPSC clone after inducible base editor expression. The mutated C is shown at position 0. The overall height of each stack indicates the sequence conservation at that position, while the height of symbols within the stack indicates the relative frequency of each nucleic acid at that position[22]. Sequence logos from position −5 to position 25 are shown in Supplementary Fig. 3. c Sequence logos of the conserved bases around the C>T mutations in iPSC clones edited by transient transfection. d Sequence logos of the conserved bases around the C->G mutations in iPSC clones that each has more than 40 detected C:G->G:C transversions.
Fig. 4The off-target mutations are randomly distributed. a Chromosomal distribution of C:G>T:A and C:G>G:C mutations in iPSC clones that each has >500 total mutations. Error bar represents standard deviation (n = 2). Source data are provided in the Source Data file. b Numbers of mutation overlaps between iPSC clones edited at either HEK3 or RNF2 locus. The numbers of mutations include on-target mutation(s). A complete comparison of mutations among all sequenced iPSC clones is shown in Supplementary Table 4.