| Literature DB >> 30200981 |
Yonggang Zhang1,2, Gustavo Arango3, Fang Li1, Xiao Xiao1, Raj Putatunda1, Jun Yu1, Xiao-Feng Yang1, Hong Wang1, Layne T Watson3,4, Liqing Zhang5, Wenhui Hu6,7.
Abstract
BACKGROUND: CRISPR/CAS9 (epi)genome editing revolutionized the field of gene and cell therapy. Our previous study demonstrated that a rapid and robust reactivation of the HIV latent reservoir by a catalytically-deficient Cas9 (dCas9)-synergistic activation mediator (SAM) via HIV long terminal repeat (LTR)-specific MS2-mediated single guide RNAs (msgRNAs) directly induces cellular suicide without additional immunotherapy. However, potential off-target effect remains a concern for any clinical application of Cas9 genome editing and dCas9 epigenome editing. After dCas9 treatment, potential off-target responses have been analyzed through different strategies such as mRNA sequence analysis, and functional screening. In this study, a comprehensive analysis of the host transcriptome including mRNA, lncRNA, and alternative splicing was performed using human cell lines expressing dCas9-SAM and HIV-targeting msgRNAs.Entities:
Keywords: CRISPR; Genome editing; HIV; Latency; Off-target; RNA sequencing; Shock and kill; Transcriptome
Mesh:
Substances:
Year: 2018 PMID: 30200981 PMCID: PMC6131778 DOI: 10.1186/s12920-018-0394-2
Source DB: PubMed Journal: BMC Med Genomics ISSN: 1755-8794 Impact factor: 3.063
Fig. 1No difference in the entire RNA transcripts among the three experimental conditions. a Diagram showing the HIV proviral activation by the dCas9-SAM system with msgRNAs targeting LTR_L or LTR_O. b Box plot and density plot for the distribution of transcript expression levels measured by FPKM (averaged within replicates) of the three conditions. The plotted region of the box plot represents the maximum, upper quartile, median, lower quartile, and minimum, respectively, from top to bottom. c Hierarchical clustering of samples based on Pearson correlation coefficient of transcript expression levels for all the pairwise comparisons of the samples
Distribution of mapped reads in different categories of RNAs in the six samples
| Sample_name | LTR_Zer1 | LTR_Zer2 | LTR_L1 | LTR_L2 | LTR_01 | LTR02 |
|---|---|---|---|---|---|---|
| 3prime_overlapping_ncrna | 159 (0.00%) | 180 (0.00%) | 160 (0.00%) | 171 (0.00%) | 180 (0.00%) | 160 (0.00%) |
| IG_C_gene | 0 (0.00%) | 2 (0.00%) | 1 (0.00%) | 0 (0.00%) | 3 (0.00%) | 1 (0.00%) |
| IG_C_pseudogene | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) |
| IG_D_gene | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) |
| IG_J_gene | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) |
| IG_J_pseudogene | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) |
| IG_V_gene | 4 (0.00%) | 2 (0.00%) | 1 (0.00%) | 3 (0.00%) | 1 (0.00%) | 4 (0.00%) |
| IG_V_pseudogene | 0 (0.00%) | 0 (0.00%) | 3 (0.00%) | 1 (0.00%) | 0 (0.00%) | 1 (0.00%) |
| Mt_rRNA | 1318 (0.00%) | 1488 (0.00%) | 1734 (0.00%) | 1496 (0.00%) | 1342 (0.00%) | 1779 (0.01%) |
| Mt_tRNA | 644 (0.00%) | 637 (0.00%) | 692 (0.00%) | 784 (0.00%) | 603 (0.00%) | 668 (0.00%) |
| TEC | 5415 (0.01%) | 5636 (0.01%) | 5036 (0.01%) | 5747 (0.01%) | 5685 (0.02%) | 5276 (0.02%) |
| TR_C_gene | 32 (0.00%) | 21 (0.00%) | 23 (0.00%) | 29 (0.00%) | 36 (0.00%) | 26 (0.00%) |
| TR_D_gene | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) |
| TR_J_gene | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) |
| TR_J_pseudogene | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) |
| TR_V_gene | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 1 (0.00%) | 0 (0.00%) | 0 (0.00%) |
| TR_V_pseudogene | 0 (0.00%) | 0 (0.00%) | 2 (0.00%) | 2 (0.00%) | 0 (0.00%) | 2 (0.00%) |
| antisense | 191,551 (0.52%) | 204,790 (0.53%) | 178,062 (0.50%) | 205,243 (0.50%) | 201,319 (0.54%) | 189,642 (0.55%) |
| known_ncrna | 0 (0.00%) | 0 (0.00%) | 1 (0.00%) | 0 (0.00%) | 1 (0.00%) | 1 (0.00%) |
| lincRNA | 738,731 (2.00%) | 761,611 (1.97%) | 706,213 (1.98%) | 702,871 (1.71%) | 742,207 (1.99%) | 702,377 (2.03%) |
| miRNA | 2479 (0.01%) | 2557 (0.01%) | 3497 (0.01%) | 3299 (0.01%) | 1525 (0.00%) | 1430 (0.00%) |
| misc_RNA | 1,612,667 (4.37%) | 1,593,547 (4.12%) | 1,627,962 (4.57%) | 1,960,500 (4.76%) | 1,343,420 (3.59%) | 1,244,791 (3.59%) |
| non_coding | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) |
| polymorphic_pseudogene | 319 (0.00%) | 355 (0.00%) | 320 (0.00%) | 369 (0.00%) | 333 (0.00%) | 312 (0.00%) |
| processed_pseudogene | 10,437 (0.03%) | 10,705 (0.03%) | 9812 (0.03%) | 11,241 (0.03%) | 10,275 (0.03%) | 8946 (0.03%) |
| processed_transcript | 196,988 (0.53%) | 213,355 (0.55%) | 194,373 (0.55%) | 229,395 (0.56%) | 203,313 (0.54%) | 192,191 (0.55%) |
| protein_coding | 32,728,357 | 34,372,393 | 31,562,319 | 36,554,719 | 33,423,746 | 30,949,051 |
| (88.74%) | (88.92%) | (88.65%) | (88.74%) | (89.42%) | (89.36%) | |
| pseudogene | 147 (0.00%) | 166 (0.00%) | 145 (0.00%) | 153 (0.00%) | 172 (0.00%) | 167 (0.00%) |
| rRNA | 32 (0.00%) | 44 (0.00%) | 38 (0.00%) | 62 (0.00%) | 46 (0.00%) | 44 (0.00%) |
| sense_intronic | 1015 (0.00%) | 1032 (0.00%) | 1070 (0.00%) | 1071 (0.00%) | 1035 (0.00%) | 1021 (0.00%) |
| sense_overlapping | 14,067 (0.04%) | 15,375 (0.04%) | 12,765 (0.04%) | 14,949 (0.04%) | 15,117 (0.04%) | 14,014 (0.04%) |
| snRNA | 3417 (0.01%) | 3214 (0.01%) | 2915 (0.01%) | 3858 (0.01%) | 3149 (0.01%) | 3236 (0.01%) |
| snoRNA | 160 (0.00%) | 172 (0.00%) | 136 (0.00%) | 149 (0.00%) | 197 (0.00%) | 170 (0.00%) |
| transcribed_processed_pseudogene | 25,420 (0.07%) | 26,038 (0.07%) | 24,532 (0.07%) | 28,196 (0.07%) | 25,315 (0.07%) | 23,318 (0.07%) |
| transcribed_unitary_pseudogene | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) |
| transcribed_unprocessed_pseudogene | 72,052 (0.20%) | 77,671 (0.20%) | 69,487 (0.20%) | 79,027 (0.19%) | 78,124 (0.21%) | 73,283 (0.21%) |
| translated_processed_pseudogene | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) | 0 (0.00%) |
| translated_unprocessed_pseudogene | 1 (0.00%) | 0 (0.00%) | 0 (0.00%) | 1 (0.00%) | 0 (0.00%) | 0 (0.00%) |
| unitary_pseudogene | 7892 (0.02%) | 8444 (0.02%) | 7147 (0.02%) | 8507 (0.02%) | 8233 (0.02%) | 7525 (0.02%) |
| unprocessed_pseudogene | 12,070 (0.03%) | 12,228 (0.03%) | 11,432 (0.03%) | 12,660 (0.03%) | 12,177 (0.03%) | 11,856 (0.03%) |
| Others | 1,257,539 (3.41%) | 1,342,963 (3.47%) | 1,185,376 (3.33%) | 1,370,001 (3.33%) | 1,299,971 (3.48%) | 1,201,876 (3.47%) |
Fig. 2No difference in the lncRNAs among the three experimental conditions. a Predicted lncRNAs based on four coding potential filtering methods. CPC, Coding-Potential Calculator; PFAM, Protein FAMily analysis; PhyloCSF, Phylogenetic Codon Substitution Frequency; CNCI, Coding-Non-Coding Index. b Expression level distribution of the 839 lncRNAs in the six samples (FPKM values are z-score normalized)
21 genes that are significantly downregulated in the O samples as compared to the Zero and L samples
| Genes | LTR_O_FPKM | LTR_Zero_FPKM | log2(fold) | LTR_L_FPKM | log2(fold) |
|---|---|---|---|---|---|
| HNRNPAB | 6.55 | 38.66 | − 2.56 | 43.94 | −2.75 |
| PTP4A2 | 3.62 | 20.44 | −2.50 | 23.02 | −2.67 |
| B4GALT2 | 2.07 | 6.02 | −1.54 | 6.07 | −1.55 |
| C4orf48 | 4.99 | 11.81 | −1.24 | 11.15 | −1.16 |
| TPGS1 | 3.36 | 7.16 | −1.09 | 8.80 | −1.39 |
| HPCAL1 | 4.01 | 8.33 | −1.05 | 8.39 | −1.06 |
| SLBP | 10.55 | 20.53 | −0.96 | 20.64 | −0.97 |
| CITED4 | 3.56 | 6.79 | −0.93 | 7.66 | −1.11 |
| HIST2H2BF | 97.90 | 175.32 | − 0.84 | 176.52 | − 0.85 |
| TMEM160 | 8.40 | 14.67 | −0.80 | 16.45 | −0.97 |
| HIST2H2AC | 444.08 | 750.59 | −0.76 | 850.05 | −0.94 |
| C17orf89 | 57.82 | 95.66 | −0.73 | 109.36 | −0.92 |
| IER5L | 6.61 | 10.87 | −0.72 | 13.23 | −1.00 |
| CEBPD | 16.45 | 26.54 | −0.69 | 29.57 | −0.85 |
| HIST2H3D | 248.87 | 400.31 | −0.69 | 425.10 | −0.77 |
| HIST1H2AB | 336.25 | 536.04 | −0.67 | 587.46 | −0.80 |
| HIST1H2AM | 468.32 | 743.47 | −0.67 | 809.48 | −0.79 |
| MIF | 128.97 | 200.45 | −0.64 | 247.69 | −0.94 |
| HIST1H4J | 1102.07 | 1656.42 | −0.59 | 1819.34 | −0.72 |
| CYBA | 179.61 | 268.40 | −0.58 | 308.07 | −0.78 |
| HIST1H2AD | 722.53 | 1057.74 | −0.55 | 1177.36 | −0.70 |
Fig. 3Hierarchical clustering of the six samples based on the Jaccard index for SNPs (a) and indels (b)
Fig. 4Summary statistics of the 12 types of alternative splicing in the six samples. The number of events for each type is log10 transformed
Comparison of differential isoform regulation between the three groups. The genes in bold font are those shared by two pairwise comparisons. The numbers in parenthesis are the number of events considered for the particular group comparison
| AS types | L vs. Zero | O vs. Zero | L vs. O |
|---|---|---|---|
| A3SS (# of events) | (7244) | (7143) | (7239) |
| C8orf22 | ANKRD11 | BMP1 | |
| CLSPN | C11orf48 | NFAT5 | |
| COBLL1 | CNOT2 |
| |
| DAK | GNB2L1 | ORMDL1 | |
| JOSD1 | SETMAR | ST5 | |
|
| YWHAB | ZNF84 | |
| PIH1D1 | ZNF587 | ||
| A5SS (# of events) | (5399) | (5350) | (5407) |
| C17orf70 | ANGPT1 | HILPDA | |
| MTMR2 | CLEC2D | NBPF11 | |
| NOC2L | LAMA4 | NDUFV2 | |
| RP5-1198O20.4 | NAA60 | NT5C | |
| SMARCC2 | SLC50A1 | OXLD1 | |
| TWF1 |
| RP4-583P15.15 | |
| ZNF30 | SRRM1 | ||
| TBC1D7 | |||
|
| |||
| VPS52 | |||
| chr1:32336239:32335947 | |||
| MXE (# of events) | (4959) | (4946) | (5006) |
|
|
| AKIRIN1 | |
| DBNL | EIF4G2 | DDHD2 | |
| DPY30 | HMGN1 | DEK | |
| MPPE1 | PTRH1 | DPP3 | |
| PLA2G6 |
| ELMOD3 | |
| SPDL1 | TMBIM4 | MPV17L | |
| TMBIM6 |
|
| |
| UQCC1 | chr7:143284899:143284974:+@chr7:143285348 | TCTN1 | |
| WBP1 |
| ||
| RI (# of events) | (4109) | (4057) | (4084) |
|
|
|
| |
| MRRF | CAPRIN2 | FANCI | |
| RP11-5A19.5 | CDK5RAP3 | HSD17B4 | |
| RPRD2 |
| MTA1 | |
| SMTN | GPS2 | TAB3 | |
| IMPDH2 | |||
| QARS | |||
| SERAC1 | |||
| SE (# of events) | (25,942) | (25,835) | (25,969) |
| C2CD5 |
|
| |
| CDC42BPA | AGPAT2 | AC124789.1 | |
| DCTD |
| ARID1B | |
| DSC3 |
| ATG10 | |
| GRB10 |
|
| |
|
| CENPU |
| |
| KCTD17 | CMTR2 | BBS1 | |
| LINC00570 | GABPB2 |
| |
| MIPOL1 |
| BTBD7 | |
|
| IMMP1L | CD320 | |
| NCSTN | KDM6A | CD59 | |
| NUMB | KLHL5 | DCTD | |
| PDE4DIP |
| LINC00472 | |
| PXK |
|
| |
| RAB40B |
| MIR4435-1HG | |
| SCMH1 | SETD8 |
| |
| SPATA20 | SMURF2P1 | RHBDD1 | |
|
|
| RP4-717I23.3 | |
| TTC23 |
| RPS6KB2 | |
| ZNF138 | TINF2 |
| |
| ZSCAN21 |
| ST20-MTHFS | |
| TMEM189 |
| ||
|
|
| ||
| TRIP6 | UBE2I | ||
| UBE2I | YDJC | ||
| VWA9 | ZNF639 | ||
| ZNF584 | ZNF678 | ||
| chr7:143284899:143284974 |
Fig. 5The sashimi plot showing exon skipping in TOPORS that exhibits significant differential regulation between the LTR_O group and the control group. The top left panel shows the FPKM of reads that supports the corresponding exons and exon junctions in the two LTR_O samples and two control samples, respectively. The top right panel shows the posterior distribution of Ψ (the fraction of inclusive isoform), with the red line denoting the estimated Ψ and grey lines the 95% confidence interval of Ψ. The bottom panel shows the two transcripts due to exon skipping in the bottom transcript