Literature DB >> 36220832

Metagenomic discovery of novel CRISPR-Cas13 systems.

Yanping Hu1,2,3, Yangcan Chen2,3, Jing Xu2,3, Xinge Wang2,3, Shengqiu Luo2,3, Bangwei Mao2,3, Qi Zhou4,5,6,7, Wei Li8,9,10,11.   

Abstract

Entities:  

Year:  2022        PMID: 36220832      PMCID: PMC9554183          DOI: 10.1038/s41421-022-00464-5

Source DB:  PubMed          Journal:  Cell Discov        ISSN: 2056-5968            Impact factor:   38.079


× No keyword cloud information.
Dear Editor, CRISPR-Cas systems are crucial adaptive immune components of microbial resistance against the invasion of mobile genetic elements (MGEs) and serve as the core of current cutting-edge genome engineering technologies[1]. Unlike the widely applied Cas9 or Cas12 DNA editing tools in present use, Cas13 is an RNA-guided programable RNA-targeting single effector system that enables gene manipulation at the transcriptional level[2]. At present, only four subtypes of Cas13 have been identified[1,3,4]. An expanded catalog of CRISPR-Cas13 systems can provide phylogenetic insights and may offer opportunities for the development of novel RNA-editing tools. By mining bulk metagenomic data (> 10 TB) from various environments, we identified hundreds of orthologs of known and novel Cas13 systems in this study, the latter of which could be classified into five novel subtypes based on protein sequence similarity. Notably, the novel Cas13 systems discovered in this study can be developed into efficient RNA editors and expand the RNA-editing toolbox. In this study, we developed a computational pipeline for the de novo identification of novel Cas13 proteins (Fig. 1a). Initially, putative CRISPR arrays were identified from sequenced data. Then, 20 kb regions of DNA flanking the CRISPR arrays were extracted for predicting the open reading frames (ORFs). Proteins with more than 400 residues were selected for further analyses. A Cas13 library consisting of profile hidden Markov models (HMMs) of all known Cas13a, Cas13b, Cas13c, and Cas13d protein sequences in the NCBI database was subsequently constructed[5]. We proposed that the use of this library, which includes the features of all known Cas13 proteins, could maximize the possibility of discovering potential novel Cas13 proteins from uncharacterized protein sequences. The identified proteins that were devoid of two higher eukaryotes and prokaryotes nucleotide-binding (HEPN) domains were assumed to be incomplete and less likely to be active[6] and were therefore not selected for further analyses. Novel Cas13 proteins were further defined based on the results of phylogenetic analyses[1,4].
Fig. 1

Identification of CRISPR-Cas13 system and RNA-degrading capacity of the novel Cas13 protein.

a Description of the bioinformatics pipeline used for identifying novel Cas13 families. b Overlapping of novel Cas13 proteins with proteins of previously discovered Cas13 subtypes, which are known or unannotated in the NCBI and metagenomic databases. c Distribution of the number of novel Cas13 subtypes. d Dendrogram constructed using the unweighted pair group method with arithmetic mean (UPGMA) algorithm. From top to bottom, n = 502, 51, 411, 7, 8, 27, 22, 8, 14, 18, and 925. e Maximum likelihood tree of the Cas13 families. f Distribution of the size of the proteins in the different Cas13 subtypes. From left to right, n = 151, 281, 21, 52, 18, 8, 7, 22, 14, 8, and 27. g Schematic diagram depicting the size of the Cas13 proteins and the position of the HEPN motif. h Comparison of the gene knockdown efficiency using crRNAs with 3′- or 5′-DR sequences (means ± s.d.; n = 3 biological replicates; Student’s t-test, ns not significant, ***P < 0.001, ****P < 0.0001). Normalized MFI, mean fluorescence intensity (MFI) relative to the non-targeting condition. i Measurement of ANXA4 mRNA knockout mediated by the novel Cas13 systems in HEK293T cells using quantitative reverse transcription PCR (RT-qPCR) (means ± s.d.; n = 3 biological replicates). Normalized expression, relative expression was normalized to the non-targeting condition. NT non-targeting crRNA. j Evaluation of the cleavage activity of Cas13 proteins by targeting the mRNAs of endogenous genes using RT-qPCR in HEK293T cells (means ± s.d.; n = 3 biological replicates). Normalized expression, relative expression was normalized to the non-targeting condition. NT non-targeting crRNA. k Comparison of the endogenous gene knockdown activity of Cas13X.1, RfxCas13d, and Cas13e3 (means ± s.d.; n = 15 for each protein; paired t-test, ns not significant). Normalized expression, relative expression was normalized to the non-targeting condition.

Identification of CRISPR-Cas13 system and RNA-degrading capacity of the novel Cas13 protein.

a Description of the bioinformatics pipeline used for identifying novel Cas13 families. b Overlapping of novel Cas13 proteins with proteins of previously discovered Cas13 subtypes, which are known or unannotated in the NCBI and metagenomic databases. c Distribution of the number of novel Cas13 subtypes. d Dendrogram constructed using the unweighted pair group method with arithmetic mean (UPGMA) algorithm. From top to bottom, n = 502, 51, 411, 7, 8, 27, 22, 8, 14, 18, and 925. e Maximum likelihood tree of the Cas13 families. f Distribution of the size of the proteins in the different Cas13 subtypes. From left to right, n = 151, 281, 21, 52, 18, 8, 7, 22, 14, 8, and 27. g Schematic diagram depicting the size of the Cas13 proteins and the position of the HEPN motif. h Comparison of the gene knockdown efficiency using crRNAs with 3′- or 5′-DR sequences (means ± s.d.; n = 3 biological replicates; Student’s t-test, ns not significant, ***P < 0.001, ****P < 0.0001). Normalized MFI, mean fluorescence intensity (MFI) relative to the non-targeting condition. i Measurement of ANXA4 mRNA knockout mediated by the novel Cas13 systems in HEK293T cells using quantitative reverse transcription PCR (RT-qPCR) (means ± s.d.; n = 3 biological replicates). Normalized expression, relative expression was normalized to the non-targeting condition. NT non-targeting crRNA. j Evaluation of the cleavage activity of Cas13 proteins by targeting the mRNAs of endogenous genes using RT-qPCR in HEK293T cells (means ± s.d.; n = 3 biological replicates). Normalized expression, relative expression was normalized to the non-targeting condition. NT non-targeting crRNA. k Comparison of the endogenous gene knockdown activity of Cas13X.1, RfxCas13d, and Cas13e3 (means ± s.d.; n = 15 for each protein; paired t-test, ns not significant). Normalized expression, relative expression was normalized to the non-targeting condition. In order to test the accuracy of the bioinformatics pipeline developed herein, the pipeline was initially used to search for previously discovered Cas13 proteins in the NCBI’s prokaryotic database. As expected, previously discovered Cas13 systems were successfully identified using the pipeline (Supplementary Fig. S1a). We next searched for novel Cas13 proteins from metagenomic databases containing data from host-associated, aquatic, and soil environments (Supplementary Tables S1–S6). Compared to the commonly accessible data in the NCBI database, metagenomic data enables the identification of a greater number of novel Cas13 proteins (Fig. 1b, c and Supplementary Fig. S1b–d), demonstrating the importance and necessity of metagenomic mining. The protein sequences were next subjected to phylogenetic analyses, and the results demonstrated that the novel Cas13 proteins clustered into seven new branches distinct from those of the known subtypes (Fig. 1d)[1]. Of these seven clades, two clades, namely, Cas13Bt-A and Cas13Bt-B, with an approximate average sequence length of 800 amino acids (aa), could be clustered into the same clades as the previously discovered Cas13X, Cas13Y, and Cas13bt proteins (Fig. 1e)[3,4]. The remaining five clades were designated as Cas13e to Cas13i, with protein lengths ranging from 740 to 1300 aa (Fig. 1d–f). Specifically, the average lengths of Cas13e and Cas13f-i proteins were ~800 and 1100 aa, respectively (Fig. 1f). Analysis of the CRISPR locus revealed that with the exception of Cas13g, all the novel Cas13 subtypes discovered herein lacked conserved adaptation genes, including Cas1 and Cas2 (Supplementary Fig. S1e). We next analyzed the features in the CRISPR array. Like other CRISPR-Cas13 systems[7,8], the average lengths of the spacers and direct repeats (DRs) of the novel Cas13 systems are 30 and 36 nt, respectively (Supplementary Fig. S2a, b). Multiple sequence alignment of the DRs revealed that they are highly conserved and have similar predicted secondary RNA structures, which were similar to the characteristics of the DR sequences of previously identified Cas13a, Cas13b, and Cas13d systems (Supplementary Fig. S2a, b)[6,9,10]. Using the spacer sequences as a query, the potential targets of natural crRNA from the CRISPR locus were investigated by searching the IMG/VR, Genbank-Phage, and Ref-Plasmid databases. Positive hits indicated that these CRISPR-Cas13 systems could be active in hosts and defend against foreign MGEs. We next analyzed the features of the sequences of the novel Cas13 proteins. Multiple sequence alignment of these novel proteins of each subtype revealed that their HEPN domains were highly conserved, and the RNXXXH motif was most conserved, accounting for ~74% of all RXXXXH motifs (Supplementary Fig. S3a, b). We observed that only Cas13a, Cas13c, and Cas13d possessed an elongated N-terminal domain (NTD) (Fig. 1g)[6,9-11]. The existing structure of Cas13 revealed that Cas13a and Cas13d contain an NTD at the N-terminus, which is the least conserved region in Cas13 proteins and forms a binding channel for the DR region[9,11]; however, this domain is absent in Cas13b[10]. Interestingly, while Cas13a, Cas13c, and Cas13d systems use mature crRNAs with a 5′-DR sequence for effective RNA interference, we noticed that the mature crRNAs of Cas13b, Cas13Bt-A, and Cas13Bt-B systems contained a 3′-DR sequence[3,4,7,8,12]. Based on this consistency, we speculated whether the existence of the NTD domain of the novel CRISPR-Cas13 systems could be used for predicting the optimal configuration of the crRNA for effective RNA inference. Using this hypothesis, we deduced that the novel Cas13e, Cas13f, Cas13g, Cas13h, and Cas13i systems, with a 3′-DR sequence in the crRNA, would be more efficient in cleaving RNA (Supplementary Fig. S4a). We employed a mammalian cell-based mCherry disruption system for evaluating the RNA cleavage activity of the novel Cas13 proteins (Supplementary Fig. S4b). Consistent with our speculation, we observed that the use of crRNAs with 3′-DR sequences, and not 5′-DR sequences, enabled the effective disruption of the mCherry mRNA by Cas13e1, Cas13f1, Cas13g1, Cas13h1, and Cas13i1 (Fig. 1h and Supplementary Fig. S4a). Collectively, the lack of the NTD domain in Cas13 can reflect on the structural differences of its cognate mature crRNA, and this rule can be used when employing novel Cas13 proteins with little known structural information for RNA editing in mammalian cells. The clinical application of Cas13-based RNA-editing systems continues to be challenging at present, partly due to the fact that the large size of RNA editors exceeds the packaging capacity of adeno-associated virus (AAV) vectors. The hypercompact Cas13 proteins discovered in this study can aid in overcoming this limitation. In order to identify novel small Cas13 systems with high activity, the mCherry reporter system was used for initial screening (Supplementary Fig. S4b, c). The mCherry signals were substantially reduced using the novel Cas13 proteins (Supplementary Fig. S4d). To verify the endogenous gene knockdown activity of the novel Cas13 proteins, two sites in the ANXA4 gene were selected for testing in HEK293T cells (Fig. 1i). Of note, Cas13e3 achieved the highest knockdown activity among all the Cas13 systems screened herein, with the efficiency at the two sites being 92% and 84% (Fig. 1i). We, therefore, selected Cas13e3 for further characterization owing to its high efficiency and the ultrasmall protein size (767 aa). In order to investigate the optimal length of the spacer for Cas13e3 editing, we tested two different crRNAs for targeting the mCherry mRNA, with spacers of lengths ranging from 5 to 50 nt (Supplementary Fig. S5a). Cas13e3 achieved the highest average knockdown efficiency at the two selected sites when the 27-nt spacer was used (Supplementary Fig. S5a). The 27-nt spacer was therefore used for subsequent experiments. We next sought to determine the protospacer flanking sequence (PFS) requirement of Cas13e3. The results of screening revealed no PFS preference for Cas13e3 (Supplementary Fig. S5b). Besides, we observed that fusion with the nuclear localization signal (NLS) could increase the knockdown efficiency of Cas13e3 (Supplementary Fig. S5c). In order to investigate the knockdown efficiency of Cas13e3 on a larger scale, a total of 15 crRNAs targeting five genes were tested (Fig. 1j). We also tested the activities of Cas13X.1 and RfxCas13d at the same sites for comparison. Notably, we observed that Cas13e3 exhibited robust knockdown activity that was comparable to that of Cas13X.1 and RfxCas13d (Fig. 1j, k). We further investigated the specificity of Cas13e3 on a genome-wide scale. A total of 102, 323, and 133 differentially expressed genes were detected using RNA-Seq for the Cas13X.1, RfxCas13d, and Cas13e3 systems, respectively (Supplementary Fig. S6a), indicating that the specificity of Cas13e3 was comparable to that of Cas13X.1, while its off-target effect was lower than that of RfxCas13d (Supplementary Fig. S6a). In order to compare the collateral RNA cleavage activities of Cas13X.1, RfxCas13d, and Cas13e3, the fluorescence intensity of enhanced green fluorescent protein (EGFP) was measured when targeting the mCherry mRNA[13]. Notably, collateral RNA cleavage activities were not detected for Cas13X.1 and Cas13e3, while RfxCas13d showed dramatic trans-cleavage activity for EGFP transcripts (Supplementary Fig. S6b). In conclusion, these results demonstrated that Cas13e3 is a hypercompact RNA editor with high interference efficiency and low collateral activity. In this study, we developed a computational pipeline to sensitively discover novel CRISPR systems by constructing a library containing comprehensive Cas13 features. By mining metagenomic data, we identified five novel Cas13 clades. We verified that several of the novel Cas13 families have RNA-degrading capacity in mammalian cells. Importantly, the novel Cas13e3 protein discovered in this study is an ultracompact Cas13 protein, and can be developed into an efficient transcriptome editor in mammalian cells. The novel systems identified in this study substantially increase the diversity of CRISPR-Cas13 systems and largely expand the programmable RNA-editing toolbox. Supplementary Methods and Figures Supplementary Tables
  12 in total

1.  Two Distant Catalytic Sites Are Responsible for C2c2 RNase Activities.

Authors:  Liang Liu; Xueyan Li; Jiuyu Wang; Min Wang; Peng Chen; Maolu Yin; Jiazhi Li; Gang Sheng; Yanli Wang
Journal:  Cell       Date:  2017-01-12       Impact factor: 41.582

2.  Structural insights into Cas13b-guided CRISPR RNA maturation and recognition.

Authors:  Bo Zhang; Weiwei Ye; Yangmiao Ye; Huan Zhou; Abdullah F U H Saeed; Jing Chen; Jinying Lin; Vanja Perčulija; Qi Chen; Chun-Jung Chen; Ming-Xian Chang; Muhammad Iqbal Choudhary; Songying Ouyang
Journal:  Cell Res       Date:  2018-11-13       Impact factor: 25.617

3.  High-fidelity Cas13 variants for targeted RNA degradation with minimal collateral effects.

Authors:  Huawei Tong; Jia Huang; Qingquan Xiao; Bingbing He; Xue Dong; Yuanhua Liu; Xiali Yang; Dingyi Han; Zikang Wang; Xuchen Wang; Wenqin Ying; Runze Zhang; Yu Wei; Chunlong Xu; Yingsi Zhou; Yanfei Li; Minqing Cai; Qifang Wang; Mingxing Xue; Guoling Li; Kailun Fang; Hainan Zhang; Hui Yang
Journal:  Nat Biotechnol       Date:  2022-08-11       Impact factor: 68.164

4.  Compact RNA editors with small Cas13 proteins.

Authors:  Soumya Kannan; Han Altae-Tran; Xin Jin; Victoria J Madigan; Rachel Oshiro; Kira S Makarova; Eugene V Koonin; Feng Zhang
Journal:  Nat Biotechnol       Date:  2021-08-30       Impact factor: 68.164

5.  Transcriptome Engineering with RNA-Targeting Type VI-D CRISPR Effectors.

Authors:  Silvana Konermann; Peter Lotfy; Nicholas J Brideau; Jennifer Oki; Maxim N Shokhirev; Patrick D Hsu
Journal:  Cell       Date:  2018-03-15       Impact factor: 41.582

6.  RNA editing with CRISPR-Cas13.

Authors:  David B T Cox; Jonathan S Gootenberg; Omar O Abudayyeh; Brian Franklin; Max J Kellner; Julia Joung; Feng Zhang
Journal:  Science       Date:  2017-10-25       Impact factor: 47.728

7.  C2c2 is a single-component programmable RNA-guided RNA-targeting CRISPR effector.

Authors:  Omar O Abudayyeh; Jonathan S Gootenberg; Silvana Konermann; Julia Joung; Ian M Slaymaker; David B T Cox; Sergey Shmakov; Kira S Makarova; Ekaterina Semenova; Leonid Minakhin; Konstantin Severinov; Aviv Regev; Eric S Lander; Eugene V Koonin; Feng Zhang
Journal:  Science       Date:  2016-06-02       Impact factor: 47.728

8.  Structural Basis for the RNA-Guided Ribonuclease Activity of CRISPR-Cas13d.

Authors:  Cheng Zhang; Silvana Konermann; Nicholas J Brideau; Peter Lotfy; Xuebing Wu; Scott J Novick; Timothy Strutzenberg; Patrick R Griffin; Patrick D Hsu; Dmitry Lyumkis
Journal:  Cell       Date:  2018-09-20       Impact factor: 41.582

Review 9.  Evolutionary classification of CRISPR-Cas systems: a burst of class 2 and derived variants.

Authors:  Kira S Makarova; Yuri I Wolf; Jaime Iranzo; Sergey A Shmakov; Omer S Alkhnbashi; Stan J J Brouns; Emmanuelle Charpentier; David Cheng; Daniel H Haft; Philippe Horvath; Sylvain Moineau; Francisco J M Mojica; David Scott; Shiraz A Shah; Virginijus Siksnys; Michael P Terns; Česlovas Venclovas; Malcolm F White; Alexander F Yakunin; Winston Yan; Feng Zhang; Roger A Garrett; Rolf Backofen; John van der Oost; Rodolphe Barrangou; Eugene V Koonin
Journal:  Nat Rev Microbiol       Date:  2019-12-19       Impact factor: 60.633

10.  HH-suite3 for fast remote homology detection and deep protein annotation.

Authors:  Martin Steinegger; Markus Meier; Milot Mirdita; Harald Vöhringer; Stephan J Haunsberger; Johannes Söding
Journal:  BMC Bioinformatics       Date:  2019-09-14       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.