| Literature DB >> 36207756 |
Ho-Min Park1,2, Yunseol Park1, Urta Berani1, Eunkyu Bang1, Joris Vankerschaver1,3, Arnout Van Messem4, Wesley De Neve1,2, Hyunjin Shim5.
Abstract
RNA-protein interactions are crucial for diverse biological processes. In prokaryotes, RNA-protein interactions enable adaptive immunity through CRISPR-Cas systems. These defence systems utilize CRISPR RNA (crRNA) templates acquired from past infections to destroy foreign genetic elements through crRNA-mediated nuclease activities of Cas proteins. Thanks to the programmability and specificity of CRISPR-Cas systems, CRISPR-based antimicrobials have the potential to be repurposed as new types of antibiotics. Unlike traditional antibiotics, these CRISPR-based antimicrobials can be designed to target specific bacteria and minimize detrimental effects on the human microbiome during antibacterial therapy. In this study, we explore the potential of CRISPR-based antimicrobials by optimizing the RNA-protein interactions of crRNAs and Cas13 proteins. CRISPR-Cas13 systems are unique as they degrade specific foreign RNAs using the crRNA template, which leads to non-specific RNase activities and cell cycle arrest. We show that a high proportion of the Cas13 systems have no colocalized CRISPR arrays, and the lack of direct association between crRNAs and Cas proteins may result in suboptimal RNA-protein interactions in the current tools. Here, we investigate the RNA-protein interactions of the Cas13-based systems by curating the validation dataset of Cas13 protein and CRISPR repeat pairs that are experimentally validated to interact, and the candidate dataset of CRISPR repeats that reside on the same genome as the currently known Cas13 proteins. To find optimal CRISPR-Cas13 interactions, we first validate the 3-D structure prediction of crRNAs based on their experimental structures. Next, we test a number of RNA-protein interaction programs to optimize the in silico docking of crRNAs with the Cas13 proteins. From this optimized pipeline, we find a number of candidate crRNAs that have comparable or better in silico docking with the Cas13 proteins of the current tools. This study fully automatizes the in silico optimization of RNA-protein interactions as an efficient preliminary step for designing effective CRISPR-Cas13-based antimicrobials.Entities:
Keywords: CRISPR-based antimicrobials; Drug design; In silico docking; RNA secondary structure; RNA tertiary structure; RNA–protein interaction; Structural biology
Mesh:
Substances:
Year: 2022 PMID: 36207756 PMCID: PMC9547417 DOI: 10.1186/s13062-022-00339-5
Source DB: PubMed Journal: Biol Direct ISSN: 1745-6150 Impact factor: 7.173
Fig. 1Architecture and mechanism of CRISPR-Cas13 systems. Three main stages constitute the CRISPR-Cas13 immune response: adaptation, expression and interference. During the adaptation stage, a complex of Cas proteins binds the invading genome, which is shown as an RNA virus. The bound part of the target RNA is cleaved out and is inserted into the CRISPR array of the prokaryotic genome as a new spacer through a reverse transcriptase. The expression stage involves the transcription of the CRISPR array as a large, single transcript and this pre-crRNA is processed into a mature crRNA containing a target spacer and a flanking repeat. The mechanisms and components involved in the pre-crRNA processing of CRISPR-Cas13 systems have not been experimentally resolved yet. At the last stage of the immune response, the interference stage utilizes the crRNA as a guide to recognize invading genomes based on sequence complementarity, recruiting the complex of Cas proteins. The Cas13a/b/d proteins have two higher eukaryotes and prokaryotes nucleotide-binding (HEPN) domains of RNase activity, which cleave the target sequence and inactivate the RNA virus
Fig. 2In silico docking of crRNAs with Cas13 proteins to assess the RNA–protein interactions. This study is divided into two parts: the validation study to optimize the in silico docking of crRNAs and Cas13 proteins, and the candidate study to apply the optimized pipeline of in silico docking to test a list of candidate crRNAs on each Cas13 protein for RNA–protein interactions, as a preliminary step prior to experimental validation
Superimposition performance of the predicted crRNA structure with the ground truth crRNA structure
| RNA 3-D structure prediction program | RNA 2-D structure prediction program | Mean RMSD value (Å) | Standard deviation of RMSD values (Å) |
|---|---|---|---|
| RNAComposer | CentroidFold | 14.5751 | 3.4605 |
| ContextFold | 15.0601 | 6.7700 | |
| CONTRAfold | 13.0886 | 3.9715 | |
| IPknot | 15.5927 | 4.3685 | |
| MXfold2 | 13.3285 | 3.3359 | |
| RNAfold | 13.8130 | 4.0411 | |
| RNAshapes | 13.4816 | 4.3896 | |
| RNAstructure | 13.5431 | 4.2974 | |
| Rosetta | CentroidFold | 11.4316 | 3.9693 |
| ContextFold | 13.2483 | 4.9258 | |
| CONTRAfold | 11.8012 | 4.3517 | |
| IPknot | 9.9731 | 3.4309 | |
| MXfold2 | 11.1416 | 3.8934 | |
| RNAfold | 11.9661 | 4.6712 | |
| RNAshapes | 11.8083 | 3.5449 | |
| RNAstructure | 12.8113 | 5.3425 |
Performance of the RNA 2-D structure prediction programs in combination with RNAComposer or Rosetta when superimposed to the GT 3-D structure of the experimentally validated crRNAs. The mean and the standard deviation of the root-mean-square deviation (RMSD) values were calculated from three replicate runs of PyMOL align across all the predicted structures
Fig. 3Performance analysis of RNA structure prediction of CRISPR repeats with PyMOL align. Heatmap of the means of the RMSD values by superimposition of each predicted crRNA 3-D structure with the ground truth (GT) structure. The RNA 2-D structure prediction programs are shown on the y-axis, and the PDB name of each Cas13 protein is shown on the x-axis, with the RNA 3-D structure program as a RNAComposer and b Rosetta. c Superimposition of the GT structure 6IV8_B predicted by ContextFold and RNAComposer (best), and 6AAY predicted by ContextFold and RNAComposer (worst). d Superimposition of the GT structure 6DTD predicted by RNAstructure and Rosetta (best), and 6IV8_D predicted by RNAstructure and Rosetta (worst). Grey = GT structure; Magenta = predicted structure
Performance analysis of in silico docking of the predicted crRNA structures with the Cas13a proteins
| Cas13 protein structure | Combination | iRMSD | Template | Model |
|---|---|---|---|---|
| 5W1H | CONTRAfold + Rosetta | 12.615 | Free | 9 |
| 5W1H | ContextFold + Rosetta | 12.818 | Free | 7 |
| 5W1H | IPKnot + RNAComposer | 12.987 | Used | 3 |
| 5W1I_AB | ContextFold + Rosetta | 12.753 | Free | 1 |
| 5W1I_AB | CONTRAfold + Rosetta | 13.547 | Free | 1 |
| 5W1I_AB | ContextFold + Rosetta | 13.7 | Free | 3 |
| 5W1I_CD | IPKnot + RNAComposer | 13.026 | Used | 3 |
| 5W1I_CD | CONTRAfold + Rosetta | 13.41 | Free | 7 |
| 5W1I_CD | ContextFold + RNAComposer | 13.515 | Used | 2 |
| 5WLH | ContextFold + RNAComposer | 6.256 | Free | 1 |
| 5WLH | IPKnot + Rosetta | 10.055 | Free | 1 |
| 5WLH | IPKnot + RNAComposer | 10.876 | Free | 5 |
The three best docking models of the Cas13a proteins with the best experimental resolution were given in terms of iRMSD when superimposed with the GT structures (5W1H, 5W1I_AB, 5W1I_CD, 5WLH). The Model column refers to the best model given by HDOCK in each in silico docking experiment
Fig. 4In silico docking evaluation of the Cas13a proteins of the best experimental metrics. The iRMSD from the in silico docking experiments of the crRNAs with the Cas13a protein using HDOCK. The 10 best models were retained from HDOCK and the experiments were performed template-free or template-based. Each box represents the results of 60 docking experiments. The 3-D structure below each box shows the GT structure (magenta), the computer selected best model (blue), and the human selected best model (green) docked on the corresponding Cas13a protein (grey). Except for 5W1H, the computer selected best model coincided with the human selected best model (green)
Fig. 5Best crRNA candidates for Cas13a protein from in silico docking experiments. The 3-D visualization shows the spatial coordinates of each crRNA candidate model (only 10 best models were considered) after in silico docking with the corresponding Cas13 protein. Each pink dot represents the centre of mass calculated from all atoms of the macromolecular structure model for each crRNA candidate. The GT crRNA is marked as a black dot, and its closest crRNA candidates in terms of Euclidian distance are highlighted as blue dots. The multiple sequence alignment compares the RNA sequences of the GT crRNA with those of the best crRNA candidates. The different shades of blue show the percentage identity, with the identity threshold set to 50%, highlighting variations in the RNA sequences. The 3-D structure shows an example of docking between the receptor (Cas protein) and the ligand (crRNA). The Cas proteins are in grey, the GT crRNA is coloured in magenta, and the best crRNA candidate model is highlighted in green with its identifier given above (GenomeID_CRISPRarray_CRISPRrepeat_modelnumber)
Best candidate crRNAs from in silico docking with Cas13 proteins
| Cas13 protein GT structure | Best model | Candidate crRNA | Model | Distance (Å) | Cluster | Human expertise | Predicted ΔG (kcal/mol) |
|---|---|---|---|---|---|---|---|
5W1H − 14.06 kcal/mol | RNAC_5W1HvsAP019845_1_1_1 | AP019845_1_1 | 1 | 15.42061 | 8 | 1 | − 13.47 |
| RNAC_5W1HvsAP019845_1_1_5 | AP019845_1_1 | 5 | 15.91598 | 6 | 1 | − 13.53 | |
| RNAC_5W1HvsCP002345_2_4_1 | CP002345_2_4 | 1 | 18.63954 | 16 | 1 | − 15.16 | |
| RNAC_5W1HvsCP002345_2_9_10 | CP002345_2_9 | 10 | 8.592774 | 3 | 0 | − 15.16 | |
| Rose_5W1HvsAP019834_1_1_1 | AP019834_1_1 | 1 | 17.97766 | 3 | 1 | − 15.09 | |
| Rose_5W1HvsAP019845_1_1_3 | AP019845_1_1 | 3 | 8.767469 | 16 | 0 | − 13.53 | |
| Rose_5W1HvsAP019845_1_1_7 | AP019845_1_1 | 7 | 16.31799 | 3 | 1 | − 13.53 | |
| Rose_5W1HvsCP002345_2_6_1 | CP002345_2_6 | 1 | 10.5807 | 3 | 0 | − 15.16 | |
| Rose_5W1HvsCP002345_2_7_1 | CP002345_2_7 | 1 | 10.28279 | 3 | 0 | − 15.16 | |
| Rose_5W1HvsCP018618_1_1_1 | CP018618_1_1 | 1 | 11.36411 | 3 | 0 | − 13.46 | |
5W1I_AB − 13.93 kcal/mol | RNAC_5W1IABvsAP019834_2_2_1 | AP019834_2_2 | 1 | 15.41219 | 1 | 1 | − 15.09 |
| RNAC_5W1IABvsAP019845_1_1_1 | AP019845_1_1 | 1 | 15.4393 | 7 | 1 | − 13.47 | |
| RNAC_5W1IABvsAP019845_1_1_2 | AP019845_1_1 | 2 | 16.81572 | 15 | 1 | − 13.53 | |
| RNAC_5W1IABvsCP091244_12_2_8 | CP091244_12_2 | 8 | 19.48762 | 9 | 1 | − 15.09 | |
| Rose_5W1IABvsAP019834_1_1_7 | AP019834_1_1 | 7 | 18.04252 | 9 | 1 | − 15.09 | |
| Rose_5W1IABvsAP019845_1_1_2 | AP019845_1_1 | 2 | 17.05391 | 9 | 1 | − 13.53 | |
| Rose_5W1IABvsCP002345_2_5_4 | CP002345_2_5 | 4 | 14.53866 | 1 | 1 | − 15.16 | |
| Rose_5W1IABvsCP002345_2_6_2 | CP002345_2_6 | 2 | 10.42858 | 1 | 0 | − 15.16 | |
| Rose_5W1IABvsCP002345_2_6_5 | CP002345_2_6 | 5 | 13.87067 | 1 | 0 | − 15.16 | |
| Rose_5W1IABvsCP002345_2_7_2 | CP002345_2_7 | 2 | 10.77602 | 1 | 0 | − 15.16 | |
| Rose_5W1IABvsCP018618_1_1_1 | CP018618_1_1 | 1 | 10.17277 | 1 | 0 | − 13.46 | |
5W1I_CD − 13.68 kcal/mol | RNAC_5W1ICDvsAP019834_2_2_3 | AP019834_2_2 | 3 | 14.82603 | 18 | 0 | − 15.09 |
| RNAC_5W1ICDvsAP019845_1_1_1 | AP019845_1_1 | 1 | 15.82544 | 19 | 1 | − 13.47 | |
| RNAC_5W1ICDvsAP019845_1_1_10 | AP019845_1_1 | 10 | 11.31408 | 18 | 1 | − 15.09 | |
| Rose_5W1ICDvsAP019834_1_1_5 | AP019834_1_1 | 5 | 18.23438 | 2 | 1 | − 15.09 | |
| Rose_5W1ICDvsAP019845_1_1_9 | AP019845_1_1 | 9 | 18.28236 | 2 | 1 | − 13.53 | |
| Rose_5W1ICDvsCP002345_2_4_4 | CP002345_2_4 | 4 | 10.64548 | 18 | 0 | − 15.06 | |
| Rose_5W1ICDvsCP002345_2_6_1 | CP002345_2_6 | 1 | 10.06707 | 18 | 0 | − 15.16 | |
| Rose_5W1ICDvsCP002345_2_6_6 | CP002345_2_6 | 6 | 13.18188 | 18 | 0 | − 15.16 | |
| Rose_5W1ICDvsCP002345_2_6_7 | CP002345_2_6 | 7 | 10.06989 | 18 | 0 | − 15.16 | |
| Rose_5W1ICDvsCP002345_2_7_1 | CP002345_2_7 | 1 | 10.96614 | 18 | 0 | − 15.16 | |
| Rose_5W1ICDvsCP011102_2_1_3 | CP011102_2_1 | 3 | 19.27656 | 18 | 1 | − 13.15 | |
| Rose_5W1ICDvsCP018618_1_1_2 | CP018618_1_1 | 2 | 8.83881 | 18 | 0 | − 13.46 | |
5WLH − 14.06 kcal/mol | RNAC_5WLHvsAP019834_2_2_2 | AP019834_2_2 | 2 | 15.66461 | 2 | 1 | − 15.09 |
| RNAC_5WLHvsAP019845_1_1_1 | AP019845_1_1 | 1 | 13.99831 | 10 | 1 | − 13.47 | |
| RNAC_5WLHvsCP002345_2_1_4 | CP002345_2_1 | 4 | 14.43221 | 2 | 1 | − 15.07 | |
| RNAC_5WLHvsCP011102_2_2_4 | CP011102_2_2 | 4 | 13.4716 | 2 | 0 | − 13.15 | |
| Rose_5WLHvsAP019845_1_1_1 | AP019845_1_1 | 1 | 16.64761 | 2 | 1 | − 13.53 | |
| Rose_5WLHvsAP019845_1_1_3 | AP019845_1_1 | 3 | 12.08608 | 2 | 1 | − 13.53 | |
| Rose_5WLHvsAP019845_1_1_4 | AP019845_1_1 | 4 | 14.27125 | 2 | 1 | − 13.53 | |
| Rose_5WLHvsCP002345_2_6_1 | CP002345_2_6 | 1 | 11.24522 | 2 | 0 | − 15.16 | |
| Rose_5WLHvsCP002345_2_6_3 | CP002345_2_6 | 3 | 14.17092 | 2 | 0 | − 15.16 | |
| Rose_5WLHvsCP002345_2_7_1 | CP002345_2_7 | 1 | 11.88083 | 2 | 0 | − 15.16 | |
| Rose_5WLHvsCP011102_3_2_5 | CP011102_3_2 | 5 | 16.4966 | 2 | 1 | − 15.09 | |
5WTK − 11.68 kcal/mol | RNAC_5WTKvsAP019845_1_1_1 | AP019845_1_1 | 1 | 23.34363 | 9 | 1 | − 13.53 |
| Rose_5WTKvsAP019845_1_1_1 | AP019845_1_1 | 1 | 3.64798 | 5 | 0 | − 13.53 | |
| Rose_5WTKvsAP019845_1_1_2 | AP019845_1_1 | 2 | 1.859212 | 5 | 0 | − 13.53 | |
| Rose_5WTKvsAP019845_1_1_3 | AP019845_1_1 | 3 | 3.46364 | 5 | 0 | − 13.53 | |
| Rose_5WTKvsCP002345_2_11_8 | CP002345_2_11 | 8 | 4.103924 | 5 | 0 | − 15.06 | |
| Rose_5WTKvsCP002345_2_5_1 | CP002345_2_5 | 1 | 9.187617 | 5 | 1 | − 15.16 | |
| Rose_5WTKvsCP002345_2_6_1 | CP002345_2_6 | 1 | 2.796534 | 5 | 0 | − 15.06 | |
| Rose_5WTKvsCP002345_2_6_2 | CP002345_2_6 | 2 | 2.4628 | 5 | 0 | − 15.06 | |
| Rose_5WTKvsCP002345_2_8_1 | CP002345_2_8 | 1 | 5.023773 | 5 | 0 | − 15.02 | |
| Rose_5WTKvsCP011102_2_2_1 | CP011102_2_2 | 1 | 6.560322 | 5 | 1 | − 13.15 | |
| Rose_5WTKvsCP018618_1_2_8 | CP018618_1_2 | 8 | 5.777393 | 5 | 1 | − 13.56 | |
5XWY − 14.01 kcal/mol | RNAC_5XWYvsAP019845_1_1_1 | AP019845_1_1 | 1 | 10.34585 | 2 | 0 | − 13.53 |
| RNAC_5XWYvsAP019845_1_1_3 | AP019845_1_1 | 3 | 14.16364 | 6 | 1 | − 13.53 | |
| Rose_5XWYvsCP011102_2_1_10 | CP011102_2_1 | 10 | 17.63066 | 2 | 0 | − 13.15 |
A list of crRNA candidates for each Cas13a protein with optimal in silico docking, selected using the optimized pipeline and calculating the distance to the GT crRNA. The Model column refers to the best model given by HDOCK in each in silico docking experiment. The Distance column shows the Euclidean distance of each docked model to the GT crRNA. The Human expertise column indicates the following 3-D visual assessment by the human expert:
0 = docked in the same region and in a similar direction as GT
1 = partially docked in the similar region and in a similar direction as GT