| Literature DB >> 21756346 |
Kira S Makarova1, L Aravind, Yuri I Wolf, Eugene V Koonin.
Abstract
BACKGROUND: The CRISPR-Cas adaptive immunity systems that are present in most Archaea and many Bacteria function by incorporating fragments of alien genomes into specific genomic loci, transcribing the inserts and using the transcripts as guide RNAs to destroy the genome of the cognate virus or plasmid. This RNA interference-like immune response is mediated by numerous, diverse and rapidly evolving Cas (CRISPR-associated) proteins, several of which form the Cascade complex involved in the processing of CRISPR transcripts and cleavage of the target DNA. Comparative analysis of the Cas protein sequences and structures led to the classification of the CRISPR-Cas systems into three Types (I, II and III).Entities:
Mesh:
Substances:
Year: 2011 PMID: 21756346 PMCID: PMC3150331 DOI: 10.1186/1745-6150-6-38
Source DB: PubMed Journal: Biol Direct ISSN: 1745-6150 Impact factor: 4.540
Experimentally characterized and predicted functions of the core components of CRISPR-Cas systems
| Family | Experimental/ | Prediction |
|---|---|---|
| Metal-dependent deoxyribonuclease; a unique fold consisting of a N-terminal β strand domain and a C-terminal α-helical domain [ | Involved in integration of spacer DNA into CRISPR repeats. | |
| RNAse specific to U-rich regions [ | Facilitates spacer selection and/or integration. Could be involved in further crRNA cleavage. | |
| Single-stranded DNA nuclease (HD domain) and ATP-dependent helicase [ | Cuts DNA during interference; promotes strand separation. | |
| Metal-dependent deoxyribonuclease specific for double-stranded oligonucleotides [ | Cuts DNA during interference. | |
| RecB-like nuclease homolog with three-cysteine C-terminal cluster [ | Might be involved in spacer acquisition | |
| RAMP [ | Might substitute for Cas6 if catalytically active. Otherwise might be involved in both interference and adaptation stages. | |
| RAMP [ | ||
| RAMP [ | Implicated in interference; binds crRNA; if enzymatically active, might be involved in RNA-guided RNA cleavage. | |
| Subunit of Cascade complex [ | Inactivated Cas10 polymerase-like protein, binds DNA, interacts with HD domain and a RAMP carrying crRNA; could be involved in both interference and spacer selection stages. | |
| Subunit of Cascade (Cmr) complex [ | Same as Cas8, but fused to HD and thus cuts ssDNA; might be involved in strand separation. | |
| Small, mostly alpha helical protein, subunit of Cascade complex [ | Specifically binds DNA; might recognize PAM. | |
Figure 1Multiple alignment of Cas7 subfamilies and related families of RAMPs. The multiple sequence alignment includes the conserved blocks identified by HHpred (red box), secondary structure predictions and the secondary structure elements extracted from the crystal structure of the Cas7 from S. solfataricus [16]. Secondary structure prediction showed as follows: 'H' indicates α-helix, 'E' indicates extended conformation (β-strand). The sequences are denoted by their GI numbers and species names. G-rich loop region of RAMPs is shown by blue box. The positions of the first and the last residues of the aligned region in the corresponding protein are indicated for each sequence. The numbers within the alignment represent poorly conserved inserts that are not shown. The coloring is based on the consensus shown underneath the alignment; 'h' indicates hydrophobic residues (WFYMLIVACTH), 'p' indicates polar residues (EDKRNQHTS), 's' indicates small residues (ACDGNPSTV).
Figure 2The RRM fold of RAMPs and Cas2. The RRM fold domains of Cas2 and the three major RAMP groups proposed in the text are shown in cartoon representation with their N- and C- termini indicated. In Cas7, the insertions into the core of the RRM fold are shown in a darker shade. In the RAMPs with two RRM fold domains, these are respectively labeled as N(-terminal) and C(-terminal). The distinct C-terminal domains of Cas5 and Cas6f (Csy4) are also shown. In Cas6f, the glycine-rich loop, which is embedded in a beta-hairpin in contrast to the typical helix-strand element, is colored orange. Note the "horizontal" packing of the first helix of the core RRM fold against the 4 strand sheet, which is one of the characteristic structural features of the RAMPs (apparent in Cas7, Cas6, Cas6e and Cas5). The following PDB ids were used to generate these representations: 2I0X (Cas2);_3PS0 (Cas7); 3I4H (Cas6); 1WJ9 (Cas6e/CasE); 3KG4(Cas5); 2XLJ (Cas6f/Csy4).
Figure 3Classification of the RAMPs. The tree-like scheme of RAMP relationships is based on the sequence similarity, structural features and neighborhood analysis described in the text, and should not be construed as a phylogenetic tree. Unresolved relationships are shown as multifurcations and tentative assignments are shown by broken lines. The catalytic activity of some of the RAMP proteins of the Cas5 and Cas7 groups involving the partially conserved histidines shown in the figure should be considered a tentative prediction.
Figure 4Gene content similarity between type I-E and type III-A systems and structural organization of large subunits of different CRISPR-Cas systems of type I and III. A. Genes in the operons for I-E and III-A subtypes are shown by arrows with size roughly proportion to the size of the corresponding gene. Homologous genes are shown by the arrows of same color or hashing. RAMPs are shown by pink or pink hashing. Solid lines connect genes for which homology can be confidently demonstrated, and dashed lines connect genes for which homology is inferred tentatively. The Cascade complex subunits are shown by square brackets. Two previously published domain annotations are included for comparison. B. Domain organization of large subunits of different type I and III CRISPR-Cas systems. Domain size is roughly proportional to correspondent sequence length. The letter "S" marks the regions that could be homologous to small subunits of Cascade complex encoded as separated genes in Type III systems, I-E subtype and some systems of I-A subtype.
Figure 5Structural organization of Cas9 protein families and their homologs. Homologous regions are shown by the same color. Distinct sequence motifs are denoted by the corresponding conserved amino acid residues above the respective domains (when the same conserved amino acid occurs in different motifs, one is marked by an asterisk to avoid confusion).
Figure 6Unusual CRISPR-Cas systems. A. Type I-C-variants with GSU0054 (or GSU0053) signature gene. B. Type I-F-variant. C. Type III-variant.
Figure 7Evolutionary scenario for the origin of CRISPR-Cas systems. Homologous genes are color-coded and identified by a family name (names follow the classification from [20]). Names in bold are proposed systematic names including those propose in this work; "legacy names" are in regular font. The signature genes for CRISPR-Cas types are shown within green boxes, and for subtypes within red boxes. The bold letters above the genes show major categories of Cas proteins: L, large CASCADE subunit; S, small CASCADE subunit; R, RAMP CASCADE subunit; RE, RAMP family RNase involved in crRNA processing (experimentally characterized nucleases shown be asterisks); T, transcriptional regulator. Genes coding for inactivated (putative) polymerases are indicated by crosses. Major evolutionary events are shown in the corresponding branches. Broken lines denote alternative evolutionary scenarios for the origin of RAMPs.