| Literature DB >> 24817877 |
Kira S Makarova1, Vivek Anantharaman1, Nick V Grishin2, Eugene V Koonin1, L Aravind1.
Abstract
CRISPR-Cas adaptive immunity systems of bacteria and archaea insert fragments of virus or plasmid DNA as spacer sequences into CRISPR repeat loci. Processed transcripts encompassing these spacers guide the cleavage of the cognate foreign DNA or RNA. Most CRISPR-Cas loci, in addition to recognized cas genes, also include genes that are not directly implicated in spacer acquisition, CRISPR transcript processing or interference. Here we comprehensively analyze sequences, structures and genomic neighborhoods of one of the most widespread groups of such genes that encode proteins containing a predicted nucleotide-binding domain with a Rossmann-like fold, which we denote CARF (CRISPR-associated Rossmann fold). Several CARF protein structures have been determined but functional characterization of these proteins is lacking. The CARF domain is most frequently combined with a C-terminal winged helix-turn-helix DNA-binding domain and "effector" domains most of which are predicted to possess DNase or RNase activity. Divergent CARF domains are also found in RtcR proteins, sigma-54 dependent regulators of the rtc RNA repair operon. CARF genes frequently co-occur with those coding for proteins containing the WYL domain with the Sm-like SH3 β-barrel fold, which is also predicted to bind ligands. CRISPR-Cas and possibly other defense systems are predicted to be transcriptionally regulated by multiple ligand-binding proteins containing WYL and CARF domains which sense modified nucleotides and nucleotide derivatives generated during virus infection. We hypothesize that CARF domains also transmit the signal from the bound ligand to the fused effector domains which attack either alien or self nucleic acids, resulting, respectively, in immunity complementing the CRISPR-Cas action or in dormancy/programmed cell death.Entities:
Keywords: CRISPR; DNA-binding proteins; Rossmann fold; beta barrel; phage defense
Year: 2014 PMID: 24817877 PMCID: PMC4012209 DOI: 10.3389/fgene.2014.00102
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Figure 1Comparative genomic analysis of CARF domain-containing proteins. (A) scheme of the relationships between major CARF families, their domain architectures and association with CRISPR-Cas system types. The dendrogram shows the relationship between CARF domain containing families. The clustering is based on sequence and structure similarity analysis as described under Materials and Methods; unresolved relationships are shown as a multifurcation. The pfam ID or other recognized family description is provided for each of the seven major groups. A typical member of a family (either locus tag of a representative protein or a pdb identifier) is shown for each terminal node; subfamilies that have not been described previously are underlined. The typical domain architecture is shown for each family. The domain name is shown above the corresponding shape the first time it appears. Brackets indicate that in several proteins in the respective family the domain is missing. In the first column on the right hand side, the number of proteins in the respective family is indicated, and the number of proteins encoded in the vicinity of cas genes is shown in parentheses. In the second, third and fourth columns, the number of genes of each family that are specifically associated with CRISPR-Cas systems of types III-A, III-B, and I are shown (the numbers representing a substantial fraction of the family are highlighted in red). (B) Domain organization of several minor CARF domain-containing families. Designations are as in Figure 1A. (C) Protein families associated with genes encoding CARF domains. The histogram shows how many times each family was identified in the vicinity of CARF domain-containing genes; the scale is shown above the histogram. Only the most frequently co-occurring families outside the set of recognized cas genes are shown. The numbers on the right hand side reflect the results of a reverse analysis when neighborhoods of the genes from each family were analyzed for the presence of cas genes. The total number of genes and the number of genes in the vicinity of known cas genes (in parentheses) are indicated. (D) Association of CARF domains with (predicted) toxin domains in the three types of CRISPR-Cas systems. The histogram shows the co-occurrence of CARF proteins with toxin domains separately for the three CRISPR-Cas system types; the type III systems are additionally partitioned into those that co-occur with type I or type II in the same genome and those that represent the sole instance of CRISPR-Cas in the respective genomes.
Figure 2Structure of the VC1899 CARF domain. This version of the CARF domain contains no elaborations or inserts observed in certain other CARF domains. The predicted active site pocket was identified using probe of 2 solvent or greater radii (gray mesh) and the predicted ligand-interacting residues the pocket are also shown.
Figure 3Comparison of the structures of multiple CARF proteins. The CARF domains of all proteins were aligned and then separated for clarity. The different spatial orientations of the C-terminal domains are shown with respect to the CARF domain. The linker between the CARF domain and the C-terminal domains is colored green, the wHTH or the equivalent domain is rendered in white, and the C-terminal effector domain is colored purple. Inserts within the CARF domain are colored gray and are shown in “wire” representation. A domain of uncertain origin in PF1127 is colored gray and is shown as ribbon.