| Literature DB >> 28769116 |
Nicolás Toro1, Francisco Martínez-Abarca2, Alejandro González-Delgado2.
Abstract
CRISPR (clustered regularly interspaced short palindromic repeats) and associated proteins (Cas) act as adaptive immune systems in bacteria and archaea. Some CRISPR-Cas systems have been found to be associated with putative reverse transcriptases (RT), and an RT-Cas1 fusion associated with a type III-B system has been shown to acquire RNA spacers in vivo. Nevertheless, the origin and evolutionary relationships of these RTs and associated CRISPR-Cas systems remain largely unknown. We performed a comprehensive phylogenetic analysis of these RTs and associated Cas1 proteins, and classified their CRISPR-Cas modules. These systems were found predominantly in bacteria, and their presence in archaea may be due to a horizontal gene transfer event. These RTs cluster into 12 major clades essentially restricted to particular phyla, suggesting host-dependent functioning. The RTs and associated Cas1 proteins may have largely coevolved. They are, therefore, subject to the same selection pressures, which may have led to coadaptation within particular protein complexes. Furthermore, our results indicate that the association of an RT with a CRISPR-Cas system has occurred on multiple occasions during evolution.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28769116 PMCID: PMC5541045 DOI: 10.1038/s41598-017-07828-y
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Unrooted phylogenetic tree encompassing the diversity of RTs associated with CRISPR-Cas systems. The tree includes 118 RT sequences associated with CRISPR-Cas systems and 419 closely related RT sequences (Methods). Note that the RTs associated with CRISPR-Cas[21] (highlighted with red dots) from Herpetosiphon aurantiacus (GI: 159898445) and Haliscomenobacter hydrossis (GI: 332661943) correspond to a group II intron and a retron/retron-like RT, respectively. The arrow indicates the position of the M. mediterranea (MMB-1) RT. Group II intron classes and varieties are highlighted in color and their names are indicated in black. All group II introns RTs are shadowed in light purple. The RT clades associated with CRISPR-cas loci are highlighted in color and their names are indicated in red. All RTs associated with CRISPR-Cas systems are shadowed in light pink. Open circles at the nodes indicate that the node concerned has a FastTree support value ≥0.92. The phyla restricted to particular RT-CRISPR clades are indicated.
Distribution of RTs associated with CRISPR-cas systems.
| Clade | Taxonomic adscription | Recordsa | Associated Effector complex | Type of RT | |||||
|---|---|---|---|---|---|---|---|---|---|
| C mr (B-C) | C sm(A-D) | (−)b | RT | RT-Cas1 | Cas6-RT-Cas1 | ||||
| Archaea | All | Taxonomic adscription | 110 | 26 | 43 | 41 | 14 | 76 | 20c |
| 1 | Euryarcheota (Methanosarcinaceae) | 5 | 2 | 3 | 0 | 5 | 0 | 0 | |
| Bacteria | 2 | Planctomycetia, Bacteroidetes, (Delta, Epsilon) Proteobacteria | 13 | 2 | 4 | 7 | 0 | 13 | 0 |
| 3 | Cyanobacteria | 15 | 4 | 6 | 5 | 0 | 15 | 0 | |
| 4 | Planctomycetia, Chlorobi, (Gamma, Delta) Proteobacteria, no-rank phyla | 6 | 0 | 4 | 2 | 2 | 4 | 0 | |
| 5 | Cyanobacteria | 8 | 1 | 2 | 5 | 3 | 5 | 0 | |
| 6 | Gammaproteobacteria | 3 | 0 | 2 | 1 | 0 | 3 | 0 | |
| 7 | (Alpha, Delta) proteobacteriad | 13 | 2 | 4 | 7 | 1 | 12 | 0 | |
| 8 | |||||||||
| 8A | Gammaproteobacteria | 23 | 6 | 9 | 8 | 0 | 7 | 16c | |
| 8B | Planctomycetia, (Beta, Gamma, Delta) Proteobacteria | ||||||||
| 9 | Chloroflexi | 3 | 0 | 0 | 3 | 3 | 0 | 0 | |
| 10 | Actinobacteria | 12 | 5 | 7 | 0 | 0 | 12 | 0 | |
| 11 | Bacteroidetes | 4 | 3 | 0 | 1 | 0 | 0 | 4 | |
| 12 | Firmicutes | 5 | 1 | 2 | 3 | 0 | 5 | 0 | |
(a)Number of representative RTs described in this study (≤85% identity) corresponding to Tables S1–S12.
(b)No records with partial or unknown effector complex associated.
(c)One of the records corresponds to a Cas6-RT fusion gene without a recognizable Cas1 domain.
(d)92% of the records belong to Alphaproteobacteria (12/13).
Figure 2Architectures of the genomic loci for the subtypes of CRISPR-Cas systems associated with RTs. A representative operon is shown for types IIIB/C and III-A/D for 11 of the 12 RT phylogenetic clades including CRISPR-array (Array) sites. For each representative genome, the corresponding gene locus tag (final digits) is indicated. Homologous genes are color-coded and identified by family (based on the findings of Makarova et al.)[21]. Warm colors correspond to effector genes and cold colors correspond to adaptive genes. RT function is indicated by a fuchsia color. Ancillary and unknown functions are not color-coded. Gene names follow reported classifications and assignments[21]. When available, both a systematic (above) and a ‘legacy’[11] (below) name are indicated. Only complete loci are shown (no complete operons are available for clade 9). The diagrams are not drawn to scale.
Figure 3Phylogenetic tree of Cas1 associated with RTs. The phylogenetic reconstruction was carried out with 148 Cas1 proteins. The identified clades were named and colored according to the RT-associated clade. FastTree support values ≥0.92 are indicated at the nodes. The Cas1 protein (unknown subtype) from Arthropira platensis (GI:479129287)[21] was used as an outgroup.