| Literature DB >> 34758284 |
Sean Meaden1, Ambarish Biswas2, Ksenia Arkhipova3, Sergio E Morales4, Bas E Dutilh3, Edze R Westra5, Peter C Fineran6.
Abstract
CRISPR-Cas are adaptive immune systems that protect their hosts against viruses and other parasitic mobile genetic elements.1 Although widely distributed among prokaryotic taxa, CRISPR-Cas systems are not ubiquitous.2-4 Like most defense-system genes, CRISPR-Cas are frequently lost and gained, suggesting advantages are specific to particular environmental conditions.5 Selection from viruses is assumed to drive the acquisition and maintenance of these immune systems in nature, and both theory6-8 and experiments have identified phage density and diversity as key fitness determinants.9,10 However, these approaches lack the biological complexity inherent in nature. Here, we exploit metagenomic data from 324 samples across diverse ecosystems to analyze CRISPR abundance in natural environments. For each metagenome, we quantified viral abundance and diversity to test whether these contribute to CRISPR-Cas abundance across ecosystems. We find a strong positive association between CRISPR-Cas abundance and viral abundance. In addition, when controlling for differences in viral abundance, CRISPR-Cas systems are more abundant when viral diversity is low, suggesting that such adaptive immune systems may offer limited protection when required to target a diverse viral community. CRISPR-Cas abundance also differed among environments, with environmental classification explaining roughly a quarter of the variation in CRISPR-Cas relative abundance. The relationships between CRISPR-Cas abundance, viral abundance, and viral diversity are broadly consistent across environments, providing robust evidence from natural ecosystems that supports predictions of when CRISPR is beneficial. These results indicate that viral abundance and diversity are major ecological factors that drive the selection and maintenance of CRISPR-Cas in microbial ecosystems.Entities:
Keywords: CRISPR-Cas; bacteriophages; metagenomics; microbial ecology
Mesh:
Year: 2021 PMID: 34758284 PMCID: PMC8751634 DOI: 10.1016/j.cub.2021.10.038
Source DB: PubMed Journal: Curr Biol ISSN: 0960-9822 Impact factor: 10.834
Figure 1CRISPR abundance positively correlates with viral abundance
Correlation between relative viral abundance and the read count (per million) of metagenomic reads that mapped to CRISPR array repeats across all samples. The dashed line represents the linear model fit, and shaded area represents 95% confidence interval (p < 0.0001 and R2 = 0.21).
Figure 2CRISPR abundance varies by environment
Distributions of metagenomic read counts that mapped to CRISPR arrays (read count per million that mapped to a CRISPR array predicted by CRISPRDetect v.3 from assembled contigs) grouped by environmental classification.
(A) Sample sizes and ontology are shown.
(B–D) Samples are grouped using the Earth Microbiome Project ontology (EMPO) at level 1 (B), 2 (C), or 3 (D).
Figure 3CRISPR abundance positively correlates with viral abundance across environments
Correlations between relative viral abundance and the read count (per million) of metagenomic reads that mapped to CRISPR array repeats per environment type.
(A and B) Environments are categorized according to the EMPO at level 1 (A) or 2 (B).
(C) Samples grouped at EMPO level 3 are divided into significant correlations or non-significant correlations (NS). Dashed lines represent linear model fits, and shaded areas represent 95% confidence intervals.
(D) The number of samples collected in each country with the circle representing samples collected in the Pacific Ocean.
See also Figures S1 and S2 and Data S1.
Figure 4CRISPR abundance negatively correlates with normalized viral diversity metrics
Correlations between viral diversity (normalized by viral load per sample) and CRISPR abundance (reads per million that map to CRISPR arrays). Panels represent Nei’s diversity index (A), Shannon’s index (B), contig richness (C), or contig evenness (D). (A) represents intra-contig viral diversity while (B)–(D) represent inter-contig viral diversity. Dashed lines represent linear model fits, and shaded areas represent 95% confidence intervals. See also Data S1.
| REAGENT or RESOURCE | SOURCE | IDENTIFIER |
|---|---|---|
| Publicly available from GenBank. | GenBank: GCA_900519105.1 | |
| Publicly available from GenBank. | GenBank: GCA_000001215.4 | |
| Publicly available from GenBank. | GenBank: GCA_000001735.1 | |
| Publicly available from GenBank. | GenBank: GCA_000002425.2 | |
| Publicly available from GenBank. | GenBank: GCA_000002985.3 | |
| Publicly available from GenBank. | GenBank: GCA_000313835.1 | |
| Metagenomic data used in this study | Publicly available from NCBI SRA database. | Accession list in |
| BBMap | Bushnell et al. | |
| MEGAHIT version 1.1.3 | Li et al. | |
| MIUMS | This study | |
| Metaxa2 | Bengtsson-Palme et al. | |
| Magic-BLAST | Boratyn et al. | |
| CRISPRDetect | Biswas et al. | |
| metaCRISPRDetect | Biswas et al. | |
| blastn | Chen et al. | |
| WGSIM | N/A | |
| VarScan | Koboldt et al. | |
| Metabat | Kang et al. | |
| samtools | Danecek et al. | |
| vegan | Oksanen et al. | |
| HMMER Version 3.2.1 | Eddy | |
| Diamond version v0.8.38.100 | Buchfink et al. | |
| metaGeneMark | Zhu et al. | |
| VirSorter2 version 2.2.2 | Guo et al. | |
| DeepVirFinder version 1.0 | Ren et al. | |
| cd-hit | Li and Godzik | |
| Bwa-mem | Li and Durbin | |