| Literature DB >> 31937679 |
J L Weissman1, Philip L F Johnson2.
Abstract
A diversity of clustered regularly interspaced short palindromic repeat (CRISPR)-Cas systems provide adaptive immunity to bacteria and archaea through recording "memories" of past viral infections. Recently, many novel CRISPR-associated proteins have been discovered via computational studies, but those studies relied on biased and incomplete databases of assembled genomes. We avoided these biases and applied a network theory approach to search for novel CRISPR-associated genes by leveraging subtle ecological cooccurrence patterns identified from environmental metagenomes. We validated our method using existing annotations and discovered 32 novel CRISPR-associated gene families. These genes span a range of putative functions, with many potentially regulating the response to infection.IMPORTANCE Every branch on the tree of life, including microbial life, faces the threat of viral pathogens. Over the course of billions of years of coevolution, prokaryotes have evolved a great diversity of strategies to defend against viral infections. One of these is the CRISPR adaptive immune system, which allows microbes to "remember" past infections in order to better fight them in the future. There has been much interest among molecular biologists in CRISPR immunity because this system can be repurposed as a tool for precise genome editing. Recently, a number of comparative genomics approaches have been used to detect novel CRISPR-associated genes in databases of genomes with great success, potentially leading to the development of new genome-editing tools. Here, we developed novel methods to search for these distinct classes of genes directly in environmental samples ("metagenomes"), thus capturing a more complete picture of the natural diversity of CRISPR-associated genes.Entities:
Keywords: CRISPR; metagenomics; microbial ecology; network
Year: 2020 PMID: 31937679 PMCID: PMC6967390 DOI: 10.1128/mSystems.00752-19
Source DB: PubMed Journal: mSystems ISSN: 2379-5077 Impact factor: 6.496
FIG 1Prediction and annotation of novel CRISPR-associated (cas) gene families from a globally distributed set of metagenomes. (a) Known CRISPR-associated genes cluster close to each other in the network. This result validates the assumption of label propagation methods that adjacent nodes in the network have similar features. (b) Predicted cas genes (cyan), known cas genes (red), and their neighbors (gray) shown in the network. The full network is not shown due to its size. Sensitive profile-profile searches suggested functional annotations for many of these genes (shown as labels). wHTH, winged helix-turn-helix domain.
FIG 2Differential expression of putative novel cas genes in Sulfolobus islandicus strains LAL14/1 and REY15A during infection (16, 17). (a and b) Of the 6 novel cas genes harbored in these two genomes (a), all 6 were significantly differentially expressed in strain LAL14/1, whereas (b) only 3 were differentially expressed in REY15A. (c) NOG280809 is located nearby the cas operon Cmr-β in LAL14/1, where it is upregulated, but is much farther away in REY15A, where it is not differentially expressed. Notably, this region contains a number of transposases and appears to have undergone considerable rearrangement. (d and e) The cas operon Cmr-β is upregulated during infection in LAL14/1 (d) but downregulated in REY15A (e). To aid in the comparisons, dashed horizontal lines denote the control uninfected expression level in panels a and b and panels d and e. Points above or below these lines denote upregulation or downregulation, respectively. Multiple instances of gene families are distinguished using letter suffixes (e.g., “_a”). Note that in addition to the use of different host strains, the two data sets use different viruses, measure and analyze expression data differently, and examine different time scales of infection (hours postinfection [hpi] versus days postinfection [dpi]) (16, 17).
Putative novel cas genes and their annotations
| NOG | Putative type | Regulatory role | Defense role | Membrane or extracellular role | TM |
|---|---|---|---|---|---|
| NOG10439 | ABC transporter | X | |||
| NOG16349 | Major facilitator superfamily | X | |||
| NOG44531 | PspC | X | |||
| NOG46784 | TA | ||||
| NOG82932 | I-U | ||||
| NOG84780 | TA | ||||
| NOG85832 | Type II secretion, peptidoglycan binding | X | |||
| NOG87308 | I-E | Abi | |||
| NOG116663 | III-A | HTH domain | |||
| NOG121080 | Fibronectin, cell wall biogenesis | X | |||
| NOG121689 | I-D | Winged helix | |||
| NOG121881 | I-B | ||||
| NOG131471 | III-A | ABC transporter, outer membrane protein assembly | X | ||
| NOG133718 | Iron-dependent repressor | ||||
| NOG138333 | I-E | Secretion | |||
| NOG140114 | Iron-dependent repressor | ||||
| NOG145673 | I-F | Winged helix | |||
| NOG146536 | Membrane protein | X | |||
| NOG242488 | I-D | Curli biogenesis/secretion | |||
| NOG269516 | TA/RM | ||||
| NOG269593 | I-B | Fibronectin, secretion, S-layer, sugar binding | |||
| NOG273942 | III-A | ||||
| NOG280809 | Cyclase | ||||
| NOG296050 | |||||
| NOG300351 | Membrane protein | ||||
| NOG309511 | I-E | Accessory Sec system GspB transporter | |||
| NOG309759 | I-E | X | |||
| NOG312939 | I-U | X | |||
| NOG314802 | I | Abi/RM | |||
| NOG315893 | I-E | HTH domain | |||
| NOG318199 | I-B | ||||
| NOG328008 | I-B | X |
Validated by at least one independent source of information.
Confirmed by multiple neighbors in network.
TA, toxin-antitoxin; RM, restriction modification.
Transmembrane.