Gal Ofir1, Sarah Melamed1, Hila Sberro1,2, Zohar Mukamel3,4, Shahar Silverman1, Gilad Yaakov1, Shany Doron1, Rotem Sorek5. 1. Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel. 2. Departments of Medicine and Genetics, Stanford University, Stanford, CA, USA. 3. Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot, Israel. 4. Department of Biological Regulation, Weizmann Institute of Science, Rehovot, Israel. 5. Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel. rotem.sorek@weizmann.ac.il.
Abstract
The evolutionary pressure imposed by phage predation on bacteria and archaea has resulted in the development of effective anti-phage defence mechanisms, including restriction-modification and CRISPR-Cas systems. Here, we report on a new defence system, DISARM (defence island system associated with restriction-modification), which is widespread in bacteria and archaea. DISARM is composed of five genes, including a DNA methylase and four other genes annotated as a helicase domain, a phospholipase D (PLD) domain, a DUF1998 domain and a gene of unknown function. Engineering the Bacillus paralicheniformis 9945a DISARM system into Bacillus subtilis has rendered the engineered bacteria protected against phages from all three major families of tailed double-stranded DNA phages. Using a series of gene deletions, we show that four of the five genes are essential for DISARM-mediated defence, with the fifth (PLD) being redundant for defence against some of the phages. We further show that DISARM restricts incoming phage DNA and that the B. paralicheniformis DISARM methylase modifies host CCWGG motifs as a marker of self DNA akin to restriction-modification systems. Our results suggest that DISARM is a new type of multi-gene restriction-modification module, expanding the arsenal of defence systems known to be at the disposal of prokaryotes against their viruses.
The evolutionary pressure imposed by phage predation on bacteria and archaea has resulted in the development of effective anti-phage defence mechanisms, including restriction-modification and CRISPR-Cas systems. Here, we report on a new defence system, DISARM (defence island system associated with restriction-modification), which is widespread in bacteria and archaea. DISARM is composed of five genes, including a DNA methylase and four other genes annotated as a helicase domain, a phospholipase D (PLD) domain, a DUF1998 domain and a gene of unknown function. Engineering the Bacillus paralicheniformis 9945a DISARM system into Bacillus subtilis has rendered the engineered bacteria protected against phages from all three major families of tailed double-stranded DNA phages. Using a series of gene deletions, we show that four of the five genes are essential for DISARM-mediated defence, with the fifth (PLD) being redundant for defence against some of the phages. We further show that DISARM restricts incoming phage DNA and that the B. paralicheniformis DISARM methylase modifies host CCWGG motifs as a marker of self DNA akin to restriction-modification systems. Our results suggest that DISARM is a new type of multi-gene restriction-modification module, expanding the arsenal of defence systems known to be at the disposal of prokaryotes against their viruses.
The arms race between prokaryotes and the viruses that infect them, bacteriophages (phages), is driving a continuous and intensive evolution of attack and defence mechanisms1–3. Bacterial defence systems target various stages of viral infection in order to thwart the attack, and are rapidly evolving and diverging to answer the fast evolutionary response of phages to these defence strategies4. The multiple defence strategies of bacteria include surface modifications to prevent adsorption of phages3, restriction-modification (R/M) systems that modify the bacterial genome and degrade unmodified foreign DNA5, abortive infection (Abi) systems that trigger cell death or metabolic arrest upon infection6, CRISPR-Cas systems that memorize viral genetic material as probes to target future infection attempts7,8, and more newly discovered defence systems such as the prokaryotic argonaute9 and BREX10. Multiple lines of evidence suggest that many new defence mechanisms are yet to be discovered2,11.R/M systems are the most common form of active defence against phages used by bacteria and archaea1,5. Such systems, generally classified into four types, modify the self-genome on specific sequence motifs and degrade DNA in which such motifs are non-modified12. R/M systems contain a restriction endonuclease activity, a DNA modification activity (most commonly methylation, not present in Type IV systems), and target recognition capabilities. These three activities could either be encoded on 3 different protein subunits (as in Type I R/M systems), two different subunits of restriction and modification, each carrying the target recognition domain (Type III and most of the Type II R/M systems), or on a single polypeptide (Type IIG)5,12.Many genes involved in bacterial and archaeal defence were shown to be physically clustered in genomic loci termed “Defence Islands”11,13. Genes of unknown function that are enriched in defence islands were hypothesized to be involved in defence as well, a concept that led to the recent discovery of the BREX defence system10. In this study we describe a new cassette of genes that is associated with defence islands in general, and R/M systems in particular. This 5-genes cassette, which we found in >350 sequenced genomes of bacteria and archaea, generally appears in two classes in nature, and encodes genes with helicase-related domains, a DNA methyltransferase, and a phospholipase D domain-containing gene. A representative system from Bacillus paralicheniformis 9945a was experimentally demonstrated to confer defence against viruses of all 3 families of tailed phages, in a yet unknown mechanism of action that involves all five genes, methylates the self-genome on specific sequence motifs, and halts phage propagation in the early stages of infection.
Results
Identification of DUF1998-containing gene cassettes in defence islands
We have focused our attention on the protein domain DUF1998 (pfam09369), which has no known function, as this domain was previously found to be enriched in prokaryotic defence islands and hence suggested to participate in anti-phage defence13. As defence systems are frequently encoded by multiple genes working in concert that are co-localized in the genome, we attempted to characterize the genetic context in which DUF1998 genes are found.To this end we searched the Integrated Microbial Genomes (IMG) database14 for genes whose only annotated domain is DUF1998. Within the 35,893 genomes we scanned, we found 1,369 such genes, encoded in 1,273 different genomes – 3.5% of the scanned genomes (Table S1). We then analysed the immediate genomic vicinity of these genes and found that in the vast majority of cases (1,095/1,369, 80%) the DUF1998-containing gene was preceded by a large (~1,100-1,300 aa) gene with a pfam00271 domain, a domain that is a part of the catalytic core of DExx-box helicases15. Further structural modelling of this protein using Phyre2 (ref 16) confirmed structural homology to DExx-box helicases including the two essential catalytic domains17. The only abundant annotated gene downstream to the DUF1998 gene was a gene with a phospholipase D (PLD) domain (pfam13091), appearing in 370 of the 1,369 cases (27%). The PLD domain is associated with enzymes that manipulate phosphoester bonds, such as kinases, phospholipases and endonucleases18, and was shown to be the catalytic domain of some restriction endonucleases19,20. As the combination of these three genes is the most abundant conserved genetic context of DUF1998, we defined these three consecutive genes as the core of our hypothesized system. We identified 351 such triplets in the analysed genomes (1% of the scanned genomes).
An abundant 5-gene cassette represents a putative new defence system
We then characterized the genomic neighbourhood of this 3-gene core, and found that it
was almost always (324/351 of cases, 92%) associated with a gene containing a
DNA methyltransferase domain, marking it as a possible new type of
restriction/modification system. We therefore named this system DISARM (Defence
Island System Associated with Restriction Modification). In most cases, the core
gene triplet was adjacent to a DNA adenine N6 methyltransferase gene (pfam13659)
that is usually annotated in the restriction enzymes database REBASE21 as a putative Type IIG R/M gene. In
these cases, which we define as Class 1 DISARM, the system is comprised of the
core triplet, the methyltransferase, and a fifth gene annotated as COG0553 SNF2
family helicase, containing SNF2-like ATPase (pfam00176) and helicase C-terminal
(pfam00271) domains22. Gene cassettes
containing the Class 1 DISARM system were found in 11 bacterial and 3 archaeal
phyla (Figure 1a; Table 1; Supplementary Table 2).
Figure 1
Two common classes of DISARM systems occur in bacteria and archaea.
(a) Class 1 DISARM systems are composed of the core gene triplet of drmA, a gene with a helicase domain (orange); drmB, DUF1998 domain-containing gene (yellow); and drmC, containing a PLD domain (green). This core gene triplet is preceded by drmD, an SNF2-like helicase (pink) and drmMI, an adenine methylase (peach). RefSeq genome accessions are indicated; system positions appear in Table S2. (b) Class 2 DISARM systems contain, in addition to the core triplet of drmABC, also drmMII, a cytosine methylase (blue). In Bacilli and some haloarchaea the systems also include a ˜800 aa gene of unknown function named here drmE (red).
Table 1
DISARM genes and their domain annotations.
Gene
Signature domain
Putative function
drmA
pfam00271
Putative helicase domain
drmB
pfam09369
DUF1998, helicase-associated domain
drmC
pfam13091
Phospholipase D / nuclease domain
drmD
pfam00176
SNF2 family helicase
drmE
None
Unknown function
drmMI
pfam13659
N6-adenine DNA methyltransferase
drmMII
pfam00145
5-cytosine DNA methyltransferase
A second subset of systems did not contain the SNF2 helicase and the Type IIG R/M enzyme pair mentioned above, but instead contained a DNA 5-cytosine methyltransferase (pfam00145). We refer to these systems as Class 2 DISARM. These systems are mostly found in extremophilic bacteria and archaea and in Firmicutes, especially Bacilli. Unlike the larger Class 1 systems, these Class 2 systems are more compact. In most Bacilli and halophilic archaea, the systems contain another protein of unknown function sized ~800aa (Figure 1b).
DISARM confers protection against multiple phage types
We selected a Class 2 DISARM system for experimental validation, as this Class is more compact (spanning an average DNA size of 9.6kb vs 16.5kb spanned by Class 1 systems, see Supplementary Table 2) facilitating easier engineering into a heterologous genome. To test whether the hypothesized DISARM system provides protection against phage infection, we have cloned the DISARM locus of Bacillus paralicheniformis 9945A, including the upstream and downstream intergenic regions, into the Bacillus subtilis BEST7003 genome (Figure 2a). We verified in advance that B. subtilis BEST7003 does not contain a DISARM system of its own by searching for homologs of each of the DISARM genes in its genome – none was found. The correct insertion of the system into the Bacillus subtilis genome was verified by whole genome sequencing (Supplementary Figure 1). No change in growth dynamics was observed in the DISARM-containing bacteria compared to the control strain transformed with an empty plasmid (Figure 2b).
Figure 2
DISARM provides protection against phages.
(a) The DISARM locus of Bacillus paralicheniformis ATCC 9945a. Numbers below axis represent position on the B. paralicheniformis genome. Locus tags are provided for each gene. (b) Insertion of the DISARM locus into the Bacillus subtilis BEST7003 genome does not impair growth. Curves show the mean of 2 independent experiments with 3 technical repeats each. Error bars are 95% confidence interval of the mean. (c-e) DISARM provides protection against phi3T, Nf and SPO1 phages. Bacteria were infected at time=0 at multiplicities of infection (MOI) of 0.05, 0.5 and 5. Curves show the mean of 2 independent experiments with 3 technical repeats each. Error bars are 95% confidence interval of the mean. (f) Plaque formation of 7 phages on DISARM-containing strains. Y axis represents concentration of plaque forming units (PFU). Shown is mean of 3 replicates, error bars are SD of the mean. Grey bars represent efficiency of plaque (EOP) on DISARM- cells, red bars are EOP in DISARM+.
We then challenged the DISARM-containing B. subtilis with phages from all 3 morphological families of the Caudovirales: the Siphophages phi3T, SPR, SPP1 and phi105, the Myophage SPO1 and the Podophages Nf and phi29. SPO1, SPP1, phi29 and Nf are obligatory lytic, while phi3T, SPR and phi105 are temperate phages. Infections were performed at 3 orders of magnitude of Multiplicity of Infection (MOI) – 0.05, 0.5 and 5 phages per bacterium. Our results show that DISARM provided anti-phage protection against all phages, manifested by the delay or absence of culture collapse upon infection with phage (Figure 2c-e, Supplementary Figure 2). To quantify the protection, we measured phage efficiency of plating (EOP) on DISARM-containing bacteria vs. control cells. DISARM provided strong protection against most of the phages, with up to 7 orders of magnitude of protection observed. For two of the phages we observed intermediate levels of DISARM defence, with 2 orders of magnitude against SPO1 and one order of magnitude against SPP1. (Figure 2f).To test whether the partial protection observed for some of the phages is due to a heritable trait such as resistance mutations or epigenetic modification in a subpopulation of the infecting phages, we isolated Nf and SPO1 phages from single plaques that appeared on DISARM+ cells. These isolated phages did not show increased resistance (measured via plaque assays) against DISARM+ cells as compared to their ancestor phages (Supplementary Figure 3), suggesting that for these phages escape from DISARM+ is not due to genetic or epigenetic traits. Rather, a small proportion of these phages seem to be naturally able to propagate inside at least a fraction of the DISARM+ cells population.One of the known phage defence paradigms is abortive infection, in which bacteria commit suicide upon infection, thus preventing the completion of the phage replication cycle6. This prevents the release of phage progeny and the spread of infection to neighbouring cells. In this scenario, infection with an MOI>=1 is expected to cause the death or bacteriostasis of a large fraction of cells in the culture and an immediate stasis or reduction of OD upon infection. As DISARM-containing cultures infected with MOI = 5 did not stop growing or collapse upon infection (Figure 2c), we infer that DISARM does not provide protection through an abortive infection mechanism.We also performed transformation efficiency experiments using an episomal Bacillus plasmid. We could not detect a reduced transformation efficiency in DISARM+ cells, suggesting that DISARM does not interfere with DNA import of this plasmid through the natural competence system of B. subtilis (Supplementary Figure 4).
DISARM allows phage adsorption but prevents phage replication
To test if DISARM protects the bacteria by preventing phage attachment, we compared the rate of phage adsorption to DISARM-containing vs. DISARM-lacking bacteria. No significant difference was observed in the adsorption rates, indicating that DISARM provides protection without hampering phage adsorption (Figure 3a). We further tested if the phage genome is replicated in DISARM-containing cells. We used Illumina sequencing to quantify the amount of phage DNA in comparison to bacterial DNA during infection with phi3T at MOI = 1. As the bacterial genome is not degraded during phi3T infection10, the ratio between phage reads and bacterial reads can be used to quantify the number of phage genome equivalents per infected cell. The results show that while in control cells the phage DNA replicates during the infection process, in DISARM-containing cells it is not replicated but is rather depleted over time in comparison to the bacterial genome (Figure 3b). In addition, phi3T was not able to circularize its genome or form detectible lysogens in DISARM-containing cells (Figure 3c-d). This indicates that DISARM prevents phage DNA replication and lysogeny, and probably also causes phage DNA degradation. Moreover, as DNA circularization occurs soon after injection and is essential for both lytic and lysogenic cycles23, our results indicate that DISARM stops the infection at a very early stage.
Figure 3
Phage phi3T adsorption and DNA replication in DISARM-containing cells.
(a) Adsorption of phages to DISARM-containing cells (red) is not impaired compared to control cells (grey). After infection of logarithmic stage cultures (OD600=0.3) with phi3T at MOI=1, samples were taken at 5 minutes intervals, and the extracellular (unadsorbed) phage concentration was measured and compared to the initial phage concentrations (Methods). Bars represent mean of 3 experiments, error bars are SEM. (b) Ratio of phage DNA to bacterial DNA during infection. Total DNA was extracted from infected bacteria (MOI=1) at the indicated time points and sequenced using an Illumina sequencer. Y-axis represents relative phage DNA concentrations, compared to bacterial genome equivalents, normalized to the value at t=5 minutes post infection. Each curve represents an independent repeat of the experiment. (c) DISARM prevents lysogeny of phi3T. Agarose gel of multiplex PCR with 3 primer sets, aimed to detect bacterial DNA, phage DNA and lysogen. Lanes are marked with minutes post infection; U lane is the uninfected control. (d) DISARM prevents phage circularization. Outward-facing primers at the edges of the phi3T genome were used to detect phage genome circularization. (e) Schematic representation of fragments amplified in panel d.
To further examine the dynamics of phage DNA decay in DISARM-containing cells, we used a previously established system that allows imaging of phage SPP1 infection24,25. In this system, the SPP1 genome is modified to include a lacO array, and the infected cells express a LacI-CFP fusion protein. Upon phage DNA injection to infected cells, LacI-CFP proteins bind to the phage lacO array resulting in a clear focus. Foci were clearly observed on WT, DISARM-lacking cells, and once established the foci grew in size during phage DNA replication (Figure 4). Foci were also observed in infected DISARM-containing cells, but these foci did not expand and rapidly disappeared during the time course of infection (Figure 4). These results further substantiate that DISARM does not block phage DNA injection into the infected cell, but cause intracellular phage DNA decay.
Figure 4
Fluorescence microscopy of phage DNA in DISARM- and DISARM+ cells.
(a) DISARM-lacking, RFP-expressing cells (red cells) were co-incubated with DISARM-containing (light blue) cells in a microfluidic device that allows visualization of a single bacterial layer (Methods). Both strains express LacI-CFP constitutively. SPP1 phages containing a LacO array (105 pfu/µl) were flowed into the device from t=15 minutes to t=45 minutes, and image was taken every 5 minutes. Upon injection of the phage DNA, a fluorescent focus of LacI-CFP is formed on the LacO array in the phage DNA. White arrows show foci in DISARM-lacking cells, which do not disappear and grow in size through the time course. Foci on DISARM-containing cells appear (full yellow arrowheads) but later disappear (empty yellow arrowheads). Scale bar is 5 µm. Representative results of 2 independent experiments. (b) Quantification of phage foci over time in the microscopy field of which a subsection is shown in panel a. Phage foci appear starting t=25 minutes, and become established in DISARM-lacking cells. Similar numbers of foci appear in DISARM-containing cells, but these foci disappear over time. Shaded area represents the time where phages were continuously flowed in.
Essential components for DISARM anti-phage activity
To map the essential components of the DISARM system, we engineered a series of B. subtilis BEST7003 strains, each containing a DISARM system with a scarless deletion of each of the DISARM genes (Figure 5). These deletions were verified by whole genome sequencing (Supplementary Figure 1). No growth impairment was observed in the deletion strains of drmE, drmA, drmB and drmC as compared to control cells or cells containing the full DISARM system (Supplementary Figure 5). We were not able to obtain a single-gene deletion for drmMII (see below).
Figure 5
Deletion of DISARM components.
(a-c) Deletion of drmE, drmA or drmB abolished
DISARM defence against phi3T. (d) Deletion of drmC
has no effect on the defence against phi3T. (e-f) Deletion of
drmC reduces DISARM protection against Nf and SPO1, but the
deletion strains are still protected compared to the control bacteria. Curves in
a-f are mean of 2 independent experiments with 2 technical
repeats each, error bars represent 95% confidence interval of the mean.
Infections were performed at MOI=0.5.
We then tested the deletion strains by infecting each of them with phi3T, SPO1 and Nf phages. Deletions of drmA (helicase domain), drmB (DUF1998 domain) and drmE (unknown function) abolished DISARM protection against all phages tested, indicating that each of these 3 genes is essential for DISARM activity (Figure 5a-c and Supplementary Figure 6). In contrast, deletion of the PLD domain gene, drmC, had no effect on DISARM protection against phi3T and the ΔdrmC cells remained fully protected against it (Figure 5d and Supplementary Figure 7a-c). However, a reduction in protection efficiency against SPO1 and Nf phages was observed, such that ΔdrmC cells were less resistant than cells containing the full DISARM system, but still more resistant than DISARM-lacking cells (Figure 5e-f). Quantification via plaque assays further showed that the ΔdrmC reduced DISARM defence against Nf by 2-3 orders of magnitude (Supplementary Figure 7). Variable protection in ΔdrmC cells was also observed in other phages (Supplementary Figure 7). Therefore, drmC appears to be redundant for defence against some phages, while required for defence against others (see Discussion).As DISARM contains a predicted C-5 cytosine-specific DNA methyltransferase (drmMII) we have performed whole genome bisulfite sequencing to look for its cognate methylation sequence motif. While in control B. subtilis no significant motif for 5-methylcytosine (5mC) was identified, in DISARM-containing cells the motif CCWGG (W=A or T) was methylated in the underlined cytosine (average methylation rate of 82% for sites covered by >5 reads). Bisulfite sequencing further validated that the same motif is methylated in B. paralicheniformis ATCC 9445A, in which the DISARM locus resides naturally. The same methylation motif was found in the genome of Bacillus subtilis subsp. spizizenii str. W23 (ref 26), which also contains a similar Class 2 DISARM system. In that strain, this motif was attributed to the homolog of the DISARM methyltransferase (named M.BsuW23II), to which no cognate restriction enzyme was suggested26.The above results suggest that drmMII methylates the DNA at CCWGG motifs. Presumably, other components of the DISARM system use non-methylated CCWGG motifs as a marker of foreign DNA akin to other known R/M systems. Consistent with this hypothesis, attempts to clone a drmMII-deleted DISARM system into B. subtilis yielded very low transformation efficiency (Supplementary Figure 8). Whole genome sequencing of three of the resulting transformed colonies showed massive deletions or frameshift mutations in the DISARM locus in addition to the intended deletion of drmMII. This suggests that in the absence of drmMII the DISARM system is toxic to the cells and only cells with a defective DISARM locus can survive. It is likely that in the absence of CCWGG methylation in the bacterial chromosome, the restriction components in the DISARM system attack the chromosome leading to the observed toxicity.To further examine whether drmMII alone is sufficient for DNA methylation, we cloned this gene under a Pveg constitutive promoter in a WT B. subtilis. Bisulfite sequencing validated that the drmMII-expressing strain is methylated on 97.7% of the reads mapping to CCWGG motifs, validating drmMII as the system’s methylase.To test whether CCWGG methylation is sufficient to protect phages against DISARM interference, we propagated phi3T on the drmMII-expressing strain, yielding methylated phages. Bisulfite sequencing verified that 75 of the 78 CCWGG sites in phi3T became methylated after propagation in the methylase-expressing cells (Supplementary Figure 9a). The 3 sites that were not modified overlapped with GGCC sites that are known to be methylated by the native methylase M.Phi3TI encoded by phi3T (the same methylase methylates also GCNGC sites27, while an additional methylase, M.Phi3TII, modifies TCGA motifs28). The DISARM+ strain still protected against the modified phages despite their high level of methylation (Supplementary Figure 9b). Moreover, DISARM also provided high level of protection against phage Nf (Figure 2d) although the genome of this phage (GenBank accession EU622808) is devoid of any CCWGG site, which we also verified by whole genome sequencing of the Nf phage we used. These results suggest that DISARM probably uses an additional, yet unknown mechanism to identify invading phage DNA in addition to the methylation signature.
Discussion
Our results establish DISARM as a new defence system, providing protection against diverse phages. The DISARM system is widespread in defence islands across the microbial world, and contains 3 core genes (drmA, drmB and drmC) accompanied by a methyltransferase (drmMI or drmMII) and an additional gene (drmD or drmE). The existence of an R/M-related active methyltransferase, the toxicity caused by its deletion, and the depletion of phage DNA during infection (Figures 3b, 4) suggest that DISARM represents a new composition of an R/M system that differs from other known such systems. While the modification module of DISARM is composed of a methyltransferase as in classic R/M systems, the restriction module seems to be unique and requires multiple components. The interaction between these modules and the exact mechanism through which DISARM restricts phage replication remain to be characterized. Moreover, since phage Nf does not contain a single CCWGG motif and is still restricted by DISARM, our results imply that the restriction module of DISARM may not rely solely on the presence of these motifs.While DISARM was able to strongly protect against most of the phages tested, for two of
the phages (SPO1 and SPP1) protection only amounted to 1-2 orders of magnitude.
Phage SPO1 is known to have a heavily modified genome, containing
hydroxymethyluracil (hmU) in place of thymine in its genome29. As CCWGG sites must contain the modified base on one of the
strands (W=A or T), it is possible that the modified base somehow partially
interferes with DISARM restriction. SPP1, which also partially escapes DISARM
protection, has no reported DNA modifications, but only contains one CCWGG motif. It
is possible that the lower level of DISARM protection against this phage can be
attributed to the single motif, although phi3T modified in all but 3 CCWGG motifs
was unable to overcome DISARM (Supplementary Figure 9), and so did phage Nf which completely lacks this
motif in its genome.We have shown that four of the five genes comprising DISARM are absolutely essential for its activity in phage resistance. The fifth gene, drmC, is partially required for defence against SPO1 and Nf, but is redundant against phi3T. The drmC gene has a PLD domain, a domain that was previously shown to be the catalytic nuclease domain in R/M systems such as BfiI and NgoAVII19,20. However, due it its redundancy in defence against most of the tested phages, drmC is unlikely to function as the core restriction endonuclease of the system, and is more likely to take an auxiliary role. Indeed, while drmA is almost always associated with the DUF1998-containing drmB gene, drmC appears associated with this pair in only one third of the cases.In this study we focused on the most prevalent genomic context of standalone DUF1998-containing genes (drmB) and showed that such domains preferentially occur in the context of DISARM systems. In DISARM, the DUF1998-containing drmB is found immediately downstream to drmA, a putative DExx-helicase (pfam00271). Interestingly, a fusion protein between DExx-helicase and DUF1998, named dpdJ, was recently demonstrated to be involved in the Dpd system, a new R/M system comprised of at least 12 genes that modifies the bacterial DNA with 7-deazaguanine derivatives and possibly restricts unmodified DNA30. In the Dpd system, the abovementioned fusion gene dpdJ is followed by a PLD-containing gene named dpdK. The roles of these genes in the Dpd systems are unknown, but in light of our results it is possible that they take part in the Dpd restriction module similarly to the putative roles of drmABC in DISARM.In line with the hypothesis that the DISARM components can interact with different modules of other R/M systems, it is worth noting that DISARM systems are often genomically associated with other methyltransferases and putative restriction endonucleases (Supplementary Figure 10). For example, in many cases Class 1 systems are accompanied by an additional predicted cytosine methylase (pfam00145) (Supplementary Figure 10a-b). DISARM systems are also frequently associated with a Res and Mod genes of a Type III R/M system (Supplementary Figure 10a). In other cases, both a cytosine methylase and an HNH endonuclease (pfam13391) accompany Class 1 DISARM systems (Supplementary Figure 10). The association with multiple other restriction systems may reflect the general tendency of defence systems to cluster in defence islands13, or, alternatively, might suggest that the function of the core DISARM genes can be combined with additional R/M modules and provide a synergistic defensive advantage.Bacterial and archaeal systems that provide defence against foreign genetic elements have demonstrated time and again their profound diversity and rapid evolution. Despite over four decades of extensive research, it seems that the full spectrum of R/M systems in nature is yet to be completely documented. The discovery of DISARM, most probably representing a new kind of R/M systems, exemplifies this concept and suggests that additional new systems will be revealed in the future. Better understanding of the arsenal of defensive measures at the disposal of bacteria and archaea will bring us closer to understanding their arms race against their parasites, a major evolutionary driving force shaping prokaryotic genomes.
Materials and Methods
Genomic identification and analysis of DISARM systems
The IMG database14 was searched on June 2016 for genes containing the pfam09369 (DUF1998) domain with no additional domains. The pfam annotations of the neighboring genes of these DUF1998 standalone genes were retrieved from IMG, and examined in order to identify the most common neighbourhood. DUF1998 genes with a pfam00271 gene upstream and a pfam13091 gene downstream were collected as core DISARM genes and further analysed. DUF1998 genes with only a pfam00271 gene upstream were manually screened to identify possible misannotation of a downstream pfam13091 gene. The pfam00271, DUF1998 and pfam13091 containing genes were termed drmA, drmB and drmC, respectively.To study the genomic neighbourhood of the identified core DISARM systems, the pfam and COG annotations of 30 genes upstream and downstream of the DUF1998 gene were retrieved from IMG and manually inspected. This led to the discovery of the additional related DISARM genes. Class 1 systems were defined according to the presence of pfam00176 and pfam13659 up to 30 genes upstream or downstream of the DUF1998 gene. Systems that contained a pfam00176 gene and Type III R/M genes (methylase with pfam01555 and restriction nuclease with pfam04851) were defined as Class 1, based on the assumption that the R/M system methylase replaces drmMI. Class 2 systems were defined according to the lack of pfam13659 and pfam00176 and the presence of pfam00145 gene. In Class 1 systems, the closest gene with pfam13659 relative to the DUF1998 gene was defined as drmMI, and the closest gene with pfam00176 relative to the DUF1998 gene was defined as drmD. In Class 2 systems, the closest gene with pfam00145 was defined as drmMII, and drmE genes were manually curated due to their lack of any annotated domain. As pfam13659 was recently cancelled and deleted from the Pfam database, some of the drmMI genes remained with no pfam annotation, but could be identified by their Superfamily SSF5335 (S-adenosyl-L-methionine-dependent methyltransferases) annotation and their size, and so were manually curated as drmMI.
Cloning of DISARM into B. subtilis BEST7003
A cloning vector for large fragments was constructed by assembling the p15a origin of replication (ori) from pACYCDuet-1 and the amyE integration cassette from plasmid pDR110, kindly provided by Ilana Kolodkin-Gal. The p15a ori31 was amplified using primers OGO174+OGO175 (a list of all primers is provided in Supplementary Table 3). The amyE integration cassette was amplified using primers OGO176+OGO185. The two fragments were assembled and transformed into E. coli cells using Gibson assembly cloning kit (NEB E5510S), and assembled plasmids were verified by restriction pattern and full sequencing.Bacillus licheniformis (Weigmann) Chester ATCC 9945a was received from ATCC. The species designation for this strain was recently changed to Bacillus paralicheniformis32 with NCBI taxonomy ID 766760. The DISARM locus of B. paralicheniformis 9945a in coordinates 815,730-826,377 (RefSeq NC_021362.1) was amplified using primers Hezi_1_F and Hezi_2_R. The vector backbone was amplified using primers OGO207+OGO208 and the two fragments were assembled using Gibson assembly.B. subtilis BEST7003 cells were previously kindly provided by Mitsuhiro Itaya. Assembled plasmids were transformed into B. subtilis BEST7003 cells as described by Itaya33.Scarless deletion strains were constructed by amplification of the DISARM system in two fragments, omitting the desired deletion region, and Gibson assembly with the vector backbone. The vector backbone was generated by primers OGO207+OGO208. PCR fragments used to generate deletion systems were: ΔdrmE – OSM13+SM3, SM4+OGO175; ΔdrmA – OSM13+SM5, SM6+OGO175; ΔdrmB – OSM13+SM2, SM7+OGO175; ΔdrmC – OSM13+SM9, SM10+OGO175; ΔdrmMII – OSM13+SM11, SM1+OGO175. The constructed plasmids were then used for integration of the deletion-containing system into B. subtilis BEST7003. Deletion of each gene included the ORF only without damaging intergenic regions. Deleted regions were as follows (coordinates on RefSeq NC_021362.1): ΔdrmE – 816,274-818,674; ΔdrmA – 818,671-821971; ΔdrmB – 822,039-823,752; ΔdrmC – 823,776–824,487; ΔdrmMII – 824,499–825,847. The entire genome of each constructed strain was verified by Illumina whole genome sequencing (Supplementary Figure 1). Sequence analysis for strain verification was performed using breseq34. A control strain containing an empty integration cassette was constructed in parallel, sequenced, and used as a control in the following experiments.For the construction of the strain expressing drmMII, the ORF of drmMII was amplified from the genomic DNA of DISARM-containing B. subtilis using primers OGO425+OGO426. The backbone of pJMP4 (kindly provided by Jason M. Peters) was amplified using primers OGO423+OGO424, and the two fragments were assembled using Gibson assembly so that the drmMII gene is under the control of the plasmid’s Pveg constitutive promoter.
Phage cultivation
Phages were previously received from the Bacillus Genetic Stock Center (BGSC). BGSCID for the phages used are 1L1 for phi3T, 1P4 for SPO1, 1L56 for SPR, 1L11 for phi105, 1P7 for SPP1 and 1P19 for Nf (1P19 is listed in the BGSC catalog as phage phi29, but upon sequencing and assembly this phage was found to be 100% identical to the reference sequence of phage Nf, GenBank accession EU622808). Phage phi29 was received from DSMZ (DSM 5545). Phages were propagated on B. subtilis BEST7003 (kindly provided by Mitsuhiro Itaya) using the plate lysate method as described by Fortier & Moineau35. Lysate titer was determined using the small drop plaque assay method as described by Mazzocco et al.36.
Phage infection growth curves
Overnight cultures of bacteria were diluted 1:100 into MMB (LB+0.1 mM MnCl2+5 mM MgCl2). Then, 200 µl of the diluted culture were dispensed into wells of 96 well plates and grown at 37 °C with shaking for 1 hour until early log phase. The number of bacterial cells in the culture was calculated according to an OD600 to CFU calibration curve. Then, 20 µl phage lysate was added to the desired MOI and the growth was followed in a TECAN Infinite 200 plate reader with OD600 measurement every 15 minutes at 37 °C with shaking.
Adsorption assay
15 ml of mid-log bacterial cultures in MMB medium at OD600 of 0.3 were infected with phage at an MOI of 1. During the infection the culture was incubated with shaking at 37 °C. At time points 1, 5, 10, 15, 20, 30, and 40 minutes post infection, 0.5 ml samples were taken and mixed with 100 µl ice-cold chloroform. Samples were vortexed, incubated at 37 °C for 5 minutes, vortexed, incubated on ice for 5 minutes, vortexed again and incubated at room temperature for 40 minutes. Samples were then briefly centrifuged and the phage concentration in the upper aqueous phase was determined by double layer plaque assay using B. subtilis BEST7003 as an indicator strain. As a control, the same amount of phage lysate was mixed with 15 ml MMB without bacteria and a sample was processed through the same stages and measured by double layer plaque assay to determine reference phage concentration.
Escape phages isolation and testing
Overnight cultures of DISARM-containing and DISARM-lacking cells were diluted 1:100 and grown to OD of 0.3. 100 µl of the culture was mixed with 100 µl of phage lysate and incubated at room temperature for 5 minutes. 4 ml of molten top agar (MMB+0.5% agar) were added, vortexed, and poured over an MMB petri dish. Plates were incubated at room temperature overnight. Isolated plaques from the DISARM-containing plates were picked into 100 µl phage buffer (50 mM Tris pH 7.4, 100 mM MgCl2, 10 mM NaCl). Serial dilutions in MMB were performed, and the phages were plated using the small drop plaque assay on DISARM-containing and DISARM-lacking cells.
DNA extraction during infection (used for sequencing, PCR for lysogeny and phage circularization detection)
50 ml of mid-log bacterial culture in MMB was infected with phi3T at MOI of 1, and 5 ml samples were taken immediately after infection (t=0) and at 5, 10, 15, 20, 30, 40 minutes after infection. During the infection, the culture was incubated with shaking at 37 °C. An uninfected control sample was taken before addition of phage. Samples were immediately transferred to ice. Samples were then centrifuged and the pellet was washed 3 times in ice-cold Tris 7.4 pH buffer to remove unabsorbed phages. The washed pellets were frozen in liquid nitrogen. Total DNA was extracted using Qiagen DNeasy Blood & Tissue kit (Qiagen 69504). Detection of phage lysogeny was performed using multiplex PCR as previously described by Goldfarb et al.10. Phage genome was detected using primers PTG83+PTG84, bacterial genome was detected using primers PTG18+PTG29, and the lysogeny junction was detected using primers PTG125+PTG126. Detection of phage circularization was done using primers PTG115+PTG116. To determine the relative abundance of bacterial and phage DNA, Illumina libraries were prepared and sequenced using a modified Nextera protocol as described by Baym et al.37. Reads were aligned to the bacterial reference genome and the phi3T genome (GenBank accession: KY030782) as previously described by Goldfarb et al.10. The number of reads aligned to the phage and host genomes at each time point were normalized to the genome sizes to calculate the number of phage genome equivalents per bacterial genome. T=5 minutes was used as a reference point for comparison of phage DNA levels, representing a time point until which phage adsorption continued but no phage replication initiated.
Bisulfite sequencing
Genomic DNA of DISARM-containing B. subtilis BEST7003, control B. subtilis BEST7003, the constitutive drmMII strain, and B. paralicheniformis 9945a, as well as genomic DNA of phage phi3T, was used to construct PBAT libraries, using a modified version of the published protocol38. Briefly, 50 ng of genomic DNA were converted and purified according to the manufacturer’s instructions (EZ DNA methylation lightning MagPrep, Zymo Research), using half of the recommended amount of each reagent. Bisulfite-converted products were subjected to second strand synthesis by Klenow fragment 3’ to 5’ exo- (10 units, M0212L, NEB) and the indexed random nonamer primer (0.8 μM):5’ACACTCTTTCCCTACACGACGCTCTTCCGATCT-INDEX-GGNNNNNNNNN3’This primer includes the truncated Illumina P5 adaptor followed by 8 bp internal index. The excess of primer was removed at the end of the reaction by exonuclease I (M0293L, NEB) and the products were purified with 0.8 x beads (Agencount Ampure XP beads, Beckman Coulter). DNA was denatured for 6 minutes at 95°C and the second strand was synthesized by Klenow polymerase using the indexed random nonamer primer (0.8 μM) containing the P7 Illumina adaptor:5'GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT-INDEX-CCNNNNNNNNN3'The products were purified with 0.8 x beads and the library was generated by 12 cycles or PCR amplification using 2.5 units of GoTaq Hot Start polymerase (M5005, Promega) together with 0.4 μM Illumina Forward PE1.0 primer (5′ -AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT-3′) and 0.4 μM pre-indexed Illumina Reverse primer (5′- CAAGCAGAAGACGGCATACGAGATXXXXXXGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT-3', where XXXXXX represents barcode for multiplexing). Amplified libraries were purified with 0.7 × Agencourt Ampure XP beads and were assessed by Qubit dsDNA HS Assay kit (Thermo Fisher Scientific) and Bioanalyzer (Agilent). The final quality-ensured libraries were pooled and sequenced on the NextSeq500 Illumina, and generated 4-6M reads per each library.Prior to analysis, adaptor trimming and quality trimming were performed using Cutadapt39. Analysis of bisulfite-modified sequence reads was done using Bismark40. Whole genome cytosine methylation report was generated, and methylated positions were defined as positions with total coverage greater than 5 and methylation ratio greater than 2. The neighborhood of the methylated positions was extracted from the reference genome, and analyzed for a recurrent motif. The positions of all CCWGG motifs in the genomes were extracted from the reference genomes, and the methylation ratio of these positions was extracted from the cytosine methylation report.
Fluorescence microscopy visualization of injected phage DNA
Strains
The thrC::LacI-CFP cassette of strain ET341, kindly provided by Sigal Ben-Yehuda, was amplified from the genomic DNA using primers OGO380+OGO381 and transformed into the BEST7003 and DISARM-containing strains. The control cells were then transformed with a constitutive RFP construct (amyE::Pveg-RFP) using plasmid pJMP4, kindly provided by Jason M. Peters. Phage SPP1 containing an array of 64 repeats of LacO24,25 was kindly provided by Paulo Tavares.
Fluorescence microscopy in a microfluidic device
Overnight cultures of LacI-CFP DISARM-containing and RFP expressing control cells were diluted 1:100 and grown until OD of 0.3. The culture was then diluted again 1:10 and equal amounts of both strains were mixed together. The mixed culture was loaded into a chamber of CellASIC ONIX plate for bacterial cells (Mercury, B04A-03-5PK) according to the manufacturer instructions, and mounted on a Zeiss Axio Observer Z1 inverted microscope. The cells were grown under a constant flow of MMB medium, and monitored periodically in bright field and RFP channels to monitor cell division for ~1 hour. Imaging then started and was performed in 3 channels: bright field, CFP (filter set 47 HE), RFP (filter set 64 HE). Images were captured every 5 minutes. After 15 minutes, SPP1-LacO phages in MMB (105 PFU/µl) were flowed into the chamber for a period of 30 minutes. The infection was followed until the beginning of cell lysis of DISARM-lacking cells was observed.
Image analysis
Analysis was done using the Imaris software (Bitplane). Background was subtracted from CFP and RFP channels. Fluorescent foci were segmented and tracked from the CFP channel using the Imaris spots object. Foci were allocated to DISARM-lacking cells according to RFP level at the same location. Segmentation and tracking was manually corrected. The number of foci within each strain was counted for each frame separately.
Data Availability
The data that support the findings of the study are available in the Supplementary Tables and
in accession PRJEB22683 deposited to the European Nucleotide Archive (ENA).
Authors: Daan C Swarts; Matthijs M Jore; Edze R Westra; Yifan Zhu; Jorijn H Janssen; Ambrosius P Snijders; Yanli Wang; Dinshaw J Patel; José Berenguer; Stan J J Brouns; John van der Oost Journal: Nature Date: 2014-02-16 Impact factor: 49.962
Authors: Matthew J Blow; Tyson A Clark; Chris G Daum; Adam M Deutschbauer; Alexey Fomenkov; Roxanne Fries; Jeff Froula; Dongwan D Kang; Rex R Malmstrom; Richard D Morgan; Janos Posfai; Kanwar Singh; Axel Visel; Kelly Wetmore; Zhiying Zhao; Edward M Rubin; Jonas Korlach; Len A Pennacchio; Richard J Roberts Journal: PLoS Genet Date: 2016-02-12 Impact factor: 5.917
Authors: Lawrence A Kelley; Stefans Mezulis; Christopher M Yates; Mark N Wass; Michael J E Sternberg Journal: Nat Protoc Date: 2015-05-07 Impact factor: 13.491
Authors: Ron L Dy; Rita Przybilski; Koen Semeijn; George P C Salmond; Peter C Fineran Journal: Nucleic Acids Res Date: 2014-01-24 Impact factor: 16.971
Authors: Saadlee Shehreen; Te-Yuan Chyou; Peter C Fineran; Chris M Brown Journal: Philos Trans R Soc Lond B Biol Sci Date: 2019-05-13 Impact factor: 6.237
Authors: Hélène Chabas; Antoine Nicot; Sean Meaden; Edze R Westra; Denise M Tremblay; Léa Pradier; Sébastien Lion; Sylvain Moineau; Sylvain Gandon Journal: Philos Trans R Soc Lond B Biol Sci Date: 2019-05-13 Impact factor: 6.237
Authors: Hon Lun Wong; Richard Allen White; Pieter T Visscher; James C Charlesworth; Xabier Vázquez-Campos; Brendan P Burns Journal: ISME J Date: 2018-07-06 Impact factor: 10.302
Authors: Linyi Gao; Han Altae-Tran; Francisca Böhning; Kira S Makarova; Michael Segel; Jonathan L Schmid-Burgk; Jeremy Koob; Yuri I Wolf; Eugene V Koonin; Feng Zhang Journal: Science Date: 2020-08-28 Impact factor: 47.728