Literature DB >> 36016269

Comparative Genomics of Closely-Related Gordonia Cluster DR Bacteriophages.

Cyril J Versoza1, Abigail A Howell2, Tanya Aftab3, Madison Blanco3, Akarshi Brar3, Elaine Chaffee3, Nicholas Howell3,4, Willow Leach4, Jackelyn Lobatos3, Michael Luca3,5, Meghna Maddineni5, Ruchira Mirji3, Corinne Mitra3, Maria Strasser3, Saige Munig3, Zeel Patel3, Minerva So3, Makena Sy3, Sarah Weiss3, Susanne P Pfeifer1.   

Abstract

Bacteriophages infecting bacteria of the genus Gordonia have increasingly gained interest in the scientific community for their diverse applications in agriculture, biotechnology, and medicine, ranging from biocontrol agents in wastewater management to the treatment of opportunistic pathogens in pulmonary disease patients. However, due to the time and costs associated with experimental isolation and cultivation, host ranges for many bacteriophages remain poorly characterized, hindering a more efficient usage of bacteriophages in these areas. Here, we perform a series of computational genomic inferences to predict the putative host ranges of all Gordonia cluster DR bacteriophages known to date. Our analyses suggest that BiggityBass (as well as several of its close relatives) is likely able to infect host bacteria from a wide range of genera-from Gordonia to Nocardia to Rhodococcus, making it a suitable candidate for future phage therapy and wastewater treatment strategies.

Entities:  

Keywords:  Gordonia; bacteriophage; cluster DR; comparative genomics; host range

Mesh:

Substances:

Year:  2022        PMID: 36016269      PMCID: PMC9413003          DOI: 10.3390/v14081647

Source DB:  PubMed          Journal:  Viruses        ISSN: 1999-4915            Impact factor:   5.818


1. Introduction

Bacteriophages are one of the most abundant organisms on Earth, infecting a wide range of host bacteria present in almost any environment from common garden soil to volcanic substrates and from freshwater streams to oceans [1]. Among these hosts, members of the order Corynebacteriales—including Gordonia, Mycobacterium, Nocardia, and Rhodococcus—are of particular importance to agriculture, biotechnology, and medicine as the outer membrane of their bacterial cells, which consists of long-chain hydroxylated mycolic acids, frequently leads to complications during the prevention, treatment, and cure of opportunistic pathogens [2]. Moreover, due to the hydrophobic nature of this “mycomembrane”, Corynebacteriales often cause severe problems during wastewater treatment as they can stabilize foams on the surface of aeration tanks during the activated sludge phase [3], which not only complicates sludge management and increases maintenance costs but also poses a health hazard to wastewater treatment plant workers in their aerosolized form [4]. Owing to the growing scarcity of clean water across the globe, treated wastewater serves as an important alternative to freshwater for many nations with more than 35% of agricultural irrigation, 17% of landscape irrigation, and 12% of groundwater recharge in the United States stemming from treated wastewater [5]. However, microbial hazards, such as multi-drug resistant bacterial pathogens, are frequently discharged into sewage systems due to the common usage of antibiotics in animal farms and on crop fields. Consequently, effective wastewater treatment strategies are indispensable to combat environmental and health concerns for farmers and consumers alike [6]. Due to their host specificity, lytic bacteriophages have been proposed as promising and environmentally-friendly bacterial treatment and control agents to remove harmful (or otherwise problematic) bacteria—such as gram-positive Gordonia which are associated with both systemic infections in immunocompromised and local infections in immunocompetent individuals [7,8] as well as sludge foaming [9,10]—while maintaining desirable microorganisms in the wastewater. To effectively guide these biological control strategies, bacteriophages and their host ranges (i.e., the bacterial genera and species a bacteriophage is able to infect) must be well-characterized—yet, the diversity of Gordonia bacteriophages remains largely unexplored. As part of a course-based undergraduate research experience at Arizona State University, we computationally inferred putative host ranges of all Gordonia cluster DR bacteriophages known to date to aid the design and improvement of future wastewater treatment strategies.

2. Materials and Methods

Genomic data for Gordonia cluster DR bacteriophages (Supplementary Table S1) were explored using Phamerator [11] and phylogenetic relationships characterized together with representative Microbacterium, Mycobacterium, and Streptomyces bacteriophages as outgroups (Supplementary Table S2). Specifically, MAFFT v.7 [12] embedded within the EMBL-EBI Bioinformatics Toolkit [13,14] was used to generate a multiple-sequence alignment between the bacteriophages. The resulting alignment was then used to generate a neighbor-joining tree in MEGA X [15] using a phylogeny test with 10,000 bootstrap replicates. Nucleotide sequence relatedness was assessed using Gepard v.2.1.0 [16]. Pairwise average nucleotide identities (ANIs) were calculated using the “Genome Comparison” tool embedded within DNA Master v.5.23.6 and plotted using the ggplot2 package [17] in R v.4.1.0. Following suggested best practices by Versoza and Pfeifer [18], a combination of exploratory and confirmatory methods was utilized to computationally predict host ranges of the closely-related Gordonia cluster DR bacteriophages. Specifically, putative host ranges were predicted using two machine-learning based prediction tools—CHERRY [19] and PHERI v.0.2 [20]—as well as the alignment-free prediction tool WIsH v.1.1 [21] together with genomic data from ten putative bacterial host species spanning three genera—Gordonia, Nocardia, Rhodococcus, and, as a negative control, Escherichia (Supplementary Table S3). All software was executed using default settings.

3. Results

To confirm cluster membership, the genomes of Gordonia cluster DR bacteriophages were investigated. They show a high level of sequence similarity with the left arm of the genomes mostly encoding well-conserved structural and assembly proteins (including a terminase, portal protein, capsid maturation protein as well as major capsid hexamer and pentamer proteins, a head-to-tail adaptor, tail assembly protein, tape measure protein, minor tail protein subunits, lysin A, lysin B, and several genes responsible for integration into the host). Thereby, the RuvC-like resolvase (Supplementary Figure S1), a Holliday junction resolving enzyme that is a distant relative of the RuvC proteins present in gram-negative bacteria such as Escherichia coli [22], is of particular interest. It closely resembles the RuvC-like endonucleases found in select Siphoviridae and Myoviridae bacteriophages infecting Streptococcus and Lactococcus hosts [23,24], which may hint at a shared evolutionary history. The right arm of the genomes contains non-structural genes (including an exonuclease, DNA helicase, DNA polymerase, and HNH endonuclease). Notably, several cluster DR bacteriophages exhibit a partial toxin/antitoxin (TA) system (Supplementary Figure S2). Prevalent in many archaea and bacteria, TA systems encode a toxin protein and a corresponding antitoxin in the form of a protein or non-coding RNA that serves as a defense mechanism against invading bacteriophages [25,26]. As bacteriophages co-evolve with their bacterial hosts [27], adaptations to such defense mechanisms are common [28] to allow bacteriophages to inactivate bacteria-encoded toxins [29,30]. Indeed, the TA system of the cluster DR bacteriophages is homologous to the hicA TA system frequently present in Burkholderia pseudomallei, E. coli, and Pseudomonas aeruginosa [31,32,33]. To elucidate phylogenetic relationships, comparative analyses were performed between all Gordonia cluster DR bacteriophages known to date (Supplementary Table S1). Following Pope and colleagues [34], clustering was based on nucleotide similarity and shared gene content, with bacteriophages sharing at least 35% of genes being grouped into clusters. A neighbor-joining tree confirmed membership in the DR cluster (Supplementary Figure S3a)—an assignment that was further supported by both the dot plot analyses (Supplementary Figure S4) as well as the pairwise average nucleotide identities (Supplementary Figure S5). Interestingly, gene trees of the RuvC-like resolvase (Supplementary Figure S3b) and the hicA-like toxin (Supplementary Figure S3c) do not recapitulate the whole genome phylogeny—however, it is unclear whether this is due to inconsistent resampling during bootstrapping caused by the short sequence length [35] or the mosaic architecture of the genome caused by horizontal gene transfer by illegitimate recombination [36,37,38]. Compared to temperate bacteriophages, both gene acquisition and gene loss, in lytic bacteriophages is less well understood [39]. However, there have been previous reports of gene transfers in T4-like and T7-like bacteriophages [40,41], and lytic bacteriophages with large genomes have been suggested to have acquired genes from donor genomes [42]. Due to their bactericidal nature, bacteriophages are frequently used for a variety of agricultural, biotechnological, and medical applications [43]. To effectively guide the usage of bacteriophages in these areas, their host ranges have to first be determined (see discussion in [18]). To investigate the host ranges of the closely related cluster DR bacteriophages, a combination of exploratory and confirmatory prediction tools was utilized together with a dataset of ten putative bacterial host species and E. coli as a negative control (Supplementary Table S3). Specifically, the tested host dataset spans the three genera of the Corynebacteriales order—Gordonia, Nocardia, and Rhodococcus—that have been implicated in activated sludge foaming in wastewater treatment plants [44]. Using the exploratory method PHERI [20], seven out of nine cluster DR bacteriophages were predicted to infect hosts under the Gordonia genus (Table 1), with the exception of bacteriophages AnClar and Yago84. To make host range predictions for newly encountered bacteriophages, PHERI utilizes a decision tree classifier of annotated protein clusters of bacteriophages with known hosts. Consequently, bacteriophages will only be predicted to infect a particular host if their protein profile closely matches that of another bacteriophage known to infect that host. As minor tail proteins play an essential role in bacteriophage infection [45], the lack of similarity in the minor tail protein profiles of AnClar and Yago84 compared to those bacteriophages known to infect Gordonia hosts might explain why neither were predicted to infect the Gordonia genus, despite having been isolated in G. terrae (Supplementary Table S1). In fact, the clades observed within the gene tree of the minor tail protein shared across all cluster DR bacteriophages (Supplementary Figure S3d) reflects the clustering of the bacteriophages with respect to host range, reiterating the importance of tail proteins for host infection. Using the exploratory method CHERRY [19]—a graph convolutional encoder and decoder that relies on a broader range of features including protein organization, sequence similarity, and k-mer frequency to predict host ranges—highlights M. smegmatis, G. terrae, and R. hoagie as the three most likely host candidates for all cluster DR bacteriophages (though the latter two scoring predictions fell below the recommended confidence threshold of 0.9). Conversely, the confirmatory method WIsH [21]—based on a Markov model that determines the k-mer similarity between bacteriophage and host genomes—predicted G. hydrophobica, G. malaquae, G. rubripertincta, and G. terrae as potential hosts for all nine cluster DR bacteriophages relative to the negative control, E. coli (Figure 1). Moreover, log likelihood values for putative Nocardia and Rhodococcus hosts were comparable to those of Gordonia, suggesting the potential for a much broader host range. Interestingly, BiggityBass exhibits the broadest predicted host range among all cluster DR bacteriophages, spread across five different phyla (Table 1), making it an appealing agent to explore for future wastewater treatment strategies [46].
Table 1

Putative host ranges as predicted by PHERI. Putative hosts of the nine Gordonia cluster DR bacteriophages included in this study (Supplementary Table S1) predicted by PHERI [20].

Gordonia Arthrobacter Aeromonas Staphylococcus Shigella Corynebacterium Stenotrophomonas
AnarQue
AnClar
BiggityBass
CloverMinnie
Ligma
Mariokart
NHagos
Sour
Yago84
Figure 1

Putative host ranges as predicted by WIsH. Heatmap of log-likelihoods of bacteriophage-host pairs—including nine Gordonia cluster DR bacteriophages (Supplementary Table S1) as well as ten potential bacterial hosts and E. coli as a negative control (Supplementary Table S3)—generated by the host prediction tool WIsH [21]. Higher values correspond to more likely interactions.

In conclusion, computational methods can offer a first glimpse into the putative host ranges of newly discovered bacteriophages—yet, it is important to remember that these methods are predictive by their very nature. Thereby, each computational method exhibits their own advantages and limitations. For example, tools that rely solely on k-mer-based models can lead to an overprediction of host ranges if convergent evolution resulted in similar nucleotide frequency patterns [47], whereas tools that rely on machine-learning are inherently limited in their predictions by the bacteriophage-host datasets available for training [18]. Experimental validation through bacteriophage isolation and cultivation still remains the “gold standard” in determining bacteriophage host ranges—however, it certainly is not without its own limitations as not all microbial hosts are amendable to cultivation in the laboratory and, even if they are, results may depend on the conditions under which the experiments were performed [18]. Given the ever growing knowledge of bacteriophage diversity across the globe, it is our hope that future computational and experimental research will go hand in hand to further explore polyvalent bacteriophages as an interesting study system to gain a better understanding of the molecular and genetic determinants underlying host range.
  54 in total

Review 1.  Rhodococcal systematics: problems and developments.

Authors:  M Goodfellow; G Alderson; J Chun
Journal:  Antonie Van Leeuwenhoek       Date:  1998 Jul-Oct       Impact factor: 2.271

Review 2.  Global phage diversity.

Authors:  Forest Rohwer
Journal:  Cell       Date:  2003-04-18       Impact factor: 41.582

3.  A selective barrier to horizontal gene transfer in the T4-type bacteriophages that has preserved a core genome with the viral replication and structural genes.

Authors:  Jonathan Filée; Eric Bapteste; Edward Susko; H M Krisch
Journal:  Mol Biol Evol       Date:  2006-06-16       Impact factor: 16.240

4.  Insights into the microbial degradation of rubber and gutta-percha by analysis of the complete genome of Nocardia nova SH22a.

Authors:  Quan Luo; Sebastian Hiessl; Anja Poehlein; Rolf Daniel; Alexander Steinbüchel
Journal:  Appl Environ Microbiol       Date:  2014-04-18       Impact factor: 4.792

5.  WIsH: who is the host? Predicting prokaryotic hosts from metagenomic phage contigs.

Authors:  Clovis Galiez; Matthias Siebert; François Enault; Jonathan Vincent; Johannes Söding
Journal:  Bioinformatics       Date:  2017-10-01       Impact factor: 6.937

6.  Genome structure of mycobacteriophage D29: implications for phage evolution.

Authors:  M E Ford; G J Sarkis; A E Belanger; R W Hendrix; G F Hatfull
Journal:  J Mol Biol       Date:  1998-05-29       Impact factor: 5.469

Review 7.  The junction-resolving enzymes.

Authors:  D M Lilley; M F White
Journal:  Nat Rev Mol Cell Biol       Date:  2001-06       Impact factor: 94.444

8.  Exploring the mycobacteriophage metaproteome: phage genomics as an educational platform.

Authors:  Graham F Hatfull; Marisa L Pedulla; Deborah Jacobs-Sera; Pauline M Cichon; Amy Foley; Michael E Ford; Rebecca M Gonda; Jennifer M Houtz; Andrew J Hryckowian; Vanessa A Kelchner; Swathi Namburi; Kostandin V Pajcini; Mark G Popovich; Donald T Schleicher; Brian Z Simanek; Alexis L Smith; Gina M Zdanowicz; Vanaja Kumar; Craig L Peebles; William R Jacobs; Jeffrey G Lawrence; Roger W Hendrix
Journal:  PLoS Genet       Date:  2006-06-09       Impact factor: 5.917

Review 9.  Toxin-antitoxin systems: Biology, identification, and application.

Authors:  Simon J Unterholzner; Brigitte Poppenberger; Wilfried Rozhon
Journal:  Mob Genet Elements       Date:  2013-08-20

Review 10.  Computational Prediction of Bacteriophage Host Ranges.

Authors:  Cyril J Versoza; Susanne P Pfeifer
Journal:  Microorganisms       Date:  2022-01-12
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.