| Literature DB >> 31201356 |
Szymon P Szafrański1,2,3, Mogens Kilian4, Ines Yang5,6, Gesa Bei der Wieden5,6, Andreas Winkel5,6, Jan Hegermann7,8, Meike Stiesch5,6,9.
Abstract
Aggregatibacter and Haemophilus species are relevant human commensals and opportunistic pathogens. Consequently, their bacteriophages may have significant impact on human microbial ecology and pathologies. Our aim was to reveal the prevalence and diversity of bacteriophages infecting Aggregatibacter and Haemophilus species that colonize the human body. Genome mining with comparative genomics, screening of clinical isolates, and profiling of metagenomes allowed characterization of 346 phages grouped in 52 clusters and 18 superclusters. Less than 10% of the identified phage clusters were represented by previously characterized phages. Prophage diversity patterns varied significantly for different phage types, host clades, and environmental niches. A more diverse phage community lysogenizes Haemophilus influenzae and Haemophilus parainfluenzae strains than Aggregatibacter actinomycetemcomitans and "Haemophilus ducreyi". Co-infections occurred more often in "H. ducreyi". Phages from Aggregatibacter actinomycetemcomitans preferably lysogenized strains of specific serotype. Prophage patterns shared by subspecies clades of different bacterial species suggest similar ecoevolutionary drivers. Changes in frequencies of DNA uptake signal sequences and guanine-cytosine content reflect phage-host long-term coevolution. Aggregatibacter and Haemophilus phages were prevalent at multiple oral sites. Together, these findings should help exploring the ecoevolutionary forces shaping virus-host interactions in the human microbiome. Putative lytic phages, especially phiKZ-like, may provide new therapeutic options.Entities:
Mesh:
Year: 2019 PMID: 31201356 PMCID: PMC6776037 DOI: 10.1038/s41396-019-0450-8
Source DB: PubMed Journal: ISME J ISSN: 1751-7362 Impact factor: 11.217
Fig. 1Diversity of prophages assessed by DNA similarity matrix analysis. a Overview dot plot of 243 prophage sequences arranged in groups, superclusters and clusters. Dot plot compared sorted and merged sequences of prophages (combined length of 9.5 megabase) on the x-axis, and the same collection of sequences on the y-axis of a plot. When the DNA residues of both sequences match at the same location on the plot, a dot is drawn at the corresponding position. Once the dots have been plotted, they will combine to form lines, and dense groups of lines will form black squares that correspond to clusters of similar genomes. The bigger the cluster, the higher the prevalence of prophages belonging to that cluster. The main diagonal represents the sequence’s alignment with itself; lines off the main diagonal but around it represent similar patterns within the closely related phage genomes. Signals found more far from the diagonal represents similar patterns within more distantly related phage genomes. Both halves of the graph created by the diagonal provide the identical information but we decided to keep them both for plot clarity. Enlarged sections for phages assigned to supercluster SuMu-like (b), Mu- and B3-like (c), BcepMu-like (d), MHaA1- and HP1-like (e), and lambda-like supercluster (f, g) are shown. Clusters (numbers 1–36) were highlighted with black frames. Superclusters and main groups are labeled and indicated by black or gray stripes. DNA coordinates for merged sequence are given in plots corners
Characteristics of phage clusters
| Supercluster | Cluster number | Prophages | Assemblies | OrthoANI (mean ± s.d.) | Mash distance (mean ± s.d.) | Predicted tail morphology | Taxonomy of lysogens | Most related phages | Reported phages | Comments | References |
|---|---|---|---|---|---|---|---|---|---|---|---|
| SuMu-like | 1 | 20 | 0 | 93.9 ± 3.3 | 0.048 ± 0.02 | Contractile |
| ||||
| 2 | 4 | 2 | 92.4 ± 3.7 | 0.05 ± 0.025 | Contractile |
| |||||
| 3 | 3 | 0 | 100 ± 0.0 | 0.002 ± 0.002 | Contractile |
| |||||
| 4 | 1 | 1 | – | – | Contractile |
| |||||
| 5 | 18 | 0 | 90.9 ± 0.1 | 0.01 ± 0.008 | Flexible |
| DNA homology with phages from clusters 8 and 12 | Gangaiah et al., [ | |||
| 37 | 0 | 1 | – | – | Flexible |
| Paez-Espino et al., [ | ||||
| Mu-like | 6 | 3 | 0 | 95.1 ± 4.3 | 0.038 ± 0.033 | Contractile |
| ||||
| 7 | 3 | 0 | 93.9 ± 3.3 | 0.05 ± 0.041 | Contractile |
| FluMu | Morgan et al., [ | |||
| 8 | 4 | 0 | 100 ± 0.0 | 0.001 ± 0.001 | Flexible |
| DNA homology with phages from clusters 5 and 12 | Gangaiah et al., [ | |||
| 9 | 13 | 0 | 99.9 ± 0.1 | 0.001 ± 0.001 | Flexible |
| Gangaiah et al., [ | ||||
| 10 | 3 | 0 | 91.3 ± 6.7 | 0.067 ± 0.052 | Flexible |
| |||||
| 11 | 1 | 0 | – | – | Flexible |
| |||||
| B3-like | 12 | 11 | 0 | 93.7 ± 6.1 | 0.058 ± 0.05 | Flexible |
| DNA homology with phages from clusters 5 and 8 | Gangaiah et al., [ | ||
| 13 | 6 | 0 | 90.0 ± 6.2 | 0.082 ± 0.051 | Flexible | ||||||
| BcepMu-like | 14 | 22 | 0 | 98.7 ± 1.7 | 0.021 ± 0.021 | Contractile |
| ||||
| 15 | 2 | 0 | 100 | – | Contractile |
| |||||
| 16 | 2 | 0 | 88.5 | – | Contractile |
| |||||
| 17 | 4 | 2 | 99.2 ± 0.6 | 0.008 ± 0.005 | Contractile |
| |||||
| 18 | 2 | 0 | 100 | – | Contractile |
| |||||
| MHaA1-like | 19 | 16 | 5 | 93.7 ± 2.4 | 0.054 ± 0.02 | Contractile |
| ||||
| 20 | 5 | 7 | 92.3 ± 2.8 | 0.054 ± 0.018 | Contractile |
| |||||
| 38 | 0 | 2 | – | – | Contractile |
| Paez-Espino et al., [ | ||||
| HP1-like | 21 | 10 | 3 | 95.5 ± 2.5 | 0.043 ± 0.018 | Contractile |
| HP1, HP2 | Harm and Rupert, [ | ||
| 22 | 3 | 16 | 90.9 ± 1.2 | 0.066 ± 0.019 | Contractile |
| One of phages has head morphogenesis protein resembling one from | ||||
| 39 | 0 | 3 | – | – | Contractile |
| Paez-Espino et al., [ | ||||
| 40 | 0 | 3 | – | – | Contractile |
| Paez-Espino et al., [ | ||||
| DIBBI-like | 23 | 17 | 0 | 97.3 ± 1.3$ | 0.043 ± 0.017 | – |
| OrthoANI indicated three outliers, DNA homology with some phages from cluster 31 | |||
| Aaphi23-like | 24 | 17 | 0 | 94.2 ± 3.5 | 0.07 ± 0.038 | – |
| ||||
| 25 | 4 | 0 | 97.1 ± 1.6 | 0.029 ± 0.015 | Contractile |
| Aaphi23, S1249 | Resch et al., [ | |||
| 41 | 0 | 8 | – | – |
| Two phages have similar head morphogenesis protein like phages from cluster 24 | Paez-Espino et al., [ | ||||
| 587AP2-like | 26 | 2 | 1 | 99.9 | 0.000 | – |
| ||||
| 1152AP2-like | 27 | 1 | 2 | – | – | – |
| ||||
| 28 | 3 | 0 | 95.8 ± 3.8 | 0.06 ± 0.051 | Flexible |
| |||||
| 29 | 1 | 0 | – | – | – |
| |||||
| 30 | 1 | 0 | – | – | – |
| |||||
| Gifsy2-like | 31 | 28 | 2 | 93.6 ± 4.7 | 0.057 ± 0.036 | Flexible |
| Multiple phages | Some show DNA homology with phages from cluster 23 | ||
| 42 | 0 | 1 | – | – | Flexible |
| Paez-Espino et al., [ | ||||
| 43 | 0 | 5 | – | – | Flexible |
| Paez-Espino et al., [ | ||||
| HK97-like | 32 | 1 | 0 | – | – | Flexible |
| ||||
| 33 | 1 | 1 | – | – | Flexible? |
| |||||
| 34 | 1 | 0 | – | – | Flexible |
| |||||
| 35 | 3 | 18 | 93.8 ± 4.7 | 0.046 ± 0.003 | Flexible |
| |||||
| 44 | 0 | 1 | – | – | Flexible |
| Paez-Espino et al., [ | ||||
| P22-like | 36 | 3 | 0 | – | 0.024 ± 0.019 | – |
| ||||
| PA73-like | 45 | 0 | 2 | – | – | Flexible |
| Potentially defective | Paez-Espino et al., [ | ||
| unknown 1 | 46 | 0 | 6 | – | – | – |
| Potentially defective | Paez-Espino et al., [ | ||
| phiKZ-like | 47 | 0 | 4 | – | – | – | – | Genome size of 279 kbp, likely lytic, pseudolysogeny possible | Paez-Espino et al., [ | ||
| 48 | 0 | 2 | – | – | – | – | Genome size of 241–254 kbp, likely lytic, pseudolysogeny possible | Paez-Espino et al., [ | |||
| 49 | 0 | 2 | – | – | – | – | Genome size of 220 kbp, likely lytic, pseudolysogeny possible | Paez-Espino et al., [ | |||
| 50 | 0 | 1 | – | – | – | – | Genome size of 211 kbp, likely lytic, pseudolysogeny possible | Paez-Espino et al., [ | |||
| unknown 2 | 51 | 0 | 1 | – | – | – | – | Genome size of 190 kbp | Paez-Espino et al., [ | ||
| S13-like | 52 | 0 | 5 | – | – | – | – | Genome size of 151–152 kbp, likely lytic | Paez-Espino et al., [ |
Fig. 2Diversity of prophages assessed by marker protein-based phylogeny. Genome segments coding for head morphogenesis and DNA packaging are shown for transposable (a), P2-like (b), and lambda-like (c) phages. Conserved domains are indicated and conserved marker proteins are highlighted. Blue frames in b indicate transcription in opposite direction (from right to left). The phylogenetic trees for transposable (d), P2-like (e), and lambda-like (f, g, h) phages were constructed. Reference sequences were included [32, 33, 99]. The evolutionary history was inferred by using the Maximum Likelihood method. The percentage of trees in which the associated taxa clustered together is shown next to the branches. The trees are drawn to scale, with branch lengths measured in the number of substitutions per site. Trees collapsed at cluster level. Number and length of studied sequences were as follows: 127 sequences with 324 amino acid positions (d), 55 with 300 aa (e), 63 with 131 aa (f), 37 with 554 aa (g), and 11 with 354 aa (h). Branches were colored to highlight superclusters. Abbreviated host taxonomy is given in brackets for each cluster. Full species names are listed in d
Fig. 3Prevalence and diversity patterns of phages across species and subspecies clades. a Number of prophages per genome plotted for four species that were best represented in genome database. b Number of prophages per genome plotted for subspecies clades of “H. ducreyi”, H. influenzae, and A. actinomycetemcomitans. Discrete population structure at subspecies resolution was inferred from whole-genome sequences [41–43]. c Prevalences of phage superclusters were plotted across four species. d Rarefaction curves were constructed to assess phage cluster richness from the results of sampling the genomes. Mean values are plotted and errors bars represent 95% CIs. The smallest sample size is indicated by a vertical dashed line. Prevalence of phage clusters across subspecies clades was plotted for “H ducreyi” (e), A. actinomycetemcomitans (f), and H. influenzae (g). Bar graph and error bars show mean value and 95 % CI, respectively. Number of observations per group is given in brackets. In a and b, significant differences between group means was detected by one-way ANOVA with post hoc Tukey test. In c and e–g, Fisher’s exact test (two-sided) was used to analyze the significance of the association between the phage presence of phage (grouped in superclusters or clusters) and host clades. Bonferroni correction was applied to compensate for multiple comparisons. The number of observations is given in brackets following the clade name. Symbols ***, **, *, and # indicated p < 0.001, p < 0.01, p < 0.05, and significant p value prior to Bonferroni correction, respectively
Fig. 4Characterization of Aggregatibacter phages. Prevalence of phages representing three clusters in different A. actinomycetemcomitans serotypes/clades (a) and clinical groups (b). Fisher’s exact test (two-sided) was used to analyze the significance of the association between the presence of phages and either host lineages or diseases. Bonferroni correction was applied to compensate for multiple comparisons. The number of observations is given in brackets following the clade name. Symbols ***, **, *, and # indicated p < 0.001, p < 0.01, p < 0.05, and significant p value prior to Bonferroni correction, respectively. c Antimicrobial activity of conditioned media from A. actinomycetemcomitans cultures treated with Mitomycin C. Results of drop spot assay are presented. Sources of conditioned media are listed in rows while the indicator strains are given in columns. Strains were grouped by prophage type and serotype/clade. Ser. is an abbreviation for serotype/clade. Electron micrographs of Aaphi23-like phage with a contractile tail and transposable phage with flexible tail induced from A. actinomycetemcomitans are shown in d and e, respectively. f Overview dot plot of newly sequenced genomes arranged in clusters. Dot plot compared sorted and merged sequences of reference prophages on the x-axis, and the control and new sequences on the y-axis of the plot. More details about the dot plot method can be found in legend of Fig. 1. Control and new sequences are indicated by black and gray stripes, respectively. DNA coordinates for merged sequence are given in the corners
Fig. 5Prevalence of phage clusters among human microbiomes. Prevalence of phages grouped to superclusters (a) and clusters (b) at three oral sites, as assessed by metagenome analysis. Dataset from [26] was used. Predicted host of the phage is indicated in b. Fisher’s exact test (two-sided) was used to analyze the significance of the association between the phages and sites. Bonferroni correction was applied to compensate for multiple comparisons. The number of observations is given in brackets following the site name. Symbols ***, **, *, and # indicated p < 0.001, p < 0.01, p < 0.05, and significant p value prior to Bonferroni correction, respectively
Fig. 6Classification of viral assemblies from metagenomes. a Comparison between prophage dataset and metagenome dataset. The following information is provided (starting from inner ring): phage group, phage supercluster, phage cluster, size of clusters and corresponding host species in prophage dataset, size of clusters and corresponding host species in metagenome dataset [27]. Assignment of IMG/VR assemblies to clusters is indicated by solid line linkers. b Dendrograms based on full-genome comparison for selected clusters constructed with VICTOR [30]. New sequences introduced by metagenomes are indicated by circle graphs [same as in a]. c The phylogenetic tree for phiKZ-like phages was constructed using sequences of major capsid protein. The evolutionary history was inferred by using the Maximum Likelihood method. The percentage of trees in which the associated taxa clustered together is shown next to the branches. The trees are drawn to scale, with branch lengths measured in the number of substitutions per site
Fig. 7Frequencies of DNA uptake signal sequences (USSs) and guanine–cytosine content (GC%) in genome sequences of Pasteurellaceae species and their phages. Frequencies of clade-specific uptake signal sequence (USS) variants in both orientations and guanine–cytosine content (GC%) were measured for genomes from diverse phages and human-associated Pasteurellaceae species. Frequencies are given per 1 Mb (i.e., 106 nt) of genome sequence. a Scatter plot showing frequencies of H. influenzae USS (Hin-USS) and Actinobacillus pleuropneumoniae USS (Apl-USS) in genomes of phages from order Caudovirales [25] on y- and x-axis, respectively. Enlarged fragment of plot covering low values is shown in top right corner. Contingency table summarizing studied groups is shown in the middle. Fisher’s exact test (two-sided) was used to analyze the significance of the association between the phage groups and presence of USSs at given cutoff. Phages characterized by noteworthy values are numbered and labeled. b like a but prophages classified in this study are presented. c like a but IMG/VR assemblies [27] are presented. d GC content was depicted for selected species grouped in clades based on genetic relationship inferred from concatenated nucleotide sequences (~2650 nt) of 16 S rRNA and three housekeeping (infB, pgi, recA) genes [1]. The subfamily clade labels were colored and abbreviated in square brackets throughout the figure. Genome sequences from either the representative National Center for Biotechnology Information (NCBI) strains or the type strains were retrieved from NCBI genome database. e Frequencies of Hin- and Apl-USS in the same genomes as in d. f GC content (mean ± s.d.) was depicted for phage clusters gathered in groups and superclusters (latter in brown throughout the figure). g Frequencies (mean ± s.d., per 1 Mb) of Hin-USS and Apl-USS in prophage sequences grouped like in f. For clusters depicted in f and g additional information is provided: the cluster size, which is the number of studied phages and written in parentheses, the predicted or confirmed morphology of phage tail written in light blue throughout the figure (C—contractile, F—flexible,?—unknown), and the abbreviated clade name for the phage host. h Ordination of bacterial species and phage clusters was constructed based on GC% and USS frequencies from d to g. The Bray-Curtis coefficient was calculated between every pair of samples using three variables: ΔGC (i.e., GC content reduced by the minimal GC in studied dataset), frequencies of Hin-USS, frequencies of Apl-USSs, each standardized by maximum (i.e., values were scaled so that their maxima across these three variables were always 100). Non-metric multidimensional scaling (nMDS) was used to represent the samples in two-dimensional space. Points were colored based on bacterial clades and phage host (labels starting with “C”). Superimposed is a vector plot for three variables (in red), with the vector direction for each variable reflecting the Pearson correlations of their values with the ordination axes, and length giving the multiple correlation coefficient from this linear regression on the ordination points. 2D-stress of 0.12 was observed. Same ordination was used in i and j. i Supercluster assignment was plotted for all phage clusters. Location of bacterial species is indicated by gray “x”. j Phage tail morphology for all phage clusters is given. k Mechanism of USS accumulation in prophages. Picture adapted from [52]