Ying Liu, Changhan Lee, Fengyang Li, Janja Trček1, Heike Bähre2, Rey-Ting Guo3, Chun-Chi Chen3, Alexey Chernobrovkin, Roman Zubarev4, Ute Römling. 1. Faculty of Natural Sciences and Mathematics, Department of Biology, University of Maribor, 2000 Maribor, Slovenia. 2. Research Core Unit Metabolomics, Hannover Medical School, D-30625 Hannover, Germany. 3. State Key Laboratory of Biocatalysis and Enzyme Engineering, Hubei Collaborative Innovation Center for Green Transformation of Bio-Resources, Hubei Key Laboratory of Industrial Biotechnology, School of Life Sciences, Hubei University, Wuhan, 430062, P.R. China. 4. Department of Pharmacological & Technological Chemistry, I.M. Sechenov First Moscow State Medical University, Moscow, 119146, Russia.
Abstract
The ubiquitous cyclic di-GMP (c-di-GMP) network is highly redundant with numerous GGDEF domain proteins as diguanylate cyclases and EAL domain proteins as c-di-GMP specific phosphodiesterases comprising those domains as two of the most abundant bacterial domain superfamilies. One hallmark of the c-di-GMP network is its exalted plasticity as c-di-GMP turnover proteins can rapidly vanish from species within a genus and possess an above average transmissibility. To address the evolutionary forces of c-di-GMP turnover protein maintenance, conservation, and diversity, we investigated a Gram-positive and a Gram-negative species, which preserved only one single clearly identifiable GGDEF domain protein. Species of the family Morganellaceae of the order Enterobacterales exceptionally show disappearance of the c-di-GMP signaling network, but Proteus spp. still retained one diguanylate cyclase. As another example, in species of the bovis, pyogenes, and salivarius subgroups as well as Streptococcus suis and Streptococcus henryi of the genus Streptococcus, one candidate diguanylate cyclase was frequently identified. We demonstrate that both proteins encompass PAS (Per-ARNT-Sim)-GGDEF domains, possess diguanylate cyclase catalytic activity, and are suggested to signal via a PilZ receptor domain at the C-terminus of type 2 glycosyltransferase constituting BcsA cellulose synthases and a cellulose synthase-like protein CelA, respectively. Preservation of the ancient link between production of cellulose(-like) exopolysaccharides and c-di-GMP signaling indicates that this functionality is even of high ecological importance upon maintenance of the last remnants of a c-di-GMP signaling network in some of today's free-living bacteria.
The ubiquitous cyclic di-GMP (c-di-GMP) network is highly redundant with numerous GGDEF domain proteins as diguanylate cyclases and EAL domain proteins as c-di-GMP specific phosphodiesterases comprising those domains as two of the most abundant bacterial domain superfamilies. One hallmark of the c-di-GMP network is its exalted plasticity as c-di-GMP turnover proteins can rapidly vanish from species within a genus and possess an above average transmissibility. To address the evolutionary forces of c-di-GMP turnover protein maintenance, conservation, and diversity, we investigated a Gram-positive and a Gram-negative species, which preserved only one single clearly identifiable GGDEF domain protein. Species of the family Morganellaceae of the order Enterobacterales exceptionally show disappearance of the c-di-GMP signaling network, but Proteus spp. still retained one diguanylate cyclase. As another example, in species of the bovis, pyogenes, and salivarius subgroups as well as Streptococcus suis and Streptococcus henryi of the genus Streptococcus, one candidate diguanylate cyclase was frequently identified. We demonstrate that both proteins encompass PAS (Per-ARNT-Sim)-GGDEF domains, possess diguanylate cyclase catalytic activity, and are suggested to signal via a PilZ receptor domain at the C-terminus of type 2 glycosyltransferase constituting BcsA cellulose synthases and a cellulose synthase-like protein CelA, respectively. Preservation of the ancient link between production of cellulose(-like) exopolysaccharides and c-di-GMP signaling indicates that this functionality is even of high ecological importance upon maintenance of the last remnants of a c-di-GMP signaling network in some of today's free-living bacteria.
Signaling
systems couple sensing
and information transmission and amplification in order to adapt physiology
and metabolism to changing external and internal stimuli. Thus, those
modules are highly prone to mutation and/or horizontal gene transfer.
The cyclic dinucleotide (CDN) molecule bis(3′,5′)-cyclic
diguanosine monophosphate (c-di-GMP), identified in 1987 as an allosteric
activator of the cellulose synthase in the bacterium Komagataeibacter
xylinus (previously Gluconacetobacter (Acetobacter) xylinus (G. xylinus)) is the most abundant CDN-based second messenger signaling system
in bacteria.[1] Cyclic-di-GMP regulates a
multitude of fundamental physiological and metabolic processes, such
as single cell motility-to-sessility transition with the promotion
of biofilm formation, chronic versus acute virulence, antimicrobial
and detergent tolerance, cell cycle progression, nutrient acquisition,
electron transfer, and cell morphology.[2] Essential signaling modules of this pathway comprise the GG(D/E)EF
domain with diguanylate cyclase (DGC) activity and the EAL and HD-GYP
domain with phosphodiesterases (PDE) activity. The GG(D/E)EF domain
synthesizes c-di-GMP in a two-step reaction with 5′-pppGpG
as an intermediate and two molecules of pyrophosphate as byproducts.[3] The EAL- and HD-GYP domains hydrolyze c-di-GMP
into linear 5′-pGpG and GMP, respectively.[4,5] Numerous
proteins are bifunctional through a combination of GGDEF with EAL/HD-GYP
domains.In these three superfamilies, catalytic domains have
evolved into
receptors or act though protein–protein interactions.[6] Although an intact GG(D/E)EF motif is usually
an indicator for catalytic activity, due to the requirement of extended
consensus motifs, the presence of such a motif and even the presence
of extended consensus signature motif(s), including ligands binding
divalent ion required for catalytic activity, is not a guarantee for
catalytic activity[7,8] or substrate specificity[9,10] and vice versa.[11,12]The activity of DGCs and
PDEs is controlled by a diversity of N-terminal
sensory domains that receive and respond to various signals, such
as oxygen, nucleotide-based small molecules, and light.[13−15] Thereby, the most frequently associated N-terminal signaling domain
in this context is the versatile PAS (Per-ARNT-Sim) domain.[16,17] With diverse primary sequences of less than 150 amino acids in size,
compact PAS domains possess an interior pocket built characteristically
by five antiparallel β-strands and flanked by a few α-helices
to host a variety of prosthetic groups and ligands, with few functional
amino acids to determine the binding specificity.[16,18]Furthermore, signaling of c-di-GMP is translated through protein
and RNA-based receptors such as the PilZ domain, MshEN domain, the
inhibitory I-site of GGDEF domains, inactive EAL/HD-GYP domains, various
classes of transcription regulators, and distinct RNA aptamers.[6,19−21] PilZ domains, the first c-di-GMP receptors discovered,
are widespread among bacteria.[22,23] As a fundamental mechanism,
the catalytic activity of the cellulose synthase BcsA and other exopolysaccharide
synthases is regulated by PilZ domains.[24] Riboswitches, consisting of a high affinity CDN binding RNA aptamer
and an expression platform located within the 5′-untranslated
region (5′-UTR), respond to c-di-GMP binding with conformational
changes that alter downstream transcriptional termination, translation,
and ribozyme activiy.[19,20] Two classes of c-di-GMP responsive
riboswitches, type I and type II, with Genes
for the Environment, for Membranes and for Motility (GEMM) motifs have
been identified in Gram-negative and Gram-positive species, such as Vibrio cholerae, Geobacter metallireducens, and Clostridium difficile. Compared with protein
receptors, which have dissociation constants (Kd) in the low μM range for c-di-GMP, RNA aptamers have
dissociation constants in the nanomolar range.In many instances,
cyclic di-GMP activates biofilm formation, a
ubiquitous multicellular sessile lifestyle of bacteria and represses
motility.[25] This physiological regulation
by c-di-GMP is evolutionarily conserved from ancient thermophiles
to human pathogens where biofilm formation contributes to chronic
infections.[26,27]In this study, we identified
novel DGCs in two pathogens, the Gram-negative Proteus mirabilis UEB50 and Gram-positive Streptococcus
gallolyticus subsp. gallolyticus UCN34[28] (S. gallolyticus). Previously,
to our knowledge, a functional c-di-GMP signaling network has not
been reported for those two species. Moreover, we verified the catalytic
activity of the DGCs in vivo by diverse experimental
approaches, for instance, regulation of downstream protein production
by chromosomally integrated c-di-GMP specific Vc1 and Vc2 translational
riboswitches, a rapid cell-lysate-based matrix-assisted laser desorption/ionization
Fourier transform mass spectrometry (MALDI-FTMS)-based screening approach,
and detection of c-di-GMP by standard liquid chromatography-mass spectrometry
(LC-MS/MS) of cell extracts. Combined with phenotypic analyses to
promote rdar (red, dry, and rough) biofilm formation and motility
in the heterologous host Salmonella typhimurium,
our results demonstrate PMI3101_v and GGDEFUCN34 to be
active DGCs. Bioinformatic and gene synteny analyses predict that
c-di-GMP produced by PMI3101_v and GGDEFUCN34 activates
exopolysaccharide biosynthesis in their native hosts by binding to
an C-terminal PilZ domain of a type 2 glycosyltransferase. Of note,
the c-di-GMP modules in streptococcal species were found to be highly
variable with respect to the location and genetic context with some
modules even containing predicted EAL phosphodiesterases. Collectively,
our results demonstrate the presence of a functional c-di-GMP signaling
network predominantly in species of a phylogenetically distinct branch
of the genus Streptococcus and maintenance of c-di-GMP
regulated cellulose biosynthesis in Proteus spp.,
which poses the question of the ecological importance of the conservation
of those signaling pathways.
Results
Phylogeny of the c-di-GMP
Signaling Network in Selected Gram-Negative
and Gram-Positive Genera
The c-di-GMP signaling network can
rapidly alter even on a short evolutionary scale. Within a genus,
species can possess distinct c-di-GMP networks, highly variable in
numbers and types of GGDEF and EAL domain proteins (https://www.ncbi.nlm.nih.gov/Complete_Genomes/c-di-GMP.html). Disappearance of signaling systems including the c-di-GMP signaling
network can be triggered by a substantial lifestyle change toward
a parasitic invasive intracellular lifestyle.[29] Network reduction on a short evolutionary time scale is exemplified
in the human pathogens Shigella spp. and Yersinia pestis, which, in contrast to their close relatives E. coli and Yersinia enterocolitica, possess
a highly reduced c-di-GMP network.[2,30] An extreme
lifestyle adaptation is even manifested as a dramatic reduction of
genome size with a concomitant reduction of all signaling networks.[31] Additional reasons for a dramatic deterioration
of bacterial signaling systems are unknown. In addition, c-di-GMP
turnover proteins, as preferentially encoded on transmissible plasmids
and enhancing conjugative transfer, have a statistically significant
higher likelihood to be horizontally transferred.[15,32] We are interested to examine the evolutionary forces, which cause
dramatic consistent alterations in the c-di-GMP signaling network.
Within the class of γ-proteobacteria, the Enterobacterales order
consists of the families Enterobacteriaceae, Erwiniaceae, Yersiniaceae,
Pectobacteriaceae, Hafniaceae, Budviviaceae, and Morganellaceae. Cyclic
di-GMP signaling systems of species of the type genera Escherichia, Erwinia, Yersinia, and Dickeya of the four first families, which usually possess
a high density of c-di-GMP turnover proteins, have been well investigated.[2] Examples are Escherichia coli K-12 MG1655 (6.3 c-di-GMP turnover proteins per Mbp (6.3/Mbp) at
a genome size of 4.64 Mbp), Salmonella typhimurium ATCC14028 (4.4/Mbp; genome size: 4.96 Mbp), Klebsiella pneumoniae subsp. pneumonia (5.3/Mbp; genome size: 5.33 Mbp), Erwinia amylovora ATCC49946 (3.2/Mbp; genome size: 3.8 Mbp), Yersinia enterocolitica subsp. enterocolitica 8081 (4.6/Mbp; genome size: 4.55 Mbp), Serratia marcescens subsp. marcescens Db11 (4.0/Mbp; genome size: 5.11
Mbp), and Dickeya dadantii DSM18020 (5.6/Mbp; genome
size: 4.82 Mbp). Equally, the less investigated genera of the families
Hafniaceae and Budviviaceae possess a significant number of c-di-GMP
turnover proteins (Figure A). In contrast, we noticed that, within the family Morganellaceae,
which consists presently of the eight genera Arsenophonus, Cosenzaea, Moellerella, Morganella, Photorhabdus, Proteus, Providencia, and Xenorhabdus, sequenced
genomes from representative species of all genera, with the exception
of Xenorhabdus nematophila ATCC19061, are consistently
missing functional c-di-GMP turnover proteins (Figure A, Supporting Table S1). Species of the genera Proteus and Cosenzaea represented by, for example, P. mirabilis HI4320, Proteus hauseri ATCC700826, Proteus vulgaris ATCC49132, and Cosenzaea myxofaciens ATCC19692
(WP_066749622.1) have, however, still retained a single GGDEF domain
protein. The family of Morgenellaceae encompasses predominantly environmental
bacteria with a genome size of 3.8 Mbp or higher, although symbionts
with highly reduced genome size are present, e.g., within the genus Arsenophonus. The reason for the disappearance of the c-di-GMP
network in most species of the family Morganellaceae, equally as its
reduced retention in genus Proteus spp. is not obvious.
Figure 1
Occurrence
and characterization of c-di-GMP turnover proteins in
representative species of the order Enterobacterales. (A) Occurrence
of c-di-GMP turnover proteins in representative species of the order
Enterobacterales. Phylogenetic tree of representative species from
families Budviviaceae, Enterobacteriaceae, Erwiniaceae, Hafniaceae,
Morganellaceae, Pectobacteriaceae, and Yersiniaceae of the order Enterobacterales. Proteus spp. contain a single GGDEF domain protein. The
number of GGDEF/EAL/GGDEF+EAL domain proteins and, in the case of
a single GGDEF domain protein, the domain structure of the GGDEF domain
protein are indicated. The maximum likelihood phylogenetic tree of
the representative species is based on the relatedness of 33 conserved
core genome proteins with >90% amino acid identity. Blue dots indicate
bootstrap values >67%. (B) Characterization of the GGDEF domain
protein
PMI3101_v of P. mirabilis UEB50. Domain structure
of PMI3101_v which is identical to the domain structure of the GGDEF
domain protein from other Proteus spp. (C) Alignment
of the GGDEF protein of PMI3101_v with homologous Proteus spp. GGDEF domain proteins. PAS_9 labels the PAS domain and GGDEF
the DGC domain. The identity of the proteins is indicated in the Supporting Information.
Occurrence
and characterization of c-di-GMP turnover proteins in
representative species of the order Enterobacterales. (A) Occurrence
of c-di-GMP turnover proteins in representative species of the order
Enterobacterales. Phylogenetic tree of representative species from
families Budviviaceae, Enterobacteriaceae, Erwiniaceae, Hafniaceae,
Morganellaceae, Pectobacteriaceae, and Yersiniaceae of the order Enterobacterales. Proteus spp. contain a single GGDEF domain protein. The
number of GGDEF/EAL/GGDEF+EAL domain proteins and, in the case of
a single GGDEF domain protein, the domain structure of the GGDEF domain
protein are indicated. The maximum likelihood phylogenetic tree of
the representative species is based on the relatedness of 33 conserved
core genome proteins with >90% amino acid identity. Blue dots indicate
bootstrap values >67%. (B) Characterization of the GGDEF domain
protein
PMI3101_v of P. mirabilis UEB50. Domain structure
of PMI3101_v which is identical to the domain structure of the GGDEF
domain protein from other Proteus spp. (C) Alignment
of the GGDEF protein of PMI3101_v with homologous Proteus spp. GGDEF domain proteins. PAS_9 labels the PAS domain and GGDEF
the DGC domain. The identity of the proteins is indicated in the Supporting Information.Upon the reduction of genome size due to habitat restriction, signaling
networks are not only reduced but can become (partially) impaired
in functionality. For example, in the human-adapted species Staphylococcus aureus and Staphylococcus epidermidis, the remaining GGDEF domain proteins have lost their catalytic activity,
although they provide physiological functionalities, which manifest,
for instance, through protein–protein interactions.[7,8] The family of Streptococcae is composed of three different genera, Streptococcus, Lactococcus, and Lactovum. Performing a BLAST search,[33] we discovered that strains of distinct streptococcal species,
among them S. gallolyticus, Streptococcus
infantarius, Streptococcus equinus, Streptococcus lutetiensis, Streptococcus salivarius, Streptococcus vestibularis, and Streptococcus
agalactolyticus as well as the unassigned species Streptococcus suis and Streptococcus henryi (Figure A and data
not shown), encode one GGDEF domain DGC candidate with a similar domain
structure (see below). Those streptococcal species mainly belong to
one of two distinct phylogenetic lineages of the genus Streptococcus, which includes the Bovis, Pyogenic, and Salivarius subgroups (Figure A[34]), suggesting particular evolutionary forces for the maintenance
of a DGC.
Figure 2
Occurrence and characterization of c-di-GMP turnover proteins in
representative species of the genus Streptococcus. (A) Occurrence of GGDEF domain proteins in representative species
of the genus Streptococcus. The domain structure
of the GGDEF domain protein(s) encoded by the respective species genomes
is indicated. Streptococcal phylogenetic subgroups, which contain
GGDEF proteins, are indicated in green, subgroups where species do
not regularly possess GGDEF proteins are in blue, and unassigned species
are indicated in red. The maximum likelihood phylogenetic tree of
representative Streptococcus species is based on
the relatedness of 77 common core genome proteins with >90% amino
acid identity. (B) Characterization of the GGDEF domain protein GGDEFUCN34 of S. gallolyticus UCN34. Domain structure
of GGDEFUCN34 and most distantly related GGDEF domain proteins
of streptococcal species. STRGAL (WP_012961431.1; S. gallolyticus UCN34), STRSUI (WP_079269016.1; Streptococcus suis), STRPAR (WP_037620421.1; Streptococcus parauberis), STRHEN (WP_018163948.1; S. henryi). (C) Alignment
of GGDEFUCN34 with most distantly related GGDEF domain
proteins from selected Streptococcus species. TM
represents a transmembrane helix, PAS_4 the PAS domain, and GGDEF
the DGC domain. The identity of the proteins is indicted in the Supporting Information.
Occurrence and characterization of c-di-GMP turnover proteins in
representative species of the genus Streptococcus. (A) Occurrence of GGDEF domain proteins in representative species
of the genus Streptococcus. The domain structure
of the GGDEF domain protein(s) encoded by the respective species genomes
is indicated. Streptococcal phylogenetic subgroups, which contain
GGDEF proteins, are indicated in green, subgroups where species do
not regularly possess GGDEF proteins are in blue, and unassigned species
are indicated in red. The maximum likelihood phylogenetic tree of
representative Streptococcus species is based on
the relatedness of 77 common core genome proteins with >90% amino
acid identity. (B) Characterization of the GGDEF domain protein GGDEFUCN34 of S. gallolyticus UCN34. Domain structure
of GGDEFUCN34 and most distantly related GGDEF domain proteins
of streptococcal species. STRGAL (WP_012961431.1; S. gallolyticus UCN34), STRSUI (WP_079269016.1; Streptococcus suis), STRPAR (WP_037620421.1; Streptococcus parauberis), STRHEN (WP_018163948.1; S. henryi). (C) Alignment
of GGDEFUCN34 with most distantly related GGDEF domain
proteins from selected Streptococcus species. TM
represents a transmembrane helix, PAS_4 the PAS domain, and GGDEF
the DGC domain. The identity of the proteins is indicted in the Supporting Information.
Basic Characteristics of GGDEF Domian Proteins from P. mirabilis and S. gallolyticus Subspecies gallolyticus
As the human pathogen P. mirabilis is
the most well investigated Proteus species,
we decided to assess the GGDEF domain protein of P. mirabilis UEB50,[35] an isolate from a urinary catheter
(Figure B). PMI3101_v,
a homologue of PMI3101 from P. mirabilis HI4320,
differs from PMI3101 by the N133T exchange. However, proteins identical
to PMI3101_v are present in P. mirabilis strains
deposited in the NCBI database.Alignment of PMI3101 with previously
functionally characterized GGDEF domain proteins indicated that the
GGDEF domain of PMI3101 possesses a conserved RxGGDEF motif, the lysine
that stabilizes the transition state and all amino acids involved
in substrate binding as identified in the DGC PleD besides the homologue
of arginine446 (Figure C and Figure (36,37)). No c-di-GMP binding RxxD inhibitory (I)-site
motif is present.[38] A class 9 PAS domain
(residues 25–123) is N-terminal linked to the GGDEF domain
(residues 135–293). The same GGDEF protein domain architecture
is found in P. vulgaris, Proteus columbae, Proteus alimentorum, P. hauseri, and Proteus genomosp. 6 (Figure B and Figure ). Outside of the genus Proteus, homologous
proteins are present in C. myxofaciens (Supporting Table S1). A structural model of the
PAS-GGDEF domain showed the closest structural homology to the PAS
domain of PA0861 (PDB: 5XGD chain A), a P. aeruginosaPAS-GGDEF-EAL
domain protein. A S-helix-like linker region connects the PAS and
GGDEF domain (Supporting Figure S1[39]). Of note, some close homologues of the protein
in species outside the Proteus genus are more complex,
implying that a modulation of domain composition through chromosomal
recombination or convergent evolution of GGDEF domain proteins has
readily occurred (Supporting Figure S2A and B). Although we could demonstrate convergent evolution of GGDEF domains
with the same N-terminal sensing domain within a panel of enterobacterial
species,[40] uncoupling of domain homology
from the identity of the N-terminal sensory domain and downstream
EAL domain occurs readily in natural isolates.[41,42]
Figure 3
Alignment
of the GGDEF domains from P. mirabilis UEB50 and S. gallolyticus UCN34 with experimentally
verified GGDEF domains. GGDEF domain of PMI3101_v and GGDEFUCN34 and two most distantly related PAS-GGDEF domain proteins from other
species of the same genus (PROVUL, P. vulgaris; PROALI, P. alimentorum; STRHEN, S. henryi DSM19005,
and STRSUI, S. suis) were aligned with selected class
I (catalytically functional, green line), class II (catalytically
functional with C-terminal EAL domain, blue), and class III (catalytically
nonfunctional, red) GGDEF domains.[44] The
determination of the secondary structure is based on the PDB 2WB4 PleD crystal structure.
Functionality of amino acids in light plum, wide turn in protein;
in red, substrate interacting residues; in plum, Mg2+ binding;
in blue, stabilizing the transition state; conserved in green, allosteric
I-site; GG[D/E]EF motif in blue; underlined, salt bridge.[37,38] Star, consistently not conserved in DGC domains of Proteus or Streptococcus. Alignments displayed with ESPript
3.0. TTT and TT indicate strict α- and β-turns. Residues
are colored according to physical-chemical properties. Framed residues
show more than 70% similarity. Hashtag indicates any amino acid of
N/D/Q/E/B/Z, dollar indicates any amino acid of L/M, and percentage
indicates any amino acid of F/Y. Relative accessibility values (acc) are displayed below the consensus sequence.
Alignment
of the GGDEF domains from P. mirabilis UEB50 and S. gallolyticus UCN34 with experimentally
verified GGDEF domains. GGDEF domain of PMI3101_v and GGDEFUCN34 and two most distantly related PAS-GGDEF domain proteins from other
species of the same genus (PROVUL, P. vulgaris; PROALI, P. alimentorum; STRHEN, S. henryi DSM19005,
and STRSUI, S. suis) were aligned with selected class
I (catalytically functional, green line), class II (catalytically
functional with C-terminal EAL domain, blue), and class III (catalytically
nonfunctional, red) GGDEF domains.[44] The
determination of the secondary structure is based on the PDB 2WB4 PleD crystal structure.
Functionality of amino acids in light plum, wide turn in protein;
in red, substrate interacting residues; in plum, Mg2+ binding;
in blue, stabilizing the transition state; conserved in green, allosteric
I-site; GG[D/E]EF motif in blue; underlined, salt bridge.[37,38] Star, consistently not conserved in DGC domains of Proteus or Streptococcus. Alignments displayed with ESPript
3.0. TTT and TT indicate strict α- and β-turns. Residues
are colored according to physical-chemical properties. Framed residues
show more than 70% similarity. Hashtag indicates any amino acid of
N/D/Q/E/B/Z, dollar indicates any amino acid of L/M, and percentage
indicates any amino acid of F/Y. Relative accessibility values (acc) are displayed below the consensus sequence.To investigate a c-di-GMP network in Streptococcus spp., we selected the GGDEF domain protein (WP_012961431.1; named
GGDEFUCN34) of S. gallolyticus UCN34,
which is identical to F5WZ28_STRG1 from S. gallolyticus ATCC43143 (Figure B), suggesting conservation within the species. Alignment of the
GGDEF domain of GGDEFUCN34 with previously characterized
DGC GGDEF domains showed that GGDEFUCN34 invariantly possesses
the RxGGDEF motif and other characteristic conserved signature amino
acids including the PleD substrate binding amino acids except for
Arg446, suggesting that GGDEFUCN34 is a functional
DGC (Figure C and Figure (43)). Furthermore, this protein does not have an RxxD I-site
motif N-terminal of the GGDEF motif.[38] The
full-length protein possesses two transmembrane helices (TM1, residues
16–34; TM2, 44–63) and a group 4 PAS sensory domain
(82–185; pfam08448) N-terminal of the GGDEF domain (135–292;
pfam00990) (Figure B). A structural model of the PAS domain showed again the highest
structural homology to the PAS domain of the P. aeruginosaPAS-GGDEF-EAL domain protein PA0861 (PDB: 5XGD chain A; Supporting Figure S1) with an S-like helix connecting
the signaling with the catalytic GGDEF domain. Close GGDEFUCN34 homologues (>80–90% amino acid sequence identity) are
present
in isolates of other species of the bovis subgroup such as S. equinus, S. lutetiensis, S.
infantarius, and Streptococcus macedonicus (>60% identity), in the salivarius subgroup such as in the probiotic
species S. salivarius and Streptococcus thermophilus and the commensal Streptococcus vestibularis (>90%
identity), and unassigned Streptococcus spp. from
those two subgroups (Figure A). The presence of a GGDEFUCN34 homologue in the
mitis subgroup member Streptococcus rubneri (data
not shown) suggests that lifestyle rather than taxonomic relationship
might determine the presence of a GGDEF domain protein, however, more
detailed investigation of the prevalence of GGDEF domain proteins
in species of the mitis/mutans/angiosus subgroups requires additional
analysis. Furthermore, homologous proteins with the same domain structure,
partially lacking one or both transmembrane helices and with a low
sequence identity, are harbored in pyogenes subgroup members such
as S. parauberis (approximately 25% identity compared
to GGDEFUCN34), S. uberis (25% identity),
and Streptococcus dysgalactiae (18% identity) but
also S. henryi (44% identity) and S. suis (25% identity), which are not assigned to a subgroup (Figure A[34]). Outside of the Streptococcus genus proteins,
homologues over the entire length are found in Weissella soli (WP_070230170.1) and Lactococcus spp. (WP_096819183)
(data not shown). Of note, other close homologues of the protein in
species outside the Streptococcus genus are more
complex, implying that a modulation of domain composition through
chromosomal recombination or convergent evolution of GGDEF domain
proteins has readily occurred (Supporting Figure S3A and B). These observations hint to the still largely unexplored
evolutionary plasticity of the c-di-GMP signaling system.
Assessment
of DGC Catalytic Activity by a Riboswitch-Based Screening
System
In order to assess the catalytic functionality of
the two candidate DGCs, we used a previously developed riboswitch-based
system[19,45] to detect alterations in c-di-GMP concentrations in vivo (Figure A). The genome of the pathogenic bacterium V. cholerae encodes two c-di-GMP specific riboswitches, Vc1 and Vc2, located
upstream of the gbpA and VC1722 gene,
respectively. The Vc1 and Vc2 riboswitches have been characterized
as “off” and “on” riboswitches in V. cholerae with c-di-GMP to promote and repress the expression
of the downstream genes, respectively.[19] Nevertheless, both riboswitches consistently functioned as “off”
riboswitches in E. coli TOP10 (Supporting Figure S4), probably due to a remodeled conformation
of the riboswitch upon binding to cellular components in E.
coli. Upon overexpression of PMI3101_v, we observed downregulation
of Vc2 riboswitch mediated β-galactosidase expression, and upon
overexpression of GGDEFUCN34, Vc1 and Vc2 riboswitch dependent
β-galactosidase expression was downregulated (Figure ). In contrast, there was no
effect upon overexpression of the catalytic mutants, PMI3101_vD215A and GGDEFUCN34D273A. These results
suggest that the candidate DGCs produce c-di-GMP when expressed heterologously
in E. coli.
Figure 4
Detection of alterations in c-di-GMP levels
by Vc1 and Vc2 riboswitches
for candidate DGCs PMI3101_v and GGDEFUCN34 in E. coli TOP10. (A) The Vc-riboswitch-based c-di-GMP sensors
monitor the in vivo level of c-di-GMP. The riboswitch
has been engineered as the 5′-UTR of the lacZ gene encoding β-galactosidase, which allows monitoring the
c-di-GMP level by differential production of β-galactosidase,
resulting in an altered colony color formation of an oxidized blue
dye precipitate as output in the presence of the substrate X-gal.
Vc1 and Vc2 riboswitches behave as off riboswitches in the E. coli TOP10 strain. Upon lack of c-di-GMP, production
of β-galactosidase is elevated. Upon expression of a DGC, elevated
c-di-GMP levels downregulate production of β-galactosidase.
Response to production of the GGDEF domain proteins PMI3101_v and
GGDEFUCN34 and their catalytic mutants (PMI3101_vD215A and GGDEFUCN34D273A) was evaluated by the
Vc1 (B) and the Vc2 riboswitch (C). The E. coli TOP10
vector control strain shows a light blue colony. Strains were grown
in the presence or absence of l-arabinose as indicated and
incubated at 28 °C for 24 h. Verified protein expression shown
in Supporting Figure S5.
Detection of alterations in c-di-GMP levels
by Vc1 and Vc2 riboswitches
for candidate DGCs PMI3101_v and GGDEFUCN34 in E. coli TOP10. (A) The Vc-riboswitch-based c-di-GMP sensors
monitor the in vivo level of c-di-GMP. The riboswitch
has been engineered as the 5′-UTR of the lacZ gene encoding β-galactosidase, which allows monitoring the
c-di-GMP level by differential production of β-galactosidase,
resulting in an altered colony color formation of an oxidized blue
dye precipitate as output in the presence of the substrate X-gal.
Vc1 and Vc2 riboswitches behave as off riboswitches in the E. coli TOP10 strain. Upon lack of c-di-GMP, production
of β-galactosidase is elevated. Upon expression of a DGC, elevated
c-di-GMP levels downregulate production of β-galactosidase.
Response to production of the GGDEF domain proteins PMI3101_v and
GGDEFUCN34 and their catalytic mutants (PMI3101_vD215A and GGDEFUCN34D273A) was evaluated by the
Vc1 (B) and the Vc2 riboswitch (C). The E. coli TOP10
vector control strain shows a light blue colony. Strains were grown
in the presence or absence of l-arabinose as indicated and
incubated at 28 °C for 24 h. Verified protein expression shown
in Supporting Figure S5.Furthermore, we observed that induction of GGDEFUCN34 with >0.01% l-arabinose had a substantial cytotoxic
effect
on E. coli TOP10, while cytotoxicity was not observed
upon expression of the GGDEFUCN34D273A catalytic
mutant. Moreover, the wild type protein was expressed at a lower level
than the inactive mutant (Supporting Figure S5). Cytotoxicity upon overexpression of DGCs has been observed previously
in the E. coli BL21 background.[38] The selective cytotoxicity upon overexpression of GGDEFUCN34 but not PMI3101_v in E. coli TOP10 implies
a distinct mechanism of action by GGDEFUCN34 due to elevated
c-di-GMP synthesis, catalytic activity of GGDEFUCN34, or
subsequent c-di-GMP binding.
PMI3101_v and GGDEFUCN34 Elevate
the Intracellular
c-di-GMP Concentration
The riboswitch assay is an indirect
approach to assess c-di-GMP levels in vivo. To confirm
that PMI3101_v and GGDEFUCN34 can produce c-di-GMP in vivo, we developed a MALDI-FTMS-based screen as a first-line
evaluation. We overexpressed PMI3101_v, GGDEFUCN34, and
their catalytic mutants in E. coli TOP10 and subsequently
used the crude lysates to assess cyclic dinucleotide production by
MALDI-FTMS mass spectrometry. MALDI-FTMS determines the weight-to-charge
ratio by measuring the flying time of the ionized molecules in the
electric field. Because major molecules have a single positive charge,
ions are actually separated by their mass. As no extraction of the
molecule or isolation of the protein is necessary, this experimental
approach can be applied as a rapid screen for DGC activity.The ionized c-di-GMP, c-di-AMP, and c-GMP-AMP generally have a mass/charge
(m/z) ratio [M + H+]+ of 691.104 (Figure ), 659.114 and 675.107, respectively. A signal corresponding
to c-di-GMP was detectable in the samples from cells expressing PMI3101_v
and GGDEFUCN34, whereas no signal was measurable when the
catalytic mutants and the vector control were expressed (Figure A, Supporting Figure S6), which suggested that c-di-GMP could
be synthesized by PMI3101_v and GGDEFUCN34 in E.
coli TOP10. Nonetheless, a mass-to-charge of 691.1040 could
be created by other molecules with the same molecular weight.
Figure 5
Mass spectrometric
analysis of cyclic dinucleotides produced in
cell lysates of E. coli TOB10 upon expression of
PMI3101_v and GGDEFUCN34. (A) MALDI-FTMS analysis of cell
lysates from E. coli TOP10 overexpressing GGDEF domain
proteins PMI3101_v, GGDEFUCN34, and their catalytic mutants.
The m/z spectrum from 685 to 700
is shown. The ion with a m/z of
691.104 was detected from chemically synthesized c-di-GMP and control
lysate overexpressing the DGC AdrA.[46] A
minor peak was detected in the lysate from the E. coli TOP10 vector control pBAD28. Enhanced peaks were seen in the lysates
derived from E. coli TOP10 expressing PMI3101_v and
GGDEFUCN34, but not in lysates from cells expressing the
catalytic mutants PMI3101_vD215A and GGDEFUCN34D273A. Cyclic di-AMP with a m/z of 659.114 is undetectable in all samples. All intensities
are normalized to the most intense peak with its y-value set to 100. GGDEF corresponds to GGDEFUCN34. (B)
Cyclic di-GMP concentrations as measured by LC-MS/MS. PMI3101_v, GGDEFUCN34, and their catalytic mutants were overexpressed in S. typhimurium UMR1.
Mass spectrometric
analysis of cyclic dinucleotides produced in
cell lysates of E. coli TOB10 upon expression of
PMI3101_v and GGDEFUCN34. (A) MALDI-FTMS analysis of cell
lysates from E. coli TOP10 overexpressing GGDEF domain
proteins PMI3101_v, GGDEFUCN34, and their catalytic mutants.
The m/z spectrum from 685 to 700
is shown. The ion with a m/z of
691.104 was detected from chemically synthesized c-di-GMP and control
lysate overexpressing the DGC AdrA.[46] A
minor peak was detected in the lysate from the E. coli TOP10 vector control pBAD28. Enhanced peaks were seen in the lysates
derived from E. coli TOP10 expressing PMI3101_v and
GGDEFUCN34, but not in lysates from cells expressing the
catalytic mutants PMI3101_vD215A and GGDEFUCN34D273A. Cyclic di-AMP with a m/z of 659.114 is undetectable in all samples. All intensities
are normalized to the most intense peak with its y-value set to 100. GGDEF corresponds to GGDEFUCN34. (B)
Cyclic di-GMP concentrations as measured by LC-MS/MS. PMI3101_v, GGDEFUCN34, and their catalytic mutants were overexpressed in S. typhimurium UMR1.To this end, we monitored the cyclic dinucleotide activity of PMI3101_v
and GGDEFUCN34 conventionally by LC-MS/MS after extraction
of the molecules. Indeed, again, we observed a high concentration
of c-di-GMP upon expression of PMI3101_v and GGDEFUCN34, but concentrations remained unaltered upon expression of the catalytic
mutants (Figure B).
Proteins were expressed in the S. typhimurium UMR1
background (see below) due to the elevated cytotoxic effect of GGDEFUCN34 in E. coli TOP1.We also purified
PMI3101_v and assessed its enzymatic activity
by an in vitro assay analyzing the product by thin-layer
chromatography (TLC); however, purified PMI3101_v did not show the
expected catalytic activity, as c-di-GMP synthesis was not observed
(data not shown). We conclude that either a cofactor(s) and/or signals
to stimulate the catalytic activity of the DGC are missing in vitro.
The DGCs PMI3101_v and GGDEFUCN34 Promote rdar Biofilm
Morphotype Expression of S. typhimurium
Cyclic di-GMP activates a multicellular behavior of S. typhimurium, rdar biofilm morphotype formation with a characteristic red, dry,
and rough colony morphology on a Congo Red (CR) agar plate due to
expression of extracellular matrix components cellulose and curli
fimbriae.[25,47] This behavior serves as a biologically relevant
read-out for elevated c-di-GMP levels. To investigate whether the
DGCs PMI3101_v and GGDEFUCN34 affect rdar biofilm formation,
we expressed the proteins in S. typhimurium UMR1
grown on CRagar plates at 28 °C. The native PMI3101_v and GGDEFUCN34 but not their catalytic mutants up-regulated the rdar
biofilm morphotype (Figure A), which added supporting evidence for being catalytically
active in vivo.
Figure 6
Effect of overexpression of PMI3101_v
and GGDEFUCN34 on biofilm formation and motility in S. typhimurium.[25,47] (A) Overexpression
of PMI3101_v and GGDEFUCN34 (GGDEF) but not overexpression
of the catalytic mutants
up-regulated the rdar morphotype of S. typhimurium UMR1. The mutant strain S. typhimurium MAE50 served
as the negative control for the rdar morphotype. The plate was incubated
at 28 °C for 72 h. Overexpression of PMI3101_v (B) and GGDEFUCN34 (C) down-regulated the apparent swimming motility of S. typhimurium UMR1. UMR1 expressing PMI3101_v displayed
decreased swimming motility compared to the vector control and PMI3101_vD215A. Similarly, GGDEFUCN34 significantly inhibited
swimming of UMR1 compared with the vector control (VC) pBAD28 and
its mutant GGDEFUCN34D271A. Respective genes
were cloned in pBAD28. Bars represent means of three independent experiments
with standard deviation analyzed by Student test (t test). **, p < 0.01 and ***, p < 0.001, respectively.
Effect of overexpression of PMI3101_v
and GGDEFUCN34 on biofilm formation and motility in S. typhimurium.[25,47] (A) Overexpression
of PMI3101_v and GGDEFUCN34 (GGDEF) but not overexpression
of the catalytic mutants
up-regulated the rdar morphotype of S. typhimurium UMR1. The mutant strain S. typhimurium MAE50 served
as the negative control for the rdar morphotype. The plate was incubated
at 28 °C for 72 h. Overexpression of PMI3101_v (B) and GGDEFUCN34 (C) down-regulated the apparent swimming motility of S. typhimurium UMR1. UMR1 expressing PMI3101_v displayed
decreased swimming motility compared to the vector control and PMI3101_vD215A. Similarly, GGDEFUCN34 significantly inhibited
swimming of UMR1 compared with the vector control (VC) pBAD28 and
its mutant GGDEFUCN34D271A. Respective genes
were cloned in pBAD28. Bars represent means of three independent experiments
with standard deviation analyzed by Student test (t test). **, p < 0.01 and ***, p < 0.001, respectively.
The DGCs PMI3101_v and GGDEFUCN34 Suppress Motility
of S. typhimurium
The transition from motility
to sessility contributes to the development of multicellular behavior
with motility to be inhibited by c-di-GMP.[25] Thus, we assessed the effect of PMI3101_v and GGDEFUCN34 on flagellar-based motility of S. typhimurium UMR1,
as observed by the apparent motility in a semisolid LBagar plate
at 37 °C (similar results were obtained at 28 °C). Wild
type PMI3101_v (Figure B) and GGDEFUCN34 (Figure C) consistently down-regulated the swimming ability
compared to the positive control S. typhimurium UMR1,
while the corresponding mutants had only a minor effect, which suggested
that the two novel GGDEF domain proteins affect flagella-mediated
motility by producing c-di-GMP. Noteworthy, the catalytic mutant of
GGDEFUCN34 slightly promoted apparent swimming. This marginal
up-regulation of swimming motility by GGDEFUCN34D273A is probably caused by an alternative binding site for c-di-GMP,
as an I-site is lacking.[38] The mutant of
PMI3101_v still suppressed swimming of UMR1 to a small extent compared
with the vector control, which can be explained by residual catalytic
activity of the mutated GGDEF domain. In summary, suppression of motility
by PMI3101_v and GGDEFUCN34 but not their catalytic mutants
again added supporting evidence for the two proteins being catalytically
active as DGC in vivo.
Genomic Context of the
GGDEF Domain Protein in Proteus
mirabilis
Our experiments showed that PMI3101_v
of P. mirabilis and GGDEFUCN34 of S. gallolyticus are bona fide DGCs. As the sole DGC encoded
by the respective chromosome, we were wondering about the genomic
context of the gene products and their physiological targets. Investigating
the gene synteny in P. mirabilis HI4320 (as P. mirabilis UEB50 has not been sequenced), we found that
PMI3101 is embedded into a type-IB-like hybrid cellulose biosynthesis
gene cluster consisting of a bcsOABCD-dgcPMI3101-galU-bcsZ structure (Figure A[48]). Such a gene cluster is invariantly
present in representative isolates of Proteus species
such as P. vulgaris, P. hauseri, P. cibarium, and P. columbae. Specifically,
BcsA and BcsB constitute the highly conserved cellulose synthase holoenzyme
with the inner membrane-spanning catalytic subunit and the associated
N-terminal-membrane-anchored periplasmic subunit, respectively (Figure B[24,49]). The catalytic subunit BcsA, which is with 710 aa shorter than
characterized cellulose synthases missing sequences N-terminal to
the BcsA domain, is only 37% identical to BcsA of S. typhimurium and 27% identical to BcsA of Rhodobacter sphaeroides but displays relevant signature motifs for catalysis and a C-terminal
PilZ domain which contains the conserved RxxxR/DxSxxG amino acid motif
for c-di-GMP binding[22,24,50] (Supporting Figure S7). In addition,
accessory proteins not required for synthesis but translocation and
packing of the cellulose macromolecule are present; the outer membrane
pore BcsC, the periplasmic factor BcsD affecting crystallinity of
cellulose microfibrils, the periplasmic cellulase BcsZ, and the uncharacterized
component BcsO.[48] Intriguingly, a GalU
encoding ORF has been unconventionally inserted downstream of the
DGC ORF. GalU reversibly catalyzes synthesis of UTP-glucose, which
suggests close proximity of substrate synthesis with the cellulose
synthase to readily produce the 1,4-β-glucan cellulose macromolecule.
Of note, a second cellulose biosynthesis gene cluster consisting of bcsG-bcsR-bcsQ-bcsA2-bcsB2-bcsC2 is present at a distant
location on the P. mirabilis chromosome equally as
in other Proteus species representatives with the
exception of the P. hauseri isolate (Supporting Figure S8). The cellulose synthase
BcsA2, which possesses 53% identity to BcsA of S. typhimurium, 36% identity to BcsA1, and 23% identity to BcsA of R. sphaeroides harbors a C-terminal PilZ domain for c-di-GMP binding (Supporting Figure S7). Thus, although cellulose
biosynthesis has not been observed in Proteus spp.
(we did not observe cellulose production upon plasmid-based expression
of PMI3010_v in P. mirabilis UEB50, data not shown),
these bioinformatic analyses suggest P. mirabilis and other Proteus spp. to synthesize cellulose
by two different cellulose or cellulose-like biosynthesis operons
stimulated by c-di-GMP signaling (Figure B; Supporting Figure S8).
Figure 7
Genomic context of the PMI3101 DGC in P. mirabilis. (A) Putative operon structure in P. mirabilis HI4320.
The type-IB-like hybrid cellulose biosynthesis gene cluster consists
of bcsOABCD-dgcPMI3101-galU-bcsZ. As a comparison,
the genomic context in P. vulgarius NCTC13145, P. columbae T60, P. hauseri 15H5D-4a, P. cibarius NZ2, and Proteus genomospecies
6 is shown. (B) Localization and functionality of gene products of
the modified class IB cellulose biosynthesis operon in P.
mirabilis HI4320. The type-IB-like hybrid cellulose biosynthesis
operon consisting of bcsOABCD-dgcPMI3101-galU-bcsZ codes for a seemingly functional cellulose biosynthesis complex.
The core component of the complex is the functional cellulose synthase
consisting of BcsA (catalytic subunit) and BcsB (inner membrane anchored
periplasmic protein), the outer membrane pore BcsC, the BcsD periplasmic
factor required for crystallinity, and the cellulase BcsZ. GalU reversibly
catalyzes the synthesis of UTP-glucose, suggesting direct delivery
of the UDP-glucose substrate to produce the 1,4-β-glucan cellulose
macromolecule. The cytoplasmic DGC potentially directly delivers the
product c-di-GMP to the PilZ domain of the cellulose synthase BscA
for cellulose biosynthesis.
Genomic context of the PMI3101 DGC in P. mirabilis. (A) Putative operon structure in P. mirabilis HI4320.
The type-IB-like hybrid cellulose biosynthesis gene cluster consists
of bcsOABCD-dgcPMI3101-galU-bcsZ. As a comparison,
the genomic context in P. vulgarius NCTC13145, P. columbae T60, P. hauseri 15H5D-4a, P. cibarius NZ2, and Proteus genomospecies
6 is shown. (B) Localization and functionality of gene products of
the modified class IB cellulose biosynthesis operon in P.
mirabilis HI4320. The type-IB-like hybrid cellulose biosynthesis
operon consisting of bcsOABCD-dgcPMI3101-galU-bcsZ codes for a seemingly functional cellulose biosynthesis complex.
The core component of the complex is the functional cellulose synthase
consisting of BcsA (catalytic subunit) and BcsB (inner membrane anchored
periplasmic protein), the outer membrane pore BcsC, the BcsD periplasmic
factor required for crystallinity, and the cellulase BcsZ. GalU reversibly
catalyzes the synthesis of UTP-glucose, suggesting direct delivery
of the UDP-glucose substrate to produce the 1,4-β-glucan cellulose
macromolecule. The cytoplasmic DGC potentially directly delivers the
product c-di-GMP to the PilZ domain of the cellulose synthase BscA
for cellulose biosynthesis.
Genomic Context of the DGC GGDEFUCN34 in S. gallolyticus UCN34
GGDEF (SGGB_RS02220) is the
first gene of a five-gene cluster inserted into the serine tRNA locus
flanked by the trmB locus encoding tRNA (guanosine
(46)-N7) methyltransferase and rimP encoding ribosome
maturation factor in S. gallolyticus UCN34 (Figure A).
Figure 8
Genomic context of the
GGDEFUCN34 DGC in S.
gallolyticus UCN34. (A) Operon structure in S. gallolyticus UCN34 (upper panel). A serine tRNA locus is flanked by the trmB locus encoding tRNA (guanosine (46)-N7) methyltransferase
and rimP encoding ribosome maturation factor. The
GGDEF DGC is the first gene of a five-gene cluster genomic islet inserted
into a tRNAserine locus. The two downstream overlapping
open reading frames encode two type 2 glycosyltransferases. SGGB_RS02225,
which overlaps with SGGB_RS02230 encoding a Dpm1-GtrA hybrid protein
and SGGB_RS02230 coding for a cellulose synthase-like protein CelA.
The SGGB_RS02235 gene (membrane) encodes an integral membrane protein
with 16 predicted transmembrane helices. Other S. gallolyticus strains can have a variable islet composition with S. gallolyticus BI02 entirely missing the islet. (B) Localization and functionality
of gene products of the genomic islet of S. gallolyticus UCN34. Two type 2 glycosyltransferases, a Dpm1-GtrA hybrid protein
and a cellulose synthase-like protein CelA, are membrane proteins
involved in (exo)polysaccharide synthesis. The C-terminal PilZ domain
of CelA suggests binding of c-di-GMP for regulation of the catalytic
activity. The membrane protein with 16 predicted transmembrane helices
and the 60 amino acid protein is of unknown function.
Genomic context of the
GGDEFUCN34 DGC in S.
gallolyticus UCN34. (A) Operon structure in S. gallolyticus UCN34 (upper panel). A serine tRNA locus is flanked by the trmB locus encoding tRNA (guanosine (46)-N7) methyltransferase
and rimP encoding ribosome maturation factor. The
GGDEF DGC is the first gene of a five-gene cluster genomic islet inserted
into a tRNAserine locus. The two downstream overlapping
open reading frames encode two type 2 glycosyltransferases. SGGB_RS02225,
which overlaps with SGGB_RS02230 encoding a Dpm1-GtrA hybrid protein
and SGGB_RS02230 coding for a cellulose synthase-like protein CelA.
The SGGB_RS02235 gene (membrane) encodes an integral membrane protein
with 16 predicted transmembrane helices. Other S. gallolyticus strains can have a variable islet composition with S. gallolyticusBI02 entirely missing the islet. (B) Localization and functionality
of gene products of the genomic islet of S. gallolyticus UCN34. Two type 2 glycosyltransferases, a Dpm1-GtrA hybrid protein
and a cellulose synthase-like protein CelA, are membrane proteins
involved in (exo)polysaccharide synthesis. The C-terminal PilZ domain
of CelA suggests binding of c-di-GMP for regulation of the catalytic
activity. The membrane protein with 16 predicted transmembrane helices
and the 60 amino acid protein is of unknown function.Downstream of GGDEF are genes coding for two type 2 glycosyltransferases. SGGB_RS02225, which overlaps with SGGB_RS02230, codes for a Dpm1-GtrA hybrid protein.[51,52] Dpm1 is the catalytic subunit of eukaryotic dolichol-phosphate mannose
synthase to which the N-terminal part of SGGB_RS02225 is homologous
(Figure B). The C-terminal
part consists of a GtrA-like protein with four transmembrane helices.
GtrA is involved in flipping of undecaprenyl-phosphate glucose over
the inner membrane in Gram-negative bacteria. Downstream, SGGB_RS02230 codes for a cellulose synthase-like protein
CelA. CelA has <20% identity to all four cellulose synthases and
has only four predicted membrane spanning helices, but it contains
relevant signature amino acids for catalysis with the crystal structure
of BcsA from R. sphaeroides as the best fit (Supporting Figure S7). Intriguingly, CelA has
a C-terminal PilZ domain including signature amino acids for c-di-GMP
binding, which suggests that CelA binds the CDN to regulate its catalytic
activity. The SGGB_RS02235 gene downstream of SGGB_RS02230 encodes an inner membrane protein with 16 predicted
transmembrane helices of unknown function followed by a short coding
sequence. Besides CelA, none of these proteins have an identifiable
c-di-GMP binding site. In summary, CelA is a potential target for
the c-di-GMP signaling pathway in S. gallolyticus UCN34 (Figure B).We were wondering whether all strains of S. gallolyticus possess this c-di-GMP signaling islet. BLAST search showed that,
of nine entirely sequenced S gallolyticus isolates,
five possess the islet, while four isolates have a partial islet or
no insertion (Figure A). In other bovis subgroup species such as S. equinus, S. luteniensis, and Streptococcus sp., the islet is present, however, not all strains within a species
possess this insertion (Supporting Figure S9 and data not shown). While in S. macedonicus ACA-DC198
and S. infantariusATCC BAA-102, a part of the islet,
but not the DGC, has been retained, Streptococcus pasteurianus strains do not possess the islet. However, this bioinformatics analysis
might be considered preliminary for the following reasons: (1) complete
genome sequences not available for all isolates; (2) the paucity of
sequenced isolates for some of the investigated species, and (3) the
biased isolate collection.
Figure 9
Characterization of the EAL domain protein from
streptococcal species.
Alignment of the EAL domains of S. henryi DSM19005, S. parauberis SPOF3K (EAL1 and EAL2), and S. suis 1080671 (EAL1) with selected experimentally confirmed class I (functional
with conserved signature amino acids, green line), IIa, IIb (functional
with partially deviating signature amino acids, red), and nonfunctional
IIIa and IIIb (blue) EAL domain proteins.[43,53] The secondary structure is based on the PDB 3SY8 RocR crystal structure.
In red, amino acids involved in substrate binding; in blue, amino
acids involved in Mg2+ binding; in green, loop 6 stabilizing
glutamate; and in plum, the catalytic base. Underlined in gray, loop
6; underlined in plum, mutated loop 6 amino acids. Alignments displayed
with ESPript 3.0. TTT and TT indicate strict α- and β-turns.
Residues are colored according to physicochemical properties. Framed
residues show more than 70% similarity. Hashtag indicates any amino
acid of N/D/Q/E/B/Z, dollar indicates any amino acid of L/M, and percentage
indicates any amino acid of F/Y. Relative accessibility values (acc) are displayed below the consensus sequence.
Characterization of the EAL domain protein from
streptococcal species.
Alignment of the EAL domains of S. henryi DSM19005, S. parauberis SPOF3K (EAL1 and EAL2), and S. suis 1080671 (EAL1) with selected experimentally confirmed class I (functional
with conserved signature amino acids, green line), IIa, IIb (functional
with partially deviating signature amino acids, red), and nonfunctional
IIIa and IIIb (blue) EAL domain proteins.[43,53] The secondary structure is based on the PDB 3SY8 RocR crystal structure.
In red, amino acids involved in substrate binding; in blue, amino
acids involved in Mg2+ binding; in green, loop 6 stabilizing
glutamate; and in plum, the catalytic base. Underlined in gray, loop
6; underlined in plum, mutated loop 6 amino acids. Alignments displayed
with ESPript 3.0. TTT and TT indicate strict α- and β-turns.
Residues are colored according to physicochemical properties. Framed
residues show more than 70% similarity. Hashtag indicates any amino
acid of N/D/Q/E/B/Z, dollar indicates any amino acid of L/M, and percentage
indicates any amino acid of F/Y. Relative accessibility values (acc) are displayed below the consensus sequence.As we have observed a high amino acid sequence variability
among
the GGDEFUNC34 protein homologues, we were wondering whether
outside of bovis subgroup species (1) the DGC is always found in the
same genetic context and, if so, whether (2) the islet is always integrated
at the same location in the genome.In the salivarius subgroup,
the operon structure GGDEF-Dpm1/GtrA-CelA-membrane is conserved in S. salivarius isolates; however,
the operon is integrated into a different genomic context (Supporting Figure S10). In the pyogenes subgroup
member S. uberis, an EAL only domain c-di-GMP phosphodiesterase
protein ORF has been integrated downstream of the GGDEF domain again
in another genomic context (Supporting Figure S11A). In S. parauberis strains, four ORFs
coding for an EAL only protein, a hydrolase, the PagC glycosyltransferase,
and another EAL only protein are regularly encoded downstream of the
GGDEF protein ORF (Supporting Figure S11B). Furthermore, in unassigned S. suis, the S. uberis type operon is rearranged with the GGDEF and EAL
protein located downstream of the ORF for the uncharacterized membrane
protein (Supporting Figure S12A). In all
species, the operon is flanked by different genes.Moreover,
in the two sequenced S. henryi isolates,
the GGDEFUNC34 homologue is found in a different genetic
context (Supporting Figure S12B). Intriguingly,
downstream of the PAS-GGDEF protein open reading frame, the open reading
frame for a composite PAS-GGDEF-EAL-GGDEF domain protein is located.In conclusion, in most of the investigated streptococcal species,
a PAS-GGDEF DGC co-occurs with two type 2 glycosyltransferases followed
by an inner membrane protein and variable accessory genes related
to c-di-GMP signaling and exopolysaccharide biosynthesis, although
the chromosomal location can vary. Of note, the gene cluster Dpm1/GtrA-CelA-membrane is present also outside of streptococcal
species such as in Pediococcus pentosaceus ATCC25745,
with two GGDEF domain proteins to be encoded at distant locations
on the chromosome (Supporting Figure S12C). Thus, those genes seem to thrive in Streptococci and related species in various contexts.
Streptococcal EAL Proteins
Are Potentially Catalytically Active
Sequence alignment of
the EAL proteins from the different streptococcal
species identified three distinct proteins, EAL1, EAL2, and PAS-GGDEF-EAL-GGDEF,
within c-di-GMP signaling islets (Figure and data not shown). The presence of the
conserved amino acid motifs required for catalytic activity such as
the homologue of the catalytic base E352GVE of the PDE
RocR[43,44] indicated that at least two of the three
EAL domains possess c-di-GMP phosphodiesterase activity. While conserved
motifs of the EAL domain of PAS-GGDEF-EAL-GGDEF from S. henryi indicated catalytic functionality, both GGDEF domains are highly
degenerated (Figure and data not shown).
Discussion
In this work, we have
gathered experimental and bioinformatics
evidence that Proteus and Streptococcus species possess a functional c-di-GMP signaling network and initially
experimentally characterized two novel DGCs from those Gram-negative
and Gram-positive species. Despite being active enzymes in the context
of a cellulose or cellulose-like operon, the physiological roles of
the DGCs PMI3101_v and GGDEFUCN34 are, however, still undefined.P. mirabilis is well-known for its extensive flagella-mediated
swimming and swarming motility.[54] Swimming
and swarming motility is stimulated by low c-di-GMP levels in bacteria
and concurrently physically inhibited by cellulose production in S. typhimurium.[25,55] Nevertheless, Proteus spp. possess an obviously functional cellulose biosynthesis
operon with the catalytic subunit of the cellulose synthase BcsA encompassing
a C-terminal PilZ domain receptor with conserved signature amino acid
motifs required for c-di-GMP binding.[22,24] We hypothesize
that cellulose production can be involved in cell aggregation, surface
adhesion, chlorine resistance, or other cellulose-mediated physiology
in one of the diverse habitats where P. mirabilis forms biofilms.[56]P. mirabilis is also a frequent cause of catheter associated urinary tract infection,
where c-di-GMP mediated cellulose production might be involved in
the modulation of in vivo virulence, as observed
for other bacteria.[57−59]To our knowledge, reports of a functional c-di-GMP
signaling system
within the family Streptococcae had been restricted to investigate
the effect of extracellular c-di-GMP on biofilm formation.[60] Cyclic di-GMP mediated biofilm formation upon
expression of cellulose-like exopolysaccharides could contribute to
the pathogenesis of S. gallolyticus and related bacterial
species. S. gallolyticus has been associated with
colorectal cancer and causes endocarditis and bacteremia predominantly
in the aged population.[61,62] Further functional
analysis of the DGCs and the c-di-GMP regulatory network in S. gallolyticus, P. mirabilis, and other
streptococcal and Proteus spp. will aid our understanding
of the ecological and clinical impact of c-di-GMP signaling in biofilm
formation and the pathogenesis of infection in these bacterial species.Interestingly, in S. gallolyticus UCN34, the DGC
colocalizes with two genes coding for type 2 glycosyltransferases,
one of them the cellulose synthase-like protein CelA harboring a C-terminal
PilZ domain. Whether the PilZ domain binds c-di-GMP requires experimental
investigation, but bioinformatic analyses indicate the conservation
of signature amino acid motifs for c-di-GMP binding (Supporting Figure S7). To emphasize, the PAS-GGDEF domain
proteins encoded by streptococcal species are distinguished by their
remarkable low amino acid sequence similarity, which can be below
30% identity in the animal pathogens S. suis, S. uberis, and S. parauberis compared to S. gallolyticus UCN34. In comparison, the corresponding
type 2 glycosyltransferases CelA and Dpm1-GtrA have evolved much slower
with sequence identities of >52% and >46%, respectively, suggesting
signal sensing and amplification systems particularly prone to rapid
evolution in different streptococcal species.Of note, no readily
recognizable EAL or HD-GYP c-di-GMP specific
PDE was identified in S. gallolyticus and P. mirabilis. This finding is in line with observations
in other bacteria, which possess also a sole functional DGC but no
identified c-di-GMP hydrolyzing enzyme.[29,63] Although conventional
c-di-GMP specific EAL and HD-GYP domain phosphodiesterases are not
present, we cannot entirely exclude the presence of distantly related
variants of those enzymes. Alternatively, other ubiquitous phosphodiesterase
domains such as the HDc domain of the bifunctional ppGpp synthase/hydrolase
SpoT are candidates for such a residual functionality. On the other
hand, surprisingly, the exopolysaccharide operons of the animal pathogens S. uberis, S. parauberis, and S.
suis had one or two EAL domain only proteins integrated into
the c-di-GMP signaling islet. Equally, in S. henryi, a PAS-GGDEF-EAL-GGDEF protein is encoded downstream of the DGC
gene. We assume that, in these cases, the EAL proteins hydrolyze c-di-GMP
as signature amino acid motifs indicating catalytic activity are present.Previously, a cyclic di-AMP signaling network had been ubiquitously
identified in streptococcal species.[64] Cyclic
di-AMP signaling has been mainly investigated in the human pathogens S. pneumoniae and S. pyogenes, the dental
caries causing S. mutans, and the animal pathogen S. suis, where c-di-AMP signaling controls exopolysaccharide
biosynthesis and biofilm formation, antimicrobial resistance, the
competence status, and regulation of host immunity, among other phenotypes.[65−68] In S. gallolyticus subsp. gallolyticus, c-di-AMP signaling has been shown to promote osmoresistance, alter
cell morphology, and inhibit biofilm formation and host–cell
interactions.[69] There might be a cross-talk
of c-di-GMP signaling with other common nucleotide-based signaling
systems such as the c-di-AMP and ppGpp signaling systems with similar
pathways to be affected.[69,70] Of note, the c-di-GMP
signaling system is not found in all strains of the species S. gallolyticus and other species, but a partial or entire
deletion of the c-di-GMP signaling encoding islet and rearrangements
of the islet can occur (Figure A; Supporting Figure S10). This
microheterogeneity indicates a high genomic and potentially phenotypic
plasticity probably governed by the ecological niche. Remarkably,
heterogeneous GGDEF proteins are present in a few strains of S. pyogenes and S. pneumoniae (data not
shown). While the GGDEF proteins in S. pyogenes are
highly degenerated (one GGDEF only could be potentially functional,
though), their occurrence in S. pneumoniae needs
to be confirmed. In the ecological niche of the human nose, S. pneumoniae is in an excellent position to constantly
take up genes from environmental species by natural competence to
test them for suitability in its genomic context.An initial
characterization of the DGC activity of P. mirabilis and S. gallolyticus proteins came from the assessment
of the response of c-di-GMP responsive riboswitches.[19] Under our experimental conditions, the systems were especially
robust to detect DGCs, although, despite the nanomolar affinity, one
of the aptamer-based c-di-GMP sensors did not routinely provide sensitivity.
Previously, a riboswitch-based fluorescent biosensor consisting of
dual aptamers, a c-di-GMP binding aptamer and the spinach aptamer
that can bind to fluorophore 3,5-difluoro-4-hydroxybenzylidene imidazolinone
(DFHBI), was designed to visualize the intracellular CDN level to
detect c-di-GMP turnover proteins.[15] Another
biosensor for monitoring changes in intracellular c-di-GMP level is
based on modulation of fluorescence resonance energy transfer (FRET)
upon binding to the PilZ domain protein containing cyan CFP and yellow
YFP fluorescent protein fusions. Furthermore, we applied a MALDI-FTMS-based
approach to detect intracellular c-di-GMP levels from entire cell
lysates without isolating the compound. To initially assess the catalytic
activity of cyclic di-GMP turnover proteins by MALDI-FTMS is especially
useful for difficult to purify proteins and upon undetectable in vitro catalytic activity. Our experimental approaches
do not require cutting-edge facilities such as fluorescence microscopy
and fluorescence activated cell sorting.Moreover, promotion
of rdar morphotype expression combined with
inhibition of motility in S. typhimurium UMR1,[15] as occurred upon overexpression of PMI3101_v
and GGDEFUCN34, is a sensitive in vivo assay to initially assess DGC activity. Furthermore, even though
GGDEFUCN34 does not possess an RxxD I-site,[38] the catalytic mutant slightly up-regulated the
swimming motility of S. typhimurium UMR1 indicative
for a depletion of the second messenger molecule as it occurs upon
c-di-GMP binding. Of note, the GGDEF domain DGC XCC4471 lacking the
I-site can bind a semi-intercalated c-di-GMP dimer.[71] We therefore speculate that the GGDEF domain protein GGDEFUCN34 with a mutated GGAEF site has the ability to bind c-di-GMP.
Thus, physiological assays proved to be relevant tools to initially
validate basic functionality of candidate c-di-GMP signaling network
components.
Conclusion
In conclusion, the c-di-GMP network is more
widespread than previously
anticipated with the production of a c-di-GMP activated cellulose
or cellulose-like macromolecule as a fundamental physiological trait
in distantly related Gram-negative and Gram-positive bacteria. These
readily trackable organisms with a highly reduced c-di-GMP signaling
network will aid in identifying the evolutionary forces that lead
to an expansion versus reduction of this ubiquitous second messenger
signaling system.
Methods
Bacterial Strains, Plasmids,
and Growth Conditions
E. coli TOP10 was
grown either in Luria–Bertani
(LB) medium or on a LBagar plate, while S. typhimurium UMR1 (ATCC14028 Nalr), MAE50 (UMR1 ΔcsgD; biofilm negative control), and MAE108 (UMR1 ΔfliC ΔfljB; motility negative control) were grown
on LB without salt agar at 30 or 37 °C. S. gallolyticus UCN34[72] was grown at 37 °C with
5% carbon dioxide in brain heart infusion (BHI) broth (Oxide) or on
a BHI agar plate. P. mirabilis UEB50[35] was grown in nutrient broth or on a nutrient agar plate
(Difco) at 37 °C. Supplements were ampicillin (100 μg/mL)
(Sigma) for the selection for recombinant strains and l-arabinose
at the indicated concentration for induction of protein production.
All strains and plasmids used in this study are listed in Supporting Table S2.
Riboswitch Construction
The Vc1 c-di-GMP riboswitch
was amplified from the genomic DNA of V. cholerae strain C6706 comprising from −240 to +20 bp with respect
to the ORF. The lac promoter was amplified from pUC19. Vc1 and lac promoter were ligated by overlapping
PCR, the fragment digested with SmaI and BamHI (New England Biolabs), and ligated into the translational
reporter vector pRS414 in frame upstream from the ninth codon of the lacZ reporter gene. The genomic DNA was extracted by a GenElute
Bacterial Genomic DNA kit (Sigma-Aldrich).Subsequently, riboswitches
along with lacZY were amplified from the pRS414 vector.
The PCR products and pGRG25 containing the inducible highly efficient
recombination system for integration into the attTn7 site were digested with PacI and NotI (New England Biolabs).
Digested pGRG25 was dephosphorylated by Antarctic phosphatase (New
England Biolabs) and ligated with the PCR product by T4 DNA ligase
(Roche Life Science). The resulting product was transformed into chemo-competent E. coli TOP10 cells grown at 30 °C. Insertion of genes
into the chromosomal attTn7 attachment site was performed
as described.[73] Primers used for cloning
and sequencing are listed in Supporting Table S2.The cytoplasmic alteration of the c-di-GMP level
affects the expression
of the β-galactosidase encoded by lacZ. An
“on” riboswitch promotes the expression of the lacZ gene upon elevated level of ligand, whereas an “off”
riboswitch behaves in the opposite way and decreases the expression
of β-galactosidase upon binding of the ligand. The alteration
of β-galactosidase expression is visualized on plates containing
5-bromo-4-chloro-3-indolyl-β-d-galactopyranoside (X-gal),
a substrate for β-galactosidase. The reaction products are galactose
and 5-bromo-4-chloro-3-hydroxyindole, which in subsequent oxidation
steps develops into an insoluble dye.
Cloning of GGDEF Domain
Proteins
PMI3101_v and GGDEF were
PCR amplified from the genomic DNA of P. mirabilis UEB50 and S. gallolyticus UCN34, respectively.
A hexahistidine-tag (His6-tag) was introduced at the C-terminus
of the open reading frames (ORFs) during PCR. The amplified fragments
were digested by XmaI and XbaI (New England Biolabs) and subsequently
cloned into the corresponding sites of the pBAD28 vector. The site-directed
mutagenesis of the GGDEF domain to GGAEF was performed using a QuickChange
II Site-Directed Mutagenesis Kit (Agilent). Primers are listed in Supporting Table S2.
Assessment of c-di-GMP
Synthesis by Riboswitch-Regulated β-Galactosidase
Activity
Vectors harboring DGCs, c-di-GMP specific PDEs,
and respective mutants were transformed into chemically competent E. coli TOP10 cells containing the monitoring riboswitch
plasmid on the chromosome. Individual colonies were grown overnight
at 37 °C with 200 rpm shaking in LB medium supplemented with
100 μg/mL ampicillin. Cultures were diluted to an OD600 of 0.1 and subsequently grown to an OD600 around 0.6.
Five μL of culture was spotted onto an LBagar plate with 100
μg/mL ampicillin, 80 μg/mL X-gal (Roche), 0–0.1%
(wt/vol) l-arabinose, and 0.25 mM isopropyl β-d-1-thiogalactopyranoside (IPTG) upon containment of the Vc1 riboswitch.
The plates were incubated at 28 °C, and color development was
monitored up to 72 h.
Phenotypic Assays
The swimming assay
was performed
in 1% tryptone, 0.5% NaCl, 0.3% (wt/vol) agar plates. Three μL
of an overnight culture resuspended in water adjusted to OD600 = 5 was injected into the agar, and the plates were incubated at
37 °C for 6 h. Afterward, pictures of plates were taken with
a Gel Doc XR+ system (Bio-Rad) and the diameter of the swimming zone
was measured.The rdar biofilm morphotype was assessed on CR-LB
without salt agar plates.[74] A single colony
was picked from an LBagar plate incubated overnight, resuspended
in water at OD600 = 5, and 3 μL spotted onto the
CR plate. After 72 h of incubation at 28 °C, the morphotype of
the colony was observed.
MALDI Fourier Transform Mass Spectrometry
To prepare
the cell lysate for rapid c-di-GMP detection, a single colony was
picked and suspended in 450 μL of LB medium with ampicillin
(100 μg/mL). A 50 μL suspension was transferred into 5
mL of LB with ampicillin and grown overnight in 30 °C with shaking
at 200 rpm. To induce the overexpression of proteins, 0.01% arabinose
was added to the overnight culture, which was further cultured for
4 h at 30 °C. A 4 mL culture was harvested and resuspended in
100 μL of LC-MS CHROMASOLV water (Sigma-Aldrich) of OD600 = 3 supplemented with 0.5 mg/mL lysozyme. The resuspension was incubated
at 24 °C for 1 h followed by two rounds of freeze and thaw (−80
°C 1 h, room temperature 1 h). The lysate was stored at −80
°C until further use.The α-cyano-4-hydroxycinnamic
acid (α-cyano, CHCA) (Sigma-Aldrich) matrix for MALDI-TOF mass
spectrometry was prepared according to the manufacturer’s instruction.
A 2 μL portion of lysate was mixed with 2 μL of matrix;
the mixture was spotted on a metal plate of the atmospheric pressure
MALDI interface (MassTech, Columbia, MD, USA) and measured by Q Exactive
(Thermo Scientific) Fourier Transform mass spectrometer.
Estimation
of c-di-GMP Concentration by LC-MS/MS
The
extraction of c-di-GMP from bacterial cells was performed as reported.[75] Overnight cultures from agar-grown colonies
with protein expression induced by 0.1% l-arabinose were
suspended in 500 μL of ice cold extraction solvent (acetonitrile/methanol/water/formic
acid = 2/2/1/0.02, v/v/v/v), pelleted, and resuspended, followed by
boiling for 10 min. Three subsequent extracts were combined and frozen
at −20 °C overnight. The extract was centrifuged for 10
min at 20,800g, evaporated to dryness in a Speed-Vac
(Savant), and analyzed by LC-MS/MS.
Authors: Carmen Chan; Ralf Paul; Dietrich Samoray; Nicolas C Amiot; Bernd Giese; Urs Jenal; Tilman Schirmer Journal: Proc Natl Acad Sci U S A Date: 2004-11-29 Impact factor: 11.205
Authors: Linda M Holland; Sinéad T O'Donnell; Dmitri A Ryjenkov; Larissa Gomelsky; Shawn R Slater; Paul D Fey; Mark Gomelsky; James P O'Gara Journal: J Bacteriol Date: 2008-05-23 Impact factor: 3.490