| Literature DB >> 32466317 |
Purificação Aline Dias da1, Azevedo Nathalia Marins de1, Araujo Gabriel Guarany de1, Souza Robson Francisco de1, Guzzo Cristiane Rodrigues1.
Abstract
The regulation of multiple bacterial phenotypes was found to depend on different cyclic dinucleotides (CDNs) that constitute intracellular signaling second messenger systems. Most notably, c-di-GMP, along with proteins related to its synthesis, sensing, and degradation, was identified as playing a central role in the switching from biofilm to planktonic modes of growth. Recently, this research topic has been under expansion, with the discoveries of new CDNs, novel classes of CDN receptors, and the numerous functions regulated by these molecules. In this review, we comprehensively describe the three main bacterial enzymes involved in the synthesis of c-di-GMP, c-di-AMP, and cGAMP focusing on description of their three-dimensional structures and their structural similarities with other protein families, as well as the essential residues for catalysis. The diversity of CDN receptors is described in detail along with the residues important for the interaction with the ligand. Interestingly, genomic data strongly suggest that there is a tendency for bacterial cells to use both c-di-AMP and c-di-GMP signaling networks simultaneously, raising the question of whether there is crosstalk between different signaling systems. In summary, the large amount of sequence and structural data available allows a broad view of the complexity and the importance of these CDNs in the regulation of different bacterial behaviors. Nevertheless, how cells coordinate the different CDN signaling networks to ensure adaptation to changing environmental conditions is still open for much further exploration.Entities:
Keywords: DAC; GGDEF; SMODS; c-di-AMP.; c-di-GMP; cGAMP
Mesh:
Substances:
Year: 2020 PMID: 32466317 PMCID: PMC7288161 DOI: 10.3390/molecules25102462
Source DB: PubMed Journal: Molecules ISSN: 1420-3049 Impact factor: 4.411
List of the bacterial c-di-GMP, c-di-AMP, cGAMP, and eukaryotic cGAMP receptors that had their structure solved in complex with their ligand and deposited in the Protein Data Bank (PDB). The Pfam/Rfam and, in some cases, the InterPro domain is described. The residues involved in ligand binding are also described for a representative of each receptor.
| Receptor Class (Pfam/Rfam) | Organism | Receptor Function | Ligand Binding Site | Ref. |
|---|---|---|---|---|
| 3′-5′ c-di-GMP | ||||
| Members of Transmembrane Protein 173 (TMEM173) family, also known as Stimulator of Interferon Genes (STING), are an important component of the immune system. STING proteins are responsible for regulating the induction of type I interferon via activation of INF-β gene transcription. | STING proteins interact with c-di-GMP at the protein dimer interface in a perfectly symmetrical manner increasing the homodimer stability. This binding involves a hydrophilic core, that in the human STING (PDB 4F5D) corresponds to, S162, G166, Y167, R238, Y240, S241, N242, E260, T267, and the presence of two Mg2+ ions and two water molecules ( | [ | ||
| [ | ||||
| [ | ||||
| c-di-GMP Riboswitches, also known as GEMM (Genes for the Environment, Membranes and Motility), are structured RNAs located in the 5′-untranslated regions of mRNAs that sense c-di-GMP molecules to regulate expression of downstream genes that could be involved with virulence, motility and biofilm formation. | GEMM Riboswitches interacts with c-di-GMP by an uncharacterized motif with high affinity, at the picomolar range, compared to c-di-GMP protein receptors, with nanomolar to micromolar affinities. In the case of c-di-GMP I Riboswitch (PDB 3IRW) the nucleotides involved in ligand binding are: G14, C15, A16, C17, A18, G19, G21, C46, A47, A48, A49, G50. | [ | ||
| [ | ||||
| [ | ||||
| [ | ||||
| VCA0042 is an important protein for the efficient infection of mice by | This PilZ domain interacts with monomeric c-di-GMP via two main sequence motifs: RxxxR and DxSxxG motifs (PDBID: 2RDE), | [ | ||
| BcsA, Bacterial cellulose synthase A, is a component of a protein complex that synthesizes and translocates cellulose across the inner membrane. The binding of c-di-GMP to a complex BscA and BcsB releases the enzyme from an autoinhibited state, generating a constitutively active cellulose synthase. | Most PilZ domains interact with dimeric c-di-GMP, in which one molecule interacts with two main sequence motifs on the β-barrel surface, DxSxxG and RxxxR motifs (PDBI: 5EIY, 5EJ1, 5EJZ, 4P00, 4P02, 5Y6F, 5Y6G, 5VX6, 5KGO, 5EJL, 5XLY, 2L74, 5Y4R, 4RT0, 4RT1). | [ | ||
| YcgR like proteins such as the motility inhibitor (MotI) protein is a diguanylate receptor that binds c-di-GMP, acting as a molecular clutch on the flagellar stator MotA to inhibit swarming motility. | [ | |||
| [ | ||||
| [ | ||||
| MapZ in complex with c-di-GMP interacts directly with a chemotaxis methyltransferase, CheR1, and inhibits its activity. In this manner, it regulates chemotaxis in | [ | |||
| The alginate biosynthesis protein Alg44 regulates alginate secretion to promote biofilm formation by sensing dimeric c-di-GMP molecules. | [ | |||
| Unknown function | The ligand is in an unusual trimeric oligomerization state, in which the six guanine bases are oriented almost parallel to each other, | [ | ||
| Proteins containing GGDEF domains are DGCs and some of them are regulated by feedback regulation by interaction of c-di-GMP to their allosteric site (I-site). | Proteins with GGDEF domain act as receptor proteins when c-di-GMP binds their allosteric site via the RxxD motif. | [ | ||
| [ | ||||
| [ | ||||
| [ | ||||
| [ | ||||
| [ | ||||
| [ | ||||
| PelD is a membrane protein in which the cytoplasmatic GGDEF domain binds c-di-GMP to regulate the synthesis of the PEL exopolysaccharide. | [ | |||
| The FimX protein regulates twitching motility by sensing c-di-GMP molecules through its EAL domain and regulates the type IV pilus machinery. | Proteins with EAL domain, such as FimX (PDB 4FOK), interact with the c-di-GMP by Q463, F479, L480, R481, S490, P491, M495, D508, R534, E653, F654, Q673, G674, D675 and T680. The A478F479L480 residues belong to a degenerate EAL motif, | [ | ||
| [ | ||||
| [ | ||||
| The transmembrane receptor LapD is a multidomain protein, in which the C-terminal EAL domain binds c-di-GMP to prevent cleavage of the surface adhesin LapA, inhibiting biofilm dispersal. | [ | |||
| [ | ||||
| BldD is a master regulator of cell development. BldD represses the transcription of close to 170 sporulation genes during vegetative growth controlling morphological differentiation and also directly control expression of antibiotics. | The C-terminal domain of BldD (PDB 5TZD) interacts with a tetramer of c-di-GMP, forming a BldD2-(c-di-GMP)4 complex, by two motifs: R114G115D116 and R125Q126D127D128. The ligand was found as | [ | ||
| [ | ||||
| VpsT is transcriptional regulator that binds c-di-GMP at its REC domain to control biofilm formation and motility. VpsT is described as a master regulator for biofilm formation and consists of an N-terminal REC domain and a C-terminal HTH domain. | A c-di-GMP2 binds into the VspD interface between two REC domains; the REC dimerization is required for ligand binding. | [ | ||
|
| ShkA has a pseudoreceiver domain (Rec1) that binds c-di-GMP to allow the autophosphorylation and subsequent phosphotransfer and dephosphorylation of the protein. The c-di-GMP binds to the protein to release the C-terminal domain to step through the catalytic cycle. | C-di-GMP binds to the Rec1-Rec2 linker that contain the DDR motif. The residues involved in the ligand binding are: R324, Y338, I340, P342, R344, S347, Q351. The D369, D370 and R371 from the DDR motif located in a loop are inside of the c-di-GMP binding site in the apo form of the protein suggesting that c-di-GMP compete with this protein loop. | [ | |
| MshE is an ATPases associated with the bacterial type II secretion system, homologous to the type IV pilus machinery. | The N-terminal domain of MshE (locus tag VC0405, PDB 5HTL) interacts with c-di-GMP by mainly two similar motifs spaced by five residues. These motifs have a similar sequence, RLGxx(L)(V/I)xxG(I/F)(L/V)xxxxLxxxLxxQ, and the residues involved to ligand binding are shown in bold and correspond to R9L10G11 and L25xxxL29xxQ32 for the motif I, and R38L30G40 and L54xxxL58xxQ61 for motif II. Other residues also important to ligand binding are: R7, D108 (from the C-terminal ATPase domain), and the main chain of D41. | [ | ||
| BrlR upregulates the expression of multidrug efflux pumps. c-di-GMP activates BrlR expression and enhances its affinity for binding DNA. BrlR has an N-terminus DNA-binding motif (HTH_MerR domain described in the Pfam as MerR domain), and a C-terminus effector-binding domain (GyrI-like domain) linked by a coiled-coil region. | There are two different c-di-GMP binding sites located at the N-terminus of the protein, mainly at the DNA binding domain of each BrlR protomer of the protein tetramer. | [ | ||
| FleQ is a transcription regulator and a contains three domains: a central AAA+ ATPase σ(54)-interaction domain, flanked by a divergent N-terminal receiver domain and a C-terminal helix-turn-helix DNA-binding motif. FleQ binds c-di-GMP through itsAAA+ ATPase domain at a different binding site than the catalytic pocket site. | FleQ binds c-di-GMP at the N-terminal part of the AAA+ ATPase through the L142F143R144S145 motif (R-switch), E330xxxR334 motif, and residues R185 and N186 of the post-Walker A motif KExxxRN. | [ | ||
| Cell cycle kinase CckA is a bifunctional histidine kinase/phosphatase enzyme, mediating both phosphorylation and dephosphorylation of downstream targets. CckA binds c-di-GMP and drives the cell cycle progression by swapping the CckA kinase activity into phosphatase mode. | CckA is a membrane and multidomain protein, in which a catalytically active (CA) domain binds c-di-GMP. The CA domain of cell cycle kinase CckA interacts with c-di-GMP by the residues Y514, K518, W523, I524, E550, H551, H552, H553, H554 and H555. | [ | ||
| 3′-5′ cGAMP or 3′-3′ cGAMP | ||||
| STING regulates the induction of type I interferons via recruitment of protein kinase TBK1 and transcription factor IRF3, activating IFN-β gene transcription. | STING proteins interact with cGAMP at the dimer interface. In the anemone STING (PDBID 5CFM), the residues involved with the ligand interaction are: Y206, R272, F276, R278, and T303 of each protomer of the dimer. Y280 binds the ligand by a water molecule. | [ | ||
| Acts as a transcriptional factor, switching between RNA secondary structures when bound to cGAMP, regulating its own expression. | 3’-5’ cGAMP riboswitches bind cGAMP (PDBID 4YAZ) through the nucleotides G8, A11, A12, U13, A14, C15, A41, A42, G74, C75, and C76. | [ | ||
| [ | ||||
| 2’-3′ cGAMP | ||||
| STING regulates the induction of type I interferons via recruitment of protein kinase TBK1 and transcription factor IRF3, activating IFN-β gene transcription. | STING proteins interact with | [ | ||
| [ | ||||
| [ | ||||
| [ | ||||
| [ | ||||
| 3′-5′ c-di-AMP | ||||
| STING binds eukaryotic 2’-3′ cGAMP with high affinity compared with bacterial CDNs such as c-di-GMP, c-di-AMP, and 3′-5′ cGAMP. | STING proteins interact with c-di-AMP in a different manner than c-di-GMP, but still at the same dimer interface. In the porcine STING (PDBID 6A03), the amino acids involved with the interaction are: S162, Y167, I235, R232, R238, Y240, and T263. | [ | ||
| [ | ||||
| [ | ||||
| [ | ||||
| RECON (reductase controlling NF-κB) is an aldo-keto reductase and a STING antagonist. It negatively regulates the NF-κB activation that induces the expression of IFN-induced genes. RECON recognizes c-di-AMP by the same site that binds the co-substrate nicotinamide. One AMP molecule (AMP1) of c-di-AMP has essentially the same position as the AMP portion of the NAD+ co-substrate, while another AMP (AMP2) presents a shifted position. | RECON binds c-di-AMP by the residues: E276, E279, N280, L219, and A253 in contact with AMP1, while Y24, Y216, Y55, and L306 are in contact with AMP2. L219, T221, and G217 are also involved in ligand binding. | [ | ||
| Bacterial c-di-AMP is involved in cell wall stress and signaling DNA damage through interactions with several protein receptors and a widespread | [ | |||
| [ | ||||
| [ | ||||
| [ | ||||
| [ | ||||
| [ | ||||
| PII-like signal transduction protein (PtsA) is a c-di-AMP receptor. PII-like proteins are associated with nitrogen metabolism using different pathways. PtsA binds c-di-AMP with a | PstA (PDBID 4D3H) forms trimers and binds to c-di-AMP at the interface between two molecules through interactions with the residues N24, R26, T28, A27, F36, L37, N41, G47, F99, and Q108. | [ | ||
| [ | ||||
| [ | ||||
| LIPC forms a tetramer and each c-di-AMP molecule binds at a protein dimer interface at the carboxyltransferase (CT) domain (HMGL-like domain in the Pfam) (PDBID 5VYZ) in a binding site that is not well conserved among pyruvate carboxylases. The residues involved in the interaction are: Q712, Y715, I742, S745, G746, and Q749 from both monomers. The ligand was found as | [ | |||
| [ | ||||
| Potassium transporter A (KtrA) and Bacterial cation-proton antiporter (CpaA) are members of the RCK domain family of proteins (Regulator of conductance of K+) and regulates the cellular potassium conductance. The C-terminal domain (RCK_C or TrkA_C) binds specifically c-di-AMP molecules ( | c-di-AMP binds at the RCK_C domain of KtrA in the interface of a dimer (PDBID 4XTT). The residues involved in the interaction are I163, I164, D167, I168, R169, A170, N175, I176, and P191 from both monomers. R169 and the isoleucine residues (hydrophobic pocket) are well conserved in other species. | [ | ||
| Intracellular pathogen | c-di-AMP binds to the cystathionine β-synthase domain (CBS) of OpuC at the dimer interface. The residues involved in ligand binding are well conserved among OpuCA orthologues and are composed by the following residues: V260, V280, T282, Y342, I355, I357, R358, and A359. | [ | ||
Figure 1Structural similarities between GGDEF (GG[D/E][E/D]F conserved sequence motif) domains and adenylate/guanylate cyclase, GTP (guanosine triphosphate) cyclohydrolase III and RRM-like palm domain of DNA polymerases. (A) dendrogram showing structures similar to GGDEF domain made with the Dali server [79] (query: PleD PDBID: 2V0N). Each domain is colored with different colors and the PDBID_chain and the Pfam name are shown for each branch. The conserved fold found in most of these structures is shown in brown, panel (B), overlaid on the GGDEF domain of PleD topology. (C) structural superposition of GGDEF domain of PleD (PDBID: 2VON) with the other domains shown in the dendrogram (panel A), using the same colors to represent each domain. At the bottom of each structural alignment, the domain’s name and the PDBID code is shown, as well as the chemical reaction performed. DGC: diguanylate cyclase; AC: adenylate cyclase, GC: guanylate cyclase, and FAPy: 2-amino-5-formylamino-6-ribosylamino-4(3H)-pyrimidinone 5‘-phosphate.
Figure 2Domain architectures of proteins containing GGDEF, SMODS (Second Messenger Oligonucleotide or Dinucleotide Synthetase), and DAC (di-adenylyl cyclase) domains. The most frequent domain architectures of proteins containing GGDEF (A), SMODS (B) and DAC (C) domains. The analysis was done using a non-redundant dataset (<80% identity) of protein sequences built from sequences retrieved from the NCBI protein database [88]. The names of the domains are based on the Pfam database [89].
Figure 3Conserved sequences within GGDEF members and their active and I-sites pockets. (A) residue frequency in GGDEF domains. Using the Dali server [79], 23 sequences of GGDEF domain structures were used to create a multiple sequence alignment, and the sequence logo was created with the WebLogo server [90]. The sequence shown below the logo and the secondary structure elements belong to PleD of Caulobacter vibrioides (PDBID: 2V0N). Residues colored in red are involved in ligand or magnesium binding (for underlined residues, only the main chain is involved) and those colored in green are located in the I-sites. The GGDEF motif is placed in a red box. On the right, the structure of the GGDEF domain of PleD is shown as a cartoon. The topology of GGDEF is shown below the structure, and the CATH topology name and code are also shown [91]; (B) interaction network between the GGDEF domain of PleD binding pocket with the substrate, GTP. In the bottom, the PleD structure in the inactive conformation is shown, in which the two inhibitory sites are shown (I-site and I’-site). On the right, it is shown in more detail the residues involved in the (c-di-GMP)2 interactions at the inhibitory sites. Gray dotted lines represent hydrogen bonds. The magnesium ions are colored in green. GTP and the protein residues involved in its binding are shown as sticks. Carbons are colored white, oxygens are red, nitrogen atoms are blue, and phosphorous atoms are orange.
Figure 4Structural similarities between the SMODS domain and other nucleotidyltransferase superfamily members. (A) the N-terminal domain SMODS (colored in red) and the C-terminal ACS_C (light pink) of the DncV protein structure are shown as cartoons (PDBID: 4U0M). The SMODS domain is involved in cGAMP synthesis and belongs to the nucleotidyltransferase superfamily (NTS). The NTS fold is characterized by the presence of a minimal conserved core of a mixed β-sheet flanked by α-helices with α1-β1-α2-β2-α3-β3-α4 topology that correspond to α3-β2-α8-β3-α9-β6 (colored in red), missing the α4 element. Various insertions are observed and are colored in grey (right panel). Members of NTS contain three conserved motifs located at the active site: (i) G[G/S] located at α8, (ii) [D/E]h[D/E] (h indicates a hydrophobic amino acid) located at β3, and (iii) [D/E] located at β6. Two of these motifs are conserved in proteins containing SMODS domains (see Figure 5A). (B) topology of DncV, showing the location of the NTS fold core (bold outlines). The secondary structure elements from SMODS are colored in red and the AGS-C domain in light pink. (C) dendrogram showing structures similar to DncV made with the Dali server [79] (query: DncV, PDBID: 4U03). Each domain is colored with different colors and the PDBID_chain and the Pfam name are shown for each branch. (D) The conserved fold found in most of these structures, all NTS members, is shown in brown overlaid on the DncV topology. (E) structural superposition of DncV (PDBID: 4U03) with other NTS members found in the panel (C) dendrogram, using the same colors to represent each domain. At the top of each structural alignment is shown the protein’s name. NF90 is colored in purple (PDBID: 4AT7), PAP is colored in orange (PDBID: 1Q78), cGAS is colored in dark green (PDBID: 5XZE), Utp22 is colored in light green (PDBID: 4M5D), hOAS3.DI is colored in blue (PDBID: 4S3N), TRF4 is colored in light blue (PDBID: 3NYB), EF_0920 is colored black (PDBID: 2NRK), and Pol µ is colored in green (PDBID: 2IHM).
Figure 5Conserved sequences within SMODS members and their substrate and inhibitor binding pockets. (A) residue frequency in SMODS proteins. Forty-five sequences were used from the Pfam database to create a multiple sequence alignment of the SMODS domain of different proteins using ClustalW [116], and the sequence logo was done using the WebLogo server [90]. The sequence shown below the logo and the secondary structure elements belong to the Vibrio cholerae dinucleotide cyclase DncV (PDBID: 4U03). Residues colored in green are involved in the interaction between the DncV SMODS domain with the folate-like inhibitor, 5-methyltetrahydrofolate diglutamate (5MTHFGLU2) molecule (residues Arg36, Arg40, Arg44, and Asp260, be located at the AGS-C domain, are not shown). Residues located at the SMODS domain involved in the catalytic activity are colored in red (residues Ser259, Lys287, Ser301, and Asp348, located at the AGS-C domain, are not shown). The red boxes contain the G(G/S) and Dx(D/E) motifs found in members of NTS. The structure shown in the right belongs to the DncV protein (PDBID: 4U03), in which the AGS-C is colored in salmon and the SMODS domain is colored by secondary structure (β-strands in yellow and α-helices in red). The SMODS domain topology is shown below the structure, and the CATH topology name and code are also shown [91]; (B) interaction network between the DncV binding pocket (active site) and its inhibitory site with substrate and inhibitor molecules, respectively. The substrates GTP and ATP are found bound at the active site, while the folate-like molecule (5MTHGLU2) binds at the allosteric site, inactivating the protein. The residues interacting with substrate and inhibitor molecules are shown as sticks and colored by element: the inhibitor carbons are colored brown, the magnesium ion is shown as a green sphere, and gray dotted lines represent hydrogen bonds.
Figure 6Conserved sequences within DAC domain members and its substrate binding pocket. (A) residue frequency present in DAC proteins. In addition, 609 sequences were used from the Pfam database to create a multiple sequence alignment of the DAC domain of different proteins using ClustalW [116], and the sequence logo was done using the WebLogo server [90]. The sequence shown below the logo and the secondary structure elements belong to the DisA protein from T. maritima (PDBID: 3C1Y). Residues that bind ATP or the magnesium cation are colored in red, underlined residues bind mainly by the main chain. Conserved motifs within DAC members are placed in red boxes: GALI, DGAhh, GxRHRxA, and (V/I)SEE motifs. (B) structure of the DAC domain of CdaA from Listeria monocytogenes (PDBID: 4RV7). The substrate ATP is found bound at the active site. The DAC domain topology is shown below the structure, and the CATH topology name and code are also shown [91]; (C) interaction network between the CdaA binding pocket with the substrate, ATP (PDBID: 4RV7). Gray dotted lines represent hydrogen bonds, the magnesium ion is colored in green, and the ATP and the protein residues involved in its binding are shown as sticks. Carbons are colored white, oxygens are red, nitrogen atoms are blue, and phosphorous atoms are orange.
Figure 7Some CDN receptors: STING, PILZ, and degenerate EAL domains. (A) multiple sequence alignment of STING proteins in complex with different CDNs: monomeric 3’-5’ c-di-GMP (PDBID: 4F5D), 2’-3’ cGAMP (PDBID: 6A06), 3’-5’ c-di-AMP (PDBID: 6A03), and 3’-5’ cGAMP (PDBID: 5CFM). The residues highlighted in yellow are involved with direct interaction with the ligand, those in green are involved with interaction with the ligand via magnesium ions, and those in cyan interact via water molecules; (B) structural superposition of STING proteins (domain TMEM173) in complex with different CDNs bound at the same protein interface. The ligands are colored by element: nitrogen atoms are in dark blue and oxygens are in red. Carbons are colored according to the ligand: 3’-5’ c-di-GMP in orange; 2’-3’ cGAMP in blue, 3’-5’ c-di-AMP in purple and; 3’-5’ cGAMP in green; (C) residues involved in interactions with the different ligands are the same one highlighted in panel (A). The residues are colored by element, with carbon in white, nitrogen in dark blue, and oxygen in red, while the ligands are colored as described in panel (B). Water molecules are shown as blue spheres, while magnesium cations are shown as green spheres. (D) multiple sequence alignment of the PilZ domains that had their structures solved in complex with c-di-GMP molecules. Residues highlighted in yellow are involved in interactions with the ligand. The secondary structure elements shown belong to the PilZ domain of YcgR from E. coli (PDBID: 5Y6F). The motifs conserved within PilZ members are placed in red boxes; (E) some domain organizations found in proteins containing PilZ domains (Alg44, BcsA, and YcgR) and their related functions; (F) structural representation of a PilZ domain as a cartoon (PilZ domain of YcgR, PDBID: 5Y6F). Residues belonging to the “RxxxR” motif are colored in orange, the “DxSxxG” motif is colored in blue, and the β strand 7 is colored in green. The dimeric c-di-GMP is shown as sticks. The interaction network presented in this figure is shown in more details in panel (G); (H) multiple sequence alignment of proteins containing degenerate EAL domains (PDBID: 4F48, 4FOK, and 3PJT) and a catalytic EAL domain (PDBID: 3SY8) that had their structure solved in complex with monomeric c-di-GMP. The residues highlighted in yellow are involved with direct interactions with the ligand, those in green are involved with interactions with the ligand via magnesium ions. In the case of RocR, which has a catalytic EAL domain, the residues highlighted in salmon were experimentally demonstrated to be important for catalysis [123].The consensus “EAL” motif is placed in a red box; (I) interaction network of the binding site of a FimX degenerate EAL domain (PDBID: 4FOK). Gray dotted lines represent hydrogen bonds. The residues and the c-di-GMP molecule are colored by element. The multiple sequence alignments were performed using the CLUSTAL W server [116].
Figure 8Diversity of cyclic dinucleotides produced by different organisms. All structures were observed within the three-dimensional protein structures deposited in the Protein Data Bank. (A) top panel: two-dimensional representation of different cyclic dinucleotides produced mainly by bacteria, with the exception of the 2’-3’ cGAMP molecule that is produced by eukaryotic cells by cGAS enzymes. The linkages between the pentoses and the phosphates are shown in green, blue, or red, and those carbons colored in grey are not involved in the phosphate linkage cyclisation. The structures of each CDNs were initially downloaded in the SDF format at the CHEBI website [192] and edited using MarvinSketch version 19.18. The PDBID of each CDN is also described for each ligand. Bottom panel: The three main conformations of CDNs found in the protein binding pockets in respect to the base proximity are described as: 1-closed conformation (shaped as a horseshoe), when the two base rings are face-to-face, colored in blue; 2-intermediate conformation (shaped as a boat), when the bases are not in the closed conformation neither in the open conformation, colored in red; and 3-open conformation (shaped as a plate), when the two base rings are far from each other in an elongated conformation, colored in lemon green. At the bottom of these structures is shown a bubble chart showing the frequency of each CDN in the Closed, Intermediate, and Open conformations. c-di-GMP is the only one that has been found in protein structures in different oligomeric states: as a monomer, dimers, trimers, and tetramers. The bubble chart on the right shows the ribose conformations that can be found in 3 different configurations, C3’-endo, C2’-endo, or C2’-exo, and the frequency of each of these conformation in the CDNs found in protein binding pockets. * The Gsyn conformation of the base ring in relation to the pentose is found only in one c-di-GMP structure (PDBID: 4FOK). Panels (B–D) show superpositions of different CDNs showing the heterogeneity of conformations found in the protein and riboswitch binding pockets; (B) top panel: different conformations for c-di-GMP found in different protein binding pockets and riboswitches shown as a superposition between them. 3’-5’ c-di-GMP structures found in degenerate EAL domains are colored in cyan (PDBID: 3HV8, 4F3H, 4F48, 3PJT, 3PJU), in riboswitches are colored in green (PDBID: 3Q3Z, 3MXH, 3MUT, 3MUR, 3MUM, 3IRW, 4YB0, 3IWN), in STING proteins are colored in blue (PDBID: 4EF4, 4EMT, 6RM0, 6S86, 4F9G, 4F5D, 4F5Y, 6A04, 5CFL, 5CFP), in PilZ domains are colored in pink (dimeric c-di-GMP: PDBID: 4ZMN, 5EUH, 3BRE, 3I5C, 1W25, 2WB4, 2V0N, 3TVK, 3I5A, 3IGM, 4URG, 4URS. Trimeric c-di-GMP, PDBID: 4XRN), in the active site of GGDEF domains (monomeric c-di-GMP, PDBID: 4RT1), and in the GGDEF I-site (dimeric c-di-GMP, PDBID: 5EIY, 5EJ1, 5EJZ, 4P00, 4P02, 5KGO, 5EJL, 5VX6, 5Y4R, 5XLY, 2L74, 4RT0, 5Y6F, 5Y6G) are colored in purple, in the T2SSE_N domain is colored in brown (PDBID: 5HTL), in the HATPase_c (CA) domain is colored in blue (PDBID: 5IDM), in the REC domain is colored in yellow (PDBID: 3KLO), and in the Sigma54_activat domain is colored in orange (PDBID: 5EXX). The unusual c-di-GMP oligomeric states found in one GGDEF active site is colored in pink and brown (PDBID: 3QYY), and in the C-terminal domain of BldD is colored in yellow (PDBID: 4OAZ); (C) different conformations for 3’-5’ c-di-AMP found in different protein binding pockets and riboswitches shown as a superposition between them. 3’-5’ c-di-AMP structures found in riboswitches are colored in green (PDBID: 4QK8, 4QK9, 4W92, 4W90, 4QLM, 4QLN, 4QKA), in TrkA_C domains are colored in brown (PDBID: 4YS2, 4YP1, 5F29), in Cyclic-di-AMP receptor domains are colored in yellow (PDBID: 4WK1, 4D3H, 4RWW, 4RLE), in a STING protein is colored in dark blue (PDBID: 6IYF), in an Aldo-keto reductase domain is colored in light blue (PDBID: 5UXF), in a Pyruvate carboxylase domain is colored in red (PDBID: 5VZ0), in the active site of cGAS is colored in dark green (PDBID: 3C1Y), and in a CBS domain is colored in orange (PDBID: 5KS7); (D) 3’-5’ cGAMP is found in intermediate and closed conformations in the ligand binding pocket of riboswitches, colored in green (PDBID: 4YAZ and 4YB1), and in a STING protein, colored in blue (PDBID: 5CFM). 2’-3’ cGAMP is found in a closed conformation in the ligand binding pocket of STING proteins, colored in blue (PDBID: 6NT7, 6NT8, 5CFQ, 4LOH, 4LOJ, 5GRM, 4KSY, and 6A06).
Figure 9Lack of anti-correlation in the distribution of DAC and GGDEF genes per prokaryotic clades. Each dot represents a prokaryotic class, such as Gammaproteobacteria or Bacilli, as defined in the NCBI’s Taxonomy Database. For each class, the number of genomes harboring at least one DAC and one GGDEF gene and the number of genomes harboring both was calculated. If, for a given class, we consider the number of genomes with DACs and the number of genomes with GGDEF, the smallest of these numbers is the maximum number of genomes that could, in principle, carry both genes. That number is seen on the horizontal axis while the actual number of genomes carrying both genes is on the y-axis. These numbers are very close to the diagonal line, indicating that, in most cases, if members of a given lineage are carrying both DAC and GGDEF, they tend to keep both genes, instead of having to choose between them.