The World of Cyclic Dinucleotides in Bacterial Behavior

The regulation of multiple bacterial phenotypes was found to depend on different cyclic dinucleotides (CDNs) that constitute intracellular signaling second messenger systems. Most notably, c-di-GMP, along with proteins related to its synthesis, sensing, and degradation, was identified as playing a central role in the switching from biofilm to planktonic modes of growth. Recently, this research topic has been under expansion, with the discoveries of new CDNs, novel classes of CDN receptors, and the numerous functions regulated by these molecules. In this review, we comprehensively describe the three main bacterial enzymes involved in the synthesis of c-di-GMP, c-di-AMP, and cGAMP focusing on description of their three-dimensional structures and their structural similarities with other protein families, as well as the essential residues for catalysis. The diversity of CDN receptors is described in detail along with the residues important for the interaction with the ligand. Interestingly, genomic data strongly suggest that there is a tendency for bacterial cells to use both c-di-AMP and c-di-GMP signaling networks simultaneously, raising the question of whether there is crosstalk between different signaling systems. In summary, the large amount of sequence and structural data available allows a broad view of the complexity and the importance of these CDNs in the regulation of different bacterial behaviors. Nevertheless, how cells coordinate the different CDN signaling networks to ensure adaptation to changing environmental conditions is still open for much further exploration.


Introduction
In the mid-2000s, the idea emerged that c-di-GMP molecules, cyclic-bis(3 →5 )-dimeric GMP, could be second messengers ubiquitous in bacteria, in which proteins containing GGDEF and EAL or HD-GYP domains were at the center of this regulation, being involved in the synthesis and degradation of c-di-GMP, respectively [1]. In the following years, different papers were published showing the central role of c-di-GMP orchestrating different signaling networks such as regulation of the flagellar rotor, bacterial motility such as twitching, exopolysaccharide synthesis, and regulation of bacterial biofilm formation. Nevertheless, c-di-GMP was first identified in 1987 as an allosteric activator of cellulose synthase in the cellulose-producing bacterium Komagataeibacter (Gluconacetobacter) xylinus [2]. It was the first c-di-GMP receptor described, and nowadays a huge range of different receptors have been identified, including RNA structures known as riboswitches. Therefore, a cyclic dinucleotide neglected in the microbiology area for 20 years emerged as a regulator of the bacterial cell lifestyle.
Recently, this research area has been under expansion, with the discoveries of new intracellular signaling cyclic dinucleotides (CDNs) in bacteria. In 2008, it was demonstrated that bacteria can produce not only c-di-GMP, but also c-di-AMP, cyclic-bis(3 →5 )-dimeric AMP, by an enzyme known as DisA that possess a DAC domain [3]. In 2012, a novel cyclic dinucleotide has been found to be ligand recognition. We also highlight the conformation of the CDNs inside of the protein binding pocket. Surprisingly, different kinds of receptors bind CDNs with similar conformations. Additional observations, based on genomic data, suggest different CDN second messenger systems tend to coexist in many organisms showing the complexity and the importance of bacterial CDN signaling networks. We explore these resources and present an organization of our current knowledge on this expanding research topic.

GGDEF, SMODS, and DAC Domains Do Not Share Structural Similarities and Probably Perform the Nucleotide Cyclization Catalysis by Different Mechanisms
At the moment, three different classes of prokaryotic proteins are known to synthesize CDN molecules: (i) proteins containing GGDEF domains (Pfam family: PF00990); (ii) CD-NTases enzymes that have the catalytic domain known as SMODS (PF18144) [4]; and (iii) DAC proteins that have a catalytic domain called DAC domain (DisA_N domain, PF2457).
Proteins containing GGDEF domains synthesize mainly 3'-5' c-di-GMP (c-di-GMP) molecules, while proteins containing SMODS domain synthesize preferentially 3'-5' cGAMP (cGAMP) molecules and proteins containing DAC domain synthesize mainly 3'-5' c-di-AMP (c-di-AMP) molecules. Even though CDN molecules are mainly synthesized by prokaryotic cells, eukaryotic cells also synthesize CDNs such as 2'-3' cGAMP by cGAS enzymes. These three classes of CDN synthetases do not share structural similarities, have different residues involved in substrate binding, and possess different catalytic mechanisms. Therefore, they are not homologs and probably evolved independently to catalyze analogous chemical reactions.
Members of families within the CD-NTases superfamily, such as SMODS and cGAS, often do not share detectable primary sequence similarity but adopt a Pol-β-like nucleotidyl transferase fold, suggesting a common origin followed by divergent evolution [5,6,64,65]. cGAS and enzymes containing-SMODS domain use a single active site to sequentially form two separate phosphodiester bonds and release one cyclic nucleotide product. On the other hand, proteins containing DAC or GGDEF domains require homodimerization to perform catalysis. DACs adopt a unique, particular fold, while GGDEF domains are homologous to adenylyl/guanylyl cyclase catalytic domains and to the palm domain of DNA polymerases; see below [1,22].
Proteins containing GGDEF domains require an accessory domain that sense different signals to regulate the GGDEF homodimerization and consequently its enzymatic activity [66]. Each GGDEF domain binds one molecule of GTP and its dimerization positions the two GTP molecules in an antiparallel manner to enable their condensation into c-di-GMP with the release of two pyrophosphate molecules [67]. Therefore, proteins containing GGDEF domains are Bi Ter (two substrates, three products) enzymes and cannot be described by a Michaelis-Menten model [68,69]. A similar enzymatic mechanism seems to happen for proteins containing DAC domains. In the following sections, the structures of GGDEF, DAC and SMODS domains and the residues important to their catalysis are described in more detail.

GGDEF Domain Structure and Catalysis
GGDEF structure and structural similarities with other protein domains. The GGDEF domain has an overall structure composed of a central five-stranded β sheet surrounded by five α helices [70] and one hairpin ( Figure 1B,C). The GGDEF domain has structural similarities to three other catalytic domains: (a) the class III adenylate and guanylate cyclase catalytic domains (Guanylate_cyc, PF00211), (b) the GTP cyclohydrolase III (GCH_III, PF05165), and (c) the palm domain of family Y DNA polymerases, such as IMS domain-impB/mucB/samB family domain, PF00817) ( Figure 1). All of these families have a similar structural core composed by a β-α-α-β-β-α-β-α-β topology ( Figure 1B), which contains the Alpha-beta Plait topology (β-α-β-β-α-β), as defined by the CATH database [71]. The specific version of the Alpha-beta Plait topology embedded in this group is better known as the RNA Recognition Motif-like fold (RRM-like fold) and corresponds to the so-called "palm domain" shared by archaeo-eukaryotic primases, reverse transcriptases, viral RNA-dependent RNA polymerases and families A, B, and Y of DNA polymerases [72].
The catalytic domains of class III adenylyl cyclase (AC) and guanylyl cyclase (GC) are involved in the conversion of adenosine triphosphate (ATP) to 3 -5 cyclic AMP (cAMP) and in the conversion of guanosine triphosphate (GTP) to 3 -5 cyclic GMP (cGMP), respectively ( Figure 1C) [73,74]. Class III AC and GC are well characterized: they are widely present in eukaryotic and prokaryotic cells and perform important function in many human tissues, being involved in signal transduction [66].
The GCH_III domain (GTP cyclohydrolase III) catalyzes the conversion of GTP to 2-amino-5-formylamino-6-ribosylamino-4(3H)-pyrimidinone 5 -phosphate (FAPy) [75]. GCH III catalyzes two modifications on the GTP molecule that involve two hydrolysis reactions, one at the base (a cyclohydrolase activity) and another in the phosphodiester bond (phosphotransferase reaction) that causes the release of a pyrophosphate molecule [75] ( Figure 1C). Palm domains recognized by the IMS model of the Pfam database are the catalytic domains of DNA polymerases such as prokaryotic DNA polymerase IV and eukaryotic DNA polymerases eta and kappa [76]. All of them are Family Y DNA polymerases involved in DNA repair and exhibit error-prone behavior [77]. In these enzymes, the palm domain has deoxynucleotidyltransferase activity ( Figure 1C).
Given their conserved structural similarity to GGDEF domains, the class III adenylyl/guanylyl cyclases (AC/GC), GTP cyclohydrolase III, and the palm domain of DNA polymerases have been shown to be ancient homologous domains [78] that evolved from a common ancestor to perform different biological functions while preserving some core similarities such as: binding of nucleotides or deoxynucleotides and release pyrophosphate or phosphate during the enzymatic reaction course. shared by archaeo-eukaryotic primases, reverse transcriptases, viral RNA-dependent RNA polymerases and families A, B, and Y of DNA polymerases [72]. The catalytic domains of class III adenylyl cyclase (AC) and guanylyl cyclase (GC) are involved in the conversion of adenosine triphosphate (ATP) to 3'-5' cyclic AMP (cAMP) and in the conversion of guanosine triphosphate (GTP) to 3'-5' cyclic GMP (cGMP), respectively ( Figure 1C) [73,74]. Class III AC and GC are well characterized: they are widely present in eukaryotic and prokaryotic cells and perform important function in many human tissues, being involved in signal transduction [66].
The GCH_III domain (GTP cyclohydrolase III) catalyzes the conversion of GTP to 2-amino-5formylamino-6-ribosylamino-4(3H)-pyrimidinone 5′-phosphate (FAPy) [75]. GCH III catalyzes two modifications on the GTP molecule that involve two hydrolysis reactions, one at the base (a cyclohydrolase activity) and another in the phosphodiester bond (phosphotransferase reaction) that causes the release of a pyrophosphate molecule [75] ( Figure 1C). Palm domains recognized by the IMS model of the Pfam database are the catalytic domains of DNA polymerases such as prokaryotic DNA polymerase IV and eukaryotic DNA polymerases eta and kappa [76]. All of them are Family Y DNA polymerases involved in DNA repair and exhibit error-prone behavior [77]. In these enzymes, the palm domain has deoxynucleotidyltransferase activity ( Figure 1C).
Given their conserved structural similarity to GGDEF domains, the class III adenylyl/guanylyl cyclases (AC/GC), GTP cyclohydrolase III, and the palm domain of DNA polymerases have been shown to be ancient homologous domains [78] that evolved from a common ancestor to perform different biological functions while preserving some core similarities such as: binding of nucleotides or deoxynucleotides and release pyrophosphate or phosphate during the enzymatic reaction course.  Residues are important to GGDEF catalysis. The GGDEF domains are diguanylate cyclases (DGCs) that convert two molecules of GTP into one molecule of c-di-GMP. The GGDEF active site is thought to be assembled only when two GGDEF domains come together in such a manner that permits the nucleophilic attack of the 3' OH groups on the α-phosphate groups of each GTP, leading to the synthesis of one molecule of c-di-GMP and two pyrophosphate molecules [70,80,81]. Therefore, DGCs are Bi Ter enzymes (two substrates, three products) as described above [68,69].
The catalytic activity of GGDEF domains is often regulated by input domains that precede the GGDEF domain, of which most are known or predicted to form dimers or heterodimers and to be sensor domains ( Figure 2A). Isolated GGDEF domains have little or no detectable enzymatic activity [37,82] and require the dimerization of the input domain to assemble a catalytically competent GGDEF domain. Two hypotheses of the GGDEF activity regulation were reported. One of them suggests that the input domain binds its ligand and enhances the homodimerization and consequently the correct orientation of GGDEF domains to perform the catalysis. The other hypothesis suggests that the protein is a homodimer already and, when the input domain binds its ligand, it causes a reorientation of the GGDEF domains to a catalytically competent GGDEF dimer, or vice versa [67]. The signal transduction from the input domain to the GGDEF domain is predicted to be relayed by a S-helix (signaling-helix) that connects the two domains and forms a two-helical parallel coiled coil (stalk) in the dimer form of the protein [67,83]. Some proteins containing GGDEF domains possess a more complex activation mechanism and may involve formation of higher oligomers [67,[84][85][86].
The GG(D/E)EF motif (glycine, glycine, aspartic or glutamic acid, and phenylalanine residues) is located in the loop between β2 and β3 ( Figure 3A), in which the glutamic acid residue binds to the α-phosphate group of GTP molecule as well as coordinates one of the cations located in the binding site ( Figure 3B). In the case of the PleD GGDEF domain, two magnesium cations are located in the binding site and are coordinated by E370 (from the GG(D/E)EF motif), D327, and the main chain of I328. The PleD residues D344 and N335 bind the guanosine base of the substrate, while the side chains of E370, K442, R446 and the main chains of F330, F331, and K332 bind the phosphate moieties of the GTP molecule ( Figure 3B) [80]. The GG(D/E)EF consensus sequence and most of the residues important to catalysis are very well conserved within GGDEF family members ( Figure 3A). This includes the D327, N335 and D344 residues, which have been reported to be essential to GGDEF domain activity [70,87].
A subclass of GGDEF domain, called Hybrid promiscuous (Hypr) GGDEF enzymes, synthesizes predominantly cGAMP molecules, but also synthesizes c-di-AMP and c-di-GMP molecules [41]. The change in the substrate specificity seems to be related with the substitution of an aspartate (D344 of PleD located at the α2) by a serine, exactly the residue that binds the guanine base of the GTP in PleD. The analysis was done using a non-redundant dataset (<80% identity) of protein sequences built from sequences retrieved from the NCBI protein database [88]. The names of the domains are based on the Pfam database [89]. The analysis was done using a non-redundant dataset (<80% identity) of protein sequences built from sequences retrieved from the NCBI protein database [88]. The names of the domains are based on the Pfam database [89]. residue frequency in GGDEF domains. Using the Dali server [79], 23 sequences of GGDEF domain structures were used to create a multiple sequence alignment, and the sequence logo was created with the WebLogo server [90]. The sequence shown below the logo and the secondary structure elements belong to PleD of Caulobacter vibrioides (PDBID: 2V0N). Residues colored in red are involved in ligand or magnesium binding (for underlined residues, only the main chain is involved) and those colored in green are located in the I-sites. The GGDEF motif is placed in a red box. On the right, the structure of the GGDEF domain of PleD is shown as a cartoon. The topology of GGDEF is shown below the structure, and the CATH topology name and code are also shown [91]; (B) interaction network between the GGDEF domain of PleD binding pocket with the substrate, GTP. In the bottom, the PleD structure in the inactive conformation is shown, in which the two inhibitory sites are shown (I-site and I'-site). On the right, it is shown in more detail the residues involved in the (c-di-GMP)2 interactions at the inhibitory sites. Gray dotted lines represent hydrogen bonds. The magnesium ions are colored in green. GTP and the protein residues involved in its binding are shown as sticks.
Carbons are colored white, oxygens are red, nitrogen atoms are blue, and phosphorous atoms are orange.
A subclass of GGDEF domain, called Hybrid promiscuous (Hypr) GGDEF enzymes, synthesizes predominantly cGAMP molecules, but also synthesizes c-di-AMP and c-di-GMP molecules [41]. The  [79], 23 sequences of GGDEF domain structures were used to create a multiple sequence alignment, and the sequence logo was created with the WebLogo server [90]. The sequence shown below the logo and the secondary structure elements belong to PleD of Caulobacter vibrioides (PDBID: 2V0N). Residues colored in red are involved in ligand or magnesium binding (for underlined residues, only the main chain is involved) and those colored in green are located in the I-sites. The GGDEF motif is placed in a red box. On the right, the structure of the GGDEF domain of PleD is shown as a cartoon. The topology of GGDEF is shown below the structure, and the CATH topology name and code are also shown [91]; (B) interaction network between the GGDEF domain of PleD binding pocket with the substrate, GTP. In the bottom, the PleD structure in the inactive conformation is shown, in which the two inhibitory sites are shown (I-site and I'-site). On the right, it is shown in more detail the residues involved in the (c-di-GMP) 2 interactions at the inhibitory sites. Gray dotted lines represent hydrogen bonds. The magnesium ions are colored in green. GTP and the protein residues involved in its binding are shown as sticks. Carbons are colored white, oxygens are red, nitrogen atoms are blue, and phosphorous atoms are orange. Some GGDEF domains have diverged from the canonical GG(D/E)EF amino acid sequence motif and are described as degenerate GGDEF domains, due to the loss of their catalytic activity. These degenerate GGDEF domains can evolve to possess different biological functions, and two examples have been described in the literature: 1-a degenerate GGDEF domain that is a sensor domain and binds GTP to activate the phosphodiesterase activity in the neighboring EAL domain of the Caulobacter crescentus CC3396 protein [38]; and 2-a degenerate GGDEF domain of the Bacillus subtilis YybT protein that has unexpected ATPase activity [92].
Allosteric inhibition in proteins containing GGDEF domain. The DGC activity of proteins containing GGDEF domain are inhibited by an allosteric noncompetitive product inhibition. A GGDEF dimer contains two symmetrical allosteric sites (I and I' sites), in which each allosteric site binds a c-di-GMP dimer (c-di-GMP) 2 ( Figure 3B). Both sites are formed by four residues, three of them from one GGDEF molecule, the RxxD motif (R359 and D362 of PleD) and an arginine (R390 of PleD), and the fourth residue is an arginine from the adjacent GGDEF molecule (R313 of PleD). The two (c-di-GMP) 2 dimers are expected to crosslink allosteric sites on opposite GGDEF domains, resulting in their immobilization in an inactive orientation [70,80,87,93] ( Figure 3B). The RxxD motif and the positively charged residue (R390 in the case of PleD) are conserved in GGDEF members ( Figure 3A).

SMODS Domain Structure and Catalysis
SMODS structure and structural similarities with other protein domains. The Vibrio cholerae dinucleotide cyclase (DncV, the gene product of VC0179) has two domains, a SMODS domain located at its N-terminus and an Adenylyl/Guanylyl and sMODS C-terminal sensor domain (AGS-C) [4] at its C-terminus [94]. The first 23 residues of the protein are located in the AGS-C domain, which presents a mainly α-helical structure. The SMODS domain has two β-sheets connected by one β-strand (β3). It also has six α-helices that do not make part of the interface between the two domains. The two β-sheets are composed by the strands: β2-β3-β7-β8-β9 and β3-β6-β5-β ( Figure 4A). The substrate binding site is located in the interface between the two domains, in which the SMODS β-sheets make close contacts with the substrate ( Figure 5A). Proteins containing SMODS domains are also found associated with other domains ( Figure 2B) and, in rare cases, can be found in proteins containing two enzymatic domains: a SMODS and a class III AC/GC catalytic domain, both domains related with synthesis of cyclic nucleotide second messengers.
DncV have structural similarities with proteins belonging to the nucleotidyltransferase (NTase) fold, a highly diverse superfamily of proteins ( Figure 4C,E) [95]. NTase fold structure is characterized by the presence of a minimal conserved core of a mixed β-sheet flanked by α-helices (α1-β1-α2-β2-α3-β3-α4) that correspond to α3-β2-α8-β3-α9-β6 in DncV protein, missing the α4 element ( Figure 4A,B). This common core is usually decorated by various additional structural elements depending on the family. The NTase fold core is present in the DncV SMODS domain and various insertions are observed ( Figure 4A   . The SMODS domain is involved in cGAMP synthesis and belongs to the nucleotidyltransferase superfamily (NTS). The NTS fold is characterized by the presence of a minimal conserved core of a mixed β-sheet flanked by α-helices with α1-β1-α2-β2-α3-β3-α4 topology that correspond to α3-β2-α8-β3-α9-β6 (colored in red), missing the α4 element. Various insertions are observed and are colored in grey (right panel). Members of NTS contain three conserved motifs located at the active site:  [102], this domain is found in uncharacterized proteins such as EF_0920 from Enterococcus faecalis; and (viii) the palm subdomain (DNA_pol_B_palm, PF14792) [103] of the DNA polymerase µ (Pol µ) from the family X [104] that includes DNA polymerase β, γ, and µ [105] ( Figure 4E).
All of these proteins share not only the NTase fold core, but also some secondary structures from the AGS-C domain ( Figure 4D) suggesting that the domain interface is conserved in these families. It is worth mentioning that the AGS-C domain shares structural similarities with domains that are commonly associated with the catalytic domain of members of NTase, such as DZF C-terminal domain, OAS1_C, PAP_assoc, and Nrap_D2 ( Figure 4E). DZF domains form dimers and heterodimers and are found in proteins involved in gene expression and RNA metabolism such as NF90 that forms a complex with NF45 and regulates genes expression [106]. Poly(A) polymerase (PAP) is involved in eukaryotic mRNA processing by its polyadenylation at the end of transcription process, so PAP incorporates ATP at the 3' end of Mrna [107]. In metazoans, the cGAS enzyme, which has a Mab_21 domain, binds cytoplasmatic double-stranded DNA (dsDNA) to activate synthesis of 2'-3' cGAMP molecules and initiate host innate immune responses. Endogenous or exogenous dsDNA in the cytoplasm, which could be from damaged mitochondria or from an invasion of pathogenic bacteria or viruses, indicates major danger to eukaryotic cells. The cytosolic accumulation of 2'-3' cGAMP activates type-1 mediated stress-responses via STING and regulates autoimmunity in human cells [108]. Human dsRNA-activated oligoadenylate synthase (OAS), which matches both NTP_tranf_2 and OAS1_C PFAM models, is a mammalian dsRNA sensor, which is increased during pathogen infections, and activates the synthesis of a second messenger 2'-5'-linked RNA molecules to cause RNA decay [109,110].
TRF4 protein, which also contains a region most similar to the NTP_Tranf_2 model, makes part of a polyadenylation TRAMP complex that recognizes aberrant eukaryotic RNAs and target them for degradation [111]. Members of DNA polymerase family X, which have a "palm" subdomain, play essential roles in the base-excision repair mechanism, a process that repairs cell DNA base damage, being responsible for DNA synthesis and 5'-deoxyribose-phosphate (dRP) removal (dRP lyase activity). These enzymes can be also involved in other DNA repair processes such as non-homologous end-joining and lesion bypass [105,112,113]. Utp22, which has a Nrap_D2 domain, forms a complex with Rrp7 and they are present in early precursors of small ribosomal subunit. Utp22 is a structural building block and apparently lacks any enzymatic activity [97].
Among all described enzymes, cGAS enzymes are the only ones that share functional similarities with the prokaryotic DncV enzymes. While DncV synthesizes 3 -5 cGAMP molecules, cGAS enzymes synthesize 2'-3 cGAMP. Most of the other enzymes with structural similarity with the DncV SMODS domain have functions related to DNA or RNA processing.
Active site of SMODS domains-DncV synthesizes preferentially cGAMP, but also produces c-di-AMP and c-di-GMP molecules. DncV regulates the expression of more than 80 genes in Vibrio cholerae and its DGC activity is inhibited by folate-like molecules in vitro [5,94,114]. Orthologs of DncV are found in Gram-negative and Gram-positive bacterial species and the residues involved in ligand binding and to folate-like molecule binding are conserved among them [94] ( Figure 5A). The active site is located in the interface between SMODS and AGS-C domains, while the folate-like molecule binds at the opposite of the substrate-binding pocket at the flat side of the protein ( Figure 5A). The folate-like molecule binds mainly at the SMODS domain and makes few interactions with the linker between the two domains ( Figure 5). The inhibitory site of the DncV, which binds folate-like molecules, such as 5-methyltetrahydrofolate diglutamate (5MTHGLU2), is formed by side chains of Arg36, Arg40, Arg44, Arg108, Trp110, Gln116, Tyr137, Phe204, and Asp260, by the main chain of Phe109, Thr111, and Leu240 and by a hydrophobic pocket formed mostly by the side chains of Leu240 and Val245 ( Figure 5B). These residues are not conserved in members of SMODS ( Figure 5A).
The active site of DncV is built by nine residues: Ser114, Tyr117, Asp131, Asp133, Arg182, Ser259, Lys287, Ser301, and Asp348. Asp348 and Ser259 interact with the guanine and adenine bases of the substrate, respectively. Tyr117 and Asp133 are involved in the interaction with the ribose of the guanine and the adenine nucleotides, respectively. Arg182 binds the β and γ phosphate groups of the ATP. Tyr117, Ser114, Lys287, and Ser301 interact with the β and γ phosphate groups of GTP. The magnesium ion is coordinated by Asp131, Asp133, and the α, β and γ-groups of the GTP ligand ( Figure 5B). These two aspartic residues belong to the Dh(D/E) motif conserved in members of NTS, as described above, and are key residues in (2'-5') oligoadenylate synthetase (OAS1) and poly(A) polymerase activities [115].  [116], and the sequence logo was done using the WebLogo server [90]. The sequence shown below the logo and the secondary structure elements belong to the Vibrio cholerae dinucleotide cyclase DncV (PDBID: 4U03). Residues colored in green are involved in the interaction between the DncV SMODS domain with the folate-like inhibitor, 5-methyltetrahydrofolate diglutamate (5MTHFGLU2) molecule (residues Arg36, Arg40, Arg44, and Asp260, be located at the AGS-C domain, are not shown). Residues located at the SMODS domain involved in the catalytic activity are colored in red (residues Ser259, Lys287, Ser301, and Asp348, located at the AGS-C domain, are not shown). The red boxes contain the G(G/S) and Dx(D/E) motifs found in members of NTS. The structure shown in the right belongs to the DncV protein (PDBID: 4U03), in which the AGS-C is colored in salmon and the SMODS domain is colored by secondary structure (β-strands in yellow and α-helices in red). The SMODS domain topology is shown below the structure, and the CATH topology name and code are also shown [91]; (B) interaction network between the DncV binding pocket (active site) and its inhibitory site with substrate and inhibitor molecules, respectively. The substrates GTP and ATP are found bound at the active site, while the folate-like molecule (5MTHGLU2) binds at the allosteric site, inactivating the protein. The residues interacting with substrate and inhibitor molecules are shown Forty-five sequences were used from the Pfam database to create a multiple sequence alignment of the SMODS domain of different proteins using ClustalW [116], and the sequence logo was done using the WebLogo server [90]. The sequence shown below the logo and the secondary structure elements belong to the Vibrio cholerae dinucleotide cyclase DncV (PDBID: 4U03). Residues colored in green are involved in the interaction between the DncV SMODS domain with the folate-like inhibitor, 5-methyltetrahydrofolate diglutamate (5MTHFGLU2) molecule (residues Arg36, Arg40, Arg44, and Asp260, be located at the AGS-C domain, are not shown). Residues located at the SMODS domain involved in the catalytic activity are colored in red (residues Ser259, Lys287, Ser301, and Asp348, located at the AGS-C domain, are not shown). The red boxes contain the G(G/S) and Dx(D/E) motifs found in members of NTS. The structure shown in the right belongs to the DncV protein (PDBID: 4U03), in which the AGS-C is colored in salmon and the SMODS domain is colored by secondary structure (β-strands in yellow and α-helices in red). The SMODS domain topology is shown below the structure, and the CATH topology name and code are also shown [91]; (B) interaction network between the DncV binding pocket (active site) and its inhibitory site with substrate and inhibitor molecules, respectively. The substrates GTP and ATP are found bound at the active site, while the folate-like molecule (5MTHGLU2) binds at the allosteric site, inactivating the protein. The residues interacting with substrate and inhibitor molecules are shown as sticks and colored by element: the inhibitor carbons are colored brown, the magnesium ion is shown as a green sphere, and gray dotted lines represent hydrogen bonds.

DAC Domain Structure and Catalysis
DAC structure and structural similarities with other protein domains. c-di-AMP is synthesized by DAC enzymes that convert two molecules of ATP into one c-di-AMP and two pyrophosphate molecules. In the case of DisA, the Rv3586 protein from Mycobacterium tuberculosis, the synthesis of c-di-AMP is made using ATP or ADP [117]. Production of c-di-AMP has been described as essential for the growth of some Gram-positive bacteria due to it being involved in crucial cellular activities, such as cell wall metabolism, maintenance of DNA integrity, ion transport, cell division, and cell size control [22,[118][119][120]. Bacillus subtilis encodes three DAC enzymes, DisA, CdaA, and CdaS. Two of them, DisA and CdaA, are constitutively expressed during vegetative growth while CdaS is required for efficient germination of spores. Other Gram-positive bacteria encode only one DAC protein that is essential for their growth, as observed in Listeria monocytogenes, Streptococcus pneumoniae, and Staphylococcus aureus, thus making this enzyme a likely target for constructing new inhibitors that may serve as antibiotics for pathogenic Gram-positive bacteria.
At the moment, three structures of proteins containing DAC domains have been solved: DisA from Thermotoga maritima (PDBID 3C1Y) [3], DisA (named in the UniProt database as DacB) from Bacillus cereus (PDBID 2FB5) [121]; and DacA (CdaA-APO Y187A Mutant) from L. monocytogenes (PDBID 6HVN) [122]. As described before, no structural similarities with other domain were detected so far. The overall DAC domain structure exhibits a globular α/β fold with a slightly twisted central β-sheet, made up of seven mixed-parallel and antiparallel β-strands (β1-β7) surrounded by five α-helices (α1-α5), in which the N-terminal helix (α1) can be split in two parts (α1' and α1). Like GGDEF domains, two DAC domains must be correctly oriented to allow the conversion of two ATP molecules into one c-di-AMP molecule and two pyrophosphates. Therefore, DAC domains are also Bi Ter enzymes (two substrates, three products). The regulation of the catalytic activity of DAC domains may be regulated by input domains (Figure 2C), and in the case of DisA from T. maritima (PDB code 3C23, 3C1Z, and 3C1Y) [3], the protein forms a homo-octamer and the DAC domains are oriented in a such way that two DAC domains are oriented face to face to allow the catalysis. Therefore, in each DisA homo-octamer, there are four potential catalytic sites. Linear DNA or DNA ends do not affect the protein activity but branched nucleic acids (such as in Holliday junctions) strongly suppress the DAC activity of DisA by binding to its C-terminal domain [3].
Active site of DAC domains-In the case of the DAC domain of CdaA from Listeria monocytogenes (PDBID 4RV7), the ATP ligand is located in a well-defined cavity made up by the N-terminus of α4, loop β5-β6, loop β4-α4, and loop α3-β3 ( Figure 6C), in which many conserved residues of DAC domains are located: the GALI motif, GxRHRxA motif, an absolutely conserved serine, DGAhh motif (h is a hydrophobic residue), and (V/I)SEE motif ( Figure 6A).
The active site of CdaA DAC domain is thought to be built by 10 residues: the side chains of Leu31, Asp71, Thr102, Arg103, His104, Ser122, and Glu124 and the main chain of Leu88, Gly101, and Glu123 ( Figure 6C). Leu31 belongs to the GALI motif, the Asp71 belongs to the DGAhhh motif, and Thr102, Arg103, and His104 belong to GxRHRxA motif. The main chain of Leu88 and the side chain of L31 interact with the adenine base, Asp71 binds the ribose, and Thr102, Arg103, His104, Ser122, Gly101, and Glu123 bind the phosphate groups of the ATP molecule. The Glu124 coordinates one magnesium cation that binds the α and β phosphate groups of the ATP molecule ( Figure 6C) [49]. CdaA from L. monocytogenes is active in the presence of Mn 2+ or Co 2+ but inactive in the presence of Mg 2+ ions [122]. However, in the case of DisA from M. tuberculosis [117] and T. maritima, the enzymes are active in the presence of Mg 2+ ions.  [116], and the sequence logo was done using the WebLogo server [90]. The sequence shown below the logo and the secondary structure elements belong to the DisA protein from T. maritima (PDBID: 3C1Y). Residues that bind ATP or the magnesium cation are colored in red, underlined residues bind mainly by the main chain. Conserved motifs within DAC members are placed in red boxes: GALI, DGAhh, GxRHRxA, and (V/I)SEE motifs. (B) structure of the DAC domain of CdaA from Listeria monocytogenes (PDBID: 4RV7). The substrate ATP is found bound at the active site. The DAC domain topology is shown below the structure, and the CATH topology name and code are also shown [91]; (C) interaction network between the CdaA binding pocket with the substrate, ATP (PDBID: 4RV7). Gray dotted lines represent hydrogen bonds, the magnesium ion is colored in green, and the ATP and the protein residues involved in its binding are shown as sticks. Carbons are colored white, oxygens are red, nitrogen atoms are blue, and phosphorous atoms are orange.

Cyclic Dinucleotide Receptors
Different classes of CDN receptors have been described and are involved in the regulation of a broad range of bacterial behaviors, while, in eukaryotic cells, they are associated with the activation of innate immune response through interactions with STING proteins. In order to analyze the (A) residue frequency present in DAC proteins. In addition, 609 sequences were used from the Pfam database to create a multiple sequence alignment of the DAC domain of different proteins using ClustalW [116], and the sequence logo was done using the WebLogo server [90]. The sequence shown below the logo and the secondary structure elements belong to the DisA protein from T. maritima (PDBID: 3C1Y). Residues that bind ATP or the magnesium cation are colored in red, underlined residues bind mainly by the main chain. Conserved motifs within DAC members are placed in red boxes: GALI, DGAhh, GxRHRxA, and (V/I)SEE motifs. (B) structure of the DAC domain of CdaA from Listeria monocytogenes (PDBID: 4RV7). The substrate ATP is found bound at the active site. The DAC domain topology is shown below the structure, and the CATH topology name and code are also shown [91]; (C) interaction network between the CdaA binding pocket with the substrate, ATP (PDBID: 4RV7). Gray dotted lines represent hydrogen bonds, the magnesium ion is colored in green, and the ATP and the protein residues involved in its binding are shown as sticks. Carbons are colored white, oxygens are red, nitrogen atoms are blue, and phosphorous atoms are orange.

Cyclic Dinucleotide Receptors
Different classes of CDN receptors have been described and are involved in the regulation of a broad range of bacterial behaviors, while, in eukaryotic cells, they are associated with the activation of innate immune response through interactions with STING proteins. In order to analyze the residues involved in CDN ligand and to compare the ligand structure inside of the protein pocket, we focused on the CDN receptors with three-dimensional structures solved and deposited in the Protein Data Bank (PDB) ( Table 1).
C-di-GMP receptors are the most studied CDN receptors, probably because it was the first CDN identified as a second bacterial messenger. Therefore  (Table 1). For cGAMP, the receptors analyzed in this review are only STING proteins and c-di-GMP I riboswitches.
The function of each CDN receptor, as well as the residues involved in ligand binding, are described in more detail in Table 1. It is notable that most of CDN receptors are specific to their ligands, with the exception of receptors involved in mammalian cell innate immunity, such as STING that interact with different CDNs such as c-di-GMP, c-di-AMP and 3'-5' and 2'-3' cGAMP molecules ( Figure 7B). Interestingly, even though the CDNs are chemically different, the STING binding pockets for each kind of CDN are very similar and the residues involved in ligand binding for each CDN are almost the same ( Figure 7A,C). This suggests that STING adjusts the ligand binding site for each CDN by placing or removing water or magnesium molecules.
STING proteins are localized on the endoplasmic reticulum membrane of eukaryotic cells and are CDN sensors that, when bound, regulate the induction of type I interferons (IFN-α and IFN-β), thus eliciting the intracellular signals of the invasion by bacteria and/or viruses, and activating the innate immune response to attack the pathogen. STING proteins can directly sense the pathogen invasion by interaction with bacterial CDNs (3'-5' c-di-GMP, 3'-5' c-di-AMP or 3'-5' cGAMP) or indirectly by binding to eukaryotic 2'-3' cGAMP through its C-terminal domain (TMEM173, PF15009). It is controversial whether STING binds 2 -5 cGAMP preferentially in relation to other CDNs, or binds all of them with the same affinity [114].   In the case of RocR, which has a catalytic EAL domain, the residues highlighted in salmon were experimentally demonstrated to be important for catalysis [123].The consensus "EAL" motif is placed in a red box; (I) interaction network of the binding site of a FimX degenerate EAL domain (PDBID: 4FOK). Gray dotted lines represent hydrogen bonds. The residues and the c-di-GMP molecule are colored by element. The multiple sequence alignments were performed using the CLUSTAL W server [116].
Other important c-di-GMP receptors are proteins containing PilZ domains (PF07238). PilZ domains regulate twitching and swarming motility via the flagellar regulator YcgR protein [7], but proteins containing different domain architectures are related with other functions, such as the regulation of the synthesis of cellulose by BcsA in Rhodobacter sphaeroides, or chemotaxis by MapZ protein and alginate secretion by Alg44 to promote biofilm formation in Pseudomonas aeruginosa ( Figure 7E and Table 1). PilZ domain is found associated with different domains that could be sensor domains, such as GAF, Cache, and PAS domains, and catalytic domains such as GGDEF, EAL, and Peptidase_S8. Therefore, proteins containing PilZ domain could be classified based on their domain architecture and function in different paralogous families [124,125]. The Pfam database describes 221 different domain architectures containing PilZ domains [89], showing the diversity of signaling networks in which c-di-GMP can be involved and have not yet been explored.
It is interesting that two proteins containing PilZ domains that are c-di-GMP receptors have been found to be involved in the production of different exopolysaccharides to produce bacterial biofilms: cellulose and alginate. Moreover, another c-di-GMP receptor is also involved in exopolysaccharide production, the PelD of Pseudomonas aeruginosa that regulates the synthesis of the Pel exopolysaccharide (Table 1).
PilZ proteins interact with c-di-GMP by two conserved sequence motifs: RxxxR and DxSxxG motifs ( Figure 7D). In the RxxxR motif located in a loop at the N-terminal part of the PilZ domain, each arginine is interacting with the phosphate group and the base of the ligand. In the case of DxSxxG motif, the aspartic acid, serine and glycine residues bind the base and the pentose ring of the c-di-GMP molecule ( Figure 7G). Other residues not conserved within members of the PilZ family are also involved in ligand binding and some of them are located at the β-strand 7 of the PilZ protein (Figure 7D,F,G). Some PilZ proteins lost their canonical residues to bind c-di-GMP and are not c-di-GMP receptors anymore but may work as protein-protein adaptors, as happens with the complex FimX-PilZ-PilB that regulates the twitching motility in Xanthomonas citri [8]. This ternary complex is an example of a full set of "degenerate" GGDEF, EAL, and PilZ domains, in which GGDEF does not synthesize c-di-GMP, PilZ does not bind c-di-GMP, and the EAL domain does not cleave c-di-GMP but kept the ability to bind it [8].
Degenerate EALs proteins lost their ability to cleave c-di-GMP to pGpG, and some of them still bind c-di-GMP molecules but do not cleave them changing its function from enzyme to a CDN receptor. The residues involved in c-di-GMP interaction are described in Table 1 and Figure 7H,I. The loss of the EAL domain catalytic function seems to be related with a change in the residues important for the coordination of a magnesium cation ( Figure 7H,I).
In Xanthomonas citri, Xanthomonas campestris, and Pseudomonas aeruginosa, FimX proteins regulate twitching motility by sensing c-di-GMP levels via interaction with degenerate EAL domain and regulates type IV pilus machinery [8]. LapD from Pseudomonas fluorescens is a transmembrane protein that binds c-di-GMP through its C-terminal degenerate EAL domain to prevent cleavage of the surface adhesin LapA and therefore activates biofilm formation [126].
Different classes of RNA riboswitches sense different kinds of CDNs (Table 1). Riboswitches are structured RNAs located in the 5'-untranslated regions of mRNAs and some can sense CDNs molecules to change its structure to regulate expression of downstream genes that could be involved with virulence, motility, biofilm formation, cell wall metabolism, synthesis and transport of osmoprotectants, sporulation, and other important biological processes [127,128].
There are three distinct classes of riboswitches that bind specific CDNs and have had their structures solved in complex with their ligand and deposited in the Protein Data Bank: c-di-GMP I riboswitch (RF01051), c-di-GMP II riboswitch (RF01786), and c-di-AMP riboswitch (ydaO-yuaA riboswitch, RF00379). C-di-GMP I riboswitch and c-di-GMP II riboswitch bind c-di-GMP molecules while c-di-AMP riboswitch binds c-di-AMP molecules [28,129,130]. The c-di-GMP I riboswitch was originally annotated as a conserved RNA-like structure of Genes Related to the Environment, Membranes and Motility (GEMM motif) and later another c-di-GMP riboswitch class was identified, the c-di-GMP II riboswitch. They have the same function but do not share any sequence motif or structural similarities. The c-di-AMP riboswitch is one of the most common riboswitches in various bacterial species and is found in the vicinity of genes related to cell wall metabolism, sporulation in Gram-positive bacteria, and other important biological processes [127,128]. These structures reveal that the RNAs use different ways to bind CDNs.
The TetR-like transcriptional factor, DarR, from Mycobacterium smegmatis was the first c-di-AMP receptor discovered [27], where c-di-AMP stimulate the DNA binding activity of this protein. DarR is a repressor that negatively regulates the expression of its target genes [27]. Another protein that interacts with c-di-AMP by a poorly understood mechanism is KdpD/KdpE that controls the potassium uptake in situations where the potassium concentrations are extremely low and other uptake systems wouldn't be enough to give the cell all potassium it requires. In Escherichia coli, there are three systems responsible for potassium uptake, namely, Trk, Kdp, and Kup. In the case of Trk system, four genes are constitutively expressed and TrkA is the predominant potassium transporter at neutral pH. The Kdp-ATPase system is induced at low potassium concentrations and under conditions of osmotic stress. The Kup, formerly TrkD, is activated when TrkA and Kdp activities are not sufficient [131][132][133]. In Bacillus subtilis, a novel high-affinity transporter KimA (formerly YdaO) has recently been characterized and the expression of KimA and KtrAB is negatively regulated by c-di-AMP riboswitches [28]. When the concentration of potassium is high in the cell, the concentration of c-di-AMP increases inhibiting potassium uptake by two ways, by binding to c-di-AMP riboswitches that will avoid the expression of proteins involved in transport, and by direct interactions with regulatory subunits of KtrAB and KtrCD causing the inhibition of potassium transport [134]. A similar process seems to happen in Staphylococcus aureus, where c-di-AMP binds to the KtrA protein and to the universal stress protein (USP) domain of the KdpD sensor kinase inhibiting the expression of Kdp potassium transporter components. In this manner, c-di-AMP appears to be a negative regulator of potassium uptake in different Gram-positive bacteria [60,134].
One of the most well understood receptors for c-di-AMP is KtrA, which binds c-di-AMP through its C-terminal domain (RCK_C or TrkA_C) to cause inactivation of the KtrA function (Table 1). c-di-AMP binds to the interface of the KtrA homodimer, and the residues involved in the ligand interaction are described in Table 1. Another c-di-AMP receptor is the c-di-AMP receptor domain (PF06153) of the PII-like signal transduction protein, PstA. PstA is a homotrimer and, in each protein interface, one c-di-AMP molecule is bound. The residues involved in ligand binding in PstA are also described in Table 1.
c-di-AMP is also related with negative control of aspartate and pyruvate pools in Lactococcus lactis by a pyruvate carboxylase, LlPC protein, and Listeria monocytogens pyruvate carboxylase, LmPC protein, respectively. In both cases, c-di-AMP binds to the pyruvate carboxylase domain (HMGL-like domain in the Pfam) ( Table 1). LIPC forms a tetramer and each c-di-AMP molecule binds the protein dimer interface at the carboxyltransferase (CT) domain in a binding site pocket containing residues that are poorly conserved among pyruvate carboxylases [135].
The huge repertoire of CDN receptors demonstrates the complexity of CDN signaling networks in bacteria. Additionally, CDNs may regulate different bacterial behaviors at different speeds through regulation of gene transcription by transcriptional factors, protein translation by riboswitches, and directly by regulating the function of different classes of protein. Table 1. List of the bacterial c-di-GMP, c-di-AMP, cGAMP, and eukaryotic cGAMP receptors that had their structure solved in complex with their ligand and deposited in the Protein Data Bank (PDB). The Pfam/Rfam and, in some cases, the InterPro domain is described. The residues involved in ligand binding are also described for a representative of each receptor.

-5 c-di-
STING proteins interact with c-di-GMP at the protein dimer interface in a perfectly symmetrical manner increasing the homodimer stability. This binding involves a hydrophilic core, that in the human STING (PDB 4F5D) corresponds to, S162, G166, Y167, R238, Y240, S241, N242, E260, T267, and the presence of two Mg 2+ ions and two water molecules ( Figure 7A-C). STING proteins bind monomers of c-di-GMP that are stabilized in the protein pocket at intermediate or closed conformations, Figure 8.
This PilZ domain interacts with monomeric c-di-GMP via two main sequence motifs: RxxxR and DxSxxG motifs (PDBID: 2RDE), Figure 7D, E. The ligand was found as intermediate monomers, Figure 8. [124] R. sphaeroides (5EIY, 5EJ1, 5EJZ, 4P00, 4P02) BcsA, Bacterial cellulose synthase A, is a component of a protein complex that synthesizes and translocates cellulose across the inner membrane. The binding of c-di-GMP to a complex BscA and BcsB releases the enzyme from an autoinhibited state, generating a constitutively active cellulose synthase.
The ligand was found as closed dimers, Figure 8. One PilZ was found to interact with a trimeric c-di-GMP (PDBID: 4XRN), Figure 8B. [149,150] E. coli (5Y6F, 5Y6G) YcgR like proteins such as the motility inhibitor (MotI) protein is a diguanylate receptor that binds c-di-GMP, acting as a molecular clutch on the flagellar stator MotA to inhibit swarming motility. The PilZ domain of MrkH, also a YcgR like protein, is transcriptional regulator protein, and binds c-di-GMP as well as DNA sequences to regulate type 3 fimbriae expression and biofilm formation. YcgR proteins regulate motility and biofilm formation by sensing c-di-GMP. [151] B. subtilis (5VX6) [152] K. pneumoniae.

P. aeruginosa (4XRN) Unknown function
The ligand is in an unusual trimeric oligomerization state, in which the six guanine bases are oriented almost parallel to each other, Figure 8B. [158]  Proteins with GGDEF domain act as receptor proteins when c-di-GMP binds their allosteric site via the RxxD motif.
VpsT is described as a master regulator for biofilm formation and consists of an N-terminal REC domain and a C-terminal HTH domain.
A c-di-GMP 2 binds into the VspD interface between two REC domains; the REC dimerization is required for ligand binding. Proteins with the REC domain of VpsT (PDB 3KLO) interact with two molecules of c-di-GMP by a K and a W[F/L/M][T/S]R motif that correspond to: K120, W131, L132, T133 and R134. The ligand was found as closed dimers, Figure 8. [51]

C. vibrioides (6QRL)
ShkA has a pseudoreceiver domain (Rec1) that binds c-di-GMP to allow the autophosphorylation and subsequent phosphotransfer and dephosphorylation of the protein. The c-di-GMP binds to the protein to release the C-terminal domain to step through the catalytic cycle.
C-di-GMP binds to the Rec1-Rec2 linker that contain the DDR motif. The residues involved in the ligand binding are: R324, Y338, I340, P342, R344, S347, Q351. The D369, D370 and R371 from the DDR motif located in a loop are inside of the c-di-GMP binding site in the apo form of the protein suggesting that c-di-GMP compete with this protein loop. [175] MshE is an ATPases associated with the bacterial type II secretion system, homologous to the type IV pilus machinery. There are two different c-di-GMP binding sites located at the N-terminus of the protein, mainly at the DNA binding domain of each BrlR protomer of the protein tetramer. Binding site 1 is composed of M1, R31, D35, Y40, and Y270. The binding site 2 is composed of P61, A64, R67, R70, F83, R86. The ligand was found as closed monomers, Figure 8. FleQ binds c-di-GMP at the N-terminal part of the AAA+ ATPase through the L 142 F 143 R 144 S 145 motif (R-switch), E 330 xxxR 334 motif, and residues R185 and N186 of the post-Walker A motif KExxxRN. The ligand was found as closed dimers, Figure 8.

-5 cGAMP or -cGAMP
STING regulates the induction of type I interferons via recruitment of protein kinase TBK1 and transcription factor IRF3, activating IFN-β gene transcription. cGAS-STING responds to cytosolic DNA via binding to 3'-5'cGAMP.
STING proteins interact with cGAMP at the dimer interface.
In the anemone STING (PDBID 5CFM), the residues involved with the ligand interaction are: Y206, R272, F276, R278, and T303 of each protomer of the dimer. Y280 binds the ligand by a water molecule. The ligand was found as intermediate monomer, Figure 8. [144] c-di-GMP I Riboswitch (RF01051) Geobacter (4YAZ) Acts as a transcriptional factor, switching between RNA secondary structures when bound to cGAMP, regulating its own expression. A human c-di-GMP I Riboswitch mutant (G20A) can also bind cGAMP.

STING (TMEM173, PF15009)
Sus scrofa (6A06) STING regulates the induction of type I interferons via recruitment of protein kinase TBK1 and transcription factor IRF3, activating IFN-β gene transcription. The STING pathway plays an important role in the detection of viral and bacterial pathogens in animals.
STING proteins interact with c-di-AMP in a different manner than c-di-GMP, but still at the same dimer interface. In the porcine STING (PDBID 6A03), the amino acids involved with the interaction are: S162, Y167, I235, R232, R238, Y240, and T263. The ligand was found as closed monomers, Figure 8. [143] N. vectensis (5CFN) [144] H. sapiens (6CFF and 6CY7) [182] Mus moluscus (4YP1) [183] Aldo-keto reductase (PF00248) Mus musculus (5UXF) RECON (reductase controlling NF-κB) is an aldo-keto reductase and a STING antagonist. It negatively regulates the NF-κB activation that induces the expression of IFN-induced genes. RECON recognizes c-di-AMP by the same site that binds the co-substrate nicotinamide. One AMP molecule (AMP1) of c-di-AMP has essentially the same position as the AMP portion of the NAD+ co-substrate, while another AMP (AMP2) presents a shifted position.
[62] Bacterial c-di-AMP is involved in cell wall stress and signaling DNA damage through interactions with several protein receptors and a widespread ydaO-type riboswitch, one of the most common riboswitches in various bacterial species. This riboswitch is found in the vicinity of genes involved in cell wall metabolism, synthesis and transport of osmoprotectants, sporulation and other important biological processes [127,128]. A c-di-GMP I Riboswitch mutant (G20A/C92U, PDB 3MUV) can also bind c-di-AMP.
LIPC forms a tetramer and each c-di-AMP molecule binds at a protein dimer interface at the carboxyltransferase (CT) domain (HMGL-like domain in the Pfam) (PDBID 5VYZ) in a binding site that is not well conserved among pyruvate carboxylases. The residues involved in the interaction are: Q712, Y715, I742, S745, G746, and Q749 from both monomers. The ligand was found as intermediate monomers, Figure 8. c-di-AMP binds at the RCK_C domain of KtrA in the interface of a dimer (PDBID 4XTT). The residues involved in the interaction are I163, I164, D167, I168, R169, A170, N175, I176, and P191 from both monomers. R169 and the isoleucine residues (hydrophobic pocket) are well conserved in other species. The ligand was found as closed monomers, Figure 8. [59,183,191] CBS domain (PF00571) L. monocytogenes (5KS7) Intracellular pathogen L. monocytogenes synthesizes and secretes c-di-AMP during growth in culture and also in host cells.  Figure 8. [24]

Conformation of Cyclic Dinucleotides inside the Binding Site of Receptors
The cyclisation between two nucleotides of the most common CDNs involves the formation of a phosphodiester bond that links the C3' of one pentose ring with the C5' of another, resulting in a 3'-5' cyclic dinucleotide. This kind of cyclisation creates a two-fold symmetry between two pentose rings of dinucleotides. Only cGAMP has been reported to present not only a 3'-5' linkage, but also being found with a 2'-3' one that contains two distinct phosphodiester linkages, one between C3 of AMP and C5 -phosphate of GMP, and the other between C5 -phosphate of AMP and C2' of GMP ( Figure 8A). The dinucleotides can assume different conformations in the binding site of different receptors that can be described in relation to the base and the ribose conformations. The ribose ring can assume three different configurations, C3'-endo, C2'-endo, or C2'-exo. When taking into account receptor structures in complex with cyclic dinucleotides (Table 1), more than 80% of the ligands have the two pentose rings in C3'-endo, almost 15% have one of the pentoses in C3'-endo and the other in C2'-endo and only one structure has the two pentoses in C2'-exo configuration ( Figure 8A). Furthermore, the base can assume a syn or anti conformation in relation to the pentose by the N-glycosidic bond, and only one of the structures, the FimX EAL domain from Xanthomonas citri (PDBID: 4FOK) [168] has one of the base at the syn conformation, which is the less stable state of the molecule. The conformation C3'-endo/C3'-endo is the more representative for the c-di-GMP and c-di-AMP molecules, while cGAMP is preferentially in C3'-endo/C2'-endo conformation ( Figure 8A).
The overall conformation of the ligand can be classified in three conformations with respect to the base proximity: 1-closed conformation (shaped as a horseshoe) is when the two base rings are face-to-face; 2-open conformation is when the two base rings are far from each other in an elongated conformation and; 3-intermediate conformation (shaped as a boat) is when the bases are not in the closed conformation or in the open conformation ( Figure 8A). The comparison of c-di-AMP conformations in the c-di-AMP receptors binding sites was described by Chin and collaborators and they conclude that c-di-AMP molecules are bound in two main conformational types, "U-shape" or "V-shape" that correspond to closed and intermediate conformation, respectively [183]. The comparison of c-di-GMP conformations in the biding sites of c-di-GMP receptors was described in detail by Chou and Galperin [193] and by Schirmer [67]. In both papers, c-di-GMP molecules are found in the protein binding sites in different conformational types ranging from fully stacked form (closed conformation) to an extended form (open conformation) allowing significant binding flexibility. The c-di-GMP bases may interact with the protein binding site by stacking with arginine or phenylalanine/tyrosine residues through the hydrophobic surface of the base. The c-di-GMP bases may also interact with acidic residues (aspartate or glutamate) through Watson-Crick-edge interaction or with arginine residue through Hoogsteen-edge interaction [193].
The c-di-GMP molecule in solution is found in a fast equilibrium between a monomeric state and as a dimer with intercalated bases with a Kd of about 1 mM under physiological salt conditions [194]. Nevertheless, the intracellular concentration of c-di-GMP is about the µM range, suggesting that free c-di-GMP molecules are monomeric inside of the cells and c-di-GMP dimers, even though being found in some c-di-GMP receptor pockets, are probably not relevant for c-di-GMP signaling [67]. Looking at the conformation of c-di-GMP when it is bound to proteins, which include its receptors and the active site of DGCs enzymes that contain GGDEF domains, most of them are found as monomers or dimers, though trimeric and tetrameric structures were also observed in PilZ [158] and the C-terminal domain of BldD proteins [173], respectively ( Figure 8B). Interestingly, PilZ is the only one that binds c-di-GMP in monomer, dimer, and trimer forms, while EAL domain binds c-di-GMP monomers with the largest conformational divergences. Proteins containing STING domain and RNA riboswitches are bound to CDN monomers that share similar conformations ( Figure 8B,D). Looking at the conformation of c-di-AMP and cGAMP when bound to proteins or riboswitches, all of them are found as monomers ( Figure 8C,D). Therefore, even though bacteria have a large class of specific CDN receptors, which include not only proteins but also RNAs; surprisingly, the conformations of the ligands at the binding site are similar.

Distribution of Proteins Containing GGDEF and DAC Domains in Bacteria
Initial reviews of the distribution of DisA homologs across bacterial clades suggested that c-di-AMP would play a more important role in Gram-positive bacteria than in Gram-negative and that, in general, bacteria would avoid allowing these two signaling networks to co-exist, so as to avoid unintended crosstalk and to easily regulate the balance of these second messengers within the cell [21,22]. Subsequent surveys on the distribution of DAC and GGDEF homologs don't support the idea that DAC homologs are rare among Gram-negative bacteria, as members of lineages such as Cyanobacteria, Spirochaetes, and Deltaproteobacteria often carry both DAC and GGDEF genes, a profile compatible with the complex lifestyles and genomes of these lineages. In addition, among Gram-positives, most members of Firmicutes and Actinobacteria, including model organisms such as Bacillus, Clostridium, Streptomyces, Listeria, and Mycobacterium, produce both signaling molecules and possess a wide array of GGDEF genes, following the general trend of having close to as many genomes with both DAC and GGDEF as possible (see Supplementary Table S1 and Figure 9). The only lineages were several of the genomes sampled that seem to have at least one DAC homolog, but no or very few and rare recognizable GGDEF homologs are Bacteroidetes and the Archaea. In both lineages, the number of genomes with both DAC and GGDEF falls below 50% of the maximum allowed, i.e., the smallest between the number of genomes carrying DAC or GGDEF. Genomic data strongly suggest that there is a tendency for bacterial cells to use both c-di-AMP and c-di-GMP signaling networks simultaneously, which would imply that both the control of their synthesis and turnover and the specificity of their sensors are carefully tuned. signaling networks simultaneously, which would imply that both the control of their synthesis and turnover and the specificity of their sensors are carefully tuned. Figure 9. Lack of anti-correlation in the distribution of DAC and GGDEF genes per prokaryotic clades. Each dot represents a prokaryotic class, such as Gammaproteobacteria or Bacilli, as defined in the NCBI's Taxonomy Database. For each class, the number of genomes harboring at least one DAC and one GGDEF gene and the number of genomes harboring both was calculated. If, for a given class, we consider the number of genomes with DACs and the number of genomes with GGDEF, the smallest of these numbers is the maximum number of genomes that could, in principle, carry both genes. That number is seen on the horizontal axis while the actual number of genomes carrying both genes is on the y-axis. These numbers are very close to the diagonal line, indicating that, in most cases, if members of a given lineage are carrying both DAC and GGDEF, they tend to keep both genes, instead of having to choose between them.

Conclusions
Recently, our knowledge about cyclic dinucleotide second messengers has been under expansion with the discoveries of new CDNs. At the moment, three different classes of prokaryotic proteins are known to synthesize CDN molecules: (i) proteins containing GGDEF domain that synthesizes mainly c-di-GMP; (ii) CD-NTases enzymes that have the catalytic domain known as Figure 9. Lack of anti-correlation in the distribution of DAC and GGDEF genes per prokaryotic clades. Each dot represents a prokaryotic class, such as Gammaproteobacteria or Bacilli, as defined in the NCBI's Taxonomy Database. For each class, the number of genomes harboring at least one DAC and one GGDEF gene and the number of genomes harboring both was calculated. If, for a given class, we consider the number of genomes with DACs and the number of genomes with GGDEF, the smallest of these numbers is the maximum number of genomes that could, in principle, carry both genes. That number is seen on the horizontal axis while the actual number of genomes carrying both genes is on the y-axis. These numbers are very close to the diagonal line, indicating that, in most cases, if members of a given lineage are carrying both DAC and GGDEF, they tend to keep both genes, instead of having to choose between them.

Conclusions
Recently, our knowledge about cyclic dinucleotide second messengers has been under expansion with the discoveries of new CDNs. At the moment, three different classes of prokaryotic proteins are known to synthesize CDN molecules: (i) proteins containing GGDEF domain that synthesizes mainly c-di-GMP; (ii) CD-NTases enzymes that have the catalytic domain known as SMODS and synthesizes mainly cGAMP; and (iii) DAC proteins that have a catalytic domain called DAC domain (also described as DisA_N domain) that synthesizes mainly c-di-AMP. These CDN synthetases do not share structural similarities, use different residues for substrate binding, and probably possess different catalytic mechanisms suggesting that they probably evolved independently to catalyze similar chemical reactions.
As evidence of the importance and ubiquity of bacterial CDNs, it is interesting to note that mammalian cells evolved to sense bacteria by detecting these molecules to stimulate the immune system to counterattack infections. Thus, the use of CDNs as adjuvants in vaccines has been considered, as they can be used as stimulators of the innate immune system [63,195].
The huge repertoire of CDN receptors and the complexity of CDN signaling networks in bacteria are shown in this review. Surprisingly, different CDNs share conformational similarities even in the pocket of different classes of receptors. CDNs can regulate bacterial behaviors at different speed levels, directly regulating protein function for a faster response or, more slowly, by affecting gene transcription or protein translation. Remarkably, different CDN second messenger systems may coexist in many organisms, which would imply that both the control of their synthesis and turnover and the specificity of their sensors are carefully tuned.
Therefore, the new discoveries reviewed in this paper open up questions about how bacteria coordinate the three mains bacterial CDNs: are they interconnected to regulate the same bacterial phenotype, or do they act independently? Do bacteria use the three CDNs as second messengers or is one chosen? Are the CDN signaling pathways conserved in different bacteria? Will other CDNs be discovered to also be second messengers?
Supplementary Materials: The following are available online. Table S1: Distribution of DAC, GGDEF, and SMODS homologs across prokaryotic lineages. This data is the basis for Figure 9.