Characterisation of Variants of Cyclic di-GMP Turnover Proteins Associated with Semi-Constitutive rdar Morphotype Expression in Commensal and Uropathogenic Escherichia coli Strains

Expression of rdar (red, dry, and rough) colony morphology-based biofilm formation in Escherichia coli is highly variable. To investigate the molecular mechanisms of semi-constitutive rdar morphotype formation, we compared their cyclic di-GMP turnover protein content and variability to the highly regulated, temperature-dependent morphotype of the historical and modern ST10 isolates E. coli MG1655 and Fec10, respectively. Subsequently, we assessed the effects of cyclic di-GMP turnover protein variants of the EAL phosphodiesterases YcgG and YjcC and the horizontally transferred diguanylate cyclase DgcX on biofilm formation and motility. The two YcgG variants with truncations of the N-terminal CSS signaling domain were oppositely effective in targeting downregulation of rdar biofilm formation compared to the full-length reference protein. Expression of the C-terminal truncated variants YjcCFec67 and YjcCTob1 showed highly diminished apparent phosphodiesterase activity compared to the reference YjcCMG1655. For YjcCFec101, substitution of the C-terminus led to an apparently inactive enzyme. Overexpression of the diguanylate cyclase DgcX contributed to upregulation of cellulose biosynthesis but not to elevated expression of the major biofilm regulator csgD in the “classical” rdar-expressing commensal strain E. coli Fec10. Thus, the c-di-GMP regulating network is highly complex with protein variants displaying substantially different apparent enzymatic activities.


Introduction
The genomic variability of organisms contributes to their adaptive potential. Microorganisms, particularly bacteria with their small genomes often in multiple copies, are masters of stress survival and adaptation. While a major focus has been on the effects of horizontally transferred accessory elements, which undoubtedly introduce novel features into bacterial genomes in quantum leaps, it has recently been realized that even single nucleotide polymorphisms (SNPs) can have a substantial effect on the cell's physiology, metabolism, and virulence. These SNPs have been mainly explored upon being nonsynonymous in open reading frames (ORFs). For example, non-synonymous mutations in a variety of ORFs have been shown to contribute to the enhanced virulence/reduced biofilm phenotype of the emerged African invasive ST313 clone of the gastrointestinal pathogen Salmonella enterica serovar Typhimurium [1]. However, few studies have also explored SNPs in the more variable intergenic regions and in promoter regions. For example, the insertion/transversion of a single nucleotide in the promoter region of the major rdar displayed highly downregulated or abolished apparent catalytic activity. On the other hand, as overexpression of the diguanylate cyclase DgcX contributed to upregulation of cellulose biosynthesis but did not elevate production of the rdar biofilm regulator CsgD, additional components might contribute to the semi-constitutive rdar morphotype and csgD expression at 37 • C in the commensal strain E. coli Fec101.

Strains and Growth Conditions
All commensal and uropathogenic E. coli strains analyzed in this work have been previously described and genome sequenced, and the cyclic di-GMP turnover network has been characterized [14]. E. coli strains expressing the rdar colony morphotype semiconstitutively were E. coli Tob1, Fec67, Fec101, B8638, B-11870, 80//6, and No. 12. Strains were propagated on LB agar or grown on LB without salt plates supplemented with 2% Congo red (CR) and Coomassie brilliant blue G stock solution (CR at 2 mg/mL, Brilliant Blue G at 1 mg/mL in 70% (v/v) ethanol) as described to visualize the rdar biofilm phenotype [19]. If required, 100 µg/mL ampicillin and 0.1% L-arabinose were added to the medium for plasmid propagation and gene expression, respectively. All strains used in the study are listed in Table S1.

Cloning Procedures and Site-Directed Mutagenesis
Genes of interest were amplified and ligated into the pBAD30 vector [20] via XbaI/HindIII restriction sites. Site-directed mutagenesis was performed with phusion ® High-Fidelity DNA Polymerase and an overlap extension PCR approach using 17 cycles to introduce the mutations. A DpnI digest to remove methylated DNA ensured that only mutated plasmid was to be transformed. All constructs were verified by sequencing. All plasmids used in the study are listed in Table S2, and the primers are listed in Table S3.

Mutant Construction
Construction of the dgcX chromosomal deletion mutant was performed via λ redmediated homologous recombination [21]. In brief, a kanamycin resistance cassette was amplified from the pKD4 template flanked by 40-bp homologous sequences at the beginning and end of the target gene. After electroporation into E. coli Fec101 carrying helper vector pKD46, the mutants were selected on 25 µg/mL and subsequently 50 µg/mL kanamycin, verified by PCR with primers flanking the replaced ORF and cured of pKD46 by incubation at 42 • C. Primers are listed in Table S3.

Analysis of rdar Colony Morphology and Type 1 Fimbriae Expression
Congo red LB without salt plates was used to visualize expression of the rdar morphotype. Dye binding to both cellulose and curli fibers stains the bacterial colony [2,9,12]. Strains were streaked for single colonies, or 5 µL of a suspension of OD 600 = 5 was spotted onto the agar plates. If applicable, 100 µg/mL ampicillin and 0.1% L-arabinose were added. Development of the colony morphology and color was analyzed at 28 • C and 37 • C, respectively, and documented by taking photographs at distinct time points for up to 72 h.
To visualize cellulose expression, 5 µL of a suspension of OD 600 = 5 were spotted onto LB without salt agar plates supplemented with 50 µg/mL Calcofluor white (Fluorescent Brightener 28, Sigma, St. Louis, MO, USA). Fluorescence was observed under UV light of a wavelength of 365 nm.
Type 1 fimbriae were assessed by the pellicle assay performed in standing culture in LB medium at 37 • C for 48 h. Pellicles were documented by taking photographs from above and as a side view.

Flagella-Based Motility
Strains were inoculated into LB motility plates containing 0.25% agar using a toothpick. If applicable, 100 µg/mL ampicillin and 0.1% L-arabinose were added. Agar plates were incubated at 28 • C and 37 • C for 6 h.

SDS PAGE and Western Blotting
To investigate CsgD production, the strains of interest were grown on LB without salt agar plates for 16-18 h. For the analysis of YcgG synthesis, strains were grown on LB without salt agar plates supplemented with 100 µg/mL ampicillin and 0.1% L-arabinose for 16 h. Five milligrams of cells were harvested and suspended in 1× SDS sample buffer, followed by boiling at 95 • C for 10 min. The samples were separated on a denaturing SDS PAGE (4% stacking and 12% (for detection of His-tagged YcgG) or 15% (for CsgD detection) resolving gel). Separated proteins in gels were visualized with a staining solution (0.1% Coomassie Brilliant Blue G-250, 2% (w/v) ortho-phosphoric acid, 10% (w/v) ammonium sulfate, (Sigma)) to allow for adjustment of equal protein content, or semidry western blotting was performed at 120 mA for 1 h to transfer the contained proteins onto a PVDF membrane (Immobilon P; Millipore, Burlington, MA, USA). CsgD protein was detected using a custom-made primary rabbit polyclonal anti-E. coli CsgD peptide antibody [22] and a goat anti-rabbit secondary antibody (Jackson Immuno Research) (both antibodies were diluted 1:3000). His-tagged proteins were detected using 1:1500 diluted mouse anti-Penta-His antibody (Qiagen, Hilden, Germany) and a 1:2000 diluted goat anti-mouse secondary antibody (Jackson Immuno Research). After treatment with Lumi-Light Western Blotting substrate (Roche, Basel, Switzerland), the resulting chemiluminescence was detected via an LAS-1000 detector (Fujifilm, Tokyo, Japan).

Analyzed Protein Sequences
The YjcC sequences aligned in Figure 3  The GGDEF domains investigated in Figure 4 were from the proteins (clockwise direction) Shigella sonnei EGF2695153. 1

Results
We identified two additional candidate genes encoding cyclic di-GMP turnover proteins associated with semi-constitutive rdar morphotype expression, the function of which is potentially altered/modulated by non-synonymous single nucleotide changes in combination with 5 or 3 alteration/truncation of the ORFs. These genes were ycgG coding for a redox-regulated cyclic di-GMP phosphodiesterase, previously shown to be involved in regulation of motility [36]; and yjcC encoding a phosphodiesterase (PDE), previously shown to mediate rdar morphotype regulation in E. coli [37,38]. In addition, dgcX coding for a horizontally transferred diguanylate cyclase (DGC) first identified in the 2001 EAEC outbreak strain [39] has been identified in the commensal strain Fec101.

The N-Terminal Sequence of Truncated YcgG Affects Protein Expression and rdar Morphotype Formation
In E. coli K-12 MG1655, the PDE YcgG consists of an N-terminal redox responsive CSS domain of around 200 aas [40,41] flanked by transmembrane domains and a C-terminal EAL domain. A full-length YcgG protein is present also in the modern equivalent of E. coli K-12. the commensal strain Fec10, in the commensal strain Fec101, and in the uropathogenic strain B-8638. E. coli Fec101 and B8628 both express a semi-constitutive rdar morphotype. As previously reported, YcgG from Fec10 and Fec101 possesses the aa substitution Y474N, while YcgG of B-8638 contains six additional aa substitutions [17]. In contrast, in the commensal strains Tob1 and Fec67 and the uropathogenic strains B-11870 (including the clonal variant 80//6) and No.12, YcgG has a truncated N-terminus, leaving basically only the EAL domain. Notably, the truncated nucleotide regions are not entirely identical, resulting in YcgG from E. coli Tob1 to possess a two aa longer and divergent 22 aa stretch at the N-terminus compared to the truncated YcgG variant of YcgG B-11870 , where the N-terminus is congruent with the aa sequence of YcgG MG1655 at that position ( Figure 1). Further, YcgG Tob1 and YcgG B-11870 contain aa substitutions compared to YcgG MG1655 ( Figure 1). The aa exchange A298V is unique to YcgG Tob1 ; S306C, F358Y, E437A, L467F, and Y474N substitutions are present in YcgG Tob1 , and YcgG B-11870 and V504I are unique to YcgG B-11870 ( Figure 1, [17]). Furthermore, the ORF of ycgG B-11870 has the rare start codon TTG in contrast to ycgG MG1655 and ycgG Tob1 , which have ATG as a start codon. However, changing TTG to the conventional ATG start codon has recently been demonstrated not to substantially alter protein expression in E. coli [42].
To investigate the consequences of the expression of the highly similar, yet distinct proteins with deletion of the CSS sensory domain, we analyzed the effect of expression of YcgG MG1655 , YcgG Tob1 , and YcgG B-11870 on rdar morphotype formation and production of its major activator CsgD in the semi-constitutive rdar strain E. coli Tob1 (rdar 28 • C /rdar 37 • C ) at 28 • C and 37 • C (Figure 2A). To restrict the assessment of functionality solely on the gene product, the coding regions were cloned under the control of the L-arabinose-inducible pBAD promoter in pBAD30 with an identical Shine-Dalgarno sequence. While full-length YcgG MG1655 slightly downregulated rdar morphotype formation and the protein level of the master rdar biofilm regulator CsgD, expression of truncated YcgG B-11870 unexpectedly led to a highly reduced rdar morphotype and low levels of CsgD production ( Figure 2B). In contrast, expression of YcgG Tob1 was less effective than YcgG MG1655 in downregulating the rdar colony morphotype and had no visible effect on CsgD synthesis. Notably, the effect of YcgG production on rdar morphotype formation was consistently observed at 28 • C and 37 • C. CsgD levels could only be reliably assessed at 28 • C. Although expression of soluble protein at 37 • C is at the detection limit of western blot, we judged that none of the constructs affected CsgD production at 37 • C ( Figure 2B; [17]).
YcgGMG1655 slightly downregulated rdar morphotype formation and the protein level of the master rdar biofilm regulator CsgD, expression of truncated YcgGB-11870 unexpectedly led to a highly reduced rdar morphotype and low levels of CsgD production ( Figure 2B). In contrast, expression of YcgGTob1 was less effective than YcgGMG1655 in downregulating the rdar colony morphotype and had no visible effect on CsgD synthesis. Notably, the effect of YcgG production on rdar morphotype formation was consistently observed at 28 °C and 37 °C. CsgD levels could only be reliably assessed at 28 °C. Although expression of soluble protein at 37 °C is at the detection limit of western blot, we judged that none of the constructs affected CsgD production at 37 °C ( Figure 2B; [17]).  and B-11870 lack the N-terminal CSS signaling domain (indicated in green) compared to K-12. The N-terminal region of truncated YcgG is unique to E. coli Tob1 (black line). (B) Alignment of aa sequences of YcgG from E. coli K-12, Tob1, and B-11870. YcgG proteins from Tob1 and B-11870 are truncated at the N-terminus compared to the K-12 reference protein. The alignment was created with MUSCLE (http://www.ebi.ac.uk/Tools/msa/muscle/). Amino acid substitutions are indicated in red letters. Star * indicates conservation of amino acid, : indicates conservative substitution; blank indicates non-conservative substitution; . indicates semi-conservative substitution. To investigate whether downregulation of rdar biofilm formation and its major activator is due to altered protein production levels or impaired enzymatic activity of YcgG upon truncation, we detected the production of YcgG via the C-terminal 6xHis-tag by western blotting. While production of YcgG MG1655 wild-type protein was detected at 28 • C and 37 • C, YcgG B-11870 was produced to a significantly greater extent. Notably, production of YcgG B-11870 was increased at 37 • C, suggesting that elevated temperature or decreased proteolysis stabilizes this YcgG variant. Concomitant with the high production of YcgG B-11870 , a specific non-biofilm forming, smooth, and white (saw) morphotype emerged at 28 • C. In contrast, production of YcgG Tob1 could not be observed by western blot analysis despite visible effects on rdar colony morphology ( Figure 2C). In summary, the effect on rdar morphotype formation and CsgD expression of the YcgG variants strongly correlates with their protein production levels.

YjcC Proteins with C-Terminal and Individual Amino Acid Substitutions Show Diminished Apparent Phosphodiesterase Activity and Downregulation of the rdar Colony Morphotype
The phosphodiesterase YjcC is a major phosphodiesterase in S. typhimurium, determinative of temperature-dependent rdar colony morphotype expression [37]. In E. coli, MG1655 yjcC appears to play a more restricted role with alternative phosphodiesterases, being dominant with respect to affecting rdar biofilm colony morphology [38]. We noted variants of the phosphodiesterase YjcC in the semi-constitutive rdar morphotype-expressing strains E. coli Tob1, Fec101, and Fec67 ( Figure S3). In strains Tob1 and Fec67, the C to T transition at position 1552 of the yjcC ORF created a stop codon leading to a 11 aa shorter protein ( Figure S2). In strain Fec101, the deletion of an adenosine nucleotide at position 1433 led to a frameshift and the substitution of the last 49 C-terminal aa of YjcC with a 44 aa long unrelated sequence ( Figure S3). In addition, YjcC proteins of strains Tob1 and Fec67 possess multiple, partly distinct, aa substitutions compared to reference YjcC MG1655 , while YjcC Fec101 displays two aa substitutions (P309L D403E) ( Figure S3; [17]). To investigate the impact of the YjcC variants with the altered C-termini, we cloned the ORFs under the control of the same Shine-Dalgarno sequence and with the same extension of the nucleotide sequences 3 of the stop codon (Table S1). Expression of these constructs and reference YjcC MG1655 in E. coli Tob1 showed production of YjcC MG1655 to substantially downregulate the rdar colony morphotype at 28 • C ( Figure 3A). While the partially truncated protein variant YjcC Tob1 and, to a minor extent, YjcC Fec67 still downregulated the rdar morphotype at 28 • C, YjcC Fec101 did not substantially alter the rdar colony morphology type compared to the vector control. Notably, YjcC MG1655 and YjcC Fec67 expression downregulated the rdar morphotype also at 37 • C to a minor extent, while expression of YjcC Tob1 displayed a substantially altered, although still downregulated, rdar colony morphology. We therefore must assume that the two aa substitutions that discriminate YjcC Fec67 from YjcC Tob1 are responsible for this different effect on the rdar colony morphotype.

YjcC Proteins with Similar C-Terminal and Amino Acid Substitutions Are Present in the Database
The YjcC phosphodiesterase proteins with C-terminal substitutions and truncatio are not unique to our strain collection ( Figure S2). YjcC proteins with the 11 aa shorter terminus and the same or similar aa substitutions, such as YjcCTob1/YjcCFec67, compared YjcCMG1655 are found encoded by a number of isolates among the sequenced E. coli strai including the uropathogenic strain E. coli CFT073. Equally, at least eight human E. c

YjcC Proteins with Similar C-Terminal and Amino Acid Substitutions Are Present in the Database
The YjcC phosphodiesterase proteins with C-terminal substitutions and truncations are not unique to our strain collection ( Figure S2). YjcC proteins with the 11 aa shorter Cterminus and the same or similar aa substitutions, such as YjcC Tob1 /YjcC Fec67 , compared to YjcC MG1655 are found encoded by a number of isolates among the sequenced E. coli strains including the uropathogenic strain E. coli CFT073. Equally, at least eight human E. coli isolates, including a human commensal, pathogens such as strain EPECa14, animal, and wastewater strains, possess a YjcC with a C-terminal substituted aa sequence highly similar to YjcC Fec101 with the second glutamate of the EVTE motif involved in divalent cation binding missing ( Figure S2). Notably, distinct single nucleotide deletions and deletion of several nucleotides led to a similar C-terminal aa sequence substitution in YjcC in these epidemiologically unrelated strains. The phylogenetic tree clearly discriminates among these three distinct classes of YjcC protein variants ( Figure 3B). Despite a divergent Cterminus, however, the structural model indicates a similar structure of the C-terminus of YjcC Fec101 compared to YjcC MG1655 ( Figure 3C).

The Novel Diguanylate Cyclase DgcX Is Critical for Biofilm Formation at Human Body Temperature
The DGC DgcX, not present in E. coli K-12 and Fec10, was first described as a highly expressed diguanylate cyclase in the E. coli O104:H4 2011 German outbreak strain, and it is considered to be a virulence determinant ( Figure 4A; [39]). The genome of the commensal E. coli strain Fec101 showed only minor deviations in the c-di-GMP network and the biofilm components compared to the ST10 reference strains with respect to SNPs and gene rearrangements, but it contains the additional DGC DgcX, a MASE4-GGDEF domain protein [17,39,44]. DgcX was thus identified as a candidate diguanylate cyclase responsible for the observed upregulated rdar colony morphotype at 37 • C in the commensal strain Fec101.
is considered to be a virulence determinant ( Figure 4A; [39]). The genome of the commensal E. coli strain Fec101 showed only minor deviations in the c-di-GMP network and the biofilm components compared to the ST10 reference strains with respect to SNPs and gene rearrangements, but it contains the additional DGC DgcX, a MASE4-GGDEF domain protein [17,39,44]. DgcX was thus identified as a candidate diguanylate cyclase responsible for the observed upregulated rdar colony morphotype at 37 °C in the commensal strain Fec101.  Interestingly, the chromosomal location and gene context of dgcX vary among the strains harboring this gene. Database searches reveal that the O104:H4 outbreak strain and other EAEC strains harbor dgcX and the adjacent inserted prophage between ybhC (position complement 805998-807281 relative to the E. coli K-12 MG1655 genome) and ybhB (position complement 807433-807909). The nature of the prophage as such differs, however [39]. In the environmental strain E. coli SE11, dgcX, along with an adjacent prophage, is inserted between ydaW (position 1422701-1423200) and uspF (position complement 1435185-1435619) [39]. In Fec101, dgcX is present as part of a 9-kb insertion yet at another location between argW and dsdC ( Figure S3A). In E. coli K-12 MG1655, the prophage CPS-53/KpLE1 [45] is inserted between argW (position 2466309-2466383 in K-12 MG1655) and dsdC (position complement 2476694-2477629).

Overexpression of DgcX Induces Cellulose Production at 37 • C in the Regulated rdar Colony Morphology Type Strain E. coli Fec10
DgcX harbors a GGDEF domain that is seemingly functional as a diguanylate cyclase, with an intact GGEEF catalytic motif and other conserved aa motifs required for catalytic activity, substrate binding, and metal ion coordination [46,47]. N-terminal of the GGDEF domain, DgcX contains a MASE4 domain [35] with six transmembrane regions ( Figure 4A). The closest homologs to DgcX over the entire length of the protein in E. coli K-12 is the catalytically non-functional YeaI (CdgI) and the diguanylate cyclase YcdT (Figures 4B and S3). Compared to other DGCs of E. coli, DgcX is highly expressed at both 28 • C and 37 • C [35].
To test the effect of DgcX against the Fec101 background, we constructed a dgcX deletion mutant ( Figure S3B). Assessment of rdar morphotype expression, however, showed only a minor effect of the dgcX deletion on rdar biofilm formation, while the deletion mutant of the gene for the major biofilm activator csgD displayed a residual pink, dry, and rough (pdar) morphotype indicating cellulose but no curli fiber biosynthesis.
To investigate the effect of dgcX overexpression on rdar morphotype formation, dgcX was cloned with either an N-or C-terminal 6xHis-tag into the cloning vector pBAD30 under the control of an L-arabinose-controlled promoter. DgcX was subsequently expressed in strain E. coli Fec10, which features a highly regulated rdar morphotype at 28 • C and a saw morphotype at 37 • C, which facilitates assessment of the functionality of a diguanylate cyclase [12,18]. Overexpression of dgcX enhanced the rdar morphotype at 28 • C ( Figure 4C, left panel) and resulted in a pdar morphotype and enhanced Calcofluor white binding at 37 • C ( Figure 4C, right panel), suggesting csgD-independent cellulose production. Upregulation of cellulose expression is dependent on the catalytic activity, as DgcX variants with mutations of the GGEEF motif to GGEAF or GGAAF no longer induced upregulation of rdar morphotype expression, while modest upregulation in colony morphology was still observed in the GGAEF mutant. Remarkably, overexpression of DgcX did not alter production of CsgD in Fec10 at either 28 • C or 37 • C ( Figure 5A). This outcome is in contrast to observations in the 2011 E. coli outbreak strain, in which DgcX elevated CsgD production, but cellulose was not synthesized due to a frame shift mutation in the gene encoding the c-di-GMP binding protein BcsE [39]. BcsE has recently been shown to be required for optimal cellulose production in E. coli and S. typhimurium [48]. Moreover, to clearly show the effect of DgcX at 28 • C, we expressed DgcX in the Fec10 ∆csgD strain. Expression of dgcX induced pdar morphotype expression also at 28 • C but to a lesser extent ( Figure 5B). Thus, DgcX conditionally contributes to semi-constitutive rdar morphotype expression in Fec101 by upregulation of csgD-independent cellulose expression.
Notably, the effect of DgcX on cellulose expression is remarkably stronger for the N-terminally, rather than for the C-terminally, tagged construct with barely visible activity at 28 • C indicating that the tag either interferes with (at the C-terminus) or promotes (at the N-terminus) enzymatic activity and/or protein stability ( Figures 4C and 5B). Notably, the effect of DgcX on cellulose expression is remarkably stronger for the Nterminally, rather than for the C-terminally, tagged construct with barely visible activity at 28 °C indicating that the tag either interferes with (at the C-terminus) or promotes (at the N-terminus) enzymatic activity and/or protein stability ( Figures 4C and 5B).

Effect of dgcX on Flagella-Based Motility in E. coli Fec101
Cyclic di-GMP signaling regulates the lifestyle transition from sessility (biofilm formation) to motility. We were therefore wondering whether dgcX affects motility. We have previously shown that Fec101 exhibits moderate motility [17]. The corresponding motility assay showed that E. coli Fec101 ΔdgcX has slightly higher motility than the Fec101 wild type at 28 °C and 37 °C ( Figure S3B).
Consistent with the motility phenotype of the dgcX deletion mutant, overexpression of dgcX in Fec101 repressed motility at 28 °C and 37 °C. As expected with the predicted loss of the diguanylate cyclase activity, the GGAAF and GGEAF mutants did not repress motility, while the GGAEF mutant still displayed motility repression, suggesting, as in the case of regulation of rdar colony morphology, at least residual catalytic activity of the DgcX GGAEF variant ( Figure 4D). Thus, in DgcX, unexpectedly, the second glutamate of the GGEEF motif is dominant in determining catalytic activity [46,47].

DgcX Is Restricted to E. coli Species
Previously, dgcX was found to be encoded by the genomes of the EAEC strains LB226692, 2011C-3493, HUSEC041, 55989, 2009EL-2071, 2009EL-2050, ETEC H10407, E24377A, and the commensal strain SE11 [39,44]. A protein BLAST search against the NCBI microbial genomes database revealed the presence of DgcX (>75% query cover and >50% identity of the aa sequence) in more than 1150 strains of the genus Escherichia, mostly

Effect of dgcX on Flagella-Based Motility in E. coli Fec101
Cyclic di-GMP signaling regulates the lifestyle transition from sessility (biofilm formation) to motility. We were therefore wondering whether dgcX affects motility. We have previously shown that Fec101 exhibits moderate motility [17]. The corresponding motility assay showed that E. coli Fec101 ∆dgcX has slightly higher motility than the Fec101 wild type at 28 • C and 37 • C ( Figure S3B).
Consistent with the motility phenotype of the dgcX deletion mutant, overexpression of dgcX in Fec101 repressed motility at 28 • C and 37 • C. As expected with the predicted loss of the diguanylate cyclase activity, the GGAAF and GGEAF mutants did not repress motility, while the GGAEF mutant still displayed motility repression, suggesting, as in the case of regulation of rdar colony morphology, at least residual catalytic activity of the DgcX GGAEF variant ( Figure 4D). Thus, in DgcX, unexpectedly, the second glutamate of the GGEEF motif is dominant in determining catalytic activity [46,47].

DgcX Is Restricted to E. coli Species
Previously, dgcX was found to be encoded by the genomes of the EAEC strains LB226692, 2011C-3493, HUSEC041, 55989, 2009EL-2071, 2009EL-2050, ETEC H10407, E24377A, and the commensal strain SE11 [39,44]. A protein BLAST search against the NCBI microbial genomes database revealed the presence of DgcX (>75% query cover and >50% identity of the aa sequence) in more than 1150 strains of the genus Escherichia, mostly E. coli, a few Shigella spp., and Escherichia albertii ( Figure 4B and BLAST search performed in June 2023). Thus, the presence of dgcX, although obtained through horizontal gene transfer, is indeed restricted to E. coli. In conclusion, the gene product DgcX is highly conserved and present only in strains of E. coli and closely related species of the Escherichia genus, while the highly similar gene products YcdT and YeaI/CdgI can also be present in related genera ( Figures 4B and S3C,D). This outcome suggests recent diversifying radiation of the YcdT/DgcX/YeaI family specifically in E. coli with one of its members, DgcX, subject to extended mobilization. Identity/similarity analyses of the aligned sequences showed, however, that threshold identity/similarity values can be determined to categorize proteins of the YcdT/DgcX/YeaI subfamilies, in line with above-mentioned threshold values with the potential assignment of founders for novel protein subfamilies. (Figure S3C,D).

Discussion
Within a species, biofilm formation can be highly variable; however, the underlying molecular mechanisms in natural isolates remain to be determined. We have started to dissect the molecular basis of the temperature-independent rdar colony morphotype expression of three commensal and three uropathogenic E. coli strains compared to the commensal strain E. coli MG1655 and its recently isolated 'modern' ST10 counterpart E. coli Fec10. Both of these strains express a highly regulated rdar biofilm only at temperatures less than 30 • C [17,18]. These analyses subsequently revealed high variability in the enzymes that conduct turnover of the second messenger c-di-GMP associated with differential biofilm expression. In a subset of strains, distinct aa substitutions in the trigger diguanylate cyclase/phosphodiesterase YciR were experimentally coupled with reduced ability to downregulate csgD encoding the master regulator of rdar biofilm formation and, upon introduction of a nonsense mutation, even to upregulate csgD by a truncated FI-PAS-GGDEF protein acting as an apparent diguanylate cyclase [17]. However, additional alterations in alternative cyclic di-GMP turnover proteins might cumulatively contribute to semiconstitutive rdar morphotype expression in E. coli strains. In this work, we investigated the effect of genomic alterations occurring in ycgG and yjcC coding for redox-regulated cyclic di-GMP phosphodiesterases and the effect of the acquisition of dgcX, which codes for a horizontally transferred diguanylate cyclase in the commensal strain E. coli Fec101. DgcX was first identified in the 2011 EAEC outbreak strain [39]. Findings in this work could be extended to investigate the molecular basis of substantially different behaviors of protein variants equally as natural evolution toward potentially novel protein functionalities in their physiological and evolutionary contexts. For example, the specific feature of yjcC in regulating the temperature dependence of the rdar morphotype in S. typhimurium in contrast to E. coli representatives might be an acquired virulence trait specific for either Salmonella or E. coli, causing intestinal infection.
Although the EAL domain of the CSS-EAL protein YcgG is present in all investigated E. coli strains, the N-terminal part of the protein was truncated in a subset of isolates due to deletion of larger genomic sequences. Indeed, we have also observed a deletion of the CSS domain of YcgG in all strains of the multidrug-resistant pandemic E. coli ST131 clone (unpublished observations). However, the deletion of the N-terminal periplasmic CSS sensory domain left in all cases an ORF encoding an intact EAL phosphodiesterase domain. The deletions, however, also led to distinct protein variants that, upon overexpression, showed opposite performance in production levels and consequently downregulated the rdar morphotype relative to the ST10 E. coli wild-type protein. Thereby only the effect of the ycgG Tob1 variant resulting in a lower protein level is in congruence with the expression of a semi-constitutive rdar morphotype. It is possible that the rare TTG start codon of ycgG Tob1 causes significantly lower protein production. Alternatively, the novel unique 22-aa sequence at the truncated N-terminus and/or, less likely, the distinct nucleotide substitution A298V can affect protein folding, stability, or degradation, causing YcgG Tob1 to be undetectable by standard western blot analyses, although its effect is still visible in the biological assay. However, it remains to be determined how ycgG expression has been altered upon introduction of the novel promoter in vivo in E. coli strain Tob1 ( Figure S1D). Although expression of ycgG is almost undetectable in E. coli K-12 [36,50], deletion of a truncated ycgG (c1610), similar to ycgG B-11870 , in the E. coli UPEC strain CFT073 increased the levels of the FimA subunit of type 1 fimbriae and enhanced adherence to bladder epithelial cells [51]. Thus, although we did not observe a decrease in type 1 fimbriae-dependent biofilm formation of E. coli Tob1 upon overexpression of the ycgG variants in laboratory assays ( Figure S4), truncated ycgG B-11870 might decrease biofilm formation in other strains and/or under alternative conditions, such as in the host. In this context, ycgG in neonatal meningitis-causing E. coli (NMEC) and the uropathogenic isolate UTI89 is involved in the switch from iron acquisition to citrate fermentation via the citrate transporter CitT [52]. Whether the truncation of YcgG is causative for involvement in this metabolic activity has not been investigated.
In summary, the effect of ycgG variants is only partially congruent with enhanced rdar biofilm formation and might predominantly be involved in the regulation of alternative cyclic di-GMP-affected physiological and metabolic properties. YcgG belongs to a group of five redox sensing CSS-EAL domain proteins in E. coli [40,41]. Reducing conditions or deletion of the disulfide bond formation system components led to proteolytic cleavage of the YjcC and YlaB proteins, which resulted in reduced catalytic activity of the cytoplasmic EAL-only domain [41]. It is intriguing that the manifested genomic alterations observed here for ycgG in natural commensal and uropathogenic E. coli strains, alternative transcriptional start sites and additive proteolytic digest by the periplasmic proteases DegQ and DegP in the model strain E. coli K-12 under laboratory growth conditions led to a similar fragmentation pattern of a CSS-EAL protein [14,17,41,49,[51][52][53].
In theory, at least three fundamentally different modes of regulation of semi-constitutive rdar biofilm expression by cyclic di-GMP turnover proteins can occur: reduced (apparent) phosphodiesterase activity, upregulated diguanylate cyclase activity of a chromosomally encoded diguanylate cyclase, and acquisition of a novel diguanylate cyclase ( Figure 6). In the commensal strain E. coli Fec101, the situation might be more complex. The GGDEF-EAL domain protein variant YciR still substantially reduces rdar biofilm formation [17], and as shown in this work, the horizontally introduced diguanylate cyclase DgcX does not enhance production of the biofilm activator CsgD but contributes to rdar biofilm formation by upregulation of cellulose biosynthesis. Indeed, diguanylate cyclases transcriptionally independent of csgD have been shown to exclusively regulate cellulose biosynthesis in E. coli under alternative growth conditions [54]. The dedication of a diguanylate cyclase/phosphodiesterase to preferentially affect biofilm formation, a particular biofilm component, or motility is not uncommon.
In S. typhimurium, YjcC is the dominant phosphodiesterase-mediating temperaturedependent rdar morphotype [37]. In E. coli MG1655, yjcC has a less determinative role [38]. The investigated yjcC variants yjcC Tob1 , yjcC Fec67 , and yjcC Fec101 showed subsequently decreased activity. Although YjcC Fec101 seems to be catalytically inactive (as it lacks a glutamate required for divalent cation binding), it remains to be shown whether lack of this phosphodiesterase activity would be sufficient to promote CsgD production in Fec101 at 37 • C. On the other hand, the substitution of the last 49 C-terminal aa with a 44 aa-long unrelated sequence in YjcC Fec101 might have led to a protein with a novel functionality that is not immediately visible by substantial alterations in the rdar colony morphology biofilm type. However, the substantially different appearance of the colony morphology of E. coli Tob1 upon overexpression of yjcC Fec101 at 28 • C indicates expression and a physiological role of yjcC Fec101 . Microorganisms 2023, 11, x FOR PEER REVIEW 14 of 17 Figure 6. Different modes of regulation of rdar biofilm expression by genetic and environmental factors. Modes of regulating rdar morphotype formation can be at the transcriptional level by mutations in the csgD promoter, leading to altered expression at 28 °C, 37 °C, and 42 °C (I) [2,16], by different genetic factors including global transcriptional regulators and small RNAs targeting the csgD promoter and beyond (II); by alterations of cyclic di-GMP turnover protein activity and presence, such as altered catalytic activity, including phosphodiesterase and diguanylate cyclase activity; or by deletion and acquisition of novel diguanylate cyclases (III) [10,38,39]. Cyclic di-GMP is a major determinative factor for overriding most other regulatory determinants. CsgD subsequently activates curli expression directly. Cellulose biosynthesis can be regulated independently of csgDmediated rdar biofilm expression (IV) [4,8,54].

Conclusions
In conclusion, characterization of the effects of variants of cyclic di-GMP specific phosphodiesterases on the downregulation of rdar colony morphotype expression and regulation of motility provided indications for their contributions to the semi-constitutive rdar morphotype, which is seen at high frequency in E. coli strains compared to the closely related species S. Typhimurium.

Supplementary Materials:
The following supporting information can be downloaded at: www.mdpi.com/xxx/s1, Figure S1: Nucleotide sequence of the ycgG region from E. coli K-12 MG1655, commensal Tob1, and uropathogenic B-11870; Figure S2: Multiple alignment of YjcC of E. coli strains MG1655, Tob1, Fec67, and Fec101; Figure S3: Flanking region of the DGC encoding gene dgcX in Fec101; Figure S4: Pellicle assay indicative for the expression of type 1 fimbriae. Table S1: Strains constructed or used in this study; Table S2: Plasmids constructed or used in this study; Table  S3: Oligonucleotides used in this study. References [55,56] are cited in the supplementary materials as reference [6] and [7], respectively.   [2,16], by different genetic factors including global transcriptional regulators and small RNAs targeting the csgD promoter and beyond (II); by alterations of cyclic di-GMP turnover protein activity and presence, such as altered catalytic activity, including phosphodiesterase and diguanylate cyclase activity; or by deletion and acquisition of novel diguanylate cyclases (III) [10,38,39]. Cyclic di-GMP is a major determinative factor for overriding most other regulatory determinants. CsgD subsequently activates curli expression directly. Cellulose biosynthesis can be regulated independently of csgD-mediated rdar biofilm expression (IV) [4,8,54].

Conclusions
In conclusion, characterization of the effects of variants of cyclic di-GMP specific phosphodiesterases on the downregulation of rdar colony morphotype expression and regulation of motility provided indications for their contributions to the semi-constitutive rdar morphotype, which is seen at high frequency in E. coli strains compared to the closely related species S. typhimurium.