Contribution of the RgfD Quorum Sensing Peptide to rgf Regulation and Host Cell Association in Group B Streptococcus

Streptococcus agalactiae (group B Streptococcus; GBS) is a common inhabitant of the genitourinary and/or gastrointestinal tract in up to 40% of healthy adults; however, this opportunistic pathogen is able to breach restrictive host barriers to cause disease and persist in harsh and changing conditions. This study sought to identify a role for quorum sensing, a form of cell to cell communication, in the regulation of the fibrinogen-binding (rgfBDAC) two-component system and the ability to associate with decidualized endometrial cells in vitro. To do this, we created a deletion in rgfD, which encodes the putative autoinducing peptide, in a GBS strain belonging to multilocus sequence type (ST)-17 and made comparisons to the wild type. Sequence variation in the rgf operon was detected in 40 clinical strains and a non-synonymous single nucleotide polymorphism was detected in rgfD in all of the ST-17 genomes that resulted in a truncation. Using qPCR, expression of rgf operon genes was significantly decreased in the ST-17 ΔrgfD mutant during exponential growth with the biggest difference (3.3-fold) occurring at higher cell densities. Association with decidualized endometrial cells was decreased 1.3-fold in the mutant relative to the wild type and rgfC expression was reduced 22-fold in ΔrgfD following exposure to the endometrial cells. Collectively, these data suggest that this putative quorum sensing molecule is important for attachment to human tissues and demonstrate a role for RgfD in GBS pathogenesis through regulation of rgfC.


Introduction
Streptococcus agalactiae, or group B Streptococcus (GBS), resides as a commensal in the gastrointestinal and/or urogenital tracts in up to 40% of healthy men and women but is an opportunistic pathogen presenting a threat to newborns, pregnant women, the chronically ill, and the elderly [1]. In neonates, GBS is a leading cause of meningitis and sepsis. Although there has been a reduction in the incidence of neonatal early onset disease (EOD) over the past 30 years [2], GBS is still a major concern in both industrialized and developing nations, and there remain significant gaps in our understanding of the molecular mechanisms of pathogenesis. The identification of features that allow one GBS strain to become more invasive than another is incomplete. Several studies utilizing multilocus sequence typing (MLST), a method targeting seven conserved housekeeping genes [3], have shown that most strains belong to one of four clonal complexes (CCs): 1, 17, 19, and 23. Strains belonging to CC-17, however, have been shown to cause an increased frequency of neonatal at different times. Three serotype III, CC-17 strains (GB00451, GB00546, and GB00097) were used to quantify rgf transcription by growth phase. Mutagenesis was performed in GB00451 (ST-17) and GB00012 (ST-1).
To examine sequence variation, additional rgf operon sequences were extracted from 40 draft genomes sequenced by the J. Craig Venter Institute (Table 1) using the Basic Local Alignment Search Tool (BLAST) available in the National Center for Biotechnology Information (NCBI) with strain O90R as the rgf reference sequence (AF390107.1) [22]. All base locations in the 40 genomes are named relative to the 3320 bp O90R sequence, which begins 94 bp prior to the start of rgfB in the rgf operon. Multiple alignments were performed using the ClustalW algorithm in MegAlign and a Neighbor joining phylogeny based on p-distance was generated using MEGA6 with bootstrapping [31]. The 40 clinical strains, which were recovered from colonized mothers or young adults, were previously characterized by MLST [7,32]. Although biofilm production was performed previously using OD 595 values ≥1.8 as the cutoff for strong biofilms [33], this study sought to examine the relationship between biofilm level and rgf sequence variation, which was not examined initially.
of the plasmid was stimulated by growth at 28 • C without antibiotics in broth for six generations followed by dilution and plating. Single colonies were tested for erythromycin susceptibility to ensure plasmid loss and PCR was performed using primers rgfD_del 5 and 6 to identify a mutant with a gene deletion (GB00451∆rgfD). Complementation of rgfD was completed using the pLZ12 plasmid with a constitutive rofA promoter sequence regulating transcription [36]. For construction, rgfD was amplified from GB00012 with Plz:rgfD F and R, digested with PstI and BamHI enzymes, and ligated into the pLZ12 plasmid. The constructed plasmid was transformed into the DH5α MAX Efficiency Chemically-Competent Cells by Invitrogen™ (Thermo Fisher Scientific, Waltham, MA, USA) and chloramphenicol resistant transformants were identified. The plasmid was extracted and electroporated into GB00451∆rgfD competent cells, and transformants were selected for growth on THA and chloramphenicol (3 µg/mL).

Association Assays
Telomerase-immortalized human endometrial stromal cells (T-HESCs) were decidualized and grown to approximately 50% confluence followed by treatment with 0.5 mM 8-bromo-cyclic adenosine monophosphate (Sigma-Aldrich, St. Louis, MO, USA) for 3-6 days as described [37]. Decidualization was confirmed by examining the expression of prolactin and insulin-like growth factor-binding protein 1. Assays were performed in triplicate at least three times when cells reached 100% confluence. GBS was washed with phosphate-buffered saline (PBS) and resuspended in infection medium (HESC medium with 2% charcoal-treated fetal bovine serum, insulin, human transferrin, and selenous acid without antibiotics) following overnight growth in THB. Host cells were washed three times with PBS and infected with GBS at a multiplicity of infection (MOI) of one bacterial cell per host cell. After 2 h at 37 • C with 5% CO 2 , samples were taken, diluted, and plated to quantify bacteria (CFU/mL). Each well was washed three times with PBS to remove non-adherent bacteria, and host cells were lysed with 0.1% Triton X-100 (Sigma-Aldrich) for 30 min at 37 • C and mixed to liberate intracellular bacteria. After serial dilution, lysates were plated on THA, incubated overnight at 37 • C, and quantified (CFU/mL). All data were expressed as percentages of the total number of bacteria per well after 2 h.

RNA Extraction, Preparation, and Quantitation
RNA was extracted, cDNA was synthesized and transcripts were quantified as previously described [37]. For collection, samples were added to two volumes of RNA Protect (Qiagen, Germantown, MD, USA) and pelleted followed by RNA extraction using the RNeasy Kit (Qiagen). DNA was removed with TURBO™ DNase (Thermo Fisher Scientific) and purified RNA was quantified. For samples exposed to host cells, total RNA was precipitated following Turbo DNase treatment and bacterial RNA was separated using the MICROBEnrich™ Kit by Ambion (Thermo Fisher Scientific). Following purification, 1 µg of RNA was used for reverse-transcription with the iScript Reverse Transcription Kit (Bio-Rad), while the iQ SYBR Supermix (Bio-Rad) was used for quantitative RT-PCR (qRT-PCR) in 15 µL reactions with 10 µM (each) of gene-specific primers (Table 2). Products were amplified and quantified using a CFX384 Touch™ Real-Time PCR detection system (Bio-Rad) under the following conditions: 1 cycle of 3 min at 95 • C and 39 cycles of 95 • C for 10 s and 60 • C for 30 s. Relative transcript quantities were calculated using the comparative threshold cycle (C T ) method (2 −∆CT ) [38] with gyrA as the internal control gene.

Statistical Analysis
Data shown were either pooled from or were representative of at least three independent experiments performed in triplicate. The t-test was used to compare differences in expression levels across groups of strains, while the paired ratio t-test was used to compare percent association to host cells. The likelihood Chi-square test was used to examine differences in categorical variables. Analyses were performed in GraphPad Prism (version 6.0; GraphPad Software, Inc., La Jolla, CA, USA) and Epi Info™ (CDC, Atlanta, GA, USA). p ≤ 0.05 was considered significant.

Allelic Variation in rgf among Diverse GBS Lineages
Because sequence variation within the rgf operon has been observed [23,39], we compared the O90R rgf reference sequence [22] to 40 rgf sequences from clinical strains representing 14 STs. In all, 39 strains were classified as belonging to five CCs including CC-1 (n = 10), CC-12 (n = 2), CC-17 (n = 7), CC-19 (n = 10), and CC-23 (n = 10); two strains were singletons. Phylogenetic analysis of the complete 3320 bp rgf operon extracted from NCBI resulted in two rgf clusters (Figure 1), which differed based on the presence of an 881 bp deletion within rgfC at position 2328 as well as multiple single nucleotide polymorphisms (SNPs) within both rgfA and rgfC. A total of 21 (52.5%) strains contained the complete rgf operon with an intact rgfC, while the remaining 19 (47.5%) strains contained the 881 bp rgfC deletion.

Allelic Variation in rgf among Diverse GBS Lineages
Because sequence variation within the rgf operon has been observed [23,39], we compared the O90R rgf reference sequence [22] to 40 rgf sequences from clinical strains representing 14 STs. In all, 39 strains were classified as belonging to five CCs including CC-1 (n = 10), CC-12 (n = 2), CC-17 (n = 7), CC-19 (n = 10), and CC-23 (n = 10); two strains were singletons. Phylogenetic analysis of the complete 3,320 bp rgf operon extracted from NCBI resulted in two rgf clusters (Figure 1), which differed based on the presence of an 881 bp deletion within rgfC at position 2328 as well as multiple single nucleotide polymorphisms (SNPs) within both rgfA and rgfC. A total of 21 (52.5%) strains contained the complete rgf operon with an intact rgfC, while the remaining 19 (47.5%) strains contained the 881 bp rgfC deletion.  The evolutionary distances between rgf operon sequences (3320 bp) for 41 strains of different STs were calculated using the p-distance method, which is represented as the number of base differences per site. The bootstrap test (1000 replicates) values are represented at the nodes. The rgfC sequence, which was classified as complete or with an 881 bp deletion, contributed to the clustering observed in the phylogeny. S = singleton.
When stratified by ST, strains of the same ST were more likely to cluster together. The 20 ST-19 and ST-23 strains, for example, clustered together on the tree depending on whether they harbored the complete (n = 4) or deleted (n = 16) version of rgfC. The same was true for CC-1 strains, though one ST-1 strain had the rgfC deletion and clustered separately from the others. By contrast, strains belonging to ST-17 were homogeneous with only three detectable SNPs among all seven ST-17 genomes. These SNPs were located within rgfD (T1115G), rgfA (A1848T), and rgfC (G2338A) and each mutation was exclusive to one of the three different ST-17 strains. Relative to the other STs, two unique non-synonymous SNPs were detected in all of the ST-17 strains. The first SNP (C246A) is located in rgfB and the second (A1131T) is located 54 bp into rgfD. Importantly, the rgfD SNP results in a truncated coding sequence after 17 amino acids due to the introduction of a stop codon. This finding suggests that rgfD may function differently in ST-17 strains relative to strains belonging to other lineages. Mutations within rgfD also resulted in the separate clustering of a ST-12 and ST-19 strain with a complete rgfC near the bottom of the top branch of the phylogeny. Both strains had three unique SNPs, G1044A, C1048T, and A1054G, located 21 bp, 25 bp, and 31 bp into rgfD, respectively, as well as an additional SNP (A3018T) in rgfC that was shared only with the seven ST-17 strains. Only C1048T and A1054G in rgfD represent non-synonymous mutations.

Association between rgf Variation and Biofilm Production
Since allelic variation in the agr system has previously been related to biofilm production in S. aureus [40], we assessed the importance of rgf allelic variation on biofilm phenotypes. Of the 40 clinical strains examined, 13 (32.5%) were previously classified as strong biofilm producers and 27 (67.5%) were weak. Those strains possessing a complete rgfC were not more likely to produce a strong biofilm relative to the strains containing the rgfC deletion (p = 0.83). Among the 21 strains with a complete rgfC, 28.6% (n = 6) were strong biofilm producers relative to 36.8% (n = 7) of strains with the rgfC deletion. It is notable that all but one of the seven CC-17 strains containing the rgfD truncation were classified as weak biofilm producers. Although the strains containing a complete rgfD were 3.4 times more likely to form a strong biofilm, the association was not statistically significant (95% confidence interval: 0.42, 85.59; Fisher's exact p = 0.39), which may be due to the small sample size.

rgfD-Dependent Expression of the rgf Operon
Since quorum-sensing controlled systems are characterized by increased expression when the extracellular inducer reaches a specific concentration, we sought to quantify rgf expression in a subset of GBS strains. The rgf operon was previously shown to be transcribed polycistronically [22]; therefore, we examined expression of rgfC, the gene encoding the sensor histidine kinase (rgfC), in three ST-17 clinical strains over time. No difference in relative rgfC transcript quantity was observed between the three clinical strains at any of the time points. It is important to note that all three strains contained complete rgf operons with a complete rgfC and genetically identical rgfB and rgfD genes. Next, we deleted rgfD in one of the three CC-17 strains, GB00451 (wild type; WT), in order to compare rgfC expression along the growth curve to the same strain lacking rgfD. Samples were also subcultured to ensure that the bacterial densities were similar at each OD 595 value; no difference in colony forming units (CFU) was observed between the WT and GB00451∆rgfD mutant (data not shown). Significantly reduced relative rgfC transcript quantity was observed at all growth points in the GB00451∆rgfD mutant relative to the WT (Figure 2). The largest difference was observed in lag phase (OD 595 =~0.2) with relative expression values of 0.16 ± 0.03 and 0.02 ± 0.01 for WT and mutant, respectively. In early log phase (OD 595 =~0.4), rgfC expression was reduced from 0.13 ± 0.04 to 0.06 ± 0.01 in the WT versus mutant (p = 0.04) and at mid-log phase (OD 595 =~0.6), expression values of 0.13 ± 0.05 and 0.05 ± 0.008 were observed for the WT and mutant (p = 0.05). At late log phase, the WT had a significantly higher level of rgfC expression (0.17 ± 0.02) compared to the GB00451∆rgfD mutant (0.08 ± 0.03; p = 0.01). Although expression of rgfC was highest for both strains at stationary phase, the level of expression in the mutant (0.44 ± 0.06) was still significantly lower than in the WT (0.72 ± 0.12; p = 0.03). Complementation of GB00451∆rgfD with the pLZ12 plasmid containing the truncated version of rgfD from GB00451 (WT) was not capable of restoring rgfC expression. To determine whether this result was partly due to the rgfD mutation in the WT strain, complementation with pLZ12-rgfD from GB00012, a ST-1 strain lacking the rgfD truncation, was performed. Importantly, complementation of GB00451∆rgfD with pLZ12-rgfD from GB00012 ( Figure 3) was sufficient to restore relative expression of rgfC to 0.15 ± 0.02 at OD 595 = 0.4 compared to the empty vector control (0.08 ± 0.1); t-test p < 0.01). whether this result was partly due to the rgfD mutation in the WT strain, complementation with pLZ12-rgfD from GB00012, a ST-1 strain lacking the rgfD truncation, was performed. Importantly, complementation of GB00451ΔrgfD with pLZ12-rgfD from GB00012 ( Figure 3) was sufficient to restore relative expression of rgfC to 0.15 ± 0.02 at OD595 = 0.4 compared to the empty vector control (0.08 ± 0.1); t-test p < 0.01).  To determine whether RgfD alters expression of other genes that were suggested to be regulated by the rgf operon, the WT and GB00411ΔrgfD mutant were examined for expression changes in the gene encoding the fibrinogen binding surface protein (fbsB), which is activated by rgfA/C [23]. Notably, relative fbsB transcript quantity was similar for the WT (0.010 ± 0.003) and GB00451ΔrgfD (0.008 ± 0.003) mutant at OD595 = 0.4 as well as OD595 = 0.6 (0.012 ± 0.005 for the WT versus 0.012 ± 0.006 for GB00451ΔrgfD). Expression levels in stationary phase (OD595 = 0.8) were slightly more variable, though there was still no significant difference between the WT (0.0007 ± 0.0004) and mutant (0.0009 ± 0.0008). whether this result was partly due to the rgfD mutation in the WT strain, complementation with pLZ12-rgfD from GB00012, a ST-1 strain lacking the rgfD truncation, was performed. Importantly, complementation of GB00451ΔrgfD with pLZ12-rgfD from GB00012 ( Figure 3) was sufficient to restore relative expression of rgfC to 0.15 ± 0.02 at OD595 = 0.4 compared to the empty vector control (0.08 ± 0.1); t-test p < 0.01).  To determine whether RgfD alters expression of other genes that were suggested to be regulated by the rgf operon, the WT and GB00411ΔrgfD mutant were examined for expression changes in the gene encoding the fibrinogen binding surface protein (fbsB), which is activated by rgfA/C [23]. Notably, relative fbsB transcript quantity was similar for the WT (0.010 ± 0.003) and GB00451ΔrgfD (0.008 ± 0.003) mutant at OD595 = 0.4 as well as OD595 = 0.6 (0.012 ± 0.005 for the WT versus 0.012 ± 0.006 for GB00451ΔrgfD). Expression levels in stationary phase (OD595 = 0.8) were slightly more variable, though there was still no significant difference between the WT (0.0007 ± 0.0004) and mutant (0.0009 ± 0.0008). To determine whether RgfD alters expression of other genes that were suggested to be regulated by the rgf operon, the WT and GB00411∆rgfD mutant were examined for expression changes in the gene encoding the fibrinogen binding surface protein (fbsB), which is activated by rgfA/C [23]. Notably, relative fbsB transcript quantity was similar for the WT (0.010 ± 0.003) and GB00451∆rgfD (0.008 ± 0.003) mutant at OD 595 = 0.4 as well as OD 595 = 0.6 (0.012 ± 0.005 for the WT versus 0.012 ± 0.006 for GB00451∆rgfD). Expression levels in stationary phase (OD 595 = 0.8) were slightly more variable, though there was still no significant difference between the WT (0.0007 ± 0.0004) and mutant (0.0009 ± 0.0008).

Role of rgfD in Association with Host Cells and Biofilm Production
Since the rgf operon has been shown to promote binding to host cell components like fibrinogen [23], the ability to associate with T-HESCs was investigated. Interestingly, the GB00451∆rgfD mutant had an average 1.3-fold decrease in the ability to associate with the decidualized endometrial cells compared to the WT. Association with T-HESCs was 0.40% ± 0.03% for GB00451∆rgfD compared to 0.55% ± 0.07% for the WT (ratio t-test p < 0.03) (Figure 4). The empty vector control had an average 1.6-fold reduction in the level of association with T-HESCs compared to GB00451∆rgfD complemented with pLZ12 containing rgfD from GB00012. The association level for the complemented mutant was 0.53% ± 0.11% versus 0.37% ± 0.07% for the empty vector (ratio t-test p = 0.002). It is important to note that even though the trend remained consistent across biological replicates, association percentages varied between experiments. When biofilms were examined, no difference in biofilm production was observed between the WT, GB00451∆rgfD mutant, both complemented mutants, or empty vector control.

Role of rgfD in Association with Host Cells and Biofilm Production
Since the rgf operon has been shown to promote binding to host cell components like fibrinogen [23], the ability to associate with T-HESCs was investigated. Interestingly, the GB00451ΔrgfD mutant had an average 1.3-fold decrease in the ability to associate with the decidualized endometrial cells compared to the WT. Association with T-HESCs was 0.40% ± 0.03% for GB00451ΔrgfD compared to 0.55% ± 0.07% for the WT (ratio t-test p < 0.03) (Figure 4). The empty vector control had an average 1.6-fold reduction in the level of association with T-HESCs compared to GB00451ΔrgfD complemented with pLZ12 containing rgfD from GB00012. The association level for the complemented mutant was 0.53% ± 0.11% versus 0.37% ± 0.07% for the empty vector (ratio t-test p = 0.002). It is important to note that even though the trend remained consistent across biological replicates, association percentages varied between experiments. When biofilms were examined, no difference in biofilm production was observed between the WT, GB00451ΔrgfD mutant, both complemented mutants, or empty vector control. . rgfD plays a role in association with decidualized endometrial stromal cells. Association percentages for GB00451 (WT) and GB00451ΔrgfD with telomerase-immortalized human endometrial stromal cells (T-HESCs) are shown as well as the percentages for GB00451ΔrgfD complemented with pLZ12 containing rgfD from GB00012 (pLZ12-GB12rgfD) and the empty vector (pLZ12 only). The histogram represents a single biological replicate with three technical replicates and error bars representing the standard deviation between technical replicates; the assay was performed four times in triplicate with identical trends per assay. * paired ratio t-test p-value < 0.05.
Because the association assays were performed in different conditions than the rgfC expression analysis, we also sought to compare rgfC expression in the WT, rgfD mutant, complemented rgfD mutant, and empty vector control to determine whether differential regulation of the operon was detectable following host cell exposure. Notably, a 22.8-fold reduction in rgfC expression was observed in the GB00451ΔrgfD mutant compared to the WT following a 2 h exposure to decidualized T-HESCs; relative transcript levels were 0.0019 ± 0014 and 0.043 ± 0.019, respectively ( Figure 5). No difference in rgfC expression was observed, however, between the complemented and empty vector controls with relative transcription values of 0.029 ± 0.01 and 0.031 ± 0.01, respectively, following exposure to T-HESCs. The histogram represents a single biological replicate with three technical replicates and error bars representing the standard deviation between technical replicates; the assay was performed four times in triplicate with identical trends per assay. * paired ratio t-test p-value < 0.05.
Because the association assays were performed in different conditions than the rgfC expression analysis, we also sought to compare rgfC expression in the WT, rgfD mutant, complemented rgfD mutant, and empty vector control to determine whether differential regulation of the operon was detectable following host cell exposure. Notably, a 22.8-fold reduction in rgfC expression was observed in the GB00451∆rgfD mutant compared to the WT following a 2 h exposure to decidualized T-HESCs; relative transcript levels were 0.0019 ± 0014 and 0.043 ± 0.019, respectively ( Figure 5). No difference in rgfC expression was observed, however, between the complemented and empty vector controls with relative transcription values of 0.029 ± 0.01 and 0.031 ± 0.01, respectively, following exposure to T-HESCs.

Discussion
Because the rgf operon was found to facilitate binding to host cell components and impact virulence in vivo, [22,23,29] we sought to better understand the role of the putative autoinducing peptide, RgfD, in phenotypes important for colonization. Similar to prior studies [29,39], we have demonstrated rgf-dependent expression of the rgf operon and have identified genetic variation in the rgf operon genes among a diverse set of GBS strains. The large 881 bp deletion within rgfC is notable given that it was present in almost half of the 40 strains recovered from women with asymptomatic GBS colonization. Since these strains represented multiple STs and were collected from patient populations in different geographic locations and time periods, the presence of this mutation suggests parallel evolution in the rgfC locus. Evidence for gene loss as well as lateral transfer and gene duplication have been described for genes involved in other quorum sensing systems (e.g., Pseudomonas [41]).
The identification of a point mutation within rgfD that was exclusive to the seven clinical strains belonging to CC-17, the lineage most commonly associated with neonatal disease, [4] is also noteworthy. The A1131T mutation results in transcription of a premature stop codon within the rgfD open reading frame to encode a truncated protein. Because of this mutation, it is possible that rgfD functions differently in ST-17 strains versus strains of other lineages with a complete rgfD. In S. aureus, sequence variation has been observed within the auto-inducing peptide gene (agrD) and was shown to influence activation of the two component system [42,43]. The AgrD peptide was also found to be post-translationally modified and reduced to a functional eight amino acid peptide [44]. In all phases of growth, we found that the truncated ∆rgfD mutant had decreased expression of rgfC, which encodes the sensor histidine kinase and represents the last of the four genes in the polycistronically transcribed operon [22]. These data suggest that this truncated RgfD protein is functional in this CC-17 strain and likely serves as an auto-inducing peptide in conjunction with other factors. This hypothesis is in agreement with agr regulation in S. aureus in which there are several factors affecting expression besides AgrD [26,27]. Furthermore, maximum rgfC expression was observed as the cells entered stationary phase in both the WT as well as the GB00451∆rgfD mutant, suggesting that other factors can impact transcription of this operon even in the absence of a functional version of RgfD. In the future, additional studies should focus on clarifying the specific regulatory role of rgfD during each growth phase and identifying other factors that contribute to transcription. It is important to Figure 5. rgfC is upregulated by rgfD following exposure to decidualized endometrial stromal cells. Comparison of the relative rgfC transcript quantity between the GB00451rgfD wild-type (WT) and GB00451∆rgfD mutant following 2 h exposure to telomerase-immortalized human endometrial stromal cells (T-HESCs). Bars represent the standard deviation of three biological replicates. * t-test p-value < 0.05.

Discussion
Because the rgf operon was found to facilitate binding to host cell components and impact virulence in vivo, [22,23,29] we sought to better understand the role of the putative autoinducing peptide, RgfD, in phenotypes important for colonization. Similar to prior studies [29,39], we have demonstrated rgf -dependent expression of the rgf operon and have identified genetic variation in the rgf operon genes among a diverse set of GBS strains. The large 881 bp deletion within rgfC is notable given that it was present in almost half of the 40 strains recovered from women with asymptomatic GBS colonization. Since these strains represented multiple STs and were collected from patient populations in different geographic locations and time periods, the presence of this mutation suggests parallel evolution in the rgfC locus. Evidence for gene loss as well as lateral transfer and gene duplication have been described for genes involved in other quorum sensing systems (e.g., Pseudomonas [41]).
The identification of a point mutation within rgfD that was exclusive to the seven clinical strains belonging to CC-17, the lineage most commonly associated with neonatal disease, [4] is also noteworthy. The A1131T mutation results in transcription of a premature stop codon within the rgfD open reading frame to encode a truncated protein. Because of this mutation, it is possible that rgfD functions differently in ST-17 strains versus strains of other lineages with a complete rgfD. In S. aureus, sequence variation has been observed within the auto-inducing peptide gene (agrD) and was shown to influence activation of the two component system [42,43]. The AgrD peptide was also found to be post-translationally modified and reduced to a functional eight amino acid peptide [44]. In all phases of growth, we found that the truncated ∆rgfD mutant had decreased expression of rgfC, which encodes the sensor histidine kinase and represents the last of the four genes in the polycistronically transcribed operon [22]. These data suggest that this truncated RgfD protein is functional in this CC-17 strain and likely serves as an auto-inducing peptide in conjunction with other factors. This hypothesis is in agreement with agr regulation in S. aureus in which there are several factors affecting expression besides AgrD [26,27]. Furthermore, maximum rgfC expression was observed as the cells entered stationary phase in both the WT as well as the GB00451∆rgfD mutant, suggesting that other factors can impact transcription of this operon even in the absence of a functional version of RgfD. In the future, additional studies should focus on clarifying the specific regulatory role of rgfD during each growth phase and identifying other factors that contribute to transcription. It is important to note that quorum quenching has also been described to occur following entry into stationary phase in several bacterial species [45,46]. Since complementation with the truncated rgfD from the WT did not restore rgfC expression relative to complementation with a complete rgfD from a different strain (genotype), we further hypothesize that rgfB may be needed for rgfD processing in CC-17 strains containing a truncated RgfD protein. Indeed, the agrBD complex was found to be responsible for activation of agr in S. aureus [42] and hence, future work is required to determine whether extrachromosomal transcription of rgfBD from a CC-17 strain can restore rgfC activity and if certain mutations within rgf can result in altered protein function. Little is known about the structure of the mature RgfD peptide in GBS and virtually nothing is known about how that structure varies across diverse strain types with different rgfD alleles.
The ∆rgfD mutant also had a significant decrease in rgfC expression following exposure to decidualized T-HESCs and in its ability to associate with the T-HESCs; the latter could be restored following complementation with a complete rgfD from the GB00012 strain. Although the decrease in association was modest and the biological relevance is not clear, it was consistently observed and statistically significant. Because we have previously shown that GBS strains of different genetic backgrounds vary in their ability to attach to A549 lung epithelial cells and T-HESCs [37], it is possible that the rgf operon plays a role in regulating distinct adherence factors. These factors may be needed to colonize different tissues inside the host and likely vary across the GBS genotypes. Because rgf was previously found to activate fbsB, the gene encoding one of two fibrinogen binding proteins [23], it is possible that the reduction in host cell association in the truncated ∆rgfD CC-17 mutant is due to a decreased ability to bind fibrinogen or other cell components via the lack of rgf activation. Similar findings were observed when both rgfA and rgfC were interrupted in a prior study [23]. Since our prior study demonstrated that only a small fraction (<1%) of associated bacteria invaded host cells [37]; however, the association reduction observed in the GB00451∆rgfD mutant cannot be explained solely by fbsB activation or through reduced invasion. Support for this hypothesis also comes from the observation that fbsB expression was not significantly different in the WT and GB00451∆rgfD mutant across the growth phase despite the observed differences in relative transcript levels of rgfC. Nonetheless, it is also possible that host cell exposure does not represent the optimal conditions for rgf activation given the higher level of rgfC expression that we observed during growth at OD 595 = 0.4. Another possibility affecting host cell association is that the CC-17 strains containing the truncated rgfD are less likely to form biofilms due to altered activation of genes important for adherence. Reduced biofilm production has been demonstrated with deletion of the agr operon in S. aureus [47] and our prior study of biofilms in 293 GBS strains showed that strains belonging to ST-17, which more commonly possessed the truncated rgfD in this study, were significantly more likely to form weak biofilms relative to strains from other lineages [33]. Because we also observed an association with weak biofilm production in ST-19 strains, a more comprehensive comparative genomics analysis is warranted. In the present study, most of the ST-19 strains possessed the large deletion within rgfC and hence, it is possible that altered transcription of rgfC combined with a complete rgfD can also impact biofilms. Although there was no association between the rgfC deletion and biofilm production overall, it is notable that only one of the nine CC-19 strains formed a strong biofilm and possessed a complete rgfC; the remaining eight CC-19 strains had the rgfC deletion and formed weak biofilms. Further testing of the different rgf mutants is therefore warranted, particularly those clinical strains with natural mutations, which can enhance understanding of the relationship between sequence variation, rgf activation and colonization.
The only other verified quorum sensing system in GBS involves RovS, an Rgg-type transcriptional regulator and its activator, a short hydrophobic peptide (SHP), has been found in many specis of Streptococcus [48,49]. Similar to the rgf system, SHP is post-translationally modified by one or more peptidases and secreted extracellularly [50,51]. Rather than indirectly affecting downstream gene regulation through extracellular recognition, however, the SHP interacts directly with RovS following importation by the Ami oligopeptide transporter [49]. Similar to the rgf operon, this system is autoregulating and affects expression of fibrinogen-binding proteins and host-cell attachment [51]; hence, differential expression of the RovS system could have an impact on our findings and warrants further investigation. Interestingly, the SHP has been linked to persistence [51], while inactivation of rgfC has specifically been shown to induce a disseminating and invasive phenotype [29,39]. Because the SHP system has also been shown to function differently in different mediums [51], further work should also focus on identifying the optimal conditions for rgf expression and rgf -associated regulatory networks, particularly during the course of an infection. Additional studies that aim to isolate the mature RgfD peptide from cell-free supernatant cultures and assess its impact on rgf expression over time are also needed.

Conclusions
Because GBS disease progression can involve transcriptional remodeling in response to changing host environments, quorum sensing offers a potential explanation for the variation in pathogenicity that has been observed between strains belonging to different phylogenetic lineages. Although quorum sensing has been demonstrated to affect pathogenesis for many bacterial species, there are few studies specific to GBS. The work described herein adds to the knowledge of quorum sensing systems in GBS and better defines the role of rgfD, the gene encoding the putative auto-inducing peptide, as a regulator of rgfC and stimulus for host-cell association.

Acknowledgments:
The authors wish to thank Dr. Melody Neely for sharing the pLZ12 plasmid. We would also like to thank Pallavi Singh and Michelle Korir for productive scientific conversations aiding this project. This study was supported in part by the Global Alliance to Prevent Prematurity and Stillbirth (GAPPS) in collaboration with the Bill and Melinda Gates Foundation (project N015615, SDM), while salary support was provided by the USDA NIFA (grant #2011-67005-30004, SDM). Graduate student support was provided by the Thomas S. Whittam Graduate Fellowship, the Rudolph Hugh Graduate Fellowship and the Graduate School at Michigan State University.

Conflicts of Interest:
The authors declare no conflict of interest. The funding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, and in the decision to publish the results.