Vaginal Microbial Network Analysis Reveals Novel Taxa Relationships among Adolescent and Young Women with Incident Sexually Transmitted Infection Compared with Those Remaining Persistently Negative over a 30-Month Period

A non-optimal vaginal microbiome (VMB) is typically diverse with a paucity of Lactobacillus crispatus and is often associated with bacterial vaginosis (BV) and sexually transmitted infections (STIs). Although compositional characterization of the VMB is well-characterized, especially for BV, knowledge remains limited on how different groups of bacteria relate to incident STIs, especially among adolescents. In this study, we compared the VMB (measured via 16S ribosomal RNA gene amplicon sequencing) of Kenyan secondary school girls with incident STIs (composite of chlamydia, gonorrhea, and trichomoniasis) to those who remained persistently negative for STIs and BV over 30 months of follow-up. We applied microbial network analysis to identify key taxa (i.e., those with the greatest connectedness in terms of linkages to other taxa), as measured by betweenness and eigenvector centralities, and sub-groups of clustered taxa. VMB networks of those who remained persistently negative reflected greater connectedness compared to the VMB from participants with STI. Taxa with the highest centralities were not correlated with relative abundance and differed between those with and without STI. Subject-level analyses indicated that sociodemographic (e.g., age and socioeconomic status) and behavioral (e.g., sexual activity) factors contribute to microbial network structure and may be of relevance when designing interventions to improve VMB health.


Introduction
Worldwide, bacterial vaginosis (BV) is the most common cause of vaginal discharge, affecting 23-29% of women in the general population [1]. This is of clinical and public health importance as BV increases the risk of HIV acquisition and is estimated to account for up to 15% of HIV infections in women [2]. BV is also associated with an increased risk of preterm birth and miscarriage [3,4] and increased prevalence and incidence of sexually transmitted infections (STIs) [5][6][7]. The most common curable STIs, Chlamydia trachomatis (CT), Neisseria gonorrhoeae (NG), and Trichomonas vaginalis (TV), disproportionately affect adolescent and young women [8], also contributing to increased risk of HIV acquisition and

Study Design, Participants, and Sample Size
This study uses data from the Cups and Community Health (CaCHe, pronounced "Cash-Ay") study, a prospective cohort study of adolescent girls and young women, which started when they were attending secondary school in Siaya County, western Kenya. Eligibility criteria and details of the study design have been previously reported [12]. Briefly, eligibility in the CaCHe study included attendance at a selected school, being a resident of the study area, provision of assent and parental/guardian consent, and report of established menses (having occurred at least 3 times). Participants who reported being pregnant at baseline screening were excluded. As previously reported in detail, the a priori-determined sample size was calculated to detect a 25% relative difference in cumulative prevalence of BV occurring over 30 months of follow-up, in a design of 6 repeated measurements [12]. After baseline assessment (May through June 2018), planned study visits took place at 6, 12, 18, and 30 months. The 24-month study visit was scheduled to take place in May 2020 and was missed due to the COVID-19 pandemic.

Detection of Bacterial vaginosis and Sexually Transmitted Infections
Participants were asked to take self-collected vaginal swabs at baseline and at each follow-up visit. At baseline, 12-, and 30-month study visits, four vaginal swabs were collected for the assessment of VMB, BV, and STIs, while at the 6-and 18-month visits, two vaginal swabs were collected for the assessment of VMB and BV. The first swab obtained was for 16S rRNA gene amplicon sequencing (VMB) using the OMNIgene Vaginal kits (OMR-130; DNA Genotek TM ). Air-dried smears from the second swab were Gramstained and assessed for BV according to Nugent's criteria, with a score of 7-10 defined as BV [13]. A third vaginal swab was used for the detection of Chlamydia trachomatis (CT) Microorganisms 2023, 11, 2035 3 of 20 and Neisseria gonorrhoeae (NG) using the GeneXpert (Cepheid, Sunnydale, CA, USA). A fourth swab was used for the detection of Trichomonas vaginalis (TV) using the OSOM TV antigen detection assay (Sekisui, Lexington, MA, USA). STIs (CT, NG, and TV) were treated following Kenyan National guidelines [14], and BV was treated with 2 g of tinidazole once daily for two days [15][16][17]. Treatment was documented for >95% of infections detected at each study visit [18].

Data Collection
Sociodemographic data and behavioral practices were collected via a self-completed tablet-based survey in the participant's language of choice (English or DhoLuo), with assistance from study staff if needed. Socioeconomic status (SES) was assessed using abridged questions from the KEMRI health and demographic surveillance system (HDSS) household survey [19] and dichotomized as lower quintiles (1-2) and higher quintiles (3)(4)(5). At the school level, water, sanitation, and hygiene (WASH) scores ranged from 0 to 3, with 3 being the highest; a score of 3 reflected available water for handwashing, soap, and an acceptable ratio of girls to acceptable latrines (i.e., those considered in adequate condition for use) [20], which was dichotomized into 0-1 and 2-3 for the analysis. In addition to being asked about sexual activity, participants were also asked if they were forced or threatened to have sex (referred to as coerced sex) and whether they engaged in transactional sex (defined as having sex in exchange for money, items, or favors).

Characterization of Vaginal Microbiome
DNA extraction, library preparation, and sequencing were performed by the Genome Research Core (GRC) at the University of Illinois Chicago. DNA extraction and PCR-based library preparation of bacterial 16S rRNA gene amplicons were performed as described previously [12]. Briefly, libraries were prepared using a two-stage PCR protocol targeting the V3-V4 variable regions of bacterial 16S rRNA genes [21] and sequenced on an Illumina MiSeq instrument using a V3 kit (600 cycle chemistry). Forward and reverse reads were merged using the software package PEAR [22]. The quality and primer-trimmed sequence data were then processed using a standard bioinformatics pipeline for chimera removal, annotation, and CST typing; this processing was conducted by the University of Maryland Institute for Genomic Science [23]. Subsequently, a biological observation matrix (BIOM) was generated at the lowest taxonomic level identifiable. Vaginal community state types (CSTs) were identified in a reference dataset using the nearest centroid classification (VAginaL community state typE Nearest CentroId clAssifier; VALENCIA) [24]. Putative contaminants were identified and removed following the application of decontam program in R (version 4.1.3) [25]. Raw sequence data (FASTQ files) were deposited in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA), under BioProject identifier PRJNA746243.

Construction of Analytic Data Set
Incident STI was defined as a positive test result for CT, NG, or TV, occurring at the 12-or 30-month visit, preceded by negative STI results. To minimize potential confounding from prior infection and antibiotic treatment, we used only the first incident STI and then stratified the incident STI into three categories: those occurring in the absence of BV, preceded by BV, or simultaneously occurring with BV. If a participant tested positive for an STI at 12 or at 30 months, the analysis utilized VMB data observed at the 12-or 30-month visit, respectively, rather than at prior visits given the long period of time between visits. However, we used all study visits (baseline, 6, 12, 18, and 30 months) to determine whether participants were persistently negative for BV and STI, or whether STI was preceded by BV (Supplementary Table S1). After identifying observations for the analytic sample (n = 180 persistently negative, n = 41 incident STI in the absence of BV, n = 20 incident STIs preceded by BV, and n = 14 incident STI with co-incident BV), there were 3 observations with <5000 total sequence reads which were not included in analyses (2 observations with  STI not preceded by or simultaneous with BV and 1 observation from a participant  persistently negative for BV and STI). Data points from those testing negative for BV and  STI throughout observation were selected with simple random sampling stratified by time  point to match the time point of incident STI (Supplementary Table S1). Since this analysis sought to gain insights into individuals with incident STIs, we did not conduct analyses on individuals with prevalent or incident BV in the absence of incident STIs, and our analyses of prevalent BV and/or STIs have been reported [12].

Statistical Analysis
The analysis took place in two steps: (1) microbial co-occurrence network analysis and (2) inferential analysis of participant-level microbial network characteristics in relation to participant-level demographics and behaviors, detailed below.
(1) Microbial Co-Occurrence Network Analysis: We conducted undirected network analysis of taxa to identify nodes and connections that differ between the outcome states: (1) 179 individuals who were persistently negative for BV and STI over 30 months of observation; (2) 39 individuals with incident STI in the absence of BV (i.e., there was no observation of BV preceding or coincident with STI; referred to as "STI with no BV"); (3) 20 individuals with incident STI that was preceded by BV (referred to as "BV before STI"). It has been recommended that a minimal sample of 20-25 observations is used in microbial network co-occurrence analysis [26], and networks are not constructed or analyzed for the 14 individuals who experienced incident STI and incident BV at the same time. Their demographic, behavioral, baseline CST, and CST at the time of infection are presented in Supplementary Table S2.
Undirected microbial co-occurrence networks were constructed separately for the three outcome states using the NetComi package in R [27], implementing SPRING (semiparametric rank-based approach for inference in the graphical model) for the association measure. SPRING was selected for its advantages in estimating sparse microbial association networks, robustness to misspecification of total cell count estimate, and reliability of network metrics [28]. Prior to relative abundance estimation, data were filtered to retain taxa that contributed at least 0.01% of the total sequence reads and were present in at least 5% of observations in the analysis, resulting in the selection of 54 taxa. Within SPRING, a modified center log ratio (mclr) transformation was applied to address zero counts and compositionality. The number of lambda values was set to 100, with 100 repetitions. Using the netAnalyze function of NetCoMi, we report normalized centralities. Clusters of taxa within the network were generated using greedy modularity optimization (cluster_fast_greedy in igraph) [29], which optimizes the modularity score [30].
(2) Network Analysis at the Participant Level: We constructed a dissimilarity-based network, in which nodes were participants instead of taxa. Following the Bayesian multiplicative replacement of zeros and CLR transformation, Aitchison's distance was used for the dissimilarity measure. Similarities are used as edge weights, and thus, participants with more comparable microbial network structures are arranged in closer proximity on the network graph. Properties of clustering, eigenvector centrality (the degree to which a node is connected to other highly connected nodes), and betweenness centrality (how often a node lies on the shortest path between two other nodes) were extracted. Eigenvector and betweenness centrality were selected because of their relevance to identify potentially "influential" and "gatekeeper" taxa, respectively. Network properties were compared based on participant characteristics and factors that were applicable at baseline, including intervention assignment and WASH score; and at baseline and follow-up: age, SES, sexual activity, coerced sex, transactional sex, having a boyfriend, vaginal microbiome CST, and STI etiology. Differences in distributions were assessed using the Chi-square test (with Fisher's exact test when cell size n < 5) for categorical variables and Wilcoxon rank sum tests and Kruskal-Wallis tests for continuous variables.
Sensitivity Analyses. Because the majority of incident STIs were C. trachomatis, we attempted to construct a subject-level network for the 21 participants with incident CT in the absence of BV, TV, and NG. No optimum number of clusters could be defined; therefore, sparsification of the network to extract subject-level components was not attempted due to the potential unreliability of findings.

Characteristics of Study Sample and Microbiome Composition
Sociodemographic and behavioral practices at baseline did not differ between participants who remained negative for BV and STI throughout the follow-up compared to those with incident STI (Table 1). However, characteristics at the time of the incident STI varied considerably with greater frequency of being sexually active, having a boyfriend, and having ever been pregnant.
1 Not all cells sum to N due to missing data. 2 Ever pregnant was asked only to those who reported being sexually active; hospitalization for pregnancy in the past 6 months also supplemented this, in some cases leading to a number of responses to "ever pregnant" being greater than the number reporting being sexually active.
The most common taxa with the highest mean relative abundance were L. crispatus and L iners, correlating with CST-I and CST-III, respectively ( Figure 1). Overall, the VMB composition was significantly different between participants who remained persistently negative for BV and STIs as compared to those with incident STI, with or without BV ( Figure 2A). While the majority of persistently negative participants had VMB of CST-I (L. crispatus dominated), L. crispatus was substantially depleted from the VMB of participants with STI, even in the absence of BV, but was uncommon and at very low relative abundance in the VMB of those with BV prior to or simultaneously occurring with incident STI ( Figure 2B,C). Individual taxa presence and relative abundance distribution by BV and STI status are given in Supplementary Table S3.

Results of Microbial Co-Occurrence Network Analysis: Differences in Network Properties and Centralities for Participants with Incident STI Compared to Persistently Negative Participants
In the VMB of 179 participants who were persistently negative for STIs and BV, 3 network component clusters were identified with 1 having 24 taxa, 1 having 2 taxa, and 5 singleton taxa ( Table 2). There were 4 network component clusters identified among 39 participants with incident STI and no BV: 2 components with 9 taxa each, 1 with 4 taxa, 1 with 3 taxa, and 6 singleton taxa. In keeping with this, other network metrics (relative largest connected component size, clustering coefficient, positive edge percentage, and average path lengths) also reflected a more connected vaginal microbial network for participants who remained persistently negative for STI and BV as compared to participants with incident STI and no BV. The network properties and differences are stark in the network plots (Figure 3), where the varying taxa of central importance, clusters, and connections are highlighted. Some similar trends were seen, with lesser clustering coefficient and positive edge percentage, for those with BV prior to incident STI in comparison to participants who remained negative for BV and STI throughout follow-up (Supplementary Table S4). Centralities (Table 3). In the microbial co-occurrence network of individuals who remained persistently negative for STIs and BV, the taxa with the highest eigenvector centralities were as follows (in descending order): Prevotella melaninogenica (present in 4.0%, with a mean relative abundance (RA) of 6.68% among samples where present), Gemella haemolysans/Gemella asaccharolytica (present in 10.2%, with a mean RA of 2.61% where present), Fannyhessea vaginae (Atopobium; present in 13.6%, with a mean RA of 4.02% where present) bacterial-vaginosis-associated bacterium 1 (BVAB1, present in 4.5%, with a mean RA of 6.02% where present), and Sneathia amnii (present in 4.0%, with a mean RA of 3.86% where present). These were the same 5 taxa with the highest closeness centrality and betweenness centrality, though with Fusobacterium nucleatum (present in 10.2%, with a mean RA of 2.54% where present) rather than BVAB1 for highest betweenness centrality. Notably, these taxa with the highest centralities were not mirrored in the vaginal microbial co-occurrence network of participants with incident STI in the absence of BV. The only taxon with high centrality found in both sub-groups was BVAB1. Taxa with the highest centralities for those with incident STI and no BV were Staphylococcus hominis (present in 23.1%, with a mean RA of 0.73% where present), Fusobacterium equinum (present in 7.7%, with a mean RA of 2.00% where present), Lactobacillus jensenii (present in 17.9%, with a mean RA of 1.94% where present), and Veillonella (present in 20.5%, with a mean RA of 5.64% where present). According to the Jaccard index, the degree (p = 0.017), eigenvector (p = 0.026), and closeness (p = 0.017) centralities exhibited statistically significant differences between the two groups.
In the VMB of 20 participants with BV prior to STI, taxa with the highest centrality values were again strikingly different from participants who remained persistently negative throughout follow-up, though no centrality differences reached statistical significance by Jaccard index p-value, possibly due to the small sample size of those with BV prior to incident STI (Supplementary Table S4). Taxa with the highest centrality among participants with BV prior to incident STI differed from those with incident STI in the absence of BV, and the taxa with the highest values differed across centrality measures.

Results of Microbial Co-Occurrence Network Analysis: Differences in Network Properties and Centralities for Participants with Incident STI Compared to Persistently Negative Participants
In the VMB of 179 participants who were persistently negative for STIs and BV, 3 network component clusters were identified with 1 having 24 taxa, 1 having 2 taxa, and 5 singleton taxa (Table 2). There were 4 network component clusters identified among 39 participants with incident STI and no BV: 2 components with 9 taxa each, 1 with 4 taxa, 1 with 3 taxa, and 6 singleton taxa. In keeping with this, other network metrics (relative largest connected component size, clustering coefficient, positive edge percentage, and average path lengths) also reflected a more connected vaginal microbial network for participants who remained persistently negative for STI and BV as compared to participants with incident STI and no BV. The network properties and differences are stark in the network plots (Figure 3), where the varying taxa of central importance, clusters, and connections are highlighted. Some similar trends were seen, with lesser clustering coefficient and positive edge percentage, for those with BV prior to incident STI in comparison to participants who remained negative for BV and STI throughout follow-up (Supplementary Table S4).    1 Network centrality measures are normalized for within-sample comparison. For each centrality measure, the top 5 taxa for each sub-group are reported and are presented first for the persistently negative sub-group and then for the sub-group of incident STI in the absence of BV. One taxon, BVAB1, is the highest-ranked centrality measure in both sub-groups. Shading is applied to the absolute difference column to facilitate the reading of taxa with greater (darker intensity shading) absolute difference vs. taxa with lower (lighter shading) differences.

Results of Participant-Level Network Analysis: Network Properties Differ by Sociodemographic, Behavioral, and VMB Composition
In subject-level network analysis, nodes are the individual participants, but they are connected based on VMB composition. Therefore, it is unsurprising that subject-level network properties (i.e., centralities and component clusters) vary according to VMB composition (Table 4). However, subject-level network properties also varied by sociodemographic and behavioral factors. Among 179 individuals persistently negative for STIs and BV, betweenness centrality was increased for those with baseline values of higher SES, having experienced coerced sex, and having a boyfriend, while eigenvector centrality was increased among participants assigned to menstrual cup intervention, and those with younger age at follow-up. Eigenvector centrality was greatest among participants with vaginal CST-I (L. crispatus dominated) at baseline and at follow-up, while betweenness centrality was increased among those with vaginal CST-I and CST-III (L. iners dominated) at follow-up. For 39 participants with incident STI and no BV, betweenness centrality did not vary by any factors examined, while eigenvector centrality was highest for those with CST-III at follow-up and those with C. trachomatis etiology. Network component clusters ( Figure 4A, Table 5) for participants who were persistently STI-and BV-negative varied by vaginal CST at baseline and follow-up, with the majority (92.3%) of network cluster 2 observations having CST-I. Network cluster 3 was also predominantly CST-I (78.5%), while network cluster 1 was majority CST-III (69.4%) and CST-IV (27.8%). Network cluster 2 participants were also more likely to be younger age at follow-up, but no other sociodemographic or behavioral factors varied by network cluster for persistently STI-and BV-negative participants. Among participants with incident STI without BV ( Figure 4B, Table 5), network clusters varied by vaginal CST at follow-up but not at baseline, with cluster 1 being predominantly (87.5%) CST-IV, cluster 2 being majority CST-III (60.9%) and CST-I (30.4%), and cluster 3 being 87.5% CST-III. Network clusters of those with incident STI without BV did not differ by age, SES, or behavioral factors at follow-up, but varied by baseline WASH score, sexual activity, experience of coerced sex, and report of transactional sex. Sexual activity, coerced sex, and transactional sex were more commonly reported by participants in network clusters 1 and 3 and may represent differential exposure to penile microbiomes. Participants in these clusters were also more likely to originate from school areas with lower WASH scores, another indicator of community-level SES.

Discussion
We identified differing key taxa in the VMB networks of adolescent and young women with incident STIs as compared to those who remained negative for BV and STI over a 30-month period of observation. Overall, the VMB showed decreased connectivity in individuals with incident STIs compared to those who remained persistently negative, and the taxa that were most central differed between those with incident STIs compared to those who remained negative. Secondly, participant-level network structure based on VMB composition varied by sociodemographic and behavioral factors, and network clusters correlated with molecular CSTs but were not redundant with them.
The differences in the VMB network structures and properties have implications for bacterial community function. In general, the VMB of those with incident STIs showed decreased connectivity, as reflected in lower measures of clustering, smaller largest connected components, and reduced edge positivity percentage. A microbial network with lower connectivity may reflect less "collaboration" or more competition among the taxa [31] and has been associated with pathogenic states [32,33]. The taxa identified as having central importance varied by infection status. Notably, these taxa were not those that were driving overall compositional differences ( Figure 2) or with the highest presence and relative abundance (Supplementary Table S1). These key taxa, even low-abundance genera, may have central roles, possibly related to gatekeeping or communication [34]. The highest centrality taxa in the VMB of persistently negative participants have been consistently associated with BV [9]: Fannyhessea vaginae (Atopobium), Prevotella melaninogenica, Gemella, BVAB1, and Sneathia amnii. In the context of a disease-free state and majority with optimal VMB (58% CST-I at baseline and 50% CST-I at follow-up), the identification of these taxa as key players may be revealing their latent pathobiont nature. Conversely, the taxa with the highest centralities in the VMB of participants with incident STI identified L. jensenii, Staphylococcus hominis, Fusobacterium equinum, Veillonella, and BVAB1. Like BVAB1, Veillonella species have also been identified in conjunction with BV, having a potential role in weakening the cervicovaginal epithelial barrier [35] and has been associated with BV treatment failure [36]. L. jensenii is the dominant taxon in CST-V, an uncommon CST in our dataset, as in others, and is generally considered beneficial in the vaginal microbiome [34]. There is evidence for a protective role of L. jensenii. For example, in a study of 220 women of varying race/ethnicity in the United States, Srinivasan et al. observed an inverse relationship between BV and L. jensenii [37], and in laboratory studies, L. jensenii has been shown to inhibit gonococcal adherence to epithelial cells [38]. In the context of our study, the greater centrality of L. jensenii may represent a change in composition or perturbation of L. jensenii homeostasis. To our knowledge, Staphylococcus hominis and Fusobacterium equinum have not been previously associated with BV, STIs, or other VMB-related conditions. In this analysis, they may represent opportunistic colonization in the setting of STI. Longitudinal network studies that incorporate bacterial function along with composition and clinical outcomes will be necessary to disentangle whether and how the centrality of taxa changes as a function of infection.
There was a lower proportion of positive edge percentages in the VMB of participants with incident STIs, especially within the largest connected component. The higher prevalence of positive edges in the VMB of persistently negative participants suggests potentially greater sharing of environmental spaces or conditions or greater sharing of bacterial products [39]. As described by Baquero et al., this collaboration could improve "homeostatic power", enabling the established community to be more resilient to "foreign organisms" [40]. The larger proportion of negative edges in the VMB of those with STI may indicate greater competition among the bacteria. The differences in network metrics in the VMB of those with STI may also suggest a potential underlying environmental imbalance. VMB community perturbation likely occurs prior to STI acquisition and as a result of STI acquisition. We analyzed the network structure at the time of acquisition, given the time interval between STI testing and microbiome assessments. Prospective microbiome-STI studies with frequent sampling would be able to shed more light on the temporal associations.
We are unaware of other vaginal microbial co-occurrence networks related to incident STIs with which we can compare our results. Antibiotic treatment of C. trachomatis has been shown to alter the VMB composition in a potentially non-optimal way [41], and microbial network analysis could help understand this effect by characterizing hub or connection disruptions and VMB restructuring in this context. For example, this might include serial network construction based on sampling of VMB at the time of infection detection, immediately following STI treatment, and again at 4 to 8 weeks, which would allow evaluation of which hubs and articulation points are disrupted alongside antimicrobial treatment of STIs. This would also inform whether the VMB restructures differently in terms of how the taxa interact with each other, building on the knowledge of compositional changes. In conjunction with bacterial function studies, this could provide potent insights into new biotherapeutic avenues for increasing VMB resilience to STIs.
The results of our subject-level microbial network analysis reinforce the knowledge that the VMB is shaped by the environment. Subject trait-driven network characteristics (i.e., the centrality of individuals and network clusters based on microbiome composition) varied by sociodemographic factors and sexual exposures and may represent different sexual networks (i.e., the connections among individuals defined by sexual relationships). These factors could directly affect sexual partner selection (such as age, SES, and proximity) or may represent norms and beliefs around hygiene practices and sexual practices (such as condom use and multiple partners), which drive partner type and selection. These are novel analyses in that they potentially capture a proxy for sexual mixing, and future studies should integrate traditional sexual network analysis with microbiome network analysis to characterize and quantify how they overlap. As demonstrated with a simulation study, Kenyon et al. found that populations with higher heterosexual connectivity had a higher population-level prevalence of BV than did communities with lower sexual network connectivity [42]. This is rational, as the authors summarize that a preponderance of data demonstrates the sharing of the genital microbiota between individuals, the transmission of STIs along sexual networks, and therefore, the transmission of genital microbiota within a sexual network in a similar manner. Consideration and incorporation of the microbiomesexual network in biobehavioral interventions may contribute to their effectiveness.
For our network analyses, incident STI was a composite of chlamydia, gonorrhea, and trichomoniasis. With just over half of incident STIs being CT, we conducted supplemental network analyses among these participants which revealed limited information, likely due to the small sample size. As a follow-up of our cohort is continuing through 72 months, and the incidence of STIs has increased over time, we may reach a sufficient sample size to be able to detect differences in microbial co-occurrence networks between incident STIs of different etiology. Characterizing the relationship of these pathogens to the contextual microbiota, and ideally also in relation to host immune-related processes, could identify avenues for biotherapeutics and vaccine development [43]. For C. trachomatis, such studies may also shed light on biological factors that influence clearance rather than persistence [44].

Limitations
There were small sample sizes for incident STIs preceded by BV (n = 20), co-incident STIs and BV (n = 14), and for singular incident etiologies (CT alone, TV alone, and NG alone). However, the characterization of VMB is nascent in relation to incident STIs, especially among adolescents and young women. While a body of literature has established VMB compositional differences of women with STIs and/or BV compared to women without infections, our microbial network co-occurrence approach uncovered novel findings that can be examined in larger sample sizes. The number of individuals and a number of observations per individual can influence the community structure [8], and as a follow-up of the cohort continues through 72 months, we will have the opportunity to expand analysis, with future plans to examine the temporal stability of these microbial co-occurrence networks. Sexual behaviors were likely to have been underreported as we previously reported [12], due to the stigma associated with this. Disclosure of such information may result in, or be perceived to result in, potential harm. To minimize this, no identifying information was collected in conjunction with research data, and at the time of data collection, extensive efforts were made to ensure privacy and confidentiality [12]. Regarding dissemination, to minimize potential negative consequences in the community, we do not report the schools involved in the study or the home areas of the participants. As with any longitudinal study, outcomes (i.e., BV or STI) may have occurred and been resolved prior to baseline observation. However, Kenya relies on syndromic management of vaginal discharge, and a high proportion of BV and STI cases are asymptomatic and thus would not have been treated.

Conclusions
Our study identified vaginal taxa with central importance to microbial community network structure, which differ between adolescent and young women who remained persistently negative for STIs and BV in comparison to those who acquired STIs. These key taxa may have an important role in bacterial community communication, competition, homeostasis, and collective function for preventing or permitting infection. Longitudinal studies that combine bacterial function and network analyses, with sexual network and sociodemographic and behavioral information, at acquisition and treatment inflection points could contribute to advancing biotherapeutic and behavioral interventions to disrupt STI transmission.
Supplementary Materials: The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/microorganisms11082035/s1, Table S1: Number of tests conducted, infections detected, and treatments, by study visit; Table S2: Characteristics of participants with incident sexually transmitted infection and incident Bacterial vaginosis detected at the same study visit and participants with incident Chlamydia trachomatis in the absence of other STIs and BV; Table S3: Presence and mean relative abundance (RA) of 54 taxa by outcome status; Table S4: Distribution of network properties by outcome: persistently negative for STI and BV and BV prior to incident STI.