Previous Article in Journal
MOHVAE-B: A Hierarchical Variational Autoencoder–Bayesian Bayesian Network Framework for Multi-Omics Integration and Glioma Biomarker Discovery
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Causal Graphical Models for Transition from Healthy Vaginal Microbiota to Bacterial Vaginosis in Pregnant Women

by
Maricela García-Avalos
1,
Juana Canul-Reich
1,*,
Lil María Xibai Rodríguez-Henríquez
2 and
Erick Natividad De la Cruz-Hernández
3
1
Academic Division of Science and Information Technology, Universidad Juárez Autónoma de Tabasco, Cunduacan-Jalpa de Mendez Road Km.1, Cunduacán 86690, Mexico
2
Academic Division of Basic Sciences, Universidad Juárez Autónoma de Tabasco, Cunduacan-Jalpa de Mendez Road Km.1, Cunduacán 86690, Mexico
3
Multidisciplinary Academic Division of Comalcalco, Universidad Juárez Autónoma de Tabasco, Rancheria Sur 4th Section, Comalcalco 86658, Mexico
*
Author to whom correspondence should be addressed.
BioMedInformatics 2026, 6(3), 32; https://doi.org/10.3390/biomedinformatics6030032
Submission received: 4 March 2026 / Revised: 5 May 2026 / Accepted: 11 May 2026 / Published: 21 May 2026

Abstract

This study developed two Causal Graphical Models (CGMs) to analyze the transitions associated with Bacterial Vaginosis (BV) and to identify key bacterial species at each stage. BV results from an imbalance in the vaginal microbiota, whose composition varies among women and across developmental stages. A previous CGM identified influential bacteria but did not address changes between microbiota states. Here, we extend that framework to capture these associations. Path Analysis, a structural equation modeling method based on observed variables that estimates effects through correlations and covariances, was applied to a dataset of 132 pregnant women (4–24 weeks of gestation) from Tabasco, Mexico, previously collected by third parties during healthy pregnancy campaigns and associated with BV diagnosis. Models were validated using statistical metrics and evaluation by a clinical microbiologist. The first model, representing the transition from normal microbiota (BV−) to an indeterminate state (I), identified Megasphaera Type 1 as significant. The second model, from I to bacterial vaginosis-positive (BV+), identified Atopobium vaginae and Bacterial Vaginosis-Associated Bacterium Type 2 as significant contributors. These findings highlight the importance of the intermediate state in dysbiosis progression and support the use of CGMs for studying microbiome dynamics.

Graphical Abstract

1. Introduction

Bacterial Vaginosis (BV) is a common disruption of the vaginal microbiome in women of reproductive age [1,2,3]. During pregnancy, it is associated with various gynecological and obstetric complications, including miscarriage, preterm delivery, premature rupture of membranes, and postcesarean endometritis [3,4,5]. Globally, BV affects an estimated 5–49% of pregnant women, with variations depending on the region and diagnostic criteria [1,4,6]. Symptoms typically include grayish, watery, and foul-smelling vaginal discharge that often worsens after sexual intercourse [1,3,4].
Several studies aimed at identifying bacteria associated with Bacterial Vaginosis (BV) have employed techniques such as feature selection, clustering, association rules, and Bayesian networks [7,8,9,10]. However, these approaches primarily identify associations or groupings of bacteria and do not explicitly assess their relative influence on BV diagnosis, nor do they analyze relationships across different diagnostic states. Path Analysis (PA), defined in [11,12,13], was applied in [14] to a dataset of pregnant women from the state of Tabasco, Mexico [15], which includes the diagnosis of bacterial vaginosis without Mycoplasma hominis (DxVBNoMh) across three diagnostic categories: normal microbiota (BV−), indeterminate (I), and BV-positive (BV+). That study considered only the BV− and BV+ categories. This analysis produced a Causal Graphical Model (CGM), derived from a structural modeling framework, in which Mycoplasma hominis (Mh), Atopobium vaginae (Av), Gardnerella vaginalis (Gv), Megasphaera Type 1 (MT1), and Bacterial Vaginosis-Associated Bacterium Type 2 (BVAB2) were associated with BV, identifying the bacterial species with the greatest influence on diagnosis.
Building on [14] and using the same dataset [15], this study extends the previous framework by incorporating the intermediate (I) state. Here, “transitions” refer to analytical comparisons between diagnostic states rather than temporal changes. These transitions do not imply temporal relationships, but rather structured comparisons across diagnostic states within a cross-sectional framework. As this is an observational cross-sectional dataset, unmeasured confounding cannot be excluded. Accordingly, two Causal Graphical Models (CGMs), derived from the proposed structural equations, were developed: the first examines the relationship between bacterial profiles and the BV− and I states, and the second examines the relationship between the same bacterial variables and the I and BV+ states. This approach enables the characterization of association patterns between diagnostic states within a cross-sectional framework and allows the identification of the bacterial species with the greatest influence at each stage.
In the first model (BV− to I), Av (0.0260), Gv (0.0388), and MT1 (0.3184) were associated with the transition to the intermediate state, with MT1 making the most significant contribution. In contrast, Mh and BVAB2 exhibited negative effects (−0.4400 and −0.0188, respectively), indicating that they do not promote change at this stage. In the second model (I to BV+), Av (0.6260) and BVAB2 (0.3800) emerged as the most influential variables in the progression toward bacterial vaginosis-positive (BV+), with Av predominating. Mh (0.1256) also contributed to the transition, MT1 (0.1164) showed a minor effect, and Gv (−0.0138) exhibited a negligible effect at this stage. These phase-specific patterns show that the relevance of each bacterial agent is state-dependent, supporting the need to explicitly model microbiota transitions to better understand patterns associated with BV+.
The remainder of this article is organized as follows: Section 2 describes the materials and methods used. Section 3 presents the experimental design, highlighting the application of PA for constructing CGMs. Section 4 details the results, emphasizing the bacterial species most influential in the diagnosis of BV. Section 5 discusses the findings in relation to previous studies. Finally, Section 6 summarizes the main conclusions of the study.

2. Materials and Methods

2.1. Spearman Correlation Matrix

To assess the relationships among the bacterial variables analyzed in this study, a Spearman correlation matrix was computed. This method was selected due to its suitability for evaluating monotonic relationships between variables without assuming normality. Correlation is a statistical measure used to determine the degree of linear association between two quantitative or ordinal variables. The strength and direction of this relationship are also described through the correlation coefficient, whose value lies within the interval [−1, +1]. The closer the coefficient is to ±1, the stronger the association, whereas values near zero indicate a weak or no relationship, which reflects randomness [16,17]. The Spearman coefficient is used to evaluate the monotonic relationships between ordinal variables organized into ranks or hierarchies. The Spearman correlation coefficient ρ s is defined as follows:
ρ s = 1 6 d i 2 n ( n 2 1 ) ,
d i is the difference between the ranks of each pair of observations, and n is the total number of observations.

2.2. Dataset

This study was based on a dataset comprising 87 variables and 132 instances, collected by a biology researcher from the Universidad Juárez Autónoma de Tabasco during healthy pregnancy campaigns conducted in Tabasco’s rural and urban communities between 2018 and 2020 [15]. As part of a cross-sectional study in which a single sample per participant was collected, following a standardized sampling protocol [18]. The dataset included sociodemographic, bacterial presence, Human Papillomavirus, information and bacterial vaginosis (BV) diagnosis. BV diagnosis (DxVB) was operationalized using molecular criteria defined in the original dataset, which was collected by third parties. This approach emphasized the quantitative assessment of lactobacilli, whose density variation enabled the characterization of vaginal microbiota states, distinguishing between equilibrium and dysbiosis. Based on this, three diagnostic categories were defined: normal microbiota (BV−), intermediate (I), and BV-positive (BV+).
Two diagnostic versions were available: one including and one excluding Mycoplasma hominis (Mh). The DxVBNoMh variable corresponds to the version excluding Mh. During preprocessing, as described in [14], the diagnostic definition excluding Mh was selected, while the alternative definition including it was discarded. This choice was made because the role of Mh in BV development remains unclear, and to avoid overlap between the diagnostic definition and the explanatory variables in the study analysis. Accordingly, Mh was retained as an independent variable. Variables unrelated to BV were excluded in consultation with a clinical microbiologist, ensuring that only attributes relevant to the study objective were retained. This reduction resulted in 72 attributes and 132 observations.
Outliers [19] were identified using boxplots; metabolic variables such as glucose and cholesterol showed atypical values. These observations were retained as they may reflect clinically plausible variability rather than measurement error. Given that the data correspond to real patients, extreme values may represent underlying health conditions. This interpretation was supported by a clinical microbiologist. Excluding these observations could result in the loss of relevant clinical information and may introduce bias into the analysis. Missing values were imputed by replacing numerical attributes with their mean and categorical attributes with their mode [20,21]. This approach was selected due to the relatively small sample size and the low proportion of missing data (approximately 5%), which supports the use of simple imputation methods. Imputation was performed using a stratified approach based on BV diagnostic.
The normality tests (Q–Q plots and Kolmogorov–Smirnov tests [22,23]) indicated that none of the variables met the normality assumption. Consequently, Spearman’s correlation matrix was applied across the three diagnostic classes (BV+, I, and BV−) to identify the variables most strongly associated with BV diagnosis. This analysis revealed five key bacterial species: Mh, Av, Gv, MT1, and BVAB2. The final dataset was reduced to six variables (five bacterial species and the BV diagnosis), maintaining the original 132 instances [14].

2.3. Bacterial Types Associated with BV

In BV, anaerobic bacteria predominate, including both facultative and strict anaerobes, although other species with different oxygen requirements may also be present. Facultative anaerobes can grow in the presence or absence of oxygen, whereas strict anaerobes require oxygen-free environments for their growth [24]. In particular, Mycoplasma hominis (Mh) and Gardnerella vaginalis (Gv) are facultative anaerobes, while Atopobium vaginae (Av), Megasphaera Type 1 (MT1), and Bacterial Vaginosis-Associated Bacterium Type 2 (BVAB2) are strict anaerobes.

2.4. Path Analysis

Structural Equation Modeling (SEM), also known as covariance structure analysis, allows the evaluation of direct and indirect relationships among variables. Within this framework, path analysis (PA) focuses exclusively on observable variables, providing a graphical representation of their interactions [13,25,26]. In this study, the term “causal” refers to directional relationships specified within the SEM/PA framework based on the observed data and the proposed model structure. These relationships should be interpreted as model-based associations rather than interventional causal effects, as causal explanations in observational studies rely on model specification rather than emerging directly from the data [27]. The graphical representation corresponds to SEM path diagrams and should not be interpreted as a causal directed acyclic graph (DAG) in the Pearlian sense.
The graphical representation of path analysis (PA) is structured as follows: squares or rectangles represent observable variables; unidirectional arrows indicate a direct effect of one variable on another within the model; curved double-headed arrows denote an association or covariance between variables; and model parameters are displayed along the corresponding arrows [13,26]. In this approach, variables are classified as endogenous, meaning they depend on other variables within the model, and exogenous, meaning they are not influenced by any other variables [26,28]. Effects can be categorized as direct, indirect (mediated through another variable), or total (the sum of both) [13,26,29,30].

2.5. Statistical Metrics in the Path Analysis

Statistical metrics allow estimating the model’s fit and predictive capacity relative to the observed data. Table 1 presents the indicators used in PA [25,31] to verify the representation of the underlying structure. The Root Mean Square Error of Approximation (RMSEA) and Standardized Root Mean Square Residual (SRMR) are absolute indices that assess the correspondence between the observed covariance matrix and the model, with lower values reflecting a better fit. In contrast, the Goodness of Fit Index (GFI), Normalized Fit Index (NFI), and Comparative Fit Index (CFI) are incremental indices that compare the performance of the model with respect to a reference [31,32].

2.6. Classification of Total Effects in Path Analysis

In path analysis, the magnitude of total effects was classified according to the ranges proposed by Hair et al. [33], considering the absolute value of the coefficients. The ranges used were as follows (see Table 2).

3. Experimental Study

The study was conducted following the steps illustrated in Figure 1: (a) Selected data, (b) stages of diagnosis, (c) correlation matrix, (d) theoretical models, (e) statistical metrics, (f) causal graphical models, and (g) microbiologist evaluation. All procedures were implemented in R (v. 4.3.2, Windows 11) using statistical functions and the packages lavaan (for path analysis) and semPlot (for structural visualization).
(a) Selected Data: The dataset consisted of the bacteria Mycoplasma hominis (Mh), Atopobium vaginae (Av), Gardnerella vaginalis (Gv), Megasphaera Type 1 (MT1), and Bacterial Vaginosis-Associated Bacterium Type 2 (BVAB2), together with the diagnosis of Bacterial Vaginosis without Mycoplasma hominis (DxVBNoMh), distributed across 132 instances. The clinical diagnosis was classified into three categories: normal microbiota (BV−), indeterminate (I), and BV-positive (BV+).
(b) Stages of Diagnosis: To explore transition patterns toward a positive diagnosis of bacterial vaginosis (BV+) and to identify which bacterial profiles have the greatest influence on progression to BV+, path analysis was performed by grouping the diagnosis into two sequential stages. The first stage included cases classified as BV− and I, whereas the second comprised I and BV+ cases. This segmentation enabled modeling of the transitional behavior of bacterial variables, considering the indeterminate state as an intermediate point between normal microbiota and confirmed BV.
(c) Correlation Matrix: As a preliminary step to path modeling, Spearman correlation matrices were calculated between the bacterial species and the diagnosis at each stage, as described above, to identify the microbial profiles most associated with each diagnostic transition. Figure 2 shows the correlation between bacteria and the diagnosis BV– and I, where stronger associations correspond to higher absolute correlation values; Mh exhibits the highest positive correlation (0.48). In Figure 3, corresponding to the diagnosis I and BV+, BVAB2 shows the strongest association with diagnosis (−0.49).
(d) Theoretical Models: Based on the correlation matrices shown in Figure 2 and Figure 3, the theoretical models defined by Equations (2) and (3) were established. In model (2), the symbol ∼ denotes a directional relationship, where the variable on the left is explained by those on the right. The first line specifies that the diagnosis of bacterial vaginosis without the presence of Mycoplasma hominis (DxVBNoMh) is estimated from five bacterial variables: Mh, Av, Gv, MT1, and BVAB2. In subsequent lines, Mh is modeled as a dependent variable influenced individually by each of the other bacteria. This specification was based on the correlation analysis, where Mh showed the strongest association with the diagnosis in this stage. Thus, Mh is treated as a central variable within the model structure to represent potential associations among bacterial variables and the diagnosis. Similarly, model (3) was derived, in which BVAB2 appears as the dependent variable, given its stronger association with the diagnosis at this stage.
Mod 1 D xVBNoMh Mh + Av + Gv + MT 1 + BVAB 2 Mh Av Mh Gv Mh MT 1 Mh BVAB 2
Mod 2 D xVBNoMh Mh + Av + Gv + MT 1 + BVAB 2 BVAB 2 Mh BVAB 2 Av BVAB 2 Gv BVAB 2 MT 1
(e) Statistical Metrics: Model (2) and model (3) were evaluated using the statistical metrics shown in Table 1. The results for both models are presented in Table 3, where identical values were observed. These results indicate that the proposed theoretical models meet the established goodness-of-fit criteria, showing consistency across the evaluated metrics.
(f) Causal Graphical Models: The causal graphical models (Figure 4 and Figure 5) were derived from the previously described models, illustrating the relationships between bacterial variables and diagnosis. These models allowed the identification of the bacterial species with the greatest influence in each diagnostic transition. For simplicity in the R implementation, some variable names were abbreviated: DxVBNoMh as DVB and BVAB2 as BVA. The total effect was calculated as the algebraic sum of the direct and indirect effects, with the latter obtained by multiplying the coefficients associated with the connecting arrows. In the path diagram (Figure 4), Av shows a direct effect on DVB of 0.04 and indirect effect mediated by Mh. The indirect effect was computed as ( 0.15 ) ( 0.44 ) = 0.066 , resulting in a total effect of 0.04 + 0.066 = 0.0260 . This reflects both a direct effect and a mediated relationship through Mh.
Similarly, in the path diagram (Figure 5), Mh shows a direct effect on DVB of 0.08 and an indirect effect mediated by BVA. The indirect effect was calculated as ( 0.12 ) ( 0.38 ) = 0.0456 , yielding a total effect of 0.08 + 0.0456 = 0.1256 . This reflects the combination of direct and mediated effects through BVA. The same procedure was applied to the other variables, allowing for an assessment of their individual and mediated contributions to DVB. In Table 4, the total effects derived from Figure 4 and Figure 5 are presented. Table 5 and Table 6 present the classification of total effects and their biological interpretation. The classification was performed following the ranges proposed by Hair et al. [33], considering the absolute magnitude of the coefficients. The direction of each effect (positive or negative) was analyzed in terms of its potential clinical relevance to the diagnosis of BV. This allows the integration of statistical interpretation with the biological relevance observed at each stage of the model.
(g) Microbiologist Evaluation: CGMs Figure 4 and Figure 5 were presented to a clinical microbiologist for interpretation and validation of the relationships identified in the transitions between normal microbiota (BV−), indeterminate state (I), and BV-positive (BV+), including the identification of the most influential bacteria at each stage. The expert assessed the relationships by considering their biological plausibility, clinical coherence, and alignment with the literature. Analysis of both stages suggests a progressive microbiological pattern toward the characteristic dysbiosis of BV. In the early phase (BV− to I), MT1 is a bacterium that is associated with the transition, whereas Mh shows no relevant participation. In the advanced stage (I to BV+), BVAB2 and Av emerge as predominant markers of the transition toward a clinical BV+ state, reflecting a dysbiotic profile. At this point, Mh shows a change by co-occurring with other species in the development of established dysbiosis. Gv exhibits a variable role that depends on the state of the microbiological profile.

4. Results

Data from pregnant women who participated in a healthy pregnancy campaign in Tabasco (2018–2020), tested for BV, were analyzed. The dataset used in the study comprises 132 instances and includes five bacterial species: Mycoplasma hominis (Mh), Atopobium vaginae (Av), Gardnerella vaginalis (Gv), Megasphaera Type 1 (MT1), and Bacterial Vaginosis-Associated Bacterium Type 2 (BVAB2), all associated with the DxVBNoMh diagnosis, which is classified into three categories: normal microbiota (BV–), indeterminate (I), and bacterial vaginosis-positive (BV+). To analyze bacterial behavior in relation to diagnostic progression toward BV+, two stages were considered: BV− to I, and I to BV+. Path analysis was used to construct theoretical models, evaluated using standard fit indices (RMSEA = 0.00, SRMR = 0.00, GFI = 1.00, NFI = 1.00, CFI = 1.00). Although these values meet conventional criteria, they result from the models being just-identified (zero degrees of freedom) and should not be interpreted as evidence of superior fit. Given the sample size (n = 132), model complexity was kept low to ensure stable parameter estimation. Based on these models, CGMs were generated to visualize relationships among bacteria and BV diagnosis, and their interpretation was supported by evaluation from a clinical microbiologist.
The total effects of each bacterial species on diagnoses were estimated from the CGMs, showing variations across transition stages. During the shift from BV− to I, MT1 (moderate, 0.3184), Av (weak, 0.0260), and Gv (very weak, 0.0388) exhibited positive effects, suggesting their association with early stages of vaginal imbalance. In contrast, Mh showed a notable negative effect (−0.4400) and BVAB2 a weak negative effect (−0.0188), indicating no contribution to changes at this stage.
In the transition from I to BV+, Mh showed a positive effect (0.1256), suggesting its association with the dysbiosis pattern, while MT1 remained positively associated (0.1164). Av showed a strong positive effect (0.6260), followed by BVAB2 (0.3800), both associated with progression toward more advanced dysbiosis. Gv displayed a slight negative effect (−0.0138), suggesting less consistent involvement. These results highlight bacterial profiles with potential relevance for BV diagnosis. Clinical evaluation supported that the CGMs provide a useful representation of bacterial relationships during the transition to BV+, supporting the use of model-based causal approaches in microbiological research. PA across diagnostic stages allowed the visualization of model-based causal relationships and the magnitude of bacterial effects associated with BV progression. Av and BVAB2 emerged as the main variables associated with the transition toward BV+, with Av showing the strongest effect within the model. These findings support the potential relevance of bacterial profiles and highlight the utility of CGMs for representing microbiological relationships.

5. Discussion

The present study provides new insights into the microbial patterns associated with BV in pregnant women, a population of particular clinical relevance due to their susceptibility to obstetric complications. By applying path analysis (PA) to generate CGMs, it was possible to visualize model-based associations and quantify the relationships of specific bacterial species across diagnostic transitions. This analysis, conducted using PA and comparisons across diagnostic categories, allows the identification of stage-dependent patterns of association between bacterial species and BV diagnosis, offering insight into how microbial relationships vary across transitions, thereby contributing to the understanding of their involvement in BV+. Although the models allow the visualization of structured relationships, the findings should be interpreted as associative rather than causal, reflecting relationships between bacterial species and BV diagnosis across stages without establishing directionality.
During the transition from BV− to I, MT1, Av, and Gv showed positive associations, suggesting a possible link with early stages of vaginal imbalance. In contrast, Mh and BVAB2 showed negative associations, suggesting limited involvement at this stage. These findings indicate that dysbiosis may not occur uniformly across species but rather through selective microbial patterns. In the subsequent transition from I to BV+, the pattern changed. Mh shifted from a negative to a positive effect, suggesting a potential link with BV+, whereas MT1 maintained a moderate positive association. Av and BVAB2 showed strong positive association, supporting their relevance as markers associated with BV+. In contrast, Gv showed a weak negative association, suggesting less consistent involvement. Overall, these findings highlight the relevance of Av and BVAB2 in relation to BV+, with Av showing the strongest association within the model.
These results are consistent with previous studies identifying Av and BVAB2 as key markers associated with vaginal microbiota imbalance and clinical BV states, supporting their potential relevance in diagnostic contexts [3,14,34]. Moreover, CGMs proved useful for representing both direct and indirect relationships among bacterial species, supporting their value as visual and analytical tools for exploring microbial patterns.
A limitation of this study is its cross-sectional design, which limits the ability to establish temporal relationships among the microbial interactions represented in the models. Additionally, the variable selection was based on the strength of association with the diagnosis, prioritizing those variables showing the highest correlation. Although this approach allowed the identification of the most relevant bacterial profiles, other clinical, behavioral, or sociodemographic factors available in the dataset were not included in the models. Therefore, potential confounding effects cannot be completely ruled out, and the results should be interpreted within the scope of the selected variables. Future studies could incorporate these variables to further refine the understanding of the observed relationships.
Although the sample size (n = 132) is modest for structural equation modeling, it represents real clinical data from a specific population, providing valuable insight into microbial patterns associated with BV. A parsimonious model specification was adopted to support stable parameter estimation. Therefore, the findings should be interpreted with caution, and future studies with larger samples are needed to validate these results. Overall, this study contributes to the understanding of associations between specific bacterial species and BV, highlighting bacterial profiles with potential relevance for diagnosis and providing a methodological framework for future investigations into microbial dysbiosis.

6. Conclusions

This study identified the differential patterns of association among five bacterial species across diagnostic stages of BV in pregnant women. Statistically validated CGMs showed patterns of association among five bacterial species across diagnostic stages of BV in pregnant women, showing that the transition from normal microbiota to the indeterminate state was characterized by associations involving MT1 (0.3184), Av (0.0260), and Gv (0.0388). In the subsequent transition to BV+, Mh showed a positive association (0.1256) and MT1 remained positively associated (0.1164), while Av (0.6260) and BVAB2 (0.3800) consolidated as the most strongly associated species, with Av showing the highest magnitude of association in the model.
These findings are consistent with previous studies and support the potential relevance of Av and BVAB2 as biomarkers associated with BV in pregnant women. They also support the utility of path analysis for generating CGMs that enable visualization of bacterial interactions and exploration of microbial dysbiosis patterns. Importantly, the results were evaluated by a clinical biologist, supporting their microbiological and diagnostic interpretation. Future work will involve expanding the dataset to include larger and more diverse populations, enabling the validation of these findings across different clinical contexts. Longitudinal analyses are needed to better understand the temporal dynamics of Av and BVAB2 during pregnancy and to further explore their associations with obstetric outcomes.

Author Contributions

Conceptualization, M.G.-A. and J.C.-R.; methodology, M.G.-A.; software, M.G.-A.; validation, J.C.-R., L.M.X.R.-H. and E.N.D.l.C.-H.; formal analysis, M.G.-A., J.C.-R. and L.M.X.R.-H.; investigation, M.G.-A.; resources, M.G.-A.; data curation, M.G.-A. and J.C.-R.; writing—original draft preparation, M.G.-A.; writing—review and editing, J.C.-R. and L.M.X.R.-H.; visualization, J.C.-R. and L.M.X.R.-H.; supervision, L.M.X.R.-H.; project administration, J.C.-R.; funding acquisition, none received. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partially funded by the Secretariat of Science, Humanities, Technology, and Innovation (SECIHTI) through scholarship number 842958.

Institutional Review Board Statement

Ethical review and approval were waived for this study due to the use of secondary data provided by DAMC-UJAT, with no direct involvement of human participants by the authors.

Informed Consent Statement

Informed consent for participation was not required as per institutional policy, since the authors did not collect data directly from human subjects.

Data Availability Statement

The data presented in this study are available on request from the corresponding author due to the data privacy.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
BVBacterial Vaginosis
BV−Normal Microbiota
BV+Positive Bacterial Vaginosis
MhMycoplasma hominis
AvAtopobium vaginae
MT1Megasphaera Type 1
GvGardnerella vaginalis
BVAB2Bacteria Associated with Bacterial Vaginosis Type 2
DxVBDiagnosis of Bacterial Vaginosis
DxVBNoMhDiagnosis of Bacterial Vaginosis without the presence of Mycoplasma hominis
RMSEARoot Mean Square Error of Approximation
SRMRStandardized Root Mean Square Residual
GFIGoodness of Fit Index
NFINormalized Fit Index
CFIComparative Fit Index
PAPath Analysis
SEMStructural Equation Models
DAGDirected Acyclic Graph
CGMsCausal Graphical Models
UJATUniversidad Juárez Autónoma de Tabasco

References

  1. Sethi, N.; Narayanan, V.; Saaid, R.; Ahmad Adlan, A.S.; Ngoi, S.T.; Teh, C.S.J.; Hamidi, M.; on behalf of WHOW Research Group. Prevalence, risk factors, and adverse outcomes of bacterial vaginosis among pregnant women: A systematic review. BMC Pregnancy Childbirth 2025, 25, 40. [Google Scholar] [CrossRef]
  2. Das, S.; Basanti, N.; Singh, Y.A.; Kulnu, N. Pregnancy outcome in women with bacterial vaginosis. Int. J. Reprod. Contracept. Obstet. Gynecol. 2024, 13, 922–925. [Google Scholar] [CrossRef]
  3. Mendling, W.; Palmeira-de-Oliveira, A.; Biber, S.; Prasauskas, V. An update on the role of Atopobium vaginae in bacterial vaginosis: What to consider when choosing a treatment? A mini review. Arch. Gynecol. Obstet. 2019, 300, 1–6. [Google Scholar] [CrossRef]
  4. Morelli, I.; Gamboa, S. Vaginosis bacteriana en el embarazo: últimos avances hasta la fecha. Rev. Méd. Sinerg. 2022, 7, e838. [Google Scholar] [CrossRef]
  5. Mina-Ortiz, J.B.; Franco-Macias, M.O.; Santana-Mariscal, L.A.; Garcia-Ortega, M.G. Impact on maternal and fetal health of adolescent pregnant women with bacterial vaginosis. J. Sci. MQR Investig. 2024, 8, 5241–5264. [Google Scholar] [CrossRef]
  6. Park, F.J.; Rosca, A.S.; Cools, P.; Chico, R.M. Prevalence of bacterial vaginosis among pregnant women attending antenatal care in low- and middle-income countries between 2000 and 2020: A systematic review and meta-analysis. Reprod. Female Child Health 2024, 3, e99. [Google Scholar] [CrossRef]
  7. Pérez-Gómez, J.F.; Canul-Reich, J.; De-La-Cruz-Hernández, E. Combination of rankings as a method for biomarker identification of bacterial vaginosis. Res. Comput. Sci. 2020, 149, 915–927. Available online: https://dblp.org/rec/journals/rcs/Perez-GomezCD20 (accessed on 15 July 2025).
  8. Hernández, H.; Canul, J.; Cruz, E. An agglomerative hierarchical clustering approach to identify coexisting bacteria in groups of bacterial vaginosis patients. Intell. Data Anal. 2023, 27, 583–611. [Google Scholar] [CrossRef]
  9. Cruz, F.; Canul, J.; Rivera, R.; Cruz, E. Impact of data balancing a multiclass dataset before the creation of association rules to study bacterial vaginosis. Intell. Med. 2024, 4, 188–199. [Google Scholar] [CrossRef]
  10. Bautista Hernández, C.M.; Canul Reich, J.; López Ramírez, C.; De la Cruz Hernández, E. Un modelo de Red Bayesiana para datos cualitativos de Vaginosis Bacteriana en mujeres embarazadas. Ideas Cienc. Ing. 2024, 2, 62–77. [Google Scholar] [CrossRef]
  11. Sucar, L.E. Probabilistic Graphical Models: Principles and Applications, 2nd ed.; Springer: Cham, Switzerland, 2020. [Google Scholar]
  12. Sagaró, N.; Zamora, L. Métodos gráficos en la investigación biomédica de causalidad. Rev. Electrón. Dr. Zoilo Mar. Vidaurreta 2019, 44. Available online: https://revzoilomarinello.sld.cu/index.php/zmv/article/view/1846 (accessed on 3 August 2025).
  13. Brand, Y. Modelos de Ecuaciones Estructurales: Conceptos y Aplicaciones. Ph.D. Thesis, Universidad Nacional de Colombia, Bogotá, Colombia, 2021. Available online: https://repositorio.unal.edu.co/bitstream/handle/unal/80064/1059705148.2021.pdf (accessed on 16 August 2025).
  14. García-Avalos, M.; Canul-Reich, J.; Rodríguez-Henríquez, L.M.X.; De la Cruz-Hernández, E.N. Causal Graphical Model of Bacterial Vaginosis in Pregnant Women. Diseases 2025, 13, 375. [Google Scholar] [CrossRef]
  15. Abundis, E.M.; Hernandez-Landero, F.; Escobar-Calderon, G.; Gomez-Crisostomo, N.; Contreras-Paredes, A.; de la Cruz-Hernandez, E. Gene expression of cardiovascular risk markers in mononuclear cells of pregnant women in relation to plasma leptin and homocysteine levels: A cross-sectional study. Int. J. Gynaecol. Obstet. 2024, 165, 350–360. [Google Scholar] [CrossRef]
  16. Apaza, E.; Cazorla, S.; Condori, C.; Arpasi, F.; Tumi, I.; Yana, W.; Quispe, J. Correlación de Pearson o Spearman en caracteres físicos y textiles de la fibra de alpacas. Rev. Inv. Vet. Perú 2022, 33, e22908. [Google Scholar] [CrossRef]
  17. Mendivelso, F.; Rodríguez, M. Prueba no paramétrica de correlación de Spearman. Rev. Méd. Sanitas 2021, 24, 42–45. [Google Scholar] [CrossRef]
  18. Sanchez, E.; Contreras, A.; Martinez, E.; Garcia, D.; Lizano, M.; Cruz, E. Molecular epidemiology of bacterial vaginosis and its association with genital micro-organisms in asymptomatic women. J. Med. Microbiol. 2019, 68, 1373–1382. [Google Scholar] [CrossRef] [PubMed]
  19. Bouza, C. Análisis Exploratorios de Datos Univariados Para la Ciencia de los Datos; ResearchGate: Berlin, Germany, 2023. [Google Scholar] [CrossRef]
  20. Alwateer, M.; Atlam, E.; Abd, M.; Ghoneim, O.; Gad, I. Missing data imputation: A comprehensive review. J. Comput. Commun. 2024, 12, 53–75. [Google Scholar] [CrossRef]
  21. Rashid, W.; Gupta, M.K. A perspective of missing value imputation approaches. In Advances in Computational Intelligence and Communication Technology: Proceedings of CICT 2019; Springer: Singapore, 2021; Volume 1086, pp. 307–315. [Google Scholar] [CrossRef]
  22. Luzuriaga Jaramillo, H.A.; Espinosa Pinos, C.A.; Haro Sarango, A.F.; Ortiz Román, H.D. Histograma y distribución normal: Shapiro–Wilk y Kolmogorov–Smirnov aplicado en SPSS. LATAM Rev. Lat. Cienc. Soc. Humanid. 2023, 4, 596–607. [Google Scholar] [CrossRef]
  23. Platas, V. Contrastes de Normalidad. Ph.D. Thesis, Universidad de Santiago de Compostela, Santiago de Compostela, Spain, 2021. Available online: https://minerva.usc.gal/rest/api/core/bitstreams/c8efafac-12ac-446a-9443-bb3ca39fa5f0/content (accessed on 2 September 2025).
  24. Tortora, G.J.; Funke, B.R.; Case, C.L. Microbiology: An Introduction, 14th ed.; Pearson: San Francisco, CA, USA, 2021; Available online: https://www.pearson.com/en-us/subject-catalog/p/microbiology-an-introduction/P200000005614 (accessed on 11 September 2025).
  25. Lepera, A. Introducción a los modelos de ecuaciones estructurales y su implementación en R mediante un ejemplo. Rev. Investig. Model. Matemáticos Apl. Gestión Econ. 2021, 1, 15–37. [Google Scholar]
  26. Civelek, M.E. Essentials of Structural Equation Modeling; Zea E-Books; University of Nebraska–Lincoln: Lincoln, NE, USA, 2018; Available online: https://digitalcommons.unl.edu/zeabook/64 (accessed on 28 September 2025).
  27. Cortés, F. Observación, causalidad y explicación causal. Perfiles Latinoam. 2018, 26, 1–20. [Google Scholar] [CrossRef]
  28. Barbeau, K.; Sarr, F.; Boileau, K.; Smith, K. Path analysis in Mplus: A tutorial using a conceptual model of psychological and behavioral antecedents of bulimic symptoms in young adults. Quant. Methods Psychol. 2019, 15, 38–53. [Google Scholar] [CrossRef]
  29. Manzano, A. Introducción a los modelos de ecuaciones estructurales. Investig. Educ. Méd. 2017, 7, 67–72. [Google Scholar] [CrossRef]
  30. Flora, D.; Crone, G.; Bell, S. Effect size interpretation in structural equation models. Struct. Equ. Model. 2025, 32, 1069–1076. [Google Scholar] [CrossRef]
  31. Jordan Muiños, F.M. Valor de corte de los índices de ajuste en el análisis factorial confirmatorio. Rev. Investig. Psicol. Soc. 2021, 7, 66–71. [Google Scholar]
  32. Castillo, A.; Ruiz, A. Entre lo observable y lo latente: Modelos de ecuaciones estructurales e investigación social. Rev. Reflex. 2025, 104, 1–27. Available online: https://www.scielo.sa.cr/pdf/reflexiones/v104n2/1659-2859-reflexiones-104-02-186.pdf (accessed on 3 October 2025).
  33. Hair, J.F.; Hult, G.T.M.; Ringle, C.M.; Sarstedt, M. Manual de Partial Least Squares Structural Equation Modeling (PLS-SEM), 2nd ed.; OmniaScience: Barcelona, Spain, 2019. [Google Scholar] [CrossRef]
  34. Castro-Castrillo, C.; Duarte-Artavia, C. Bacterial vaginosis: The role of Atopobium vaginae and other anaerobic bacteria. Cienc. Lat. Rev. Científica Multidiscip. 2025, 9, 1831–1841. [Google Scholar] [CrossRef]
Figure 1. Methodology of the study.
Figure 1. Methodology of the study.
Biomedinformatics 06 00032 g001
Figure 2. Correlation matrix between DxVBNoMh (diagnostic categories: BV− and I) and bacterial presence. Color intensity represents the strength of the correlation, with blue indicating positive values and red indicating negative values. Numerical values correspond to correlation coefficients, where larger absolute values indicate stronger associations. Only the upper triangular matrix is shown for clarity.
Figure 2. Correlation matrix between DxVBNoMh (diagnostic categories: BV− and I) and bacterial presence. Color intensity represents the strength of the correlation, with blue indicating positive values and red indicating negative values. Numerical values correspond to correlation coefficients, where larger absolute values indicate stronger associations. Only the upper triangular matrix is shown for clarity.
Biomedinformatics 06 00032 g002
Figure 3. Correlation matrix between DxVBNoMh (diagnostic categories: I and VB+) and bacterial presence. Color intensity represents the strength of the correlation, with blue indicating positive values and red indicating negative values. Numerical values correspond to correlation coefficients, where larger absolute values indicate stronger associations. Only the upper triangular matrix is shown for clarity.
Figure 3. Correlation matrix between DxVBNoMh (diagnostic categories: I and VB+) and bacterial presence. Color intensity represents the strength of the correlation, with blue indicating positive values and red indicating negative values. Numerical values correspond to correlation coefficients, where larger absolute values indicate stronger associations. Only the upper triangular matrix is shown for clarity.
Biomedinformatics 06 00032 g003
Figure 4. Causal graphical model depicting the relationships between bacterial species and diagnosis in the BV− and I categories. Arrows represent directed relationships, and the values next to the arrows correspond to standardized coefficients. Positive and negative signs indicate the direction of the association. Solid lines denote direct effects, while dashed lines represent covariations between variables.
Figure 4. Causal graphical model depicting the relationships between bacterial species and diagnosis in the BV− and I categories. Arrows represent directed relationships, and the values next to the arrows correspond to standardized coefficients. Positive and negative signs indicate the direction of the association. Solid lines denote direct effects, while dashed lines represent covariations between variables.
Biomedinformatics 06 00032 g004
Figure 5. Causal graphical model depicting the relationships between bacterial species and diagnosis in the I and VB+ categories. Arrows represent directed relationships, and the values next to the arrows correspond to standardized coefficients. Positive and negative signs indicate the direction of the association. Solid lines denote direct effects, while dashed lines represent covariations between variables.
Figure 5. Causal graphical model depicting the relationships between bacterial species and diagnosis in the I and VB+ categories. Arrows represent directed relationships, and the values next to the arrows correspond to standardized coefficients. Positive and negative signs indicate the direction of the association. Solid lines denote direct effects, while dashed lines represent covariations between variables.
Biomedinformatics 06 00032 g005
Table 1. Statistical metrics to assess model fit in path analysis.
Table 1. Statistical metrics to assess model fit in path analysis.
MetricDescriptionExpected Value
RMSEARoot Mean Square Error of Approximation<0.05
SRMRStandardized Root Mean Square Residual<0.05
GFIGoodness of Fit Index≥0.95
NFINormalized Fit Index≥0.95
CFIComparative Fit Index≥0.95
Table 2. Classification of total effect magnitudes according to Hair et al. [33].
Table 2. Classification of total effect magnitudes according to Hair et al. [33].
Absolute Coefficient ValueClassification
≥0.50Very strong
0.30–0.49Strong
0.10–0.29Moderate
<0.10Very weak
Table 3. Metric values for model (2) and model (3).
Table 3. Metric values for model (2) and model (3).
IndicatorValueIdeal ValueGood Fit
RMSEA0.00<0.05Yes
SRMR0.00<0.05Yes
GFI1.00≥0.95Yes
NFI1.00≥0.95Yes
CFI1.00≥0.95Yes
Table 4. Total effects of bacteria with DxVBNoMh (DVB) at each stage.
Table 4. Total effects of bacteria with DxVBNoMh (DVB) at each stage.
AttributesVB− and IndeterminateIndeterminate and VB+
Atopobium vaginae 0.0260 0.6260
Mycoplasma hominis 0.4400 0.1256
Gardnerella vaginalis 0.0388 0.0138
Megasphaera Type 1 0.3184 0.1164
Bacteria Associated with Bacterial Vaginosis Type 2 0.0188 0.3800
Table 5. Interpretation of the total effects of each bacterium on the VB- and Indeterminate diagnosis.
Table 5. Interpretation of the total effects of each bacterium on the VB- and Indeterminate diagnosis.
Bacteria (Total Coefficient) Impact on the DiagnosisContextual Interpretation
Av (0.0260)Very weakAv begins to show a very slight influence, possibly acting as a companion organism in the initial stages of imbalance.
Mh (−0.4400)ModerateMh shows an effect indicating that its presence does not contribute to the transition to the indeterminate state and may even delay it.
Gv (0.0388)Very weakGv has a minimal impact on this initial transition, indicating that its presence alone is insufficient to significantly alter the vaginal state.
MT1 (0.3184)ModerateMT1 contributes more to the shift from a healthy to an indeterminate state, indicating a possible association with alterations in the vaginal microbiota.
BVAB2 (−0.0188)Very weakIts effect indicates that it is not yet involved in the dysbiosis shift.
Table 6. Interpretation of the total effects of each bacterium with the Indeterminate and VB+ diagnosis.
Table 6. Interpretation of the total effects of each bacterium with the Indeterminate and VB+ diagnosis.
Bacteria (Total Coefficient) Impact on the DiagnosisContextual Interpretation
Av (0.6260)StrongAv has the strongest effect at this stage, reinforcing its role as a strongly associated marker of a positive diagnosis.
Mh (0.1256)WeakMh changes sign compared to the previous stage, indicating that it may become involved in more advanced phases of vaginal imbalance.
Gv (−0.0138)Very weakGv does not contribute to the change from the indeterminate state to a positive diagnosis.
MT1 (0.1164)WeakMT1 still contributes but with less weight. It participates in the progression, although its impact decreases compared with that of the previous stage.
BVAB2 (0.3800)ModerateBVAB2 is strongly associated with the progression toward VB+, confirming its role as a predominant indicator of vaginal dysbiosis.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

García-Avalos, M.; Canul-Reich, J.; Rodríguez-Henríquez, L.M.X.; Cruz-Hernández, E.N.D.l. Causal Graphical Models for Transition from Healthy Vaginal Microbiota to Bacterial Vaginosis in Pregnant Women. BioMedInformatics 2026, 6, 32. https://doi.org/10.3390/biomedinformatics6030032

AMA Style

García-Avalos M, Canul-Reich J, Rodríguez-Henríquez LMX, Cruz-Hernández ENDl. Causal Graphical Models for Transition from Healthy Vaginal Microbiota to Bacterial Vaginosis in Pregnant Women. BioMedInformatics. 2026; 6(3):32. https://doi.org/10.3390/biomedinformatics6030032

Chicago/Turabian Style

García-Avalos, Maricela, Juana Canul-Reich, Lil María Xibai Rodríguez-Henríquez, and Erick Natividad De la Cruz-Hernández. 2026. "Causal Graphical Models for Transition from Healthy Vaginal Microbiota to Bacterial Vaginosis in Pregnant Women" BioMedInformatics 6, no. 3: 32. https://doi.org/10.3390/biomedinformatics6030032

APA Style

García-Avalos, M., Canul-Reich, J., Rodríguez-Henríquez, L. M. X., & Cruz-Hernández, E. N. D. l. (2026). Causal Graphical Models for Transition from Healthy Vaginal Microbiota to Bacterial Vaginosis in Pregnant Women. BioMedInformatics, 6(3), 32. https://doi.org/10.3390/biomedinformatics6030032

Article Metrics

Back to TopTop