Identifying Predictive Bacterial Markers from Cervical Swab Microbiota on Pregnancy Outcome in Woman Undergoing Assisted Reproductive Technologies

Background and aims: Failure of the embryo to implant causes about three-fourths of lost pregnancies. Female genital tract microbiota has been associated to Assisted Reproductive Technologies (ART) outcomes. The objective of this study was to analyze the microbiota of human cervical swab and to correlate these findings with the ART outcomes. Materials and Methods: In this study, 88 cervical swabs were collected from women undergoing ART cycles, with various causes of infertility, at the beginning of the ART protocols. After microbial DNA extraction, V3–V4 variable regions of the 16S rRNA gene were amplified and sequenced on the Illumina MiSeq platform. PEnalized LOgistic Regression Analysis (PELORA) was performed to identify clusters of bacterial populations with differential abundances between patients with unfavorable and favorable pregnancy outcome groups, respectively. Results: We identified a core of microorganisms at lower taxonomic levels that were predictive of women’s pregnancy outcomes. Statistically significant differences were identified at species levels with Lactobacillus salivarius, Lactobacillus rhamnosus among others. Moreover the abundance of Lactobacillus crispatus and iners, respectively increased and decreased in favorable group as compared to unfavorable group, resulted within the core of microorganisms associated to positive ART outcome. Although the predominance of lactobacilli is generally considered to be advantageous for ART outcome, we found that also the presence of Bifidobacterium (together with the other lactobacilli) was more abundant in the favorable group. Discussion: Cervix is colonized by microorganisms which can play a role in ART outcomes as seen by an overall decrease in embryo attachment rates and pregnancy rates in both fertile and infertile women. If confirmed in a larger cohort, the abundance of these bacteria can be useful not only as a marker of unfavorable pregnancy outcome but also they may open the way to new interventional strategies based on genital tract microbiota manipulation in order to increase the pregnancy rates in woman undergoing assisted reproductive technologies.


Introduction
Although around 95% of the trillion microorganisms constituting the microbiota reside within the gut, the remaining 5% are dislocated in other human districts including genital tracts. Vaginal microbiota is characterized by a lower bacterial diversity and high relative abundance of Lactobacillus species whose dominance is even more conspicuous during pregnancy [1]. Vaginal microbiota composition is strongly influenced by genetic, environmental, individual and lifestyle factors [2]. Recent evidence supports that the composition of the cervicovaginal microbiota plays a role in pregnancy outcome and it can be linked to adverse obstetric outcomes such as preterm birth, a leading cause of neonatal morbidity and mortality worldwide [3,4]. Moreover the microorganisms residing within the female reproductive tract have been associated with Assisted Reproductive Technologies (ART) outcomes [5]. One of the most important clinical challenges in this field is to improve the outcome of patients undergoing ART ever since the first live birth took place in 1978 [6]. Over the years, numerous studies have focused on investigating the importance of some factors related to ART failure, for instance sperm quality and female age. Beside the role of viral infections interfering with pregnancy outcome as previously published [7][8][9], over time, another factor has joined those mentioned above, such as the microbial composition of the female reproductive tract [10,11]. One of the pioneers in demonstrating the presence of endometrial microorganisms associated with ART outcomes was Moreno et al. in 2016 [12], giving life to a line of research aimed at improving the reproductive health of women with a focus on the cervicovaginal microbiota. The composition of cervicovaginal microbiota is characterized by high abundance of the Lactobacillus genus, some of which, such as L. crispatus, L. gasseri and L. jensenii, are able to introduce lactic acid and hydrogen peroxide (H 2 O 2 ) in the female reproductive tract, inhibiting the growth of other bacteria and viruses [13]. Several studies have classified the cervicovaginal microbiota of reproductive-age women into six groups named Community State Types (CSTs), each of which is characterized by a predominant species: Lactobacillus crispatus (CST I), Lactobacillus gasseri (CST II), Lactobacillus iners (CST III), Lactobacillus jensenii (CST V) and CST IV-A and CST IV-B clusters. The last two groups are made up of a wide range of anaerobic and facultative bacteria, such as Gardnerella, Megasphera, Atopobium and Prevotella [3,14]. Particularly, CST IV-A differs from CST IV-B for the higher abundance of bacterial vaginosis-associated bacterium 1 (BVAB1) a species of bacteria associated with common vaginal disorders and belonging to the order of Clostridiales [15]. Up to date, only a limited number of studies regarding the relationship between female genital tract microbiota and pregnancy outcome are available. Cervical swab is the current method to obtain the biological matrix to analyze the microbiota of the cervix due to its minimally invasive impact. Given the anatomical structure of this district, one of the advantages of the cervical swab is the reduction of cross-contamination risk at sampling time [16]. Our study aimed to characterize the cervical swab microbiota of patients undergoing ART and to detect clusters of bacteria which were predictive (i.e., with differential abundance levels) of unfavorable and favorable pregnancy outcomes, respectively.

Study Population
Cervical swabs were collected immediately before carrying out the oocytes pickup and before external and internal disinfection from 90 women, diagnosed with different types of infertility, from November 2020 to May 2021, before undergoing assisted reproductive technology protocols at the outpatient division of Nuova Ricerca Hospital. All patients provided their signed informed consent. Ethical approval was obtained from the review boards of Human Ethics Committee of Nuova Ricerca Hospital under C.E. approval number 001\2020. Pregnancies were initially diagnosed by serum HCG and then confirmed as clinical pregnancies by ultrasound visualization of gestational sac with heartbeat. Out of 90 total patients, 2 of them were tested positive after embryo implantation but then had a miscarriage 8 weeks later, and for this reason they were excluded from the analysis. The remaining 88 patients were divided into two groups according to ART outcomes: 39 women resulted pregnant while the remaining 49 patients resulted negative. Exclusion criteria were: vaginal infection and antibiotic use 30 days before ART protocol, no previous pregnancy, previous history of pelvic inflammatory disease (PID), (PID), body mass index more than 30, age more than 40 and preimplantation test (PGT) positive for genetic diseases. Patients' clinical data are described in Table 1.

Sample Collection and DNA Extraction
Cervical swab was collected from each study participant in a tube containing a DNA stabilization buffer (Copan Brescia Italy n. cat. 608C). After a centrifugation at 7500rpm for 10 min, total DNA was extracted using the QIAamp DNA blood and tissue Kit (Qiagen Milan Italy Cat. N. 69504) following the manufacturer's instructions. At the end of the isolation protocol, DNA was checked for concentration and purity and stored at −80 • C until use.

Statistical Analysis
Demographic and clinical characteristics of patients with unfavorable and favorable pregnancy outcomes were reported as mean ± standard deviation (SD), median along with interquartile range (i.e., first-third quartiles) and observed frequencies (and percentages) for continuous and categorical variables, respectively. For each continuous variable, the assumption of normality distribution was checked by means of quantile-quantile (Q-Q) plots and Shapiro-Wilks test. Comparisons between groups were performed by twosample t-test and Chi-square test (or Fisher exact test as appropriate) for continuous and categorical variables, respectively. Stacked bar charts were used to show the vaginal microbiota composition (i.e., mean relative abundance %) at phylum, family, genus and species levels between patients with unfavorable and favorable pregnancy outcomes. To identify clusters of bacterial populations such that the linear combination of their abundances was differential between patients with unfavorable and favorable pregnancy outcomes, the PEnalized LOgistic Regression Analysis (PELORA) was performed [19]. This promising algorithm is mainly used to find predictive gene signatures from microarray data by using supervised grouping techniques. To this purpose, a standardized Z-score of each bacterium relative abundance (%) was computed as follows: as a first step, the abundance was logit transformed (i.e., computing the natural logarithm of the ratio between the relative abundance proportion and its complimentary) and, as a second step, the logit-transformed variable was standardized by subtracting its mean and dividing by its SD. When the relative abundance was exactly 0%, the logit transformation cannot be performed for that value and, to overcome this issue, such percentage was replaced by 0.001% for the computation of Z-score only. Using PELORA algorithm, multiple clusters of bacterial populations can be detected. Each cluster has the characteristic that its centroid (i.e., the mean of the Z-scores of all identified bacteria within the cluster) was significantly higher (or lower) in one of the two compared groups (i.e., patients with unfavorable and favorable pregnancy outcomes). Two different free parameters must be set by the user in the PELORA algorithm: the number of centroids and the penalty parameter (λ). The number of centroids was set to vary between one and two, because we were mainly interested to detect no more than two informative pathways for each scenario, whereas a number of different combinations of λ = (0, 1/32, 1/16, 1/8, 1/4, 1/2, 1) were evaluated, performing 200 bootstrap resamplings of the data and recording the overall misclassification rate. For each specific scenario, the penalty parameter that achieved the lowest median misclassification rate (across the boostrap samples) was chosen. Comparisons between Z-score means were assessed by twosample t-test. Scatter plots (or box plots) of the Z-scores computed at cluster centroids as well as heatmaps of the relative bacteria abundance (%) identified by PELORA within each cluster were shown at phylum, family, genus and species levels. Two-sided p < 0.05 were considered to be statistically significant. All statistical analyses and plots were performed by the computing environment R (R Development Core Team 2008, version 4.1, packages: supclust, ggplot2, gridExtra).

Sample Characteristics
Since it is known that bacteria may influence pregnancy outcome, the study participants undergoing ART were classified in two subgroups, according to whether they had a favorable or unfavorable pregnancy outcome at the end of the study. Clinical/pathological and demographic characteristics of these two subgroups of patients undergoing ART are summarized in Table 1. The two groups were homogeneous for all the examined characteristics except for Oligo-Astheno-Teratozoospermia (OAT) distribution (p = 0.002).

Comparison of Cervical Fluid Microbiota Composition between Patients with Favorable or Unfavorable ART Outcome
In order to assess whether a different cervical fluid microbiota discriminates patients undergoing ART with a favorable or unfavorable outcome, its composition in the two cohorts of patients was analyzed by 16S rRNA gene sequencing. A total of 12,897,869 qualityfiltered read pairs were obtained from 88 study participants with an average of 146,566 read pairs per sample (SD ± 69,272). Figure 1 reports cervical fluid bacterial communities at the phylum, family, genus and species level detected in 49 unfavorable and 39 favorable pregnancy outcome patients. As expected, Firmicutes was the most abundant phyla, accounting for about 82.2 and 73.5% of all bacteria without no significant changes between the unfavorable and favorable groups, respectively. The other most abundant phylum was constituted by the Actinobacteria. Consequently, Bidifobacteriaceae and Lactobacillaceae were the predominant families in both groups. At genus level, a significant increase of Bifidobacterium was detected within the favorable group while the unfavorable group was characterized by a significant increased presence of Atopobium. Worth of note, at species level, the increased relative abundance of Lactobacillus iners within the unfavorable group together with Atopobium vaginae which will be discussed below.

PELORA Algorithm Identified Bacterial Populations Associated to Favorable or Unfavorable Pregnancy Outcome
Based on the relative abundances generated by taxonomic analyses, the PELORA algorithm was performed to identify clusters of bacterial populations that best discriminate patients with favorable from those with unfavorable pregnancy outcome. The list of the bacteria detected by the algorithm within each cluster is reported in Table 2. Table 2. Results from PEnalized LOgistic Regression Analysis (PELORA). The PELORA algorithm identified clusters of bacterial populations such that the linear combination of their abundances (Z-scores) is differential between patients with unfavorable and favorable pregnancy outcomes, respectively.

Taxa Level
Cluster  Cluster centroid Z-score (means) Mean ± SD 0.085 ± 0.207 −0.107 ± 0.140 <0.001 Abbreviations: IQR-interquartile range (i.e., first-third quartiles); SD-standard deviation; Absent-all values are 0%. * This is a one-element cluster: the Proteobacteria represents a cluster in itself and therefore its Z-score mean corresponds to the centroid Z-score mean. Standardized Z-score: as a first step, the relative abundance (%) of each bacterium was logit transformed (so that values can theoretically range from negative to positive infinity) and, as a second step, the Z-score was computed by standardizing the transformed variable (i.e., taking the variable values, subtracting its mean and dividing by its SD). The centroid is calculated as the average of Z-scores for all those variables selected within each cluster. # All p-values were derived from the parametric two-sample t-test on Z-scores with the exception of those marked as " § " which instead were derived from Mann-Whitney U test. The latter was performed in presence of no variance in one of the two groups (i.e., when the group has all values equals to 0%-denoted as "Absent").

Figure 1.
Cervical swab microbiota composition (i.e., mean relative abundance %) at phylum (A), family (B), genus (C) and species (D) levels in patients with unfavorable and favorable pregnancy outcome. All bacteria with mean relative abundance less than 1% are included in the "Others (<1%)" category.

PELORA Algorithm Identified Bacterial Populations Associated to Favorable or Unfavorable Pregnancy Outcome
Based on the relative abundances generated by taxonomic analyses, the PELORA algorithm was performed to identify clusters of bacterial populations that best discriminate patients with favorable from those with unfavorable pregnancy outcome. The list of the bacteria detected by the algorithm within each cluster is reported in Table 2. Figure 1. Cervical swab microbiota composition (i.e., mean relative abundance %) at phylum (A), family (B), genus (C) and species (D) levels in patients with unfavorable and favorable pregnancy outcome. All bacteria with mean relative abundance less than 1% are included in the "Others (<1%)" category.
At phylum level, two clusters were detected but no significant differences were found both with respect to the cluster centroids and with respect to each bacterium included within each cluster. At the family level, only one cluster (which included 4 bacteria) was detected showing a significant increase in the abundance of unkn. Alphaproteobacteria (c), Yersiniaceae and Streptococcaceae as compared to the ones in the unfavorable group (p < 0.001; p < 0.007 and p < 0.036, respectively). Furthermore, the Z-scores of the cluster centroid were significantly different between the two groups (p < 0.001). At genus level, two clusters (which included 11 and 7 bacteria, respectively) were detected. The first one showed that a statistically significant increase in the abundance of unkn. Alphaproteobacteria (c) (p < 0.001), Yersinia (p < 0.009) and Streptococcus (p < 0.035). Bifidobacterium (p < 0.011), Enterobacter (p < 0.008), unkn. Sphingomonadaceae (f) (p < 0.002) and Micrococcus (p < 0.001) as well as the cluster's centroid (p < 0.001) was found in patients with favorable pregnancy outcome with respect to the those with the unfavorable one. The second cluster showed that only a statistically significant increase in the abundance of the Peptoniphilus (p < 0.025) as well as the cluster's centroid (p < 0.001) was found in patients with unfavorable pregnancy outcome with respect to the those with the favorable one. At the species level, two clusters (which included 22 and 20 bacteria, respectively) were detected. The first one showed that a statistically significant increase in the abundance unkn. Alphaproteobacteria (c) (p < 0.001), unkn. Serratia (g) (p < 0.001), Lactobacil-lus psittaci (p < 0.045), unkn. Bifidobacterium (g) (p < 0.019), Streptococcus anginosus (p < 0.004), Yersinia pseudotuberculosis (p < 0.029), Lactobacillus casei (p < 0.010), unkn. Sphingomonadaceae (f) (p < 0.002) and unkn. Micrococcus (g) (p < 0.001) as well as the cluster's centroid (p < 0.001) was found in patients with favorable pregnancy outcome with respect to the those with the unfavorable one. Instead, the second cluster showed that a statistically significant increase in the abundance of the of unkn. Firmicutes (p) (p < 0.033), Anaerococcus prevotii (p < 0.023). Peptoniphilus lacrimalis (p < 0.041) and Peptoniphilus timonensis (p < 0.041) as well as the cluster's centroid (p < 0.001) was found in patients with unfavorable pregnancy outcome with respect to the those with the favorable one. It is of note that the latter two bacteria were found to be completely absent among patients with the unfavorable outcome. The distribution of Z-scores computed at clusters centroids was graphically represented in Figure 2 at different taxa levels, showing that two clusters, composed by the linear combination of specific microorganisms residing within the cervical fluid, were able to greatly discriminate (except at phylum level) patients with unfavorable and favorable pregnancy outcomes, respectively. Specifically, patients with an unfavorable outcome are characterized by lower Zscores from cluster 1 and higher Z-scores from cluster 2, whereas, on the contrary, those with a favorable outcome are characterized by higher Z-scores from cluster 1 and lower Z-scores from cluster 2. Because of the presence of a single cluster in the family level, it was quite clear that the centroid's Z-scores detected within patients with unfavorable outcome were significantly lower than the ones with the favorable one. Moreover, Heatmaps Specifically, patients with an unfavorable outcome are characterized by lower Z-scores from cluster 1 and higher Z-scores from cluster 2, whereas, on the contrary, those with a favorable outcome are characterized by higher Z-scores from cluster 1 and lower Z-scores from cluster 2. Because of the presence of a single cluster in the family level, it was quite clear that the centroid's Z-scores detected within patients with unfavorable outcome were significantly lower than the ones with the favorable one. Moreover, Heatmaps reported in Figure 3 show the relative abundance of each of the microorganisms detected within each cluster at the phylum (A), family (B), genus (C) and species (D) level for each recruited subject in the unfavorable and favorable groups, respectively.

Discussion
Vaginal microbiota has been investigated in several studies highlighting its importance in maintaining a healthy female reproductive system. Differences in the composition of bacteria that populate the vaginal tract can be the reason for vaginal infections or disfunctions [20]. Although no statistical significance was observed for tubal pathology, of note is the fact that it is present in 6.8% of all subjects while only one (2%) was present in the unfavorable group and five (12.8%) in the favorable group (Table 1), underlining the importance of a core microbiota involved in the pregnancy rate. The known importance of microbiota residing within the reproductive system and in reproductive health led us to analyze whether the microbiota composition may influence the outcome of ART cycles. For this reason, previous studies analyzed how the prevalence of some bacteria or others can influence ART outcomes such as the implantation rate or abortion rate [21]. Moreover, in the last few years the presence of bacteria in the highest parts of

Discussion
Vaginal microbiota has been investigated in several studies highlighting its importance in maintaining a healthy female reproductive system. Differences in the composition of bacteria that populate the vaginal tract can be the reason for vaginal infections or disfunctions [20]. Although no statistical significance was observed for tubal pathology, of note is the fact that it is present in 6.8% of all subjects while only one (2%) was present in the unfavorable group and five (12.8%) in the favorable group (Table 1), underlining the importance of a core microbiota involved in the pregnancy rate. The known importance of microbiota residing within the reproductive system and in reproductive health led us to analyze whether the microbiota composition may influence the outcome of ART cycles. For this reason, previous studies analyzed how the prevalence of some bacteria or others can influence ART outcomes such as the implantation rate or abortion rate [21]. Moreover, in the last few years the presence of bacteria in the highest parts of the female reproductive system has been detected, such as the uterus and ovaries [5]. In this upper part of reproductive system, the composition of microbiota is different from the vaginal one. In particular, the uterine microbiota is characterized by a lower diversity in terms of number of bacteria species as compared to the vaginal microbiota [22]. Differently from the vaginal microorganisms, the upper reproductive tract was considered sterile and was little investigated until several years ago. Different studies then proved that all components of the upper reproductive tract, i.e., the uterus, fallopian tubes and ovaries, are populated by different species of bacteria [21][22][23]. The population of bacteria of the upper reproductive tract has proved to be less abundant than the vaginal one but contains a richer variety of different species. Lactobacillus are still the most abundant species, but less so than in the vaginal tract. Bacteria that populates the uterine microbiota has been proposed to be responsible for protecting the endometrium from infection and modulating its function [24]. Uterine microbiota can be analyzed to understand how it influences embryo implantation. Nevertheless, samples of endometrium tissue are really hard to obtain and invasive for patients. For these reasons, studies that analyzes uterine microbiota are still low in number. This study had the purpose to analyze the spectrum of species that populate the uterine cervix in patients that undergoing ART. The use of cervical swabs, in fact, could give a result more similar to the uterine cervix than the vaginal one. It was already reported that the presence of a poor Lactobacilli-dominant microbiota has been correlated with a higher probability of failure in in vitro fecundation (IVF) treatment [12]. Previous studies use different methods to analyze microbiota composition and different techniques to obtain samples. For these reasons, more studies has still to be carried out to understand how the microbiota of the reproductive female tract could influence ART outcomes. The discovery of an "ideal" core of microorganisms which are able to increase the implantation chances could lay the groundwork to use therapies that modulate bacterial composition. We found a core of microorganisms, listed in Table 2, whose abundance or scarcity is associated with a favorable or unfavorable ART outcome. For instance, the increased abundance of Atopobium vaginae within the unfavorable ART outcome can be explained by the fact that this microorganism is associated to a bacterial vaginosis [25], reducing the rate of pregnancy success [26,27]. Moreover, our data showed that Lactobacillus crispatus and Lactobacillus iners increased and decreased, respectively, in the favorable group as compared to unfavorable group. As reported in a previous study, their abundance over a certain limit is also important for the ART outcome [10]. Lactobacilli are fundamental as they lower the vaginal pH through the production of lactic acid, generating an unfavorable habitat for many pathogens [28]. Based on this rationale, the indication that colonizing the reproductive tract microbiota with different species of Lactobacillus to achieve a "healthy" profile through the administration of H 2 O 2 -producing L. crispatus could enhance the success rate of ART outcome emerged [29]. However, although the predominance of lactobacilli is generally considered to be advantageous for ART outcome [10], we found that also the presence of Bifidobacterium (together with the other lactobacilli) was more abundant in the favorable group. Indeed, it was suggested that bifidobacteria contribute to a healthy vaginal microbiota [25] and associated to a lower risk of preterm birth [26]. Our data suggests that cervical swab microbiota profiles could be useful not only to detect markers of unfavorable pregnancy outcome, if confirmed in larger cohorts, but also in paving the way for new interventional strategies based on genital tract microbiota manipulation in order to increase the pregnancy rates in woman undergoing assisted reproductive technologies. Institutional Review Board Statement: The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Institutional Review Board of Human Ethics Committee of Nuova Ricerca Hospital under C.E. approval number 001\2020 and registered on 20 October 2020. All patients provided their signed informed consent.
Informed Consent Statement: Written informed consent has been obtained from the patients to publish this paper.
Data Availability Statement: Data regarding this study are available upon reasonable request.