Ovarian Morphology in Non-Hirsute, Normo-Androgenic, Eumenorrheic Premenopausal Women from a Multi-Ethnic Unselected Siberian Population

Polycystic ovary syndrome (PCOS) is a highly prevalent disorder in women, and its diagnosis rests on three principal features: ovulatory/menstrual dysfunction, clinical and/or biochemical hyperandrogenism, and polycystic ovarian morphology (PCOM). Currently, data on age- and ethnicity-dependent features of PCOM remain insufficient. We aimed to estimate ethnicity- and age-dependent differences in ovarian volume (OV) and follicle number per ovary (FNPO) in a healthy, medically unbiased population of Caucasian and Asian premenopausal women, who participated in the cross-sectional Eastern Siberia PCOS epidemiology and phenotype (ESPEP) study (ClinicalTrials.gov ID: NCT05194384) in 2016–2019. The study population consisted of 408 non-hirsute, normo-androgenic, eumenorrheic premenopausal women aged 18–44 years. All participants underwent a uniform evaluation including a review of their medical history and a physical examination, blood sampling, and pelvic ultrasonography. The statistical analysis included non-parametric tests and the estimation of the upper normal limits (UNLs) by 98th percentiles for OV and FNPO. In the total study population, the upper OV percentiles did not differ by ethnicity or age group. By contrast, the UNL of FNPO was higher in Caucasian women than in Asian women, and women aged <35 years demonstrated a higher UNL of FNPO compared to older women. In summary, these data suggest that the estimation of FNPO, but not OV, should take into account the ethnicity and age of the individual in estimating the presence of PCOM.


Introduction
One of the criteria of polycystic ovarian syndrome (PCOS) in the majority of cases is the polycystic structure of the ovaries [1].The international guideline for the assessment and management of PCOS patients, published in 2018 [1] and updated in 2023 [2,3], proposed considering the Rotterdam 2003 criteria for PCOM as standard during PCOS diagnosis, and it simultaneously stressed the need to take into account racial and age characteristics.
Previously, ethnic variations in follicle number and/or ovarian volume (OV) have been demonstrated by some authors in different populations.Thus, in Chinese women, as compared to women in the European population, the smaller ovarian volume and the lower number of follicles were proposed as sufficient criteria for diagnosing PCOM: ≥6.3 cm 3 and ≥10 follicles [3,4].In Turkish women, the lower ovarian estimates compared to those of the Western female population were found with the following threshold criteria for PCOM: an ovarian volume of 6.43 cm 3 and a number of follicles ≥ 8 [5].According to the current guideline for the assessment and management of PCOS patients, for PCOM diagnosis, the ovarian volume is a much more reliable indicator than the number of follicles [1,2].Nevertheless, in the population of Korean women, the number of follicles was considered to be a more significant criterion for polycystic disease than the volume of the ovary, due to the smaller volume of the ovaries in the Asian race [6,7].
Evidently, age-related processes in women suggest a reduction in the number of growing antral follicles.The volume of the ovaries and the number of follicles depend on the lifespan of the reproductive period, reaching a maximum value in adolescence, with a gradual decrease in adulthood and a fast decrease at the age of menopause [8][9][10][11][12].At the same time, the number of follicles decreases faster than OV.Regarding PCOM, in women aged ≥35 years, the prevalence of polycystic ovaries is 7.8% vs. 21% in younger women [7,8].
According to the international guidelines on the diagnosis and management of patients with PCOS, transvaginal ultrasonography should be performed in the early follicular phase of the natural cycle or after withdrawal bleeding caused by pharmaceuticals.Currently, the criteria for PCOM in women aged 18-35 years are as follows: ≥20 follicles in at least one ovary and/or ovarian volume ≥ 10 mL, without the presence of dominant follicles, cysts, or corpus luteum [1,2,13]; this approach to follicle number estimation is applicable if a transducer above 8 MHz is used.However, in clinical practice, as well as in epidemiological studies, equipment with a sensor frequency of 4-8 MHz is still widely used [1,2,13].
As previously shown in epidemiological studies, diagnostic criteria for PCOM based on ovarian volume (OV) and follicle number per ovary (FNPO) could be determined using different approaches: (a) by performing receiver operator characteristic (ROC) curve analyses (which report the diagnostic power of a parameter to distinguish between diseased and non-diseased conditions and propose thresholds that balance test sensitivity and test specificity) [14] or (b) by means of cluster analysis in a large population-based unselected cohort.Some authors also utilized the upper (95th-98th) percentiles in a well-characterized cohort of women with regular predictable menstrual cycles of 21-35 days in length and no clinical and/or biochemical evidence of hyperandrogenism (HA), recruited from the same population and examined in a similar manner as the study subjects [14] to establish diagnostic criteria for polycystic ovarian morphology.
In general, the data on age-and ethnicity-dependent diagnostic criteria for PCOM remain insufficient and may vary in different geographical zones.In this study, we aimed to estimate the upper percentiles for OV and FNPO in an unselected sub-population of healthy, non-hirsute, normo-androgenic, eumenorrheic premenopausal Eastern Siberian women to determine the need for ethnicity-and age-dependent diagnostic criteria for PCOM.

Materials and Methods
Study design and population.Study subjects were recruited during the cross-sectional institution-based prospective Eastern Siberia PCOS epidemiology and phenotype (ESPEP) study (ClinicalTrials.govID: NCT05194384), conducted in two major areas of Eastern Siberia (the Irkutsk region and the Buryat Republic, Russian Federation) from March 2016 to December 2019, as previously described [15,16].In brief, a total population of 1490 premenopausal women underwent a mandatory annual employment health assessment.Of these, all women aged 18-44 years who were found to be completely healthy constituted the study population (Figure 1).The inclusion criteria for the current study population included a history of regular, predictable menstrual cycles of 21-35 days in length and no clinical signs of HA and/or elevated testosterone and/or DHAS levels and/or FAI [16].Women with a BMI < 18 or ≥30 kg/m 2 , premature ovarian failure (based on history or due to elevated FSH), treated and untreated hyperprolactinemia (based on history or increased prolactin level > 727 mIU/mL), untreated thyroid disorder (based on history or TSH level > 4 mIU/mL), and 21-hydroxylase deficient non-classical adrenal hyperplasia (based on increased 17-hydroxyprogesterone (17-OHP) > 7.0 nmol/L), were excluded.The main characteristics of the study population, including their socio-demography, menstrual and and/or FAI [16].Women with a BMI < 18 or ≥30 kg/m 2 , premature ovarian failure (based on history or due to elevated FSH), treated and untreated hyperprolactinemia (based on history or increased prolactin level > 727 mIU/mL), untreated thyroid disorder (based on history or TSH level > 4 mIU/mL), and 21-hydroxylase deficient non-classical adrenal hyperplasia (based on increased 17-hydroxyprogesterone (17-OHP) > 7.0 nmol/L), were excluded.The main characteristics of the study population, including their socio-demography, menstrual and reproductive history, anthropometry, vital signs, and hormone profiles, in total and by ethnicity, are presented in the Supplement (Table S1).Study Protocol.As previously described, subjects were evaluated consecutively by trained personnel and by means of questionnaires, anthropometry, vital signs, and gynecological examination.Anthropometry measurements included height, weight, and waist circumference (WC).Their body mass index (BMI) was calculated as weight (kg)/height (m 2 ).Hirsutism was defined by the modified Ferriman-Gallwey (mFG) visual hirsutism score scale [15,17].
Hormonal analyses.Blood samples were obtained in the morning, and analyzed for serum total testosterone (TT), DHEAS, sex hormone-binding globulin (SHBG), prolactin, Study Protocol.As previously described, subjects were evaluated consecutively by trained personnel and by means of questionnaires, anthropometry, vital signs, and gynecological examination.Anthropometry measurements included height, weight, and waist circumference (WC).Their body mass index (BMI) was calculated as weight (kg)/height (m 2 ).Hirsutism was defined by the modified Ferriman-Gallwey (mFG) visual hirsutism score scale [15,17].
Pelvic ultrasound (U/S) was performed by experienced specialists who were trained to conduct the U/S scans uniformly, with the appropriate inter-observer variability.We used Mindray M7 (Mindray Bio-Medical Electronics Co., Shenzhen, China), a transvaginal probe (5.0-8.0MHz).Ovarian volume was determined by the formula for a prolate ellipsoid (length × width × height × 0.523) [18].
Statistical analysis.As previously reported, sample size calculations for the total population of the ESPEP study were based on the following formula: n = (z 1−α ) 2 (P(1 − P))/D 2 , where n = individual sample size, z_(1 − α) = 1.96 (when α = 0.05), P = assumed PCOM prevalence for unselected population according to previously published data, D = error.If we take prevalence as 33% [8,9] (or 0.33) and absolute error as 5%, then the minimum sample size is as follows: Data were collected using research electronic data capture (REDCap) [19].
Managing missing data: In our research dataset, there were two types of missing data: missing completely at random (MCAR) and missing at random (MAR).We recorded all missing values with labels of "N/A" to make them consistent throughout our dataset.When analyzing the dataset, we used the pairwise deletion method.
The results of the Kolmogorov-Smirnov test for normality showed that the analyzed continuous variables were non-normally distributed.Therefore, for continuous variables, we used Mann-Whitney non-parametric tests.Kruskal-Wallis ANOVA and z-criteria were used to compare proportions and categorical variables.A p-value of 0.05 was considered statistically significant.To compare the 95th, 97.5th, and 98th percentiles, we analyzed 95% confidential intervals (95% CIs).For the construction of 95% CIs we utilized the bootstrap percentile method.Overlapping 95% CIs can explain statistical significance when comparing two measured results.If the two 95% CIs do not overlap, we considered 95th-98th percentiles significantly different [14].

Results
After exclusions, 444 healthy women (285 Caucasians, 123 Asians, and 36 of mixed ethnicity) were eligible to be included in the study population.Taking into account the low number of women of mixed ethnicity, we further excluded these from the study, leaving 408 women for analysis (Figure 1).
The mean age of the study population with regular predictable menstrual cycles and no clinical or biochemical evidence of hyperandrogenism was 34.32 ± 5.96 years.Women of Caucasian and Asian ethnicity were comparable by age, anthropometric characteristics, and marital status, although these groups exhibited some differences in respect of education and occupation.Participants of Asian origin demonstrated a lower mFG score, within a normal range.Regarding serum FSH, LH, TSH, 17OHP, and AMH levels, no statistically significant differences were detected.At the same time, prolactin levels were significantly higher (within normative ranges) in Asians as compared to Caucasian women.When studying the impact of ethnicity on androgens, we found that TT, DHAS, and FAI values in the study population were significantly lower in Asians than in Caucasian women, but they were in the normative ranges for the multi-ethnic Siberian population [16].At the same time, Asians showed lower levels of SHBG (Table S1).
In all the study participants we analyzed, OV and FNPO for ovaries satisfied the following criteria: (a) an absence of follicles and/or cysts greater than 9 mm in diameter, (b) an absence of corpus luteum, and (c) an absence of ovarian masses.Finally, among the total number of healthy premenopausal women from the unselected population (n = 408), the data for 563 ovaries were eligible for further investigation.For these ovaries, we performed a descriptive statistical analysis and determined the 98th percentiles for OV and FNPO.
All estimates of OV and FNPO are shown in Tables 1 and 2 in totals, by ethnicity and age.As presented in the tables, the upper percentiles of both OV and FNPO were calculated for the total group and for subgroups of Caucasian and Asian ethnicity < 35 and ≥35 years old.Based on the calculation of 95% CIs for these percentiles and on the analysis of their overlapping, we compared UNLs (the 98th percentiles).In the total study population, the upper OV percentiles did not differ by ethnicity or age group; by contrast, the UNL of FNPO was higher in Caucasian women than in Asian women (Tables 1 and 2).At the same time, women from the study sub-population aged <35 years demonstrated a higher UNL of FNPO compared to older women; this was observed mainly in the Caucasian women.In the Caucasian group, we found the higher 98th percentile for FNPO in women younger than 35 years, whereas the upper percentiles calculated for FNPO in women of Asian ethnicity did not vary by age (Table 2).

Discussion
Ovarian morphology (specifically, OV and FNPO) is one of the key characteristics of polycystic ovaries; however, ultrasound features of PCOM are not rare and are observed in up to 16-25% of healthy women with regular menstrual cycles [8].As previously reported, the proposed thresholds for FNPO and OV were not similar for populations of different ethnicity (19-30) (Table 3).Nevertheless, establishing the ethnicity-specific diagnostic criteria of polycystic ovarian morphology is still challenging.In our study, we based our research on determining the upper (98th) percentile of both OV and FNPO in non-hirsute, normo-androgenic, eumenorrheic premenopausal women from a multi-ethnic unselected Siberian population.Our data demonstrated that, in terms of means, OV was increased in Caucasian women as compared to Asians (6.58 ± 2.36 vs. 5.69 ± 2.09); however, for upper percentiles, the difference was not statistically significant, due to the overlap of 95% CIs.For OV, the upper percentiles determined for Caucasians in our study were comparable to the same estimate in the population-based study conducted by Lujan et al. (2013) [30] in the United States and Canada.At the same time, according to the previously reported data based on ROC analysis, UNLs for OV in French Caucasians [25], Indian women [26], women from the United States and Iceland [27], and Turkish and Vietnamese women [5,28] were lower than our estimates.However, most of these studies were performed in the relatively small hospital-based samples (Table 3).
Regarding FNPO, we demonstrated a substantially higher threshold of FNPO at upper percentiles in women of Caucasian origin as compared to Asians.In our study, the UNL of FNPO for Caucasians was slightly higher (15 vs. 12) as compared to that which had previously been demonstrated [23,25,26,29] [5] presented significantly lower values for FNPO for the control group of Caucasians as compared to our data.
Our estimates of FNPO for the Asian reference group were consistent with data previously published by Chen et al. (2008) [22], who used ultrasound equipment of a similar class as that used in our study.Alternatively, our data, as well as findings from the Chinese study mentioned above, differ from those reported for a Thai Asian population by Wongwananuruk et al. (2018) [31].
The changes in the follicle number and OV with age were previously observed in the general populations [20,[32][33][34].Moreover, the development of age-specific diagnostic criteria for PCOM in women of reproductive age had been recommended by the international guidelines [1,2].Nevertheless, the data regarding age-dependent thresholds for ovarian morphology were insufficient [20,27,29].In our study, in the total population of healthy women, the upper percentiles for FNPO in women aged <35 years and ≥35 years differ significantly (15 vs. 12), mainly due to the Caucasians (15 vs. 13).At the same time, we have not found age-dependent differences in FNPO in the Asian subgroup (11 vs. 10, respectively).
Study strengths.Importantly, our study benefited from the fact that we identified study subjects with regular, predictable menstrual cycles and no clinical or biochemical evidence of hyperandrogenism (reference group) in a representative unselected, medically unbiased, multi-ethnic population of women, living in the same geographical and socioeconomic conditions.We consider the Eastern Siberian population as an appropriate model for the purpose of studying the ethnicity-dependent aspects of ovarian morphology in Caucasians and Asians.All study participants were well phenotyped, with the exclusion of any factors that could influence their PCOM characteristics.In addition, all measurements (FNPO, OV) were accomplished by only three specialists who were trained to conduct the U/S scans uniformly with the same machines to eliminate potential bias.Therefore, the proposed criteria could be useful and convenient for diagnosing PCOM.Study limitations.While large, our study cohort is still relatively limited in size, and the results should be confirmed in much larger populations.The use of the US equipment with probes of ≤8 MHz is one of our study limitations as well.At the same time, mid-range equipment is most commonly utilized in standard clinical practice, and we suggest that our data on OV and FNPO thresholds are still valid, although the use of probes of >8 MHz is highly recommended [2].We were also not able to estimate ultrasound characteristics in women of mixed Caucasian/Asian ethnicity, due to the insufficient number of participants.

Conclusions
In this study, we found that, in the total study population of healthy women, OV based on the upper percentiles did not depend on ethnicity, whereas estimates of FNPO were significantly higher in Caucasians as compared to Asians.We did not find age-dependent differences for OV estimates in the total study population.By contrast, for FNPO, we demonstrated higher upper percentiles in women aged <35 vs. women ≥35 years old.Our data suggest that the estimation of PCOM should take into account the ethnicity and age of the individual.

Figure 1 .
Figure 1.Flow diagram of selection of healthy premenopausal women among the participants in the cross-sectional Eastern Siberia PCOS epidemiology and phenotype (ESPEP) study.

Figure 1 .
Figure 1.Flow diagram of selection of healthy premenopausal women among the participants in the cross-sectional Eastern Siberia PCOS epidemiology and phenotype (ESPEP) study.

Table 1 .
Ovarian volume (OV) and follicle number (FNPO) in healthy women from unselected population.

Table 2 .
Ovarian volume (OV) and follicle number (FNPO) in healthy women from unselected population by age.

Table 3 .
Thresholds for follicle number and ovarian volume proposed by different authors.