1. Introduction
Fetal growth and fetal growth defects play a crucial role in human development [
1]. Normal fetal development during gestation not only indicates a healthy pregnancy but also informs the post-natal and long-term health of neonates [
2]. Birth weight has been identified as a strong marker of health risk throughout postnatal life [
3].
Fetal growth is influenced by various factors, including hormones, growth factors, gestational conditions, environmental factors, and fetal pathology [
4,
5,
6]. Additionally, maternal diseases occurring during pregnancy, such as diabetes mellitus, hypertensive disorders, and thyroid diseases, can also impact fetal growth [
7,
8]. Prenatally, the evaluation of fetal nutritional status is predominantly based on the estimation of fetal weight through ultrasound.
Once the fetal weight is estimated by ultrasound, the obtained value is compared with a reference curve. This comparison allows for the classification of the fetus as small, appropriate, or large for its gestational age and sex, depending on the percentile in which it falls in relation to the reference curve used. Currently, there are essentially two types of fetal growth reference curves. The first and most used are empirically created and based on population studies. A good example of this type of curve is INTERGROWTH21st. This standard was created based on the principle that it can be universally applied to all populations, provided the data come exclusively from healthy, well-nourished mothers who experienced a normal pregnancy. The data were collected prospectively in a standardized manner across eight countries. The standard posits that any downward variation in growth and birthweight due to maternal size and ethnic origin indicates stunting and malnutrition and should not be adjusted for [
9,
10].
The other type of fetal growth curve is the customized one. These curves aim to classify the normality or abnormality of growth by considering the growth potential of the specific fetus being studied. To achieve this, the percentile is calculated by considering certain maternal variables known to influence growth potential, such as height, weight at the beginning of pregnancy, and ethnicity. A good example of this second type is the GROW model, which, in some studies, has shown better performance in identifying small-for-gestational-age fetuses compared to INTERGROWTH21st [
11].
After birth, the diagnosis of neonatal nutritional status can be estimated through the integration of various measurements and fetal anthropometric indices, including length, weight, head circumference, and combinations of these, such as the head circumference to length ratio (HC/L) [
12], the arm circumference to head circumference ratio, the ponderal index (PI) [
13], or the body mass index (BMI). Understanding fetal nutritional status enables early intervention, potentially correcting growth defects and thereby reducing perinatal and neonatal complications [
14]. However, existing growth curves, which only consider fetal weight, do not consistently predict fetal and neonatal nutritional status accurately [
15]. Our study aims to develop an optimized model that better identifies fetuses with impaired nutritional status. To achieve this, we propose modeling not only fetal weight but also fetal length. By establishing the relationship between these parameters, we can calculate the fetal BMI, which may correlate more effectively with nutritional status. Secondly, we compare the performance of our model with INTERGROWTH21st and GROW, two well-established models frequently utilized in routine clinical practice across numerous countries.
2. Materials and Methods
This retrospective cohort study includes all healthy pregnant women whose pregnancies and deliveries were attended at the Obstetrics and Gynecology Department of Puerto Real University Hospital between January 2011 and December 2021. The exclusion criteria applied to pregnancies with chronic hypertension, diabetes mellitus, and thyroid diseases, as well as births with a gestational age of less than 37 weeks or older than 42 weeks.
Statistical analysis:
We developed a multivariate linear regression model to predict neonatal weight at 40 weeks of gestation, following the methodology outlined in our previous work [
16]. To account for potential bias associated with obesity, maternal weight was adjusted for pregnancies where the maternal BMI exceeded 30 kg/m
2. The correction involved calculating the maternal weight based on a BMI of 30 kg/m
2. Similarly, maternal weight was adjusted for mothers with a BMI below 18.5 kg/m
2 by calculating their weight for a BMI of 18.5 kg/m
2.
Additionally, we developed a multivariate linear regression model to estimate the theoretical length that a specific fetus should reach at 40 weeks of gestation, using maternal variables (age, weight, and height) and fetal sex as predictors.
We calculate the estimated fetal weight at each gestational age using the proportionality curve proposed by Gardosi et al. [
17], incorporating the coefficient of variation of fetal weight in our population. The formula is shown in Equation (1).
Equation (1): Weight proportion formula. GA states for gestational age, in weeks.
The length at each week of gestation was calculated using the Abduljalil formula for both male and female fetuses [
18] (Equations (2) and (3)).
Equation (2): Height formula for male newborns. GA states for gestational age in weeks.
Equation (3): Height formula for female newborns. GA states for gestational age in weeks.
The method we used to estimate fetal length at any gestational age is like the one employed by Gardosi et al. for modeling fetal weight throughout gestation [
17]. Firstly, starting from the Abduljalil formula and through a polynomial regression analysis, we developed our own proportionality curve for fetal length (Equations (4) and (5)).
Equation (4): Length proportion formula for male fetuses. GA states for gestational age in weeks.
Equation (5): Length proportion formula for female fetuses. GA states for gestational age in weeks.
These formulas allow us to calculate what proportion of the length at 40 weeks of gestation corresponds to each specific gestational age.
By leveraging the theoretical neonatal length predicted by our regression model at 40 weeks of gestation and multiplying this value by the proportion corresponding to a given gestational age, we can calculate the theoretical length that the fetus must have at any gestational age.
In relation to INTERGROWTH-21st (IG21st) and GROW, newborns were classified according to their birth weight as follows:
Small for gestational age (SGA): Birthweight below the 10th percentile.
Adequate for gestational age (AGA): Birthweight between the 10th and 90th percentiles.
Large for gestational age (LGA): Birthweight above the 90th percentile.
We have not found a classification based on fetal or neonatal BMI; so, for this research study, we considered SGA a fetus or newborn whose BMI was under the 10th percentile, appropriate for gestational age when it was between the 10th and 90th percentile, and, finally, large for gestational age when the BMI was above the 90th percentile for the BMI estimated by our customized BMI curve.
Nutritional status at birth was evaluated by calculating the neonatal ponderal index (PI) (Equation (6)).
Equation (6): Ponderal index.
When the PI was below the 10th percentile, the newborn was classified as undernutrition. On the other hand, when the PI was above the 90th percentile, the newborn was classified as overnutrition.
In the case of undernutrition, we consider a true positive when the method (customized BMI, GROW, or IG21st) classified the fetus as SGA. On the other hand, regarding overnutrition, we consider a true positive when the method used classified the fetus as LGA.
Subsequently, receiver operating characteristic curves (ROC) were constructed by calculating the true positives, false positives, true negatives, and false negatives for each percentile, with differences between percentiles set at 0.2. Therefore, these values were computed for percentiles ranging from the 99.5th to the 0.5th.
Differences in classifications for SGA and LGA were determined using the chi-square test. Sensitivity, specificity, positive likelihood ratio, and negative likelihood ratio were calculated to assess the nutritional status according to each method. The differences in their predictive capacities for nutritional status were evaluated using the DeLong test.
The statistical analysis of the data was performed with the software R 4.1.3 [
19].
All the equations referenced in the study are presented in
Appendix A.
4. Discussion
The new BMI curve appears to predict neonatal undernutrition more effectively than GROW or IG21st, with a significantly higher sensitivity of 0.83 compared to 0.31 and 0.3 yielded by GROW and IG21st, respectively. Similar results were observed when evaluating LGA fetuses, where our model demonstrated a sensitivity of 0.83, while GROW and IG21st yielded 0.12 and 0.37, respectively, for the estimation of overnutrition. Sensitivity is a valuable metric for evaluating each classification system, as it is particularly relevant for identifying at-risk gestations and because the treatment is noninvasive. However, when comparing different methodologies, it is useful to consider additional metrics that take false positives into account.
For this purpose, we applied the DOR. Our model demonstrated a statistically superior DOR for both undernutrition and overnutrition identification, with values of 49.41 and 55.29, respectively. The closest DOR to this was IG21st for the identification of LGA, which yielded 7.84. The poor capability of GROW to identify overnutrition in our population was unexpected, especially considering that this model is derived from an estimated weight. However, our model calculates weight using linear regression based on anthropometric and clinical variables, while GROW uses a polynomial linear regression based solely on gestational time.
These results show the capabilities of these models when using the traditional 10th percentile to classify SGA and the 90th percentile to classify LGA. Nevertheless, previous work suggests that these weight percentiles are not necessarily optimal, and at least 30% of undergrowth might be expected to fall within the normal weight range [
7]. It is therefore appropriate to delve deeper into the analysis and study the identification capabilities of each system at different percentiles. Given the nature of the results and analysis, the most popular method to display these results and the appropriate metric for comparison is to develop an ROC curve. Each point on this curve will show the sensitivity and specificity of each percentile for each method. We decided to create one curve for overnutrition and another for undernutrition to facilitate the comparison of results and interpretation.
The aucROC shows that both IG21st and GROW have similar predictive capabilities for both undernutrition and overnutrition, with values of 0.8 for undernutrition and 0.74 for overnutrition. It also better illustrates that the points around 30% sensitivity are not optimal for these methods, suggesting that these classification methods could benefit significantly from a custom percentile cutoff. However, our customized BMI curve shows a superior aucROC of 0.95 compared to the other methods.
The fact that in our sample the coefficient of variation of BMI is almost three times less than the coefficient of variation of weight could explain, at least in part, the improvement in the aucROC. If we are better able to predict fetal BMI, and this has less variance due to randomness, we are likely to achieve better results, assuming that BMI and weight are both good indicators of nutritional status.
More innovative strategies to assess fetal growth and newborn nutritional status are needed, as it has been established that there is a significant variety of SGA and severe SGA rates using different growth standards [
20]. Although there is previous work regarding the determination of nutritional status through SGA, none, to our knowledge, have used ROC curves to evaluate different percentiles of its behavior. A sensitivity of 15% and specificity of 93% are expected, yet we have observed in our population sensitivity of 30% and 95% for methods based on weight.
Recent models utilizing multivariable logistic regression have yielded a sensitivity of 63% and a specificity of 81% [
21]. Other authors have achieved better results by customizing the Hadlock weight with a custom coefficient of variance and a custom percentile, which have provided an aucROC of up to 0.864. However, the same authors achieved an aucROC of 0.880 using multivariable logistic regression [
15]. Our study provides a superior aucROC of 0.95 using a similar approach of the custom percentile with a custom coefficient of variance but also incorporates a custom estimated BMI.
The main limitation of our study is the criteria we used to establish undernutrition and overnutrition, for which we used ponderal index tables from previous work. We have found several different ways to assess neonatal nutritional status, but none of them have become a gold standard. This lack of a gold standard to establish neonatal nutritional status implies that using other criteria to establish undernutrition or overnutrition, our BMI curve might not demonstrate such superior performance over GROW or IG21st. Furthermore, a future prospective study providing insights into this classification and the performance of our model, as well as an external validation of our model in different populations, would be of interest. Additionally, our study focuses on the performance of using fetal BMI to predict neonatal nutritional status.
The resources available to obstetricians for assessing fetal health are quite limited. Normal growth can serve as a reliable indicator of health. Similarly, growth alterations, whether deficient or excessive, can signal the presence of a pathological process. Currently, growth alteration is primarily based on the estimation of fetal weight, which is complemented by other clinical observations such as the volume of amniotic fluid and the study of uteroplacental circulation using Doppler. These findings inform both the monitoring of the pregnant woman and decision-making processes. We believe that a method with greater sensitivity to detect fetuses with growth alterations could help optimize both monitoring and treatment. In cases of hypertensive stages of pregnancy, increased sensitivity to detect small-for-gestational-age fetuses could lead to earlier interventions. In cases of diabetes, better identification of large-for-gestational-age fetuses could help optimize metabolic control and even determine whether to end the pregnancy. However, these assumptions should be explored in future research, and further studies evaluating the ability of fetal BMI to predict poor perinatal outcomes may be necessary.