Prediction of Late-Onset Small for Gestational Age and Fetal Growth Restriction by Fetal Biometry at 35 Weeks and Impact of Ultrasound–Delivery Interval: Comparison of Six Fetal Growth Standards

Small-for-gestational-age (SGA) infants have been associated with increased risk of adverse perinatal outcomes (APOs). In this work, we assess the predictive ability of the ultrasound-estimated percentile weight (EPW) at 35 weeks of gestational age to predict late-onset SGA and APOs, according to six growth standards, and whether the ultrasound–delivery interval influences the detection rate. To this purpose, we analyze a retrospective cohort study of 9585 singleton pregnancies. EPWs at 35 weeks were calculated to the customized Miguel Servet University Hospital (MSUH) and Figueras standards and the non-customized MSUH, Fetal Medicine Foundation (FMF), INTERGROWTH-21st, and WHO standards. As results of our analysis, for a 10% false positive rate, the detection rates for SGA ranged between 48.9% with the customized Figueras standard (AUC 0.82) and 60.8% with the non-customized FMF standard (AUC 0.87). Detection rates to predict SGA by ultrasound–delivery interval (1–6 weeks) show higher detection rates as intervals decrease. APOs detection rates ranged from 27.0% with FMF to 7.9% with the Figueras standard. In conclusion, the ability of EPW to predict SGA at 35 weeks is good for all standards, and slightly better for non-customized standards. The APO detection rate is significantly greater for non-customized standards.


Introduction
Screenings for fetal growth abnormalities are essential components of antenatal care, and fetal ultrasound plays a key role in the assessment of these conditions [1][2][3]. Smallfor-gestational-age (SGA) infants-those with a birth weight below the 10th percentile

Study Design
This was a retrospective cohort study of births assisted at the Miguel Servet University Hospital (MSUH), between March 2012 and December 2016. The inclusion criteria were as follows: live singleton pregnancies controlled in MSUH from the first trimester of gestation; fetal ultrasound assessment at gestational age of 35 (range [34][35][36] weeks; and deliveries between 37 and 42 weeks of gestational age of fetuses without stillbirth associated with malformations or chromosomal abnormalities. Of the 19,310 consecutive deliveries assisted in our hospital in the period studied, the 9585 cases that fulfilled the specific inclusion criteria-such as data availability to estimate percentile weights by standards-were considered for the analysis. Study participants' selection samples are detailed in Figure 1. The last menstrual period was adjusted by first trimester ultrasound [30]. Universal ultrasound screening was performed at 35 weeks (range 34-36 weeks) at the Ultrasound and Prenatal Diagnosis Unit using either a Voluson 730 Expert, E6, E8 ultrasound machine (General Electric, Healthcare, Zipf, Austria) or an Aloka Prosound SSD-5000 (Hitachi Aloka Medical Systems, Tokyo, Japan). This ultrasound corresponds to the one that is routinely performed in all pregnancies at our center to try to increase the detection of fetal growth alterations [5].
EFW was calculated with the formula of each standard to which it was built. We used Hadlock et al.'s [19] formula, which combines biparietal diameter, cephalic and abdominal circumference, and femur length, for the MSHU, Figueras  The last menstrual period was adjusted by first trimester ultrasound [30]. Universal ultrasound screening was performed at 35 weeks (range 34-36 weeks) at the Ultrasound and Prenatal Diagnosis Unit using either a Voluson 730 Expert, E6, E8 ultrasound machine (General Electric, Healthcare, Zipf, Austria) or an Aloka Prosound SSD-5000 (Hitachi For the calculation of the EPW, we collected in the study the maternal age and body mass index (BMI) at the beginning of pregnancy, parity, maternal and paternal height, maternal ethnic origin, smoking habits, infant gender, birth weight, and ultrasound EFW.
We also collected perinatal outcomes in order to analyze APOs in SGA infants at delivery, defined as the occurrence of a 5-min Apgar score < 7, instrumental or cesarean delivery for non-reassuring fetal status, arterial cord blood pH < 7.10, and stillbirth.

Estimated Percentile Weight
EPWs were calculated according to 6 different customized and NC growth standards, including population, population-customized, and international references. For the customized standards, the methodologies of Hadlock et al. [18] and Gardosi et al. [23] were used for (1) the MSUH standard customized for parity, age, BMI, maternal height, paternal height, and fetal gender, built using a modified version of Hadlock et al.'s growth charts adjusted to our population, with a coefficient of variation that changes with gestational age (Saviron-Cornudella et al. [16]); (2) and the Barcelona Clinic Hospital (Figueras et al. [15]). For the NC standards, we used (3) an NC version of the MSUH standard (Saviron-Cornudella et al. [16]); (4) the international population INTERGROWTH-21st [20,21]-a multilevel mixed model whose main characteristic is that it includes pregnant women without pathology; (5) the international WHO fetal growth standard [17], and (6) the FMF local growth multilevel mixed model (Nicolaides at al [22]).
To assess ultrasound weight measures in the third trimester, EPWs were estimated between 34 and 36 weeks of gestational age. The WHO EPW was calculated by interpolation of the 5th, 10th, 25th, 50th, 75th, 90th, and 95th percentiles.
As a gold standard for the analysis, SGA was defined as a birth weight below the 10th percentile, using a growth reference for the Spanish population based on 9362 birthweights [31]. We did not focus our analysis on intrauterine growth-restricted fetuses (IUGRs). As we did not perform Doppler ultrasound universally (only in cases of estimated fetal weight < 10th percentile), we did not study the subgroup of SGA fetuses at delivery with altered Doppler ultrasound. This is because a significant percentage of SGA fetuses at delivery did not present an estimated fetal weight <10th percentile by ultrasound.

Statistical Analysis
Data were descriptively analyzed using medians and interquartile ranges for continuous variables, and absolute and relative frequencies for categorical variables. The ability of EPW provided by the six standards to predict SGA was analyzed using the area under the receiver operating characteristic curve (AUC) [32]. Sensitivity (detection rate) was established for false positive rates (FPR) of 5, 10, 15, and 20%. The percentile threshold point corresponding to the FPR values was also calculated. AUCs were compared using the DeLong test, and sensitivities through a proportion comparison test.
In addition, we built logistic regression models to estimate the OR and 95% confidence interval that correspond to an increase of 1% in the EPW at 35 weeks, as a predictor for SGA at delivery, performing a subanalysis for different ultrasound-delivery intervals (1-6 weeks).
We analyzed the diagnostic ability of the EPW 10th percentile and SGA birthweights to detect the following adverse perinatal outcomes: 5-min Apgar score < 7, instrumental delivery for non-reassuring fetal status (NRFS), cesarean delivery for NRFS, arterial cord blood pH < 7.10, and stillbirth. Comparison between APOs predicted by standards was performed using a proportion test.
Analyses were performed using R version 3.6.2 language programming (The R Foundation for Statistical Computing, Vienna, Austria) [33]. Table 1 shows the descriptive characteristics of the pregnant women, and also displays medians and percentiles 10 (P10) and 90 (P90) among groups for the six studied standards for EFWs by ultrasound at 35 weeks (range from 34+0 to 36+6 weeks). WHO and FMF standards show an underestimation of the median expected value (50%) by ultrasound (median values 43.1%, P10-P90 range 7.5-74.9, and 37.6%, P10-P90 range 2.7-89.9, respectively), while the Figueras standard shows an overestimation by ultrasound (median values 59.3%, P10-P90 range 18.1-93.5). EPW distributions are detailed in Figure 2, where a comparison of the percentage of SGA is shown for each standard. The rate of SGA at birth in our cohort was 9.4% (n = 902). EPW distributions are detailed in Figure 2, where a comparison of the percentage of SGA is shown for each standard. The rate of SGA at birth in our cohort was 9.4% (n = 902). Regarding APOs, Table 2 shows that SGA deliveries included 21.6% (n = 139) APOs, 28.6% (n = 12) 5-min Apgar scores < 7, 19.9% (n = 32) instrumental deliveries for NRFS, 26.8% (n = 71) cesarean deliveries for NRFS, 17.7% (n = 45) neonatal acidemia (pH cord blood pH < 7.10), and 26.3% (n = 5) stillbirth.  Regarding APOs, Table 2 shows that SGA deliveries included 21.6% (n = 139) APOs, 28.6% (n = 12) 5-min Apgar scores < 7, 19.9% (n = 32) instrumental deliveries for NRFS, 26.8% (n = 71) cesarean deliveries for NRFS, 17.7% (n = 45) neonatal acidemia (pH cord blood pH < 7.10), and 26.3% (n = 5) stillbirth.    Figure 3 illustrates the receiver operating characteristic curve comparison and AUC for the prediction of SGA at delivery by ultrasound at 35 weeks. Figure 4 displays the results of the logistic regression model with the ORs and 95% CIs to predict SGA by ultrasound at 35 weeks, according to the standards.    p-values of the comparisons of the standard AUC values and sensitivity for a 90% specificity are shown in Table 4. The Fetal Medicine Foundation and the non-customized MSUH standards showed no statistically significant differences between them, with greater SGA prediction ability than the Intergrowth-21st, Figueras, and WHO standards. Moreover, in the comparison, the Intergrowth-21st and WHO standards showed significant differences from the customized MSHU and Figueras standards. Finally, customized standards did not show differences between them. Regarding APO prediction by EPW < 10, in Table 2 we show that the Fetal Medicine Foundation and WHO standards reached the greatest detection rates-27.0% and 17.4% respectively-with statistically significant differences between them and the rest of standards. No statistically significant differences were detected in the any of the possible comparisons between the non-customized MSUH (11.8%), INTERGROWTH-21 st (10.7%), customized MSUH (9.6%), and Figueras (7.9%) standards, with the unique exception of the significant difference between the non-customized MSUH and Figueras standards. The instrumental deliveries for NRFS, cesarean deliveries for NRFS, and neonatal acidemia APOs might explain those differences. P-values of all comparisons are illustrated in Table  4. Table 5 displays values of AUCs and sensitivities for different FPRs to predict SGA by ultrasound-delivery interval (range 1-6 weeks). The observed results show higher detection rates as the interval decreases. Figure 5 shows the prediction of small for gestational age cases, by standard, by ultrasound-delivery interval (1-6 weeks), for a 10% false positive rate.  Regarding APO prediction by EPW < 10, in Table 2 we show that the Fetal Medicine Foundation and WHO standards reached the greatest detection rates-27.0% and 17.4% respectively-with statistically significant differences between them and the rest of standards. No statistically significant differences were detected in the any of the possible comparisons between the non-customized MSUH (11.8%), INTERGROWTH-21st (10.7%), customized MSUH (9.6%), and Figueras (7.9%) standards, with the unique exception of the significant difference between the non-customized MSUH and Figueras standards. The instrumental deliveries for NRFS, cesarean deliveries for NRFS, and neonatal acidemia APOs might explain those differences. p-values of all comparisons are illustrated in Table 4. Table 5 displays values of AUCs and sensitivities for different FPRs to predict SGA by ultrasound-delivery interval (range 1-6 weeks). The observed results show higher detection rates as the interval decreases. Figure 5 shows the prediction of small for gestational age cases, by standard, by ultrasound-delivery interval (1-6 weeks), for a 10% false positive rate.    Figure 6 displays odds ratios and 95% confidence intervals of the standards in order to predict SGAs by ultrasound-delivery interval (range 1-6 weeks). Figure 7 illustrates the receiver operating characteristic curve comparison of fetal growth standards for the prediction of SGA newborns according to the ultrasound-delivery interval (range 1-6 weeks).  Figure 6 displays odds ratios and 95% confidence intervals of the standards in order to predict SGAs by ultrasound-delivery interval (range 1-6 weeks). Figure 7 illustrates the receiver operating characteristic curve comparison of fetal growth standards for the prediction of SGA newborns according to the ultrasound-delivery interval (range 1-6 weeks).

Principal Findings
We have demonstrated the utility of EPW by ultrasound at 35 weeks (range 34+0-36+6 weeks) as a predictor of SGA fetuses at delivery at term. Adjusting the percentile threshold points, the growth standards showed a similar good predictive ability, but with a significant advantage for the non-customized MSUH and Fetal Medicine Foundation standards, and a disadvantage for both the customized MSUH and Figueras standards, for SGA fetuses.
In our results, we found an underestimation of 10 th percentile, by ultrasound at 35 weeks (range 34-36 weeks) with the WHO (7.5%) and FMF (2.7%) standards, and an overestimation with the Figueras (18.1%) standard; for that, we can conclude that these standards have a lack of calibration for our study population. The MSHU (NC (11.9%) and

Principal Findings
We have demonstrated the utility of EPW by ultrasound at 35 weeks (range 34+0-36+6 weeks) as a predictor of SGA fetuses at delivery at term. Adjusting the percentile threshold points, the growth standards showed a similar good predictive ability, but with a significant advantage for the non-customized MSUH and Fetal Medicine Foundation standards, and a disadvantage for both the customized MSUH and Figueras standards, for SGA fetuses.
In our results, we found an underestimation of 10th percentile, by ultrasound at 35 weeks (range 34-36 weeks) with the WHO (7.5%) and FMF (2.7%) standards, and an overestimation with the Figueras (18.1%) standard; for that, we can conclude that these standards have a lack of calibration for our study population. The MSHU (NC (11.9%) and customized (12.2%)) and INTERGROWTH-21st (12.7%) standards fit better to the 10th percentile, with a minimum error, probably for the exclusion of premature deliveries.
When we analyzed the APO-predictive ability of the six standards by percentile weight <10 at 35th week of gestational age, the customized Fetal Medicine Foundation and WHO standards showed the greatest diagnostic ability, with statistically significant differences from the rest of standards. The main reason for this lies in the greater proportion of 10th percentile EPW for the Fetal Medicine Foundation (21.2%) and WHO (12.6%) standards, In any case, with similar proportions of EPW < 10, the non-customized MSUH and INTERGROWTH-21st standards show a better APO-predictive ability than the customized MSUH and Figueras standards. A previous study did not find any significant differences between the customized and non-customized standards when analyzing the predictive ability of EPW to detect APOs; by contrast, using EPW > 90th percentile, we detected significant differences [34].

Prediction by Fetal Biometry and Ultrasound-Delivery Interval
There is no international consensus on performing a universal ultrasound in the third trimester; two international guidelines-the RCOG [35], and the American College of Obstetrics and Gynecology (ACOG) [36]-do not recommend universal ultrasound to detect fetal growth anomalies. Sovio el al [37], however, found that universal third trimester ultrasound in nulliparous women, compared with selected ultrasound, tripled the detection of SGA < P10 infants, and could identify FGR fetuses at increased risk of neonatal morbidity.
The EPW at third trimester ultrasound over 32 weeks has been shown to be a good predictive model (AUC > 0.85) for the detection of SGA at delivery in several studies, although with detection rates limited for late-onset SGA births [4,38,39]. For gestational time, the detection rate of SGA at delivery by ultrasound between 33-34 weeks is approximately 52%, and between 36-37 weeks it is approximately 60% (FPR of 10%) [40][41][42]. According to several studies, therefore, detection is higher the later the ultrasound is performed [14,43,44]. In our case, the predictive capacity for SGA at delivery by ultrasound at 35 weeks is also limited for the six growth standards, and generally, a shorter ultrasounddelivery interval is correlated with better prediction rates. In any case, the cutoff points of the 10th percentiles by ultrasound at 35 weeks are moderate for the prediction of SGA at delivery.

Prediction by Fetal Biometry and Ultrasound-Delivery Interval: Comparison of Standards
Blue et al., in 2018 [45], compared the RCOG and ACOG standards for the detection of SGA at delivery, with a mean birth of 37.7 weeks and ultrasounds performed in the previous 2 weeks, and showed that both standards had a moderate predictive capacity (AUCs of 0.78 and 0.76, respectively). In another study by Blue in 2019 [46], the Hadlock and INTERGROWTH-21st standards for the detection of SGA, with deliveries at 37 weeks on average and ultrasound in the previous two weeks, showed good predictive capabilities (>0.90), with cutoff points of the optimal percentile at 15% for the Hadlock standard and 22% for the INTERGROWTH-21st standard. Both studies are not comparable to ours; although they show the minimum differences in SGA prediction regardless of the standard used, neither of them studied customized standards.
In two studies by Odibo et al. in 2018 [47] and 2019 [48], using the same sample obtained for the three different standards compared (INTERGROWTH-21st, a local customized standard, and the Hadlock standard), a moderate predictive capacity for SGA at delivery was achieved (0.67, 0.62, and 0.69, respectively), although with ultrasound performed between 26 and 36+6 weeks, and an average ultrasound-delivery interval of 6.7 weeks-also different from our study. Reboul et al., in 2017 [49], found that the Hadlock and the customized Gardosi standards had a moderate predictive capacity for SGA at delivery (0.768 and 0.708, respectively), with the detection rates somewhat higher for the Hadlock standard, although with an average of performing ultrasound at 32 weeks-lower than ours, which could justify the lower predictive capacity.

Clinical and Research Implications
In clinical practice we can say that more important than the choice of the growth standard is its calibration before clinical use, both by ultrasound and delivery, in the reference population. The physiological and non-pathological characteristics of each population are those that will allow us to calibrate the standard to be used.
There are several factors for which ultrasound in the third trimester presents limitations when predicting SGA and FGRs at delivery, and some of them are unavoidable-especially the systematic error of ultrasound at the time of EFW calculation [50]. With the current studies carried out on the timing of performing the third trimester ultrasound and the ultrasound-delivery interval, together with our comparative study of standards, we can affirm that the timing that better predicts SGA cases is the one closest to delivery; however, we cannot delay ultrasound universally to 37 weeks, since we would not detect early FGRs. As we are not currently able to make that prediction, it will continue to be the subject of future research.
According to our results, it would be appropriate to raise the ultrasound-estimated weight percentile cutoff point above 10 for fetal growth control. This is because the 10th percentile has been shown to be insufficient, and with low predictive capacity for SGA at delivery and, therefore, fetuses that can potentially be IUGR even before delivery can escape control and, thus, increase their morbidity and mortality. Our recommendation, in the ultrasound during the third trimester, between 35 and 36 weeks, could be to raise the cutoff point at least from the 10th to the 20th percentile for strict control of fetal growth.

Strengths and Limitations of the Study
Our study has several strengths, including the wide sample size close to 10,000 pregnancies. Ultrasound measurements were performed in routine clinical practice; thus, weight estimations were more concentrated over specific weeks of gestational age. Limitations of our investigation are that our data came from a single hospital, and their retrospective nature could limit the generalization of our standards. Furthermore, the information of the ultrasound was available to the obstetricians, which could mean a bias in the management of the pregnancies. A small percentage of labors are inductions of labor or cesarean sections programmed by IUGR, and they could act as confounding factors in the study. Similarly, other cases of early termination due to other causes have not been taken into account.

Conclusions
In summary, even with limited detection rates, the growth standards showed a similar good predictive ability, with a statistically significant improvement by the use of non-customized standards, for SGA at delivery by ultrasound at 35 weeks. Generally, a shorter ultrasound delivery interval for the different standards was correlated with better prediction rates for small gestational age cases. When focusing on the use of EPW < 10th percentile at week 35 for the prediction of APOs, non-customized standards also demonstrated an advantage over customized standards.