1. Introduction
Adverse birth outcomes (as low birth weight or preterm birth) remain a major public health concern due to the well documented association with neonatal mortality as well as both short- and long-term morbidity [
1]. In 2015, about one million deaths among children under 5 years old were related to preterm birth complications [
2]. Some studies investigated the relation between PTB, SGA and neonatal mortality [
3,
4] and suggested that small for gestational age (SGA) was associated with increased risks of stillbirth and neonatal mortality [
5].
Several authors advanced that adverse birth outcomes could be related to unfavorable fetal growth conditions, which could, in turn, have long-term health consequences at adult age [
6,
7,
8,
9], particularly cardio vascular disease [
9]. This is what Barker named “the fetal origins hypothesis” in 1995 [
10]. More precisely, Barker explained that small length and weight at birth and disproportion in head size could constitute makers of lack of nutrients or oxygen at particular stages of gestation, thus increasing the risk of coronary heart disease in adulthood, as demonstrated by Osmond et al. in 1993 [
11].
In this context, a variable named Favorable Fetal Growth Condition (FFGC) has been defined in order to capture a variety of (un)favorable conditions related to environmental, genetic or epigenetic factors, dimensions that are all recognized to play a crucial role in prenatal development. Since the first scientific studies initiated in the 1990s, little attention has been devoted to investigating the existence of such a multi-dimensional variable. Recently, Bollen et al. investigated the possibility that the FFGC could exist and introduced it as a latent variable in a structural equation model in a Filipino islands infant’s cohort [
12]. This type of statistical model has the main advantage of allowing the integration of several outcomes instead of considering each in separate models. His findings confirm that the FFGC latent variable mediated the effects of maternal characteristics on several birth outcomes; the model with the latent variable (that measures the indirect effect of maternal characteristics) better fitted the data than a model without it (which assumes a direct effect of maternal characteristics). More recently, a study supported the evidence of a FFGC latent variable on a population of children born in the United States (North Carolina and Pennsylvania) [
13]. Replication efforts adopting the same strategy are needed, given that, as discussed by Camerota et al., “robustness is especially important for research on the fetal origins hypothesis, given its possible lifelong implications for human health and development” [
13].
During the last decade, it was documented that neighborhood characteristics may modify the health impact of risk factors in term of coefficients significance and magnitude [
14,
15]. To our knowledge, to date, no study has investigated whether the neighborhood socioeconomic level modified the relationship between known risk factors and the FFGC latent variable and the consequences of FFGC on health events.
In this context, one objective of our study was to replicate the approach developed by Bollen et al. (2013) [
12] on a population of newborns in Paris, France, and explore whether the model that included the FFGC latent variable better fitted the data. An additional objective was the investigation of the differential effect of the FFGC latent variable according to the parental socioeconomic level measured at the residential census blocks.
2. Material and Methods
2.1. Study Area
According to the national census in 2006, Paris, the capital of France, has a population of about 2,250,000 inhabitants and counts about 30,000 newborns per year. The year 2006 was chosen as it was the closest year to the newborn health data. The small-area level used for the analysis is the census block (called IRIS by INSEE, the French National Statistics Institute). These units, the smallest for which socio-demographic data from the census are available, were constructed in order to be as homogeneous as possible in terms of population size and socioeconomical profile. The city of Paris is subdivided into 992 census blocks with a mean population of 2199 inhabitants and a mean area of 0.11 km2.
2.2. Individual Data Source
Newborn health data are available from the first birth certificate registered by the Maternal and Child Care department of Paris (named PMI for ‘Protection Maternelle et Infantile’) [
14]. This certificate is completed by the health professional before exit of the maternity, within the 8 days following birth, and then sent to the PMI local unit. This database includes all newborns between January 2008 and December 2011. In accordance with the rules regarding protection of personal data, all the residential postal addresses of the mother at the time when the certificate was completed were geocoded at the IRIS level. From this database, we extracted several parental and newborn characteristics defined below.
2.3. Parental Characteristics
Several mothers’ characteristics were collected, including age (in years), parity (the number of times that woman had given birth to a fetus with a gestational age >20 weeks), level of education and occupational status. Occupational status was the only available characteristic for the father. Newborn birth at very preterm gestational age (<20 weeks) were excluded from the analysis because the babies only have a very small chance of surviving.
We created three age groups of women according to a threshold defined in the literature [
13]: less than 20 years (
younger) and above 35 years of age (
older) are known to be two age groups at greater risk of adverse birth outcomes compared to women aged between 20 and 35 years (the referent age group). The initial quantitative parity variable was categorized as a binary variable (first pregnancy: yes or no:
FirstP). Level of mother education was categorized into four groups: low (
primary school), intermediate (
secondary),
middle (‘baccalauréat’ diploma), and higher education level (chosen as the referent age group given their high number in Paris). The occupational status of mother and father was categorized into two groups only characterizing ‘employed versus unemployed’ (i.e., students or unemployed) parents (
unemployedM and
unemployedF, respectively).
2.4. Newborn Characteristics
Three birth outcomes were considered in this study: birth weight (BW) in kilograms, length (BL) in centimeters and gestational age (GA) in weeks. As it is known that birth outcomes are gender differentiated (for instance, mean of birth weight is higher for boys than for girls), infant sex was included as a confounder (Girl), with male infants defining the referent group.
2.5. Neighborhood Characteristics
To characterize the neighborhood where mothers lived during pregnancy, we chose an estimate of socioeconomic deprivation at the census block scale. The index, based on previous work, was constructed using a principal component analysis (more details are given in Lalloué et al. [
16]). A total of 15 socioeconomic and demographic variables collected by the National Institute of Statistics and Economic (INSEE—2012), the most correlated with the first principal component, were selected and linearly combined to estimate the socioeconomic deprivation index for each census block of Paris city. The index was categorized into 10 groups according to the decile of its distribution (named Socio-Economic Status: from
SES1, the least deprived census blocks to
SES10, the most deprived census blocks). We choose to categorize into 10 classes of deprivation in order to better capture the large range of socioeconomic inequalities existing in Paris. Thanks to our very large sample size, 10 categories constituted a good compromised to keep enough statistical power and to group homogeneous census blocks in terms of level of deprivation.
A major goal of the current study was to test whether BW, BL, and GA are associated with a common latent variable named the Favorable Fetal Growth Conditions (FFGC), as concluded in Bollen et al. [
12], or are three distinct outcomes which have independent relationships with predictors. In this case, we conclude that a model with a latent FFGC variable (indirect-effects model) better fits the data than a direct-effects model. We then stratified the analysis on the level of socioeconomic deprivation with the aim to investigate whether the FFGC always plays the same role and to the same extend in each socioeconomic category.
Direct Effects Model—Unmediated Effects—The base SEM (Model 1) is depicted in
Figure 1. In this model, each predictor has a direct effect on the observed birth outcomes (Birth Weight:
BW, Birth Length:
BL, and Gestational Age:
GA). All the observed predictors’ variables are exogenous (
FirstP, younger, older, Primary, Secondary, bac, unemployedM and unemployed [see section Parental characteristics for a definition of these variables]). The exogenous variables are allowed to correlate with one another as indicated by the long bar with the short arrows connecting them, except the variable ‘
girl’; indeed, there is no hypothesis justifying a possible correlation between the sex of the newborn and the occupational status or the level of education, for instance. The arrows drawn between the set of exogenous observed variables and the set of health outcomes indicate the direct influence of one variable on another. Finally, in this model, we allowed errors in
BW,
BL, and
GA to correlate with each other; this means that there is a residual association between the 3 health outcomes not captured by the exogenous variables introduced in the model.
Indirect Effects Model—FFGC as a Mediator—In the second model (Model 2), a latent variable was added to Model 1 in order to consider possible mediation effects of exogenous variables on BW, BL, and GA, as proposed by Bollen [
17]. In
Figure 2, the latent variable FFGC is represented with an oval sign and the observed variables with a rectangle. As in previous studies, to assign a scale to the latent variable, we fixed the path between the latent variable FFGC and one outcome, the birth weight to 1 (the standard deviation of the BW variable being the highest). In this model, we hypothesized that the latent variable FFGC constituted an unobserved measure of a blend of favorable or unfavorable conditions for fetal growth, which simultaneously affected BW, BL and GA. The exogenous variables may affect the FFGC variable differently, which could increase or decrease the BW, BL and GA. Only the girl variable did not have a direct effect on FFGC because it constituted an intrinsic characteristic of the baby while other variables characterized the environment for fetuses’ growth. Considering the direct relation of the girl variable on the 3 outcomes allowed us to take into account the greater vulnerability of the male fetus. In contrast with Model 1, this model does not allow error correlations between the outcomes (no residual relationship among them), considering that possible associations between them are already captured by their common dependence on the FFGC latent variable. However, like the first model, all the exogenous observed variables were allowed to correlate with one another, except the variable ‘
girl’.
Modified FFGC Latent Variable Model—Model 3. We explored empirical modifications to Model 2, like Camerota and Bollen [
13], by adding some omitted paths detected by statistical tests provided by the statistical software (SAS, Using PRoc Calis function). Based on these statistics, we only considered modifications which were plausible and theoretically justifiable. Thus, the plausible paths were the direct path: from FirstP to GA; from FirstP to BW; from unemployedF to BW, from unemployedM to GA and from primary to GA; these additional paths were represented in the following
Figure 3 with blue arrows.
2.6. Statistical Analysis
Imputation of missing values—The descriptive analysis revealed high rates of missing values for occupational status of the mother and of the father as well as the level of education of the mother (varying between 30% and 40%). As the missing values are not randomly distributed over the study area, we imputed the data based on the census data collected in city of Paris by the National Institute of Statistics and Economic (INSEE—2012). In each census block, we randomly attributed an education level and an occupational status to mothers and fathers with missing values in order to obtain comparable individual and neighborhood distributions in terms of education level and occupational status.
Statistical procedure—First, we considered all the newborns in the statistical analysis. Then, we explored potential changes in the model estimates when taking into account the neighborhood deprivation where the mother lived. Thus, we stratified the analysis on ten classes of the neighborhood socioeconomic deprivation index to analyze
- (i)
whether Model 2 and/or Model 3 fit better than Model 1 in each sub-group, separately, and
- (ii)
whether the signs, the magnitude and the p-value of the regression coefficients related to FFGC varied between each sub-group (from the most deprived census blocks to the most advantaged).
Statistical indicators—We used several statistical indicators to assess overall model fit: the SRMR (Standardized Root Mean Square Residual), the RMSEA (Root Mean Square Error of Approximation), the CFI (Comparative Fit Index) of Bentler, as recommended by O’Rourke N and Hatcher L [
18]. The closer [1-RMSEA] and CFI are to 1, the better the model fit. However, CFI close to 0.94 was suggested to define a good fit model [
18]. The closer the SRMR is to 0 (or lower than 0.055), the better the model. An additional statistical indicator was used to compare SEM models: the Bayesian information criterion (BIC); a difference of 10 or more in the BICs between two models suggests evidence in favor of the model with the lowest BIC [
19].
All the models were estimated in SAS using full-information robust maximum likelihood (MLR) as our estimator.
4. Discussion
Fetal environment, which is known to be related with pregnancy outcomes [
20,
21], is a complex notion which cannot be directly observed through a singular and simple way. While it is well documented that several factors play a crucial role in fetal development [
22], we could only use parental characteristics in this analysis. Introducing them in an SEM model, we tested the hypothesis that an underlying latent variable, namely the FFGC variable, an unobserved variable, could partially express the effects of this blend of parental characteristics on BW, BL and GA. Parental characteristics, in particular, level of education and occupational status, have been reported to be associated with fetal growth [
7]. Indeed, Silva et al. explained that maternal level of educational reflectes material resources (because it partially determines the occupational status and income level) and non-economic social dimensions, including the level of knowledge about health [
7], which could impact the birth weight, the head circumference and other birth measures. The mechanisms through which parental traits may be indirectly related to adverse birth outcomes remain unclear. However, we may formulate a hypothesis. These parental characteristics may reflect (1) poor living conditions [
23], which could relate to airborne exposure of the fetus, in association with poor housing [
24], living near high traffic areas or workplace air pollution, (2) unhealthy food [
25] (due to a low level of education and/or to limited resources [
7,
26]), which could damage or slow fetal growth by lack of essential nutriments [
27] and (3) insufficient pregnancy screening [
23], which could reduce the chance of early detect infections and other adverse situations. All these factors are separately known to reduce gestational age, birth weight and birth length, representing different dimensions of fetal growth, which might be captured in the favorable/unfavorable fetal growth conditions latent variable. Hence, as recommended by Camerota and Bollen [
13], it is important to replicate this indirect model proposed by Bollen et al. [
12] in different populations and settings to accumulate evidence for the existence of an FFGC latent variable. According to the fetal origins hypothesis, impairment of birth conditions may have lifelong implications for human health and development [
28].
The main objective of our study was to apply this novel approach to another population, here newborns in the city of Paris (2008–2011). One original aspect of our study was to investigate whether results found on the overall population concerning the latent variable could vary according to the level of socioeconomic deprivation measured at the residential census blocks of the mothers. Comparing fit statistics from the model where parents’ characteristics directly affected BW, BL, and GA (Model 1) to one in which these effects were mediated by FFGC (Modified FFGC Latent, Model 3), we confirm that the latter provided a better fit for the data.
One limitation, highlighted by Bollen et al. [
12] in their study, was related to the collected data: birth weight and length, and gestational age were self-reported by the mothers and this could introduce bias in the measures of association. In 2016, Camerota and Bollen [
13] wanted to overpass this limitation and replicated their study with independent measures of the birth outcomes: one self-reported by the mother and the other reported by health professionals, as in our study. They confirmed that the indirect model (including the FFGC latent variable) better fit the data, as in our finding. Thus, the results appear robust to how information on birth outcomes were collected.
Most interestingly, we stratified our analysis according to the neighborhood socioeconomic index estimated at the census block scale to test: i) first, whether the FFGC Model was a better fit in each sub-group, and ii) secondly, whether the patterns of signs and significance of regression coefficient to predict FFGC were different between each sub-group. Our study found that a model with a latent FFGC variable fit the data better than a model without it, whatever the class of neighborhood socioeconomic deprivation. Hence, our study suggests that the existence of the latent variable characterizing favorable fetal growth conditions was not undermined by the level of neighborhood socioeconomic deprivation. However, our findings reveal that the strength, the direction and the level of significance of the association between the exogenous variables and the FFGC were different according to the level of neighborhood deprivation. Therefore, our study suggests that mechanisms and/or the strength of the mediating effect of the FFGC on adverse birth outcomes may be influenced by the level of socioeconomic deprivation; further investigations are needed to gather additional scientific evidence.
The strengths and limitations of this study should be addressed here. The main strength of our study is the databases we used, with a high rate of completeness of the birth certificates (93% on average) and the large population size, resulting in a high statistical power when stratifying the statistical analysis by neighborhood deprivation category. However, one limitation related to the stratified analysis is the high number of missing values that we imputed using information available at the census block scale. Doing so, we increased the level of correlation between the distribution of individual characteristics and that of the neighborhood. Hence, we decreased between-subject variability within census blocks, especially among the most deprived census blocks in which the rate of the missing value was twice higher compared to the most privileged ones. However, our results consistently reveal significant relations with the FFGC variable; this may suggest that imputation values did not have a significant influence on this main result but rather much more on the measure of the associations and their level of significance between exogenous variables and the FFGC variable. Additionally, our study presents the advantage of consider three birth outcomes together in the same analysis while the majority of the epidemiological studies investigated each one separately when data of several outcomes were available. A methodological approach taking into account that correlation between the birth outcomes is more realistic, knowing, for instance, that small gestational age reduces the birth weight and height. While the development of the structural equation model has existed for a long time [
29,
30,
31], to date, few studies have dealt with birth outcomes [
12,
13]. We encourage new research conducted in the perinatal field to adopt this approach.
The main limitation of our study is the absence of certain mothers’ characteristics, such as maternal height and weight (or the body mass index), smoking habits or alcohol consumption during pregnancy; these factors are all known risk factors of adverse birth outcomes. However, we took into account individual and neighborhood socio-economic information, which are documented to be related to unhealthy behaviors; thus, we believe that we partially captured these missing risk factors thanks to these socioeconomic data, and thus minimized the resulting bias. Additional limitation of this study concerns the difficulty to extrapolate our findings to other French areas. Indeed, the Ile de France region is one of the French regions where the proportion of population with a high level of education is the highest (
https://www.insee.fr/fr/statistiques/1288219). This may have influenced the association with the latent variable and needs to be investigated in other areas with various socioeconomic profiles.
In addition, in our study, only one variable, occupational status, was available to characterize the socioeconomic position of the father. In future research, it could be very interesting, of course, to consider the level of education of the father and of the mother in the same analysis.