Spatiotemporal Contextual Uncertainties in Green Space Exposure Measures: Exploring a Time Series of the Normalized Difference Vegetation Indices

Environmental health studies on green space may be affected by contextual uncertainties originating from the temporality of environmental exposures and by how the spatial context is delimitated. The Normalized Difference Vegetation Index (NDVI) is frequently used as an outdoor green space metric capturing the chlorophyll content in the vegetation canopy. This study assessed (1) whether residential NDVI exposures vary over time, and (2) how these time series of NDVI scores vary across spatial context delimitations. Multi-temporal NDVI data for the period 2006–2017 for the Netherlands were obtained from the Moderate Resolution Imaging Spectroradiometer (MODIS) satellite platform. Annual NDVI exposures were determined across multiple buffer sizes (i.e., 300, 600, and 1000 m) centered on a random sample of 10,000 Dutch residential addresses. Besides the descriptive statistics, pairwise Wilcoxon tests and Fligner–Killeen tests were used to determine mean and variance differences in annual NDVI scores across buffer widths. Heat maps visualized the correlation matrices. Significance levels were adjusted for multiple hypotheses testing. The results indicated that annual NDVI metrics were significantly correlated but their magnitude varied notably between 0.60 to 0.97. Numerous mean and variance differences in annual NDVI exposures were significant. It seems that the disparate buffers (i.e., 300 and 1000 m) were less strongly correlated, possibly because variance heterogeneity is reduced in larger buffers. These results have been largely consistent over the years and have passed Monte Carlo-based sensitivity tests. In conclusion, besides assessing green space exposures along different buffer sizes, our findings suggest that green space–health studies should employ NDVI data that are well-aligned with epidemiological data. Even an annual temporal incompatibility may obscure or distort green space–health associations. Both strategies may diminish contextual uncertainties in environmental exposure assessments.


Introduction
It is increasingly recognized that the human body responds to the environments to which it is exposed [1][2][3]. The assessment of health-threatening and health-promoting environmental exposures has received fresh impetus from progress in earth observation coupled with geographic information systems (GIS) leading to high-quality environmental information [4,5]. To date, satellite images represent the gold standard for capturing conditions of the natural environment (i.e., vegetation) in an objective manner [5][6][7].
One prominent environmental factor is green space-namely forests, parks, etc.-which is capable of reducing harm caused by exposure to physical and mental health stressors [2,8]. Studies found that being surrounded by outdoor residential green space has beneficial effects for diverse population groups and on multiple health outcomes, such as adiposity [9], birth outcomes [10], suicide [11], physical activity [12], children's brain development and cognitive function [13], the elderly's mortality [14], and depression [15]. Pathways connecting green space and health include the mitigation of harmful exposures (e.g., air pollution, noise), the promotion of physical activity, and the support of restoring capacities [8].
Vegetation metrics [7] derived from remote sensing images such as the Normalized Difference Vegetation Index (NDVI) [16,17] are widely used to describe the distribution and amount of green space in population-based epidemiological studies [14,18,19]. However, findings that relate NDVI to health outcomes are not always consistent. For example, while [13] reported beneficial effects of green space on brain development, another study found an insignificant direct effect of residential NDVI extracted from the Moderate Resolution Imaging Spectroradiometer (MODIS) on major depressive disorders and psychological distress [20].
While there are manifold reasons for these contradictory findings (e.g., different research designs, study site specific differences), contextual uncertainties also contribute, to a certain extent, to the heterogeneity of findings on green space and health outcomes [21]. First, the temporality of environmental exposures contributes to contextual uncertainties. Green space varies naturally over time, but ambiguities can also arise artificially as a result of applied data processing [22]. Despite such temporal variations in NDVI scores, some studies [20,23] defend the application of NDVI data that are misaligned with epidemiological data. However, empirical results supporting such practices are lacking. Since longitudinal studies incorporating people's life courses and their exposure to (temporally weighted) time series of green space are receiving increasing attention [9,24], well-aligned time series of NDVI data are crucial to circumvent exposure misclassifications.
Spatial contextual uncertainties arise due to the diverse ways in which the spatial context that people belong to is defined. Several studies have shown that assigning exposures based on administrative areas (e.g., postal codes) leads to conceptual and methodological flaws (for details, see [3]). Refined green space assessments can be achieved by focusing only on the vicinity of an individual's actual residential location as represented through GIS-based buffers. As the "true" health influencing spatial context is unknown, the ways to determine an adequate buffer size are still under debate. Consensus has been reached, however, that different buffers sizes need to be explored to validate the stability of health-NDVI associations [25,26].
Motivated by these challenges, this study addressed spatiotemporal contextual uncertainties in exposure to outdoor NDVI for a large sample of addresses in the Netherlands. The first objective was to examine whether NDVI exposure measures vary significantly over time. Such analyses empirically underpin common empirical practices whereby temporally misaligned indicators are used as substitutes in assessing health-green space correlations. The second objective was to examine how variations in the spatial context-here defined as buffers with different sizes centered on residential addresses-lead to temporally misaligned exposure measures.
The empirical findings contribute to a better understanding of contextual uncertainties about NDVI scores. As spatial and temporal contextual uncertainties jointly affect green space assessments, this study puts both scale issues and the temporal alignment of green space exposures centrally. The results may assist those carrying out future studies to decide whether misaligned vegetation indices could serve as substitutes. To the best of our knowledge, no previous study has addressed these elementary but important issues, which have the potential to result in biased green space exposure assessments.

Study Area
This analysis was conducted in the Netherlands as part of the NEEDS study (www.needs.sites.uu.nl). The Netherlands was selected as a case study due to its diverse geography, which ranges from rural areas with much green space in the north of the country, to one of Europe's most populated and urbanized areas in the middle of the country (i.e., the Randstad area).

Address Data
Like previous studies [23,27], the analysis was carried out at an address level. Address locations stem from the Dutch cadaster (version September 2017), which contains 9.1 million addresses. We extracted those for residential function and mixed use. From these 8.7 million addresses, a random sample of 10,000 locations was drawn; each had the same selection probability. Such a sample size corresponds to epidemiological studies. Figure A1 in the Appendix shows the distribution of the sampled addresses.

Remote Sensing-Based NDVI Data
The widely implemented satellite-derived Normalized Difference Vegetation Index (NDVI) was used to measure the level of greenness [17]. Such a multispectral vegetation index reflects the fact that chlorophyll absorbs the red light (i.e., 0.6-0.7 µm) and the mesophyll leaf structure scatters the near-infrared (0.7-1.1 µm) [16,28,29]. The NDVI is composed of the near-infrared radiation minus visible radiation divided by near-infrared radiation plus visible radiation [17]. The NDVI ranges between −1 and +1, where negative values refer to non-biomass (e.g., clouds, water, and snow) and values around 0 represent rocks and bare soil. The higher the positive values, the higher the amounts of biomass. To reduce the likelihood of biased exposure determination, values <0 (representing water bodies) are filtered for outdoor green space assessments [23].
We focused on MODIS NDVI satellite-based measures due to their global coverage as pre-processed time series while being less computationally intensive than Landsat or other higher resolution data. Annual MODIS NDVI 16-day composites (June 25-July 10, with the product ID MOD13Q1) in the hierarchical data format were acquired for the Netherlands spanning the period 2006-2017 [7,29]. Within the 16-day acquisition period, the algorithm to pre-process the MODIS data selected pixels with the lowest cloud coverage, a low view angle, as well as having the highest NDVI values [29]. Data were obtained through the online MODIS data repository (https://modis.gsfc.nasa.gov/data/dataprod/mod13.php). As vegetation follows a seasonal cycle with the highest amount of green during the summer, we consistently used annual data for June/July. To match with other available data, the original spatial resolution of 250 meters was down-scaled to 100 meters using bilinear interpolation. After clipping and re-projecting the data to the national Dutch reference system, the data were entered into a multilayer image stack, and the recommended scaling factor of 0.0001 was applied [29]. Figure 1 illustrates an example of the NDVI time series for Houten and its surroundings.

GIS-Based Exposure Assessment
To represent the spatial environmental context, we utilized a buffer approach using GIS [25,27]. Circular buffers with radii of 300, 600, and 1000 m were centered on each sampled address location. Smaller buffer sizes reflect the immediate surroundings, while 1000 m buffers capture the wider surroundings. In addition, such circular buffers allow for the exploration of scale effects and are more suitable for long-term analyses because they are independent of auxiliary data (i.e., the road network), which also change over time but are rarely available. Next, we determined the average NDVI values for every address across the three buffer sizes for each year. These proxy measures for green space exposure are attached to each location.

Statistical Analyses
Methods included descriptive statistics describing the central tendency including the mean, standard deviation, etc., to summarize the residential NDVI scores. Individual NDVI trajectories for each address and year were displayed through time series plots. Non-parametric correlations after Spearman coupled with a heat map [30] were implemented to assess associations across the different buffer sizes as well as their correlations over time. Pairwise Wilcoxon rank sum tests were used to test whether NDVI scores differ statistically over the years and across different buffer sizes [31]. Non-parametric Fligner-Killeen tests, which are robust against non-normally distributed values, were conducted to test whether the variances across the annual NDVI scores and buffer sizes differ [32].
Since we conducted a large number of tests, the multiple testing problem arose, namely the probability of false positives increased (i.e., finding differences by chance). To handle multiple hypotheses testing in the applied statistics, the p-values were adjusted following Holm's procedure [33]. Finally, because the sample characteristics could vary by chance, sensitivity tests were performed. A Monte Carlo procedure [34] was used to examine whether the initial sample mean varied significantly compared to 100 alternative random samples, each 1000 addresses large, drawn from all residential addresses (i.e., the initial 10,000 addresses were excluded). The annual mean and standard deviation (SD) of the resulting reference distribution was compared to the initial sample per year. The analyses were carried out in the R environment for statistical computing [35]. Table 1 together with Figure 2 summarize the annual NDVI scores aggregated across the sample.

Results
The boxplots confirm what the measures of central tendency indicated, namely that the NDVI values fluctuate slightly between 2006 and 2017. For example, the NDVI values for 2010 seem to be lower compared to the other years. There seems to be a minor overall temporal trend of increasing NDVI scores since 2006. Regression lines added to the boxplots confirm this impression. Similar patterns were observed when cross-comparing the different buffer sizes.  Because such aggregated analyses hide individual temporal patterns, the individual time series of the average 16-days NDVI composite values per address location were also examined. Figure 3 displays the time series plots. While the overall plots do not indicate striking patterns, an individual investigation showed some striking temporal changes of the NDVI scores per location. The core of the analyses was, as summarized in Figure 4, assessing the NDVI correlations over time and across buffer sizes. While all correlations were highly statistically significant (p < 0.05), even after adjusting for multiple testing, some patterns were recognizable. For example, the 1000 m buffers were less strongly correlated than the 600 and 300 m buffers. Further, it seemed that some years were less associated, although still statistically significant. The years 2007 and 2016 were, independent of the examined buffer size, less associated with the other years of NDVI data. This was in line with Figure 1, whereas deviations in the NDVI patterns were recognizable. Even annual differences of a single year may result in some correlation differences. The results of the pairwise mean comparisons between years with corrections for multiple testing are given in Table 2. As indicated by the p-values, several years of our sampled NDVI values have different means, independent of which buffer size is considered. Moreover, the Fligner-Killeen tests to assess the homogeneity of variances rejected the null hypothesis of equal variances over the years and buffer sizes (all p < 0.001). Our final sensitivity tests, using 100 random samples of 1000 address samples and comparing mean NDVI scores, indicated no statistically significant differences. The initial mean NDVI score was well-situated in the simulated distribution of mean NDVI values plus/minus one standard deviation. Figure A2 in the Appendix illustrates the result for NDVI values for 2014 and the 600 m buffer.

Buffer (m)
Year

Discussion
Accurately assessing the level of outdoor green space surrounding people's residential homes is crucial for epidemiological studies. Due to its cost-effectivity and large coverage of high-quality time series, the NDVI is commonly used for regional or even nationwide studies [14,19]. To limit contextual uncertainties-a fundamental methodological problem [21]-it is important to have NDVI data that are temporally well-aligned with the epidemiological data, although it is not common practice [20,23]. Such data mismatch may translate into biased outcomes, that is, lead to an over-or underestimation of the NDVI effect size in regressions. This study shed light on the often ignored circumstance of temporally incompatible green space data by exploring annual time series of NDVI exposures derived from buffers centered on a large sample of residential addresses randomly distributed across the Netherlands.

Interpretation of the Results
The results showed that, as expected, individual years of NDVI scores were significantly correlated; however, a few years were less strongly correlated than others. These differences were further confirmed statistically through mean and variance differences in NDVI exposures. Our findings question the practice of having a temporally poorly-aligned linkage of green space and epidemiological data. For example, Rugel et al. [20] used survey data for 2012 while the NDVI was for 2014-2016. At least partly due to temporally incompatible data, an insignificant green space effect on mental health outcomes was reported by that study. It could be that the level of green space changed, which translated into less accurate exposure assessments. However, the NDVI differences could also emerge due to mixed pixels, misregistration of the satellite location, etc. (see [16] for a discussion).
There is no consensus on the buffer size to delineate the residential context [25,26]. Our tested buffer widths followed the literature, reflecting the closer and extended residential environment. The results show that the NDVIs for different buffer widths over time are highly positively correlated, ranging from 0.60 to 0.97. A similar range of correlations for MODIS NDVIs was reported elsewhere [26]. Furthermore, there seems to be a tendency for the disparate buffers to be less correlated, which is in concordance with a study in New York City [26]. An explanation could be that smaller buffers have a pronounced variance while NDVI differences average out in larger buffers. Roughly, these results are consistent over the time series of NDVI values. Utilizing a Monte Carlo approach, these results passed sensitivity tests. It can be ruled out that the initial sample represented an extreme case. Given these results, it is recommended not to use temporally incompatible NDVI and epidemiological survey data. To reduce temporal uncertainties, the date should be aligned as closely as possible with the survey data. Without having conducted the obligatory tests, we speculate that discrepancies may be attenuated when cumulative green space exposures over time (or time-weighted ones), rather than purely cross-sectional metrics, are employed. Here, systematic investigations across different study designs, health outcomes, study sites, etc. require further attention.
Because MODIS NDVI maps are retrieved from daily images to generate 16-day composite measures, exact temporal matches are possible for most parts of the world, which is challenging for other popular sensor platforms such as Landsat. The promising temporal granularity of MODIS images is challenged by a moderate spatial resolution of 250 meters and a minimum mapping size of detectable vegetation. This may be problematic for urban areas where small-scale green spaces exist, which may lead to an over-or underestimation of the true green space exposure. Besides these technological constraints, it is debatable whether green space measures from downward-facing satellites (including all platforms) represent accurately how people on the ground perceive green space [36,37]. When comparing neighborhood-based NDVI scores and street view-based green space for Beijing (China), no significant associations were found across both metrics [37]. Such overhead-view assessments only capture the available vegetation; they do not provide insights into the quality, accessibility, and actual use of these areas [20,38].
The application of green space metrics is more complex than it seems at first sight due not only to different sensor platforms (e.g., MODIS, Landsat) but also to complications ensuing from the variety of measures proposed. While the NDVI correlates with expert ratings [39], the enhanced vegetation index (i.e., an alternative vegetation index), for example, is designed to overcome methodological limitations of the NDVI including the elimination of atmospheric and canopy background effects contaminating standard NDVI while having better sensitivity at higher vegetation levels [7,29] (for an application, see [20]). Others implemented classified land use information [15,27,40]-which is challenging because of a varying number of land use categories or sensitivity to the underlying image-based classification algorithm-or used measures such as the leaf area index [41] and the vegetation continuous fields tree cover [42,43]. However, our results are of more general interest because the MODIS NDVI was found to be highly correlated with alternative vegetation measures such as the leaf area index [41].

Study Limitations
There are numerous limitations to this analysis. Firstly, the present analytical study design is restricted. While imitating real-word ego-centric exposure assessments, the study did not evaluate the actual consequences of misaligned green space measures on physical and mental health outcomes across different settings (e.g., countries, cities and rural areas, spatial contexts). Studies that rule out incompatible green space data and that do not lead to biased regression coefficients are advised. Secondly, although down-scaling increased the number of cells per buffer while limiting the influence of edge effects, we cannot exclude that this procedure artificially over-or underestimated the average NDVI values. Furthermore, when interpreting the results, it is critical to stress that all NDVI values given, within the threshold buffer distance, contribute equally (i.e., mean of NDVI cells per buffer). A refinement could be distance-based weighting schemes because cells close to homes are more important than those further away. Thirdly, notwithstanding that NDVI was found to be fairly constant across satellite platforms [6], only MODIS NDVI measures were explored. Thus, the results should not be uncritically transferred to other platforms (e.g., RapidEye, Sentinel 2). Such high-resolution sensors support more precise inner-urban analyses. Fourthly, for the sake of simplicity, we assumed the residential home location to be the source of exposure, as did others [19,23]. This conventional conceptualization leaves aside exposures along people's day-to-day mobility, which is relevant for more accurate exposure assessments [3,21]. Finally, as only a large, representative sample for the Netherlands was investigated, the generalization of findings could be questioned.

Conclusions
This paper addressed contextual uncertainties originating from both the temporality of environmental exposures and how the spatial context is delimitated. The empirical results show that annual NDVI scores derived from different buffer sizes differed significantly in their means and variances. Furthermore, within the range of 0.60 to 0.97, the magnitude of the correlations varied notably over the years and buffer sizes.
These findings are vital for exposure assessments that often enrich epidemiological data with temporally incompatible NDVI data, thus potentially obscuring green space-health associations. Even a misalignment of a single year may distort correlations. To mitigate contextual uncertainties, it is advised to integrate temporally well-aligned green space data. While the present study took a first step toward an empirical basis addressing spatiotemporal contextual uncertainties, future studies should systematically research the consequences for correlation studies. Acknowledgments: I appreciate the reviewers' critical and constructive comments on the manuscripts, which contributed to the quality of this publication. I thank the IJERPH for providing a waiver for the article processing charge.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results. Appendix A Figure A1. Sampled address locations. The red dot in the center indicates the location shown in Figure 1.