Standardized Green View Index and Quantification of Different Metrics of Urban Green Vegetation

Urban greenery is considered an important factor in relation to sustainable development and people's quality of life in the city. Although ways to measure urban greenery have been proposed, the characteristics of each metric have not been fully established, rendering previous researches vulnerable to changes in greenery metrics. To make estimation more robust, this study aims to (1) propose an improved indicator of greenery visibility for analytical use (standardized GVI; sGVI), and (2) quantify the relation between sGVI and other greenery metrics. Analyzing a data set for Yokohama city, Japan, it is shown that the sGVI, a weighted form of GVI aggregated to an area, mitigates the bias of densely located measurement sites. Also, by comparing sGVI and NDVI at city block level, we found that sGVI captures the presence of vegetation better in the city center, whereas NDVI is better in capturing vegetation in parks and forests. These tools provide a foundation for accessing the effect of vegetation in urban landscapes in a more robust matter, enabling comparison on any arbitrary geographical scale.


Introduction
According to the United Nations [2018], the proportion of the world population living in cities is expected to increase from 55% in 2018 to 68% by 2050. As the city becomes more densely populated, sustainable development is more indispensable than ever, in order to tackle problems such as mitigation of climate change and enhancement of the quality of life of citizens in the city.
Green metrics occupy an essential part in these arguments, since the method by which urban greenness is measured largely affects the association analyses. Conventionally, accessibility to green spaces registered in land use data is often employed as a representation of urban greenery, but this approach does not fully capture the people's exposure to green vegetation in that street trees are ignored and different profile of vegetation in green spaces is not considered. In this point, satellite images provide an objective estimation of presence of vegetation, using Normalized Difference Visibility Index (NDVI). However, NDVI does not represent people's perception of green vegetation due to its top-down eyesight, whereas people on the street often see vegetation in horizontal direction or canopy in elevated direction (Li et al. [2015b], Larkin and Hystad [2019]).
Recently, in order to measure eye-level visibility of green vegetation, GVI was developed (Yang et al. [2009], Li et al. [2015b]) and has been used in various studies (Li et al. [2015a(Li et al. [ , 2016, Villeneuve et al. [2018], Lu [2019]). Although this index allows to capture people's perception of greenness from a specific site on street, it has been overlooked that aggregation of such site-based GVI at area level (e.g. block, census tract, administrative boundary) is sensible to the site distribution. Spatial aggregation of the site-based GVI must therefore be studied for more robust discussion on association between urban greenness and other factors.
In order to tackle this problem, this study proposes an aggregation method called standardized GVI (sGVI). By using Voronoi tessellation, the GVI sites are weighted quasi disproportionately to the density of the sites in the area of aggregation. Also, characteristics of NDVI and sGVI are compared, since comprehension of them is crucial especially applying these metrics to analytical studies. Even though different perspective between NDVI and sGVI is pointed out based on their moderate correlation (Larkin and Hystad [2019]), their relation at spatially aggregated level must be also examined, given that such aggregation is common in association studies focusing on urban greenery. This article is organized in the following order: after related studies are reviewed in Section 2, the methodologies of green metrics are presented (Section 3). Application of the newly proposed green metric as well as other conventional metrics and the result is discussed in Section 4 and Section 5. At last, Section 6 concludes this article. The amount of greenness in a given area is traditionally quantified by land use data with green coverage or the Normalized Differential Vegetation Index (NDVI) derived from satellite imagery and the use of infrared light (James et al. [2015], Gascon et al. [2016]).
To account for greenness underestimated by land use data or the NDVI, such as urban forests, Yang et al. [2009] proposed the Green View Index, which makes use of colored pictures to assess street-level visibility of green vegetation. This index is further elaborated in Li et al. [2015b], by developing an automated program to estimate the visibility of greenness using the Google Street View API. Their work has been applied to cities around the world, with the computed results showcased in a website 2 .
However, the accuracy of green vegetation detection using the GVI was lacking, as artificial green objects can be erroneously classified as green vegetation (Li et al. [2015b]). To improve the accuracy, the use of advanced image recognition technologies such as semantic segmentation (Cai et al. [2018]) and deep convolutional neural networks (Cai et al. [2019]) have been explored.
Variants of the GVI have been proposed. Chen et al. [2019] proposed the panoramic GVI (PGVI), which processes the entire panorama image at a site and calculates the proportion of green objects in terms of pixel. Apart from street-level visibility, the Floor Green View Index (Yu et al. [2016]) measures green patches seen from a building floor, using LiDAR and 3D modeling data for buildings and NDVI for vegetation. The floor GVI focuses on the visibility of greenness from a building, without considering physical interaction with vegetation. Table 1 summarizes the previously proposed indices to capture greenness, even though the scopes are different each other.

Application of green metrics
GVI is often studied alongside social, economic, and physiological factors. For social aspects, the survey conducted by Villeneuve et al. [2018] found that GVI was positively associated with participation in summer recreational activities. For economic aspects, Li et al. [2016] demonstrated the correlation between economic inequality and environmental inequality in terms of accessibility to urban green vegetation, as measured by GVI 3 . For physiological aspects, GVI has been associated with physical activities (Li and Ghosh [2018], Lu [2019]) and geriatric depression ).
When compared with other factors, GVI must be aggregated to the size of the area containing the relevant data 4 . This is because GVI is measured at street-level sites; if aggregated at the area level, for instance block, census tract, and other administrative boundaries, the area will have several GVI values, which may be different from each other. However, there exists no consensus on the aggregation method: some studies used the median (Li et al. [2016]), while others used the mean (Lu et al. [2018], Li and Ghosh [2018]). As is discussed later in this article, the estimation by aggregation is affected by the method of aggregation. It is thus important to conceive a more robust aggregation method.
Furthermore, the reason behind characteristics shown by different greenery metrics have not been thoroughly understood. A survey conducted by Villeneuve et al. [2018] found that GVI (and not the NDVI) was positively associated with participation in recreational activities during the summer. However, it is not clear why this was the case. Ye et al.
[2019] compared greenness measured by NDVI and a modified version of GVI considering accessibility, and found that greenness in well-developed neighborhoods tend to be underestimated by NDVI. Understanding the different characteristics is crucial on how to apply GVI at the city level in accordance with other metrics such as NDVI.
Given these gaps, this study aims to establish a method for such area-based study that mitigates bias from spatial distribution of GVI sites. Also, we try to characterize this new metric in comparison to other existing metrics.

Methodology
This section provides definitions of green vegetation metrics that are used in this study.

Green View Index
The Green View Index (GVI) was originally proposed by Yang et al. [2009] and developed by Li et al. [2015b] as a metric to measure the visibility of green vegetation in landscape. In this study, following Li et al. [2015b], we process images extracted from Google Street View (GSV) for each site, and calculates the Green View using this formula: (1) 2 http://senseable.mit.edu/treepedia 3 But Li et al. [2016] proved that there is no significant environmental disparity among racial/ethnic groups in general, in terms of accessibility to urban green vegetation. 4 Note that statistic data is often spatially aggregated in order to protect individual privacy.
where n is the number of images for each site, set to 6 in this study 5 . Area gi is the number of green pixels in the image for ith direction, and Area ti is the number of total pixels in the image for ith direction. The vertical view angle is fixed to 0 • , which is parallel to the horizontal line.
Green pixels are identified based on the RGB color band. A major limitation of this approach is the potential confusion of natural and artificial green objects: trucks painted green might be classified as green pixels, for instance. Some solutions have been proposed, including the use of image recognition technology such as semantic segmentation (Cai et al. [2018]) and deep convolutional neural networks (Cai et al. [2019]). However, as the precision of object recognition lies beyond the scope of this study, we consider the implementation of such advanced techniques future work.
Using metadata collected from the Google Street View API, it is possible to know the months in which the images were taken. Since our interest lies in urban greenery, we defined "green months" for our study area as the period from April to October. Images taken outside of the "green months" are not utilized for analysis.
In addition to limiting images to green months, we implemented the functionality of specifying year of image via the Google Maps JavaScript API (Li et al. [2016]). This allows us to use images of the designated year when available, thus mitigating the bias of temporal fluctuation.
The code used in this study to calculate GVI is available on GitHub.

Standardized Green View Index
GVI is a site-based metric: it measures the visibility of surrounding greenery at a geographical point. However, in practice it is often aggregated to an area level (block, census tract, administrative boundary, etc.), in order to associate the index with other socioeconomic factors. Previous studies (cf. Li et al. [2016], Lu et al. [2018]) implicitly assumed that each site of GVI calculation has equal importance, hence simply taking the mean of GVI scores of sites located in a given area. However, this assumption may not hold, especially if sites are not evenly distributed in space. A heterogeneous distribution of GVI points will lead to a biased estimation when aggregated to an area level, as points densely located will contribute to the aggregated value to a greater extent. Figure 1 illustrates an example of such biased estimation. Sites with low GVI are densely located in the upper part of the area, while sites with high GVI are sparsely located in the lower part. Taking the mean of these sites skews the GVI value towards that of the denser parts (upper part of figure), resulting in a biased aggregation at the area-level. In order to mitigate bias resulting from point density, we propose a Standardized Green View Index (sGVI) to calculate GVI for a given area. The sGVI is a weighted aggregation of GVI scores in a study area. It considers how road segments are located in the area when calculating the area-level value. The idea is to define the expected value of GVI in terms of total road length inside the zone to represent GVI of an area. In other words, sGVI is the expected value of GVI when the site for calculation is randomly chosen on the road network of the area. The mathematical formulation of sGVI is as follows: where j is point of GVI calculation l j is the total length of links that the point j is associated with, and l is the total length of all links in the zone. The association of point j to links (l j ) is defined by the Voronoi tessellation of each point: the set of link fragments overlapped by the Voronoi tessellation is associated with point j. The procedure of this association is illustrated in Figure 2.

Road fragments
All roads 21% 4.5% Given a set of sites and an area, the Voronoi tessellation divides the area into cells so that any point in the area belongs to the cell of its closest site 6 . Once the tessellation is defined, the roads are overlapped and cut by each cell. Then the proportion of the road fragments contained in a cell over all the road fragments in the area is calculated, in terms of length of road fragment. These proportions are weighted inversely to the density of sites, which derives from densely located road segments.
One of the advantages of using the Voronoi tessellation over other methods is that every part of the roads is automatically associated with one site, without duplication. Given that the sites for GVI evaluation is located on a road segment, by definition, all the cells in the tessellation are supposed to have at least a fragment of road segment. In addition, since the cells do not overlay each other, it is ensured that one fragment of road segment is associated with only one site. This property may not be the case with other methods based on network configuration: for instance, if one decides to associate a site with nearby road segments, it is possible that two points in different road segments have the same distance from the site. An additional rule of attribution is needed in order to avoid duplicated association. The Voronoi tessellation, in turn, requires only the sites, the road segments and the boundary, and no other parameter is needed. As the equation 2 indicates, quasi inverse weighting of GVI sites in terms of site density can be achieved with possibility of simple application in practice.

Comparison of sGVI and other green metrics
In order to explore the relation between sGVI and other green metrics, the following metrics were calculated in the study area of Yokohama city (see the next section for more detailed description): (1) sGVI, (2) GVI (mean), (3) GVI (median), and (4) NDVI. (1) sGVI is the expected value of GVI in the area, when a site is randomly chosen on the road network of the area. (2) GVI (mean) is an aggregation of GVI in the area by mean (Lu et al. [2018], Li and Ghosh [2018]), and (3) GVI (median) is an aggregation by median (Li et al. [2016]). (4) NDVI is an aggregation of NDVI in the area by mean 7 .
For comparison, correlation among these four indicators is calculated. Then, the pair with the lowest correlation coefficient is further examined in terms of spatial distribution and regression analysis. The calculation of the green metrics was computed on an Intel Core i9 CPU at 2.4GHz and implemented in Python.

Normalized Differential Vegetation Index
While the GVI measures visible green vegetation on eye level, the Normalized Differential Vegetation Index (NDVI) quantifies the top-down green coverage using satellite imagery. The NDVI makes use of the fact that green vegetation reflects near-infrared lights more than red lights in the visible spectrum. The formula of NDVI is as follows: where N IR is near-infrared light and Red is red light. The value is normalized to [−1, 1], with a larger value signifying more abundant green vegetation.
The images used to calculate NDVI were retrieved from Level-2A data of Sentinel-2 with 10m resolution via Copernicus Open Access Hub. Four images were retrieved for the study period 8 , and in order to mitigate seasonal effects, mean values of each image for each pixel were taken. The cloud cover ratio of these images was smaller than 5%.

Case study: Yokohama city 4.1 Study area
The green vegetation metrics defined in the previous section are applied in two wards (Nishi ward and Kanagawa ward) of Yokohama city, Japan. Located South of Tokyo metropolitan region, Yokohama city has its central business district located in the Nishi ward, and peripheral residential areas located in the West part of Kanagawa ward. This city structure allows us to study different behavior of the metrics, depending on the land use pattern. The geographical scale of analysis in this study is at Chome level (the Japanese name for a city block (Gao and Asami [2007])), since this is the smallest level of division for statistical data in Japan.
The area and population of the study area is shown in Table 4 in Appendix A. The locations of the study area is illustrated in Figure 3 (Yokohama city [2020]). 7 Note that NDVI contains some areas where GVI sites will not be located, since the GVI site is located on street. This may lead to an unfair comparison 8 The dates were 9 March, 8 May, 5 August and 6 October, 2019.

Descriptive analysis
This section describes basic statistics of the data source that is used, namely GVI from Google Street View imagery and NDVI from satellite imagery.
For GVI, 7780 sites are selected in total, and six corresponding images per site are retrieved via the Google Street View API. The sites are located every 100m along a link in the road network, and it is ensured that intersections have at least one site (if images satisfying criteria of season and year are available). 84% of the sites turned out to have images taken in 2019 (see Figure 4). This mitigates the time fluctuation, which has been pointed out in previous studies (Li et al. [2015b], Ye et al. [2019]). Also, the month of image taking is limited to the period from April to October, with the majority (83%) of images being taken in April or May.

Figure 4: Year and month of GSV images
As for NDVI, two satellite images are retrieved and the mean value for each mesh was calculated 9 . The study area has in total 307,335 (70,273 (Nishi) + 237,062 (Kanagawa)) meshes with 10m spatial resolution. Figure 9 in Appendix A illustrates the geographical distribution of NDVI. Figure 5 shows the histograms of NDVI and GVI for the study area, and the descriptive information is shown in Table 2. 9 Note that this aggregation will not produce spatial bias, because every satellite image has the same spatial resolution.

Standardized Green View Index correcting spatial bias
Example of biased aggregation Figure 1 showed an example of biased estimation of area-level aggregation of GVI resulting from heterogeneous site density. In the case of Figure 1, the mean and median values of GVI are 8.38 and 3.53 respectively. If a researcher intends to study the relationship between these green metrics and socioeconomic factors in the area, it is evident that the choice of metrics will largely affect results of the analysis.
Admittedly, this situation does not occur in every study area, as the difference between mean and median values of GVI depends on the (numerical) distribution of GVI. For instance, if the GVI values in an area follow the normal distribution, the choice between mean and median will be unimportant since the two values are expected to be similar by definition. However, when the distribution is more long-tailed such as power low, the difference between mean and median is not always negligible, and the importance of defining a representative value will increase.
Even if the researcher has complete knowledge of the GVI distribution in the study area (which is divided into sub-areas), it is possible that the distributions that each sub-area follows are not identical. From this viewpoint, a feasible solution to mitigate this bias is to consider the weight of each site depending on their representativeness in the sub-area, which can be realized by sGVI.
Bias correction by sGVI The sGVI of the area shown in Figure 1 was 11.4, as sGVI places a greater emphasis on sites with high GVI in the lower part. In other words, when an arbitrary point on the network is chosen, the expected value of GVI there is 11.4. This result is also more intuitive, since the vegetation in the lower part will affect more the evaluation, covering larger superficial area.
The indicator is expected to behave as a proxy of representative value of areas, not only in areas with heterogeneous distribution of sites but also in areas with homogeneous distribution of sites. This is because, with such homogeneously located sites, sGVI leverages each point quasi-equally. If the distribution of GVI of such are is following normal distribution, the estimation by sGVI will be close to the mean and the median of GVI in the area. Such robustness towards distribution of sites and eventual variation of GVI is an advantage of sGVI, which has not been explored in previous studies.

Comparison of different green metrics
In order to further understand characteristics of the suggested indicator, sGVI, we first analyze the statistical relation between sGVI and NDVI. In this analysis, the Chome zones where sGVI is 0 are excluded. This is because of either lack of available image on Google Street View or absence of streets. With this pre-processed data set, Spearman's rank correlation 10 between the two metrics was calculated (  Table 3: Correlation matrix of green metrics Figure 6 illustrates the scatter plot of sGVI and NDVI as well as the regressed line of NDVI by sGVI at Chome level, whose coefficient of determination was 0.57. From this result, we see that factors other than sGVI contribute to more than 40% of the fluctuation of NDVI. Similar results were observed between GVI and NDVI in previous studies (Larkin and Hystad [2019], Ye et al. [2019]). This brings the question of what these other influencing factors are. In the aim of exploring the above question, second, the geographical distribution of the estimated values was explored. Figure 7 visually compares sGVI and NDVI calculated at Chome level. Even though the scales of value between sGVI and NDVI are not directly comparable, there is an observable tendency that NDVI leverages the vegetation in the north-west part of the study area. Spatial distribution of residual error from the regression analysis is also illustrated in Figure 8. Under an assumption that the relation between sGVI and NDVI can be modeled by a linear function, NDVI in the north-west part where vegetation is more present is larger than what sGVI predicts, and NDVI in the south-east part where buildings are more dominant is smaller than the prediction by sGVI. Given that the north-west part is mainly a residential area with parks and gardens, and that the south-east part is the central business district with limited number of vegetation along streets, this result indicates that sGVI captures the green vegetation in urbanized area more than NDVI. In other words, presence of buildings makes it difficult to estimate the amount of green vegetation from the top-down viewpoint.

Spatial aggregation of weighted points
In geography, "everything is related to everything else, but near things are more related than distant things" (Tobler [1970]). Spatial distribution of objects must therefore be considered when arguing at spatial level. From this viewpoint, simple aggregation of GVI values in a given area by mean signifies that every site is assumed to be related equally to every other sites, which is clearly false. If two sites are close, the images taken in the two sites are likely to capture an identical tree, whereas two distant sites do not have anything in common in their images. It is thus necessary to consider the spatial proximity of each site in order to estimate green vegetation, not for a site (point), but for a certain area.
This study proposed standardized GVI (sGVI), which is able to consider the heterogeneous relation among sites. Explicitly considering physical existence of road network, sGVI can mitigate a biased estimation caused by simply aggregating the mean value.
It is also possible to imagine other possibilities of weighting, especially using tools of spatial statistics. For instance, smoothing techniques such as Kernel Density Estimation (KDE) may be an option. However, KDE normally leverages densely located sites by definition, whereas the problem of aggregating GVI over an area is the opposite: provide less weight to densely located sites. This is why this method is not introduced in this study, but it may be a possible direction for future study.

Different perspective: NDVI and GVI groups
It has been pointed out that NDVI and GVI capture different aspects of green vegetation, due to their different viewpoints.
NDVI has a top-down viewpoint due to its use of satellite imagery, which makes it possible to capture horizontal extension of green vegetation better than other perspectives. Knowing that vegetation grows in a way that gives leaves maximum sunlight, it is reasonable to represent the existence of vegetation by NDVI. Nevertheless, it should be noted that there is green vegetation on building walls, especially in the city center, which is not fully captured by NDVI.
On the other hand, GVI has a street-level viewpoint. This is closer to humans' perception, but has several limitations. First, vegetation hindered by objects are not considered. Second, the estimation of GVI is limited on streets where pavement necessarily appears in the images, thus lowering the estimated value. Third, canopy formed by tall trees may be only partially perceived by people on the street, since the canopy will be placed at the marge of the eyesight, which is fixed in the horizontal direction. Lastly, densely located sites may be auto-correlated, because the same tree may be observed more than once, especially if the sites are very closely located.
These points themselves are consistent with the standpoint that GVI measures the visibility of urban green vegetation, but will lead inconsistency when GVI is employed as a proxy of the existence of vegetation. The usage of indicators deriving from GVI (including sGVI), thus, should be limited to measure visibility of green vegetation.

Insight for further analytical work
Multiple green vegetation metrics have been proposed, some of which are treated in this study. However, differences among them have not been fully understood. The result of this study, namely the comparison of NDVI and GVI, implies that the innate characteristic of each metric must be more carefully considered when associated with other factors.
Taking an example of physiological study, green vegetation is expected to have positive effects on physical health outcomes (Larkin and Hystad [2019]). Nevertheless, when the causal relation between presence of green vegetation and health outcomes, there will be at least two paths: one is optical, and the other is atmospheric/olfactory/bacterial. The former corresponds to the effect of just "seeing" greenery, whereas the latter corresponds to the effect of more direct interaction between human body and vegetation, such as quality of air, smell of plants and presence of certain bacteria. It is clear that GVI and FGVI (Yu et al. [2016]) is an appropriate method to measure the former, while NDVI is more suitable to evaluate the latter.
For statistical analysis, attention must also be paid to the aggregation method. Presence of several mediators between green vegetation and physiological outcome has been remarked ), but spatial aggregation of sitebased metrics has not been discussed. The proposed method of sGVI mitigates the bias from heterogeneous distribution of measurement points of GVI.
Association with these green vegetation metrics and other factors must be carefully discussed, knowing the properties of each method.

Conclusion
People's exposure to green vegetation is often associated with societal, economical or physiological factors, but it has been overlooked that aggregation method of green metrics may generate bias in area-based estimation. This study implemented GVI (Li et al. [2015b]) with development of designating time of image shooting, and proposed a new metric, standardized GVI (sGVI), which considers the density of sites for GVI calculation. We found that sGVI mitigate such density-led bias, and expect that it is more robust to heterogeneous spatial distribution of sites compared to simple aggregation by mean or median. Also, it was shown that NDVI leverages green vegetation in residential area with parks and gardens, while sGVI captures more vegetation in urban area where buildings are dominant. Therefore, for further analysis associating green vegetation and other factors, especially in urban areas, it is recommendable to employ sGVI since it mitigates bias from spatial distribution of sites and captures eye-level greenery in a more sensitive manner than NDVI.
For future works, there are two major issues to be considered. Firstly, the heavy computational load of Voronoi tessellation, especially when calculating sGVI in a large area, is not ideal. Keeping the idea of leveraging sparsely located sites, another direction with a lighter computational load must be explored.
The second issue is the treatment of missing points when estimating sGVI. Since not every site has images that satisfy the given conditions such as month and year, robustness to missing points must be considered in greater detail. While it is possible to mitigate this by placing sites with smaller intervals (20m instead of 100m, for example), it is not costefficient both in terms of time and money 11 .
Under SARS-CoV-2 epidemic, reduction of social contact by taking distance between people has been requested. Such situation may increase the importance of spaces outside buildings, and, thus, understanding the roles of the components in the outdoor space is crucial. From this view point, sGVI prepares a solid foundation on association studies, which will eventually contribute to the design of public spaces in the context of sustainable development.
A Miscellaneous information about the study area