Mapping Seasonal Tree Canopy Cover and Leaf Area Using Worldview-2 / 3 Satellite Imagery: A Megacity-Scale Case Study in Tokyo Urban Area

: This study presents a methodology for developing a high-resolution (2 m) urban tree canopy leaf area inventory in di ﬀ erent tree phenological seasons and a subsequent application of the methodology to a 625 km 2 urban area in Tokyo. Satellite remote sensing has the advantage of imaging large areas simultaneously. However, mapping the tree canopy cover and leaf area accurately is still di ﬃ cult in a highly heterogeneous urban landscape. The WorldView-2 / 3 satellite imagery at the individual tree level (2 m resolution) was used to map urban trees based on a simple pixel-based classiﬁcation method. The comparison of our mapping results with the tree canopy cover derived from aerial photography shows that the error margin is within an acceptable range of 5.5% at the 3.0 km 2 small district level, 5.0% at the 60.9 km 2 municipality level, and 1.2% at the 625 km 2 city level. Furthermore, we investigated the relationship between the satellite data (vegetation index) and in situ tree-measurement data (leaf area index) to develop a simple model to directly map the tree leaf area from the WorldView-2 / 3 imagery. The estimated total leaf area in Tokyo urban area in the leaf-on season (633 km 2 ) was twice that of the leaf-o ﬀ season (319 km 2 ). Our results also showed that the estimated total leaf area in Tokyo urban area was 1.9–6.2 times higher than the results of the moderate-resolution (30 m) satellite imagery. and S.H.; investigation, Y.K. and S.H.; resources, Y.K. and S.H.; data curation, Y.K. and S.H.; writing—original draft preparation, Y.K.; writing—review and editing, S.H. and A.T.; visualization, Y.K. and S.H.; supervision, Y.K. and A.T.; project administration, Y.K.; funding acquisition, Y.K. All authors have


Introduction
Urban vegetation, particularly trees, provides numerous benefits to human well-being (ecosystem services) [1,2]; trees improve air quality by removing pollutants from the atmosphere and mitigate the heat island effect by providing direct shade and by transpiration [3]. However, trees are also responsible for the formation of ozone (O 3 ) and fine particulate matter (PM 2.5 ) through the emission of biogenic volatile organic compounds (BVOC) via leaf surface [4,5]. To maximize the urban forest benefits for smart sustainable city development, the negative impacts of the tree canopy on air quality should be assessed [6].
Statistically sound data on the urban forest structure are required to properly assess the effect of the balance of the services/disservices on urban air quality. The magnitude of plant-atmosphere processes is strongly correlated with the total leaf area [7]. In practice, field surveys and aerial photographs are

WordView-2 and WorldView-3 (WorldView-2/3) Datasets
Cloud-free WorldView-2/3 images (DigitalGlobe, Inc., USA) acquired during different tree phenology periods were used in this study (Table 1). In Tokyo, April to September are considered spring and summer months belonging to the "leaf-on season" when most trees develop a green canopy, while October to March are autumn and winter months belonging to the "leaf-off season" when the leaves of deciduous trees begin changing color and falling. As the image acquisition dates were selected based on the data availability and cloudless sky conditions over the site, one pair of images for both leaf-on and leaf-off seasons were not available for the same year.
The WorldView-2/3 images were subjected to atmospheric and geometric corrections before processing and classification. Each multispectral digital number image was converted to Top of Atmosphere (TOA) radiance based on the radiometric calibration parameters and the standard correlation formula [14,15]. For each band, surface reflectance was generated based on the sensor response function and specific atmospheric conditions. The datasets were orthorectified and projected to the WGS-84 UTM Zone 54N system.

WordView-2 and WorldView-3 (WorldView-2/3) Datasets
Cloud-free WorldView-2/3 images (DigitalGlobe, Inc., Westminster, CO, USA) acquired during different tree phenology periods were used in this study (Table 1). In Tokyo, April to September are considered spring and summer months belonging to the "leaf-on season" when most trees develop a green canopy, while October to March are autumn and winter months belonging to the "leaf-off season" when the leaves of deciduous trees begin changing color and falling. As the image acquisition dates were selected based on the data availability and cloudless sky conditions over the site, one pair of images for both leaf-on and leaf-off seasons were not available for the same year.
The WorldView-2/3 images were subjected to atmospheric and geometric corrections before processing and classification. Each multispectral digital number image was converted to Top of Atmosphere (TOA) radiance based on the radiometric calibration parameters and the standard correlation formula [14,15]. For each band, surface reflectance was generated based on the sensor response function and specific atmospheric conditions. The datasets were orthorectified and projected to the WGS-84 UTM Zone 54N system.  The green plots illustrate the LAI field measurement points. Aerial photograph inspection in (A2 and B2) and field investigation in (A3 and B3) were conducted to assess the accuracy of tree mapping results from WorldView-2/3 imagery. Satellite imagery from DigitalGlobe Products. (a, A1, b, and B1-B3) WorldView2 © 2020 DigitalGlobe, Inc., a Maxar company. (a, A2, and B3) WorldView3 © 2020 DigitalGlobe, Inc., a Maxar company.

Tree Canopy Extraction from Background
After merging the WorldView-2/3 images and performing radiometric calibration, we separated the land cover into categories. A multiple stepwise masking procedure using the bi-modal histogram threshold method was employed to extract the tree crowns from the remaining background [16]. Figure 3 illustrates the segmentation procedure that consists of six steps. Here, a pixel-based classification was performed [17].
First, we selected the training sample pixels from 12 categories: eight non-vegetation (soil, sand, asphalt/concrete, high-rise building, low-rise building, water, shaded non-vegetation, and high Normalized Difference Vegetation Index (NDVI) non-vegetation), and four vegetation (tree canopy, dense grass, sparse grass, and shaded vegetation) categories. More than 1500 training samples per image were chosen almost equally for each land category, fulfilling the accuracy requirement of the decision tree methods for land cover classification [18]. Training samples were randomly selected in the study area to avoid spatial bias.
After the training samples were obtained, we used the normalized difference index ( ) as a classifier to distinguish each pixel into different land categories. Bands A and B are, respectively, the reflectance data of the WorldView-2/3 spectral bands. We produced a pixel number histogram of the NDI values calculated from two different land categories. All 64 possible band combinations of the eight spectral WorldView-2/3 bands were tested The green plots illustrate the LAI field measurement points. Aerial photograph inspection in (A2 and B2) and field investigation in (A3 and B3) were conducted to assess the accuracy of tree mapping results from WorldView-2/3 imagery. Satellite imagery from DigitalGlobe Products. (a, A1, b, and B1-B3) WorldView2 © 2020 DigitalGlobe, Inc., a Maxar company. (a, A2, and B3) WorldView3 © 2020 DigitalGlobe, Inc., a Maxar company.

Tree Canopy Extraction from Background
After merging the WorldView-2/3 images and performing radiometric calibration, we separated the land cover into categories. A multiple stepwise masking procedure using the bi-modal histogram threshold method was employed to extract the tree crowns from the remaining background [16]. Figure 3 illustrates the segmentation procedure that consists of six steps. Here, a pixel-based classification was performed [17].
First, we selected the training sample pixels from 12 categories: eight non-vegetation (soil, sand, asphalt/concrete, high-rise building, low-rise building, water, shaded non-vegetation, and high Normalized Difference Vegetation Index (NDVI) non-vegetation), and four vegetation (tree canopy, dense grass, sparse grass, and shaded vegetation) categories. More than 1500 training samples per image were chosen almost equally for each land category, fulfilling the accuracy requirement of the decision tree methods for land cover classification [18]. Training samples were randomly selected in the study area to avoid spatial bias.
After the training samples were obtained, we used the normalized difference index (NDI = (Band A − Band B)/(Band A + BandB)) as a classifier to distinguish each pixel into different land categories. Bands A and B are, respectively, the reflectance data of the WorldView-2/3 spectral bands. We produced a pixel number histogram of the NDI values calculated from two different land categories. All 64 possible band combinations of the eight spectral WorldView-2/3 bands were tested to determine an appropriate band from the two sets (Bands A and B) for obtaining a bi-modal (two-peak) NDI histogram. Subsequently, a splitting variable (threshold) around the bottommost point (valley) of the bi-modal NDI histogram was determined manually by visual inspection. The NDI values of 64 different band combinations were tested at each classification step, and the classification performance was optimized by trial-and-error. Tree canopy land cover map was produced based on the best result.
Since we were not successful at distinguishing the vegetation type in the shaded area, we calculated the total area of the shaded tree canopy cover using the ratios of the vegetation categories as follows: where T shadow and T sunlit are the percent tree canopy cover (%) in shadow and sunlit areas, respectively; V shadow and V sunlit are the percent vegetation cover (%) in shadow and sunlit areas, respectively. We assumed that the ratios of the vegetation categories in shadow and sunlit areas are approximately the same in a large area.
Remote Sens. 2020, 12, x FOR PEER REVIEW 6 of 20 to determine an appropriate band from the two sets (Bands A and B) for obtaining a bi-modal (twopeak) NDI histogram. Subsequently, a splitting variable (threshold) around the bottommost point (valley) of the bi-modal NDI histogram was determined manually by visual inspection. The NDI values of 64 different band combinations were tested at each classification step, and the classification performance was optimized by trial-and-error. Tree canopy land cover map was produced based on the best result. Since we were not successful at distinguishing the vegetation type in the shaded area, we calculated the total area of the shaded tree canopy cover using the ratios of the vegetation categories as follows: where Tshadow and Tsunlit are the percent tree canopy cover (%) in shadow and sunlit areas, respectively; Vshadow and Vsunlit are the percent vegetation cover (%) in shadow and sunlit areas, respectively. We assumed that the ratios of the vegetation categories in shadow and sunlit areas are approximately the same in a large area.

Reference Data of Tree Canopy Cover
Vegetation maps obtained from the field surveys and aerial photographs were used as a reference to test the accuracy of the output of our WorldView-2/3 image classification analysis.

Field Investigation
We surveyed the vegetation cover in an urban park of 0.3 km 2 , which is located in Koto-ward in December 2016 (Figure 2, B3). Hardcopies of the WordView-2 images acquired in both the leaf-on and leaf-off seasons were brought to the field to locate the tree crown and also to confirm the tree types. We classified the trees with height > 2 m as "tall" and < 2 m as "small." We manually outlined the vegetation in the WordView-2 image and created a 2D polygon map of eight vegetation types (six tree types and dense/sparse grass). The image analysis was conducted using ArcGIS (Ver. 10.3.1; Environmental Systems Research Institute, Inc.).

Reference Data of Tree Canopy Cover
Vegetation maps obtained from the field surveys and aerial photographs were used as a reference to test the accuracy of the output of our WorldView-2/3 image classification analysis.

Field Investigation
We surveyed the vegetation cover in an urban park of 0.3 km 2 , which is located in Koto-ward in December 2016 (Figure 2, B3). Hardcopies of the WordView-2 images acquired in both the leaf-on and leaf-off seasons were brought to the field to locate the tree crown and also to confirm the tree types. We classified the trees with height >2 m as "tall" and <2 m as "small." We manually outlined the vegetation in the WordView-2 image and created a 2D polygon map of eight vegetation types (six tree types and dense/sparse grass). The image analysis was conducted using ArcGIS (Ver. 10.3.1; Environmental Systems Research Institute, Inc., Redlands, CA, USA).

Aerial Photograph Interpretation
We used one scene of the true color GEOSPACE digital aerial photographs with 0.25 m spatial resolution acquired in March 2013. The photograph covered 3.0 km 2 of a dense urban setting with many high-rise buildings around the Tokyo metropolitan government office in Shinjuku-ward ( Figure 2, B2). We manually created 2D polygons of each vegetation land cover category (tree canopy, grass, and shaded vegetation) in the aerial photograph by visual inspection using ArcGIS.

Tokyo Green and Water Coverage Ratios (Tokyo GWC-Ratio Data) Based on Aerial Photograph Interpretation
The green and water coverage ratios in Tokyo special wards (our study area) have been surveyed every five years since 2003 by the Tokyo metropolitan government. This data (called Tokyo GWC-ratio data hereafter) were created to calculate and monitor the ratio of the total area of parks, water surfaces, and ground surfaces covered by greenery in the entire region. The land cover map of tree canopy and grass were surveyed in the leaf-on season. We used the 2013 data to assess the accuracy of our image classification results obtained for the leaf-on season only.
The method adopted to analyze the green coverage data is based on the extraction of features of interest (tree canopy and grass) from aerial photographs by visual inspection. The aerial photographs covered all Tokyo special wards (approx. 625 km 2 ) with the ground resolution of no less than 0.2 m. The photographs for the 2013 data (and also 2018 data) were taken in November of the preceding year for both datasets. Tree canopy and grass areas were delineated by visual inspection by enlarging the photographs to a scale of 1:1000 so that a 1 m 2 object on the ground could be visible in a 1 mm 2 area on the computer screen. Therefore, every single tree and a patch of grass larger than 1 m 2 were delineated. The areas of tree and grass were digitized as a polygon layer and saved as a shape file in ArcGIS. In case the ground object was difficult to identify due to some cloud and shadow interruptions, the analyst conducted an on-site visit for confirmation. Cultivated areas of farmlands, and algae and weed in rivers and lakes were excluded.
The shape of the tree canopy in streets and parks is relatively constant due to the forest management by local authorities. According to the Tokyo GWC-ratio data, there is only a small difference (−0.2%) between the 2013 to 2018 data on the ratio of tree canopy coverage in the special wards. Thus, it is likely that no significant tree canopy size and leaf area changes can occur from the date of the satellite imagery data (between 2013 and 2017) to the date of the reference data (December 2016 for the field investigation, March 2013 for the aerial photograph interpretation, and November 2012 for the 2013 Tokyo GWC-ratio data). This was also proved in the field survey and aerial photographs; trees were mostly identified at the same position as in the WordView-2/3 images with similar crown sizes.

Approximate Model for the Relationship between Leaf Area Index (LAI)-Vegetation Indices (VI)
To infer LAI from the WordView-2/3 imagery, we developed an approximate model to evaluate the relationship between the vegetation index (VI) and in situ leaf area index (LAI; m 2 m −2 ) measurement data. Then, this model was used to calculate the pixel-based tree canopy LAI from the corresponding VI data. The LAI is the total one-sided leaf area per unit ground surface [19].
Several studies have developed LAI prediction models using vegetation indices (VI) for a variety of vegetation cover, including grasslands and trees [20,21]. The NDVI = ((NIR − red)/(NIR + red) ), which is the most widely used VI, quantifies the response of the vegetation to near-infrared reflectance (NIR) and red absorption [22,23]. The red and NIR of the WordView-2/3 data correspond to spectral bands Band 5 and Band 7, respectively. The NDVI is recognized as the most stable vegetation index for estimating the LAI [24,25], yet the methodological limitation of the VI and notably the NDVI were shown by Pinty et al. [26]. Thus, in this study, four different additional vegetation indices, which have also been widely applied in the literature to derive LAI, were examined ( Table 2). The red edge of the WordView-2/3 data correspond to spectral band, Band 6. The VI obtained with different extraction methods was tested to evaluate the relationship between the VI and LAI measurement data. We tested Remote Sens. 2020, 12, 1505 8 of 20 two VI-extraction window sizes (1, 3 × 3) close to the LAI measurement point, i.e., a VI value acquired from (1), which is the closest pixel to the LAI measurement point, and (2) a block of nine (3 × 3) pixels constituted with the closest pixel to the LAI measurement point at the center and its surrounding eight pixels. For the 3 × 3 extraction window, we tested two values of each VI (mean and maximum) for comparison with the LAI data at the center of those nine pixels.
The LAI-VI approximate model was then developed for both the leaf-on and leaf-off seasons. We optimized the model parameters to achieve 1:1 ratio between the estimated LAI and the observed LAI to avoid overestimation. Table 2. Vegetation indices considered in this study that concern prediction of leaf area index (LAI).

Vegetation Indices Algorithm Abbreviation Reference
Normalized Difference Vegetation Index

In Situ Tree LAI Measurement
We measured the tree canopy LAI (m 2 m −2 ) for Ginkgo biloba, Cornus florida, Prunus spp., Acer buergerianum, Zelkova serrata, Platanus spp., Cinnamomum camphora, Lithocarpus edulis, Morella rubra, Magnolia kobus as these are the 10 species most commonly planted in Tokyo special wards. Several other species found in the field were also measured. The LAI data was collected from evergreen and deciduous trees for the leaf-on season, and from evergreen trees for the leaf-off season. A LAI-2200 plant canopy analyzer (LI-COR Biosciences Inc., Lincoln, NE, USA) was used for the tree canopy LAI measurement in the field (Supplementary Material Figure S1). LAI measurement points were deployed over three areas in Figure 2 (A1 and B1, A2 and B2, A3 and B3) for the leaf-on and leaf-off seasons, respectively. The exact location of the LAI measurement points was recorded by GPS to allow subsequent extraction of the nearby VI value from satellite images. A field survey was conducted in each area in 2018, in the same month when the satellite image of each respective area was photographed. A total of 386 and 268 tree canopy LAI data points were collected in the leaf-on and leaf-off seasons, respectively.
All the LAI measurements were conducted on either cloudy days, soon after the sunrise, or soon before the sunset, to avoid measurement errors, which may be caused by direct sunlight [31]. Moreover, the operator always stood between the sensor and the sun to conduct measurements under the same light conditions to reduce measurement bias. To shield the hemispherical sensor of the plant canopy analyzer from the operator and any objects except trees, we only used the data obtained within a front azimuth angle between 0 • and 30 • and a vertical zenith angle between 0 • and 43 • in accordance with the procedure suggested by Dufrêne and Bréda [32]. Each LAI data point represents an average of five measurements taken under a single tree canopy, and additional one measurement of above-canopy radiation taken in the nearby open space as a reference. Incidentally, it is likely that the in situ LAI data include some contribution of woody elements, so-called plant area index, rather than an actual leaf area index [7]. Thus, we measured the LAI of leafless trees (i.e., the woody element only) in winter to evaluate the degree of overestimation in the LAI data obtained by the plant canopy analyzer.
All the statistical analyses in this study were performed using MATLAB (Ver. 2018b, The MathWorks, Inc., Natick, MA, USA) and BullCurve for Excel (Ver. 3.4.3, Social Survey Research Information Co., Ltd., Tokyo, Japan).

LAI Estimates Using Landsat5-TM and Landsat8 Satellite Data
We also estimated the tree canopy leaf area using Landsat5-TM and Landsat8 imagery (30 m resolution). For this purpose, we adopted a previously reported LAI−NDVI relationship: Remote Sens. 2020, 12, 1505 9 of 20 LAI = 0.419e NDVI/0.270 for Landsat5-TM imagery derived from a broad-leaved forest area in Japan [33], and LAI = 3.440e NDVI − 3.380 for Landsat8 imagery derived from an urban park tree canopy in South Korea [34]. Landsat5-TM imagery was acquired on 11 October, 2010, and Landsat8 imagery was acquired on 27 October, 2016. The season of image acquisition was chosen to coincide with the WorldView-2 imagery, which was acquired on 27 October 2016, for a more accurate comparison. Cloudless images were selected. The Landsat scenes were obtained from the United States Geological Survey (USGS) Earth Explorer website (https://earthexplorer.usgs.gov/). The multispectral digital numbers were converted using the conversion coefficients provided by the Landsat metafiles to obtain Top of Atmosphere (TOA) spectral reflectance. Band 4 (Red) and Band 5 (NIR) of the Landsat scene were used to calculate the NDVI for the LAI estimates.

Determination of Masking Thresholds and Band Combinations for Tree Canopy Extraction
In total, five land cover classes (tree, grass, shaded vegetation, non-vegetation, and shadow/water) were produced following the stepwise masking approach and determining the ideal NDI band combinations and thresholds (see Section 2.3). Table 3 shows the optimized two-band combinations for obtaining the bi-modal NDI histogram and thresholds for each classification step. The bi-modal NDI histograms obtained for each satellite imagery is shown in the Supplementary Material section in Figure S2 for the leaf-on season, and Figure S3 for the leaf-off season. First, the land surface was separated with the NDI calculated from Band 5/Band 7, namely NDVI. In areas with a high NDVI, some pixels were non-vegetation pixels. Thus, in the second step, those non-vegetation pixels in the high NDVI areas were excluded using the NDI calculated from Band 5/Band 6. In the third step, the NDI calculated from Band 1/Band 3 clearly separated the tree canopy and grass, using the same threshold value (0.2) for both the leaf-on and leaf-off seasons. In the fourth step, the dense grass and sparse grass was separated. In the fifth and sixth steps, the vegetation was extracted from the area of low NDVI pixels that was obtained in the first step. Here, to distinguish the vegetation and non-vegetation in the shadow area, the threshold value was a key factor that strongly affected the classification accuracy. Hence, we manually determined an optimum threshold value for each satellite image that produced the best classification result for extracting vegetation data in the shadow area. We were not successful in distinguishing the types of vegetation (tree canopy and grass) in the shadow area because of the weak spectral capability. Table 3. Band combinations and threshold variables determined for image classification.

Accuracy Assessment
Figures 4 and 5 present the resultant classification maps of vegetation (tree, grass, and shaded vegetation) using the pixel-base classification method in the leaf-on and leaf-off seasons, respectively. The mapping results we obtained from WordView-2/3 (Figure 4(a3,b3), and Figure 5(a2,b2)), agree with the vegetation maps generated from aerial photographs (Figure 4(a2,b2), and Figure 5(a1)), and field investigations ( Figure 5(b1)). The overall difference of the tree canopy cover was less than 5.5% (Table 4). However, those two sites show only a small area and do not therefore fully represent the land cover in the entire study area. Thus, we made a comparison across municipalities for a comprehensive accuracy assessment. The tree canopy cover was found to be in good agreement with the Tokyo GWC-ratio data showing ±5% difference overall and 1.2% difference for the entire Tokyo special-ward area (Supplementary Table S1). Figure 6 shows a summary of the overall classification accuracy. The classification result achieved nearly 1:1 ratio against the validation data from field investigation and aerial photograph interpretation. Thus, the classification accuracy was nearly equal in the study area and sufficient for a comprehensive assessment of the tree canopy cover with a margin of difference of 5.5% at 3.0 km 2 small district level, 5.0% at 60.9 km 2 municipality level, and 1.2% at 625 km 2 city level (Supplementary Table S1), compared to aerial photograph interpretation.       Leafon 11.9% -8.4% 3.5% Leafoff 7.5% 6.8% -0.7%   Figure 6. Image classification performance for tree canopy cover estimation. The plots represent the ratio of tree canopy cover (%) estimated for various areas (two test areas, each of 23 municipalities in Tokyo special-wards, and all Tokyo special wards) in comparison with the respective verification data derived from the field survey and aerial photographs.

Tree Canopy LAI Mapping from VI
The measured LAI from the three sites ranged between 1.9 and 10.7 m 2 m −2 (N = 386, mean ± SD = 5.4 ± 2.0) for the leaf-on season and between 1.8 and 9.7 m 2 m −2 (N = 268, mean ± SD = 4.6 ± 1.3) for the leaf-off season, respectively. The LAI data followed a normal distribution for both the leaf-on season (χ 2 = 23.6, 2 df, p < 0.001) and the leaf-off season (χ 2 = 18.1, 2 df, p < 0.001). Among the four VI extraction methods examined herein (Section 2.5), the maximum VI selected within a block of nine pixels (3 × 3) showed the highest correlation with the LAI data (Supplementary Material Figure S4, a11-a15 for the leaf-on season, and b11-b15 for the leaf-off season). The Red Edge NDVI (RE−NDVI) was not correlated with LAI. The relationship between the maximum VI (except RE−NDVI) and the observed LAI data was equal during the leaf-on seasons (Spearman's rank correlation coefficient, R = 0.44), yet the NDVI showed a slightly better relationship during the leaf-off season (Spearman's rank correlation coefficient, R = 0.34 for NDVI, and R = 0.30 for WDRVI and EVI2).
As an example of the LAI−VI relationship, Figure 7 (a1 and b1) shows the relationship between the maximum NDVI and the observed LAI data for each tree species. The LAI values converge to the value of 2 when the NDVI values decrease. This result agrees with our observation for the leafless trees during the leaf-off season that showed an LAI value of 1.8 ± 0.4 m 2 m −2 (mean ± SD, N = 39). Thus, the LAI estimates derived using the plant canopy analyzer include an overestimation of 1.8 m 2 m −2 caused by the misdetection of the woody elements (i.e., tree trunk and branches) as leaves. Accordingly, we subtracted this overestimation from the observed LAI data and obtained the LAI−NDVI approximate model for the leaf-on season (  (Figure 7, a1 and b1). The LAI was saturated at an NDVI of 0.8 for both the Figure 6. Image classification performance for tree canopy cover estimation. The plots represent the ratio of tree canopy cover (%) estimated for various areas (two test areas, each of 23 municipalities in Tokyo special-wards, and all Tokyo special wards) in comparison with the respective verification data derived from the field survey and aerial photographs.

Tree Canopy LAI Mapping from VI
The measured LAI from the three sites ranged between 1.9 and 10.7 m 2 m −2 (N = 386, mean ± SD = 5.4 ± 2.0) for the leaf-on season and between 1.8 and 9.7 m 2 m −2 (N = 268, mean ± SD = 4.6 ± 1.3) for the leaf-off season, respectively. The LAI data followed a normal distribution for both the leaf-on season (χ 2 = 23.6, 2 df, p < 0.001) and the leaf-off season (χ 2 = 18.1, 2 df, p < 0.001). Among the four VI extraction methods examined herein (Section 2.5), the maximum VI selected within a block of nine pixels (3 × 3) showed the highest correlation with the LAI data (Supplementary Material Figure S4, a11-a15 for the leaf-on season, and b11-b15 for the leaf-off season). The Red Edge NDVI (RE−NDVI) was not correlated with LAI. The relationship between the maximum VI (except RE−NDVI) and the observed LAI data was equal during the leaf-on seasons (Spearman's rank correlation coefficient, R = 0.44), yet the NDVI showed a slightly better relationship during the leaf-off season (Spearman's rank correlation coefficient, R = 0.34 for NDVI, and R = 0.30 for WDRVI and EVI2).
As an example of the LAI−VI relationship, Figure 7 (a1 and b1) shows the relationship between the maximum NDVI and the observed LAI data for each tree species. The LAI values converge to the value of 2 when the NDVI values decrease. This result agrees with our observation for the leafless trees during the leaf-off season that showed an LAI value of 1.8 ± 0.4 m 2 m −2 (mean ± SD, N = 39). Thus, the LAI estimates derived using the plant canopy analyzer include an overestimation of 1.8 m 2 m −2 caused by the misdetection of the woody elements (i.e., tree trunk and branches) as leaves. Accordingly, we subtracted this overestimation from the observed LAI data and obtained the LAI−NDVI approximate model for the leaf-on season (LAI = 0.1e NDVI/0.179 ) and the leaf-off season (LAI = 0.1e NDVI/0.188 ) (Figure 7, a1 and b1). The LAI was saturated at an NDVI of 0.8 for both the seasons, and thus, the applicability of these models was uncertain at an NDVI > 0.8. Hence, we replaced the NDVI values > 0.8 with 0.8 to estimate the LAI. The same was true for the other VI, where the VI values appeared to saturate at −0.15 for the WDRVI1, 0.2 for the WDRVI2, and 1.6 for the EVI2, respectively (see Supplementary Material Figure S4, a11-a15 for the leaf-on season, and b11-15 for the Remote Sens. 2020, 12, 1505 13 of 20 leaf-off season). Therefore, the maximum VI value for each LAI−VI model was set to that respective saturation VI value (Supplementary Material Table S2). Figure 7(a2,b2) shows the observed LAI data in association with the estimated LAI using the LAI−NDVI approximate model as an example of the accuracy achieved from the LAI−VI model. The 1:1 ratio between the observed and estimated LAI is an expected outcome because the LAI−NDVI approximate model was adjusted manually to obtain this result (for detail see Section 2.5). Even some plots in Figure 7(a2,b2) were out of the 95% prediction interval, large errors were not caused by a particular tree species. Similarly, there was no significant difference between the estimated LAI and observed LAI for most tree species (Figure 7c). The root-mean-square errors (RMSEs) for the estimated LAI for leaf-on/leaf-off seasons were 2.11/2.41, 2.54/2.30, 2.35/1.93, and 2.34/1.86 for LAI−NDVI, LAI−WDRVI1, LAI−WDRVI2, and LAI−EVI1, respectively (Supplementary Material Table S2). Figure 8b presents an example of the LAI map generated using the LAI−NDVI approximate model. Here, we calculated LAI in pixel i by sequentially executing the following code through all the pixels defined as tree canopy using the image classification: where, A and B are the specific coefficients for each LAI−VI model as shown in Supplementary Material Table S2. The maxVI (i) value is the maximum VI within a block of nine pixels with pixel i at the center. As a result, the estimated total leaf area in Tokyo special wards in the leaf-on season was twice that in the leaf-off season. Table 5 presents the summary of the estimated results for the entire area and the subarea illustrated in Figure 8 with the dense tree canopy covering most of the ground. For both areas, the total leaf area for the WorldView-2/3 results was higher than the Landsat5-TM and Landsat8 results. The same was true for the entire Tokyo special-ward area: the total leaf area estimated from WorldView-2/3 was 1.9-6.2 times higher than the Landsat5-TM and Landsat8 results (Table 5).
Remote Sens. 2020, 12, x FOR PEER REVIEW 14 of 20 seasons, and thus, the applicability of these models was uncertain at an NDVI > 0.8. Hence, we replaced the NDVI values > 0.8 with 0.8 to estimate the LAI. The same was true for the other VI, where the VI values appeared to saturate at −0.15 for the WDRVI1, 0.2 for the WDRVI2, and 1.6 for the EVI2, respectively (see Supplementary Material Figure S4, a11-a15 for the leaf-on season, and b11-15 for the leaf-off season). Therefore, the maximum VI value for each LAI−VI model was set to that respective saturation VI value (Supplementary Material Table S2). Figure 7 (a2 and b2) shows the observed LAI data in association with the estimated LAI using the LAI−NDVI approximate model as an example of the accuracy achieved from the LAI−VI model. The 1:1 ratio between the observed and estimated LAI is an expected outcome because the LAI−NDVI approximate model was adjusted manually to obtain this result (for detail see Section 2.5). Even some plots in Figure 7 (a2 and b2) were out of the 95% prediction interval, large errors were not caused by a particular tree species. Similarly, there was no significant difference between the estimated LAI and observed LAI for most tree species (Figure 7c) Table S2). Figure 8b presents an example of the LAI map generated using the LAI−NDVI approximate model. Here, we calculated LAI in pixel i by sequentially executing the following code through all the pixels defined as tree canopy using the image classification: where, A and B are the specific coefficients for each LAI−VI model as shown in Supplementary Material Table S2. The maxVI (i) value is the maximum VI within a block of nine pixels with pixel i at the center. As a result, the estimated total leaf area in Tokyo special wards in the leaf-on season was twice that in the leaf-off season. Table 5 presents the summary of the estimated results for the entire area and the subarea illustrated in Figure 8 with the dense tree canopy covering most of the ground. For both areas, the total leaf area for the WorldView-2/3 results was higher than the Landsat5-TM and Landsat8 results. The same was true for the entire Tokyo special-ward area: the total leaf area estimated from WorldView-2/3 was 1.9-6.2 times higher than the Landsat5-TM and Landsat8 results ( Table 5).    Table 5. Comparison of the total tree canopy leaf area (km 2 ) estimated in Figure 8 and all Tokyo special wards from the satellite images (WorldView-2/3, Landsat5-TM, and Landsat8) and the respective LAI−VI models.

Satellite Imagery Date
Entire Region of Figure 8 Tree Colony Region in Figure  8 All   Table 5. Comparison of the total tree canopy leaf area (km 2 ) estimated in Figure 8 and all Tokyo special wards from the satellite images (WorldView-2/3, Landsat5-TM, and Landsat8) and the respective LAI−VI models.

Satellite Imagery Date
Entire Region of Figure 8 Tree Colony Region in Figure 8 All

Discussion
In this study, we aimed to develop a method to identify the small patches of individual trees in a highly urbanized landscape at the mega-city scale, which is seldom studied. We selected the entire urban area of Tokyo (special wards consisting of 23 municipalities) as the study area and utilized the high-resolution WorldView-2/3 satellite imagery to map and estimate the leaf area of the tree canopies. Land coverage of trees by pixel-based image classification was estimated with less than 5.5% difference from the test areas where the field surveys and aerial photograph interpretation were performed. At the municipality scale, the tree canopy cover was estimated with ± 5% difference from the Tokyo GWC-ratio data derived from the aerial photograph interpretation.
Once the tree canopy was identified, we developed a simple method to estimate their leaf area. The results revealed a relationship between the in situ LAI measurement data and the three VI data (NDVI, WDRVI, and EVI2, except the RE−NDVI) from the WorldView-2/3 imagery. The maximum VI selected within a block of nine pixels (3 × 3) was the only one to produce a significant relationship with the LAI data. The others (a VI value at the nearest pixel to the LAI measurement point and the average VI of the nine-pixel (3 × 3) window with the LAI measurement point at the center pixel) were not correlated with the LAI (see Supplementary Material Figure S4). The major reason for their weak LAI−VI relationships can be the effect of geolocation uncertainties in the satellite images and the uneven illumination on the tree crown caused by the shaded and unshaded areas. The rounded shape of an individual tree crown may cause variations in canopy illumination. These variations can be detected in high spatial resolution satellite imagery (2 × 2 m) and may have a significant influence on spectral reflectance. Shadows have lower reflectance intensity than unshaded areas and can cause the VI values to become lower than their actual values. Hence, if the shaded area is included within a block of 3 × 3 pixels, an average VI value for the window of nine pixels will be unreasonably low. This can lead to the weak LAI−VI relationships. However, using the maximum VI value within a block of 3 × 3 pixels means that the VI at the center of the nine pixels is more likely to be chosen from the unshaded area of the tree crown and, thus, it becomes less affected by the uneven illumination effect. Hence, we used the LAI−VI relationship based on the maximum VI to calculate the LAI.
We investigated the LAI−VI relationships in two seasons (leaf-on and leaf-off seasons) because there was a possibility that these relationships are not uniform over the year and might fluctuate seasonally with the phenological stages of the vegetation [35,36]. In fact, Qiao et al. [25] used the NDVI from MODIS satellite imagery and found significant LAI−NDVI relationships during the vegetation's growing and declining periods, whereas a poor relationship was observed during the flourishing period. We focused only on the two phenophases, namely the vegetation non-active period during January-February (except one satellite imagery acquired on April 4) and the vegetation active period during September-November (corresponding approximately to the "declining period" [25]), and thus, the other phenological transition stages such as periods of growing and flourishing were neglected in this study. This might be the reason why the LAI−VI relationships in our results were not visibly affected by the phenology because the flourishing period (the most uncertain period for the LAI−VI relationships) was not considered. Additional work is warranted to evaluate more cases in different seasons for a comprehensive and more accurate assessment.
Using the obtained LAI−VI approximate model, we estimated the tree leaf area over Tokyo's 23 special wards and also for the area shown in Figure 8 that can be characterized as a typical urban environment with the built-up area including parks and tree-lined streets. The obtained leaf area from the WorldView-2/3 imagery was higher than the results suggested in [33,34] (Table 5). The problem with the lower NDVI values arising from mixed pixels (i.e., a pixel that encompasses vegetation with different physical properties) was previously documented [37]. Given the relatively larger pixel size of the Landsat imagery (30 × 30 m) compared to the WorldView-2/3 imagery (2 × 2 m), the mixed pixel effect was strong in the highly heterogeneous urban environment and thus lowered the VI and the LAI values accordingly. In fact, the LAI−NDVI relationships proposed by Hoshi et al. [33] and Kimm et al. [34] were targeted for broad-leaved trees with continuous canopy, and thus, their applicability to estimate the leaf area in a highly urbanized landscape with small patches of individual trees was uncertain. On the contrary, for the same reason, the gap between the results was smaller for the estimates in the dense tree canopy region (Table 5).
This study only focused on extracting the tree canopy cover and did not consider the differences in species and vegetation types in the shadow area. In addition, the problem of saturation of VI at high LAI also needs to be addressed (notably the NDVI in Figure 7 a1 and b1, and for the other VI see Supplementary Material Figure S4). The VI data are derived from the two-dimensional picture, and therefore, the VI, notably the NDVI, is known to be less sensitive to high LAI values when the vegetation is dense [21,24,38]. Thus, the use of VI poses a challenge because the leaf area of an overlapping canopy and a tall tree is difficult to quantify accurately using the LAI−VI relationship. There is a need for robust algorithms to fully evaluate the leaf area in the pixels with high VI values. Although NDVI is one of the most reliable index for estimating LAI [24,39,40], some other vegetation indices tested in this study (RE−NDVI, WDRVI, and EVI2) are also proposed with their own advantages. Several new vegetation indices such as the near-infrared reflectance of vegetation (NIRv) [41] are reported to possess sufficient sensitivity in high-density vegetation areas. Another uncertainty in the LAI−VI relationship is the effect of understory vegetation. During the LAI measurement, the view angle was skywards; thus, the vegetation below the sensor (≈1.5 m high in this study) was not recorded and, therefore, excluded from the ground truth. In a closed dense canopy, the influence of understory vegetation can be neglected, but in a sparse canopy, understory vegetation can influence the VI value, which is observed from the sky above [42,43]. In fact, a recent study showed that the relationship between the LAI and several vegetation indices (including NDVI) improved substantially when the observation points with understory vegetation were excluded [44]. The extent of the effect of understory vegetation on this study's findings should be addressed in a future study to provide more accurate and consistent LAI estimates.
Our future work will focus on the unsolved issues to produce a more complete leaf area inventory for the Tokyo urban area. Using multi-temporal data is known to offer a promising increase in the success rate of mapping urban trees and/or species composition [12,45]. As the WorldView-2/3 satellite can provide images with near-daily revisit times, future research will also include the evaluation of the multi-temporal datasets to improve estimation accuracy by mitigating the uncertainties in the seasonal variations in the LAI−VI relationship caused by the differences in phenology and species type.

Conclusions
To the best of our knowledge, only a few studies have conducted the accuracy assessment of tree canopy cover maps derived from high-resolution (2 m) satellite imagery in mega cities. Our results can be summarized as follows: 1.
The WorldView-2/3 imagery has the ability to map tree canopy cover in a highly heterogenic urban environment with acceptable accuracy. The error margin of the tree canopy cover can reach 5.5% at the 3.0 km 2 small district level, 5.0% at the 60.9 km 2 municipality level, and 1.2% at the 625 km 2 city level, compared to the values based on aerial photograph interpretation.

2.
LAI was estimated with acceptable accuracy for the leaf-on season (September-November, corresponding to a vegetation declining period in autumn) and the leaf-off season (January-February, corresponding to a vegetation non-active period in winter) using the LAI−VI relationships.

3.
The estimated LAI and the total leaf area in Tokyo urban area was 1.9-6.2 times higher than the results from the Landsat5-TM and Landsat8 images using the LAI−NDVI relationships reported in a previous study.
Timely and accurate acquisition of the information on the status and structural changes in urban forests is crucial to develop strategies for sustainable city development and air quality improvement. The method developed in this study offers a practical solution to estimate the tree leaf area in megacity-scale regions in different tree phenology seasons. Maps of the leaf area at the individual tree scale can improve the understanding of the effect of the ecosystem services/disservices on urban systems, and enhance the efficiency of green infrastructure investment and implementation.
Supplementary Materials: The following are available online at http://www.mdpi.com/2072-4292/12/9/1505/s1, Table S1: Comparison of the vegetation land cover for each 23 municipalities in Tokyo special wards estimated by aerial photograph interpretation (Tokyo GWC-ratio data), and WorldView-2/3 imagery classification (this study). Figure S1: LAI measurement using LAI-2200, Plant Canopy Analyzer. Measurements were made below the tree canopy at the breast height with the sensor facing upward. Figure S2: Bi-modal histograms used to obtain the threshold values for image classification (leaf-on season). The histograms are normalized with the total number of each classification category for each satellite imagery. Figure S3: Bi-modal histograms used to obtain the threshold values for image classification (leaf-on season). The histograms are normalized with the total number of each classification category for each satellite imagery. Figure S4: Relationships between the VI and the observed/estimated LAI for the leaf-on season (upper panels: a) and the leaf-off season (lower panels: b).