Comparative Assessment of Two Vegetation Fractional Cover Estimating Methods and Their Impacts on Modeling Urban Latent Heat Flux Using Landsat Imagery

Quantifying vegetation fractional cover (VFC) and assessing its role in heat fluxes modeling using medium resolution remotely sensed data has received less attention than it deserves in heterogeneous urban regions. This study examined two approaches (Normalized Difference Vegetation Index (NDVI)-derived and Multiple Endmember Spectral Mixture Analysis (MESMA)-derived methods) that are commonly used to map VFC based on Landsat imagery, in modeling surface heat fluxes in urban landscape. For this purpose, two different heat flux models, Two-source energy balance (TSEB) model and Pixel Component Arranging and Comparing Algorithm (PCACA) model, were adopted for model evaluation and analysis. A comparative analysis of the NDVI-derived and MESMA-derived VFCs showed that the latter achieved more accurate estimates in complex urban regions. When the two sources of VFCs were used as inputs to both TSEB and PCACA models, MESMA-derived urban VFC produced more accurate urban heat fluxes (Bowen ratio and latent heat flux) relative to NDVI-derived urban VFC. Moreover, our study demonstrated that Landsat imagery-retrieved VFC exhibited greater uncertainty in obtaining urban heat fluxes for the TSEB model than for the PCACA model.


Introduction
The urban heat island (UHI) is a phenomenon that leads to increased air or surface temperatures in urbanized areas compared to temperatures in surrounding rural areas.Recently, there has been an increasing trend in studies that quantify the urban heat island response to global environmental change, requiring the surface heat flux in energy balance models of cities.To enhance the understanding of energy and water exchanges across urban landscape environments, Grimmond et al. [1] and Grimmond et al. [2] compared 33 urban energy balance models, and the results showed that the models were most capable of modeling net all-wave radiation and least capable of modeling latent heat flux.However, most of them are numerical microclimate models and not compatible with the application of large-scale remote sensing data, considering the complex inputs of meteorological data and urban surface geometric characteristics.
To quantify global and regional energies, remote sensing data have been coupled with numerous physical models to enhance modeling accuracy [3,4].Various physical models, such as the big-leaf theory models [5][6][7] and Surface Energy Balance System model [8,9], were developed to delineate surface energy components across urban landscape environments.Moreover, the two-source energy balance models, in which heat fluxes are calculated separately for each surface component (i.e., bare land and vegetation), were established to delineate the surface energy balance with relatively satisfactory accuracies [10][11][12][13][14][15].In particular, Zhang et al. [16] and Zhang et al. [17] developed a more convenient Pixel Component Arranging and Comparing Algorithm (PCACA) model based on the two-layer model using a combination of land surface temperature (LST) and the vegetation fractional cover (VFC) space.Kuang et al. [18] achieved using PCACA model to quantify the effects of urban land-cover on surface heat flux regulation in Beijing.Recently, some improved temperature decomposition methods based on PCACA model were further proposed [19][20][21] to estimate regional heat fluxes.
VFC is a key input to remote sensing models of land surface energy balance in heterogeneous areas because the vegetation coverage influences energy exchange processes and heat fluxes of land surfaces.Considering surface energy balance models in urban landscape environments, sub-pixel heterogeneities in vegetation cover distributions can significantly impact model partitioning of available energy due to the nonlinearities inherent in land-atmosphere interactions.With respect to some remote sensing-based urban thermal studies [14,22], the influence of vegetation coverage on urban heat fluxes is generally evident in the determination of the associated heat flux component using linear or non-linear combinations of each fraction of urban land cover (mainly urban vegetation and non-vegetation).The effects of spatial heterogeneity resulting from mixed pixels of vegetation cover and other land covers over heterogeneous regions on the surface heat fluxes have been assessed by some studies using coarse resolution satellite imagery such as Moderate Resolution Imaging Spectrometer (MODIS) and Advanced Very High Resolution Radiometer (AVHRR).Gibson et al. [23] described the uncertainty related to VFC in the derivation of evapotranspiration using the Surface Energy Balance System model.Kustas and Norman [24] reported that heat energy balance modeling tended to be affected by landscape heterogeneity due to mixed pixel effects.However, satellite-based quantification of the impact of sub-pixel VFC on energy budgets at the urban scale using medium resolution imagery (i.e., Landsat) has received less attention than it deserves, particularly concerning different energy flux models.
Remote sensing provides a useful data source for quantifying VFC over large areas; however, many remote sensing data are not fine enough to retrieve VFC with very satisfactory results over regional landscapes [25].Based on the resolutions of most operational satellite sensors such as Landsat and ASTER, spatial resolution limitations are evident in the context of characterizing VFC because complex urban landscape environments generally exhibit a mix of urban vegetation cover and impervious surfaces.To retrieve urban vegetation fractional coverage, the dimidiate pixel model has been performed with the image bands (or band math), such as the Normalized Difference Vegetation Index (NDVI) [26][27][28] and variations [29,30].NDVI-derived method was extensively used to determine the VFC factor in urban heat flux assessments over regional areas [18,[31][32][33].However, satellite image-based vegetation indices were found to have low correlations with the VFC in heterogeneous landscapes such as dry lands or urban areas due to the sensitivity of vegetation to surface environmental conditions [34,35].Spectral Mixture Analysis (SMA) is a sub-pixel classification technique, in which the spectrum collected by a sensor is assumed to be a linear or nonlinear combination of the spectra of components within the studied pixel level [36].Linear SMA was efficient in estimating the percentage of ground cover, such as vegetation cover and impervious surfaces [37][38][39][40].While these two methods (NDVI-derived and SMA-derived) were applied to urban environments to obtain VFC, very few studies have attempted to quantitatively evaluate their relative accuracies and associated roles in estimating urban heat fluxes [18,41].
Considering VFC is a key input parameter in the energy balance model and the effects of remote sensing VFC on the urban heat flux estimation have not been fully assessed, this research evaluated how the application of two different VFC retrieval models (NDVI-derived method and Multiple Endmember Spectral Mixture Analysis (MESMA)-derived method) affected the accuracies of the model estimates of surface heat fluxes in a heterogeneous urban environment.For this purpose, two urban heat fluxes models that used MESMA-derived and NDVI-derived VFC as inputs, the TSEB and PCACA models, were implemented across the urban landscape of Beijing, China.Moreover, inter-comparable studies of how the two different remote sensing-based urban heat flux models (TSEB and PCACA) with different physical basis depend on VFC were also conducted in our research.Thus, the Bowen ratio and latent heat fluxes were calculated based on four methods: (1) TSEB combined with NDVI-derived VFC; (2) TSEB combined with MESMA-derived VFC; (3) PCACA combined with NDVI-derived VFC; and (4) PCACA combined with MESMA-derived VFC.

Study Area
Beijing (located at 39 • 28 N-41 • 05 N and 115 • 25 E-117 • 30 E) is the capital city of China (Figure 1) and selected for our research.This city exhibits high northwestern and low southeastern terrain and is surrounded by mountainous areas to the west and plains to the east and south.Beijing has experienced significant urban sprawl in recent decades.Quantification of urban thermal environments in Beijing is of great significance for sustainable urban development and residential comfort.By quantifying and comparing remote sensing-based energy balance models over Beijing, we can better understand the key surface factors that drive inter-model discrepancies.

Study Area
Beijing (located at 39°28′N-41°05′N and 115°25′E-117°30′E) is the capital city of China (Figure 1) and selected for our research.This city exhibits high northwestern and low southeastern terrain and is surrounded by mountainous areas to the west and plains to the east and south.Beijing has experienced significant urban sprawl in recent decades.Quantification of urban thermal environments in Beijing is of great significance for sustainable urban development and residential comfort.By quantifying and comparing remote sensing-based energy balance models over Beijing, we can better understand the key surface factors that drive inter-model discrepancies.1).

Remote Sensing Data
To quantify the urban heat fluxes, one cloud-free Landsat 5 TM image (22 September 2009) of Beijing was used.In addition, a QUICKBIRD image that covered the Beijing region was collected on 1 October 2009 and employed as a reference dataset for the identification of field data.To retrieve the atmospheric profile for estimating Landsat LST, we collected TERRA/MODIS remote sensing images (MOD021KM and MOD03, obtained online at http://daac.gsfc.nasa.gov/data/dataset/MODIS/),which have the same acquisition date as the Landsat TM imagery.

Ground Data
To validate the remote sensing-retrieved heat fluxes, eddy flux measurements of the latent heat flux (LE), sensible heat flux (H) and Bowen ratio (H/LE) were obtained from the work of [18].In their study, LE and H recorded at five sites were used, as shown in Table 1.The observation instruments were installed approximately 20-30 m above the ground, with average values collected at half-hour intervals.Meteorological data from 20 meteorological stations in Beijing were collected based on [18].
In addition, eight field observations (BA1-BA8, in Table 1) collected in Beijing region during 2001 to 2010 were used for further accuracy comparison analysis.More detailed descriptions about the experiment sites can be found in [42,43].To obtain the critical coefficients for the NDVI-derived VFC retrieval method and evaluate the precision of VFC images retrieved from Landsat TM, 150 sample plots were randomly selected in Beijing, and the corresponding reference VFC values were digitized using high-resolution QUICKBIRD imagery and Google Earth™ imagery.Specifically, reference fractions of urban vegetation abundance were obtained by manually digitizing 150 sample plots in the high-resolution imagery.Eight sites and field-measured VFC data from [18] were also used in this study.Among the 158 selected sites, 38 sites were used to retrieve the required coefficients for the NDVI-derived approach, and the other 120 sites were used to validate the Landsat TM VFC.
In addition, subsets of 10 and 23 sample plots located at the centers of pure water bodies and pure dry, bare lands (impervious surfaces and dry bare soil) were identified from the high-resolution QUICKBIRD and Google Earth™ images.Their corresponding Landsat TM LSTs were used to calibrate the input parameters of the PCACA model.

Image Pre-Processing
The collected Landsat images were first geo-referenced to the Universal Transverse Mercator (UTM) coordinate system using high-resolution Google Earth™ images.The RMSEs of the rectification were all less than one pixel for all the Landsat images.The atmospheric correction was applied to the visible and near infrared bands of Landsat imagery using the Fast Line-of-Sight Atmospheric Analysis of Spectral Hypercubes (FLAASH) module [44] in ENVI 4.8 software.
In this study, the land surface temperature (LST) was derived from the Landsat thermal infrared band, as described by Jiménez-Muñoz and Sobrino [45].Land surface emissivity, which is necessary for retrieving LST, was obtained with NDVI thresholds method [46].The accuracy of Landsat derived LST was assessed with the 26 field observations listed in Table 1.It is observed that a better agreement existed between the estimates and the measurements, and the Landsat estimation achieved an RMSE of 1.57K and a MAE of 0.68 K.
Surface land cover maps were generated using image classification of Landsat TM/ETM images.This study utilized a non-parametric supervised classification based on support vector machine classifier [47].Six predominant land cover types (i.e., forest, crops/grassland, bare soil, lawn, impervious surfaces and water bodies) were identified.High-resolution data sources including QuickBird and Google Earth™ were used to assess the classification results.The statistical measures involving overall accuracy (OA) and Kappa statistic (KC) were calculated.All of the images achieved acceptable results with the OA higher than 85% and KC higher than 0.90, meaning that the resultant Landsat classifications are reliable for further analysis.

Method Description
The NDVI-derived model and MESMA-derived model were employed to predict surface VFC.Two models (TSEB model and PCACA model) with different physical basis were then used to investigate the urban heat flux.The overall flowchart is given in Figure 2.
Remote Sens. 2017, 9, 455 5 of 20 In addition, subsets of 10 and 23 sample plots located at the centers of pure water bodies and pure dry, bare lands (impervious surfaces and dry bare soil) were identified from the high-resolution QUICKBIRD and Google Earth™ images.Their corresponding Landsat TM LSTs were used to calibrate the input parameters of the PCACA model.

Image Pre-Processing
The collected Landsat images were first geo-referenced to the Universal Transverse Mercator (UTM) coordinate system using high-resolution Google Earth™ images.The RMSEs of the rectification were all less than one pixel for all the Landsat images.The atmospheric correction was applied to the visible and near infrared bands of Landsat imagery using the Fast Line-of-Sight Atmospheric Analysis of Spectral Hypercubes (FLAASH) module [44] in ENVI 4.8 software.
In this study, the land surface temperature (LST) was derived from the Landsat thermal infrared band, as described by Jiménez-Muñoz and Sobrino [45].Land surface emissivity, which is necessary for retrieving LST, was obtained with NDVI thresholds method [46].The accuracy of Landsat derived LST was assessed with the 26 field observations listed in Table 1.It is observed that a better agreement existed between the estimates and the measurements, and the Landsat estimation achieved an RMSE of 1.57K and a MAE of 0.68 K.
Surface land cover maps were generated using image classification of Landsat TM/ETM images.This study utilized a non-parametric supervised classification based on support vector machine classifier [47].Six predominant land cover types (i.e., forest, crops/grassland, bare soil, lawn, impervious surfaces and water bodies) were identified.High-resolution data sources including QuickBird and Google Earth™ were used to assess the classification results.The statistical measures involving overall accuracy (OA) and Kappa statistic (KC) were calculated.All of the images achieved acceptable results with the OA higher than 85% and KC higher than 0.90, meaning that the resultant Landsat classifications are reliable for further analysis.

Method Description
The NDVI-derived model and MESMA-derived model were employed to predict surface VFC.Two models (TSEB model and PCACA model) with different physical basis were then used to investigate the urban heat flux.The overall flowchart is given in Figure 2.

NDVI-Derived Method
The present study calculated the VFC with a simple and extensively used model, which obtains the VFC using a predefined function of the NDVI values.This model can be described as follows: where NDV I v and NDV I s are the NDVI values of fully vegetated areas and bare land, respectively.NDV I v and NDV I s are required coefficients and generally depend on the specific research region.
The coefficient k is a function associated with the vegetation distribution.k is set to 1 in this study, as suggested by [18,22,26,27].
In the retrieval of VFC by the NDVI-derived method, mean Landsat TM NDVI values at 23 selected bare land locations (8 bare soil and 15 impervious surfaces) were computed and considered as NDV I s , and the value was set to 0.06.To estimate NDV I v , Equation ( 1) was implemented to fit the VFC values collected from high-resolution data and their corresponding values of NDV I − NDV I s at 38 sample sites (introduced in Section 2.2.2).The fitted value of NDV I v − NDV I s = 0.60, and NDV I v was set to 0.66.It should be mentioned that NDV I v across the Beijing region was somewhat low in present study mainly due to the complex surface characteristics.Similar results were also found in some prior studies that focused on the VFC derivations over urban regions or desert regions, such as NDV I v = 0.60, in Beijing, China [18]; NDV I v = 0.63, in north China [48]; and NDV I v = 0.61, in central New Mexico, USA [27].Therefore, the value of NDV I v reported by our study was reasonable and acceptable.

MESMA-Derived Method
Multiple endmember spectral mixture analysis (MESMA) [49][50][51] is an improved and widely used algorithm based on the linear SMA.Unlike other simple linear spectral mixture analyses, MESMA can unmix each pixel with different combinations of represented endmembers.The main advantage of MESMA is that it is able to overcome the limitation of using the fixed types and numbers of endmembers to model all pixels.MESMA mainly involves three steps.In the first step, the representative endmembers containing unique spectral reflectance were selected.Three major representative categories of urban land cover types (vegetation, soil and impervious surfaces with high-albedo and low-albedo) were identified based on VIS theory [52].Water and wetland area were not considered and masked out in the pre-process procession due to the non-availability of MESMA model for water regions.Then, with these collected endmembers, a linear spectral unmixing analysis with full abundance constraints (Equations ( 2) and ( 3)) was performed to estimate fractional land covers in an iterative manner.
where k is the number of endmembers, f k is the fraction of endmember k, and R k is the reflectance of endmember k.
In the process of implementing MESMA, a series of simple linear spectral mixture analysis models were used to diversify the combinations of endmembers.For each pixel, one-, two-, and three-endmember combinations were used to determine the optimal candidate model.An optimal candidate model generates a realistic range of fractions (0-100%) and does not exceed an RMSE threshold (0.025).The corresponding estimated results of MESMA were combined into one image with several associated bands.Each band represents the areal fraction of each endmember of vegetation, bare soil and impervious surfaces.VFC can be obtained as the sum of all the abundance bands of vegetation.
This study established a spectra library consisting of 28 spectra.Specifically, some major candidate imagery endmembers were first identified using the Pixel purity index approach in combination with an N-dimensional visualizer.Then, the optimal representative endmembers were chosen from the diverse collected endmembers.The high-albedo and low-albedo impervious surface categories each included 6 distinct endmember spectra, and the vegetation (grassland/lawn and forest) and bare soil categories included 6/5 and 5 spectra, respectively.The reflective spectra of the collected endmembers are displayed in Figure 3.With the selected endmember spectra, MESMA was implemented to obtain VFC.In the process of MESMA, a large number of combination models were applied to unmix each pixel in a cyclical manner.The associated details of MESMA model parameterization are shown in Table 2.
Remote Sens. 2017, 9, 455 7 of 20 candidate model generates a realistic range of fractions (0-100%) and does not exceed an RMSE threshold (0.025).The corresponding estimated results of MESMA were combined into one image with several associated bands.Each band represents the areal fraction of each endmember of vegetation, bare soil and impervious surfaces.VFC can be obtained as the sum of all the abundance bands of vegetation.This study established a spectra library consisting of 28 spectra.Specifically, some major candidate imagery endmembers were first identified using the Pixel purity index approach in combination with an N-dimensional visualizer.Then, the optimal representative endmembers were chosen from the diverse collected endmembers.The high-albedo and low-albedo impervious surface categories each included 6 distinct endmember spectra, and the vegetation (grassland/lawn and forest) and bare soil categories included 6/5 and 5 spectra, respectively.The reflective spectra of the collected endmembers are displayed in Figure 3.With the selected endmember spectra, MESMA was implemented to obtain VFC.In the process of MESMA, a large number of combination models were applied to unmix each pixel in a cyclical manner.The associated details of MESMA model parameterization are shown in Table 2.

Data Type of Combination Number of Models Total
Landsat TM one-endmembers 28 two-endmembers 313 2087 three-endmembers 1746

TSEB Model
An advanced two-source urban heat flux algorithm [14,15] is applied to delineate heterogeneous urban areas.This model can decompose the heat flux within a mixed pixel into sub-component heat fluxes of vegetation and non-vegetation.

TSEB Model
An advanced two-source urban heat flux algorithm [14,15] is applied to delineate heterogeneous urban areas.This model can decompose the heat flux within a mixed pixel into sub-component heat fluxes of vegetation and non-vegetation.
For single observation angle satellite data, it is impractical to accurately separate the component surface temperatures of the vegetation and the impervious surface within the complex mixed pixels.Thus, the sensible heat flux H is obtained using an alternative effective resistance approach: Remote Sens. 2017, 9, 455 8 of 20 where ρ is the air density, C p is the specific heat of air at constant pressure, T s is the remote sensing surface temperature from Landsat imagery and T a is the atmospheric temperature.R a_veg and R a_non−veg are the aerodynamic resistance values of vegetated and non-vegetated areas, respectively, which are retrieved using the following equation: where Z m is the height of wind measurements, Z h is the height of humidity measurements, d is the zero-plane displacement height, Z 0m is the roughness length governing momentum transfer, Z 0h is the roughness length governing transfer of heat and vapor, k is the von Karman's constant and u is the wind speed (m s −1 ) at a given height based on meteorological data.R s is the resistance to heat flow in the boundary layer immediately above the soil surface and can be computed using the following equation: where a is the free convective velocity, b is a coefficient that represents the typical soil surface roughness, and u s is the wind speed over the soil surface at a height of 0.05-0.2m.As suggested by Kustas and Norman [53] and Weng, Hu, Quattrochi and Liu [14], a and b are set to 0.004 m/s and 0.012, respectively.LE is given using the following equation: where LE veg and LE non−veg are the latent heat fluxes in vegetated and non-vegetated areas, respectively.These variables were computed as follows: where e a is the atmospheric water vapor pressure in hPa, e o is the saturation vapor pressure in hPa and is calculated using Equations ( 9) and (10) [5,7].γ is the psychometric constant.
In Equations ( 9) and (10), T is the atmospheric temperature in kelvin and r s_veg and r s_non−veg are the stomata resistance values in vegetated and non-vegetated areas, respectively.r s_veg and r s_non−veg were obtained using the simple approach reported by Kato and Yamaguchi [5].
By dividing sensible heat flux H by latent heat flux LE, we can obtain the Bowen ratio.
In the TSEB model, the required parameters involving roughness lengths, including d, Z 0m and Z 0h , have important influences on the results.Typical values of d, Z 0m and Z 0h for specific surface land types are alternatively used in this study.The reference classifications of the surface types were obtained using the SVM method.Based on existing studies in the Beijing [54,55], the required parameters are listed in Table 3.

PCACA Model
The PCACA model uses a combination of LST versus VFC trapezoid method and a Bowen ratio energy balance method to partition surface temperature and surface net radiation of mixed pixels, subsequently estimating various heat components.
As a method that is based on the trapezoid approach, PCACA assumes that the LSTs of mixed pixels all exist within the space constructed by VFC and LST.In Figure 4, the wet edge of the trapezoid is related to surface conditions of maximum LE (or potential evapotranspiration).In contrast, the dry edge represents minimum LE (or zero evapotranspiration).Accordingly, if the endpoints of the two edges are selected and the associated positions are determined, we can fix the shape and structure of the trapezoid.Furthermore, the Bowen ratio and surface heat flux can be calculated.
In urban regions, LST sw and LST sd represent the endpoints of water bodies and dry, bare lands, respectively.LST vd and LST vw represent dry, full canopy green spaces and water-saturated soils, respectively.As described by previous studies [16,17,56], the Bowen ratio of each pixel can be retrieved through the linear interpolation between wet and dry edges of the trapezoid using the following equation: where LST min and LST max are the intersections between VFC and the wet edge and dry edge, respectively.LST min and LST max in Figure 4 can be represented by the following equations: where k dry and k wet are the slopes of the dry and wet edges.Substituting LST min and LST max in Equation ( 12) into Equation ( 13) yields the following expression.
To determine appropriate LST min and LST max for the PCACA method, we must identify appropriate wet and dry edges in the complex urban regions using the Landsat-based VFC/LST spaces.Selecting accurate hot and cold endpoints is difficult because of the constraint information derived from one thermal band.In this study, we did not focus on the procedures used to identify extreme pixels.Following previous studies [16,17], LST sd and k dry were retrieved using LST values from representative bare land in the study area, and LST sw and k wet from the Landsat images were based on the LST values of several water districts in the study regions.
To determine appropriate min LST and max LST for the PCACA method, we must identify appropriate wet and dry edges in the complex urban regions using the Landsat-based VFC/LST spaces.Selecting accurate hot and cold endpoints is difficult because of the constraint information derived from one thermal band.In this study, we did not focus on the procedures used to identify extreme pixels.Following previous studies [16,17], Then, based on Equations ( 1) and ( 12), LE can be obtained using Equation (16).
In this study, n R and G are estimated according to the work of [6,14].In the implementation of the PCACA model, 10 water bodies in Beijing were selected, and their mean Landsat TM LSTs were calculated.We considered the water surface temperature to be LST sw = 20.0 ± 0.
In this study, R n and G are estimated according to the work of [6,14].

Performance of VFC Estimations
The resultant VFCs were validated using the independent dataset (N = 120) that is introduced in Section 2.2.2.The VFC maps derived from the two remote sensing methods are shown in Figure 5, and the scatter plots of estimated VFC versus reference VFC are given in Figure 6.Through checking the accuracy of NDVI-derived VFC, an adjusted R 2 of 0.86 was obtained.The regression analysis displayed an adjusted R 2 of 0.91 based on the results of the MESMA-derived method.For comprehensive analysis, urban landscape was subdivided into two categories: pixels were classified as less-developed areas when the VFC were larger than or equal to 30%, whereas those pixels of less than 30% were considered developed areas.The statistical results for the overall study area and two categories are illustrated in Table 4.The RMSE-based and MAE-based accuracies achieved using the MESMA-derived approach were 1.61% and 1.03% better, respectively, than those of the NDVI-based approach.
displayed an adjusted R 2 of 0.91 based on the results of the MESMA-derived method.For comprehensive analysis, urban landscape was subdivided into two categories: pixels were classified as less-developed areas when the VFC were larger than or equal to 30%, whereas those pixels of less than 30% were considered developed areas.The statistical results for the overall study area and two categories are illustrated in Table 4.The RMSE-based and MAE-based accuracies achieved using the MESMA-derived approach were 1.61% and 1.03% better, respectively, than those of the NDVI-based approach.The resultant VFCs were approximately similar for the NDVI-derived and MESMA-derived methods in non-urban regions.Specific differences in the estimated VFCs obtained using these two methods were observed in urban regions.As shown in Figure 5, the spatial VFC patterns of trees and grasslands located in medium-high VFC areas (i.e., urban parks and residential areas) matched relatively well between the two methods.However, some small patches of grass/shrubs and trees located in low VFC areas, e.g., in the city center and the southern portions of built-up regions, were confused with impervious surfaces and resulted in obvious differences in estimated VFCs using the The resultant VFCs were approximately similar for the NDVI-derived and MESMA-derived methods in non-urban regions.Specific differences in the estimated VFCs obtained using these two methods were observed in urban regions.As shown in Figure 5, the spatial VFC patterns of trees and grasslands located in medium-high VFC areas (i.e., urban parks and residential areas) matched relatively well between the two methods.However, some small patches of grass/shrubs and trees located in low VFC areas, e.g., in the city center and the southern portions of built-up regions, were confused with impervious surfaces and resulted in obvious differences in estimated VFCs using the two methods.The better performance of the MESMA-derived method relative to that of the NDVI-derived method is also displayed by the red circles in scatter plots in Figure 6.
Remote Sens. 2017, 9, 455 12 of 20 two methods.The better performance of the MESMA-derived method relative to that of the NDVIderived method is also displayed by the red circles in scatter plots in Figure 6.The performance of VFC retrieval was examined by comparing the frequency plots of estimated VFC values based on the NDVI-derived and MESMA-derived models.Figure 7 shows that the frequency profiles of the modeled VFC results are similar in non-urban regions, with a total bias of only 0.92.Conversely, the discrepancy between the NDVI-derived and MESMA-derived histograms in urban regions was relatively large, with a total bias of 3.12, that the effectiveness of urban VFC retrieval varied between the two approaches.These deviations agree with the results of the conducted accuracy evaluations and scatter plots in Figure 6.

Performance of Urban Heat Flux Estimations
The in situ measurements from the five sites (B1-B5) listed in Table 1 were used to validate the Landsat TM-retrieved Bowen ratio and latent heat flux.The Bowen ratio maps derived from two remote sensing heat flux models are shown in Figure 8.The LE maps derived from remote sensing heat flux methods are shown in Figure 9.There are obvious spatial differences between the model results using the different retrieved VFCs as driving parameters, particularly in urban regions where some residential evaporation occurred.
Figure 10a,b illustrates the statistical results for all the five sites put together, based on the TSEB model using NDVI-derived VFC and MESMA-derived VFC as the driving parameter, respectively.The statistical results for all the five sites put together, based on the PCACA model using NDVIderived VFC and MESMA-derived VFC as the driving parameter, respectively, are illustrated in the Figure 10c,d.Furthermore, the differences in heat flux maps based on the PCACA model using the two different VFC sources were relatively similar (see Figure 10g,h), while the TSEB model yielded some substantial differences (see Figure 10e,f).

Performance of Urban Heat Flux Estimations
The in situ measurements from the five sites (B1-B5) listed in Table 1 were used to validate the Landsat TM-retrieved Bowen ratio and latent heat flux.The Bowen ratio maps derived from two remote sensing heat flux models are shown in Figure 8.The LE maps derived from remote sensing heat flux methods are shown in Figure 9.There are obvious spatial differences between the model results using the different retrieved VFCs as driving parameters, particularly in urban regions where some residential evaporation occurred.
Figure 10a,b illustrates the statistical results for all the five sites put together, based on the TSEB model using NDVI-derived VFC and MESMA-derived VFC as the driving parameter, respectively.The statistical results for all the five sites put together, based on the PCACA model using NDVI-derived VFC and MESMA-derived VFC as the driving parameter, respectively, are illustrated in the Figure 10c,d.Furthermore, the differences in heat flux maps based on the PCACA model using the two different VFC sources were relatively similar (see Figure 10g,h), while the TSEB model yielded some substantial differences (see Figure 10e,f).model using NDVI-derived VFC and MESMA-derived VFC as the driving parameter, respectively.The statistical results for all the five sites put together, based on the PCACA model using NDVIderived VFC and MESMA-derived VFC as the driving parameter, respectively, are illustrated in the Figure 10c,d.Furthermore, the differences in heat flux maps based on the PCACA model using the two different VFC sources were relatively similar (see Figure 10g,h), while the TSEB model yielded some substantial differences (see Figure 10e,f).The spatial patterns of LE are also illustrated along with a histogram that shows the frequency distribution of LE values within the model domain.As shown in Figure 11, different models exhibited different capabilities of capturing the frequency distribution of urban LE estimation.The discrepancies in LE distributions between the two studied models, in terms of both overall magnitude and spatial differences, were larger in urban regions relative to those in non-urban regions.The distributions of the modeled LE results are similar in non-urban areas, with overall bias values of 7.3 Wm −2 and 4.0 Wm −2 based on the TSEB model and PCACA model, respectively.However, a larger difference was observed in urban regions, with overall bias values of 58.8 Wm −2 and 13.9 Wm −2 based on the TSEB model and PCACA model, respectively.The deviations in the estimated urban LE obtained from both TSEB and PCACA models could be attributed to the differences in the response to urban VFC retrieval.For the TSEB approach, VFC was a critical factor that dominated the pixel-level-combined heat flux.While for the PCACA model, these uncertainties were caused by the spatial context between VFC and LST.This further demonstrated that accurate VFC data is particularly important in enhancing the estimation of urban heat fluxes.Similar finding was observed in the study of [57], which used Surface Urban Energy and Water Balance model to delineate neighborhood scale energy components and demonstrated the importance of vegetative cover in simulating urban energy terms.
level-combined heat flux.While for the PCACA model, these uncertainties were caused by the spatial context between VFC and LST.This further demonstrated that accurate VFC data is particularly important in enhancing the estimation of urban heat fluxes.Similar finding was observed in the study of [57], which used Surface Urban Energy and Water Balance model to delineate neighborhood scale energy components and demonstrated the importance of vegetative cover in simulating urban energy terms.It is observed that RMSE value was 63.1 Wm −2 for TSEB model when using the NDVI-derived VFC, while this value was 58.3 Wm −2 for PCACA model.The MAE value was 58.6 Wm −2 for TSEB model when using NDVI-derived VFC; and this value was 53.1 Wm −2 for PCACA model.On the other hand, when using MESMA-derived VFC, the RMSE of LE was 57.3 Wm −2 for TSEB model, while this value was 54.6 Wm −2 for PCACA model.The MAE value was 50.8 Wm −2 for TSEB model when using MESMA-derived VFC; and this value was 48.2 Wm −2 for PCACA model.
We also checked the determination coefficients (R 2 ) of the fields with respect to the combination of B1-B5 sites and BA1-BA8 sites.The R 2 was 0.66 and 0.70 for TSEB model when using the NDVI-derived VFC and MESMA-derived VFC, respectively.Concerning the PCACA model, the R 2 values for the resultant LE values were 0.72 and 0.73 for the NDVI-derived VFC and the MESMA-derived VFC respectively.Overall, results demonstrated that LE was produced with obvious errors when using the NDVI-derived VFC while this was improved when using the SMA-derived VFC.
We also checked the determination coefficients (R 2 ) of the fields with respect to the combination of B1-B5 sites and BA1-BA8 sites.The R 2 was 0.66 and 0.70 for TSEB model when using the NDVIderived VFC and MESMA-derived VFC, respectively.Concerning the PCACA model, the R 2 values for the resultant LE values were 0.72 and 0.73 for the NDVI-derived VFC and the MESMA-derived VFC respectively.Overall, results demonstrated that LE was produced with obvious errors when using the NDVI-derived VFC while this was improved when using the SMA-derived VFC.

Discussion
When combined with medium-resolution spatial and spectral features, MESMA-derived method achieved more accurate VFC values in complex urban regions than NDVI-derived method.In particular, the NDVI-derived approach obviously underestimates low VFC areas as compared to estimates using the MESMA approach, especially in complex urban regions where partial cover conditions are more prevalent.This underestimation is mainly associated with the complexity of bare land (bare soil and impervious surfaces) spectra.For the sparsely vegetated areas, the performance of the NDVI-derived method decreases because NDVI is more sensitive to vegetation spectra over high-level bare land areas.This finding is supported by previous studies [58,59] that reported relatively lower vegetation coverages in metropolitan areas (Phoenix and Manaus) would result in estimation difficulties.It should be stressed that for a substantial part of urban thermal studies, the NDVI-derived method was commonly used due to its simplicity and ease of implementation.However, based on the above context, MESMA-derived VFC was found more prospective in studies of urban thermal environments.
Through the investigation we could see that the LE differences between two VFC sources were obvious in local urbanized regions, especially in areas with high-intensity impervious surface pixels (low-intensity urban vegetation) for which only a smaller fraction of the energy translates into

Discussion
When combined with medium-resolution spatial and spectral features, MESMA-derived method achieved more accurate VFC values in complex urban regions than NDVI-derived method.In particular, the NDVI-derived approach obviously underestimates low VFC areas as compared to estimates using the MESMA approach, especially in complex urban regions where partial cover conditions are more prevalent.This underestimation is mainly associated with the complexity of bare land (bare soil and impervious surfaces) spectra.For the sparsely vegetated areas, the performance of the NDVI-derived method decreases because NDVI is more sensitive to vegetation spectra over high-level bare land areas.This finding is supported by previous studies [58,59] that reported relatively lower vegetation coverages in metropolitan areas (Phoenix and Manaus) would result in estimation difficulties.It should be stressed that for a substantial part of urban thermal studies, the NDVI-derived method was commonly used due to its simplicity and ease of implementation.However, based on the above context, MESMA-derived VFC was found more prospective in studies of urban thermal environments.
Through the investigation we could see that the LE differences between two VFC sources were obvious in local urbanized regions, especially in areas with high-intensity impervious surface pixels (low-intensity urban vegetation) for which only a smaller fraction of the energy translates into evaporation and transpiration.In some regions such as the city center and the southern portions of built-up areas in Beijing, LE was detected using the MESMA-retrieved VFC and was ignored based on the NDVI-retrieved VFC.This can be attributed to the fact that VFC was more likely to be underestimated using the NDVI-derived method in sparsely vegetated areas dominated by light impervious surfaces and soil.In particular, when NDVI-derived VFC was used as input data, latent heat flux was seriously underestimated and even considered as non-existent over high-level impervious surfaces (VFC less than 10%).
Clear dissimilarities were also observed in the frequency distributions of LE estimates between those of TSEB and PCACA models using the same VFC in urban and non-urban regions.Moreover, the discrepancies in the TSEB model-retrieved heat flux values using MESMA VFC versus NDVI VFC inputs were nearly one to two times larger than those of the PCACA model.This may be because that PCACA model could partially counteract the effects of miscalculated VFC on LE estimations using the constraint of spatial context of them.The results of our study are supported by some previous studies [18,60], which have reported that the VFC-LST methodology could be used in complex urban scenes and produce acceptable results.
On the other hand, although previous studies have focused on the validation and applicability of remote sensing-based urban heat flux models, there is no comparative analysis regarding different urban heat flux models.In present study, PCACA model, a typical VFC-LST based trapezoid model that does not require a large number of relative parameters which cannot be retrieved by remote sensing, can be more suitable to delineate urban surface LE and with higher accuracy.In addition, while the TSEB and PCACA models provided acceptable performances with the available remote sensing and meteorological data, the applications of these models needs to be further assessed by invalidating with other experimental sites (i.e., URBANFLUXES (http://urbanfluxes.eu/)) or comparing with other urban energy models.Some effectively neighborhood scale urban climates models, such as [61], should be considered in further work.

Conclusions
This study investigated the fractional abundance of urban vegetation based on the heterogeneous pixels of Landsat imagery.A comparative analysis between NDVI-derived and MESMA-derived urban VFC illustrated that the latter achieved more accurate VFC values in complex urban regions.Moreover, MESMA-derived VFC could result in more accurate urban LE estimates relative to NDVI-derived VFC when used as input to both the TSEB and PCACA models.Our study suggests that the MESMA-retrieved VFC, which has not been sufficiently investigated in previous urban thermal environment studies, should be given more attention.
When using the different remote sensing-based VFC driver maps, PCACA model produced smaller differences on the output LE maps over both urban and non-urban regions.However, obvious bias was observed using the TSEB model in both urban and non-urban regions, particularly in the former.Accordingly, PCACA model may be an alternative for remote sensing-based urban heat flux studies that focus on quantitative comparative analyses in complex study regions.

Figure 1 .
Figure 1.The left is the location of study areas in the China.The right map is the Landsat 5 TM true color composition imagery of Beijing, overlaid with the field measurements and meteorological station sites (listed in Table1).

2. 2 . 1 .
Remote Sensing Data To quantify the urban heat fluxes, one cloud-free Landsat 5 TM image (22 September 2009) of Beijing was used.In addition, a QUICKBIRD image that covered the Beijing region was collected on 1 October 2009 and employed as a reference dataset for the identification of field data.To retrieve the atmospheric profile for estimating Landsat LST, we collected TERRA/MODIS remote sensing images (MOD021KM and MOD03, obtained online at http://daac.gsfc.nasa.gov/data/dataset/MODIS/),which have the same acquisition date as the Landsat TM imagery.One Landsat scene available from 22 September 2009 may be insufficient, therefore ten Landsat TM/ETM images collected on 17 April 2001, 12 April 2002, 6 July 2004, 6 May 2005, 22 May 2005, 27

Figure 1 .
Figure 1.The left is the location of study areas in the China.The right map is the Landsat 5 TM true color composition imagery of Beijing, overlaid with the field measurements and meteorological station sites (listed in Table1).

Figure 2 .
Figure 2. The overall flowchart of this study.The solid blue box represents the input data.From top to bottom, the dashed red boxes illustrate image pre-processing, model implementation and analysis of results.

Figure 2 .
Figure 2. The overall flowchart of this study.The solid blue box represents the input data.From top to bottom, the dashed red boxes illustrate image pre-processing, model implementation and analysis of results.

Figure 3 .
Figure 3. Illustration of the spectra endmembers with different numbers of clusters: (a) 11 for vegetation; (b) 5 for bare soil; (c) 6 for high-albedo impervious surfaces; and (d) 6 for low-albedo impervious surfaces.

Figure 3 .
Figure 3. Illustration of the spectra endmembers with different numbers of clusters: (a) 11 for vegetation; (b) 5 for bare soil; (c) 6 for high-albedo impervious surfaces; and (d) 6 for low-albedo impervious surfaces.
sd LST and dry k were retrieved using LST values from representative bare land in the study area, and sw LST and w et k from the Landsat images were based on the LST values of several water districts in the study regions.

Figure 4 .
Figure 4.The trapezoid of vegetation fractional cover/land surface temperature (VFC/LST) space, in which v and s denote vegetated areas and bare ground, respectively, and d and w denote extremely dry and extremely wet areas, respectively In the implementation of the PCACA model, 10 water bodies in Beijing were selected, and their mean Landsat TM LSTs were calculated.We considered the water surface temperature to be o 20.0 0.5 C.=

Figure 4 .
Figure 4.The trapezoid of vegetation fractional cover/land surface temperature (VFC/LST) space, in which v and s denote vegetated areas and bare ground, respectively, and d and w denote extremely dry and extremely wet areas, respectively.
5 o C. Average values of 23 Landsat TM LSTs of built-up, pavement and dry, bare soil regions in the study area were used to determine the dry surface temperature LST sd = 38.2± 0.9 o C. When obtaining values of LST sw and LST sd , k dry and k wet were estimated through fitting Equation (13) using a combination of VFC and LST.k dry and k wet were set to −4.76 and 0.51, respectively, in our study.With these calculated coefficients, the Bowen ratio was estimated from VFC and LST in each pixel using the following formula.β = −0.51× VFC − (20.0 − LST) −4.76 × VFC + (38.2 − LST)

Figure 5 .
Figure 5.The VFC maps based on: (a) Normalized Difference Vegetation Index (NDVI)-derived method; and (b) Multiple Endmember Spectral Mixture Analysis (MESMA)-derived method.Detailed VFC maps in the urban region (blue box) based on: (c) NDVI-derived method; and (d) MESMA-derived method.

Figure 5 .
Figure 5.The VFC maps based on: (a) Normalized Difference Vegetation Index (NDVI)-derived method; and (b) Multiple Endmember Spectral Mixture Analysis (MESMA)-derived method.Detailed VFC maps in the urban region (blue box) based on: (c) NDVI-derived method; and (d) MESMA-derived method.

20 Figure 7 .
Figure 7. Normalized frequency plots of the two estimated VFCs in: (a) urban regions; and (b) nonurban regions (Note that the x-axis is from 0.02 to 0.98).

Figure 7 .
Figure 7. Normalized frequency plots of the two estimated VFCs in: (a) urban regions; and (b) non-urban regions (Note that the x-axis is from 0.02 to 0.98).

Figure 11 .Figure 11 .
Figure 11.Frequency plots of estimated LE data based on the NDVI-derived and MESMA-derived models: (a,c) urban region; and (b,d) non-urban region.
Endmember Spectral Mixture Analysis (MESMA)-derived method) affected the accuracies of the model estimates of surface heat fluxes in a heterogeneous urban environment.For this purpose, two urban heat fluxes models that used MESMA-derived and NDVI-derived VFC as inputs, the TSEB and PCACA models, were implemented across the urban landscape of Beijing, China.Moreover, intercomparable studies of how the two different remote sensing-based urban heat flux models (TSEB and PCACA) with different physical basis depend on VFC were also conducted in our research.Thus, the Bowen ratio and latent heat fluxes were calculated based on four methods: (1) TSEB combined with NDVI-derived VFC; (2) TSEB combined with MESMA-derived VFC; (3) PCACA combined with NDVI-derived VFC; and (4) PCACA combined with MESMA-derived VFC.

Table 1 .
Descriptions of ground observation sites and field observations collected at Landsat passing.

Table 2 .
Details of the MESMA model parameterization.

Table 2 .
Details of the MESMA model parameterization.

Table 3 .
Parameters used for surface coverage types.

Table 4 .
Comparisons of VFC estimation accuracy for NDVI-derived and MESMA-derived methods.