Using a Vegetation Index-Based Mixture Model to Estimate Fractional Vegetation Cover Products by Jointly Using Multiple Satellite Data: Method and Feasibility Analysis

: Remote sensing fractional vegetation cover (FVC) requires both ﬁner-resolution and high-frequency in climate and ecosystem research. The increasing availability of ﬁner-resolution ( ≤ 30 m) remote sensing data makes this possible. However, data from different satellites have large differences in spatial resolution, spectral response function, and so on, making joint use difﬁcult. Herein, we showed that the vegetation index (VI)-based mixture model with the appropriate VI values of pure vegetation ( V v ) and bare soil ( V s ) from the MODIS BRDF product via the multi-angle VI method (MultiVI) was feasible to estimate FVC with multiple satellite data. Analyses of the spatial resolution and spectral response function differences for MODIS and other satellites including Landsat 8, Chinese GF 1, and ZY 3 predicted that (1) the effect of V v and V s downscaling on FVC estimation uncertainty varied from satellite to satellite due to the positioning differences, and (2) after spectral normalization, the uncertainty (RMSDs) for FVC estimation decreased by ~2.6% compared with the results without spectral normalization. FVC estimation across multiple satellite data will help to improve the spatiotemporal resolution of FVC products, which is an important development for numerous biophysical applications. Herein, we proved that the VI-based mixture model with V v and V s from MultiVI is a strong candidate. downscaling was produced based on Landsat and HJ satellites [31], the 30 m V v and V s have better positioning consistency with FVC from Landsat 8 than ZY 3 and GF 1. As for consistency (R 2 ), FVC_2 for Landsat 8, ZY 3, and GF 1 all have lower consistency than that of FVC_1. The downscaled endmembers increase the heterogeneity of the estimated FVC results. FVC_1 within the same 500-meter resolution pixel only reﬂects the difference of the 30 m NDVI from Landsat 8, ZY 3, or GF 1. When downscaled endmembers are used, the FVC_2 differences also include differences in land cover types. This shows that the downscaled endmembers have different accuracy under different land cover types, but the overall trend is good (i.e., RMSD decrease). Contributions: Conceptualization, W.S. and X.M.; methodology, W.S.; software, W.S. and B.Z.; validation, W.S. and T.Z.; formal analysis, L.W.; investigation, T.Z.; resources, B.Z. and J.Z.; data curation, B.Z. J.Z.; writing—original preparation, W.S.; writing—review W.S., T.Z. visualization, W.S. T.Z.; X.M., administration,


Introduction
Remote sensing fractional vegetation cover (FVC) is one of the most important products in describing the vegetation coverage for climate, ecosystem, land degradation, and desertification [1,2]. It is defined as the ratio of the vertically projected area of vegetation to the total surface extent [3][4][5][6][7]. Until now, most published FVC products are in coarse resolution (≥250 m). However, for application in a city ecosystem, agriculture, and soil erosion over basin areas, monitoring heterogeneous land surfaces by using coarse-resolution imageries can easily cause information absence [8,9] and requires higher resolution information. Besides this, high-frequency satellite data has great value for dynamic monitoring of rapid changes on the Earth's surface, such as timely crop monitoring.
The good news is that more finer-resolution (≤30 m) remote sensing data are freely available, such as Landsat 8 (30 m), Chinese GF 1 (16 m), and Sentinel 2 (10 m), making it possible to develop finer-resolution land surface products at high temporal frequency.
Synergies between Landsat 8 Operational Land Imager (OLI) and Sentinel 2 Multispectral Imager (MSI) data are promising to fulfill the community's needs for high-temporal resolution images at a finer resolution, which can provide dense global observations at a nominal revisit interval of 2-3 days [10]. Besides this, China has launched a series of Earth resources and environment monitoring satellites, such as the HJ (HuanJing, which means the environment in Chinese; 30 m; 31 days for global coverage) series, GF (GaoFen, which means high spatial resolution in Chinese; 16 m; 41 days for global coverage) series, ZY (ZiYuan, which means resource in Chinese; 5.8 m; 59 days for global coverage) series. Those satellite data can have finer spatial resolutions for detailed surface process monitoring. Therefore, making full use of multiple finer-resolution satellite data can help improve the spatiotemporal resolution of FVC products and will help to improve the practicality of FVC products. However, different satellite sensors have different spatial resolution and spectral response functions, and there is limited research on using these data in combination to fulfill the needs for global high-temporal resolution remote sensing products.
Furthermore, after analyzing the present algorithms available for finer-resolution FVC estimation [11,12], most algorithms are data-based, which are not extendable to other satellite sensors. The neural network is sensor dependent and needs a large amount of training data [11,13], which is not always available, and the transferability of both network and training data needs to be discussed. The relative vegetation abundance algorithms scaled by maximum and minimum vegetation index (VI) values and spectral mixture analysis algorithms need the spectra and VI at subpixel scale for end members are widely used for FVC application and production [1,[14][15][16]; however, the endmember information are often data-based and required as given information [17]. Thus, there is a need to provide a method that can joint use multiple satellite data.
The recently proposed Multi-angle vegetation index (MultiVI) method can be a candidate for the solution [18]. The MultiVI method facilitates the acquisition of end members in a VI-based linear mixture model for FVC estimation, where the end members (V v and V s ) are quantitatively derived from publicly available Moderate Resolution Imaging Spectroradiometer (MODIS) Bidirectional Reflectance Distribution Factor (BRDF) products without using any other prior knowledge [18]. By taking the Normalized Difference Vegetation Index (NDVI) as an example, the key end members, NDVI for pure vegetation (V v ) and bare soil (V s ), are two byproducts of this method and can be applied to generate FVC via the VI-based mixture model. The 500 m resolution V v and V s from MODIS can be downscaled according to land cover product, which makes it available to generate FVC in various spatial resolutions with multiple satellite data.
However, the generalizability of this 500-meter resolution V v and V s from MODIS should be discussed, whether they are suitable for all satellite data and capable of the estimation of FVC from multiple satellite data. Research has shown that the spectral response function does influence the band reflectance and VIs [19,20]. Although the difference between the spectral response functions of different satellite sensors has a negligible effect on the estimation accuracy of vegetation structure parameters [19,20], few studies have addressed the uncertainty and feasibility of using information from one satellite (e.g., V v and V s from MODIS) to determine vegetation structure parameters (e.g., FVC) from other satellites (i.e., when applying the V v and V s from MODIS to other satellite data, does the spectral response function difference affect the accuracy of FVC estimation?).
Herein, we proposed that the VI-based mixture model which obtains V v and V s from MODIS via the MultiVI method is capable of estimating FVC by jointly using multiple finer-resolution satellite data (i.e., Landsat 8, GF 1, and ZY 3). We examine the uncertainty and feasibility of the FVC estimation process. Our objectives are: (i) analyzing the necessity of MODIS V v and V s downscaling for finer-resolution FVC estimation; and (ii) assessing uncertainty due to spectral response function differences for FVC estimation with different satellite data.

Study Areas and Field Measurements
Herein, we selected the Saihanba National Park (SNP; Figure 1) for the validation of finer-resolution FVC with field measurements. The SNP area is 666 km 2 , which is in Hebei Province, China, and consists of 43.2% grassland, 44.0% forest, 9.5% desert and swamp, 3.0% cropland, and 0.3% residential land. The forest is artificial and is dominated by birch and larch. The land cover has a certain heterogeneity but is relatively stable. Field measurements were collected for the primary vegetation types in the SNP.

Study Areas and Field Measurements
Herein, we selected the Saihanba National Park (SNP; Figure finer-resolution FVC with field measurements. The SNP area is 666 Province, China, and consists of 43.2% grassland, 44.0% forest, 9. 3.0% cropland, and 0.3% residential land. The forest is artificial and and larch. The land cover has a certain heterogeneity but is relativ urements were collected for the primary vegetation types in the SN FVC was measured in sampling plots of 45 m × 45 m by us 2015, to obtain the reference FVC for the validation [21][22][23]. Duri digital camera was set up at a height of 1.5~2 m from the ground stick, and it was directed downward when photographing. Give top-down direction was used to capture low vegetation under whereas a bottom-up direction was used to capture the underside ditionally, the forest FVC was calculated as Equation (1), in which f while fdown is for FVC underneath the tree crown. Digital images diagonals of the plot, and about 20 images were taken for each pl ages were processed by using an automatic and shadow-resistant a FVC) with an uncertainty of less than 0.025 [24]. In all, 25 sites captured from 27 June to 13 September 2015 were used for evalu sites were measured multiple times during this period).  FVC was measured in sampling plots of 45 m × 45 m by using a digital camera in 2015, to obtain the reference FVC for the validation [21][22][23]. During the experiment, the digital camera was set up at a height of 1.5~2 m from the ground with the help of a long stick, and it was directed downward when photographing. Given the forests in SNP, a top-down direction was used to capture low vegetation underneath the tree crown, whereas a bottom-up direction was used to capture the underside of the tree crown. Additionally, the forest FVC was calculated as Equation (1), in which f up is for tree crown FVC while f down is for FVC underneath the tree crown. Digital images were taken along two diagonals of the plot, and about 20 images were taken for each plot. The digital FVC images were processed by using an automatic and shadow-resistant algorithm (SHAR-LABFVC) with an uncertainty of less than 0.025 [24]. In all, 25 sites with 35 measurements captured from 27 June to 13 September 2015 were used for evaluation herein (i.e., some sites were measured multiple times during this period).

Finer-Resolution NDVI
To analyze the uncertainty and feasibility of the VI-based mixture model and V v and V s from MODIS via MultiVI in finer-resolution FVC estimation, three satellite data were getting involved and compared, being: (i) Landsat 8 OLI surface reflectance product [25]; (ii) Chinese GF 1 satellite wide-field-of-view (WFV; 16 m) data; (iii) Chinese ZY 3 satellite multi-spectral camera (MUX; 5.8 m) data. Table 1 lists the temporal information of these satellite data. An atmospheric correction based on the dark object method was applied to GF 1 data [26]. A 10-day mean temporal composition was applied to GF 1 data to remove outliers. The Fast Line-of-sight Atmospheric Analysis of Hypercubes (FLAASH) module in the Environment for Visualizing Images (ENVI; Exelis, Inc., Boulder, CO, USA) software was used for the atmospheric correction of ZY 3 data. Both GF 1 and ZY 3 surface reflectance were simply averaged aggregated to 30-meter resolution, which was the same as Landsat 8, and NDVI was calculated for FVC estimation.

Spectral Library
The spectral band ranges and spectral response differences among MODIS, Landsat 8, GF 1, and ZY 3 ( Figure 2) were analyzed before applying the V v and V s from MODIS to the other three finer resolution satellite data. Canopy spectra were simulated based on the threedimensional (3D) radiative transfer (RT) simulation framework, LESS (large-scale remote sensing data and image simulation framework over heterogeneous 3D scenes) [27]. The RT leaf model, PROSPECT-D [28], built-in LESS was used to simulate different leaves' spectra. High, middle, and low levels of dry matter (Cm), chlorophyll (Cab), and anthocyanin (Anth) were considered. Table 2 lists the details of the parameters for PROSPECT-D. Canopy structure was defined by Leaf Area Index (LAI), Leaf inclination Angle Distribution (LAD), and crown shape. For canopy spectra simulation, the scene LAI was set at 6, which represented a very dense vegetation scene. The shape of leaf in all scenes was set as disc; details are listed in Table 3. When simulating the canopy spectra, the effects of soil reflectivity were also considered. The soil spectra were from the global spectral libraries [29,30]. In all 486, canopy spectra were simulated and 4439 soil spectra were used for spectral normalization. the field measurements.

Spectral Library
The spectral band ranges and spectral response differences 8, GF 1, and ZY 3 ( Figure 2) were analyzed before applying the V the other three finer resolution satellite data. Canopy spectra wer three-dimensional (3D) radiative transfer (RT) simulation fram remote sensing data and image simulation framework over h [27]. The RT leaf model, PROSPECT-D [28], built-in LESS was u leaves' spectra. High, middle, and low levels of dry matter (Cm anthocyanin (Anth) were considered. Table 2 lists the details of SPECT-D. Canopy structure was defined by Leaf Area Index (LA Distribution (LAD), and crown shape. For canopy spectra simul set at 6, which represented a very dense vegetation scene. The s was set as disc; details are listed in Table 3. When simulating th fects of soil reflectivity were also considered. The soil spectra wer libraries [29,30]. In all 486, canopy spectra were simulated and 44 for spectral normalization.

V v and V s Downscaling
A 30 m resolution global land cover dataset (GlobeLand 30) in 2010 [31] was used to downscale V v and V s from MODIS in 500-meter resolution. GlobeLand 30 divides the land surface into 10 types, including 6 vegetation types (i.e., cultivated land, forest, grassland, shrubland, wetland, and tundra) and 4 unvegetated types (i.e., water bodies, artificial surfaces, bare land, and permanent snow and ice). Herein, we combined the wetland, water bodies, and permanent snow and ice into 1 type, all named water bodies, since they all have V s below 0. As for MODIS, V s via MultiVI does not consider the water background and it is always greater than 0; this type was considered in the downscaling process, but no FVC estimation was performed. Thus, 8 land cover types (i.e., cultivated land, forest, grassland, shrubland, tundra, artificial surfaces, bare land, and water bodies) were used for V v and V s downscaling.
V v /V s for each MODIS pixel is assumed as the combination of V v /V s of all land cover types in this pixel area. Take the proportion of each land cover type (k) in the MODIS pixel as the weight (f ), 500 m V v /V s can be decomposed to 30-meter resolution according to Equation (2).
where, V v,modis and V s,modis is the V v and V s for a single 500 m MODIS pixel, V v,k and V s,k is the V v and V s for a single 30 m land cover type (k) pixel, m is the number of land cover type in this MODIS pixel area. A 3 × 3 sliding window with 1 MODIS pixel step was used to solve the equation, and the result was set as the solution for all the 30 m land cover type pixels in the center MODIS pixel area. Equation (3) shows an example of how to obtain V v,k in MODIS pixel (x,y) using a sliding window. For obtaining V s,k , it is the same as that of V v,k , except that V v,modis is replaced by V s,modis .

Spectral Normalization
Spectral normalization was applied to V v and V s from MODIS to match spectral settings of Landsat 8, GF 1, and ZY 3. Canopy and soil spectra described in Sec. II.C was transformed into red and near-infrared (NIR) bands reflectance according to the spectral response functions of each satellite sensor ( Figure 2). Normalized coefficients for V v and V s were obtained based on canopy and soil NDVI, respectively. A simple linear model (Equation (4)) was performed on Landsat 8 and MODIS, GF 1 and MODIS, and ZY 3 and MODIS, respectively.
where, i is for Landsat 8, GF 1, and ZY 3; NDVI vegj,i is the NDVI of i calculated from canopy spectrum j; NDVI vegj,modis is the NDVI of MODIS calculated from canopy spectrum j; a v,i and b v,i are normalized coefficients for V v of i. In all 486 canopy spectra were used for a v,i and b v,i estimation; thus, n is 486 in Equation (4). The estimation of the normalized coefficients for V s of i (i.e., a s,i and b s,i ) are similar to V v , except for changing the spectra to the soil. When estimating a s,i and b s,i , n was 4439, which means 4439 soil spectra. V v and V s for Landsat 8, GF 1, and ZY 3 were calculated by applying the normalized coefficients to V v and V s from MODIS.

FVC Production
VI-based mixture model (Equation (5)) was used to estimate consistent FVC products from multiple satellite data. The central latitude/longitude of each pixel for all satellites were used to match V v and V s from MODIS with NDVI from Landsat 8, ZY 3, and GF 1. To analyze the necessity of MODIS V v and V s downscaling for finer-resolution FVC estimation, FVC (FVC_1 in Figure 3) was estimated with 500-meter and 30-meter endmembers, respectively. To assess uncertainty due to spectral band ranges and spectral response function differences from different satellite sensors, FVC was also estimated with endmembers before (FVC_2 in Figure 3) and after (FVC_3 in Figure 3) spectral normalization. The necessity of MODIS V v and V s downscaling, uncertainty of spectral normalization, and consistency of FVC estimation from different satellites were evaluated by comparing with field-measured FVC.
function differences from different satellite sensors, FVC was also estimated with endmembers before (FVC_2 in Figure 3) and after (FVC_3 in Figure 3) spectral normalization. The necessity of MODIS Vv and Vs downscaling, uncertainty of spectral normalization, and consistency of FVC estimation from different satellites were evaluated by comparing with field-measured FVC.

Necessity Analysis for Vv and Vs Downscaling
The necessity of downscaling the endmembers, Vv and Vs, from MODIS was analyzed by comparing the uncertainty of FVC estimation based on the endmembers at 500 m and 30 m, respectively. Figure 4 shows that FVC from Landsat 8 has lower uncertainty with 30 m Vv and Vs (i.e., RMSD for FVC_2 is 0.124) than with 500 m (i.e., RMSD for FVC_1 is 0.134). However, for ZY 3 and GF 1, the uncertainties are not much different. The RMSDs for FVC_1 and FVC_2 from ZY 3 are 0.117 and 0.119, while for FVC_1 and FVC_2 from

Necessity Analysis for V v and V s Downscaling
The necessity of downscaling the endmembers, V v and V s , from MODIS was analyzed by comparing the uncertainty of FVC estimation based on the endmembers at 500 m and 30 m, respectively. Figure 4 shows that FVC from Landsat 8 has lower uncertainty with 30 m V v and V s (i.e., RMSD for FVC_2 is 0.124) than with 500 m (i.e., RMSD for FVC_1 is 0.134). However, for ZY 3 and GF 1, the uncertainties are not much different. The RMSDs for FVC_1 and FVC_2 from ZY 3 are 0.117 and 0.119, while for FVC_1 and FVC_2 from GF 1 are 0.094 and 0.102, respectively. The FVC estimation uncertainty of using finer-resolution V v and V s depends on the positioning accuracy of the input data (i.e., finer-resolution NDVI). Since the land cover dataset (GlobeLand 30) used for downscaling was produced based on Landsat and HJ satellites [31], the 30 m V v and V s have better positioning consistency with FVC from Landsat 8 than ZY 3 and GF 1. As for consistency (R 2 ), FVC_2 for Landsat 8, ZY 3, and GF 1 all have lower consistency than that of FVC_1. The downscaled endmembers increase the heterogeneity of the estimated FVC results. FVC_1 within the same 500-meter resolution pixel only reflects the difference of the 30 m NDVI from Landsat 8, ZY 3, or GF 1. When downscaled endmembers are used, the FVC_2 differences also include differences in land cover types. This shows that the downscaled endmembers have different accuracy under different land cover types, but the overall trend is good (i.e., RMSD decrease).

Uncertainty Analysis for Spectral Normalization
The uncertainty of FVC estimation caused by the spectral difference between finerresolution NDVI from Landsat 8, ZY 3, and GF 1 and Vv and Vs from MODIS were ana-

Uncertainty Analysis for Spectral Normalization
The uncertainty of FVC estimation caused by the spectral difference between finerresolution NDVI from Landsat 8, ZY 3, and GF 1 and V v and V s from MODIS were analyzed by comparing finer-resolution FVC with field measurement (Table 4). We compared situations of no spectral normalization (i.e., original in Table 4), only normalized V s , only normalized V v , and normalized both V v and V s (i.e., normalized all in Table 4), respectively. Results show that all Landsat 8, ZY 3, and GF 1 have slight improvement after spectral normalization (i.e., on average, the RMSD of FVC estimated with normalized V v and V s decreased~2.6% compared with the FVC estimated with the original V v and V s ). Since 30 of the 35 field measurements have FVC > 0.5 (Figure 4), which are more sensitive to V v during production [32], thus only normalized V v seems to have the lowest uncertainty for Landsat 8 and ZY 3. That is to say when using the VI-based mixture model to estimate consistent finer-resolution FVC products from Landsat 8, ZY 3, and GF 1, a simple spectral normalization of V v and V s can be considered. Table 4. Uncertainty (RMSD) analysis for spectral normalization by comparing finer-resolution FVC (FVC_3) with field measurement. Original: no spectral normalization; Normalized V s : only did spectral normalization for V s ; Normalized V v : only did spectral normalization for V v ; Normalized All: did spectral normalization for both V v and V s .

Satellite
Original

Accuracy Analysis for FVC by Comparing with Traditional VI-Based Linear Mixture Model
The accuracy of FVC estimated with the process of FVC production in this study was compared with the traditional VI-based linear mixture model. The downscaled MODIS V v and V s after spectral normalization were used for the process of FVC production in this study (i.e., FVC_3 was used for comparison). V v and V s for the traditional VI-based model were obtained based on the statistic method provided by Zeng et al. [14]. According to Zeng et al. [14], V v is the NDVI value at the 75th percentile of the cumulative distribution histogram for cultivated land, forest, and grassland, and 90th percentile for shrubland and artificial land, while V s is the constant value, 0.05, for all land cover types. Considering that Landsat 8 has the highest recognition among all and is the best data match with the GlobeLand 30 land cover product, Landsat 8 on 31 July 2015 was selected for this comparison. Figure 5 shows the FVC map estimated by the proposed method ( Figure 5a) and the traditional VI-based linear mixture model (Figure 5b), respectively. The spatial distribution of FVC in both Figure 5a,b is very similar, except that the texture of Figure 5a is clearer. The accuracy of Figure 5a,b was checked by using the field measurements around 31 July 2015. The result ( Figure 6) shows that the FVC estimated by the process in this study has less uncertainty (RMSD = 0.110) than the traditional VI-based model (RMSD = 0.149). The lower consistency (R 2 ) is also caused by the heterogeneity of downscaled V v and V s . tion of FVC in both Figure 5a,b is very similar, except that the texture of Figure 5a is clearer. The accuracy of Figure 5a,b was checked by using the field measurements around 31 July 2015. The result ( Figure 6) shows that the FVC estimated by the process in this study has less uncertainty (RMSD = 0.110) than the traditional VI-based model (RMSD = 0.149). The lower consistency (R 2 ) is also caused by the heterogeneity of downscaled Vv and Vs.

Applicability of the Vv and Vs Downscaling Method
In order to analyze the applicability of the land cover-based method used herein, Figure 7 shows the distribution of pixel het area. The number of different land cover types in each 500-meter M to represent the heterogeneity. Herein, most pixels have over tw types but less than six, which makes Equation (3) solvable.

Applicability of the V v and V s Downscaling Method
In order to analyze the applicability of the land cover-based V v and V s downscaling method used herein, Figure 7 shows the distribution of pixel heterogeneity in the SNP area. The number of different land cover types in each 500-meter MODIS pixel was used to represent the heterogeneity. Herein, most pixels have over two kinds of land cover types but less than six, which makes Equation (3) solvable.

Applicability of the Vv and Vs Downscaling Method
In order to analyze the applicability of the land cover-based Vv and Vs downscaling method used herein, Figure 7 shows the distribution of pixel heterogeneity in the SNP area. The number of different land cover types in each 500-meter MODIS pixel was used to represent the heterogeneity. Herein, most pixels have over two kinds of land cover types but less than six, which makes Equation (3) solvable. After grouping the field plots based on Figure 7, the uncertainties of estimated FVC from Landsat 8, ZY 3, and GF 1 due to land cover heterogeneity are shown in Figure 8.
Here, finer-resolution FVC estimated with downscaled Vv and Vs (FVC_2) was used (Figure 4b). The uncertainty was also presented by absolute error (FVC_2-field-measured FVC). The land cover heterogeneity was grouped into three groups: (1) the number of land cover types is less than three but equal to or greater than two (2 ~ 3); (2) the number of land cover types is less than four but equal or greater than three (3 ~ 4); (3) the number of land cover types is equal or greater than (>4). In Figure 8, the red median marks for both Landsat 8 and ZY 3 FVC are all above the dark red dash zero line in all three groups, which means that they both overestimated the FVC. Due to the small image width of ZY 3, areas with high heterogeneity (>4) are not covered. While for GF 1, both the median and mean are below zero, which means that it underestimated the FVC. The differences in the After grouping the field plots based on Figure 7, the uncertainties of estimated FVC from Landsat 8, ZY 3, and GF 1 due to land cover heterogeneity are shown in Figure 8. Here, finer-resolution FVC estimated with downscaled V v and V s (FVC_2) was used (Figure 4b).
The uncertainty was also presented by absolute error (FVC_2-field-measured FVC). The land cover heterogeneity was grouped into three groups: (1) the number of land cover types is less than three but equal to or greater than two (2~3); (2) the number of land cover types is less than four but equal or greater than three (3~4); (3) the number of land cover types is equal or greater than (>4). In Figure 8, the red median marks for both Landsat 8 and ZY 3 FVC are all above the dark red dash zero line in all three groups, which means that they both overestimated the FVC. Due to the small image width of ZY 3, areas with high heterogeneity (>4) are not covered. While for GF 1, both the median and mean are below zero, which means that it underestimated the FVC. The differences in the uncertainty of the estimated FVC hardly change with the heterogeneity (Figure 8), which shows that the surface heterogeneity does not affect the accuracy of the process.  When using a different land cover dataset with a different classification system, the size of the sliding window may need to be adjusted. Furthermore, when it is hard to match finer-resolution Vv, Vs, and finer-resolution NDVI accurately, or land cover product matching the finer-resolution NDVI is not available, particularly in areas that have undergone rapid land cover changes, the coarse-resolution Vv and Vs which represent the aver- When using a different land cover dataset with a different classification system, the size of the sliding window may need to be adjusted. Furthermore, when it is hard to match finerresolution V v , V s , and finer-resolution NDVI accurately, or land cover product matching the finer-resolution NDVI is not available, particularly in areas that have undergone rapid land cover changes, the coarse-resolution V v and V s which represent the average situation of a wide range (herein 500 m × 500 m) and have the closest observation time is better for FVC estimation based on VI-based model. In this study, the positioning accuracy between Landsat 8 and finer-resolution V v and V s was less than one pixel, while ZY 3 and GF 1 were worse based on visual interpretation (i.e., within 2~3 pixels).

Spectral Analysis for Multiple Satellite Sensors
The spectral difference among MODIS, Landsat 8, ZY 3, and GF 1 were compared based on NDVI. Figure 9a shows that the vegetation NDVI (i.e., calculated NDVI based on vegetation spectra described in Sec. II.C) from Landsat 8 and ZY 3 are very similar, while vegetation NDVI from GF 1 looks different. Although ZY 3 has a wide band range, ZY 3 also has a high response in the spectral range, whereas Landsat 8 has a high response, especially in the NIR band ( Figure 2); thus, they have very similar vegetation NDVI. As for soil NDVI (i.e., calculated NDVI based on soil spectra described in Sec. II.C), Landsat 8 soil line has the most similarity with MODIS (i.e., has the slope 0.996 very close to 1), while the slopes for ZY 3 and GF 1 are 0.796 and 0.776, respectively (Figure 9b). The influence of the broadband reflectivity of ZY 3 and GF 1 on NDVI is mostly reflected in the soil NDVI. Different from previous research [19,20], herein, we applied the spectral normalization to Vv and Vs from MODIS to make it match the NDVI from Landsat 8, ZY 3, and GF 1, since Vv and Vs in the VI-based model determine the benchmark and boundary of the FVC estimation [2]. Table 5 lists the original and normalized Vv and Vs. The average Vv for Landsat 8 and ZY 3 are larger than MODIS since the slope of normalized coefficients is less than 1 (Figure 9a). This may be due to the wider high NIR response range of Landsat 8 and ZY 3 than MODIS; the spectral band range (width) with response >0.9 is 854~875 (21) nm for Landsat 8, 775~871 (96) nm for ZY 3, and 847~864 (17) nm for MODIS ( Figure  2). Although GF 1 also has a wide spectral band range, the peak of the NIR spectral response for GF 1 (i.e., 774 nm) is less than others (MODIS: 857 nm, Landsat 8:859 nm, ZY 3: 807 nm; Figure 2). The normalized Vv for GF 1 is less than MODIS. For Vs, all normalized Vs are less than MODIS. There is less difference in the change rate and reflectivity of soil spectra between red and NIR than vegetation. The difference in Vs is mainly caused by the spectral band range.  Different from previous research [19,20], herein, we applied the spectral normalization to V v and V s from MODIS to make it match the NDVI from Landsat 8, ZY 3, and GF 1, since V v and V s in the VI-based model determine the benchmark and boundary of the FVC estimation [2]. Table 5 lists the original and normalized V v and V s . The average V v for Landsat 8 and ZY 3 are larger than MODIS since the slope of normalized coefficients is less than 1 (Figure 9a). This may be due to the wider high NIR response range of Landsat 8 and ZY 3 than MODIS; the spectral band range (width) with response > 0.9 is 854~875 (21) nm for Landsat 8, 775~871 (96) nm for ZY 3, and 847~864 (17) nm for MODIS ( Figure 2). Although GF 1 also has a wide spectral band range, the peak of the NIR spectral response for GF 1 (i.e., 774 nm) is less than others (MODIS: 857 nm, Landsat 8:859 nm, ZY 3: 807 nm; Figure 2). The normalized V v for GF 1 is less than MODIS. For V s , all normalized V s are less than MODIS. There is less difference in the change rate and reflectivity of soil spectra between red and NIR than vegetation. The difference in V s is mainly caused by the spectral band range.

Prospect of FVC Estimation by Joint Using Multiple Satellite Data
FVC from finer-resolution satellite data, such as Landsat series, sentinel 2, Chinese GF series, and so on, has great value for dynamic monitoring of crop, city, and hydrological basins [8,9,33]. However, the missed temporal information can hinder applications of FVC that require near-daily or multi-day imagery, flood response, vegetation phenology identification, and forest disturbance detection [8][9][10]33,34]. Only using the temporally sparse time-series satellite data is, therefore, unsuitable for global monitoring of rapid changes and rapid phenology changes. The FVC estimation process proposed in this study can make full use of multiple finer and even high-resolution satellite data. By joint using multiple satellite data for FVC estimation, it is expected to improve the spatial and temporal resolution and applicability of FVC products. The high-frequency and highspatial-resolution FVC product is meaningful for climate, ecosystem, land degradation, and desertification research [1,2].
In this study, GF 1 (RMSD = 0.099) has the lower uncertainty than Landsat 8 (RMSD = 0.121) and ZY 3 (RMSD = 0.117; Table 4). However, the uncertainty in FVC estimation is influenced not only by the spatial resolution, spectral band range, and spectral response function differences but also by observation geometry. Both Landsat 8 and ZY 3 have nadir view zenith angle (VZA), while GF 1 has a side-swing ability of ≤35 • , so its VZA changes. Based on a study with the SAIL bidirectional canopy reflectance model coupled with the PROSPECT leaf optical properties model (PROSAIL), the effects of changes in the VZA caused by side sway are found to have a greater impact on reflectance and NDVI than that caused by the spectral response function [19]. Thus, the FVC from GF 1 estimated by the process herein has less stability than FVC from Landsat 8 and ZY 3. For FVC estimation, nadir observation is needed, GF 1 observation with large VZA should be excluded in production.

Conclusions
Herein, we provided and analyzed a process of FVC production that is capable of jointly using multiple satellite data. The process is based on the VI-based mixture model with the two key endmembers for pure vegetation (V v ) and bare soil (V s ) from MODIS via the MultiVI method. It is supposed to be able to produce high-frequency and high-temporal FVC products, since multiple finer-resolution satellite data (i.e., Landsat 8, ZY 3, GF 1) can be used to achieve high frequency. The inconsistent spatial resolution between V v and V s from MODIS and finer-resolution satellite data and the difference in spectral band range and spectral response function are analyzed. Results shows that FVC from Landsat 8 (RMSD = 0.121), ZY 3 (RMSD = 0.117), and GF 1 (RMSD = 0.099) has uncertainty~0.11 with downscaled and spectral normalized V v and V s . The necessity of V v and V s downscaling depends on the positioning accuracy of the finer-resolution satellite data. When the positioning accuracy is worse (i.e., greater than one pixel herein), the coarse-resolution V v and V s have less uncertainty during FVC estimation. After spectral normalization, the uncertainty (RMSD) for FVC estimation decreases by~2.6%. Therefore, the VI-based mixture model with V v and V s from MODIS via MultiVI is flexible in producing FVC at finer resolution and shows potential for the generation of high-frequency large-area products.