Effect of the Partitioning of Diffuse and Direct APAR on GPP Estimation

: Accurate estimation of gross primary productivity (GPP) is necessary to better understand the interaction of global terrestrial ecosystems with climate change and human activities. Light use efﬁciency (LUE)-based GPP models are widely used for retrieving several GPP products with various temporal and spatial resolutions. However, most LUE-based models assume a clear-sky condition, and the inﬂuence of diffuse radiation on GPP estimations has not been well considered. In this paper, a diffuse and direct (DDA) absorbed photosynthetically active radiation (APAR)-based method is proposed for better estimation of half-hourly GPP, which partitions APAR under diffuse and direct radiation conditions. Firstly, energy balance residual (EBR) FAPAR, moderate resolution imaging spectroradiometer (MODIS) leaf area index (LAI) (MCD15A2H) and clumping index (CI) products, as well as solar radiation records supplied by FLUXNET2015 were used to calculate diffuse and direct APAR at a half-hourly scale. Then, an eddy covariance-LUE (EC-LUE) model and meteorological observations from FLUXNET2015 data sets were used for obtaining corresponding LUE values. A co-variation relationship between LUE and diffuse fraction was observed, and the LUE was higher under more diffuse radiation conditions. Finally, the DDA-based method was tested using the half-hourly FLUXNET GPP and compared with half-hourly GPP calculated using total APAR (GPP_TA). The results indicated that the half-hourly GPP estimated using the DDA-based method (GPP_DDA) was more accurate, giving higher R 2 values, lower RMSE and RMSE* values (R 2 varied from 0.565 to 0.682, RMSE ranged from 3.219 to 12.405 and RMSE* were within the range of 2.785 to 8.395) than the GPP_TA (R 2 varied from 0.558 to 0.653, RMSE ranged from 3.407 to 13.081 and RMSE* were within the range of 3.321 to 9.625) across FLUXNET sites within different vegetation types. This study explored the effects of partitioning the diffuse and direct APAR on half-hourly GPP estimations, which demonstrates a higher agreement with FLUXNET GPP than total APAR-based GPP.


Introduction
The gross primary productivity (GPP) of vegetation is considered the most critical component of terrestrial ecosystems; it quantifies the conversion efficiency of carbon dioxide (CO 2 ) to vegetation biomass and acts as an indicator of the absorption ability of atmospheric CO 2 . GPP plays a key role in the estimation of the global carbon budget and sustainable development of terrestrial ecosystems [1][2][3]. Therefore, it is necessary to accurately estimate GPP to better understand the status of the terrestrial ecosystem as influenced by atmospheric conditions and human activities [4].
In recent years, several GPP models have been proposed for retrieving global and regional GPP data sets, including the process-based models and the data-driven models. Light use efficiency (LUE)-based models are widely used in the data-driven area, such as CASA [5], MOD17 algorithm [6], Vegetation Photosynthesis Model (VPM) [7,8], and The FLUXNET data set [11,30] is available at http://fluxnet.fluxdata.org/, accessed on 6 November 2021, and it contains several ecosystem fluxes based on the eddy covariance method. The eddy covariance method is currently the standard method used by biometeorologists to measure fluxes of trace gases between ecosystems and the atmosphere. Fluxes are measured by computing the covariance between the vertical velocity and target scalar mixing ratios at each individual node (site). In this paper, data from FLUXNET2015 data sets were selected containing GPP (umol CO 2 /m 2 /s), incident shortwave radiation (SW_IN, W/m 2 ), diffuse incident shortwave radiation (SW_DIF), air temperature (TA, • C), and vapor pressure deficit (VPD, hPa). The FLUXNET GPP was derived from the difference between ecosystem respiration (RECO) and net ecosystem CO 2 change (NEE) measurements. The SW value was used to obtain PAR by multiplying it by a constant (0.48) [31]. Meteorology measurements of TA and VPD were used to calculate the light use efficiency (LUE)-constrained factor based on the EC-LUE model [32]. In total, FLUXNET sites containing diffuse radiation observations representing different vegetation types were selected in this study (Table 1) in order to develop a diffuse and direct APAR-based (DDA) GPP model.

EBR FAPAR Product
The fraction of absorbed photosynthetically active radiation (FAPAR) is a key variable in describing the exchange of energy, mass, and momentum fluxes between the surface and atmosphere in global agricultural and ecological models [33,34]. Most global FAPAR products, such as MCD15A2H, SeaWiFs, MERIS and Copernicus Global Land products [35][36][37][38], correspond to FAPAR under black-sky conditions at the satellite overpass time only. A method based on the energy balance residual (EBR) principle was used to retrieve the white-sky and black-sky FAPAR using the MODIS broadband VIS albedo (white-sky and black-sky) product (MCD43A3), the LAI product (MCD15A2H), and the clumping index (CI) product [39]. The EBR FAPAR product [39] can be downloaded freely from the National Earth System Science Data Center (http://www.geodata.cn/, accessed on 19 December 2021). The EBR FAPAR product provides an eight-day compositing period with a spatial resolution of 500 m. The validation results showed R 2 values of 0.917 and 0.909 and RMSE values of 0.088 and 0.012, respectively, for black-sky and white-sky conditions corresponding to VALERI data sets [40].

LAI Product
The MCD15A2H product from MODIS collection V006 is an eight-day composited data set of the leaf area index (LAI) and the fraction of absorbed photosynthetically active radiation (FAPAR) with a spatial resolution of 500 m [41]. The MODIS LAI/FAPAR algorithm consists of a main look-up table (LUT)-based procedure that exploits the spectral information contained in the MODIS red and NIR bands, and a back-up algorithm that uses the empirical relationships between the NDVI and canopy LAI, and FAPAR [42]. In the main algorithm, the observed and modeled spectral directional reflectance at the red and NIR bands were compared for a suite of canopy structures and soil patterns, and the mean values of LAI and FAPAR were recorded as retrievals for each pixel [42].

Clumping Index Product
Clumping index (CI) is a critical canopy structural parameter characterizing the degree of the leaf grouping within a canopy. The CI value of vegetation canopy can be retrieved utilizing a linear relationship between the CI and the normalized difference between hotspot and dark spot (NDHD) angular index [43]. MODIS BRDF products were obtained using a hotspot-adjusted RossThick-LiSparse Reciprocal (RTLSR) model [44]. CI retrievals from the MODIS bidirectional reflectance distribution function (BRDF) products (MCD43A1 and MCD43A2) based on linear CI-NDHD equations proposed by Chen et al. [43] were proposed by Jiao et al. [44]. The main CI algorithm was designed to retrieve CIs in a closed interval (0.33, 1.00), and a back-up algorithm was designed to reprocess these outlier CI retrievals. Finally, a data set with a temporal resolution of eight days and spatial resolution of 500 m was obtained. The validated data sets indicated higher accuracy of R 2 values of 0.80 and 0.72, RMSE values of 0.07 and 0.12, respectively, for the main algorithm and back-up algorithm [44].

EC-LUE Model
The EC-LUE model is used to calculate the potential LUE, which is reduced by nonoptimal temperature or water stress [32]: where C s , T s and W s are the downward-regulation scalars attenuating the maximum LUE for the respective effects of atmospheric CO 2 concentration, temperature, and moisture of different vegetation types. The three scalars vary between 0 and 1 with smaller values indicating a stronger negative impact. Min denotes the minimum value of T s and W s . The effect of the atmospheric CO 2 concentration on GPP is determined by the following equations [45,46]: where ϕ is the CO 2 compensation point in the absence of dark respiration (ppm), C i is the leaf internal CO 2 concentration, C a is the atmospheric CO 2 concentration, and χ is the ratio of leaf internal to atmospheric CO 2 concentration, which can be estimated as follows [47,48]: where ε is a parameter related to the carbon cost of water, which determines the sensitivity of VPD to χ; K is the Michaelis-Menten coefficient of Rubisco; and η * is the viscosity of water relative to its value at 25 • C [49].
where P 0 is the partial pressure of O 2 , and K c and K 0 are the Michaelis-Menten constants for CO 2 and O 2 [48]:  (8) where T a is the air temperature (unit: K) and R is the molar gas constant (8.314 J/mol/K). T s is estimated based on the equation developed for the Terrestrial Ecosystem Model (TEM) [50]: where T min , T max and T opt are the minimum, maximum, and optimum air temperatures (8 • C) for photosynthetic activity, respectively. If the air temperature falls below T min or increases beyond T max , T s is set to zero. In this study, T min and T max are set to 0 and 40 • C, respectively, while T opt is determined as 20.33 • C using nonlinear optimization [9]. The impact of moisture on photosynthesis (W s ) is defined as follows: VPD 0 is the half-saturation coefficient of the VPD constraint equation (k Pa), which varies with vegetation types (Table 2):

Diffuse and Direct APAR (DDA)-Based Method for Half-Hourly GPP Estimation
The diffuse and direct APAR (DDA)-based model was originally derived from the concept of radiation conversion efficiency [6,52], which further partitions diffuse and direct APAR in the half-hourly GPP estimation: where APAR direct and APAR diffuse are the absorbed photosynthetically active radiation (APAR) under direct and diffuse sky conditions. α is an enhancement factor for correcting the increase of LUE due to the clouds under diffuse sky conditions. Zhang et al. [10] found that α slightly varied, ranging from 1.01 (crop) to 1.29 (wetland) across various vegetation types (Table 3). If the influence of diffuse light on LUE is not considered, half-hourly GPP can be determined using the EC-LUE model directly [53]: where GPP TA is the half-hourly GPP calculated which do not considered the diffuse lights, APAR total is the total APAR which concludes the diffuse and direct APAR. APAR direct and APAR diffuse are partitioned using the EBR FAPAR product [39]: where K d is the diffuse fraction derived from the ratio of FLUXNET2015 half-hourly observations of diffuse incoming shortwave radiation (SW_DIF, W/m 2 ) and incoming shortwave radiation (SW_IN, W/m 2 ), 0.48 is a constant used for transforming the incoming shortwave radiation to PAR [31], FAPAR where FAPAR t 0 direct denotes the instantaneous FAPAR value at 10:30 a.m. with a temporal resolution of one day. A non-linear method of spline interpolation was used for interpolating the eight-day EBR FAPAR [39] to daily scales in order to match the FLUXNET meteorological data at half-hourly scale. τ t i is the canopy directional transmittance at different times within a day per 30 min, and τ t 0 is the transmittance at 10:30 a.m., which can be determined using the gap fraction model [54][55][56]: where k is the leaf extinction coefficient, G(θ) is the projection of unit foliage area on the plane perpendicular to the sun incident direction, LAI is the leaf area index, CI is the clumping index, and θ t i is the solar zenith angle (SZA) at different times within a day per 30 min, and θ t 0 is the SZA value at 10:30 a.m. Here, G(θ) was set using a measured leaf inclination angle (LIA) dataset [57] matched with average leaf inclination angle (ALA) of different leaf inclination angle distribution functions (LADF) [58] for calculating G values towards different vegetation types in the selected FLUXNET sites (Table 4). Then, according to the matched LADF, the G functions can be calculated as follows [59,60]: where θ is solar zenith angle (SZA), θ L is leaf inclination angle, f (θ L ) is the leaf inclination angle distribution function and ϑ = cos −1 (cosθcosθ L ). The leaf extinction coefficient at visible band is dominated by its chlorophyll content [61]. Under natural conditions, the chlorophyll content normally varies from 20 to 80 µg/cm 2 according to the LOPEX'93 [62] and ANGERS [62] datasets. According to the simulations using PROSPECT-5 model [61,63], the leaf extinction coefficient varies from 0.79 to 0.94. However, no leaf chlorophyll content dataset was available for these FLUXNET sites, the leaf extinction coefficient (k) was given by a fixed value of 0.88 (the mean value with a chlorophyll content ranging from 20 and 80 µg/cm 2 ) [39].

Validation Plan for GPP Estimated by the Diffuse and Direct APAR (DDA)-Based Method
Direct validation based on FLUXNET GPP variable (GPP_NT_ VUT_REF, GPP estimated from the nighttime partitioning method) in comparison with the total APAR-based GPP (GPP_TA), which do not consider the diffuse lights was conducted in this paper.
Statistical metrics containing the determination coefficient (R 2 ), root mean square error (RMSE), and a corrected RMSE (RMSE*) were calculated as follows: where GPP cal is the half-hourly GPP retrieved using either the DDA method or the total APAR-based method; GPP FLUX is the half-hourly GPP observations derived from the FLUXNET2015; the over-bars denote the mean values of different GPP items; and n is the number of samples. Constant a and b are the slope and intercept of the linear function of the estimated GPP and observed GPP, a corrected RMSE value (RMSE*) used here is intended for removing the influence of the scale difference between satellite data and in-situ data as well as the systematic error of LUE-based model in order to accurately express the performance of partitioning of diffuse and direct APAR in GPP estimations.
In addition, a t test for the half-hourly GPP_DDA and GPP_TA was applied to evaluate the difference in partitioning the diffuse and direct APAR in the GPP estimation: where x and y are the mean value of samples; s x and s y are the standard deviations of the samples; n and m are number of samples. The t test returns a p-value to denote the probability that the observed test statistic is as extreme as or more extreme than the observed value under the null hypothesis, which represents the discrepancy between different data sets. The significance level was set as 0.05.

Temporal Variations of LUE at FLUXNET Sites
Daily variations of LUE and diffuse fraction were analyzed firstly to determine the relationship between LUE and diffuse radiation. Here, data from FLUXNET2015 and the EBR FAPAR product were used for deriving LUE values at a half-hourly scale. Data within the peak-growth seasons were selected for analysis. Figure 1 shows the variations of LUE and diffuse fraction at different times during a single day. Both LUE and diffuse fraction exhibited high values during sunrise and sunset times while lower values were observed around noontimes, which indicates a co-variation between LUE and diffuse fraction at a daily scale.
In addition, the relationship between LUE and diffuse fraction was examined across the FLUXNET sites within the whole year (Figure 2). For a better analysis, we firstly divided the diffuse fraction into ten levels from the lowest to the highest values (0.1 intervals). The results showed that LUE increased with increasing diffuse fraction for all vegetation types, which indicated the significant enhancement effect of diffuse radiation toward LUE [28,29]. In addition, the relationship between LUE and diffuse fraction was examined across the FLUXNET sites within the whole year (Figure 2). For a better analysis, we firstly divided the diffuse fraction into ten levels from the lowest to the highest values (0.1 intervals). The results showed that LUE increased with increasing diffuse fraction for all vegetation types, which indicated the significant enhancement effect of diffuse radiation toward LUE [28,29].  Figure 3 exhibits half-hourly FLUXNET GPP responses to diffuse and direct APAR at FLUXNET sites. The results in Figure 3 indicated a co-variation between GPP observations and APAR values under both diffuse and direct radiation conditions but with different rates, where GPP exhibited higher rates in diffuse radiation conditions. From the  Figure 3 indicated a co-variation between GPP observations and APAR values under both diffuse and direct radiation conditions but with different rates, where GPP exhibited higher rates in diffuse radiation conditions. From the statistical results in Table 5, the slopes of the linear functions are extremely higher under diffuse radiation conditions (shown as magenta points, slopes varied from 0.097 to 0.266) than direct radiation conditions (shown as blue points, slopes varied from 0.032 to 0.221), which approximately three times the difference. Although the enhancement of R 2 between the diffuse APAR and GPP than direct APAR was relatively small, obvious increase of slope values in the linear fitting functions (the variation coefficient of slope is up to 203.125% in the EBF site) can totally indicate a stronger response to GPP under diffuse radiation conditions. This was consistent with the results found by Zhou et al. [64]. Moreover, halfhourly GPP values were higher under diffuse light conditions than direct light conditions for the same amount of radiation, which further manifested the enhancement effect of LUE under diffuse radiation conditions [26,65].

Half-Hourly GPP Estimation by Diffuse and Direct APAR (DDA)-Based Method
In order to explore the effects of partitioning the diffuse and direct APAR on the half-hourly GPP estimation, site GPPs were collected from the FLUXNET2015 data sets. Meanwhile, the total APAR-based method was used as a comparison method. Figure 4 shows the validation of half-hourly GPP estimated by the diffuse and direct APAR-based model (GPP_DDA) and the total APAR-based GPP (GPP_TA) against the FLUXNET GPP (observed GPP) in six sites covering different vegetation types during the entire year. The results showed that GPP estimated using the DDA method presented higher R 2 and lower RMSE values (R 2 varied from 0.565 to 0.682, RMSE values were within the range of 3.219 to 12.405) against observed GPP than GPP_TA (R 2 varied from 0.558 to 0.653, RMSE values ranged from 3.407 to 13.081), which indicated that the partition of diffuse and direct APAR in the GPP estimation can reduce the uncertainties when using big leaf models. In addition, a corrected RMSE value (see Section 2.7) was calculated for removing the scale difference between the satellite data and in-situ data as well as the systematic error of the big leafbased model. Table 6 summarized the statistic metrics of DDA-based GPP and TA-based GPP with the observed GPP, concluding R 2 , RMSE, RMSE* and the variation coefficient of RMSE* values. From the result of the RMSE* values, obvious decrease of RMSE* values of the DDA-based GPP can be found in all vegetation types (the variation coefficient of RMSE* is varied between −11.224% and −23.015%). Moreover, a t test between these two models was used to indicate the difference between the DDA method and the total APAR-based method. The p-values were all less than 0.05 under a significance level of 0.05, which showed a significant difference in partitioning diffuse and direct APAR in the GPP estimation compared to the total APAR-based method. Besides, for a more intuitive validation of the DDA-based method, different levels of diffuse fraction are distinguished for calculating RMSE values between the DDA-based GPP and the observed GPP. Similarly to Figure 2, diffuse fraction was firstly divided in to ten groups, ranging from 0 to 1 with 0.1 intervals and the levels of diffuse fraction corresponds to 1 to 10 accordingly. Next, RMSE values between the DDA-based GPP and the observed GPP were calculated under different levels. Figure 5 shows the RMSE variation under different levels of diffuse fraction in different vegetation types over six FLUXNET sites. A significant decline of the RMSE values can be noticed with the increase of diffuse fraction owning averaged R 2 values of 0.659, which indicated that the DDA-based method can well improve the accuracy of GPP estimation due to the influence of diffuse radiation and confirm the necessity of partitioning diffuse APAR on GPP estimations.

Limitations of the Big Leaf Models
In Figure 4, higher R 2 and lower RMSE values for the diffuse and direct APAR-based GPP compared to the FLUXNET GPP can be noticed, indicating that the partition of diffuse radiation in the GPP estimation is quite necessary. However, the enhancement of model performance was quite limited. The reason may be that the proposed diffuse and direct APAR-based (DDA) method was essentially a big leaf model, which treats the whole canopy as a big uniform leaf and does not separate the leaves into shaded and sunlit parts. Therefore, these simplifications may introduce systematic errors into the calculation of GPP [66][67][68]. Such big leaf model limitations might have significantly limited the effects of the partitioning by diffuse and direct APAR on the GPP estimation.

Uncertainties from LUE Simplifications
Due to of the lack of LUE observations under both direct and diffuse radiation conditions, LUE calculated from the EC-LUE model was assumed as LUE under clear-sky conditions, and obtained LUE values under diffuse radiation conditions afterwards multiplied the enhancement factor (α) [69]. This may certainly distort GPP estimations, but the point of this paper was to explore the effects of partitioning the direct and diffuse APAR in GPP estimations using big leaf models. Therefore, such simplification was reasonable here.

Scale Difference between Remote Sensing Data and In Situ Data
For model validation and comparison, FLUXNET GPP was used in this paper. From the scatter plots in Figure 4, scatters below the 1:1 line can be observed for DBF, SAV, GRA, and CRO sites, which indicates underestimations of both the direct and diffuse radiation (DDA) method and total APAR-based method. This may be a scale difference between the in situ data and remote sensing data. On the one hand, the in situ GPP data observed by FLUXNET was influenced by multiple factors, such as climate conditions (temperature, moisture, winds, etc.) and atmosphere conditions (clouds, aerosols, particles, etc.). On the other hand, the remote sensing data used in this paper included FAPAR and clumping index products, which represent the pixel information with a range of 500 m × 500 m, while the in situ GPP data represent a single point. Furthermore, the vegetation types at FLUXNET sites were not actually totally homogenous, and the corresponding remote sensing pixels may not have represented the real value of the actual vegetation types. Therefore, the inconsistency in the in situ data and remote sensing data introduced uncertainties into model performance and validation results.

Uncertainties of the Remote Sensing Data
LAI, FAPAR, and CI satellite data were input to calculate the half-hourly FAPAR values across the FLUXNET sites in this paper. The quality of these inputs determined the accuracy of the GPP estimations. The MODIS LAI product (MCD15A2H), EBR FAPAR product, and CI product were used rather than measured data to drive the DDA model and the total APAR-based model. Such an application might have introduced uncertainties into the GPP estimations. In order to match the temporal scale of observations across the FLUXNET sites, a linear interpolation method was applied firstly for the EBR FAPAR product to obtain daily FAPAR and then to achieve half-hourly FAPAR records. Such an interpolation method might have introduced errors into the inputs of the GPP models. Additionally, LAI and CI data were used at temporal resolutions of eight days and a month, which might also have introduced errors into the GPP calculation at a half-hourly scale. In the future, high temporal and spatial resolution FAPAR, LAI and CI data should be used to constrain errors as well as for model construction and validation.

Contribution of Diffuse Radiation to GPP Estimations
The process of photosynthesis can be influenced by multiple environmental factors such as radiative conditions, vapor pressure deficit (VPD), and temperature [24,70]. The enhancement effect of LUE due to the diffuse radiation has been widely reported [24,28,29]. An approximate 110% enhancement of LUE under cloudy-sky days over clear-sky days was found by Choudhury [29]. Here, enhancement factors for different vegetation types derived from FLUXNET data by Zhang et al. [10] were used to correct LUE under diffuse radiation conditions.
Meanwhile, environmental factors such as VPD and major volcanic eruptions are also related to the diffuse radiation conditions [24,70]. Overall, the combined factors can enhance the photosynthesis under diffuse radiation conditions. Therefore, the partition of diffuse and direct radiation in GPP estimations is quite necessary.

Conclusions
In this paper, a diffuse and direct APAR-based (DDA) method is proposed to explore the effects of partitioning diffuse and direct APAR on half-hourly GPP estimations. The main conclusions are as follows: (1) LUE increased with increasing diffuse fraction for all vegetation types, which showed the enhancement effect of diffuse radiation on LUE. Moreover, a diurnal co-variation of LUE and diffuse fraction was observed for all vegetation types at FLUXNET sites, which further demonstrated the necessity of partitioning diffuse and direct APAR in half-hourly GPP estimations. (2) Half-hourly GPP increased with increasing diffuse and direct APAR but at different rates. Half-hourly GPP presented higher growth rates under diffuse conditions, with obvious increase of slope values (the variation coefficient of slope is up to 203.125%), which indicated the significant contribution of diffuse radiation to the process of vegetation photosynthesis. (3) Half-hourly GPP estimated using the DDA-based method showed higher R 2 , lower RMSE and RMSE* values (R 2 varied from 0.565 to 0.682, RMSE ranged from 3.219 to 12.405 and RMSE* were within the range of 2.785 to 8.395) against the FLUXENET GPP than the GPP_TA (R 2 varied from 0.558 to 0.653, RMSE ranged from 3.407 to 13.081 and RMSE* were within the range of 3.321 to 9.625), which suggested a better performance by partitioning the diffuse and direct APAR in half-hourly GPP estimations when using big leaf models.