Photosynthetically Active Radiation and Foliage Clumping Improve Satellite-Based NIRv Estimates of Gross Primary Production

: Monitoring gross primary production (GPP) is necessary for quantifying the terrestrial carbon balance. The near-infrared reﬂectance of vegetation (NIRv) has been proven to be a good predictor of GPP. Given that radiation powers photosynthesis, we hypothesized that (i) the addition of photosynthetic photon ﬂux density (PPFD) information to NIRv would improve estimates of GPP and that (ii) a further improvement would be obtained by incorporating the estimates of radiation distribution in the canopy provided by the foliar clumping index (CI). Thus, we used GPP data from FLUXNET sites to test these possible improvements by comparing the performance of a model based solely on NIRv with two other models, one combining NIRv and PPFD and the other combining NIRv, PPFD and the CI of each vegetation cover type. We tested the performance of these models for different types of vegetation cover, at various latitudes and over the different seasons. Our results demonstrate that the addition of daily radiation information and the clumping index for each vegetation cover type to the NIRv improves its ability to estimate GPP. The improvement was related to foliage organization, given that the foliar distribution in the canopy (CI) affects radiation distribution and use and that radiation drives productivity. Evergreen needleleaf forests are the vegetation cover type with the greatest improvement in GPP estimation after the addition of CI information, likely as a result of their greater radiation constraints. Vegetation type was more determinant of the sensitivity to PPFD changes than latitude or seasonality. We advocate for the incorporation of PPFD and CI into NIRv algorithms and GPP models to improve GPP estimates.


Introduction
As one of the principal processes controlling land-atmosphere interaction, gross primary productivity (GPP) is a key component of the terrestrial carbon balance [1,2].Continuous and accurate monitoring of GPP is necessary for quantifying how the dynamics of carbon cycles at regional to global scales respond to climate variability, and for addressing related issues including the terrestrial carbon sink, vegetation dynamics and ecosystem vulnerability [3][4][5], above all in the context of climate change.Forecasted changes in temperature, precipitation and atmospheric CO 2 are likely to lead to changes in terrestrial productivity [6].
The near-infrared reflectance of vegetation (NIRv) has been proposed as a new approach for estimating GPP [7,8].NIRv captures the fraction of NIR reflectance attributable to vegetation and addresses the confounding effects of leaf area, background brightness and the distribution of the photosynthetic capacity with depth in canopies.It has been shown to correlate well with GPP throughout the world's biomes [7][8][9].
The NIRv approach does not take into account radiation and light-use efficiency, since it assumes that plants allocate resources efficiently, so this measure of plants' investment in capturing light should correlate with their capacity to fix CO 2 [7,8].However, radiation plays an important role in linking vegetation greenness to photosynthesis [10] and, despite not usually being the most important limiting factor, radiation does, in fact, constrain vegetation growth over 27% of the Earth's vegetated surface area [11].In the light-use efficiency (LUE) concept developed by Monteith [12], GPP acts as a function of the photosynthetically active radiation (PAR), the fraction of absorbed PAR (fPAR) by plants and the conversion efficiency of absorbed light energy (ε).Recently, NIRVP [13,14], defined as the product of the near-infrared reflectance of vegetation and incoming PAR, has been shown to be closely correlated with GPP on different spatial and temporal scales [13,15,16].The benefits of adding incoming PAR to NIRv are also evident in the work of Jiang et al. [17].
The effect of radiation on GPP varies with latitude and the time of year.In boreal and temperate European forests, radiation and temperature are the main drivers of the latitudinal gradient and interannual patterns in GPP [18,19].Carbon assimilation in many boreal coniferous forest ecosystems has been shown to be light-limited in autumn [20,21].Regarding seasonality, radiation can also affect GPP due to its effect on the phenology of carbon uptake caused by the radiation constraints on photosynthesis, since radiation is a major factor limiting photosynthetic activity during the end of the season [11,22].The carbon uptake period is also sensitive to changes in radiation in evergreen forests [23].
Radiation use is related to canopy structures and levels of foliage organization, which vary between vegetation cover types [24].The clumping index (CI) is a measure of the level of foliage grouping within canopy structures relative to a random distribution [24,25].Leaf organizational structures exist for various ecological reasons, and affect radiation interception and distribution within canopies and, consequently, carbon uptake [26].Neglecting the CI increases the uncertainty of GPP estimates [26,27].In fact, the estimation of GPP from sun-induced fluorescence (SIF) improved after accounting for the canopy structure effects [28].
We aimed to explore how radiation as a modulator of NIRv could improve GPP estimates, especially at high latitudes and in the latter part of the year, given that photosynthetic photon flux density (PPFD) drives the GPP of the green biomass.We also hypothesized that the incorporation of information on foliar clumping would provide more accurate estimates of radiation distribution and, therefore, of its use in the canopy.Using eddy covariance flux measurements to test these hypotheses, we compared the ability of a model based solely on NIRv with the ability of two other models-a combination of NIRv and PPFD, and a combination of NIRv, PPFD and CI-to better estimate GPP in situ and, thus, analyse the need to include the structural information of foliage organization in GPP models.

Carbon Flux
We downloaded meteorological and CO 2 flux data collected by eddy covariance towers from the FLUXNET2015 Dataset Tier 1 (https://fluxnet.org/data/fluxnet2015-dataset/,accessed on 25 February 2023) available from 2000 to 2014.We used the daily mean radiation data estimated from average half-hourly records (SW_IN) and daily values of GPP based on methods of daytime partitioning (GPP_DT_VUT_USTAR50) [29].
accessed on 25 February 2023) available from 2000 to 2014.We used the daily mean radiation data estimated from average half-hourly records (SW_IN) and daily values of GPP based on methods of daytime partitioning (GPP_DT_VUT_USTAR50) [29].

Remotely Sensed Data
We used the MCD43A4 Version 6 dataset obtained from the MODIS instrumentation aboard the Terra and Aqua satellites available from 2000 to 2014 in the different sites.This data set contains daily reflectance at a 500 m spatial resolution for the MODIS bands 1 to 7. The reflectance was corrected for atmospheric effects and was nadir bidirectional reflectance distribution function (BRDF)-adjusted to simulate the observations collected from a nadir view.Data were provided by NASA's LP DAAC portal (https://doi.org/10.5 067/MODIS/MCD43A4.006).
The near-infrared reflectance of the vegetation (NIRv) was calculated as the product of the NIR nadir normalized top-of-the-canopy reflectance (ρNIR) and the normalized difference vegetation index (NDVI) [7]: and ρred and ρNIR are the reflectance values for MODIS band 1 (620-670 nm) and band 2 (841-871 nm), respectively.

Clumping Index (CI)
The canopy CI is defined as a ratio of the effective leaf area index (LAIe) to the leaf area index (LAI) [30], with LAIe being the LAI value that would produce the same indirect gap measurement as observed assuming a simple random distribution of the foliage [31].The values of the CI for each vegetation type were obtained from He et al. [24] and are as follows: ENF = 0.53, DBF = 0.7, EBF = 0.66, GRA = 0.75, OSH = 0.71, WET = 0.67, WSA = 0.76 and MF = 0.69.In addition, we also used the 8 days 500 m clumping index product (LIS-CI-A1) [32,33], interpolated at daily steps to match with the remote sensing and carbon flux data.The method for interpolation was cubic interpolation.Cubic interpolation estimates values between known data points by fitting a third-degree polynomial using the four closest points to the interpolation date.

GPP Models
We trained three linear models using the whole dataset: We then checked the performance of each model after grouping the different sites into three latitudinal ranges-low (30-45 • ), middle (45-53 • ) and high (53-62 • ) latitudes • -given how much the irradiance depends on latitude.
Since radiation seems to mostly affect vegetation productivity in autumn [28], we also compared the performance of the models after adding radiation to NIRv both before and after the summer solstice (21 June).

Statistical Analyses
All data treatment and analyses were conducted using R statistical software (version 3.2.5)[34].The modelling and assessment of the predictive performance of the models was conducted using the k-fold cross-validation approach [35] (k = 10; 10% of the data was left for validation purposes at each validation run) with "lmStepAIC" from the R package caret [36].We used the coefficient of determination (R 2 ), the root mean square error (RMSE), a measure of uncertainty, and the mean error (bias; mean differences between predicted GPP and measured GPP) to measure the accuracy of the estimates of the linear regression models.

Results
The comparison between ground-measured and satellite-estimated GPP using the different models shows that the GPP estimates improved after adding radiation information to the NIRv-GPP model (from R 2 = 0.57, RMSE = 2.51, Akaike information criterion (AIC) = 49,522 to R 2 = 0.62, RMSE = 2.36, AIC = 46,212), and even more so after adding the CI for each biome (R 2 = 0.73, RMSE = 1.97,AIC = 36,622) (Figure 2).The improvement was lower (R 2 = 0.69, RMSE = 2.23, AIC = 37,756) (Figure S1) when using the 8 days 500 m LIS-CI-A1 product in the model instead of using a fixed CI value per biome type from He et al. [24].

Statistical Analyses
All data treatment and analyses were conducted using R statistical software (version 3.2.5)[34].The modelling and assessment of the predictive performance of the models was conducted using the k-fold cross-validation approach [35] (k = 10; 10% of the data was left for validation purposes at each validation run) with "lmStepAIC" from the R package caret [36].We used the coefficient of determination (R 2 ), the root mean square error (RMSE), a measure of uncertainty, and the mean error (bias; mean differences between predicted GPP and measured GPP) to measure the accuracy of the estimates of the linear regression models.

Results
The comparison between ground-measured and satellite-estimated GPP using the different models shows that the GPP estimates improved after adding radiation information to the NIRv-GPP model (from R 2 = 0.57, RMSE = 2.51, Akaike information criterion (AIC) = 49,522 to R 2 = 0.62, RMSE = 2.36, AIC = 46,212), and even more so after adding the CI for each biome (R 2 = 0.73, RMSE = 1.97,AIC = 36,622) (Figure 2).The improvement was lower (R 2 = 0.69, RMSE = 2.23, AIC = 37,756) (Figure S1) when using the 8 days 500 m LIS-CI-A1 product in the model instead of using a fixed CI value per biome type from He et al. [24].The addition of PPFD information to NIRv improved the GPP estimates for all except two sites and for all vegetation cover types except grasslands and woody savanna (Figure 3).This improvement in the estimates after adding PPFD information was significantly correlated with the CI (Figure 3); the higher the foliage organization (lower CI), the greater the improvement in terms of correlation and certainty (higher increase in R 2 and greater decrease in RMSE, respectively).ENF, the vegetation cover type with the lowest CI (i.e., the highest foliage organization), was the cover type that improved the most after adding the PPFD information (Figure 3).The addition of PPFD information to NIRv improved the GPP estimates for all except two sites and for all vegetation cover types except grasslands and woody savanna (Figure 3).This improvement in the estimates after adding PPFD information was significantly correlated with the CI (Figure 3); the higher the foliage organization (lower CI), the greater the improvement in terms of correlation and certainty (higher increase in R 2 and greater decrease in RMSE, respectively).ENF, the vegetation cover type with the lowest CI (i.e., the highest foliage organization), was the cover type that improved the most after adding the PPFD information (Figure 3).
When adding the CI information for each biome to the NIRv + PPFD model, the bias decreased, above all in ENF, for all vegetation cover types except WET (the improvement ranged from −1.97 in ENF to −0.06 in EBF; see Figure 4 and Figure S2).
The GPP estimates improved at all latitudes after adding the radiation information, although bias increased, and no differences were detected between latitudes (Figure S2).After adding CI information, both uncertainty and bias decreased in all latitudinal groups.Of the vegetation cover types, ENF contributed most to the decrease in dispersion in all latitudinal ranges after adding the CI information (Figure S2).
Finally, we checked the performance of each model when the data of evergreen needleleaf forests-the most sensitive vegetation cover type to PPFD and CI-were separated in the three latitudinal ranges.The improvement in the GPP estimates was similar at all latitudinal ranges when PPFD was added to the model.Similarly, the improvements in the evergreen needleleaf GPP estimates when the CI information was added were not latitudinally dependent, and the bias decreased from 2.05 to 0.09 as evaluated over the whole dataset (Figure S3).The predictive ability of the three models fell as the latitude decreased.We found no differences in the improvements to the GPP estimates after adding radiation before and after the summer solstice (Table S1). between the ground-measured GPP and the satellite-estimated GPP after adding PPFD information to the NIRv model (GPPnirppfd vs. GPPnirv) in black for the different studied sites (Table 1) (CI calculated as the average annual value per site from the CI product LIS-CI-A1), and in red for the different vegetation cover types (CI for vegetation cover type as in He et al. [24]): evergreen needleleaf forest (ENF), evergreen broadleaf forest (EBF), deciduous broadleaf forest (DBF), mixed forest (MF), open shrubland (OSH), grassland (GRA), woody savanna (WSA) and wetland (WET).
When adding the CI information for each biome to the NIRv + PPFD model, the bias decreased, above all in ENF, for all vegetation cover types except WET (the improvement ranged from −1.97 in ENF to −0.06 in EBF; see Figures 4 and S2).The GPP estimates improved at all latitudes after adding the radiation information, although bias increased, and no differences were detected between latitudes (Figure S2).After adding CI information, both uncertainty and bias decreased in all latitudinal groups.Of the vegetation cover types, ENF contributed most to the decrease in dispersion in all latitudinal ranges after adding the CI information (Figure S2).
Finally, we checked the performance of each model when the data of evergreen needleleaf forests-the most sensitive vegetation cover type to PPFD and CI-were separated in the three latitudinal ranges.The improvement in the GPP estimates was similar at all latitudinal ranges when PPFD was added to the model.Similarly, the improvements in the evergreen needleleaf GPP estimates when the CI information was added were not latitudinally dependent, and the bias decreased from 2.05 to 0.09 as evaluated over the whole dataset (Figure S3).The predictive ability of the three models fell as the latitude decreased.We found no differences in the improvements to the GPP estimates after adding radiation before and after the summer solstice (Table S1).

Adding PPFD and CI to NIRv to Remotely Assess GPP
The addition of information on radiation to NIRv improved the GPP estimates for different vegetation cover types.Only in grasslands did radiation not improve the estimates, although this was expected given that canopy development in grasslands occurs in parallel to photosynthetic activity [37].In addition, changes in GPP are often linked to water availability rather than to radiation, since grasslands are often water-constrained [38,39].
The improvement in GPP after adding radiation to NIRv was related to foliage organization.When PPFD was included in the model, the improvement increased with decreasing CI (higher leaf organization).This is due to the light distribution and its use inside the canopy.If leaves are more clumped on shoots, branches and crowns, more light can penetrate the canopy, making it less light-saturated and more sensitive to light changes.The incorporation of a variable of vegetation structure, such as CI, into the NIRv + PPFD algorithm improved the global GPP estimate in terms of uncertainty and bias.NIRv addresses the distribution of the photochemical capacity of the complete canopy

Adding PPFD and CI to NIRv to Remotely Assess GPP
The addition of information on radiation to NIRv improved the GPP estimates for different vegetation cover types.Only in grasslands did radiation not improve the estimates, although this was expected given that canopy development in grasslands occurs in parallel to photosynthetic activity [37].In addition, changes in GPP are often linked to water availability rather than to radiation, since grasslands are often water-constrained [38,39].
The improvement in GPP after adding radiation to NIRv was related to foliage organization.When PPFD was included in the model, the improvement increased with decreasing CI (higher leaf organization).This is due to the light distribution and its use inside the canopy.If leaves are more clumped on shoots, branches and crowns, more light can penetrate the canopy, making it less light-saturated and more sensitive to light changes.The incorporation of a variable of vegetation structure, such as CI, into the NIRv + PPFD algorithm improved the global GPP estimate in terms of uncertainty and bias.NIRv addresses the distribution of the photochemical capacity of the complete canopy since plants allocate their photosynthetic capacity to be able to fully exploit captured sunlight, and so the photochemical capacity of the complete canopy is directly related to the performance of the topmost leaves in the canopy [40].However, the effects of the changes in PPFD differ depending on how the radiation is distributed throughout the canopy as a result of differences in structure and leaf distribution.Leaf structure and distribution also affect the reabsorption intensity and the scattering directions of PPFD within the canopy [41,42] and, therefore, its reflectance and NIRv.Thus, the addition of CI information for each vegetation type corrected for the effects of the PPFD distribution in the canopy on GPP and improved the global GPP estimates.The incorporation of vegetation structure and foliage organization into GPP models should be advantageous given their key role in light use efficiency.

Vegetation Cover Types
The vegetation cover type with the greatest improvement in GPP estimation after the addition of CI information was evergreen needleleaf forests.In this biome, radiation limitation was found to have the greatest effect on productivity (16%) [19].Evergreen ecosystems tend to have greater radiation constraints [11] and, after temperature, radiation is the most limiting climatic variable in this type of forest [43].The greater sensitivity of ENF to PPFD variations could be related to the facts that conifer canopies are particularly effective in distributing light throughout the canopy due to the clumping of needles around stems [44] and that the leaf area index (LAI) in needleleaf canopies is higher than in other vegetation cover types [45,46].The total leaf area supported by a given crown is the most basic structural property affecting the amount of absorbed solar radiation and plant gas exchange [47][48][49], although it is also important to take into account how leaves are distributed.Of all vegetation types, evergreen needleleaf forests have the lowest CI [24].They are highly organized, with structures composed of shoots of needles, branches, whorls, tree crowns and tree stands [50].This organization allows more radiation to penetrate into the canopy and, thus, permits a greater use of light.The structure of these forests increases both canopy photosynthesis-since more light can reach the lower levels of the canopy-and the contribution of non-light-saturated shaded leaves to whole canopy rates, thereby resulting in a higher GPP than the GPP estimated considering only the green biomass.Thus, including CI in the model corrected for this effect, and values for ENF fitted the general model better.
Our study area was limited to northern latitudes, and no tropical areas were included.Radiation also plays an important role in tropical forests since it is the main factor limiting CO 2 uptake during the rainy season.Seasonal and interannual variation in light availability may also limit CO 2 uptake [51].Light may significantly regulate net ecosystem exchange and carbon storage in this biome, and possible changes in the distribution of cloud cover associated with climate change could cause significant future changes in carbon storage [52].Tropical forests account for 32-36% of global terrestrial net primary production [53].Unfortunately, given the frequent cloud cover in tropical areas, it is often difficult to obtain high-quality remote sensing data from tropical biomes in the spectral domain involved in NIRv computation and in the optical spectra in general.
We did not analyse water stress effects on GPP, so caution should be taken when using the corrected NIRv in drought-prone areas.

Latitudinal Ranges
The improvement in the GPP estimates in ENF after the addition of PPFD was independent of latitude (Figures S2 and S3).This suggests that the vegetation type is more determinant in the sensitivity to PPFD changes than latitude, and that PPFD already intrinsically takes into account latitudinal variation.The importance of PPFD in driving GPP changes at middle and high latitudes [19] may, in part, be related to the dominance of ENF at these latitudes.However, the effect of greater radiation limitation with latitude on GPP [11] should also be considered.At high latitudes, radiation limits vegetation productivity, since incoming solar radiation is restricted to the summer season and high cloud cover may greatly reduce the radiation received by the vegetation and, thus, limit vegetation productivity.At middle and low latitudes, a combination of either temperature and radiation or temperature and water availability limits the net primary productivity [43].The addition of radiation information also improved the GPP estimates at low and middle latitudes, where temperature and water can be limiting factors.The fact that the models perform worse with decreasing latitude in the ENF sites (Figures S2 and S3) could be related to the interference of drought or temperature stresses.

Seasonality
The dynamics of carbon uptake can also be sensitive to the seasonality of incoming radiation.Light is a key factor exerting control on photosynthesis in autumn [11,22].Radiation can play an important role in regulating the end of the growing season in temperate and boreal ecosystems [23].Greater autumn radiation in evergreen forests results in a later end date for net carbon uptake and is associated with higher autumn and annual net ecosystem productivity [23].Some studies have shown that carbon assimilation in many boreal coniferous forest ecosystems is light-constrained in autumn [20,21].Greater radiation may enable more net carbon gain in autumn as the days become shorter and radiation is often limited [54].Therefore, we would expect more improvement in the GPP estimates after the addition of radiation in autumn since it is in this season that radiation seems to play a more important role [22].However, we found no differences in the improvements in the GPP estimates after considering radiation before and after the summer solstice (Table S1).In fact, radiation has been found to be limiting both at the start of the season and end of the season in temperate Europe.This result reinforces the value of our model, which uses NIRv, PPFD and CI to remotely assess daily GPP throughout the whole year.

Conclusions
Our results demonstrate that the addition of daily radiation information (photosynthetic photon flux density, PPFD) and the clumping index (CI) for each vegetation cover type to the NIRv algorithm improves its ability to estimate GPP.The improvement is related to foliage organization, given that the foliar distribution in the canopy (CI) affects radiation distribution and use, and radiation drives productivity.Evergreen needleleaf forest is the vegetation cover type with the greatest improvement in GPP estimation after the addition of CI information.Vegetation type is more determinant in the sensitivity to PPFD changes than latitude or seasonality.Our results highlight that foliar organization and tree structure play an important role in productivity.We advocate for the incorporation of PPFD and CI into NIRv algorithms to improve GPP estimates and GPP models.

Figure 1 .
Figure 1.Sites used in this study.

Figure 2 .
Figure 2. Relationship between the measured GPP and the GPP estimated from (i) NIRv, (ii) NIRv + PPFD and (iii) NIRv + PPFD + CI, for the whole dataset.CI used in the model was the CI per vegetation cover type from He et al. [24].Data correspond to daily mean data from 26 sites from 2000 to 2014.The red line corresponds to the linear regression and the black line to the 1:1 line.Different colours identify the vegetation cover types.

Figure 2 .
Figure 2. Relationship between the measured GPP and the GPP estimated from (i) NIRv, (ii) NIRv + PPFD and (iii) NIRv + PPFD + CI, for the whole dataset.CI used in the model was the CI per vegetation cover type from He et al. [24].Data correspond to daily mean data from 26 sites from 2000 to 2014.The red line corresponds to the linear regression and the black line to the 1:1 line.Different colours identify the vegetation cover types.

Figure 3 .
Figure 3. Relationship between the clumping index and the variation in R 2 (∆R 2 ) and RMSE (∆RMSE)between the ground-measured GPP and the satellite-estimated GPP after adding PPFD information to the NIRv model (GPPnirppfd vs. GPPnirv) in black for the different studied sites (Table1) (CI calculated as the average annual value per site from the CI product LIS-CI-A1), and in red for the different vegetation cover types (CI for vegetation cover type as in He et al.[24]): evergreen needleleaf forest (ENF), evergreen broadleaf forest (EBF), deciduous broadleaf forest (DBF), mixed forest (MF), open shrubland (OSH), grassland (GRA), woody savanna (WSA) and wetland (WET).

Figure 3 .
Figure 3. Relationship between the clumping index and the variation in R 2 (∆R 2 ) and RMSE (∆RMSE)between the ground-measured GPP and the satellite-estimated GPP after adding PPFD information to the NIRv model (GPPnirppfd vs. GPPnirv) in black for the different studied sites (Table1) (CI calculated as the average annual value per site from the CI product LIS-CI-A1), and in red for the different vegetation cover types (CI for vegetation cover type as in He et al.[24]): evergreen needleleaf forest (ENF), evergreen broadleaf forest (EBF), deciduous broadleaf forest (DBF), mixed forest (MF), open shrubland (OSH), grassland (GRA), woody savanna (WSA) and wetland (WET).

Figure 4 .
Figure 4. Variation in absolute bias between the ground-measured GPP and the satellite-estimated GPP after adding biome CI information to the NIR and PPFD model (GPPnirppfdci vs. GPPnirppfd) for the different vegetation cover types.

Figure 4 .
Figure 4. Variation in absolute bias between the ground-measured GPP and the satellite-estimated GPP after adding biome CI information to the NIR and PPFD model (GPPnirppfdci vs. GPPnirppfd) for the different vegetation cover types.