Using Solar-Induced Chlorophyll Fluorescence Observed by OCO-2 to Predict Autumn Crop Production in China

: The remote sensing of solar-induced chlorophyll ﬂuorescence (SIF) has attracted considerable attention as a new monitor of vegetation photosynthesis. Previous studies have revealed the close correlation between SIF and terrestrial gross primary productivity (GPP), and have used SIF to estimate vegetation GPP. This study investigated the relationship between the Orbiting Carbon Observatory-2 (OCO-2) SIF products at two retrieval bands (SIF757, SIF771) and the autumn crop production in China during the summer of 2015 on di ﬀ erent timescales. Subsequently, we evaluated the performance to estimate the autumn crop production of 2016 by using the optimal model developed in 2015. In addition, the OCO-2 SIF was compared with the moderate resolution imaging spectroradiometer (MODIS) vegetation indices (VIs) (normalized di ﬀ erence vegetation index, NDVI; enhanced vegetation index, EVI) for predicting the crop production. All the remotely sensed products exhibited the strongest correlation with autumn crop production in July. The OCO-2 SIF757 estimated autumn crop production best (R 2 = 0.678, p < 0.01; RMSE = 748.901 ten kilotons; MAE = 567.629 ten kilotons). SIF monitored the crop dynamics better than VIs, although the performances of VIs were similar to SIF. The estimation accuracy was limited by the spatial resolution and discreteness of the OCO-2 SIF products. Our ﬁndings demonstrate that SIF is a feasible approach for the crop production estimation and is not inferior to VIs, and suggest that accurate autumn crop production forecasts while using the SIF-based model can be obtained one to two months before the harvest. Furthermore, the proposed method can be widely applied with the development of satellite-based SIF observation technology.


Introduction
China is an agricultural country with the largest population in the world, which only accounts for 7% of the earth's cropland resources but it needs to feed 22% of the world population [1]. Precise prediction of crop production in China and any other countries with the same situations is very significant [2,3]. Autumn crop refers to the food crops planted in spring or summer and harvested in autumn (e.g., middle rice, late rice, corn, sorghum, millet, sweet potato, soybeans, and so on), which is important as the main food in China [4]. Therefore, large-scale and accurate estimation of

OCO-2 SIF Products
We used the OCO-2 SIF data, which is provided freely from (https://disc.gsfc.nasa.gov) and available from September 2014 to present. The OCO-2 instrument is a three-channel grating spectrometer recorded the high resolution spectra of the O2 A-band (757-775 nm) and the other two bands. Fraunhofer lines at 758.8 and 770.1 nm can be used for the fluorescence retrieval via the FLD (Fraunhofer Line Discrimination) algorithm. Generally, the former is referred as the 757 nm window, with the latter as the 771 nm window [24,26,28,36]. Therefore, OCO-2 provides mid-day SIF retrievals at both 757 and 771 nm. The mid-day SIF products were converted to daily SIF by applying the daily correction factor that was included in the SIF Lite product. The OCO-2 system observed data by three modes, which included nadir, glint, and target. Typical OCO-2 alternately measured between the nadir and glint viewing mode. The repeat frequency of the special target

OCO-2 SIF Products
We used the OCO-2 SIF data, which is provided freely from (https://disc.gsfc.nasa.gov) and available from September 2014 to present. The OCO-2 instrument is a three-channel grating spectrometer recorded the high resolution spectra of the O2 A-band (757-775 nm) and the other two bands. Fraunhofer lines at 758.8 and 770.1 nm can be used for the fluorescence retrieval via the FLD (Fraunhofer Line Discrimination) algorithm. Generally, the former is referred as the 757 nm window, with the latter as the 771 nm window [24,26,28,36]. Therefore, OCO-2 provides mid-day SIF retrievals at both 757 and 771 nm. The mid-day SIF products were converted to daily SIF by applying the daily correction factor that was included in the SIF Lite product. The OCO-2 system observed data by three modes, which included nadir, glint, and target. Typical OCO-2 alternately measured between the nadir and glint viewing mode. The repeat frequency of the special target observation mode is approximately 16 days. In the nadir mode, the instrument views the ground directly below the spacecraft. In the glint mode, the instrument tracks near the location with direct sunlight being reflected [29]. Besides, a target mode is infrequently turned on when the satellite overpasses the ground validation sites. In this mode, a large number of temporally continuous measurements at different viewing zenith angles (VZA) are made [37]. Details of the retrieval that is based on the IMAP-DOAS (Iterative Maximum A Posteriori-Differential Optical Absorption Spectroscopy) preprocessor, as well as the OCO-2 SIF product can be found in Frankenberg et al. [26,28], which is also efficient in removing low and thick clouds. In this study, we used the SIF Lite product (V8r) aggregated as daily files with 1.29 × 2.25 km 2 spatial resolution.

MODIS Products
MODIS VIs and land cover type products were used in this study. These data were freely provided from (https://e4ftl01.cr.usgs.gov).
The NDVI and EVI are two commonly-used VIs for monitoring vegetation conditions. These VIs have significant relationships with crop production [38]. We used monthly NDVI and EVI derived from MOD13A3 data, with 1 km spatial resolution. The algorithm ingests all the 16-day 1 km products that overlap the month and employ a weighted temporal average.
where Red (620-670 nm), NIR (841-876 nm), and Blue (459-479 nm) are the surface bidirectional reflectance factors for MODIS bands 1, 2, and 3, respectively; L is the canopy background adjustment for correcting the nonlinear, differential NIR, and red radiant transfer through a canopy; C1 and C2 are the coefficients of the aerosol resistance term (that uses the blue band to correct for aerosol influences in the red band); and, G is a gain or scaling factor. The coefficients adopted for the MODIS EVI algorithm are, L = 1, C1 = 6, C2 = 7.5, and G = 2.5 [39]. The MODIS land cover type product (MCD12Q1) provides annual data that characterize five global land cover classification systems from 2001 to present, with the spatial resolution of 500 m. The land cover product that we used is based on the IGBP global vegetation classification scheme.

Autumn Crop Production Data of China
The autumn crop production data of each province in China are available at the official website of the National Bureau of Statistics (http://data.stats.gov.cn). The statistical data that we can obtain on this website were updated to 2016. In addition to Hong Kong, Macau, and Taiwan, the autumn crop production data were obtained of each province in 2015 and 2016. The autumn crop production can reach up to 60 million tons in areas with large cropland (e.g., Heilongjiang) and less than one million ton in areas with small cropland (e.g., Beijing, Hainan, and Shanghai).

Analysis
According to the definition of autumn crop, the final production is most related to the summer season. The satellite-derived data from June to August 2015 and 2016 were extracted while using the following processes. Firstly, we extracted "SIF 757 nm" and "SIF 771 nm" values (henceforth denoted as SIF757 and SIF771), where the land cover type was cropland with an IGBP index of 12 and removed the negative values, which may affect the results. The SIF data were created as a grid map and resampled to 500 m spatial resolution. Subsequently, the VIs data were also resampled to the same spatial resolution and masked with the cropland data to obtain the cropland VIs data. The cropland data of 2015 and 2016 were obtained from the MCD12Q1 product, respectively. These satellite-derived data were used zonal statistics for 31 China provinces at monthly and seasonal scales by means of ArcGIS10.1 (ESRI, Redlands, CA, USA) to obtain the mean pixel value and the cropland pixels for each province.
The converted value was the mean pixel value, multiplied by the cropland pixels. These converted values were henceforth denoted as SIF757', SIF771', EVI', and NDVI'. The relationships of these converted values with autumn crop production statistics were evaluated at monthly and seasonal timescales while using the following equation: where Value i and Value i represented the converted satellite-derived product value and the mean satellite-derived product pixel value for each province, respectively; n i represented the cropland pixels for each province; and, i, which ranged from 1 to 31, represented the province where the statistical data were available when the SIF data passed by.
To compare the potential of OCO-2 SIF and MODIS VIs in estimating autumn crop production, the correlations between the converted OCO-2 SIF products (SIF757, SIF771), MODIS VIs (EVI, NDVI), and the government's autumn crop production statistics were evaluated at monthly and seasonal scales. The SIF data were spatially discrete points, because the viewing modes alternate from orbit to orbit. We calculated the value at a monthly scale while using the points that passed by in June, July, and August, respectively. Subsequently, the values at a seasonal scale using the total points of summer (June, July, and August) were calculated. We obtained the monthly scale value directly and calculated the value at seasonal scale while using the average of June, July, and August data since VIs were monthly raster data.
The correlations between SIF757', SIF771', EVI', NDVI', and autumn crop production were evaluated while using the coefficient of determination (R 2 ). Root mean square error (RMSE) and mean absolute error (MAE) [6,40] were used to evaluate the performance of crop production estimation. The indicators were calculated while using the following equations: where n was the number of provinces used for validation.ŷ i was the SIF757', SIF771', EVI', NDVI' (Equation (4)), and estimated autumn crop production (Equations (5) and (6)). Additionally, y i was the government's autumn crop production statistics (ten kilotons). All of the statistical analyzes were performed while using SPSS Statistics 22 (IBM, Chicago, IL, USA), Origin2017 (OriginLab, Northampton, MA, USA), and MATLAB R2017b (MathWorks, Nadick, MA, USA). Generally, the optimal result is judged by a maximum R 2 value, minimum RMSE, and MAE values.
In addition, the relationships between SIF757', SIF771', EVI', NDVI', and the autumn crop production statistics were modeled based on the most relevant month or season data in 2015. The models whose R 2 is highest in June, July, August, and summer were selected as the most relevant models and they were used to estimate the 2016 autumn crop production. The model with the best performance was specifically analyzed.

Relationships of OCO-2 SIF and MODIS VIs with Autumn Crop Production at Monthly and Seasonal Scales
We compared the relationship between the SIF757', SIF771', EVI', NDVI', and the government's autumn crop production statistics at monthly and seasonal scales. The results for the 2015 summer show that the converted OCO-2 SIF data, MODIS VIs, and the government's autumn crop production statistics were strongly correlated ( Figure 2). In general, SIF771' (R 2 = 0.548-0.716, p < 0.01) was more strongly correlated with the government's autumn crop production statistics than SIF757' (R 2 = 0.526-0.692, p < 0.01). In addition, EVI' (R 2 = 0.628-0.664, p < 0.01) and NDVI' (R 2 = 0.666-0.672, p < 0.01) had equally strong correlations with the government's autumn crop production statistics.
The strongest relationship of SIF757' (R 2 = 0.692, p < 0.01), SIF771' (R 2 = 0.716, p < 0.01), EVI' (R 2 = 0.664, p < 0.01), and NDVI' (R 2 = 0.672, p < 0.01) was observed in July ( Figure 2b). The mean R 2 of SIF757' and SIF771' were 0.643 and 0.662. Additionally, the mean R 2 of EVI' and NDVI' were 0.650 and 0.668. Moreover, the correlation between SIF' and autumn crop production was higher than VIs' for both monthly and seasonal scales, except in June, during the summer of 2015 (Table 1). In addition, the correlation between VIs' (EVI' and NDVI') and autumn crop production do not greatly vary among months. By contrast, the correlation between SIF' (SIF757' and SIF771') and autumn crop production among the months exhibited great variation.    The SIF' and VIs' in July had the strongest correlation with the autumn crop production statistics according to the relationship between the SIF757', SIF771', EVI', NDVI', and the government's autumn crop production statistics in the summer of 2015 ( Figure 2 and Table 1). 4 models con be chosen to estimate autumn crop production ( Table 2). The fitted models with p-values that were lower than 0.01 indicated a significant relationship between the SIF757', SIF771', EVI', NDVI', and the government's autumn crop production statistics, and confirmed the reliable predictive capability of these models. Growing evidence suggested that the relationship between vegetation photosynthesis (gross primary productivity, GPP) and SIF is linearly related [41][42][43][44][45]. Moreover, terrestrial GPP drives the terrestrial food chain [46]. Therefore, we assumed that SIF and production are related linearly and selected the linear model to predict the autumn crop production. The linear models in July (Table 2) were utilized to estimate autumn crop production in each province in 2016 ( Figure 3). 2016 was the only one year in which OCO-2 SIF product, MODIS VIs products, and the government's autumn crop production statistics data could be matched, except for 2015. The robustness of the coefficients derived from these production estimation models for each province in 2016 was evaluated while using RMSE and MAE. The scatterplots between the production statistics and estimated productions demonstrated satisfactory results (Figure 3). The estimated autumn crop productions of each model were close to the government's autumn crop production statistics in 2016 (Table 3). Moreover, the performances of SIF757', SIF771', EVI', and NDVI' are comparable. The best performance was observed in the autumn crop productions that were estimated using SIF757' in July (R 2 = 0.678, p < 0.01; RMSE = 748.901 ten kilotons; MAE = 567.629 ten kilotons).
The OCO-2 SIF products performed better than MODIS VIs products when estimating the autumn crop production and SIF757' performed best. We conducted a detailed analysis of the estimation of SIF757' in order to identify the constraints that affect the estimation accuracy of SIF (Figure 4). We calculated the deviation percentage of the autumn crop production estimated while using SIF757 in 2016, and the percentage of cropland that was covered by SIF data in each province. Meanwhile, the minimum percentage of cropland covered by SIF data is up to 3.762% (Anhui) and the maximum percentage of cropland covered by SIF data is up to 4.524% (Heilongjiang).
The lower the percentage of cropland covered by SIF data, the higher the deviation percentage of autumn crop production estimated while using the SIF757'. Meanwhile, in Beijing, Hainan, Qinghai, Tianjin, Ningxia, and Xinjiang, where the autumn crop production is approximately less than five million tons and the elevation is higher than 1000 m, the deviation percentage of autumn crop production estimated using the SIF757' exceeded 100% (Figure 4a). Besides, in Sichuan, Chongqing, Hunan, Shaanxi, and other provinces with large undulating terrain and broken cropland, the autumn crop production that was estimated using the SIF757' was also not very accurate (Figure 4). Productionobs (ten kilotons) and productionest (ten kilotons) represent production observation and production estimation, respectively.  Production obs (ten kilotons) and production est (ten kilotons) represent production observation and production estimation, respectively.

Potential of OCO-2 SIF and MODIS VIs in Estimating the Autumn Crop Production of China
The main goals of this study are (1) to explore the relationship between the OCO-2 SIF data and the government's autumn crop production statistics and establish a simplified model for estimating autumn crop production while using OCO-2 SIF; (2) to compare the differences between the SIF757', SIF771', EVI', NDVI' for estimating the autumn crop production; and, (3) to analyze the factors affecting the estimation accuracy. Therefore, it is a new attempt to use satellite remote SIF data for crop production estimation.
Crops yield can be estimated while using remotely sensed GPP [47]. Additionally, GPP based on GOME-2 SIF has been used to predict the crop yield in previous study [46]. Thus, GPP and crop yield (production) are related. On the other hand, SIF is mainly determined by Absorbed photosynthetically active radiation (APAR) [30,48], which is the key to link SIF and GPP [45]. Therefore, we conclude that OCO-2 SIF data are closely related to autumn crop production. Besides, the spatial resolution of MODIS VIs is finer than OCO-2 SIF and the OCO-2 SIF product is discrete, which indicates that VIs is more advantageous than SIF. However, the autumn crop production that is predicted by SIF' is more accurate than VIs', with higher R 2 , lower RMSE, and MAE. It proved that SIF' is more effective than VIs'.
In addition, the R 2 of SIF'-crop production statistics were 0.548-0.716 for SIF771' and they were 0.526-0.692 for SIF757' ( Table 1). The mean R 2 of SIF'-crop production statistics were 0.662 and 0.643 for SIF771' and SIF757', respectively (Table 1). SIF771' was more related to the autumn crop production than SIF757'. However, SIF757' performed better than SIF771' when estimating autumn crop production, which indicated that SIF757 is more sensitive to photosynthesis than SIF771 in this study region. The result is consistent with another study, which showed that SIF757 has a stronger correlation with tower GPP than SIF771 [24]. This phenomenon could be explained by 771 nm falling farther away from the peak emission on the SIF spectrum, which indicated that SIF757 has higher retrieval precision than SIF771 [28,30].
The crop production statistics were highly correlated with EVI/NDVI data during the leaf constant period. In this study, the autumn crop production statistics were more strongly correlated

Potential of OCO-2 SIF and MODIS VIs in Estimating the Autumn Crop Production of China
The main goals of this study are (1) to explore the relationship between the OCO-2 SIF data and the government's autumn crop production statistics and establish a simplified model for estimating autumn crop production while using OCO-2 SIF; (2) to compare the differences between the SIF757', SIF771', EVI', NDVI' for estimating the autumn crop production; and, (3) to analyze the factors affecting the estimation accuracy. Therefore, it is a new attempt to use satellite remote SIF data for crop production estimation.
Crops yield can be estimated while using remotely sensed GPP [47]. Additionally, GPP based on GOME-2 SIF has been used to predict the crop yield in previous study [46]. Thus, GPP and crop yield (production) are related. On the other hand, SIF is mainly determined by Absorbed photosynthetically active radiation (APAR) [30,48], which is the key to link SIF and GPP [45]. Therefore, we conclude that OCO-2 SIF data are closely related to autumn crop production. Besides, the spatial resolution of MODIS VIs is finer than OCO-2 SIF and the OCO-2 SIF product is discrete, which indicates that VIs is more advantageous than SIF. However, the autumn crop production that is predicted by SIF' is more accurate than VIs', with higher R 2 , lower RMSE, and MAE. It proved that SIF' is more effective than VIs'.
In addition, the R 2 of SIF'-crop production statistics were 0.548-0.716 for SIF771' and they were 0.526-0.692 for SIF757' ( Table 1). The mean R 2 of SIF'-crop production statistics were 0.662 and 0.643 for SIF771' and SIF757', respectively (Table 1). SIF771' was more related to the autumn crop production than SIF757'. However, SIF757' performed better than SIF771' when estimating autumn crop production, which indicated that SIF757 is more sensitive to photosynthesis than SIF771 in this study region. The result is consistent with another study, which showed that SIF757 has a stronger correlation with tower GPP than SIF771 [24]. This phenomenon could be explained by 771 nm falling farther away from the peak emission on the SIF spectrum, which indicated that SIF757 has higher retrieval precision than SIF771 [28,30].
The crop production statistics were highly correlated with EVI/NDVI data during the leaf constant period. In this study, the autumn crop production statistics were more strongly correlated with SIF data during July than the other period, which is the leaf constant period. This finding can be explained by the strong photosynthesis during the constant leaf period. In addition, according to previous studies, July represents the peak growing season for autumn crop [31], which explains why the most relevant correlation between SIF and production is July. SIF757' performed best when estimating the autumn crop production in China (R 2 = 0.678, p < 0.01; RMSE = 748.901 ten kilotons; MAE = 567.629 ten kilotons). The performance of SIF757' is better than the estimation performance of satellite Earth Observation (EO) time-series products for wheat, barley, and grain maize productions in Europe (approximately R 2 = 0.583) [49]. Moreover, this performance is close to the crop phenology and a combination of EVI2 (the MODIS two-band Enhanced Vegetation Index) and NDVI estimating maize, soybean productions in America (approximately R 2 = 0.69-0.70) [16].
In addition, the correlations between VIs' (EVI' and NDVI') and autumn crop production were not much different among the months, whereas the correlations between SIF' (SIF757' and SIF771') and autumn crop production were much different among months. This result reveals that SIF can monitor crop dynamics better than VIs, which is in good agreement with the previous results [50][51][52][53], suggesting that SIF is able to precisely track the seasonality of photosynthesis and NDVI is insensitive to seasonal changes in photosynthesis. This finding further indicates that it is feasible to estimate autumn crop production while using satellite-derived SIF products. Additionally, the performance is better and more reliable than VIs.

Limitation and Uncertainty
In this study, we determined that predicting autumn crop production while using OCO-2 SIF product is feasible and not inferior to MODIS VIs, especially, the performance of SIF757 is best. However, some uncertainties and limitations remain. First of all, the SIF values of each province were obtained by point averaging due to the discrete of OCO-2 SIF product. Moreover, in some provinces, few data passed by their cropland and the mean pixel value of SIF product may not represent the SIF value of the entire cropland. This condition may lead to the inaccuracy of the SIF values. Figure 4 shows the higher the percentage of cropland covered by SIF data, the higher accuracy. This further illustrates that the discrete influences the estimation accuracy. Although the OCO-2 SIF product that we use has the highest spatial resolution available today, it is still relatively coarse. The coarse spatial resolution indicates that the SIF value of each pixel is a mixture of cropland and other vegetation types. The SIF value we used is inaccuracy, which leads to an inaccurate estimation. We found that the accuracy of autumn crop prediction is higher in low-elevation provinces with large farm size and flat cropland by comparing the spatial distribution of Chinese cropland topography with the deviation percentage of autumn crop production estimated using SIF757 (Figure 4) (e.g., Anhui, Hebei, Henan, and so on). This is also affected by the coarse spatial resolution. Moreover, the production of most provinces is between 500 and 4000 ten kilotons, which leads to higher estimation accuracy in these provinces. In addition, the spatial distribution of autumn crops is quite important for accurately estimating the crop yield. However, the data is lacked until now, and it remains hard to distinguish the different crop types using satellite remote sensing for a large area. Therefore, this study mixed all the autumn crops together. Actually, the relationships between SIF and crop production may vary among the different autumn crops due to the influence of the canopy structure, photosynthetic pathways, light energy utilization rate, and C-N carbohydrate conversion capacity, which may constrain the accuracy in estimating autumn crop yields. Besides, the study period (two years) is short due to the limitation of SIF product and statistical data, which may lead to inaccurate results. we still believed that the results were significant since the fitted model with p-values lower than 0.01.
The results can be generalized to other regions and crops, due to the similarity of the SIF generation process. SIF is generated during the photosynthesis process, regardless of the region or crop, which provides a possibility for SIF to estimate crop production. More and more researchers pay attention to satellite-derived SIF, which promotes the development of satellite-derived SIF. New or future missions (e.g., GOSIIF; Tropospheric Monitoring Instrument, TROPOMI; Fluorescence EXplorer, FLEX) with finer spatial resolution and continuous data could be used in a similar study of crop estimation.
If possible, our future studies will be conducted in a larger region and more crops with considering more factors.

Conclusions
This study explored the potential of OCO-2 SIF product in estimating autumn crop production in China. SIF is generated during the photosynthesis process, which leads to a closer connection between SIF and crop production. The results showed that OCO-2 SIF757 of July had great potential in estimating autumn crop production. The estimation accuracy was limited by the spatial resolution and discrete of satellite-derived SIF product. SIF performs better than VIs in estimating autumn crop production and SIF could monitor crop dynamics more effectively than VIs. With the development of continuous SIF product, such as GOSIF (0.05 degree) [30], TROPOMI (7 km) [54], and FLEX (300 m) [55], we believed that SIF would provide an increasingly accurate crop production estimation. SIF will be widely used in more studies that are related to terrestrial ecosystem and carbon cycles.