Research on Improving the Accuracy of SIF Data in Estimating Gross Primary Productivity in Arid Regions

: Coupling solar-induced chlorophyll fluorescence (SIF) with gross primary productivity (GPP) for ecological function integration research presents numerous uncertainties, especially in ecologically fragile and climate-sensitive arid regions. Therefore, evaluating the suitability of SIF data for estimating GPP and the feasibility of improving its accuracy in the northern region of Xinjiang is of profound significance for revealing the spatial distribution patterns of GPP and the strong coupling relationship between GPP and SIF in arid regions, achieving the goal of “carbon neutrality” in arid regions. This study is based on multisource SIF satellite data and GPP observation data from sites in three typical ecosystems (cultivated and farmlands, pasture grasslands, and desert vegetation). Two precision improvement methods (canopy and linear) are used to couple multiple indicators to determine the suitability of multisource SIF data for GPP estimation


Introduction
Gross primary productivity (GPP) is the comprehensive product of vegetation fixing CO 2 through photosynthesis [1].As an important factor in terrestrial carbon cycle research, it plays a crucial role in the biosphere [2].Under the goal of "carbon neutrality", achieving a precise measurement and estimation of GPP is not only of profound significance for understanding the mechanisms of the carbon cycle [3,4], but also plays an important role in determining the comprehensive response of CO 2 to climate change.Currently, the main methods for measuring and estimating GPP include model simulations, ground-based observations, and satellite-based remote sensing.Model simulations primarily refer to methods based on light use efficiency (LUE) models [5,6].Due to differences in underlying surfaces, variations in vegetation structure, and the comprehensive impact of climate change, the structure and related parameters of LUE models are difficult to accurately construct and quantify, resulting in certain limitations in this method.Ground-based observations primarily refer to the eddy covariance (EC) measurement technique [7].The EC method measures the net ecosystem exchange (NEE) through flux towers [8], and then decomposes NEE into GPP and ecosystem respiration (Reco) [9].Due to the limited coverage area of EC measurements, the EC method needs to use related approaches to upgrade the measurement scale when conducting large-scale GPP estimate [10,11].The upscaled GPP also has certain limitations, such as incomplete driving factors, excessive parameters, and constraints from the original EC flux towers.Therefore, to achieve a precise measurement and estimation of GPP and to promote the realization of the "carbon neutrality" goal, it is necessary to further explore other GPP measurements or alternative methods.
Chlorophyll fluorescence is a light signal produced when chlorophyll molecules move from a low-energy state to a high-energy state and then return to the low-energy state.The wavelength of this fluorescence is approximately in the range of 650-800 nm, which is also referred to as solar-induced chlorophyll fluorescence (SIF).SIF encompasses the spectra contributed by both Photosystem I and Photosystem II.Compared with traditional vegetation indices, SIF can better reveal the dynamic changes and carbon cycling processes of GPP.SIF has become one of the hot research topics in the field of vegetation remote sensing [12].For example, Gao et al. [13] conducted linear and spatiotemporal analyses using real-time tower-measured GPP from FLUXNET in 2015 and two types of SIF (CSIF and GOSIF), demonstrating that both showed a positive correlation across different spatiotemporal scales.Wang et al. [14] applied SIF and GPP to characterize the spatiotemporal features and regulatory factors of terrestrial ecosystems in China from 2007 to 2018.They compared the spatiotemporal consistency of SIF and GPP between arid and nonarid regions by analyzing different climatic zones.Qiu et al. [15] characterized the response changes of SIF and GPP to drought processes under arid conditions.Wei et al. [16] coupled SIF and GPP satellite products and applied multiple indicators to demonstrate the reduced lag effect of GPP in grasslands in arid regions.Liu et al. [17] demonstrated the advantages of using SIF and GPP data with multiple indicators to represent the water storage sensitivity of desert vegetation in arid and semi-arid regions.Wang et al. [18] integrated remote sensing data with multiple indicators and showed that SIF is more effective in representing the GPP of various systems in arid areas than normalized difference vegetation index and vegetation optical depth.However, given the increasingly important role that SIF satellite products play in GPP estimation, many scholars have begun using models, machine learning, and other methods to generate SIF satellite products with different spatiotemporal characteristics from Land 2024, 13, 1222 3 of 25 different platforms.The satellite platforms mainly include Orbiting Carbon Observatory-2 (OCO_2), Sentinel-5P, MODIS, and GOME-2, and the products mainly include CSIF [19,20], RTSIF [21,22], GOSIF [23], SIF_OCO2_005 [24], and others.With the generation of various SIF satellite products, the comparative analysis of the applicability of different products for GPP estimation in different ecosystems has become a new research topic.However, existing studies mostly focus on single ecosystems.For instance, An et al. [25] assessed the consistency of five SIF satellite products for rubber plantation ecosystems on Hainan Island.Dang et al. [26] verified the feasibility of using SIF instead of GPP to explore the mechanisms affecting humid, arid, and semi-arid regions.Furthermore, because SIF itself only accounts for 2-3% of solar radiation [27], it is necessary to reduce spatial resolution to eliminate certain noise during measurement, resulting in a relatively low spatial resolution of existing SIF satellite products.For instance, MetOp-A/B sensors mounted on the GOME-2 satellite can only obtain monthly scale SIF data with a resolution of 0.5 • [28].Therefore, although SIF can serve as an effective substitute for vegetation photosynthesis and be used for GPP estimation, the overall applicability of multisource SIF satellite products in GPP estimation across different regions and ecosystems requires further comparative validation.The feasibility of improving the accuracy of SIF satellite products (considering various influencing factors, such as different underlying surfaces) and achieving effective GPP estimation requires further comparative research, especially in typical ecosystems in arid regions of China.
The Xinjiang Uygur Autonomous Region is located in the inland northwest of China, far from the ocean, with extremely low precipitation, making it part of China's arid region.It encompasses typical ecosystems such as pasture grasslands, cultivated and farmlands, and coniferous forests.As well as fragile ecosystems like desert vegetation, natural sand dunes, and desert-oasis transition zones, making it exceptionally sensitive to climate change on both a national and global scale.As a typical arid region, which SIF satellite product is most suitable for estimating GPP in this area?Which improvement method can effectively enhance the progress of SIF satellite products in this region?Do the improved SIF satellite products conform to the comprehensive variation characteristics of GPP in arid regions?The resolution of this series of questions is crucial for achieving the "carbon neutrality" goal in arid regions.Unfortunately, these questions have not yet received clear answers.
Given this, this study takes the northern region of Xinjiang as the research area and utilizes indirectly observed GPP data from three typical ecosystems in this region (cultivated and farmlands, pasture grasslands, and desert vegetation).Spatial characteristics, linear regression parameters, GPP sensitivity to influencing factors, and GPP/SIF values under different weather conditions are selected as evaluation criteria.These criteria are used to comprehensively assess the overall applicability of four continuously updated SIF satellite products (CSIF, RTSIF, GOSIF, SIF_OCO2_005) for GPP estimation in typical arid ecosystems.Subsequently, the feasibility of employing canopy and linear accuracy improvement methods to enhance the accuracy of the most suitable SIF satellite product for GPP estimation is verified.Finally, the changes in spatial characteristics of SIF data before and after improvement are revealed, indirectly reflecting the spatial distribution patterns of GPP in arid regions.These studies will provide reliable empirical evidence for the applicability and accuracy improvement of SIF satellite products for GPP estimation in arid regions.Moreover, they fill the gap in the coupling research of SIF satellite products and GPP in this localized arid region of northern Xinjiang, laying a theoretical foundation for further research on GPP influencing mechanisms in Xinjiang and the entire arid region.

Study Area
The Xinjiang Uygur Autonomous Region (35 1a) is located in northwest China.Its north end is located in the Altay Mountains, its south end is located in the Kunlun Mountains, and its central area has the Tianshan Mountains running through it.It covers a total area of 1.6649 million km 2 [29,30] and is part of China's arid region.Due to the region being far from the ocean and surrounded by mountains, the areas have a large temperature difference and are dry with little rain.The annual average temperature is about −4.9~14.9• C [31].The rivers and lakes in the region (such as the Tarim River and Abi Lake) mainly come from glacial meltwater.Owing to its unique geographical location, the region has diverse land surfaces, including desert vegetation, pasture grasslands, cultivated and farmlands, coniferous forests, and other underlying surfaces, the ecosystem is very typical.
in the Kunlun Mountains, and its central area has the Tianshan Mountains running through it.It covers a total area of 1.6649 million km 2 [29,30] and is part of China s arid region.Due to the region being far from the ocean and surrounded by mountains, the areas have a large temperature difference and are dry with little rain.The annual average temperature is about −4.9~14.9°C [31].The rivers and lakes in the region (such as the Tarim River and Abi Lake) mainly come from glacial meltwater.Owing to its unique geographical location, the region has diverse land surfaces, including desert vegetation, pasture grasslands, cultivated and farmlands, coniferous forests, and other underlying surfaces, the ecosystem is very typical.
This study focuses on the northern region of Xinjiang as the research area.Three typical ecosystems in this region (cultivated and farmlands, pasture grasslands, and desert vegetation) are selected for the analysis of the coupling between SIF satellite products and GPP in arid regions.

Site Data
The situ observation data for this study were obtained from the Land-Atmospheric Interaction Observation Stations constructed by the Institute of Desert Meteorology, This study focuses on the northern region of Xinjiang as the research area.Three typical ecosystems in this region (cultivated and farmlands, pasture grasslands, and desert vegetation) are selected for the analysis of the coupling between SIF satellite products and GPP in arid regions.

Site Data
The situ observation data for this study were obtained from the Land-Atmospheric Interaction Observation Stations constructed by the Institute of Desert Meteorology, China Meteorological Administration, Urumqi.There were three stations (Table 1), each equipped with EC systems, radiation observation systems, and gradient tower systems.Desert vegetation area [35,36] The EC system consisted of an open-path CO 2 /H 2 O infrared gas analyzer (LI7500, Li-Cor, Lincoln Nebraska, IA, USA) and an ultrasonic three-dimensional anemometer (CSAT3, Campbell Scientific, Logan, UT, USA).The Ultrasonic anemometer can accurately measure pulsating wind speed and acoustic virtual temperature in three different directions, with measurement accuracies of ±4.0 cm/s and ±2.0 cm/s.The data acquisition frequency is 10 Hz/20 Hz and the data output interval is 30 min.
The open-path LI7500 infrared gas analyzer provides accurate measurements of CO 2 concentration and water vapor density in the atmosphere, with measurement accuracies of ±0.01 mmol/mol and ±0.15 mmol/mol.The data output interval is also 30 min.
The data used in this study were continuous observation data in 2020, with observation times synchronized to the local time.All measured data were output through a data collector (CR3000, Campbell Scientific, Logan, UT, USA) at time frequencies of 10 s, 1 min, 30 min, and 1 h.The acquisition frequency of the radiation and gradient observation systems was 1 Hz, whereas that of the EC covariance system was 10 Hz.

Satellite Data
By comprehensively comparing the SIF satellite products that are currently the most widely used and have been proven to have good applicability [13,[16][17][18]25], we have preliminarily selected the SIF satellite product generated based on the two most widely used satellite platforms (OCO_2 and Sentinel-5P).These SIF satellite products have a complete time cycle, time/spatial resolution, and coverage range, which can perfectly match the research area and time period.It mainly includes four satellite products as follows: CSIF [19,20], RTSIF [21,22], GOSIF [23], and SIF_OCO2_005 [24] (Table 2).
The leaf area index (LAI) data for this study were obtained from the HIQ-LAI satellite product data set on the Google Earth Engine platform.This data set was created by Yan et al. [37], using the spatiotemporal informative component analysis (STICA) algorithm, which reanalyzed nearly 22 years of MODIS C6.1 LAI products.The data set has an 8-day temporal scale, and each year includes 46 TIFF format files at a resolution of 500 m.
The land use data for this study were sourced from the Chinese Academy of Sciences Resource and Environment Science Data Registration and Publishing System.This data set was produced by Xu et al. [38], through manual visual interpretation of Landsat 8 remote sensing images to generate the data set.The data set comprises 25 secondary types and offers raster data at resolutions of 1000 m, 100 m, and 30 m, with this study utilizing data at a resolution of 30 m.
This study used the observation data from the aforementioned satellites products in 2020.

Other Auxiliary Data
To improve the accuracy of the SIF data based on the canopy method, various auxiliary parameter data sets were applied concurrently (Table 3).The final data collection frequency for EC used in this study is 10 Hz, with a time output frequency of 30 min.The data collection frequency for radiation and gradient observations is 1 Hz, with a time output frequency of 30 min.All original flux observation data are initially in the TOB1 format, which can be converted to the TOB3/5 format for preliminary operations by the LoggerNet software4.0(Campbell Scientific, Logan, UT, USA).Subsequently, EddyPro7.1 software was employed for data processing, including outlier removal [45], time lag correction [46], coordinate sequence rotation [47], frequency response correction [48], sonic virtual temperature, and density correction [49], thereby obtaining flux data with a time step of half an hour.
Owing to uncontrollable factors such as instrument damage and abrupt weather changes, the preliminary processed flux data may still suffer from issues, such as missing or discontinuous data.Therefore, it is necessary to further comprehensive interpolation using the Max Planck online interpolation tool, which was performed on the half-hourly flux data.Through comprehensive processing, high-quality and continuous half-hourly flux data can be obtained.(The tool was developed by the Max Planck Institute for Biogeochemistry and can be obtained at: https://www.bgc-jena.mpg.de/bgi/index.pHp/Services/REddyProcWeb, accessed on 18 September 2023).
In the data processed by comprehensive interpolation, the CO 2 flux data represents NEE, while GPP and Reco [50] need to be calculated using nighttime and daytime datasplitting methods [7,51].The calculation formula is as follows: Due to SIF being the total product of photosynthesis, it is necessary to finally remove the nighttime value from the GPP half-hourly flux data based on local sunshine time, and convert it into an average energy value GPP data set over 8-day intervals.The ultimate unit for this data set is gC/m 2 /day.(The sunshine time acquisition can be obtained at: https://richurimo.bmcx.com/xinjiangweiwuerzizhiqu__time__2020_02__richurimo/,accessed on 18 September 2023).
The unit conversion formula is as follows:

.2. Satellite Data Processing
Due to the differences in temporal cycles, temporal/spatial resolutions, coverage, and storage formats of each satellite product, comprehensive processing is required.Firstly, the ArcGIS10.8software iterator tool was used for batch format conversion to create TIFF format satellite product data sets.Secondly, the ArcGIS10.8software iterator tool was used for mask clipping to obtain coverage consistent with the study area.Subsequently, the ArcGIS10.8software was used for resampling to obtain a 2020 satellite product data set with consistent spatial resolution (0.05 • ) in the study area.Finally, the ArcPy10.8 tool and MATLABR2022a software were used for batch extraction of the SIF satellite product raster attribute values corresponding to each site.We set that the SIF satellite products are suitable for arid regions if the spatial attributes of SIF in each ecosystem are reasonable and exhibit annual averages close to the measured GPP.

Method for Applicability Verification
To verify the applicability of multisource SIF satellite products for GPP estimation in arid regions, linear regression fitting analysis is conducted using measured GPP data from each site and their corresponding SIF data.Our data used for comprehensive calculations have complete and stable time series, with a clear number of variables and the exclusion of outliers and outliers.The adjusted R 2 is used for comprehensive verification and evaluation, which avoids the irrationality, overfitting, bias, and inconsistency of model predictions.We set the suitability of SIF satellite products for arid regions when R 2 is greater than 0.6 in desert vegetation areas and greater than 0.8 in the cultivated and farmland and pasture grassland areas.The parameter calculation formula is as follows: Land 2024, 13, 1222 where n represents the number of samples, y i represents the observed values, ŷi represents the predicted values, and the predicted value was calculated comprehensively based on the slope and intercept.

Method for Response Degrees Verification
To verify the responsiveness of the multisource SIF satellite products to the main influencing factors of GPP in arid regions, Pearson correlation analysis was used to calculate their correlation, and t-tests were performed to determine the confidence level.Our data used for comprehensive calculations have complete and stable time series, with a clear number of variables and the exclusion of outliers and outliers, which avoids bias and inconsistency in comprehensive analysis.We set the SIF satellite product to be applicable in arid regions when it exhibits a high correlation with influencing factors under different ecosystems.The parameter calculation formula is as follows: where n represents the sample size, x i represents the satellite values, and y i represents the observed values of influencing factors.

Method for GPP/SIF Verification
To verify the changes in light distribution-sensitive diagnostic indicators (GPP/SIF) under different weather conditions, the Clear Sky Index (CI) method proposed by Gu et al. [52] was used to calculate the weather index of the study area in 2020.Subsequently, the weather conditions were comprehensively classified using the weather condition classification standard proposed by Okogue et al. [53] (Table 4).Finally, the GPP/SIF method derived by Yang et al. [54] was used to calculate the light distribution-sensitive diagnostic values.We set the suitability of SIF satellite products for arid regions when the GPP/SIF ratio in each ecosystem shows fluctuations around the 1:1 line without major abrupt changes.The parameter calculation formula is as follows: where R sc is the solar constant, β is the solar zenith angle, φ is the latitude of the study area, δ is the solar declination angle, and ω is the hourly angle.

Method for SIF Precision Improvement
To further explore the optimal accuracy of SIF data in three typical ecosystems in arid regions, we employed two methods to improve the precision of RTSIF satellite products, and then repeated the linear regression fitting analysis process.
Method 1: Accuracy improvement method based on canopy.This method decomposes the incident sunlight into three parts as follows: zero-order canopy transmittance (denoted as t 0 ), canopy interception rate (denoted as i 0 ), and escape probability (denoted as fesc).Among them, a portion of the sunlight intercepted by the canopy is absorbed (denoted as a) and a portion is scattered (denoted as s).When scattering occurs, collisions occur between the canopies, resulting in a recollision rate (denoted as p).Based on the relationships between these factors and incorporating other parameters (Table 3), the accuracy of SIF data is improved using the following formula: where n represents the average number of interactions between the solar radiation and leaf surfaces, R obs is the canopy spectral bidirectional reflectance, and λ is the wavelength.Other parameter information can be found in Table 3. Method 2: The linear deviation accuracy improvement method [55,56].This method uses the slope (a) and intercept (b) of the best regression-fitting equation (y = ax + b) between the measured GPP and SIF satellite data to comprehensively eliminate bias, thereby improving the accuracy of the SIF data.The calculation formula is as follows:

GPP Various on the Site
From Figure 2, it can be seen that the monthly average GPP of each site exhibited a changing trend of initially increasing and then decreasing during the growing season (March to October), whereas the period outside the growing season showed a relatively flat trend.The overall interannual variation manifested an inverted "U" shape, and the monthly average GPP of each site during the year was sorted according to the underlying surface conditions as follows: cultivated and farmland area > pasture grassland area > desert vegetation area.
Among these, the minimum monthly average GPP value (0.0003 gC/m 2 /month) of Ulan Usu Station (cultivated and farmland area) occurred in December, followed by a sharp increase due to the cultivation, after reaching a peak (0.35 gC/m 2 /month) in July, then the monthly average GPP value decreased year-on-year.The overall growth rate ranks first among all sites.The monthly average growth rate of GPP at Ulastai Station (pasture grassland area) is second.The overall trend of GPP monthly average change at this station is similar to that of Ulan Usu station, with peak values (0.14 gC/m 2 /month) also occurring in July.However, the Ulastai Station is located in the hinterland of Tianshan Mountains, with a relatively high altitude (2036 m) and a relatively low temperature.This results in its annual monthly GPP being slightly lower than the Ulan Usu Station, and its minimum value (0.011 gC/m 2 /month) appears in December.
Station (desert vegetation area) with relatively low vegetation coverage is consistent with the above stations.However, due to the short growth cycle of short-lived vegetation, the interannual monthly average GPP at the desert vegetation area is significantly lower.The overall monthly average GPP ranks at the end of each station, with the minimum value (0.0018 gC/m 2 /month) occurring in January and a peak (0.057 gC/m 2 /month) occurring in July.

Analysis of Applicability
From Figure 3, it can be seen that there are differences in the linear regression fit between the RTSIF (Figure 3a), CSIF (Figure 3b), SIF_OCO2_005 (Figure 3c), and GOSIF (Figure 3d) satellite products and their corresponding station GPP data in arid regions.In the pasture grassland area, RTSIF demonstrated the highest R 2 fitting value (0.85), followed by CSIF (0.84), and GOSIF shows the lowest (0.41).The order of R 2 fitting values was the same for the cultivated and farmland area.In the desert vegetation area, the highest R 2 fitting value was for RTSIF (0.62), and the lowest was for SIF_OCO2_005 (0.36).Additionally, there are also differences in the RMSE and SD among the satellites in different underlying surfaces.RTSIF showed the smallest RMSE and SD in the pasture grassland area (0.01 and 0.11, respectively), followed by CSIF (0.01 and 0.13, respectively), and In addition, the annual variation trend of the monthly average GPP at the Kelameili Station (desert vegetation area) with relatively low vegetation coverage is consistent with the above stations.However, due to the short growth cycle of short-lived vegetation, the interannual monthly average GPP at the desert vegetation area is significantly lower.The overall monthly average GPP ranks at the end of each station, with the minimum value (0.0018 gC/m 2 /month) occurring in January and a peak (0.057 gC/m 2 /month) occurring in July.

Analysis of Applicability
From Figure 3, it can be seen that there are differences in the linear regression fit between the RTSIF (Figure 3a), CSIF (Figure 3b), SIF_OCO2_005 (Figure 3c), and GOSIF (Figure 3d) satellite products and their corresponding station GPP data in arid regions.In the pasture grassland area, RTSIF demonstrated the highest R 2 fitting value (0.85), followed by CSIF (0.84), and GOSIF shows the lowest (0.41).The order of R 2 fitting values was the same for the cultivated and farmland area.In the desert vegetation area, the highest R 2 fitting value was for RTSIF (0.62), and the lowest was for SIF_OCO2_005 (0.36).Additionally, there are also differences in the RMSE and SD among the satellites in different underlying surfaces.RTSIF showed the smallest RMSE and SD in the pasture grassland area (0.01 and 0.11, respectively), followed by CSIF (0.01 and 0.13, respectively), and GOSIF showed the largest (0.01 and 24.46, respectively).In the cultivated and farmland area and the desert vegetation area, the order of the RMSE and SD was the same as that in the pasture grassland area.
Furthermore, by comparing the R 2 fitting values of the four satellite products on three typical underlying surfaces with the linear regression optimal value (optimal value is 1), it was found that the difference between the R 2 fitting value and the optimal value was less than 0.38 for RTSIF, less than 0.41 for CSIF, less than 0.49 for SIF_OCO2_005, and less than 0.64 for GOSIF.This indicates that there are differences in the applicability of multisource SIF satellite products for GPP estimation in arid regions.The overall ranking of their applicability is RTSIF > CSIF > SIF_OCO2_005 > GOSIF, with RTSIF having an overall significance greater than 0.5 and a confidence interval of 95%, indicating that RTSIF satellite products have the best suitability.less than 0.38 for RTSIF, less than 0.41 for CSIF, less than 0.49 for SIF_OCO2_005, and less than 0.64 for GOSIF.This indicates that there are differences in the applicability of multisource SIF satellite products for GPP estimation in arid regions.The overall ranking of their applicability is RTSIF > CSIF > SIF_OCO2_005 > GOSIF, with RTSIF having an overall significance greater than 0.5 and a confidence interval of 95%, indicating that RTSIF satellite products have the best suitability.

Analysis of Spatial Features
From Figure 4, it can be seen that there are differences in the spatial distribution characteristics of the annual average values of RTSIF, CSIF, SIF_OCO2_005, and GOSIF satellite products in arid regions.The RTSIF, CSIF, and SIF_OCO2_005 satellite products exhibited reasonably distributed spatial patterns with distinct attribute features.The highest values are predominantly distributed in regions with high vegetation cover, such as the Altai Mountains and the northern and southern slopes of the Tianshan Mountains, whereas the lowest values are found in areas with low vegetation cover, such as the Gurbantunggut Desert and the eastern Gobi Desert.Specifically, in the pasture grassland area, the annual mean values are 0.10 mw/m 2 /nm/sr (RTSIF), 0.12 mw/m 2 /nm/sr (CSIF),

Analysis of Spatial Features
From Figure 4, it can be seen that there are differences in the spatial distribution characteristics of the annual average values of RTSIF, CSIF, SIF_OCO2_005, and GOSIF satellite products in arid regions.The RTSIF, CSIF, and SIF_OCO2_005 satellite products exhibited reasonably distributed spatial patterns with distinct attribute features.The highest values are predominantly distributed in regions with high vegetation cover, such as the Altai Mountains and the northern and southern slopes of the Tianshan Mountains, whereas the lowest values are found in areas with low vegetation cover, such as the Gurbantunggut Desert and the eastern Gobi Desert.Specifically, in the pasture grassland area, the annual mean values are 0.10 mw/m 2 /nm/sr (RTSIF), 0.12 mw/m 2 /nm/sr (CSIF), and 0.13 mw/m 2 /nm/sr (SIF_OCO2_005).In the cultivated and farmland area, the values are 0.26 mw/m 2 /nm/sr (RTSIF), 0.18 mw/m 2 /nm/sr (CSIF), and 0.24 mw/m 2 /nm/sr (SIF_OCO2_005).In the desert vegetation area, the values are 0.03 mw/m 2 /nm/sr (RTSIF), 0.01 mw/m 2 /nm/sr (CSIF), and 0.04 mw/m 2 /nm/sr (SIF_OCO2_005).The annual mean values of RTSIF satellite products are closer to the observed GPP data at the stations.
In contrast, the spatial distribution of the GOSIF satellite products is unreasonable in arid regions, with indistinct attribute features and an overall overestimation tendency.The rationality of the attribute features of these products in arid regions is ranked as follows: RTSIF > CSIF > SIF_OCO2_005 > GOSIF.The RTSIF satellite products most effectively reflects the spatial features of GPP in the study area.
0.01 mw/m 2 /nm/sr (CSIF), and 0.04 mw/m 2 /nm/sr (SIF_OCO2_005).The annual mean values of RTSIF satellite products are closer to the observed GPP data at the stations.
In contrast, the spatial distribution of the GOSIF satellite products is unreasonable in arid regions, with indistinct attribute features and an overall overestimation tendency.The rationality of the attribute features of these products in arid regions is ranked as follows: RTSIF > CSIF > SIF_OCO2_005 > GOSIF.The RTSIF satellite products most effectively reflects the spatial features of GPP in the study area.

Analysis of the Impact Factor Responsiveness
From Figure 5, it can be seen that the responsiveness of RTSIF, CSIF, SIF_OCO2_005, and GOSIF satellite products to the main influencing factors (photosynthetically active radiation (PAR), soil temperature (Tsoil), air temperature (Tair)) of GPP varies in different ecosystems in arid regions.In the desert vegetation area (Figure 5a), RTSIF exhibited the highest responsiveness to Tsoil, PAR, and Tair, followed by CSIF, and GOSIF showed the weakest response.In the pasture grassland area (Figure 5b), RTSIF showed the highest responsiveness to PAR, followed by SIF_OCO2_005, while GOSIF exhibited the weakest response.For Tsoil and Tair, the highest responsiveness is observed for SIF_OCO2_005, followed by RTSIF, with GOSIF showing the weakest response.In the cultivated and farmland area (Figure 5c), the highest responsiveness to PAR and Tair was observed for RTSIF, followed by SIF_OCO2_005, with GOSIF exhibiting the weakest response.For Tsoil, the highest responsiveness was observed for SIF_OCO2_005, followed by RTSIF, with GOSIF exhibiting the weakest response.
These results indicate that the responsiveness of the four satellite products in arid regions to the main influencing factors varies greatly among different ecosystems.The overall responsiveness to Tsoil is ranked as follows: SIF_OCO2_005 > RTSIF > CSIF > GOSIF.While the overall responsiveness to Tair and PAR is ranked as follows: RTSIF > SIF_OCO2_005 > CSIF > GOSIF.RTSIF shows an overall significance greater than 0.5 with influencing factors in different ecosystems, indicating that RTSIF satellite products have the highest overall responsiveness.

Analysis of the Impact Factor Responsiveness
From Figure 5, it can be seen that the responsiveness of RTSIF, CSIF, SIF_OCO2_005, and GOSIF satellite products to the main influencing factors (photosynthetically active radiation (PAR), soil temperature (Tsoil), air temperature (Tair)) of GPP varies in different ecosystems in arid regions.In the desert vegetation area (Figure 5a), RTSIF exhibited the highest responsiveness to Tsoil, PAR, and Tair, followed by CSIF, and GOSIF showed the weakest response.In the pasture grassland area (Figure 5b), RTSIF showed the highest responsiveness to PAR, followed by SIF_OCO2_005, while GOSIF exhibited the weakest response.For Tsoil and Tair, the highest responsiveness is observed for SIF_OCO2_005, followed by RTSIF, with GOSIF showing the weakest response.In the cultivated and farmland area (Figure 5c), the highest responsiveness to PAR and Tair was observed for RTSIF, followed by SIF_OCO2_005, with GOSIF exhibiting the weakest response.For Tsoil, the highest responsiveness was observed for SIF_OCO2_005, followed by RTSIF, with GOSIF exhibiting the weakest response.From Figure 6, it can be seen that there are differences in GPP/SIF values in the study area under different weather conditions and ecosystems.In the desert vegetation area (Figure 6a), SIF_OCO2_005 and GOSIF exhibited severe overestimation under clear, overcast, and cloudy conditions, while CSIF showed greater overestimation under clear conditions and greater underestimation under overcast and cloudy conditions.RTSIF fluctu- These results indicate that the responsiveness of the four satellite products in arid regions to the main influencing factors varies greatly among different ecosystems.The overall responsiveness to Tsoil is ranked as follows: SIF_OCO2_005 > RTSIF > CSIF > GOSIF.While the overall responsiveness to Tair and PAR is ranked as follows: RTSIF > SIF_OCO2_005 > CSIF > GOSIF.RTSIF shows an overall significance greater than 0.5 with influencing factors in different ecosystems, indicating that RTSIF satellite products have the highest overall responsiveness.

Analysis of GPP/SIF Values under Different Weather Conditions
From Figure 6, it can be seen that there are differences in GPP/SIF values in the study area under different weather conditions and ecosystems.In the desert vegetation area (Figure 6a), SIF_OCO2_005 and GOSIF exhibited severe overestimation under clear, overcast, and cloudy conditions, while CSIF showed greater overestimation under clear conditions and greater underestimation under overcast and cloudy conditions.RTSIF fluctuated around the 1:1 line, indicating reasonably accurate values.In the pasture grassland area (Figure 6b), SIF_OCO2_005, GOSIF, and CSIF all demonstrated significant overestimation under clear, overcast, and cloudy conditions, whereas RTSIF remained relatively close to the 1:1 line.In the cultivated and farmland area (Figure 6c), SIF_OCO2_005 generally exhibited significant underestimation under clear, overcast, and cloudy conditions, while GOSIF tended to overestimate, and CSIF showed a balance between overestimation and underestimation across different weather conditions.RTSIF maintained fluctuations around the 1:1 line.

SIF Data Accuracy Improvement Analysis
A comparison between multisource SIF satellite products and site data revealed th RTSIF satellite products performed better in estimating GPP than other satellite produc in arid regions, but there is still considerable room for improvement in the applicabili of RTSIF across different ecosystems.In order to advance research on GPP in arid region it is necessary to further refine RTSIF data.Therefore, we applied two correction metho to improve the RTSIF data in arid regions, and evaluated their applicability of two met ods across the three ecosystems.Ultimately, the optimal improvement method for diffe ent ecosystems is determined, revealing the overall spatial distribution characteristics SIF in arid regions and indirectly elucidating the spatial characteristics of GPP.

Analysis of Canopy-Based Accuracy Improvement
From Figure 7, it can be seen that the R 2 fitting values after improvement (based o the canopy improvement method) at each site are sorted by underlying surface conditio These results indicate that the GPP/SIF values of the four SIF satellite products were overestimated or underestimated differently under different ecosystem and weather conditions in arid regions.SIF_OCO2_005 and CSIF showed varying degrees of severe overestimation and underestimation, whereas GOSIF demonstrated significant overestimation, and RTSIF maintained relatively reasonable values.The GPP/SIF values ranked overall as follows: RTSIF > CSIF > SIF_OCO2_005 > GOSIF.The RTSIF satellite products exhibited the highest overall rationality in GPP/SIF values.

SIF Data Accuracy Improvement Analysis
A comparison between multisource SIF satellite products and site data revealed that RTSIF satellite products performed better in estimating GPP than other satellite products in arid regions, but there is still considerable room for improvement in the applicability of RTSIF across different ecosystems.In order to advance research on GPP in arid regions, it is necessary to further refine RTSIF data.Therefore, we applied two correction methods to improve the RTSIF data in arid regions, and evaluated their applicability of two methods across the three ecosystems.Ultimately, the optimal improvement method for different ecosystems is determined, revealing the overall spatial distribution characteristics of SIF in arid regions and indirectly elucidating the spatial characteristics of GPP.

Analysis of Canopy-Based Accuracy Improvement
From Figure 7, it can be seen that the R 2 fitting values after improvement (based on the canopy improvement method) at each site are sorted by underlying surface conditions as follows: cultivated and farmland area (Figure 7a) = pasture grassland area (Figure 7b), with values equal to 0.90, maximizing close to the optimal value.The overall significance is greater than 0.5 with a confidence interval of 95%.Compared to before improvement, the R 2 values after improvement of each underlying surface increased by 0.06 (pasture grassland area) and 0.05 (cultivated and farmland area), respectively.Additionally, the overall improvement ranges of the MB, RMSE, and SD at each site were between 0.0009~0.0013,0.0012~0.0065,and 0.0348~0.0864,respectively.The overall fitting parameter errors were improved to varying degrees, with the cultivated and farmland area showing the largest overall improvement and the pasture grassland area having the smallest improvement.This indicates that the canopy method has good applicability in areas with high vegetation coverage in arid regions.
Land 2024, 13, x FOR PEER REVIEW 15 of 2 ing in the inability of the canopy method to accurately achieve the SIF data accuracy im provement for this type of underlying surface.This indicates that the canopy method highly applicable in areas with dense vegetation in arid regions, whereas its applicabilit is weaker in areas with sparse vegetation, indicating certain limitations overall.

Analysis of Linear-Based Accuracy Improvement
From Figure 8, it can be seen that the R 2 fitting values after improvement (based o the linear improvement method) at each site are sorted by underlying surface condition as follows: pasture grassland area (Figure 8b) > cultivated and farmland area (Figure 8c > desert vegetation area (Figure 8a), with the corresponding values are 0.87, 0.85 and 0.7 respectively.All values were greater than 0.70 and close to the optimal value, indicatin that 70% of the dependent variable variability in the underestimated desert vegetatio area can be explained by the linear method.In addition, compared with before the im provement, the R 2 fitting values of the cultivated and farmland area, the pasture grasslan area, and the desert vegetation area increased by 0.01, 0.02, and 0.13, respectively, showin a significant overall improvement effect.
In addition, the overall improvement ranges of the MB, RMSE, and SD at each sit ranged from 0.0001 to 0.0011, 0.0003 to 0.0146, and 0.0126 to 0.0799, respectively.Variou fitting parameters showed different degrees of enhancement, with the greatest improve ment observed in the desert vegetation area, and relatively minor improvements in th cultivated and farmland area and the pasture grassland area, the enhanced SIF data ca further reflect the characteristics of GPP changes.
Overall, compared to before the improvements, SIF data are now more closel aligned with the measured GPP data.The parameters in the underestimated desert vege However, owing to the short growth cycle and small leaf area of short-lived vegetation on the underlying surface of the Kelameili Station (desert vegetation area), the LAI in some areas during the corresponding time period cannot be accurately measured, resulting in the inability of the canopy method to accurately achieve the SIF data accuracy improvement for this type of underlying surface.This indicates that the canopy method is highly applicable in areas with dense vegetation in arid regions, whereas its applicability is weaker in areas with sparse vegetation, indicating certain limitations overall.

Analysis of Linear-Based Accuracy Improvement
From Figure 8, it can be seen that the R 2 fitting values after improvement (based on the linear improvement method) at each site are sorted by underlying surface conditions as follows: pasture grassland area (Figure 8b) > cultivated and farmland area (Figure 8c) > desert vegetation area (Figure 8a), with the corresponding values are 0.87, 0.85 and 0.75, respectively.All values were greater than 0.70 and close to the optimal value, indicating that 70% of the dependent variable variability in the underestimated desert vegetation area can be explained by the linear method.In addition, compared with before the improvement, the R 2 fitting values of the cultivated and farmland area, the pasture grassland area, and the desert vegetation area increased by 0.01, 0.02, and 0.13, respectively, showing a significant overall improvement effect.

Comparative Analysis of Accuracy improvement Based on Canopy and Linear Methods
From Figure 9, it can be seen that the linear accuracy improvement method can en hance the accuracy of RTSIF data across different ecosystems in arid regions, with overa R 2 values exceeding 0.75 after improvement.However, the canopy accuracy improvemen method is only applicable to areas with a higher leaf area index in arid regions, and cannot achieve improvement in regions with low vegetation cover.Interestingly, the R fitting values decrease overall for the pasture grassland area and cultivated and farmlan area after improvement using the linear method, with decreases of 0.03 and 0.05, respe tively.This indicates that while the linear method is suitable for various ecosystems arid regions, its effectiveness in areas with high vegetation cover is not as pronounced a that of the canopy method.
The differences in the error fitting parameters after improvement based on bot methods are listed in Table 5.Compared to the canopy method, for the pasture grasslan area and the cultivated and farmland area using the linear improvement method resul in increased MB, RMSE, and SD errors, with error increases ranging between 0~0.002 0.0009~0.0211,and 0.0065~0.0671,respectively.Conversely, for the desert vegetation are using the linear improvement method results in relatively small MB, RMSE, and SD e rors, with error values of 0.0002, 0.0029, and 0.0110, respectively, and the improvemen effect is relatively significant.This further emphasizes that while the linear method ha broader applicability than the canopy method in arid regions, its effectiveness is relative weaker in areas with a higher leaf area index.
In summary, the linear method can enhance the accuracy of SIF data across differen ecosystems in arid regions and further reduce errors between parameters after improv ment.However, its effectiveness in areas with high vegetation cover is slightly weak In addition, the overall improvement ranges of the MB, RMSE, and SD at each site ranged from 0.0001 to 0.0011, 0.0003 to 0.0146, and 0.0126 to 0.0799, respectively.Various fitting parameters showed different degrees of enhancement, with the greatest improvement observed in the desert vegetation area, and relatively minor improvements in the cultivated and farmland area and the pasture grassland area, the enhanced SIF data can further reflect the characteristics of GPP changes.
Overall, compared to before the improvements, SIF data are now more closely aligned with the measured GPP data.The parameters in the underestimated desert vegetation area showed the greatest overall improvement due to the linear method, whereas the cultivated and farmland area and the pasture grassland area exhibited a relatively smaller overall improvement.This suggests that while the linear method is applicable across different ecosystems, there are limitations when applying this method to improve the accuracy of SIF data in arid regions.

Comparative Analysis of Accuracy improvement Based on Canopy and Linear Methods
From Figure 9, it can be seen that the linear accuracy improvement method can enhance the accuracy of RTSIF data across different ecosystems in arid regions, with overall R 2 values exceeding 0.75 after improvement.However, the canopy accuracy improvement method is only applicable to areas with a higher leaf area index in arid regions, and it cannot achieve improvement in regions with low vegetation cover.Interestingly, the R 2 fitting values decrease overall for the pasture grassland area and cultivated and farmland area after improvement using the linear method, with decreases of 0.03 and 0.05, respectively.This indicates that while the linear method is suitable for various ecosystems in arid regions, its effectiveness in areas with high vegetation cover is not as pronounced as that of the canopy method.
Land 2024, 13, x FOR PEER REVIEW 17 of 26 than that of the canopy method.Therefore, the canopy method can be utilized to improve the accuracy of SIF data in regions with high vegetation cover in arid regions, whereas the linear method can be used, as a supplement, to enhance the accuracy of SIF data for other types of underlying surfaces.After careful verification, we found that the title did not express it clearly, which was due to our negligence.We deeply apologize for this.We have rephrased the title, and the revised title is clear and reasonable.We have marked the modified content in red.

Spatial Analysis of SIF Data Accuracy Improvement before and after
The spatial characteristics of the average values of RTSIF satellite data before improvement for each quarter (March to May are spring, etc.) are shown in Figure 10 (left column).High values of SIF data are primarily distributed in the cropland area on the northern and southern slopes of the Tianshan Mountains, as well as in the pasture grassland area of the Tianshan and Altai Mountains.Conversely, low values are predominantly found in regions with a smaller leaf area index, such as the Gurbantunggut Desert (desert vegetation area) and the eastern Gobi Desert.The differences in the error fitting parameters after improvement based on both methods are listed in Table 5.Compared to the canopy method, for the pasture grassland area and the cultivated and farmland area using the linear improvement method results in increased MB, RMSE, and SD errors, with error increases ranging between 0~0.0021, 0.0009~0.0211,and 0.0065~0.0671,respectively.Conversely, for the desert vegetation area, using the linear improvement method results in relatively small MB, RMSE, and SD errors, with error values of 0.0002, 0.0029, and 0.0110, respectively, and the improvement effect is relatively significant.This further emphasizes that while the linear method has broader applicability than the canopy method in arid regions, its effectiveness is relatively weaker in areas with a higher leaf area index.In summary, the linear method can enhance the accuracy of SIF data across different ecosystems in arid regions and further reduce errors between parameters after improve-ment.However, its effectiveness in areas with high vegetation cover is slightly weaker than that of the canopy method.Therefore, the canopy method can be utilized to improve the accuracy of SIF data in regions with high vegetation cover in arid regions, whereas the linear method can be used, as a supplement, to enhance the accuracy of SIF data for other types of underlying surfaces.After careful verification, we found that the title did not express it clearly, which was due to our negligence.We deeply apologize for this.We have rephrased the title, and the revised title is clear and reasonable.We have marked the modified content in red.

Spatial Analysis of SIF Data Accuracy Improvement before and after
The spatial characteristics of the average values of RTSIF satellite data before improvement for each quarter (March to May are spring, etc.) are shown in Figure 10 (left column).High values of SIF data are primarily distributed in the cropland area on the northern and southern slopes of the Tianshan Mountains, as well as in the pasture grassland area of the Tianshan and Altai Mountains.Conversely, low values are predominantly found in regions with a smaller leaf area index, such as the Gurbantunggut Desert (desert vegetation area) and the eastern Gobi Desert.
Combining land use data with two precision improvement methods, the final precision improvement results achieved by the RTSIF satellite are shown in Figure 10 (right column).The overall improvement in the mean values of the SIF data for each quarter was 0.11% (0.037% for cultivated and farmland area, 0.028% for pasture grassland area, 0.016% for desert vegetation area, and 0.025% for other area).The most significant improvement was observed in the spring and summer seasons (improvement rate of 0.071%).Specifically, in spring, as snow begins to melt, vegetation (including short-lived vegetation) begins to grow, the contribution of SIF has increased.Which is coherent with the overall enhancement of SIF attributes in the study area after improvement.In particular, significant improvements were observed in regions with a higher leaf area index, such as the northwestern part of the Altai Mountains and both slopes of the Tianshan Mountains.The attributes of the Gurbantunggut Desert also showed a synchronized enhancement, with results falling within a reasonable attribute range for arid regions [57,58].
In summer, most regions have completed snowmelt, and vegetation continues to grow as solar radiation increases.The contribution of the SIF further increases, which is consistent with the overall enhancement of SIF attributes in the study area after improvement.In particular, regions with higher vegetation cover, such as the Altai Mountains and both slopes of the Tianshan Mountains, experience a further increase in the enhancement rate.The attributes in desert areas and similar regions remain within a reasonable range for arid regions [59][60][61].
In autumn and winter, snow begins to accumulate in most regions, and vegetation growth slows as the overall solar radiation decreases, the contribution of SIF decreases, and the overall change is not significant.This aligns with the overall lower and relatively less pronounced enhancement of the SIF attributes in the study area after improvement.The western part and both slopes of the Tianshan Mountains are areas where the enhancement of SIF attributes is more pronounced during the autumn and winter.
Additionally, after improvement, the annual average SIF (GPP) data for various underlying surfaces in the northern region of Xinjiang was 0.13 mw/m 2 /nm/sr (0.26 mw/m 2 /nm/sr for cultivated and farmland area, 0.14 mw/m 2 /nm/sr for pasture grassland area, 0.034 mw/m 2 /nm/sr for desert vegetation area, and 0.11 mw/m 2 /nm/sr for other area).Combining land use data with two precision improvement methods, the final precision improvement results achieved by the RTSIF satellite are shown in Figure 10 (right column).The overall improvement in the mean values of the SIF data for each quarter was 0.11% (0.037% for cultivated and farmland area, 0.028% for pasture grassland area, 0.016% for desert vegetation area, and 0.025% for other area).The most significant improvement was observed in the spring and summer seasons (improvement rate of 0.071%).Specifically, in spring, as snow begins to melt, vegetation (including short-lived vegetation) begins to grow, the contribution of SIF has increased.Which is coherent with the overall  The overall applicability of using multisource SIF to comprehensively evaluate GPP data in arid regions shows that the overall ranking of the applicability of the four satellites for GPP estimation is as follows: RTSIF > CSIF > SIF_OCO2_005 > GOSIF (based on spatial characteristics, responsiveness to GPP influencing factors, GPP/SIF values under different weather conditions, and other standards).The significance of RTSIF is greater than 0.5, with a confidence interval of 95%.This is because the sensors and generation principles of the four SIF satellite products are different.Among them, CSIF, GOSIF, and SIF_OCO2_005 all come from OCO_2 sensors.This sensor has an early launch time and a narrow spectral band (757-775 nm) [62], leading to a lack of certain physiological and physical significance in the generated satellite data.
However, RTSIF is derived from the TROPOMI sensor on Sentinel-5P, which was launched later and has a more comprehensive spectral band (735-785 nm) [21,22], leading to a more refined understanding of various mechanisms in the obtained SIF data set.Furthermore, CSIF, GOSIF, and SIF_OCO2_005 all utilize the MODIS data set, but MODIS exhibits a noticeable lag effect [63], resulting in a significant deviation from GPP, thus reducing its inversion accuracy to some extent.In contrast, on the basis of the complete sensor band, RTSIF comprehensively considers different weather conditions to generate [21,22], which to some extent improves the inversion accuracy.
Additionally, the applicability of RTSIF in GPP estimation varied across the three ecosystems in the study area.The pasture grassland area exhibits the highest R 2 value of 0.85, with a significance greater than 0.5 and a confidence interval of 95%, indicating the best suitability for GPP estimation.This is attributed to the relatively stable growth cycle of the grassland in Xinjiang, where the green-up period occurs between days 110 and 150 each year, the growing season falls between days 140 and 160, and senescence appears between days 270 and 290.The overall growth cycle of the grassland is stable, with no significant interannual fluctuations in SIF contribution and relatively stable attribute characteristics.Consequently, the suitability of SIF data for GPP estimation in this area was slightly higher than that of the cropland area with a larger leaf area index.This finding is consistent with the conclusions of Dong Tong's [64] research, who utilized machine learning and model construction methods to study the spatiotemporal and phenological characteristics of grasslands in Xinjiang over the last 20 years.In the cultivated and farmland area, such as those cultivating maize and cotton, GPP and SIF are much higher than those of forests and grasslands, with an overall high measurability.However, because of their shorter growth cycle and significant reduction in SIF contribution during the nongrowing season, their suitability for GPP estimation using SIF data is relatively lower than that of the pasture grassland area with an R 2 fitting value of 0.84, but with a significance also greater than 0.5 and a confidence interval of 95%.This is consistent with the conclusion of Chen Xin [65], who used multiple crops for global farmland GPP estimations.
The carbon sink in the desert vegetation area is primarily generated through nonphotosynthetic processes, with complex controlling factors and trends.This conclusion is drawn from research by Yang Fan et al. [66], who utilized comparative experiments to demonstrate CO 2 characteristics in the Taklimakan Desert.This indicates that the overall measurability of SIF in the desert vegetation area is relatively low, thus resulting in a weaker applicability of SIF data for GPP estimation on these underlying surfaces.In this study, in the desert vegetation area (Gurbantunggut Desert), the GPP during the growing season of short-lived vegetation shows a trend of increasing initially and then decreasing.Outside the growing season, the interannual variation of GPP shows an inverted "U" shape and indicates a carbon sink.The interannual variations of GPP and SIF are relatively complex, with small overall accumulations of GPP and contributions of SIF.This further validates the aforementioned research conclusion and is consistent with the findings of Gulinur et al. [67] regarding CO 2 fluxes in the Gurbantunggut Desert.
This study validated the applicability of SIF data for GPP estimation in arid regions, the findings of this research are crucial for subsequent comprehensive estimation and feature analysis of GPP in Xinjiang and even the entire arid region based on SIF data.

Analysis of GPP Estimation Accuracy Based on Improving SIF
The results of precision improvement of SIF data for arid regions indicate that the linear methods can enhance SIF data accuracy for different underlying surfaces (R 2 increased values for the cultivated and farmland area, the pasture grassland area, and the desert vegetation area were 0.01, 0.02, and 0.13, respectively).However, canopy methods are only applicable to regions with a higher leaf area (R 2 increased values for the cultivated and farmland area and pasture grassland area were 0.06 and 0.05, respectively), and cannot be improved for areas with low vegetation cover.The adaptability of canopy methods was validated in a study by Yin Yueqiang [68], who considered the canopy as a factor for global SIF data precision improvement, ultimately reconstructing six sets of SIF data at different resolutions.The adaptability of linear methods was confirmed in a study by Wang Yu et al. [69], where linear methods were used to improve the precision of five solar radiation reanalysis data sets in the eastern Gobi Desert in Xinjiang, eliminating the impact of errors in the reanalysis data on radiation assessments.In addition, in areas of desert vegetation, the error reduction values of MB, RMSE, and SD based on the linear improvement method were 0.0001, 0.0007, and 0.0126, respectively.However, using the linear method to improve the areas of the pasture grassland and cultivated and farmland, except for a decrease in the R 2 fitting value, other fitting parameters, on the contrary, increased overall, with the error increase ranges of the MB, RMSE, and SD between 0.0006~0.0011,0.0003~0.0146and 0.0323~0.0799,respectively.This further indicates that the linear method has a wider applicability than the canopy method in arid regions, but the improvement effect is relatively weak in areas with high vegetation coverage.Therefore, the canopy and linear improvement methods can be alternately used to improve the accuracy of SIF satellite products, then the two methods can be integrated to ultimately achieve a precision improvement of SIF data in arid regions.
Furthermore, in conjunction with land use data, SIF data improvement was conducted using the canopy and linear methods, with an overall improvement rate of 0.11% (0.037% for cultivated and farmland area, 0.028% for pasture grassland area, 0.016% for desert vegetation area, 0.025% for other area).After improvement, areas with higher vegetation cover, such as cropland on the northern and southern slopes of the Tianshan Mountains and pasture grassland in the Tianshan and Altai Mountains, exhibited the highest attribute values.Conversely, lower values were mainly distributed in areas with smaller proportions of the leaf area index, such as the Gurbantunggut Desert (desert vegetation area) and the eastern Gobi Desert.
Additionally, the postimprovement annual average SIF (GPP) values for various underlying surfaces in the northern region of Xinjiang are 0.13 mw/m 2 /nm/sr.Serving as effective proxies for GPP, the spatial distribution of improved SIF better reflects the GPP distribution characteristics in arid regions and further reveals the strong coupling relationship between the GPP and SIF in these areas.The rationality of the attribute values after improvement aligns with the conclusions of numerous scholars studies in arid regions.For example, Li Yue et al. [70] conducted remote sensing monitoring of grassland GPP on the Mongolian Plateau based on SIF data, showing that the annual average values for various grasslands range from 0.11 to 3.48 gC/m 2 (0.14 mw/m 2 /nm/sr in this study); Yan Zhirong et al. [71] conducted a study on the spatiotemporal distribution of vegetation GPP in China from 2007 to 2018 based on SIF data, indicating that the annual average values for the desert vegetation area (based on latitude and longitude division) range from 0 to 0.1 gC/m 2 (0.034 mw/m 2 /nm/sr in this study); and Song Lian [72] conducted a comparative study on the high-temperature stress mechanisms of crops based on SIF data, showing that the annual values for various crops range from 0.22 to 4.42 gC/m 2 (0.26 mw/m 2 /nm/sr in this study).

Innovation, Limitations, and Prospects
The innovation of this study lies in coupling multiple evaluation indicators (linear regression parameters, satellite spatial characteristics, GPP influencing factor responsiveness, and GPP/SIF values under different weather conditions) to comprehensively compare and analyze the applicability of four commonly used SIF satellite products for GPP estimation in arid regions, and identify which SIF satellite product is most suitable for GPP estimation in arid regions.Additionally, it verifies the feasibility of canopy and linear accuracy improvement methods for SIF accuracy improvement in arid regions based on the most suitable SIF satellite product.By integrating land use data, it also reveals the spatial distribution patterns of SIF (GPP) in arid regions.These results have not been clearly demonstrated in previous studies, thus our research fills the gap in the coupling studies of SIF and GPP in arid regions and lays a theoretical foundation for achieving the "carbon neutrality" goal in these areas.
The limitation of this study is that the climate change in Xinjiang is complex and there are some uncontrollable influencing factors, which may have a certain impact on the comprehensive assessment of applicability.Other methods for improving the accuracy of SIF satellite products in arid regions need to be further developed and validated.There are often biases and uncertainties in data measurement, which result in some abnormal and outlier data, as well as incomplete and nonstationary time series data.These data have a certain impact on the operability of statistical analysis, reasonable estimation of parameters, and effective analysis of dependent variables.Although we have comprehensively processed these data in detail and reasonably, there are still subtle impacts that are inevitable.Additionally, the research results may also be influenced by the limitations of data resolution and remote sensing technology, requiring careful consideration in the application of the results.
Based on the current research results, we will further explore models suitable for GPP inversion in arid regions in the future, coupling the improved SIF with these models to achieve GPP inversion in arid regions, thus analyzing the spatial and temporal patterns of GPP in arid regions.Additionally, we will use measured data of influencing factors such as PAR, Tair, and Tsoil, along with satellite data, to comprehensively analyze the spatial and temporal characteristics of GPP influencing factors/mechanisms in arid regions.

Conclusions
This study comprehensively evaluated the applicability of multisource SIF satellite products for GPP estimation in arid regions using various indicators, adopted multiple methods to improve the accuracy of SIF satellite products that are most suitable for GPP estimation in arid areas, and comprehensively analyzed the spatial characteristics of GPP indirectly reflected by SIF data before and after improvement.The final research conclusions are as follows: (1) The interannual variation of the monthly mean GPP in arid regions shows an inverted "U" shape, with peaks occurring in June and July.During the growing season (March to October), GPP first increases and then decreases, while in the nongrowing season (November to February), GPP fluctuations are not significant.(2) The overall suitability ranking of multisource SIF satellite products for GPP estimation in arid regions is as follows: RTSIF > CSIF > SIF_OCO2_005 > GOSIF.This has a profound significance for revealing the spatial and temporal patterns of the terrestrial ecosystem carbon cycle in arid regions by coupling multiple factors and provides new approaches for constructing carbon reduction policies in arid regions.(3) When improving the accuracy of SIF satellite products in arid regions, both the canopy improvement method and the linear improvement method need to be used in combination.This provides practical theory for achieving a more comprehensive and higher accuracy analysis of carbon source/sink spatial and temporal characteristics in arid region terrestrial ecosystems, which is of great significance for achieving "carbon neutrality" in arid regions.(4) Based on land use data, the spatial characteristics of SIF data in arid regions achieved through the two methods showed a high correlation with vegetation coverage, with the annual mean value of SIF data for each surface after improvement being approximately 0.13 mw/m 2 /nm/sr.

Practical Applications
Based on the research conclusions, the practical applications of this study in arid regions are mainly reflected in the following aspects: (1) By revealing the interannual variation characteristics of GPP in arid regions, relevant theories can be directly referenced in the subsequent construction of the carbon cycle system in arid regions, thereby avoiding unreasonable interannual variations.(2) By revealing the most suitable SIF satellite products for GPP estimation in arid regions, the relevant satellites can be directly applied in subsequent analysis of the spatial and temporal patterns of carbon storage in arid regions based on GPP, an important factor of carbon source/sink, thus avoiding repeated comparative validation.(3) By revealing the methods for improving the accuracy of SIF satellite products in arid regions, these methods can be directly applied in subsequent accuracy improvement of other SIF satellite products in arid regions, thus avoiding repeated exploration and analysis.(4) By revealing the spatial characteristics of GPP indirectly reflected by SIF in arid regions, accurate carbon reduction policies can be directly constructed based on the spatial patterns to achieve "carbon neutrality" in arid regions, thus avoiding discrepancies between practice and reality.

Figure 1 .
Figure 1.(a) Specific locations of the Tianshan Mountains, Ulan Usu Station, Ulastai Station, and Kelameili Station in Xinjiang.(b) Schematic representation of the elevations of the study area.(c) Schematic representation of the land use types at the study area.

Figure 1 .
Figure 1.(a) Specific locations of the Tianshan Mountains, Ulan Usu Station, Ulastai Station, and Kelameili Station in Xinjiang.(b) Schematic representation of the elevations of the study area.(c) Schematic representation of the land use types at the study area.

Figure 2 .
Figure 2. Interannual variation of monthly average GPP at each site in 2020 (excluding nighttime values).

Figure 2 .
Figure 2. Interannual variation of monthly average GPP at each site in 2020 (excluding nighttime values).

Figure 3 .
Figure 3. (a) The linear regression fitting of 2020 GPP data from three site with corresponding site data of CSIF satellite products.(b) The linear regression fitting of 2020 GPP data from three site with corresponding site data of RTSIF satellite products.(c) The linear regression fitting of 2020 GPP data from three site with corresponding site data of SIF-OCO-005 satellite products.(d) The linear regression fitting of 2020 GPP data from three site with corresponding site data of GOSIF satellite products.

Figure 3 .
Figure 3. (a) The linear regression fitting of 2020 GPP data from three site with corresponding site data of CSIF satellite products.(b) The linear regression fitting of 2020 GPP data from three site with corresponding site data of RTSIF satellite products.(c) The linear regression fitting of 2020 GPP data from three site with corresponding site data of SIF-OCO-005 satellite products.(d) The linear regression fitting of 2020 GPP data from three site with corresponding site data of GOSIF satellite products.

Figure 4 .
Figure 4.The spatial distribution characteristics of annual mean values of multisource SIF satellite products.

Figure 4 .
Figure 4.The spatial distribution characteristics of annual mean values of multisource SIF satellite products.

Figure 5 .
Figure 5. Responsiveness of multisource SIF satellite products to major influencing factors of GPP (** indicates significance at the 0.5 level).(a) Kelameili Station, desert vegetation area.(b) Ulastai Station, pasture and grassland area.(c) Ulastai Station, pasture and grassland area.(c) Ulan Usu Staion, cultivate land and farmland area.

Figure 7 .
Figure 7. Linear fitting graph of 2020 GPP data and RTSIF corresponding station data for each st tion after improving based on the canopy method.(a) Ulastai Station, pasture and grassland are (b) Ulan Usu Staion, cultivate land and farmland area.

Figure 7 .
Figure 7. Linear fitting graph of 2020 GPP data and RTSIF corresponding station data for each station after improving based on the canopy method.(a) Ulastai Station, pasture and grassland area.(b) Ulan Usu Staion, cultivate land and farmland area.

Land 2024 ,Figure 8 .
Figure 8. Linear fitting diagram between the 2020 GPP data of each station and the correspondin RTSIF station data after improving based on the linear method.(a) Kelameili Station, desert veget tion area.(b) Ulastai Station, pasture and grassland area.(c) Ulastai Station, pasture and grasslan area.(c) Ulan Usu Staion, cultivate land and farmland area.

Figure 8 .
Figure 8. Linear fitting diagram between the 2020 GPP data of each station and the corresponding RTSIF station data after improving based on the linear method.(a) Kelameili Station, desert vegetation area.(b) Ulastai Station, pasture and grassland area.(c) Ulastai Station, pasture and grassland area.(c) Ulan Usu Staion, cultivate land and farmland area.

Figure 9 .
Figure 9.The R 2 fitting values for various sites based on two accuracy improvement methods: canopy and linear.

Figure 9 .
Figure 9.The R 2 fitting values for various sites based on two accuracy improvement methods: canopy and linear.

Figure 10 .
Figure 10.Changes in spatial characteristics of quarterly average values before and after the improvement of SIF satellite product data.(a1-d1) The spatial variation characteristics of the mean values of each season before improvement, (a1) for spring, and so on.(a2-d2) The spatial variation characteristics of the mean values of each season after improvement, (a2) for spring, and so on.

Figure 10 .
Figure 10.Changes in spatial characteristics of quarterly average values before and after the improvement of SIF satellite product data.(a1-d1) The spatial variation characteristics of the mean values of each season before improvement, (a1) for spring, and so on.(a2-d2) The spatial variation characteristics of the mean values of each season after improvement, (a2) for spring, and so on.

1 .
Analysis of the Applicability of Multisource SIF Data in Estimating GPP

Table 1 .
Basic information of different observation stations.

Table 2 .
Basic information on four widely used SIF satellite products.

Table 3 .
SIF data accuracy improvement auxiliary parameter data set.

Table 4 .
Study area weather division results.

Table 5 .
The difference in error fitting parameters for various sites based on two improvement methods.

Table 5 .
The difference in error fitting parameters for various sites based on two improvement methods.