1. Introduction
Carbon storage in terrestrial ecosystems plays a vital role in regulating the global carbon cycle and mitigating climate change. Understanding its spatial and temporal dynamics is crucial for developing effective climate mitigation strategies. Utah’s diverse ecosystems and semi-arid climate provide an ideal setting to investigate climate–carbon interactions, particularly in response to changing environmental conditions.
Advancements in remote sensing, particularly through Landsat data, have transformed carbon storage analysis by enabling long-term monitoring of vegetation productivity and biomass accumulation. Integrating satellite-derived data with key climatic variables such as temperature, precipitation, humidity, and solar radiation offers a more comprehensive perspective on carbon dynamics, particularly in arid and semi-arid regions where ecosystem responses to climate variability are complex and less studied.
As carbon emissions continue to rise, Carbon Capture and Sequestration (CCS) remains a critical component of climate mitigation efforts. Predicting carbon storage potential and understanding its interactions with climatic factors are essential for optimizing sequestration strategies and guiding sustainable land management policies. The application of machine learning techniques to large-scale environmental datasets provides powerful tools for modeling these complex relationships. However, challenges remain in refining their accuracy and reliability, particularly in heterogeneous landscapes such as Utah. This study aims to address these gaps by leveraging remote sensing data and advanced statistical modeling to enhance the predictive capacity of carbon storage assessments in semi-arid ecosystems.
The MODIS MOD17 algorithm is a widely used remote sensing-based model for estimating terrestrial Gross Primary Production (GPP) and Net Primary Production (NPP) [
1,
2,
3]. Initially developed to monitor global carbon cycles, MOD17 has been extensively applied in ecological studies to assess carbon dynamics across diverse ecosystems [
4,
5,
6]. GPP measures the total carbon fixed by plants through photosynthesis, while NPP represents the remaining carbon after accounting for plant respiration [
7].
GPP and NPP variability is largely influenced by climatic conditions, land cover, disturbances, and land-use changes [
8]. Understanding the distinction between GPP and NPP is crucial for evaluating ecosystem productivity and carbon sequestration potential, as NPP determines the carbon available to sustain higher trophic levels within ecosystems [
9].
Effective rangeland management requires addressing productivity heterogeneity across spatial and temporal scales [
10]. Traditional field-based monitoring methods capture localized vegetation dynamics but lack broader applicability for large-scale assessments [
11,
12].
Remote sensing provides an efficient alternative by enabling continuous monitoring of vegetation productivity over vast areas. Integrating satellite imagery with meteorological and land cover data improves GPP and NPP estimation, linking these metrics to key ecological processes such as solar radiation absorption and light use efficiency [
1,
7,
13,
14]. Since primary production supports essential ecosystem services, remote sensing-based models offer critical insights for sustainable rangeland management [
1,
2,
4].
The MOD17 model utilizes MODIS data to estimate global GPP and NPP at a 500 m resolution from 2000 onward, making it valuable for broad-scale analyses [
15]. However, its applicability in rangelands is limited due to its simplified representation of vegetation diversity.
GPP and NPP variability is influenced by multi-scale ecological processes and human disturbances, which, though minor individually, can cumulatively impact ecosystem function. Effective monitoring and land management require integrating both large- and fine-scale processes to capture these dynamics [
16].
GPP and NPP cannot be directly observed at broad scales and require models incorporating biophysical and atmospheric factors [
17,
18]. A Landsat-derived 30 m resolution product based on MOD17 improves spatial detail and extends temporal coverage from 1984 to the present, enhancing rangeland monitoring [
19]. However, satellite pixel estimates still aggregate sub-pixel vegetation dynamics, limiting precision [
20].
MOD17, though widely used, relies on coarse inputs (0.5° meteorological data, 500 m FPAR, LAI, and land cover) and biome-level parameters, simplifying ecological variation across regions [
1,
21,
22]. These trade-offs between spatial resolution, temporal coverage, and computational constraints affect its suitability for detailed conservation efforts [
23,
24].
NPP, as a critical ecosystem service, quantifies the carbon assimilated into plant biomass through photosynthesis over time and reflects the energy available across trophic levels. Rangelands generally has a lower NPP compared to other ecosystems, such as forests. However, NPP in rangelands is often highly heterogeneous, providing valuable forage at multiple scales and supporting a greater number of herbivores than any other terrestrial biome [
2,
4].
The NPP/GPP ratio, also known as carbon use efficiency, is a crucial parameter that reflects the allocation of carbon between NPP and autotrophic respiration (Ra), the two fundamental components of GPP. This ratio indicates the fraction of total assimilated carbon that is incorporated into new tissues, representing the efficiency with which ecosystems sequester carbon from the atmosphere into terrestrial biomass [
25,
26].
In carbon cycle research, the NPP/GPP ratio is considered a critical ecosystem property. It is widely used to estimate GPP from NPP or, conversely, to determine Ra as the difference between GPP and NPP [
27,
28,
29].
The MODIS MOD17 algorithm is widely used for large-scale ecosystem productivity monitoring in remote sensing. This study presents an approach aimed at improving the application of the MOD17 model in rangeland ecosystems. While MOD17 is an important model for estimating Gross Primary Productivity (GPP) and Net Primary Productivity (NPP), it has limitations, such as its inability to adequately reflect resolution and ecosystem diversity. Therefore, in this study, Landsat data are used to provide higher resolution and long-term monitoring, enhancing the model’s accuracy in rangeland ecosystems. The novelty of this study lies in offering solutions to overcome the current limitations of the MOD17 model, enabling more precise and comprehensive monitoring in rangeland management. This approach provides more effective and efficient solutions compared to traditional field-based monitoring methods, making a significant contribution to ecosystem productivity and carbon cycle monitoring.
Estimations of climate data vary across gridded models, with differences primarily arising from the interpolation methods used and the assumptions made for interpolation [
30]. Gridded meteorological data are derived from weather station measurements and combined with climate models such as PRISM to provide complete spatial coverage. To fill gaps between stations, gridded data rely on interpolation techniques. The accuracy of such climate data depends on three key factors: (1) the precision of precipitation, temperature, and PET estimates compared to MET station data; (2) regional variations in climate; and (3) the accuracy of mean value versus extreme estimates [
31,
32].
GridMET, which integrates precipitation data from NLDAS-2 (NASA Land Data Assimilation Systems) and regional climatic effects from PRISM, operates at a 4-km spatial resolution. While this scale provides broad regional climate patterns, it is too coarse to capture microclimates effectively [
31,
33,
34]. The hybrid interpolation method employed by GridMET combines multiple climatic data sources, including daily averaged MET station data (surface radiation, wind velocity, PET), regional-scale reanalysis (simulations based on physical models), gridded PRISM climate data, and daily gauge-based precipitation from NLDAS-2. This approach allows for downscaling and interpolation of climate information across diverse landscapes.
Despite these refinements, GridMET demonstrates minimal mean bias in temperature estimations, except under extreme temperature conditions [
30]. The debiasing process for NLDAS precipitation data relies on PRISM’s 30-year monthly normals (1981–2010), where aggregated NLDAS monthly values are compared to PRISM data to develop scaling factors that adjust the raw daily NLDAS data accordingly. Since PRISM incorporated data from several of the same weather station networks used in this study, the validation comparisons may reflect inflated accuracy due to shared data sources. However, it is important to note that GridMET only uses PRISM’s monthly climate normals for bias correction, rather than daily data. In fact, GridMET’s accuracy was independently validated against some of PRISM’s input data in the foundational study for GridMET [
35].
Ref. [
36] defines microclimate as a localized region influenced by spatial and temporal interactions across heterogeneous landscapes and the effects of spatial heterogeneity on biotic and abiotic processes [
37,
38]. However, gridded data inherently homogenizes terrain within each grid cell, assigning uniform meteorological values—such as PET, precipitation, and temperature—to all points within the cell. This simplification leads to GridMET overestimating mean precipitation values, particularly in topographically complex regions like the Great Basin, while underestimating extreme precipitation events. Additionally, due to the lack of terrain influence considerations, GridMET’s estimates of wind speed are often skewed [
30,
31].
Machine learning techniques, especially gradient tree boosting (GBT), have gained significant attention in various fields due to their powerful performance. GBT, particularly in classification tasks, has been successful across a range of problems, from ranking tasks to rate prediction problems [
39,
40]. Since its inception, advancements in the technique have further improved its effectiveness, making it an essential tool in predictive modeling. It is often considered a strong alternative to other models such as Random Forest (RF) and Classification and Regression Trees (CART), offering competitive performance across different domains. CART models, although simple and non-parametric, are highly versatile and can be used for both classification and regression tasks. For example, in classification, a CART model could predict whether an individual has prostate cancer based on variables such as PSA levels, age, and other factors. In regression tasks, it might predict the likelihood of an individual having prostate cancer. The output of a CART model is typically a decision tree, which provides a clear, interpretable path from input variables to a decision or prediction [
41,
42,
43,
44].
Support Vector Machines (SVMs), developed by Vapnik and colleagues in the 1990s, are also significant in machine learning, especially for binary classification tasks. The goal of SVM is to identify the most optimal separating hyperplane in a high-dimensional feature space, optimizing the generalization performance of the model [
45,
46]. In many applications, Artificial Neural Networks (ANNs) have gained popularity due to their non-linear nature and their ability to model complex relationships. ANNs are computational systems inspired by the biological neural networks of the human brain. They consist of interconnected neurons that process information through a series of hidden layers. An input vector is passed through the hidden layers using an activation function, which is continuous, differentiable, and bounded. The error between the predicted and actual values is then propagated backward through the network, updating the model’s weights through training mechanisms like the generalized delta rule [
47,
48,
49]. This combination of machine learning techniques, from SVMs to ANNs and CART models, offers powerful tools for tackling a wide range of classification and regression problems, enabling improved predictive accuracy in various scientific and technological domains
This study aims to investigate carbon storage dynamics in Utah over a 30-year period (1991–2020), focusing specifically on how various climatic and environmental factors influence carbon sequestration across diverse ecosystems. The research maps carbon storage in relation to key environmental variables such as precipitation, temperature, evapotranspiration, soil moisture, and fire risk. Additionally, this study examines the interactions between these climatic factors and land use changes, which have significantly impacted ecosystem transformations in Utah in recent decades.
By integrating remote sensing data with advanced statistical modeling techniques, this study provides a novel approach to understanding the complex relationships between climate, vegetation, and carbon storage in semi-arid regions. Unlike previous studies, which primarily focused on localized or short-term carbon sequestration patterns, this research employs a long-term, region-wide analysis, offering a more comprehensive perspective on carbon storage dynamics. Understanding these interactions is crucial for predicting future carbon sequestration capacity, especially as climate change progresses.
The findings contribute to a broader understanding of carbon dynamics in arid and semi-arid regions, while also informing land management practices and climate adaptation strategies designed to enhance carbon sequestration potential. Ultimately, the results will support the development of climate-informed carbon capture strategies, enabling decision-makers to implement more efficient carbon management approaches and facilitate the transition towards a low-carbon future.
  2. Materials and Methods
  2.1. Study Area and Data Sources
This study was conducted in the state of Utah, USA, a region characterized by diverse ecosystems, including forests, wetlands, grasslands, and agricultural zones. The study area’s climatic variability and land-use changes provide an ideal setting to analyze carbon storage dynamics over time. Remote sensing data and environmental variables were used to assess spatial and temporal variations in carbon sequestration across these ecosystems.
Primary data sources included satellite imagery, climate records, and environmental indices. Landsat imagery was employed to map carbon storage and analyze its relationship with climatic factors. Meteorological data, including precipitation, temperature, humidity, and solar radiation, were obtained from regional weather stations and modeled climate datasets such as GRIDMET.
  2.2. Carbon Storage Mapping
Carbon storage maps were developed using remote sensing techniques that incorporated vegetation indices and biomass estimation models. These maps were derived by integrating remote sensing data with climatic and environmental variables through spatial analysis techniques within the Google Earth Engine (GEE) platform and Python 3.11.
  2.3. Climatic and Environmental Variables
Key climatic and environmental variables influencing carbon storage were analyzed, including:
Precipitation (pr): Daily average precipitation (mm/d) from meteorological datasets.
Temperature (tmmn, tmmx): Minimum and maximum temperature (Kelvin) to assess thermal conditions.
Humidity (rmax, rmin): Maximum and minimum relative humidity (%) to evaluate moisture availability.
Solar Radiation (srad): Shortwave solar radiation (W/m2) as an energy source for photosynthesis.
Evapotranspiration (eto, etr): Potential and actual evapotranspiration (mm/d) to quantify water loss.
Fire Risk (bi): Fire susceptibility based on the Burning Index (BI).
Soil Moisture (sph): Soil moisture content (kg/kg) derived from remote sensing and ground-based measurements.
  2.4. Data Preprocessing and Spatial Analysis
The collected data were standardized to ensure consistency across different scales. Statistical filtering techniques were applied to remove outliers and prevent biases. A correlation analysis was performed to identify significant relationships between carbon storage and climatic variables. Desirability profiles were developed to visualize optimal conditions for carbon sequestration, while a correlation heatmap was generated to illustrate the relationships among key variables.
  2.5. Time-Series Analysis
Time-series analysis was conducted to evaluate changes in carbon storage, net primary productivity (NPP), and gross primary productivity (GPP) from 1991 to 2020. This analysis focused on identifying temporal trends and the influence of climatic variables such as temperature, evapotranspiration, and fire risk. Special emphasis was placed on peak carbon storage years, particularly 2015, to understand the factors contributing to variations in sequestration capacity.
  2.6. Model Development and Evaluation
To predict carbon storage, five machine learning and statistical models were employed:
- Random Forest (RF): An ensemble learning method that enhances predictive accuracy through multiple decision trees. RF is well-suited for capturing complex nonlinear relationships and handling large datasets with high-dimensional features. It is particularly effective in identifying important variables and interactions between them. 
- Gradient Tree Boost (GTB): A boosting technique that minimizes prediction errors sequentially. GTB is chosen for its ability to improve prediction accuracy by focusing on errors made by previous trees, making it particularly effective for capturing patterns in heterogeneous data. 
- Artificial Neural Networks (ANN): A deep learning approach capable of capturing nonlinear relationships. ANNs are chosen for their ability to model complex interactions in high-dimensional data, especially where traditional models might fail to identify hidden patterns. 
- Support Vector Machines (SVM): A kernel-based method for classification and regression. SVM is used due to its strong theoretical foundation in high-dimensional spaces and its effectiveness in situations where the number of features exceeds the number of samples. 
- Multiple Regression (MR): A statistical model quantifying the linear relationships between carbon storage and climatic variables. MR is included for its interpretability and ability to assess the linear dependency between predictors and the target variable. 
The performance of the models was assessed using several statistical metrics, including R-squared (R2), adjusted R-squared (Adj. R2), F-score, and accuracy. These metrics were employed to evaluate the models’ goodness of fit and their predictive capabilities.
  2.7. Data Processing
The study area was delineated using the Food and Agriculture Organization (FAO) boundaries dataset, filtering the area of interest (AOI) to Utah’s administrative boundary. This allowed us to focus on the specific geographic region of interest for carbon storage analysis. Carbon storage potential was assessed using two key datasets: Net Primary Production (NPP) and Gross Primary Production (GPP) from the Landsat-based University of Montana (UMT) NTSG datasets. These datasets were selected due to their high spatial resolution and reliable estimates of primary production over time, which are essential for understanding the carbon flux in ecosystems.
A novel metric, NPP8, was calculated by integrating annual GPP and NPP values (approximately 1808.061 kg/m2 and 149.178 kg/m2, respectively). This metric was developed to quantify the combined contributions of GPP and NPP to carbon storage, providing a more comprehensive assessment of primary production in the region. The use of NPP8 allows for the incorporation of both gross and net carbon sequestration, offering a deeper insight into the ecosystem’s carbon storage capacity and dynamics.
All data processing was conducted using Google Earth Engine, which provided a powerful platform for handling large-satellite data. The appropriate preprocessing steps were applied for each satellite dataset, including atmospheric correction, cloud masking, and spatial resampling, to ensure data consistency and quality. Additionally, time-series data from the Landsat and MODIS datasets were harmonized to create a consistent temporal framework. Preprocessing steps such as normalization, handling missing values, and outlier detection were performed to ensure the integrity and reliability of the datasets.
  2.8. Meteorological Data and Spatial Analysis
Meteorological data from GRIDMET were incorporated to evaluate precipitation, temperature, and vapor pressure (average approx. 0.0147198 mm, 281.9334424 K and 0.8186154 kPa) deficit. The data were processed through filtering, summing, and averaging to extract relevant climatic variables. All spatial analyses were conducted at a 500-m resolution, and regional summaries were computed using zonal statistics. Data processing, including calculations and visualizations, was performed within Google Earth Engine and Python.
  2.9. Model Validation and Future Research
The effectiveness of the multiple regression model in predicting carbon storage was assessed using statistical metrics such as the F-statistic, p-value, and R2. Variable selection techniques were applied to optimize model performance and reduce overfitting. Future studies may explore integrating spatial and time-series modeling approaches to refine carbon sequestration predictions further.
This methodological framework enables a comprehensive assessment of carbon storage potential and offers valuable insights into the effectiveness of different modeling approaches for climate-driven carbon capture analysis.
  3. Results
This study provides a comprehensive analysis of carbon storage dynamics across Utah, integrating climatic, environmental, and spatial data to assess key drivers of carbon sequestration. By leveraging multiple modeling approaches and geospatial analyses, the findings reveal significant spatial and temporal variations in carbon storage, influenced by precipitation, temperature, humidity, and fire risk. The study also evaluates the predictive performance of different machine learning models, highlighting the effectiveness of Random Forest (RF) and Artificial Neural Networks (ANN) in estimating carbon stocks. Additionally, the temporal trends from 1991 to 2020 underscore the impact of climate variability on net primary productivity (NPP) and gross primary productivity (GPP). These insights emphasize the importance of adaptive carbon management strategies to enhance sequestration potential and mitigate climate change effects.
Changes in Ecosystem Productivity, Carbon Storage, and Climatic Variables Between 1991 and 2020 are presented in 
Figure 1. The data from this period reveal the relationships between carbon storage, ecosystem productivity, and climatic variables. During this time, the average amount of stored carbon generally increased but showed a declining trend after peaking at 0.232 kg/m
2 in 2015, reaching 0.163 kg/m
2 in 2020. Similarly, net primary productivity (NPP) and gross primary productivity (GPP) reached their highest levels in 2015 before decreasing. NPP increased from 1651 kg C/m
2 in 1991 to 2310 kg C/m
2 in 2015 but then declined to 1627 kg C/m
2 in 2020. This suggests that ecosystem productivity and carbon sequestration weakened after 2015.
Analysis of climatic variables indicates a clear warming trend. The average maximum temperature rose from 288.56 K in 1991 to 290.50 K in 2020, while the average minimum temperature increased from 273.62 K to 275.18 K over the same period. Along with rising temperatures, evapotranspiration (eto) and actual evapotranspiration (etr) also increased, indicating greater evaporation and transpiration. Eto rose from 3.30 in 1991 to 3.61 in 2020, while etr increased from 4.50 to 5.12, suggesting increased water loss and heightened drought stress in ecosystems.
Additionally, vapor pressure deficit (VPD) and fire risk indicator (ERC) values showed an increasing trend. VPD increased from 0.74 in 1991 to 0.93 in 2020, indicating a greater capacity of the atmosphere to hold moisture and higher evaporation rates. Similarly, ERC, which represents fire potential, rose from 45.03 in 1991 to 63.05 in 2020. Notably, 2012 and 2020 recorded the highest ERC values, indicating a higher risk of wildfires. Overall, these findings highlight the impacts of climate change on ecosystem productivity and the carbon cycle. While carbon storage and productivity increased until 2015, they declined afterward, likely due to rising temperatures, increased evaporation, and drought effects. The increase in VPD and ERC further suggests that ecosystems face greater risks of drought and wildfires. These trends indicate a disruption of ecological balance and significant changes in the carbon cycle driven by climate change.
The relationship between the average carbon amount (average approx. 0.1816850 amount of C, kg/m
2) and various climate variables was examined. A multiple regression analysis was conducted, and the distribution of variables was visualized through histograms in 
Figure 2. Additionally, predicted values and goodness-of-fit profiles were analyzed, while spatial distributions were mapped. This section provides a detailed interpretation of the obtained visualizations.
The histogram analysis revealed that precipitation (Average pr, mm/d) is predominantly concentrated at low values. Higher precipitation levels generally contribute to increased soil moisture and vegetation productivity, leading to greater carbon storage. However, excessive rainfall can result in soil erosion and carbon loss, highlighting the complexity of direct and indirect effects on carbon levels. The maximum relative humidity (Average rmax, %) variable showed a high concentration at elevated values, indicating a generally humid environment in the study area. Increased humidity supports vegetation biomass preservation and organic matter accumulation, suggesting a potential positive correlation between relative humidity and carbon storage. Conversely, minimum relative humidity (Average rmin, %) was observed to be concentrated at lower values. A decrease in minimum relative humidity, particularly during dry periods, can increase evapotranspiration rates, leading to soil carbon depletion. However, humidity levels below certain thresholds may reduce microbial activity, slowing carbon release.
The specific humidity (Average sph, kg/kg) histogram exhibited a right-skewed distribution, indicating high atmospheric moisture levels that could enhance photosynthesis rates. However, whether the relationship between specific humidity and carbon storage is linear requires further investigation. The shortwave solar radiation (Average srad, W/m2) displayed a symmetric distribution. As a critical factor in photosynthesis, higher solar radiation levels may enhance carbon sequestration capacity. However, when combined with rising temperatures, excessive radiation can increase evaporation and lead to soil carbon losses. The analysis of wind direction (Average th, degrees) suggested no strong direct relationship with carbon levels, though indirect influences through atmospheric transport and soil moisture distribution are possible.
The minimum temperature (Average tmmn, K) was primarily concentrated within the 250–300 K range. Lower temperatures slow down organic matter decomposition, contributing to carbon preservation in soils. However, extremely low temperatures may limit plant growth, negatively impacting long-term carbon accumulation. In contrast, maximum temperature (Average tmmx, K) varied between 275 and 325 K. Higher temperatures can enhance plant respiration, increasing carbon emissions while simultaneously reducing soil moisture and degrading organic carbon stocks. The wind speed (Average vs, m/s) distribution was concentrated at lower values. Higher wind speeds can contribute to soil erosion and carbon loss, although this effect is less pronounced in areas protected by vegetation cover.
The fire risk index (Average bi) demonstrated an inverse relationship with carbon stocks, as regions with higher fire risk exhibited lower carbon levels. This finding supports the well-established impact of wildfires in burning organic matter and releasing stored carbon into the atmosphere. The evapotranspiration variables (Average eto and etr) influence carbon storage capacity by regulating soil moisture availability. While excessive evapotranspiration can limit carbon sequestration, moderate levels can support plant growth and contribute positively to the carbon cycle.
The prediction values and goodness-of-fit profiles were evaluated using multiple regression models, providing insights into the effects of independent variables on carbon levels. The findings indicate that temperature, humidity, and fire risk significantly influence carbon stocks. By identifying optimal conditions, ecological regions with high potential for carbon storage can be prioritized for conservation efforts. The spatial distribution maps were used to visualize variations in carbon amounts across the study area.
- Red zones represent areas with high carbon loss, often due to wildfires, erosion, or human activities. 
- Green zones indicate regions with high carbon storage, primarily forested ecosystems or dense vegetation areas. 
- Yellow zones correspond to moderate carbon levels and are highly sensitive to climate fluctuations. 
The analysis of these maps suggests that sustainable land management strategies could enhance carbon storage capacity in targeted regions.
Highlights the significant impact of climate variables on carbon dynamics. Precipitation, temperature, humidity, and fire risk emerge as key determinants of carbon distribution. The spatial analysis provides valuable insights for identifying critical areas for carbon preservation and enhancement. Future studies incorporating long-term datasets could further improve the understanding of carbon dynamics, contributing to more effective climate mitigation strategies.
Analysis of Predicted Values and Desirability Profiles is presented in 
Figure 3. This section provides a detailed examination of the Profiles for Predicted Values and Desirability, analyzing the relationships between independent variables and carbon storage capacity. This analysis serves as a crucial tool for understanding how the model predicts carbon values and how different variables influence these predictions based on predefined suitability criteria. Profiles for Predicted Values and Desirability is a visualization technique commonly used in multivariate regression models to assess the effects of independent variables (climatic factors) on dependent variables (carbon storage capacity). This graph illustrates how changes in different variables impact carbon levels and identifies conditions under which maximum suitability is achieved. Typically, the vertical axis represents carbon storage, while the horizontal axis represents independent variables, such as precipitation, humidity, temperature, and solar radiation. By examining the trends of predicted values for each variable, the model’s behavior and key influencing factors are visualized. A detailed analysis of independent variables reveals their varying impacts on carbon storage. Precipitation (pr) plays a crucial role in soil moisture balance and the biochemical cycling of carbon. The analysis indicates that at low precipitation levels, carbon storage is generally low. As precipitation increases, carbon levels rise, but beyond a certain threshold, this increase stabilizes or slightly declines. This suggests that excessive rainfall may lead to soil erosion and loss of organic matter, potentially limiting carbon sequestration. Therefore, identifying optimal precipitation ranges is essential for developing effective land management strategies.
Relative humidity (rmax and rmin) also exhibits significant relationships with carbon storage. Higher maximum relative humidity (rmax) is associated with higher carbon levels, while minimum relative humidity (rmin) presents a more complex trend. At low rmin values, carbon levels decrease, but they stabilize beyond a specific threshold. These findings indicate that adequate humidity levels enhance microbial activity in the soil, accelerating carbon accumulation. Similarly, specific humidity (sph), which represents the absolute water vapor content in the air, is positively correlated with carbon storage. At low sph levels, carbon storage is reduced, whereas moderate to high sph levels promote significant carbon accumulation. This underscores the importance of maintaining adequate humidity levels to sustain carbon sequestration. Implementing appropriate irrigation and land-use management strategies can help maintain soil moisture balance.
Solar radiation (srad), a key factor in photosynthesis, directly influences carbon sequestration. At low srad levels, carbon storage is reduced, while moderate solar radiation enhances carbon accumulation. However, very high radiation levels lead to a plateau or slight decline in carbon storage. This suggests that beyond an optimal radiation threshold, drought and excessive heat stress may negatively impact ecosystem productivity. Thus, managing solar exposure in agricultural and forestry systems is crucial to maintaining optimal carbon sequestration rates. Temperature (tmmn and tmmx) also significantly influences carbon dynamics. At low temperatures, microbial activity declines, slowing down organic matter decomposition and reducing carbon storage. In contrast, moderate temperatures promote plant growth and enhance carbon sequestration capacity. However, extremely high temperatures lead to increased evapotranspiration and soil moisture loss, ultimately reducing carbon levels. This analysis highlights the importance of optimal temperature conditions in maximizing carbon storage potential.
The desirability profiles further reveal the ideal conditions for carbon sequestration. The highest desirability values (Desirability = 1.0) indicate the most favorable combinations of precipitation, temperature, humidity, and solar radiation for maximizing carbon storage. Conversely, low desirability values correspond to conditions that negatively impact carbon sequestration. The findings suggest that moderate precipitation, optimal temperatures, and high humidity levels contribute to maximum carbon sequestration potential, while extreme drought or excessive moisture may hinder carbon accumulation. Additionally, fire-prone regions show lower carbon storage potential compared to areas with low fire risk, emphasizing the need for fire prevention measures in carbon conservation strategies. The analysis of predicted values and desirability profiles provides critical insights into the spatiotemporal dynamics of carbon storage under varying climatic and environmental conditions. The study identifies the optimal conditions for carbon sequestration, including moderate precipitation, high relative humidity, suitable temperatures, and balanced solar radiation levels. However, risk factors such as extreme heat, low humidity, high evaporation rates, and wildfire susceptibility pose challenges to carbon conservation efforts. To mitigate carbon loss, implementing sustainable irrigation techniques, wildfire prevention measures, and climate-resilient land management policies are essential. These findings provide a valuable framework for future climate modeling and land management studies, contributing to more effective strategies for carbon sequestration and climate change adaptation.
This correlation heatmap visualizes the relationships between environmental variables, contributing to a better understanding of ecological processes. The heat map between the variables is given in 
Figure 4. When examining the relationships between amount of C (kg/m
2) and other environmental factors, strong positive correlations are observed with humidity variables (Average rmax, Average rmin, Average sph). This suggests that carbon accumulation tends to increase in more humid environments. Conversely, a negative correlation is observed with solar radiation (Average srad), indicating that lower solar radiation levels may be associated with higher carbon amounts.
Although the relationships between temperature variables (Average th, Average tmmn, Average tmmx) and carbon amount are relatively weak, the negative correlation with maximum temperature (Average tmmx) is notable. This suggests that rising temperatures may lead to a decrease in carbon storage. Additionally, no strong relationship is observed between evaporation and transpiration variables (Average eto, Average etr, Average vpd) and carbon amount, though a slight positive correlation is found with average evaporation (eto).
The relationships between environmental variables also provide important insights into ecological processes. Strong negative correlations are found between humidity variables and solar radiation, while inverse relationships between temperature and humidity are also evident. Notably, the bioclimatic index (Average bi) exhibits a positive correlation with temperature variables, highlighting the impact of temperature changes on bioclimatic processes. Overall, these results indicate that carbon accumulation increases under conditions of high humidity, low temperature, and low solar radiation. A more detailed analysis of the spatial and temporal variations of these variables will be crucial for understanding ecosystem carbon storage capacity, providing valuable insights for future research.
  3.1. Evaluation of Multiple Regression Model Results
In this study, multiple regression analysis was applied to assess the relationship between amount of C (kg/m
2) and various climate variables, including precipitation, humidity, solar radiation, temperature, wind speed, fire risk indices, and evapotranspiration metrics. To evaluate the model’s predictive power and robustness, key statistical metrics such as R-squared (R
2), adjusted R-squared (Adjusted R
2), F-statistic, and accuracy were analyzed. 
Table 1 presents the performance comparison between multiple regression and alternative machine learning models such as Random Forest, Gradient Tree Boost, Artificial Neural Networks (ANN), and Support Vector Machine (SVM).
The multiple regression model yielded an R2 value of 0.83, indicating that the independent variables explain 83% of the variance in the dataset. However, the adjusted R2 of 0.63 suggests that some independent variables may not be significantly contributing to the model, potentially leading to overfitting. The F-statistic of 4.14 and the p-value of 0.0067 (p < 0.05) confirm that the overall model is statistically significant, meaning that the selected climate variables have a meaningful impact on amount of C (kg/m2).
When compared with machine learning models, Random Forest achieved the highest R2 value (0.95), followed by Gradient Tree Boost (0.87), ANN (0.87), and SVM (0.87). However, despite its high R2, Random Forest has a significantly lower adjusted R2 (0.75), indicating that the model might be overfitting the training data. The ANN model demonstrated a higher accuracy score (0.80), suggesting that it might be more effective in capturing complex relationships compared to the other models. Meanwhile, SVM and multiple regression showed the lowest accuracy (0.50), highlighting potential limitations in capturing nonlinear dependencies.
The results indicate that while multiple regression provides a solid baseline model, alternative non-linear approaches like Random Forest or ANN may offer improved predictive performance. The relatively low adjusted R2 in multiple regression suggests that some independent variables may be redundant, and variable selection techniques such as stepwise regression or LASSO regression could enhance model efficiency. Additionally, incorporating spatial analysis, feature engineering, and time-series modeling could further refine predictions and mitigate overfitting. Future studies should explore hybrid modeling approaches that integrate regression with machine learning algorithms to enhance accuracy and robustness in carbon storage assessments.
Figure 5 presents the SHAP (SHapley Additive exPlanations) values for the Random Forest (RF) model. This analysis highlights the importance and direction of the impact of each feature on the model predictions. The findings indicate that “Average th” and “Average sph” are the most influential features in the model. “Average th” shows a negative impact on predictions at low values, while contributing positively at higher values. Similarly, “Average sph” demonstrates a strong positive impact at higher values. “Average tmin” also plays a significant role in the model, whereas features like “Average pr” and “Average eto” exhibit limited influence. Notably, “Average fm100” and “Average fm1000” have both positive and negative impacts, suggesting a more complex relationship with the model predictions. Overall, this analysis provides critical insights into how environmental variables influence the model’s predictive performance and underscores the need for further investigation of key features like “Average th” and “Average sph”. These findings are valuable for better understanding the decision-making processes of the Random Forest model and identifying variables to prioritize in future studies.
 Five different carbon storage maps derived from remote sensing techniques and multiple regression analyses are analyzed in detail. These maps, presented in 
Figure 6, illustrate the spatial variations of carbon stocks and their relationships with climate variables, fire risk, evapotranspiration, and soil moisture. The general carbon distribution map provides an overview of the spatial patterns of carbon storage across the study area. Areas with low carbon content are predominantly found in regions with intensive agricultural activities, sparse vegetation, or a history of wildfires. In contrast, forested areas, natural grasslands, and wetlands, which accumulate significant amounts of organic matter, exhibit higher carbon storage. Identifying these patterns is crucial for effective carbon management, as it allows for the prioritization of restoration efforts in low-carbon areas.
The climate-based carbon distribution map examines the impact of variables such as precipitation, temperature, humidity, and evaporation on carbon storage. Low-precipitation and high-temperature regions tend to have significantly reduced carbon levels due to accelerated organic matter decomposition. In contrast, humid areas with moderate temperatures retain higher carbon stocks, as photosynthetic activity is more efficient in these conditions. However, excessive rainfall can lead to increased soil erosion and organic matter loss, negatively impacting carbon storage. These findings highlight the need for water management strategies, particularly in arid regions, to mitigate carbon losses. The fire risk index (Burning Index, BI) and carbon distribution map reveal the impact of wildfires on carbon stocks. Areas with a history of frequent fires exhibit significantly lower carbon levels, emphasizing the destructive role of fire in carbon sequestration. In high-risk zones, implementing controlled burning techniques, enhancing soil moisture retention, and preserving vegetation cover can help minimize carbon losses.
The evapotranspiration and carbon distribution map explores the relationship between plant transpiration, soil evaporation, and carbon storage. Regions with high evapotranspiration rates generally have lower carbon levels, while areas with moderate or low evapotranspiration tend to store more carbon. This highlights the critical role of water conservation policies in maintaining carbon stocks and suggests that reducing excessive water loss can enhance carbon retention. Finally, the soil moisture and carbon distribution map demonstrates the direct correlation between soil moisture content and carbon storage. Areas with low soil moisture show reduced carbon levels, whereas regions with higher moisture content retain more carbon. This underscores the importance of sustainable water management strategies in agricultural and natural ecosystems to enhance carbon sequestration.
The carbon amount distribution map, using a 0–1 scale with blue representing low values and red indicating high values, provides further insights into the spatial distribution of carbon intensity. Red areas, which represent the highest carbon density, are primarily concentrated in regions with significant carbon emissions or natural carbon storage potential. These may include industrial zones, urban centers, and dense vegetation ecosystems. For instance, Salt Lake City and its surroundings likely exhibit high emissions due to industrial activities, whereas forested and wetland regions contribute to natural carbon sequestration. Additionally, Utah’s fossil fuel reserves, particularly coal and oil extraction sites, may also appear as red zones due to their substantial carbon outputs. In contrast, blue areas represent regions with low carbon storage capacity, including arid deserts, drylands, and sparsely vegetated ecosystems where biomass accumulation is minimal.
From a policy perspective, high-carbon areas (red zones) should be prioritized for emission reduction strategies, such as the implementation of carbon capture and storage (CCS) technologies in industrial and energy production centers. Meanwhile, low-carbon areas (blue zones) may require restoration and afforestation efforts to enhance their carbon sequestration potential. The preservation and expansion of natural carbon sinks, such as forests and wetlands, should be a key priority for sustainable land management. Future land-use policies should incorporate these spatial analyses to develop region-specific climate resilience strategies, ensuring both carbon mitigation and ecosystem stability across Utah.
  3.2. Limitations
While this study provides valuable insights into the use of Landsat-enhanced MOD17 for monitoring rangeland ecosystems, several limitations must be acknowledged. The assumptions inherent in the MOD17 algorithm, particularly the relationship between vegetation indices and primary productivity, may introduce a degree of uncertainty, especially when applied to regions with distinct environmental conditions. Furthermore, although this study focuses on Utah’s ecosystem, caution should be exercised when extrapolating the findings to other regions. Ecosystem-specific factors, such as climate, vegetation types, and soil properties, can influence the model’s applicability in different geographic areas.
Additionally, satellite resolution plays a critical role in carbon storage estimates. While Landsat data offer higher spatial resolution than MODIS, they still have limitations in capturing fine-scale variations within the landscape, which may affect the accuracy of carbon sequestration estimates. Future research may benefit from integrating data from multiple satellite platforms with higher spatial and temporal resolution to mitigate these uncertainties and enhance the model’s robustness across diverse ecosystems. To further improve model interpretability, future studies will explore the use of advanced visualization techniques such as SHAP (SHapley Additive exPlanations) values, which can provide more transparent insights into the factors influencing model predictions. These techniques will help enhance the understanding of the underlying patterns in the data and improve the model’s accessibility to users, making it easier to interpret and trust in decision-making contexts.
  4. Discussion
The results of this study offer valuable insights into the carbon dynamics of Utah over the past three decades, highlighting the complex interplay of natural and anthropogenic factors influencing carbon storage. Spatial patterns of carbon storage, as indicated by NPP8, show significant variations across the region, with forested areas exhibiting greater carbon storage potential due to higher productivity, while arid and semi-arid regions demonstrate lower carbon storage capacity. This pattern aligns with existing literature, which emphasizes the critical role of vegetation cover and precipitation in determining carbon sequestration across different ecosystems. The lower carbon storage potential in water-limited regions, such as those found in Utah, underscores the importance of water availability in maintaining carbon storage capacity.
Temporal trends in Net Primary Productivity (NPP) and Gross Primary Productivity (GPP) further reinforce the sensitivity of carbon dynamics to climatic fluctuations. Interannual variations in carbon uptake are closely tied to precipitation patterns, with years of higher precipitation correlating with increased productivity and greater carbon sequestration. Conversely, periods of drought were associated with reduced carbon storage, emphasizing the vulnerability of ecosystems, particularly in arid and semi-arid regions, to prolonged dry conditions. These findings are consistent with previous research that highlights the significant impact of climate variability, especially in water-limited environments, on carbon cycling.
The observed correlations between climate variables—particularly precipitation and temperature—and carbon storage in Utah suggest that future climatic changes could substantially alter the region’s carbon balance. As climate change continues to affect regional hydrology, a decrease in carbon sequestration capacity may occur, potentially contributing to a negative feedback loop that exacerbates global warming. The risk of reduced carbon uptake due to changing precipitation patterns and rising temperatures highlights the need for strategies that promote ecosystem health and resilience to mitigate the impacts of climate change on carbon storage.
The comparison of multiple regression with machine learning models (Random Forest, Gradient Tree Boost, ANN, and SVM) provides additional insight into modeling carbon storage dynamics. While multiple regression (R2 = 0.83) effectively captured the linear relationships between carbon storage and climate variables, the higher performance of machine learning models, particularly Random Forest (R2 = 0.95), suggests that non-linear interactions play a crucial role in determining carbon sequestration. Furthermore, the ANN model demonstrated a higher accuracy (0.80), indicating its potential to improve predictions. These findings suggest that future research should explore hybrid modeling approaches, incorporating both statistical and machine learning techniques, to refine carbon storage predictions and account for complex ecological interactions.
Additionally, the spatial analysis of carbon storage revealed critical patterns that can inform sustainable land management strategies. The amount of C distribution map, which visualizes carbon density variations, reinforces the importance of identifying priority areas for conservation and carbon sequestration enhancement. High-carbon areas (depicted in red) correspond to regions with high vegetation productivity, while low-carbon areas (in blue) represent zones with lower biomass, such as arid lands or intensively managed agricultural fields. These findings highlight the need to implement restoration strategies in low-carbon areas, enhance vegetation cover, and improve soil moisture retention to optimize carbon sequestration potential.
The study also underscores the importance of localized research to deepen our understanding of how climate, vegetation, and carbon dynamics interact in semi-arid regions. Future research efforts should aim to incorporate long-term climate projections and various land-use scenarios to refine carbon sequestration predictions under different climate change conditions. Additionally, integrating remote sensing technologies with ground-based observations will enhance the accuracy and spatial resolution of carbon estimates, providing a more comprehensive understanding of carbon storage in these ecosystems.
Finally, these findings emphasize the need for sustainable land management practices and effective climate adaptation strategies to maintain the role of ecosystems in mitigating global carbon emissions. As climate change continues to pose significant challenges to ecosystem services, it is critical to implement adaptive management practices that promote carbon sequestration, restore degraded lands, and preserve vegetation cover. The results of this study provide valuable information for policymakers, land managers, and conservationists by highlighting key drivers of carbon storage dynamics and offering insights into targeted mitigation strategies. By maintaining and enhancing carbon storage capacity, we can contribute to global efforts in mitigating climate change and ensuring long-term environmental sustainability.
  5. Conclusions
In conclusion, this study highlights the critical role of climate and land use in shaping carbon dynamics in Utah over the past 30 years. The results demonstrate significant spatial and temporal variations in carbon storage, influenced by factors such as vegetation type, precipitation, and temperature. The findings underscore the vulnerability of carbon sequestration to climate variability, particularly in arid regions, where droughts have a pronounced impact on productivity.
This study provides a comprehensive analysis of the relationship between climatic variables and carbon storage across Utah, utilizing machine learning and statistical models to enhance carbon capture assessments. The findings indicate that Random Forest and Artificial Neural Networks (ANN) outperform other models, with Random Forest excelling in explanatory power (R2 = 0.95) and ANN demonstrating high predictive accuracy (0.80). These models offer critical insights for policymakers and researchers in designing effective carbon sequestration strategies aligned with regional climatic conditions.
The analysis also reveals the complex relationships between climatic factors, such as precipitation, temperature, solar radiation, and humidity, and their effects on carbon retention. In particular, the study highlights the negative impact of extreme climatic conditions, such as excessive heat, drought, and fire risks, which reduce carbon stocks and disrupt ecosystem productivity. These findings emphasize the urgency of implementing effective climate adaptation strategies to mitigate the impacts of climate change on carbon dynamics.
The study underscores the importance of integrating advanced modeling techniques into carbon capture and sequestration (CCS) initiatives to optimize mitigation efforts. The strong correlations between precipitation, humidity, and carbon storage highlight the necessity of climate-informed CCS strategies to enhance carbon retention in terrestrial ecosystems. Furthermore, the identification of high-sequestration regions provides a foundation for implementing targeted interventions, ensuring efficient resource utilization while maximizing climate benefits.
The results of the multiple regression model provide a strong statistical foundation for understanding the influence of climate variables on carbon storage. However, the study also highlights the need for model optimization, as certain variables did not contribute significantly to the model’s explanatory power. Future research should focus on refining predictive models by incorporating remote sensing data, improving spatial resolution, and integrating socio-economic factors into carbon management frameworks.
Ultimately, the study advocates a holistic approach to carbon management that includes sustainable land use practices, water conservation strategies, and fire prevention measures. By prioritizing the restoration of low-carbon areas and improving land management policies in high-risk regions, significant strides can be made in enhancing carbon sequestration and contributing to global climate change mitigation efforts. By leveraging data-driven approaches, decision-makers can develop sustainable policies that enhance carbon capture efforts, ultimately contributing to global climate resilience. Moving forward, continued research and the application of advanced modeling techniques will be essential to refining carbon management strategies and ensuring the long-term sustainability of carbon storage in the face of a changing climate.