Next Article in Journal
Phenology-Informed Multitemporal PlanetScope and UAV-LiDAR Fusion for Above-Ground Carbon Mapping in Tropical Dry Forests of Sakaerat Biosphere Reserve, Thailand
Previous Article in Journal
Enhanced Detection of Subsurface Combustion: An Improved Index Combined with Time Series Analysis
Previous Article in Special Issue
Impact Study of Assimilating Fengyun-3 GNSS-R Ocean Surface Winds in the Weather Research and Forecasting Model: Sensitivity Analysis on Observation Error Specifications
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

High-Resolution Soil Surface Moisture Projections for European Perennial Crops: A Machine Learning Framework Integrating Sentinel-1 and CMIP6 Climate Scenarios

1
Centre for the Research and Technology of Agro-Environmental and Biological Sciences (CITAB), Institute for Innovation, Capacity Building, and Sustainability of Agri-Food Production (Inov4Agro), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal
2
Chemistry Center of Vila Real (CQVR), University of Trás-os-Montes and Alto Douro (UTAD), Quinta de Prados, P.O. Box 1013, 5001-801 Vila Real, Portugal
3
Instituto Universitario de Investigación en Olivar y Aceites de Oliva, University of Jaén, 23071 Jaén, Spain
4
Department of Agricultural, Forest and Food Sciences, University of Torino, 10095 Grugliasco, Italy
5
Institute of Agriculture, Department of Soil Science, Warsaw University of Life Sciences—SGGW, Nowoursynowska 159, 02-776 Warsaw, Poland
*
Author to whom correspondence should be addressed.
Remote Sens. 2026, 18(12), 1902; https://doi.org/10.3390/rs18121902 (registering DOI)
Submission received: 15 April 2026 / Revised: 1 June 2026 / Accepted: 5 June 2026 / Published: 9 June 2026

Highlights

What are the main findings?
  • Vineyards emerged as the most predictable perennial crop system (R2 ≈ 0.87), while olive groves showed the lowest predictive performance (R2 ≈ 0.63–0.68), reflecting crop-specific differences in surface moisture sensitivity linked to rooting depth and drought adaptation strategies.
  • Projections under the high-emission SSP5-8.5 scenario indicate soil moisture declines of 8–24% by 2041–2070, with historically wetter LLs experiencing the most severe absolute losses despite greater initial moisture buffers.
What are the implications of the main findings?
  • High-resolution, crop-specific projections enable targeted climate adaptation strategies for perennial agricultural systems at farm and landscape scales, supporting precision irrigation and drought risk management.
  • The results suggest that coarse-resolution climate models may underestimate future soil drying in European agricultural regions, underscoring the necessity of kilometre-scale downscaling methodologies for reliable impact assessments.

Abstract

Soil surface moisture (SSM) is a critical indicator of agricultural drought, yet high-resolution projections under climate change remain scarce. This study develops a machine learning framework to predict and project SSM at 1 km resolution across five European Living Labs (LLs), encompassing vineyards, olive groves, and fruit tree systems. Historical Sentinel-1 SSM observations (2014–2024) were used to train ensemble models (Random Forest, XGBoost, ExtraTrees, LightGBM) incorporating climate variables, soil texture, topography, and land use. Tree-based models achieved R2 values of 0.63–0.87. Vineyards showed the highest predictability (R2 ≈ 0.87), reflecting their sensitivity to short-term atmospheric demand and surface water availability, whereas olive groves were the least predictable (R2 ≈ 0.63–0.68), consistent with deeper rooting systems and greater drought buffering capacity. When forced with bias-corrected CMIP6 projections under SSP1-2.6 and SSP5-8.5 for 2041–2070, models indicate minimal changes under SSP1-2.6 but pronounced SSM declines of 8–24% under SSP5-8.5, with historically wetter regions experiencing the largest absolute losses. SHAP analysis confirmed precipitation and potential evapotranspiration as dominant predictors across all crops. This framework provides spatially explicit, crop-relevant SSM projections to support climate adaptation in European agricultural landscapes.

1. Introduction

Climate change is intensifying the frequency, duration and severity of droughts globally. Rising temperatures and altered precipitation patterns and regimes disrupt the water balance, also fostering soil evapotranspiration and desiccation [1]. These impacts are critical in Mediterranean and semi-arid regions, where droughts frequently threaten agricultural productivity and ecosystem stability [2]. Climate projections consistently suggest that such conditions will worsen throughout the 21st century, highlighting the urgency of quantifying how soil water availability will evolve under future scenarios [3].
Among hydrological state variables, Soil Surface Moisture (SSM) plays a central role as an indicator of land–atmosphere interactions [4]. Due to its high temporal variability, SSM is a key indicator of short-term drought dynamics for precision agriculture and land-surface modelling [5]. However, the strong spatial heterogeneity of soil, vegetation, and topographic features poses major challenges for consistent monitoring and prediction of SSM at a practical spatial resolution [6]. Currently, available SSM datasets can be broadly classified according to the approaches used for their generation, namely physically based modelling and observation-driven or machine learning (ML) methods. Physically based products, such as ERA5-Land [7], the Global Land Data Assimilation System (GLDAS) [8], and Copernicus Land Monitoring Service (CLMS) reanalyses [9], simulate soil moisture through land-surface parameterisations constrained by atmospheric forcing. These provide long records, but their 9–25 km spatial resolution and biases over heterogeneous terrain limit local actionable assessments [10]. Assimilation-based systems, such as NASA Soil Moisture Active and Passive (SMAP L4) [11], improve temporal stability, but scale mismatch remains a central bottleneck for farm-relevant applications. In contrast, observation-driven and ML approaches infer SSM from multi-sensor remote sensing and environmental covariates, typically integrating Sentinel-1 backscatter, SMAP/SMOS brightness temperature, optical indices, land surface temperature, soil texture, and terrain information [12]. These frameworks can better capture non-linear controls and deliver finer spatial detail (1–5 km). Operational examples include the ESA CCI Soil Moisture dataset [13], which provides a multi-decadal global record on a 0.25° grid (~25 km), and the Copernicus Sentinel-1 Surface Soil Moisture product, at 1 km grid spacing across Europe [14]. Building on these foundations, recent advances have generated hybrid and high-resolution datasets that combine physical consistency with data-driven flexibility. Xu et al. (2024) [15] introduced GSSM-10, a multimodal deep learning ensemble that combines SMAP, Sentinel-1 and ERA5 inputs to generate SSM on a 10 m to 1 km grid. Zhu et al. (2025) [16] proposed a fusion framework integrating microwave brightness temperature with optical reflectance to estimate soil moisture at 1 km, while Rabiei et al. (2025) [17] coupled Sentinel-1, SMAP, and digital soil maps within deep neural networks to retrieve both surface and subsurface moisture, improving cross-variable physical consistency. Progress has also been made in interpretability, as Nikraftar et al. (2025) [18] developed interpretable ML approaches to diagnose dominant controls on soil moisture. Extending beyond historical mapping, Feng et al. (2024) [19] merged CMIP6 projections with ML downscaling to simulate soil moisture under SSP scenarios at 0.25° grid resolution. Additional work demonstrated the value of model ensembles and temporal learning for soil moisture forecasting and drought indicators [20].
Despite these advances, key limitations remain. Most ML-based soil moisture studies focus on historical periods. Future applications indeed face distribution shifts in predictors and the propagation of climate-model biases, while bias-correction strategies are often not transferable across time. As a result, high-resolution, bias-adjusted SSM projections that are spatially detailed and climatically robust remain scarce. To address this gap, this study develops an ML framework to predict and project SSM at 1 km resolution for five European Living Labs (LLs), focusing on agricultural areas dominated by vineyards, olive groves, and temperate fruit trees. Historical Sentinel-1 SSM observations (2014–2024) are used as training targets [14]. The trained models are subsequently forced with bias-corrected CMIP6 climate projections under SSP1-2.6 and SSP5-8.5 to generate mid-century (2041–2070) SSM projections. By combining Sentinel-1 spatial detail with climate-model temporal consistency, the framework provides crop-specific, high-resolution projections suitable for drought-risk assessment and adaptation planning across contrasting European agro-climatic contexts, with adequate allocation of roles to several environmental variables, such as precipitation and evapotranspiration.

2. Materials and Methods

2.1. Study Areas and LLs

This study was conducted in five European LLs, formally defined within the framework of the LivingSoiLL (Healthy Soil to Permanent Crops LLs, https://livingsoill.eu/, accessed on 4 June 2026) Horizon Europe project, which aims to investigate soil health, climate impacts, and sustainable land management across representative agricultural regions in Europe. LLs were delineated by the project consortium to capture contrasting environmental, climatic, and agricultural conditions (Figure 1). The five targeted LLs are located in the North of Portugal/North of Spain (Galicia) (LL1: Luso-Galician), South of Spain (LL2: Andalusian), Northwestern Italy (LL3: Northwestern Italy—Piemonte), France (LL4: Loire Valley and Beaujolais) and Poland (LL5: Grójec), collectively spanning Mediterranean, temperate oceanic, and continental climate zones. The selected LLs encompass agricultural landscapes dominated by vineyards, olive groves, and temperate fruit trees, which are among the most climate-sensitive perennial cropping systems in Europe, playing a key role in regional water use and drought vulnerability, as well as being socioeconomically relevant in their corresponding regions.
LL1 (Luso-Galician) covers the north of Portugal and Galicia (northwestern Spain). This Living Lab features a heterogeneous agricultural mosaic, including vineyards, olive groves and fruit tree orchards. According to the Köppen–Geiger climate classification [21] (Table 1), LL1 is mostly characterised by hot-summer (Csa) or warm-summer (Csb) Mediterranean climates. These conditions are also associated with mild, wet winters and dry summers, resulting in strong seasonality in both precipitation and evapotranspiration. LL2 (Andalusian) is situated in southern Spain and is predominantly covered by extensive olive groves and fruit tree plantations. It is generally classified as a hot-summer Mediterranean (Csa) and is characterised by high annual temperatures, low and irregular precipitation, and prolonged summer droughts. LL3 (Northwest Italy) is in northwestern Italy and is largely dominated by vineyard-based agricultural systems. According to the Köppen–Geiger climate classification, the prevailing climate is humid subtropical (Cfa), featuring relatively evenly distributed precipitation throughout the year, warm summers, and moderate seasonal temperature variability. LL4 (Loire Valley and Beaujolais) comprises two separate regions in central and eastern France, and it is primarily associated with vineyard cultivation. This Living Lab is predominantly classified, in both regions, as temperate oceanic (Cfb), with moderate temperatures, regular precipitation, and limited summer water stress. Lastly, LL5 (Grójec) is located in central Poland and is dominated by intensive temperate fruit tree (apple) production systems. The region falls within a humid continental climate (Dfb), characterised by cold winters, warm summers, and strong seasonal contrasts in both temperature and precipitation. Precipitation is mainly concentrated during late spring and summer, associated with convective rainfall events, while winter precipitation is lower and often occurs in the form of snowfall.

2.2. Data Collection and Processing

Data collection followed a workflow integrating multi-source datasets into a consistent SSM prediction framework (Figure 2). Historical climate data were downscaled to a 1 km grid and aggregated monthly to match SSM resolution. Land-use data were incorporated to characterise vegetation types, whereas soil texture data were used to estimate soil physical and hydraulic properties. Topographic variables derived from a digital terrain model (DTM), with 30 s resolution, including elevation, slope, and aspect, were included to capture terrain-driven controls on soil water redistribution and drainage processes. All static variables were resampled to the same spatial grid (1 km) and masked consistently across the LLs. Satellite-derived surface soil moisture observations at 1 km resolution, obtained from Sentinel-1 products, were used as the target feature for model training and validation. Quality control procedures were applied to remove invalid observations, reduce the influence of outliers, and ensure temporal continuity. The final modelling dataset was constructed by spatially and temporally joining climate, land-use, soil texture and topographic predictors with corresponding SSM observations, resulting in a spatio-temporal dataset suitable for ML applications.
In the Model Development step, multiple ML algorithms, including ExtraTrees (ET), eXtreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), Linear Regression (LR) and Random Forest (RF), were trained and evaluated using the historical dataset (2014–2024). Model validation used a temporally consistent strategy to prevent information leakage and to assess predictive performance under realistic conditions. Feature selection and importance analysis identified the main drivers of SSM variability across different agro-climatic contexts.
To assess future soil moisture dynamics (Future Projection step), the trained models were then forced with bias-corrected and downscaled CMIP6 climate projections under different Shared Socioeconomic Pathways (SSPs). This approach enabled the generation of high-resolution SSM projections, consistent with future climate conditions, while preserving the relationships learned from historical observations.

2.2.1. Climate Data

Historical climate predictors (2014–2024), used for model training, were derived from the ERA5-Land reanalysis dataset, which provides global hourly meteorological fields at 9 km resolution and is widely used in hydrological, agro-climatic, and soil moisture studies. In this study, key atmospheric drivers of SSM dynamics were extracted, including precipitation, near-surface air temperature, surface downward shortwave radiation (RSDS), and potential evapotranspiration (PET). ERA5-Land variables were aggregated from hourly to monthly values to reduce high-frequency noise and to match the temporal resolution adopted for modelling. Specifically, monthly accumulated precipitation, monthly mean air temperature, and monthly mean surface solar radiation downwards (W m−2) were computed using standard variable-specific aggregation procedures. PET was estimated using the Hargreaves method, i.e., calculated using the minimum, maximum, and mean air temperatures following the original formulation [22], which has been widely applied in data-scarce environments and large-scale hydro-climatic studies [23].
All climate layers were harmonised to a common 1 km grid using variable-specific procedures. Air temperature and precipitation were downscaled using a statistical, topography-aware framework based on the CHELSA method, incorporating elevation-dependent lapse rates and orographic effects to preserve fine-scale climatic gradients [24]. In contrast, RSDS was resampled to 1 km by bilinear interpolation, reflecting its smoother spatial variability at monthly timescales. All variables were reprojected to WGS84 (EPSG:4326) to ensure consistency with Sentinel-1 SSM and ancillary predictors.
For the historical baseline (1981–2010) and future climate (2041–2070) forcing, the climate data were obtained from the NASA Earth Exchange Global Daily Downscaled Projections (NEX-GDDP-CMIP6) dataset [25], which provides daily CMIP6 projections at 0.25° (~25 km), already with bias correction implemented. To represent climate model uncertainty, four CMIP6 GCMs were selected: GFDL-ESM4 [26], IPSL-CM6A-LR [27], MPI-ESM1-2-HR [28], and UKESM1-0-LL [29]. Two contrasting SSPs were used to span plausible trajectories: SSP1-2.6 (mitigation) and SSP5-8.5 (high emissions) [30]. Daily precipitation and air temperature were processed following a similar downscaling approach to ERA5-Land and subsequently aggregated to monthly values. PET, for both baseline and future periods, was recalculated using the same Hargreaves methodology [22], ensuring methodological consistency between training and projections, as well as allowing PET changes to reflect projected temperature variability and change.

2.2.2. Land-Use, Soil Texture and Topographic Data

Land-use (Land-Use Harmonisation), soil texture (Harmonised World Soil Database v2.0), and topographic variables (Digital Terrain Model—WorldClim v2.1) were incorporated as static predictors to represent key surface and subsurface controls on SSM dynamics. These variables capture spatial heterogeneity in vegetation cover, soil hydraulic properties, and terrain-driven processes that strongly modulate water infiltration, retention, redistribution, and evaporative losses, particularly at fine spatial scales.
Land-use information for the delineation of vineyards, olive groves, and fruit tree orchards within each Living Lab was derived from the CORINE Land Cover dataset 2018 [31], while the Land-Use Harmonisation (LUH2) [32] dataset was used to provide consistent land-use categories for model training across historical and future periods. LUH2-based land-use classes were spatially harmonised to a common 1 km grid and used to ensure that model training and projections were restricted to relevant agricultural areas. These perennial cropping systems exhibit distinct canopy structures, rooting depths, and management practices, which influence soil moisture availability and its temporal persistence. Soil physical properties were obtained from the Harmonised World Soil Database v2.0 [33], providing spatially consistent information on soil texture classes based on the United States Department of Agriculture (USDA) classification scheme. Soil texture was used as a proxy for key hydraulic characteristics, including infiltration capacity, water-holding capacity, and drainage behaviour, which govern the partitioning of precipitation into infiltration, runoff, and evaporation. These properties are particularly important for interpreting spatial contrasts in SSM responses under different climatic and land-use conditions. Topographic variables (Elevation, slope, and aspect) were obtained from a DTM from the Shuttle Radar Topography Mission (SRTM), available in the WorldClim v2.1 database [34]. All these layers were spatially harmonised to a common 1 km grid and consistently masked across all LLs.

2.2.3. Soil Surface Moisture Reference Data

The Sentinel-1 Copernicus Surface Soil Moisture product [14] was used as reference SSM data for model training and validation over the period 2014–2024. This Copernicus Global Land Service product provides near-surface soil moisture estimates at 1 km resolution across Europe, representing relative moisture conditions of the upper soil layer (approximately the top 5 cm), expressed as a percentage of saturation (0–100%). The retrieval is derived from Sentinel-1A and Sentinel-1B C-band SAR backscatter acquired in Interferometric Wide Swath mode with VV polarisation. As an active microwave system, Sentinel-1 is largely insensitive to cloud cover and illumination, enabling consistent retrievals under a wide range of atmospheric conditions and providing frequent revisit coverage across Europe.
Retrieval uses the TU Wien change-detection algorithm, scaling instantaneous backscatter against historical dry/wet references. The product includes radiometric calibration, terrain correction, and normalisation for incidence angle effects, along with quality flags and masks to identify retrievals affected by water bodies, low sensitivity, complex topography, frozen or snow-covered soils, and anomalous dry or wet conditions. Although daily composites are routinely provided, valid observations typically occur every 1.5–4 days, depending on the latitude and orbital geometry.
The 2014–2024 record, although limited by the operational availability of the Sentinel-1 product, already spans contrasting wet years and two of the most severe European droughts of recent centuries (2018 and 2022) [35,36], thus sampling conditions close to the deficits projected for mid-century.
Sentinel-1 SSM quality was evaluated against International Soil Moisture Network (ISMN) stations [37] (Figure 3), performing well over agricultural landscapes but weaker over dense vegetation or mountains.

2.2.4. Data Pre-Processing and Dataset Creation

Agricultural areas (vineyards, olive groves, fruit trees) were delineated using QGIS 3.40 by intersecting Living Lab boundaries with CORINE Land Cover 2018 [31]. Subsequently, a shapefile containing spatial centroids was generated for each crop type within each Living Lab. Each centroid represents a 1 km2 pixel and includes a unique identifier (UID), crop type, and its corresponding geographic coordinates (latitude and longitude). This centroid-based strategy provides spatial consistency and computational efficiency while preserving representativeness. Using this centroid shapefile, values of all climate, soil texture, land-use and topographic predictor features were systematically extracted from the pre-processed datasets. Before extraction, all datasets were reprojected to a common coordinate reference system (WGS84-EPSG:4326), resampled to a spatial resolution of 1 km, and temporally aggregated to monthly values, covering the period from January 2014 to December 2024. Monthly aggregation was adopted to reduce short-term variability and noise associated with daily extremes, while ensuring temporal consistency between predictor variables and the soil surface moisture target reference data.
Quality control removed missing values and unreliable pixels, retaining only observations with complete predictor–SSM correspondence. The final historical modelling dataset was then achieved, comprising 23 predictor features, including lagged variables from the two preceding months (Table A1, Appendix A), and a total of 3,691,609 observations, spanning multiple crops across European LLs. All predictor features were spatially and temporally aligned with the corresponding SSM observations, forming the basis for the training, validation, and evaluation of the soil surface moisture prediction models developed in this study.
In addition to the historical dataset, future datasets were constructed to support SSM projections under climate change scenarios. These datasets include the same set of predictor features used during model training, but exclude the SSM target feature, as they are intended exclusively for forward simulations. Future datasets were generated for the period 2041–2070, separately for each climate scenario (historical baseline, SSP1-2.6, and SSP5-8.5) and for each driving global climate model considered in this study (GFDL-ESM4, IPSL-CM6A-LR, MPI-ESM1-2-HR, and UKESM1-0-LL). All predictor variables were processed using the same spatial and temporal harmonisation procedures applied to the historical data, ensuring full consistency between historical and future datasets and enabling robust, scenario and model-specific projections of SSM across all LLs.

2.3. Machine Learning Models

Multiple ML models were implemented to predict SSM from climate, land-use, soil texture, and topographic predictors. The selected models span a range of complexity levels, from linear approaches to non-linear ensemble methods, allowing a comprehensive assessment of their ability to capture both simple and highly non-linear relationships between environmental drivers and SSM dynamics. All models were trained using the same set of predictor features and were applied consistently across vineyards, olive groves, and temperate fruit tree crops.

2.3.1. Baseline Model: Linear Regression

An LR model was implemented as a baseline reference to provide a transparent and interpretable standard for SSM prediction. LR assumes linear relationships between predictors and SSM, thus enabling direct attribution of changes to individual drivers. Although limited for non-linear interactions, it provides a baseline for evaluating advanced ML approaches. LR also facilitates physical interpretability through the coefficient signals and magnitudes, linking statistical relationships to hydrological processes. By contrasting the predictive skill of LR with that of non-linear ensemble models, the analysis quantifies the extent to which soil moisture dynamics are governed by linear versus non-linear controls across crops. Hence, the LR model serves both as a methodological baseline and as a diagnostic tool for interpreting the added complexity captured by tree-based and boosting algorithms.

2.3.2. Tree-Based and Ensemble Models

To account for the non-linear and interaction-driven nature of soil moisture processes, several tree-based and ensemble ML models were employed [38,39]. Random Forest is an ensemble learning method based on bootstrap aggregation of decision trees, in which predictions from multiple decorrelated trees are averaged to reduce variance and improve robustness against overfitting [40]. RF is well-suited to handling high-dimensional datasets and correlated predictors and has been successfully applied in environmental and hydrological studies where climate, soil, and topography jointly influence soil moisture dynamics [41]. ExtraTrees extends the RF framework by introducing additional randomisation in both feature selection and split thresholds. This enhanced randomness further reduces model variance and improves generalisation, especially in large datasets with complex and noisy relationships [42]. ET is computationally efficient and has been shown to perform well in soil moisture estimation tasks where predictor interactions are highly heterogeneous [43,44]. XGBoost is a gradient boosting algorithm that builds decision trees sequentially, with each new tree correcting the errors of the previous ensemble. XGB is designed to optimise predictive performance through regularisation, shrinkage, and efficient handling of missing values. Its ability to model complex non-linear relationships and feature interactions makes it particularly suitable for capturing threshold effects and extreme soil moisture conditions [45]. LightGBM is a gradient boosting framework optimised for large-scale datasets. Unlike traditional level-wise tree growth, LightGBM employs a leaf-wise growth strategy that prioritises splits with the highest loss reduction, resulting in faster training and improved accuracy. Its computational efficiency and scalability make it especially appropriate for high-resolution spatio-temporal datasets, such as those used in this study [46].
Table 2 presents a summary of the ML models used in this study, grouped by learning strategy (bagging and boosting), and their main corresponding hyperparameter settings. All configurations were applied consistently across olive groves, vineyards, and fruit tree systems.

2.4. Models Training and Validation Strategy

The ML models were trained using the historical dataset covering the period from January 2014 to December 2024, which combines monthly climate features, land-use, soil texture, and topographic predictors and Sentinel-1-derived soil surface moisture observations.
Before model training, an exploratory correlation analysis was conducted to assess potential multicollinearity among predictors and to characterise the interrelationships between climatic, topographic, and soil variables for each crop type. The correlation matrices (Appendix A, Figure A1, Figure A2 and Figure A3) reveal distinct correlation patterns across vineyards, olive groves, and fruit trees, reflecting the different agro-climatic settings of these systems. These crop-specific correlation structures informed the interpretation of feature importance and provided a context for understanding how climatic and environmental drivers interact differently across agricultural systems. The predictor set includes climate variables, lagged terms, soil texture, and topographic features that share physical interdependencies (e.g., temperature-based metrics and PET are manifestations of common radiative and atmospheric processes), leading to inherent multicollinearity. The correlation analysis confirmed moderate to strong correlations among climatic predictors, but all were retained as they capture complementary aspects of SSM dynamics. This multicollinearity is effectively handled by the tree-based ensemble algorithms used in this study, whose random feature subsampling at each split decorrelates individual trees and prevents any single collinear variable from dominating predictions [47]. In addition to conventional empirical risk minimisation, model training also incorporated a Group Distributionally Robust Optimisation (Group DRO) strategy, which emphasises the performance of the worst-performing predefined group rather than optimising only the average loss across all training samples [48]. This is particularly advantageous in heterogeneous environmental datasets, as it reduces the dominance of majority or easier groups, improves worst-group performance, and promotes more balanced generalisation across contrasting environmental conditions and agricultural contexts [48]. Moreover, because real-world environmental prediction tasks are often affected by distribution shifts across space, time, and environmental settings, Group DRO provides an additional layer of robustness against such shifts by discouraging the model from relying excessively on patterns that are only valid for the most represented conditions [49].
Temporal validation (training on earlier periods, testing on later) avoided information leakage and assessed generalisation to unseen conditions. Where applicable, temporal cross-validation was also used to evaluate model stability across successive time windows and to reduce sensitivity to a single train-test partition.
Performance was evaluated separately for each crop type to account for differences in canopy structure, rooting depth and hydro-climatic sensitivity. This crop-specific evaluation allows a more detailed understanding of how model skill varies across contrasting agricultural systems and avoids masking differences that would arise from pooling all crops into a single assessment. To quantify predictive performance, two complementary metrics were used: the coefficient of determination (R2) and the mean absolute error (MAE). R2 was used to assess the proportion of variance in SSM explained by each model, whereas MAE was used to quantify the magnitude of prediction errors, as it provides a direct measure of average prediction error. The combined use of these metrics ensures a balanced evaluation of model accuracy, robustness, and practical applicability across crops and LLs.

2.5. Feature Importance and Model Interpretation

Feature importance analysis identified dominant SSM controls by weighting climatic, topographic, soil and land-use contributions for each crop type, enabling a direct comparison among vineyards, olive groves, and fruit trees while linking predictive performance with physical plausibility.
Crop-specific interpretation is essential, as climatic and environmental controls vary across agricultural systems. For example, vineyards in temperate regions may exhibit stronger sensitivity to seasonal radiation and evapotranspiration demand, whereas olive groves and fruit tree systems in Mediterranean environments may respond more strongly to precipitation deficits and atmospheric water demand. Where applicable, SHAP (SHapley Additive exPlanations) [50] analysis was used to provide a more detailed and transparent interpretation of model behaviour. SHAP values quantify the marginal contribution of each predictor to individual model predictions and therefore allow both the ranking of variables by importance and the interpretation of the direction of their effects on SSM. In contrast to conventional impurity-based importance scores, SHAP supports a more robust comparison of predictor influence and facilitates the identification of non-linear responses, threshold behaviour, and interactions among predictors.
By combining model-based importance metrics with physically grounded interpretations, the analysis strengthens the scientific credibility of the modelling framework and supports a process-based discussion of SSM controls across contrasting agro-climatic systems.

2.6. Historical (1981–2010) and Future (2041–2070) Climate Projections of SSM

To assess the response of SSM to climate change, the trained ML models were driven with bias-corrected and downscaled CMIP6 climate projections derived from the NEX-GDDP-CMIP6 dataset. This dataset provided daily projections of key climatic drivers, including precipitation, air temperature, and radiation-related variables, which were subsequently processed to maintain full methodological consistency with the training dataset. All projected climate features were spatially harmonised to the same 1 km grid used for model calibration, and monthly predictors were derived using the same aggregation and preprocessing workflow adopted for the training period.
The projection framework preserved the calibrated ML structure, ensuring SSM changes reflect climatic forcing differences rather than methodological changes. The projection analysis considered a historical baseline period (1981–2010) and a future mid-century period (2041–2070), allowing a direct comparison between past and projected soil moisture conditions. In addition, two contrasting emissions pathways were analysed: SSP1-2.6, representing a low-emission mitigation scenario, and SSP5-8.5, representing a high-emission pathway associated with stronger warming and intensified hydro-climatic stress. This dual-scenario design assesses both SSM change magnitude and agricultural system sensitivity to different emission trajectories. To explicitly account for structural uncertainty in climate forcing, projections were generated using four CMIP6 global climate models (GFDL-ESM4, IPSL-CM6A-LR, MPI-ESM1-2-HR, and UKESM1-0-LL). The resulting multi-model framework enables a more robust characterisation of future SSM by reducing dependence on a single climate model and by capturing a range of plausible future hydro-climatic responses.
Projected SSM changes were analysed from both a temporal and a spatial perspective. Temporally, the analyses focused on shifts between the historical and future periods, as well as differences between SSP1-2.6 and SSP5-8.5. Spatially, the 1 km projections allowed the identification of heterogeneous drying or relative persistence patterns within and among the five LLs, highlighting localised hotspots of increased soil moisture loss and contrasting responses across crop systems. This high-resolution, scenario-based framework constitutes one of the main scientific contributions of the study, as it enables the generation of spatially explicit and crop-relevant SSM projections for climate change impact assessment, drought-risk analysis, adaptation planning and risk reduction across European agricultural landscapes, thereby contributing to the long-term environmental and socioeconomic sustainability of these key agrarian value chains.

3. Results

3.1. Comparative Evaluation of the Performance of Regression Models in Predicting Soil Surface Moisture

All tree-based ML models outperformed the LR baseline, confirming that SSM was governed by non-linear relationships that were not adequately captured by a simple linear formulation. In all crop systems, RF, XGBoost, ET, and LightGBM combined substantially higher coefficients of determination (R2 between 0.63 and 0.87) with lower prediction errors (MAE between 5.28 and 9.18) than linear regression, whose R2 remained below 0.55 and MAE exceeded 10 in all cases, highlighting the clear advantage of ensemble-based approaches for SSM estimation (Figure 4).
Among the analysed crops, vineyards showed the highest predictive performance, with tree-based models consistently reaching R2 values of about 0.86 to 0.87 and MAE values close to 5.3 to 5.5, clustering in the upper-right, low-error region of the model-performance space. Within this group, RF and LightGBM provided the best trade-off between explained variance and error for vineyards, closely followed by ET and XGBoost, whereas the LR baseline yielded a markedly lower R2 of 0.55 and a much higher MAE of 10.31 and is clearly isolated from the ensemble models in Figure 4. This indicates that vineyard SSM was the most predictable of the three crop systems and that its hydro-climatic controls were effectively captured by the ML models.
By comparison, fruit trees showed intermediate performance, with tree-based models achieving R2 values around 0.63 to 0.65 and MAE near 8.4 to 8.6; in this case, LightGBM and RF slightly outperformed the remaining ensemble models, although differences among them were modest. Olive groves displayed the lowest absolute predictive performance among the ML models (R2 ≈ 0.63 to 0.68; MAE ≈ 8.6 to 9.2), with ET and XGBoost generally providing the most accurate predictions, while LR remained confined to very low R2 (0.27 to 0.33) and very high MAE (above 11.9) for both fruit trees and olive groves (Figure 4).
The relative gains over LR were also substantial and consistent across all crops. The heatmaps of performance improvement show that R2 increased by approximately 56 to 152%, whereas MAE decreased by approximately 27 to 49% relative to the baseline, depending on the crop and model, with most pairwise comparisons against LR reaching statistical significance (Figure 5a,b). The strongest reductions in MAE were observed for vineyards, while the largest relative improvements in R2 occurred in olive groves, reflecting the particularly weak performance of the linear model for these crops. Although differences among the four tree-based models were relatively small, ET and RF tended to present the most favourable positions across crops, indicating robust predictive performance and good generalisation potential.

3.2. Feature Contributions Through Shapley Additive Explanations—SHAP Values

The SHAP analysis showed that the ML models were driven by physically meaningful predictors and that the relative contribution of each feature varied across crop types. Across all three crops, the dominant controls were consistently related to water input (precipitation) and atmospheric water demand (PET and RSDS), while lagged variables also played an important role, indicating that antecedent climatic conditions contributed to present-month SSM.
For olive groves, the most influential predictor was precipitation, followed by RSDS and lagged precipitation and radiation terms (Figure 6). The prominence of both current precipitation and precipitation lag-1 indicates that SSM in olive systems responds not only to immediate rainfall events but also to short-term antecedent moisture conditions. Radiation-related predictors also ranked highly, suggesting that evaporative demand and surface energy availability are central to soil drying in these predominantly Mediterranean environments. Although the DTM was less influential than the climatic predictors, its recurrent selection indicates that topography still exerts a secondary but consistent control on SSM through its effects on water redistribution and local storage. In olive groves, higher-elevation areas were associated with higher predicted SSM, likely because they also coincide with zones of greater precipitation (Figure 6).
In vineyards, the slight decline in predicted SSM with increasing elevation suggests that steeper upper-slope positions may enhance runoff and reduce near-surface water retention, whereas lower and more convergent positions may favour moisture accumulation (Figure 7). This topographic effect should be interpreted as site-specific, as soil moisture patterns depend on the combined influence of terrain, water inputs, and local soil properties [51,52]. Regarding PET, for vineyards, potential evapotranspiration emerged as the dominant driver, followed by precipitation, precipitation lag-1, and other lagged hydro-climatic variables (Figure 7). This pattern points to a strong combination of soil moisture and atmospheric evaporative demand in vineyard systems. High PET values were associated with lower model outputs, while higher precipitation values generally contributed positively to predicted SSM, indicating a physically coherent balance between water loss and water supply. The importance of lagged precipitation further suggests that vineyard SSM retains a short hydrological memory, with antecedent wetness influencing moisture availability in subsequent months.
For fruit trees, precipitation is again ranked as the leading predictor, followed by PET, rsds, and lagged precipitation and PET (Figure 8). Compared with vineyards, fruit tree systems displayed a more balanced influence of rainfall, evaporative demand, and radiation, suggesting a mixed climatic control rather than a single dominant mechanism. As in the other crop types, the repeated occurrence of lagged variables indicates that SSM is partly controlled by short-term persistence in hydro-climatic conditions rather than by instantaneous forcing alone.

3.3. Spatial Patterns and Projected Changes in SSM Under Historical and Future Scenarios

Pronounced spatial heterogeneity in baseline SSM was observed across the five LLs, with historical median values ranging from 39.1% in LL2 vineyards to 53.3% in LL3 vineyards (Figure 9). The Iberian systems (LL1 and LL2) occupied the drier end of the gradient, with median SSM values of 40–46% across crops, whereas central and eastern European systems (LL3–LL5) exhibited higher baseline moisture conditions (47–53%). This distribution confirms a pronounced climatic gradient from drier southwestern to wetter central and eastern European agricultural systems.
Under SSP1-2.6, raincloud distributions remained largely overlapping with historical baselines, indicating modest changes in central tendency (typically −0.3 to −2.0%). The only exception was LL5 fruit trees, which showed a slight increase (+0.7%), consistent with the near-complete distribution overlap observed in Figure 9.
Under SSP5-8.5, pronounced leftward distribution shifts indicated widespread drying across all crop–LL combinations, with median SSM declining from the historical 40–53% range to 26–34% (Figure 9). Absolute reductions ranged from −8.1% (LL2 olive groves) to −24.3% (LL3 fruit trees), with historically wetter systems experiencing the largest absolute losses. This pattern of severe drying under high-emission scenarios contrasts sharply with the stability observed under SSP1-2.6.
Under the SSP1-2.6 scenario (2041–2070), changes in SSM relative to the historical baseline (1981–2010) were generally small. The ensemble-median SSM heatmap indicates that most crop–Living Lab combinations experienced only minor declines, typically close to zero and rarely exceeding approximately 2% in magnitude (Figure 10). In practical terms, this suggests that under the lower-emission pathway, mean mid-century SSM remains broadly similar to the historical baseline, with only limited deviations in most agricultural systems. A small positive anomaly was observed for fruit trees in LL5, indicating that the low-emission scenario does not produce a uniformly negative response across all regions and crops. By contrast, the SSP5-8.5 scenario produced a strong and spatially widespread decline in SSM across almost all crop types and LLs. Relative to the historical baseline, projected reductions ranged from moderate to severe, with the largest absolute losses concentrated in fruit-tree and vineyard systems located in historically wetter LLs (Figure 10). The strongest declines were observed in LL3 and LL4, where fruit trees and vineyards showed the most pronounced drying signal, while LL5 also exhibited substantial losses in fruit tree systems. In LL1, all three crop systems showed marked declines, indicating that even already dry southwestern environments are projected to become substantially drier under stronger warming. Although LL2 showed comparatively smaller reductions, the drying signal remained consistent across all crop types.
An important result is that the strongest absolute reductions were not necessarily found in the historically driest systems, but rather in several of the historically wetter LLs. This indicates that regions with a larger present-day soil moisture buffer may undergo the greatest absolute losses under severe warming, leading to a partial convergence toward drier future conditions across the study area. This suggests that future agricultural drought risk may expand beyond traditionally dry regions into areas that currently benefit from relatively stable moisture conditions. In other words, the future signal is not simply an amplification of present-day dryness, but a spatially differentiated redistribution of soil moisture deficits.
The robustness of these projected changes across the four-member CMIP6 ensemble is examined in Appendix B (Figure A4), which reports the pixel-level ensemble median together with the inter-model range and the per-Living Lab mean of each individual GCM. The ensemble-based confidence range is small relative to the magnitude of the projected decline under SSP5-8.5, indicating that the drying signal is consistently reproduced by all four climate models, whereas the near-zero changes under SSP1-2.6 also remain stable across the ensemble.
The projected SSM changes are broadly consistent with the climatic anomalies shown in the precipitation and PET maps (Figure 11a–d), although the response varies among LLs according to the local balance between water input and atmospheric evaporative demand. Under SSP1-2.6, precipitation changes are spatially heterogeneous, with modest decreases across much of southern and western Europe, but localised increases in some central and eastern sectors (Figure 11a). This contrast is particularly relevant for LL5 (Grójec), which lies within a zone of positive precipitation anomalies under SSP1-2.6. In this case, the additional water input appears to partly compensate for the concurrent rise in PET, helping to explain why LL5 is the only system showing a slight increase in SSM under SSP1-2.6 (from 47.1% to 47.8%, +0.7%), rather than the small declines observed in the other LLs. By contrast, LL1-LL4 are in regions where precipitation is either weakly reduced or does not increase sufficiently to offset the higher evaporative demand, resulting in stable-to-slightly declining SSM under the low-emissions scenario. Under SSP5-8.5, negative precipitation anomalies become more widespread across the Iberian Peninsula, France, and large parts of the Mediterranean domain, while PET increases strongly and coherently across the entire European domain (Figure 11b,d). This indicates a continent-wide intensification of atmospheric evaporative demand, which becomes the dominant control on future SSM decline. Even in regions where precipitation changes are weak or spatially mixed, the strong PET increase promotes faster depletion of surface soil moisture. Accordingly, the widespread drying projected under SSP5-8.5 reflects the combined effect of reduced or only weakly changing precipitation and sharply enhanced evaporative demand. Importantly, the LL5 contrast under SSP1-2.6 shows that SSM responses are not controlled by temperature-driven PET alone: where precipitation increases under moderate forcing, local moisture gains can still be maintained. However, under stronger warming, the PET signal becomes sufficiently large to override these local precipitation benefits, producing a generalised drying tendency across all LLs. Although this analysis focuses on annual median conditions, seasonal drying during crop growth periods may be even more pronounced and deserves further investigation.

4. Discussion

This study developed an ML framework that bridges high-resolution satellite observations and future climate projections by integrating historical Sentinel-1 surface soil moisture data (2014–2024) with bias-corrected CMIP6 scenarios. The consistent superiority of tree-based ensemble methods over linear regression (R2 = 0.63–0.87 versus <0.55) is consistent with the broader soil-moisture literature showing that SSM dynamics are governed by complex, non-linear interactions among hydro-climatic forcing, soil properties, vegetation, and topographic controls, for which tree-based methods are particularly effective [38,41,44]. The higher predictability of vineyards relative to olive groves is also physically plausible, since vine water status is highly responsive to short-term atmospheric demand and soil water availability, whereas olive systems are more drought-adapted and can partially buffer surface drying through crop-specific ecophysiological traits [53,54]. In that regard, while olive groves typically exhibit the greatest rooting depth to maintain transpiration under high atmospheric demand, vineyards and fruit trees often feature more concentrated root zones in the top 40–100 cm, making their evapotranspiration response more sensitive to surface water deficits [55]. This crop-specific behaviour is also consistent with the depth represented by the target variable. The Copernicus surface soil moisture (SSM) product characterises moisture in the top few centimetres of soil [14], whereas active root water uptake in perennial crops extends over deeper and crop-dependent layers. In apple orchards, water uptake is typically concentrated within 0–60 cm, with a dominant contribution from the 0–40 cm layer [56,57]. Grapevines also exhibit strong sensitivity to short-term soil water availability and atmospheric demand, despite their ability to access deeper water sources under drought conditions [58,59]. In contrast, olive trees show greater drought acclimation and root-system plasticity, allowing them to exploit deeper soil moisture and buffer short-term surface drying [60,61]. These differences help explain why shallow moisture anomalies are more directly reflected in vineyards and fruit tree systems, while in olive groves the importance of antecedent precipitation likely reflects the redistribution of water availability within the soil profile.
SHAP analysis further reinforces the physical plausibility of the framework by indicating that the models capture meaningful hydrological controls rather than artificial or spurious correlations. The dominance of precipitation, PET, RSDS, and lagged variables suggests that the framework reproduces both the balance between water inputs and atmospheric demand, as well as the short-term memory of near-surface moisture conditions. This interpretation is consistent with recent interpretable ML analyses of SSM drivers and with observational evidence that soil moisture exhibits temporally varying memory controlled by meteorological forcing [18,62].
External evidence also supports the robustness of high-resolution projections. It was shown that increasing climate-model resolution can intensify projected soil drying and shift seasonal deficits earlier in the year through improved representation of circulation patterns, precipitation biases, and land–atmosphere feedback [63]. In this context, the stronger local drying hotspots identified by our 1 km framework are consistent with the view that kilometre-scale downscaling can reveal hydro-climatic contrasts that are smoothed in coarser-resolution assessments. This is particularly relevant in Europe and the Mediterranean, where CMIP ensembles project amplified warming, widespread drought intensification, and a growing role of atmospheric evaporative demand in shaping future water stress [3,64,65].
The projections reveal a marked scenario-dependent redistribution of water deficits. While SSP1-2.6 keeps SSM close to baseline conditions, SSP5-8.5 drives a generalised convergence toward drier states, with the largest absolute losses occurring in historically wetter LLs (LL3, LL4, and LL5). This pattern suggests that under stronger warming, existing hydro-climatic buffers are progressively eroded as increasing evaporative demand combines with precipitation deficits. Such an interpretation is consistent with European evidence that soil-moisture drought emerges from the combined effects of precipitation shortfalls and enhanced PET, and that stronger soil moisture–temperature coupling can further intensify drying under warmer conditions [65,66]. Nonetheless, the implications for agricultural management are crop-specific. In olive groves, intensified warming is likely to increase vulnerability during key reproductive stages, supporting adaptation through regulated deficit irrigation, improved soil water conservation, and drought-tolerant cultivars [53]. For vineyards, the strong SSM–PET coupling suggests that increasing evaporative demand may become a dominant driver of moisture stress, favouring precision irrigation, canopy management, and drought-resilient rootstocks [54,67,68]. Fruit tree systems, which show the strongest declines in some LLs, are consistent with broader evidence that temperate fruit and nut production is highly sensitive to warming, altered phenology, and water stress. In these systems, improved irrigation planning, mulching or soil cover, and stress-resilient rootstocks are especially relevant [69,70,71]. The contrasting behaviour of LL5 under SSP1-2.6 further indicates that local precipitation anomalies can temporarily offset rising evaporative demand, though this buffering effect becomes clearly insufficient under SSP5-8.5 [65,66].
This framework, therefore, enhances scientific robustness by combining kilometre-scale Sentinel-1 information with Group Distributionally Robust Optimisation [48] and temporally consistent validation. Nevertheless, some limitations remain: models trained under present-day conditions may not fully represent non-stationary relationships under future climates, particularly for extremes outside the training distribution. Future work should therefore test hybrid strategies that combine robust ML, dynamic vegetation representations, and process-based land-surface modelling to improve extrapolation skill, uncertainty quantification, and operational relevance for agricultural decision support.

5. Conclusions

This study developed an ML framework to predict and project SSM at 1 km resolution across five European LLs, encompassing vineyards, olive groves, and fruit tree systems. Tree-based ensemble methods consistently outperformed linear approaches, with RF and LightGBM achieving the highest accuracy for vineyards (R2 ≈ 0.87), while ET and XGBoost performed best for olive groves and fruit trees (R2 ≈ 0.63–0.68). By integrating satellite observations with climate projections, the framework provides a kilometre-scale, crop-specific assessment of future soil surface moisture across multiple European agro-climatic zones. Precipitation, PET, and RSDS emerged as the dominant predictors across all crop systems, corroborating the physical plausibility of the models.
Under SSP5-8.5, projected SSM declines ranged from 8 to 24% across all LLs by 2041–2070, with the most severe reductions in historically wetter regions, namely LL3 and LL4. The SSP1-2.6 scenario showed minimal changes, highlighting the effectiveness of anthropogenic radiative forcing mitigation pathways. These findings are consistent with high-resolution climate model assessments, indicating that coarse-resolution models may underestimate future soil drying in European agrarian regions.
The framework demonstrates strong potential for supporting climate adaptation in perennial cropping systems. Moreover, the 1 km resolution projections enable targeted interventions at the farm and landscape scales, while the crop-specific insights facilitate tailored (ad hoc) management strategies. Future work should integrate dynamic vegetation models and deep learning architectures to capture non-stationary climate–soil relationships and enhance operational applicability for agricultural decision support. Such high-resolution projections can inform regional water management, irrigation planning, and climate-resilient agricultural strategies at farm to landscape scales.

Author Contributions

Conceptualization, N.G. and J.A.S.; methodology, N.G. and H.F.; software, N.G. and H.F.; validation, N.G., H.F. and J.A.S.; formal analysis, N.G. and A.F.; investigation, N.G., H.F., A.F., F.P., L.F.F., J.P.M., C.C., L.P., J.M.J., S.N. and J.J.; resources, J.A.S. and F.P.; data curation, N.G. and A.F.; writing—original draft preparation, N.G.; writing—review and editing, N.G., H.F., A.F., F.P., L.F.F., J.P.M., C.C., L.P., J.M.J., S.N., J.J. and J.A.S.; visualisation, N.G.; supervision, J.A.S. and H.F.; project administration, J.A.S.; funding acquisition, J.A.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by National Funds by FCT—Portuguese Foundation for Science and Technology, under the projects UID/04033/2025: Centre for the Research and Technology of Agro-Environmental and Biological Sciences (https://doi.org/10.54499/UID/04033/2025) and LA/P/0126/2020 (https://doi.org/10.54499/LA/P/0126/2020). We also thank the project LivingSoiLL (GA 101157502) funded by Horizon Europe and the project STrengthS4WineChaiN (NORTE2030-FEDER-01786100).

Data Availability Statement

The original contributions presented in this study are included in the article.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. The 23 predictor features used by the ML models. Five climatic variables (precipitation, tasmax, tasmin, rsds, pet) are each entered as the current month and the two preceding months (lag-1 and lag-2), totalling 15 features. Eight non-time-varying predictors (3 topographic, 1 edaphic, 4 land-use fractions) complete the predictor set.
Table A1. The 23 predictor features used by the ML models. Five climatic variables (precipitation, tasmax, tasmin, rsds, pet) are each entered as the current month and the two preceding months (lag-1 and lag-2), totalling 15 features. Eight non-time-varying predictors (3 topographic, 1 edaphic, 4 land-use fractions) complete the predictor set.
Feature nrCategoryFeatureUnitsTemporalHistorical Source (2014–2024)Future Source (2041–2070)Ref.
1–3ClimaticPrecipitation (pr)mm month−1MonthlyERA5-LandNEX-GDDP-CMIP6[25,72]
4–6ClimaticMaximum air temperature (tasmax)°CMonthlyERA5-LandNEX-GDDP-CMIP6[25,72]
7–9ClimaticMinimum air temperature (tasmin)°CMonthlyERA5-LandNEX-GDDP-CMIP6[25,72]
10–12ClimaticSurface downward shortwave radiation (rsds)W m−2MonthlyERA5-LandNEX-GDDP-CMIP6[25,72]
13–15ClimaticPotential evapotranspiration (pet, Hargreaves)mm month−1MonthlyComputed from ERA5-Land tasmax/tasminComputed from NEX-GDDP-CMIP6 tasmax/tasmin[25,72]
16TopographicElevation (dtm)mStaticWorldClim 2 (1 km)(same static field)[34]
17TopographicSlopedegreesStaticComputed from dtm(same static field)[34]
18TopographicAspectdegreesStaticComputed from dtm(same static field)[34]
19EdaphicUSDA soil texture classcategoricalStaticHarmonized World Soil Database v2.0 (FAO/IIASA), USDA classification(same static field)[33]
20Land useCropland fraction0–1StaticLUH2 (Land-Use Harmonization v2)(same static field)[32]
21Land useForest fraction0–1StaticLUH2(same static field)[32]
22Land useGrassland fraction0–1StaticLUH2(same static field)[32]
23Land useUrban fraction0–1StaticLUH2(same static field)[32]
Figure A1. Spearman correlation matrix for olive groves showing correlation strength/direction (circle size/colour) and significance levels.
Figure A1. Spearman correlation matrix for olive groves showing correlation strength/direction (circle size/colour) and significance levels.
Remotesensing 18 01902 g0a1
Figure A2. Spearman correlation matrix for vineyards showing correlation strength/direction (circle size/colour) and significance levels.
Figure A2. Spearman correlation matrix for vineyards showing correlation strength/direction (circle size/colour) and significance levels.
Remotesensing 18 01902 g0a2
Figure A3. Spearman correlation matrix for fruit trees showing correlation strength/direction (circle size/colour) and significance levels.
Figure A3. Spearman correlation matrix for fruit trees showing correlation strength/direction (circle size/colour) and significance levels.
Remotesensing 18 01902 g0a3

Appendix B

Figure A4. Projected change in surface soil moisture (ΔSSM, %—pp) between 2041 and 2070 and 1981 and 2010, by Living Lab under SSP1-2.6 (green) and SSP5-8.5 (terracotta), for (a) vineyards, (b) olive groves and (c) fruit trees. Diamonds: pixel-level ensemble median. Shaded boxes and whiskers: min–max across the four CMIP6 GCMs. Open circles: per-LL mean for each individual GCM.
Figure A4. Projected change in surface soil moisture (ΔSSM, %—pp) between 2041 and 2070 and 1981 and 2010, by Living Lab under SSP1-2.6 (green) and SSP5-8.5 (terracotta), for (a) vineyards, (b) olive groves and (c) fruit trees. Diamonds: pixel-level ensemble median. Shaded boxes and whiskers: min–max across the four CMIP6 GCMs. Open circles: per-LL mean for each individual GCM.
Remotesensing 18 01902 g0a4

References

  1. OECD. Global Drought Outlook: Trends, Impacts and Policies to Adapt to a Drier World; OECD Publishing: Paris, France, 2025; ISBN 978-92-64-54069-9. [Google Scholar]
  2. Claro, A.M.; Fonseca, A.; Fraga, H.; Santos, J.A. Future Agricultural Water Availability in Mediterranean Countries under Climate Change: A Systematic Review. Water 2024, 16, 2484. [Google Scholar] [CrossRef]
  3. Cos, J.; Doblas-Reyes, F.; Jury, M.; Marcos, R.; Bretonnière, P.-A.; Samsó, M. The Mediterranean Climate Change Hotspot in the CMIP5 and CMIP6 Projections. Earth Syst. Dyn. 2022, 13, 321–340. [Google Scholar] [CrossRef]
  4. Entekhabi, D.; Njoku, E.G.; O’Neill, P.E.; Kellogg, K.H.; Crow, W.T.; Edelstein, W.N.; Entin, J.K.; Goodman, S.D.; Jackson, T.J.; Johnson, J.; et al. The Soil Moisture Active Passive (SMAP) Mission. Proc. IEEE 2010, 98, 704–716. [Google Scholar] [CrossRef]
  5. Albergel, C.; Dorigo, W.; Reichle, R.H.; Balsamo, G.; de Rosnay, P.; Muñoz-Sabater, J.; Isaksen, L.; de Jeu, R.; Wagner, W. Skill and Global Trend Analysis of Soil Moisture from Reanalyses and Microwave Remote Sensing. J. Hydrometeorol. 2013, 14, 1259–1277. [Google Scholar] [CrossRef]
  6. Lei, F.; Crow, W.T.; Shen, H.; Su, C.-H.; Holmes, T.R.H.; Parinussa, R.M.; Wang, G. Assessment of the Impact of Spatial Heterogeneity on Microwave Satellite Soil Moisture Periodic Error. Remote Sens. Environ. 2018, 205, 85–99. [Google Scholar] [CrossRef] [PubMed]
  7. Muñoz-Sabater, J.; Dutra, E.; Agustí-Panareda, A.; Albergel, C.; Arduini, G.; Balsamo, G.; Boussetta, S.; Choulga, M.; Harrigan, S.; Hersbach, H.; et al. ERA5-Land: A State-of-the-Art Global Reanalysis Dataset for Land Applications. Earth Syst. Sci. Data 2021, 13, 4349–4383. [Google Scholar] [CrossRef]
  8. Rodell, M.; Houser, P.R.; Jambor, U.; Gottschalck, J.; Mitchell, K.; Meng, C.-J.; Arsenault, K.; Cosgrove, B.; Radakovich, J.; Bosilovich, M.; et al. The Global Land Data Assimilation System. Bull. Am. Meteorol. Soc. 2004, 85, 381–394. [Google Scholar] [CrossRef]
  9. Joint Research Centre (JRC). Copernicus Land Monitoring Service (CLMS); Joint Research Centre (JRC): Brussel, Belgium, 2025. [Google Scholar]
  10. Bierkens, M.F.P. Global Hydrology 2015: State, Trends, and Directions. Water Resour. Res. 2015, 51, 4923–4947. [Google Scholar] [CrossRef]
  11. Reichle, R.H.; De Lannoy, G.J.M.; Liu, Q.; Ardizzone, J.V.; Colliander, A.; Conaty, A.; Crow, W.; Jackson, T.J.; Jones, L.A.; Kimball, J.S.; et al. Assessment of the SMAP Level-4 Surface and Root-Zone Soil Moisture Product Using In Situ Measurements. J. Hydrometeorol. 2017, 18, 2621–2645. [Google Scholar] [CrossRef]
  12. Teixeira, A.C.; Bakon, M.; Lopes, D.; Cunha, A.; Sousa, J.J. A Systematic Review on Soil Moisture Estimation Using Remote Sensing Data for Agricultural Applications. Sci. Remote Sens. 2025, 12, 100328. [Google Scholar] [CrossRef]
  13. Dorigo, W.; Wagner, W.; Albergel, C.; Albrecht, F.; Balsamo, G.; Brocca, L.; Chung, D.; Ertl, M.; Forkel, M.; Gruber, A.; et al. ESA CCI Soil Moisture for Improved Earth System Understanding: State-of-the Art and Future Directions. Remote Sens. Environ. 2017, 203, 185–215. [Google Scholar] [CrossRef]
  14. Surface Soil Moisture 2014-Present (Raster 1 Km), Europe, Daily-Version 1. Available online: https://sdi.eea.europa.eu/catalogue/srv/api/records/e934b15f-7d48-4c6d-a9c6-6484488aa58f (accessed on 20 November 2025).
  15. Xu, N.; Daccache, A.; Ahmadi, A. GSSM-10 (Global 10-m Surface Soil Moisture) Derived from Multi-Sensor Data and Ensemble Learning. Earth Syst. Sci. Data 2024. [Google Scholar] [CrossRef]
  16. Zhu, Z.; Zhang, R.; Fang, B.; Kim, H.; Nguyen, H.H.; Lakshmi, V. A Novel Soil Moisture Evaluation Framework Incorporating Brightness Temperature and a High-Resolution 1 Km Summer Brightness Temperature Dataset. GISci. Remote Sens. 2025, 62, 2491169. [Google Scholar] [CrossRef]
  17. Rabiei, S.; Babaeian, E.; Grunwald, S. Deep Learning-Based Short- and Mid-Term Surface and Subsurface Soil Moisture Projections from Remote Sensing and Digital Soil Maps. Remote Sens. 2025, 17, 3219. [Google Scholar] [CrossRef]
  18. Nikraftar, Z.; Parizi, E.; Saber, M.; Boueshagh, M.; Tavakoli, M.; Esmaeili Mahmoudabadi, A.; Ekradi, M.H.; Mbuvha, R.; Hosseini, S.M. An Interpretable Machine Learning Framework for Unraveling the Dynamics of Surface Soil Moisture Drivers. Remote Sens. 2025, 17, 2505. [Google Scholar] [CrossRef]
  19. Feng, D.; Wang, G.; Wei, X.; Amankwah, S.O.Y.; Hu, Y.; Luo, Z.; Hagan, D.F.T.; Ullah, W. Merging and Downscaling Soil Moisture Data From CMIP6 Projections Using Deep Learning Method. Front. Environ. Sci. 2022, 10, 847475. [Google Scholar] [CrossRef]
  20. Tefera, M.L.; Zeleke, E.B.; Pirastru, M.; Melesse, A.M.; Seddaiu, G.; Awada, H. Satellite-Based Machine Learning for Soil Moisture Prediction and Land Conservation Practice Assessment in West African Drylands. Remote Sens. 2025, 17, 3651. [Google Scholar] [CrossRef]
  21. Beck, H.E.; McVicar, T.R.; Vergopolan, N.; Berg, A.; Lutsko, N.J.; Dufour, A.; Zeng, Z.; Jiang, X.; van Dijk, A.I.J.M.; Miralles, D.G. High-Resolution (1 Km) Köppen-Geiger Maps for 1901–2099 Based on Constrained CMIP6 Projections. Sci. Data 2023, 10, 724. [Google Scholar] [CrossRef]
  22. Hargreaves, G.H.; Allen, R.G. History and Evaluation of Hargreaves Evapotranspiration Equation. J. Irrig. Drain. Eng. 2003, 129, 53–63. [Google Scholar] [CrossRef]
  23. Gao, P.; Mu, X.-M.; Wang, F.; Li, R. Changes in Streamflow and Sediment Discharge and the Response to Human Activities in the Middle Reaches of the Yellow River. Hydrol. Earth Syst. Sci. 2011, 15, 1–10. [Google Scholar] [CrossRef]
  24. Karger, D.N.; Lange, S.; Hari, C.; Reyer, C.P.O.; Conrad, O.; Zimmermann, N.E.; Frieler, K. CHELSA-W5E5: Daily 1 Km Meteorological Forcing Data for Climate Impact Studies. Earth Syst. Sci. Data 2023, 15, 2445–2464. [Google Scholar] [CrossRef]
  25. NASA Center for Climate Simulation (NCCS). NEX-GDDP-CMIP6 2026; NASA Center for Climate Simulation (NCCS): Greenbelt, MD, USA.
  26. Dunne, J.P.; Horowitz, L.W.; Adcroft, A.J.; Ginoux, P.; Held, I.M.; John, J.G.; Krasting, J.P.; Malyshev, S.; Naik, V.; Paulot, F.; et al. The GFDL Earth System Model Version 4.1 (GFDL-ESM4.1): Overall Coupled Model Description and Simulation Characteristics. J. Adv. Model. Earth Syst. 2020, 12, e2019MS002015. [Google Scholar] [CrossRef]
  27. Boucher, O.; Servonnat, J.; Albright, A.L.; Aumont, O.; Balkanski, Y.; Bastrikov, V.; Bekki, S.; Bonnet, R.; Bony, S.; Bopp, L.; et al. Presentation and Evaluation of the IPSL-CM6A-LR Climate Model. J. Adv. Model. Earth Syst. 2020, 12, e2019MS002010. [Google Scholar] [CrossRef]
  28. Müller, W.A.; Jungclaus, J.H.; Mauritsen, T.; Baehr, J.; Bittner, M.; Budich, R.; Bunzel, F.; Esch, M.; Ghosh, R.; Haak, H.; et al. A Higher-Resolution Version of the Max Planck Institute Earth System Model (MPI-ESM1.2-HR). J. Adv. Model. Earth Syst. 2018, 10, 1383–1413. [Google Scholar] [CrossRef]
  29. Sellar, A.A.; Jones, C.G.; Mulcahy, J.P.; Tang, Y.; Yool, A.; Wiltshire, A.; O’Connor, F.M.; Stringer, M.; Hill, R.; Palmieri, J.; et al. UKESM1: Description and Evaluation of the UK Earth System Model. J. Adv. Model. Earth Syst. 2019, 11, 4513–4558. [Google Scholar] [CrossRef]
  30. Riahi, K.; van Vuuren, D.P.; Kriegler, E.; Edmonds, J.; O’Neill, B.C.; Fujimori, S.; Bauer, N.; Calvin, K.; Dellink, R.; Fricko, O.; et al. The Shared Socioeconomic Pathways and Their Energy, Land Use, and Greenhouse Gas Emissions Implications: An Overview. Glob. Environ. Change 2017, 42, 153–168. [Google Scholar] [CrossRef]
  31. European Environment Agency. CORINE Land Cover 2018 (CLC2018); European Environment Agency: Copenhagen, Denmark, 2018.
  32. Hurtt, G.C.; Chini, L.; Sahajpal, R.; Frolking, S.; Bodirsky, B.L.; Calvin, K.; Doelman, J.C.; Fisk, J.; Fujimori, S.; Klein Goldewijk, K.; et al. Harmonization of Global Land Use Change and Management for the Period 850–2100 (LUH2) for CMIP6. Geosci. Model Dev. 2020, 13, 5425–5464. [Google Scholar] [CrossRef]
  33. Harmonized World Soil Database Version 2.0; FAO: Rome, Italy; International Institute for Applied Systems Analysis (IIASA): Laxenburg, Austria, 2023; ISBN 978-92-5-137499-3.
  34. Fick, S.E.; Hijmans, R.J. WorldClim 2: New 1-km Spatial Resolution Climate Surfaces for Global Land Areas. Int. J. Climatol. 2017, 37, 4302–4315. [Google Scholar] [CrossRef]
  35. Drought in Europe–August 2022–GDO Analytical Report; Publications Office of the European Union: Luxembourg, 2022.
  36. Büntgen, U.; Urban, O.; Krusic, P.J.; Rybníček, M.; Kolář, T.; Kyncl, T.; Ač, A.; Koňasová, E.; Čáslavský, J.; Esper, J.; et al. Recent European Drought Extremes beyond Common Era Background Variability. Nat. Geosci. 2021, 14, 190–196. [Google Scholar] [CrossRef]
  37. Bauer-Marschallinger, B.; Paulik, C. Copernicus Global Land Service: Surface Soil Moisture 1 Km–Product User Manual; Copernicus Global Land Service: Vienna, Austria, 2019. [Google Scholar]
  38. Cutler, D.R.; Edwards, T.C.; Beard, K.H.; Cutler, A.; Hess, K.T.; Gibson, J.; Lawler, J.J. Random forests for classification in ecology. Ecology 2007, 88, 2783–2792. [Google Scholar] [CrossRef]
  39. Nordhausen, K. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition by Trevor Hastie, Robert Tibshirani, Jerome Friedman. Int. Stat. Rev. 2009, 77, 482. [Google Scholar] [CrossRef]
  40. Biau, G.; Scornet, E. A Random Forest Guided Tour. Test 2016, 25, 197–227. [Google Scholar] [CrossRef]
  41. Taheri, M.; Bigdeli, M.; Imanian, H.; Mohammadian, A. An Overview of Machine-Learning Methods for Soil Moisture Estimation. Water 2025, 17, 1638. [Google Scholar] [CrossRef]
  42. Geurts, P.; Ernst, D.; Wehenkel, L. Extremely Randomized Trees. Mach. Learn. 2006, 63, 3–42. [Google Scholar] [CrossRef]
  43. Sudhakara, B.; Bhattacharjee, S. High-Resolution Soil Moisture Estimation: A Case Study in Coastal India. J. Indian Soc. Remote Sens. 2025, 53, 2647–2665. [Google Scholar] [CrossRef]
  44. Lamichhane, M.; Mehan, S.; Mankin, K.R. Soil Moisture Prediction Using Remote Sensing and Machine Learning Algorithms: A Review on Progress, Challenges, and Opportunities. Remote Sens. 2025, 17, 2397. [Google Scholar] [CrossRef]
  45. Ibrahem Ahmed Osman, A.; Najah Ahmed, A.; Chow, M.F.; Feng Huang, Y.; El-Shafie, A. Extreme Gradient Boosting (Xgboost) Model to Predict the Groundwater Levels in Selangor Malaysia. Ain Shams Eng. J. 2021, 12, 1545–1556. [Google Scholar] [CrossRef]
  46. Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.-Y. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In Proceedings of the Advances in Neural Information Processing Systems; Curran Associates, Inc.: Red Hook, NY, USA, 2017; Volume 30. [Google Scholar]
  47. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  48. Sagawa, S.; Koh, P.W.; Hashimoto, T.B.; Liang, P. Distributionally robust neural networks for group shifts: On the importance of regularization for worst-case generalization. arXiv 2020, arXiv:1911.08731. [Google Scholar] [CrossRef]
  49. Fontem, B.; Ji, R. Distributionally Robust Optimization with Generalized Total Variation Ambiguity Sets. Eur. J. Oper. Res. 2026, 328, 894–911. [Google Scholar] [CrossRef]
  50. Ekanayake, I.U.; Meddage, D.P.P.; Rathnayake, U. A Novel Approach to Explain the Black-Box Nature of Machine Learning in Compressive Strength Predictions of Concrete Using Shapley Additive Explanations (SHAP). Case Stud. Constr. Mater. 2022, 16, e01059. [Google Scholar] [CrossRef]
  51. Qiu, Y.; Fu, B.; Wang, J.; Chen, L. Soil Moisture Variation in Relation to Topography and Land Use in a Hillslope Catchment of the Loess Plateau, China. J. Hydrol. 2001, 240, 243–263. [Google Scholar] [CrossRef]
  52. Magdić, I.; Safner, T.; Rubinić, V.; Rutić, F.; Husnjak, S.; Filipović, V. Effect of Slope Position on Soil Properties and Soil Moisture Regime of Stagnosol in the Vineyard. J. Hydrol. Hydromech. 2022, 70, 62–73. [Google Scholar] [CrossRef]
  53. Fraga, H.; Moriondo, M.; Leolini, L.; Santos, J.A. Mediterranean Olive Orchards under Climate Change: A Review of Future Impacts and Adaptation Strategies. Agronomy 2021, 11, 56. [Google Scholar] [CrossRef]
  54. Rienth, M.; Scholasch, T. State-of-the-Art of Tools and Methods to Assess Vine Water Status. OENO One 2019, 53. [Google Scholar] [CrossRef]
  55. Machado, B.D.; Magro, M.; de Souza, D.S.; Rufato, L.; Kretzschmar, A.A. Study on the Growth and Spatial Distribution of the Root System of Different European Pear Cultivars on Quince Rootstock Combinations. Rev. Bras. Frutic. 2018, 40. [Google Scholar] [CrossRef]
  56. Zheng, L.; Ma, J.; Sun, X.; Guo, X.; Cheng, Q.; Shi, X. Estimating the Root Water Uptake of Surface-Irrigated Apples Using Water Stable Isotopes and the Hydrus-1D Model. Water 2018, 10, 1624. [Google Scholar] [CrossRef]
  57. Aguzzoni, A.; Engel, M.; Zanotelli, D.; Penna, D.; Comiti, F.; Tagliavini, M. Water Uptake Dynamics in Apple Trees Assessed by an Isotope Labeling Approach. Agric. Water Manag. 2022, 266, 107572. [Google Scholar] [CrossRef]
  58. Savi, T.; Petruzzellis, F.; Martellos, S.; Stenni, B.; Dal Borgo, A.; Zini, L.; Lisjak, K.; Nardini, A. Vineyard Water Relations in a Karstic Area: Deep Roots and Irrigation Management. Agric. Ecosyst. Environ. 2018, 263, 53–59. [Google Scholar] [CrossRef]
  59. Gambetta, G.A.; Herrera, J.C.; Dayer, S.; Feng, Q.; Hochberg, U.; Castellarin, S.D. The Physiology of Drought Stress in Grapevine: Towards an Integrative Definition of Drought Tolerance. J. Exp. Bot. 2020, 71, 4658–4676. [Google Scholar] [CrossRef]
  60. García-Tejera, O.; López-Bernal, Á.; Orgaz, F.; Testi, L.; Villalobos, F.J. Are Olive Root Systems Optimal for Deficit Irrigation? Eur. J. Agron. 2018, 99, 72–79. [Google Scholar] [CrossRef]
  61. Brito, C.; Dinis, L.-T.; Moutinho-Pereira, J.; Correia, C.M. Drought Stress Effects and Olive Tree Acclimation under a Changing Climate. Plants 2019, 8, 232. [Google Scholar] [CrossRef] [PubMed]
  62. Orth, R.; Seneviratne, S.I. Analysis of Soil Moisture Memory from Observations in Europe. J. Geophys. Res. Atmos. 2012, 117, D15115. [Google Scholar] [CrossRef]
  63. van der Linden, E.C.; Haarsma, R.J.; van der Schrier, G. Impact of Climate Model Resolution on Soil Moisture Projections in Central-Western Europe. Hydrol. Earth Syst. Sci. 2019, 23, 191–206. [Google Scholar] [CrossRef]
  64. Vicente-Serrano, S.M.; Peña-Angulo, D.; Beguería, S.; Domínguez-Castro, F.; Tomás-Burguera, M.; Noguera, I.; Gimeno-Sotelo, L.; El Kenawy, A. Global Drought Trends and Future Projections. Philos. Trans. A Math. Phys. Eng. Sci. 2022, 380, 20210285. [Google Scholar] [CrossRef]
  65. Manning, C.; Widmann, M.; Bevacqua, E.; Van Loon, A.F.; Maraun, D.; Vrac, M. Soil Moisture Drought in Europe: A Compound Event of Precipitation and Potential Evapotranspiration on Multiple Time Scales. J. Hydrometeorol. 2018, 19, 1255–1271. [Google Scholar] [CrossRef]
  66. Mondal, S.K.; An, S.-I.; Min, S.-K.; Jiang, T.; Su, B. Enhanced Soil Moisture–Temperature Coupling Could Exacerbate Drought under Net-Negative Emissions. npj Clim. Atmos. Sci. 2024, 7, 265. [Google Scholar] [CrossRef]
  67. Santos, J.A.; Fraga, H.; Malheiro, A.C.; Moutinho-Pereira, J.; Dinis, L.-T.; Correia, C.; Moriondo, M.; Leolini, L.; Dibari, C.; Costafreda-Aumedes, S.; et al. A Review of the Potential Climate Change Impacts and Adaptation Options for European Viticulture. Appl. Sci. 2020, 10, 3092. [Google Scholar] [CrossRef]
  68. Hannah, L.; Roehrdanz, P.R.; Ikegami, M.; Shepard, A.V.; Shaw, M.R.; Tabor, G.; Zhi, L.; Marquet, P.A.; Hijmans, R.J. Climate Change, Wine, and Conservation. Proc. Natl. Acad. Sci. USA 2013, 110, 6907–6912. [Google Scholar] [CrossRef]
  69. Osorio-Marín, J.; Fernandez, E.; Vieli, L.; Ribera, A.; Luedeling, E.; Cobo, N. Climate Change Impacts on Temperate Fruit and Nut Production: A Systematic Review. Front. Plant Sci. 2024, 15, 1352169. [Google Scholar] [CrossRef]
  70. Martínez, J.-P.; Sagredo, B.; Moreno, M.Á. Editorial: Using Rootstocks in Crops and Fruit Trees to Mitigate the Effects of Climate Change and Abiotic Stress. Front. Plant Sci. 2024, 15, 1479317. [Google Scholar] [CrossRef] [PubMed]
  71. Ananthakrishnan, S.; Sharma, J.C.; Sharma, N.; Kumar, S.; Shankar, S.V.; Ranjha, R.; Lalkhumliana, F.; Sharma, K.; Aravinthkumar, A. Mulching and Irrigation Strategies for Climate Resilient Apple Cultivation in High-Density Orchards. Sci. Rep. 2025, 15, 17125. [Google Scholar] [CrossRef] [PubMed]
  72. Muñoz Sabater, J. ERA5-Land Hourly Data from 1950 to Present 2019; Climate Data Store: Bologna, Italy, 2025. [Google Scholar]
Figure 1. (a) Geographical location of the selected five European LLs (red rectangles). (bf) Spatial distribution of dominant agricultural systems: (b) LL1 (Luso-Galician), (c) LL2 (Andalusian), (d) LL3 (Northwest Italy), (e) LL4 (Loire Valley and Beaujolais), and (f) LL5 (Grójec).
Figure 1. (a) Geographical location of the selected five European LLs (red rectangles). (bf) Spatial distribution of dominant agricultural systems: (b) LL1 (Luso-Galician), (c) LL2 (Andalusian), (d) LL3 (Northwest Italy), (e) LL4 (Loire Valley and Beaujolais), and (f) LL5 (Grójec).
Remotesensing 18 01902 g001
Figure 2. Schematic workflow of the study: (1) collection and integration of climate, land-use and soil, topographic, and Sentinel-1 SSM data; (2) development and validation of ML models (Linear Regression, ExtraTrees, LightGBM, Random Forest, and XGBoost); (3) generation of future SSM projections using downscaled CMIP6 climate scenarios under different SSPs; and (4) application of projected outputs to drought risk assessment, agro-hydrological modelling, and climate adaptation planning.
Figure 2. Schematic workflow of the study: (1) collection and integration of climate, land-use and soil, topographic, and Sentinel-1 SSM data; (2) development and validation of ML models (Linear Regression, ExtraTrees, LightGBM, Random Forest, and XGBoost); (3) generation of future SSM projections using downscaled CMIP6 climate scenarios under different SSPs; and (4) application of projected outputs to drought risk assessment, agro-hydrological modelling, and climate adaptation planning.
Remotesensing 18 01902 g002
Figure 3. Location of ISMN stations and correlation with Sentinel-1 SSM estimates (2014–2024).
Figure 3. Location of ISMN stations and correlation with Sentinel-1 SSM estimates (2014–2024).
Remotesensing 18 01902 g003
Figure 4. Comparison of regression and tree-based ML models for monthly SSM prediction across the three crop types.
Figure 4. Comparison of regression and tree-based ML models for monthly SSM prediction across the three crop types.
Remotesensing 18 01902 g004
Figure 5. Relative performance gain of tree-based models (Random Forest, XGBoost, ExtraTrees, LightGBM) vs. linear regression across crop types (olive groves, fruit trees, vineyards). (a) Percentage increase in R2 relative to linear regression; (b) percentage reduction in MAE relative to linear regression. Asterisks denote statistical significance (* p < 0.05, ** p < 0.01, *** p < 0.001).
Figure 5. Relative performance gain of tree-based models (Random Forest, XGBoost, ExtraTrees, LightGBM) vs. linear regression across crop types (olive groves, fruit trees, vineyards). (a) Percentage increase in R2 relative to linear regression; (b) percentage reduction in MAE relative to linear regression. Asterisks denote statistical significance (* p < 0.05, ** p < 0.01, *** p < 0.001).
Remotesensing 18 01902 g005
Figure 6. SHAP summary (a) and dependence plots (be) for olive groves, showing top predictors and effects of precipitation, rsds and lagged terms.
Figure 6. SHAP summary (a) and dependence plots (be) for olive groves, showing top predictors and effects of precipitation, rsds and lagged terms.
Remotesensing 18 01902 g006
Figure 7. SHAP summary (a) and dependence plots (be) for vineyards, showing top predictors and effects of PET, precipitation and lagged terms.
Figure 7. SHAP summary (a) and dependence plots (be) for vineyards, showing top predictors and effects of PET, precipitation and lagged terms.
Remotesensing 18 01902 g007
Figure 8. SHAP summary (a) and dependence plots (be) for fruit trees, showing top predictors and effects of precipitation, PET, rsds and lagged terms.
Figure 8. SHAP summary (a) and dependence plots (be) for fruit trees, showing top predictors and effects of precipitation, PET, rsds and lagged terms.
Remotesensing 18 01902 g008
Figure 9. Raincloud plots of ensemble-median SSM (% across GCMs) for the historical (1981–2010) and future (2041–2070, SSP1-2.6 and SSP5-8.5) scenarios, shown per Living Lab and crop. (a) LL1, (b) LL2, (c) LL3, (d) LL4 and (e) LL5, each panel showing the distributions for the crops present in that Living Lab.
Figure 9. Raincloud plots of ensemble-median SSM (% across GCMs) for the historical (1981–2010) and future (2041–2070, SSP1-2.6 and SSP5-8.5) scenarios, shown per Living Lab and crop. (a) LL1, (b) LL2, (c) LL3, (d) LL4 and (e) LL5, each panel showing the distributions for the crops present in that Living Lab.
Remotesensing 18 01902 g009
Figure 10. Ensemble-median SSM changes (%) by Living Lab and crop type under SSP1-2.6 and SSP5-8.5.
Figure 10. Ensemble-median SSM changes (%) by Living Lab and crop type under SSP1-2.6 and SSP5-8.5.
Remotesensing 18 01902 g010
Figure 11. Precipitation (a,b) and PET (c,d) changes (2041–2070 vs. 1981–2010) under SSP1-2.6 (a,c) and SSP5-8.5 (b,d).
Figure 11. Precipitation (a,b) and PET (c,d) changes (2041–2070 vs. 1981–2010) under SSP1-2.6 (a,c) and SSP5-8.5 (b,d).
Remotesensing 18 01902 g011
Table 1. Description of the five European LLs: region, main crops, and climate classification.
Table 1. Description of the five European LLs: region, main crops, and climate classification.
Living LabRegionMain CropsKöppen–GeigerClimate Signature
LL1—Luso-GalicianN Portugal and Galicia (Spain)Olive groves
Fruit trees
Vineyards
Csa; CsbMild wet winters; dry summers
LL2—AndalusianSouthern SpainOlive groves
Fruit trees
Vineyards
CsaHot; low and irregular rainfall
LL3—Northwest ItalyNorthwestern ItalyFruit trees
Vineyards
CfaRainfall year-round; warm summers
LL4—Loire Valley and BeaujolaisCentral and eastern FranceFruit trees
Vineyards
CfbModerate; regular precipitation
LL5—GrójecCentral PolandFruit treesDfbCold winters; warm summers
Csa: Hot-summer Mediterranean; Csb: Warm-summer Mediterranean; Cfa: Humid subtropical; Cfb: Temperate oceanic; Dfb: Humid continental.
Table 2. Machine learning models and hyperparameter configuration.
Table 2. Machine learning models and hyperparameter configuration.
Learning StrategyModelHyperparameters
BaggingRandom Forest (RF)n_estimators = 400
max_depth = None
min_samples_leaf = 1
Extra Trees (ET)n_estimators = 400
max_depth = None
min_samples_leaf = 1
BoostingXGBoost (XGB)n_estimators = 400
max_depth = 6
learning_rate = 0.05
LightGBM (LGBM)n_estimators = 400
learning_rate = 0.05
max_depth = −1
num_leaves = 64
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Guimarães, N.; Fraga, H.; Fonseca, A.; Pacheco, F.; Fernandes, L.F.; Moura, J.P.; Carlos, C.; Pereira, L.; Jurado, J.M.; Negri, S.; et al. High-Resolution Soil Surface Moisture Projections for European Perennial Crops: A Machine Learning Framework Integrating Sentinel-1 and CMIP6 Climate Scenarios. Remote Sens. 2026, 18, 1902. https://doi.org/10.3390/rs18121902

AMA Style

Guimarães N, Fraga H, Fonseca A, Pacheco F, Fernandes LF, Moura JP, Carlos C, Pereira L, Jurado JM, Negri S, et al. High-Resolution Soil Surface Moisture Projections for European Perennial Crops: A Machine Learning Framework Integrating Sentinel-1 and CMIP6 Climate Scenarios. Remote Sensing. 2026; 18(12):1902. https://doi.org/10.3390/rs18121902

Chicago/Turabian Style

Guimarães, Nathalie, Helder Fraga, André Fonseca, Fernando Pacheco, Luís Filipe Fernandes, João Paulo Moura, Cristina Carlos, Leonor Pereira, Juan M. Jurado, Sara Negri, and et al. 2026. "High-Resolution Soil Surface Moisture Projections for European Perennial Crops: A Machine Learning Framework Integrating Sentinel-1 and CMIP6 Climate Scenarios" Remote Sensing 18, no. 12: 1902. https://doi.org/10.3390/rs18121902

APA Style

Guimarães, N., Fraga, H., Fonseca, A., Pacheco, F., Fernandes, L. F., Moura, J. P., Carlos, C., Pereira, L., Jurado, J. M., Negri, S., Jonczak, J., & Santos, J. A. (2026). High-Resolution Soil Surface Moisture Projections for European Perennial Crops: A Machine Learning Framework Integrating Sentinel-1 and CMIP6 Climate Scenarios. Remote Sensing, 18(12), 1902. https://doi.org/10.3390/rs18121902

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop