1. Introduction
In agricultural landscapes, the availability of water for irrigation can be a key driver for rapid changes in land use patterns, enabling the transition from non-irrigated to irrigated crops. The construction of dams normally represents significant impact in the landscape, both by direct and indirect mechanisms [
1,
2]. The availability of water may lead farmers to shift to high-value crops, with the prospect of increasing their economic revenue. However, the economic benefit may vary due to the area of influence of the dam [
3]. Land use changes can include the conversion of forest land areas into grassland [
4], replacements of non-irrigated crops by irrigated crops, substitution of low intensive to highly intensive crops or the establishment of large-scale farms [
5].
The ability to monitor land use land cover (LULC) changes in agricultural landscapes is essential to support natural resource management, spatial planning, environmental and conservation management. An effective monitoring procedure needs to have a high discrimination level, making it capable at identifying specific crops. In the US, mapping of crops has been available nationwide since 2008, based on a 30 m resolution from remote sensing imagery classification of the cropland [
6]. The legend of the classification covers 109 different crops, including 84 annual (single and double crops) and 25 permanent, and additional classes for forest and shrubland. In Europe, the CORINE Land Cover (CLC) inventory is the main reference for land use monitoring, but it doesn’t have a focus on crop mapping. The low spatio-temporal of CLC, based on a minimum mapping unit of 25 ha and 6-year update cycles, results in important limitations for effective monitoring of areas with high variability, as agricultural landscapes. Its thematic classification is not sufficient to document different agricultural practices [
7].
In 2021, the European Commission’s Joint Research Centre (JRC) published one of the first attempts to map agricultural crops across Europe, the EU Crop Map [
8], which was a snapshot for 2018 based on Sentinel-1 imagery. However, the classification used in this study was primarily focused on annual crops, while permanent crops were grouped together with woody vegetation into a single class “woodlands and shrubland”. A new version of the EU Crop Map was released for the year 2022, also incorporating Sentinel-2 data [
9] but maintaining the classification scheme. Globally, the 10 m spatial resolution of this initiative meant a significant improvement in detail compared to other initiatives, which often relied on lower-resolution datasets, but it still does not discriminate between permanent crops.
Other LULC maps exist for Europe. The Land Use/Cover Area frame Survey (LUCAS), a large-scale systematic survey, based on in situ observations, was first conducted in 2001, over a 2 km grid across EU member states [
10]. However, the classification scheme used in LUCAS cannot discriminate among permanent crop types. The EU Crop Map used LUCAS as one of the sources for the model-training process. Sainte Fare Garnot et al. [
11] and the Land Use, Land Use Change and Forestry (LULUCF) inventory [
12] rely on an even more detailed legend at national level. The former applies a spatio-temporal encoder to 20 classes of crops relevant to subsidy allocations in France. The former proposes a promising spatio-temporal encoder, which outperformed other state-of-the-art methods. The latter focused on supporting greenhouse gas (GHG) reporting. However, neither provides detailed classification schemes that distinguish permanent crop types at the species level.
In southern Portugal (Alentejo region), construction of the Alqueva dam was completed in 2002. It became one of the largest artificial lakes in Europe, with a storage capacity exceeding 4150 million m
3 of water over an area of 25,000 ha. This development project aimed to ensure a public water supply for agriculture, industry, and human consumption, as well as the production of clean energy, and boost the regional tourist sector. The Alqueva dam controls the main reservoir, although the development project now includes more than 70 satellite dams or reservoirs, connected by over 2000 km of canals and pipelines, as part of the irrigation area (EFMA). The socio-economic impacts of the project are evident today, particularly in agriculture. Approximately 130,000 ha are currently irrigated, with an additional 35,000 ha planned, bringing the total irrigated area to nearly 165,000 ha. The availability of water supports a large number of irrigated crops in this area. According to Empresa de Desenvolvimento e Infra-estruturas do Alqueva (EDIA), the public company that manages the Alqueva Multipurpose Project, data from registered beneficiaries within the irrigation perimeter indicate that the proportion of irrigated land occupied by permanent crops increased from 75% in 2018 to 82% in 2021. This area, totaling approximately 90,000 ha, is predominantly composed of olive groves (71,000 ha), followed by nut crops, particularly almond groves, which have expanded notably in recent years to cover 23,860 ha. Vineyards are the third most prevalent permanent crop, albeit at a much smaller scale, with a total area of 5990 ha [
13].
The transformation of the agricultural landscape is observed not only in the shift to permanent crops, but also in the adoption of more intensive plantation systems. Recent olive groves can be planted into two systems: potted or intensive olive groves (HD), with planting densities between 200 and 600 trees per hectare, and managed for medium size canopies, while hedged or super-intensive olive groves (SHD) with densities of 1000 to 2500 trees per hectare are managed as smaller trees, forming continuous hedges. A third cultivation method, the traditional olive grove, is typically practiced without irrigation and is characterized by extensive and widely spaced planting. This system still occupies a significant area within the EFMA, but their yields fall below the species potential, due to the advanced age of the trees and the use of non-optimized cultural practices. Consequently, changes in plantation systems are anticipated for this crop. However, intensive olive farming has been shown to have negative biodiversity impacts in the region [
14]. Other studies also indicate that intensive olive groves may lead to impoverished habitat quality and other ecological impacts due to high input of phytosanitary treatments, fertilizers and water supply [
15]. Given the marked differences between these three olive grove management systems, this study will attempt to distinguish them in the classification of permanent crops.
Mapping LULC in agricultural landscapes through satellite images and machine learning has been extensively addressed in the scientific literature. Open data policies adopted by space agencies such as the National Aeronautics and Space Administration (NASA) and the European Space Agency (ESA) have revolutionized large-scale mapping by providing access to high-quality, low-cost imagery from programs such as Landsat and Sentinel. In addition, freely accessible platforms, such as the SentiNel Application Platform (SNAP), or cloud-based platforms such as Google Earth Engine (GEE) [
16], allow us to apply machine learning techniques to produce land use maps with remarkable precision [
17]. Studies are performed with a single satellite mission [
8,
18,
19], but also combining missions from different agencies [
20,
21,
22]. This study is particularly inspired by research that integrates multiple missions, as highlighted in several works [
23,
24,
25] due to the consensus that combining data from multiple missions enhances the precision of the results obtained. Moreover, the latest version of the EU Crop Map [
9] already incorporates data from two missions (Sentinel-1 and Sentinel-2).
This study contributes to improving the EU Crop Map through several methodological advances. First, we combine multiple ground truth sources to compile a comprehensive reference dataset of over 25,000 labeled points, representative of Mediterranean permanent crops, which is made publicly available on Zenodo [
26]. Second, we use a Random Forest (RF) classifier with input variables specifically relevant for distinguishing olive groves with distinct cultivation systems and vineyards, as these classes exhibit spatial patterns critical for accurate classification. To this end, we integrate both optical (Sentinel-2) and radar (Sentinel-1) imagery and include variables measuring spatial variability through texture analysis. Third, we analyze feature importance, not addressed in [
6], to understand the relative contributions of both sensors for class prediction. Finally, following previous work [
27], we explore optimal sample size requirements not only for overall classifier performance but specifically for each permanent crop class, identifying which land cover classes benefit most from extensive reference datasets. This study also adds to the corpus of knowledge already existing on crop classification in the Mediterranean region using Sentinel-2 [
28,
29].
Hence, to meet the objectives of this study, we explored the integration of data from two ESA missions, Sentinel-1 and Sentinel-2, associated with the use of the supervised machine learning algorithm Random Forest in order to refine the legend of the EU Crop Map. The main goal is to disaggregate the “woodland and shrubland” class into a finer permanent crop classification. We developed a methodology to differentiate between permanent crops, based on factors such as plantation intensity, and classify them, while showing the potential of combining satellite information from the Sentinel-1 and Sentinel-2 missions for land use mapping production.
The second goal of this study is to evaluate two key aspects of the labeled input data used to train the model: the quality of the features used, and the quantity of labeled samples required to achieve good model precision. These assessments are intended to support both model performance evaluation and the potential for sample size reduction. To develop and test the proposed methodology, we selected as a pilot region the Alqueva multipurpose development area (EFMA)—a landscape marked by rapid changes in agricultural use—where accurately distinguishing between different types of permanent crops is particularly important.
2. Materials and Methods
2.1. Study Area
This study was conducted in southern Portugal, in the area of influence of the Alqueva Dam, in areas within the EFMA classified as “woodland and shrubland” by the EU Crop Map (
Figure 1). The landscape of the study area is predominantly covered by Quercus woodlands, mostly cork oaks (
Quercus suber) and holm oaks (
Quercus rotundifolia). At the forestry level, two other relevant species for the area are stone pine (
Pinus pinea) and eucalyptus (
Eucalyptus globulus). In addition, agriculture plays a crucial role in the region, with a diverse range of cultivated crops. These can be broadly classified into annual crops and permanent crops, with olive groves being the most prominent, and almond groves becoming increasingly present, although other types of orchards and vineyards should also be considered. Among olive groves, traditional, high-density (HD), and super-high-density (SHD) cultivation methods are used.
2.2. Data Sources and Labeling Ground Truth
Several data sources were used to identify and classify the reference samples. The area of interest (AOI) was defined by the boundaries of EFMA. The classification was based on Portugal’s land parcel identification system (SIP), as defined by IFAP’s 2018 SIP [
30]. Within the AOI, this layer was intersected with areas classified as “Woodland and shrubland type of vegetation” in the EU Crop Map 2018 [
8]. The result was aggregated into SIP polygons (hereafter referred to as plots) for gathering and organizing additional reference data. Finally, plots were filtered to retain only those larger than 1 ha, to ensure adequate spatial homogeneity. This resulted in 6349 plots to be classified.
We used the year 2018 as a reference for the classification procedure. Each plot was manually classified based on the following sources:
Orthophoto images in Google Earth Pro [
31];
2018 orthophotos made available by Direção-Geral do Território (DGT) [
32];
Google Street View functionality [
33];
2018 iSIP occupancy polygons file, made available by IFAP3 [
30];
Polygon file from Carta de Uso e Ocupação do Solo (COS 2018), developed by DGT [
34].
The last source, COS 2018, was produced by extensive photointerpretation over orthophotos, being, for this reason, a particularly important high-quality land use map for Portugal. However, similarly to CLC, it has spatio-temporal resolution limitations, because its minimum unit area is 1 ha, and it is produced with a typical periodicity of 5 years.
We defined a new classification scheme for this study, in order to allow the discrimination between traditional, high-density (HD) and super-high-density (SHD) olive groves. A class was created for each of these land use types. The other individually classified permanent crops were almond groves and vineyards. All other permanent crops, which are much less significant in the study area, have been grouped into a single class named “Other permanent crops”. Forests were grouped in a class “Forest”, as they are not a focus in this study. Finally, a class named “Other occupations” aggregates land covers that do not fit into any of the previous classes (
Table 1). Occupations like water surfaces, wetlands, bare lands, and built-up areas are masked in the product provided by the EU Crop Map.
We chose to carry out the classification task of the 6349 plots manually in order to guarantee the best possible quality of the training data for the machine learning model. With the final legend set up, random training points were collected at a density of approximately 4 points per 100 ha from the plots, resulting in a total of 25,398 samples, distributed among the classes as shown in
Table 1. The dataset is available at Zenodo [
26].
2.3. Satellite Data Sources
The satellite data used in this work were obtained for the year 2018 from two European Space Agency (ESA) missions: Sentinel-1 (S1) and Sentinel-2 (S2). The missions differ significantly in the type of information they collect. Sentinel-1 provides surface relief data, while Sentinel-2 captures information from the optical spectrum. A more detailed description of each mission is provided below.
Sentinel-1—In 2018, this ESA mission comprised two satellites: Sentinel-1A, launched in 2014, and Sentinel-1B in 2016. As an active sensor system, Sentinel-1 does not depend on sunlight reflected by the Earth’s surface, like optical sensors such as S2; instead, it emits its own energy to acquire data. Sentinel-1 provides a spatial resolution of up to 5 m, and a temporal resolution of 6 days. This mission operates in the CS1 band, with a center frequency of 5.404 Ghz, corresponding to a wavelength of 5.55 cm, which enables image acquisition regardless of the presence of clouds. The interferometric wave (IW) acquisition mode was selected, which records VV and VH polarizations, representing vertically and horizontally transmitted backscatter, respectively [
5]. Sentinel-1 products are available at three processing levels. In this study, we used Level-1 with Ground Range Detected (GRD), and Single Look Complex (SLC) images [
35].
Sentinel-2—In 2018, this mission also consisted of a constellation of two satellites, Sentinel-2A launched in 2015 and Sentinel-2B in 2017. Unlike Sentinel-1, Sentinel-2 is a passive optical sensor that requires sunlight and cloud-free conditions for image acquisition. Multispectral information is obtained across 13 different bands with spatial resolutions ranging from 10 m to 60 m, and spectral resolutions spanning 443 nm to 2190 nm, encompassing the visible, near-infrared, and shortwave infrared regions of the electromagnetic spectrum. Each satellite completes a cycle in 10 days, resulting in a mission time resolution of 5 days. ESA provides Sentinel-2 imagery at five different processing levels: Level 0, Level 1A, Level 1B, Level 1C, and Level 2A. In this work, we used Level 2A, which provides atmospherically corrected Surface Reflectance (SR) products from Level-1C products [
26].
2.4. Pre-Processing Satellite Data
To build the satellite dataset, monthly composites were created for the full 2018 calendar year, temporally matching the ground truth reference data. Including a full year ensures representation of all phenological stages in permanent crops. Additionally, to capture differences in phenological development between crops, five normalized difference indices were calculated as described below.
The methodology adopted in this study is depicted in
Figure 2. Three different platforms were adopted for processing and visualization: Google Earth Engine [
16], which is free for research purposes, and includes pre-processed satellite datasets available as collections; Jupyter Notebook v6.5.3 [
36], which is an IDE to run python v3.11.1 scripts, where the classification model was developed; and QGIS [
37], which allows easy visualization and handling of the geospatial files resulting from this methodology.
Satellite data were acquired through the GEE platform using the script available in the GitHub repository referenced in the Data Availability Statement. A time series spanning the entire calendar year 2018 (January–December), was collected to ensure that all phenological phases of the target crop species were captured, which is expected to improve the classification model performance. Given that permanent crops are the primary focus, the probability of significant land occupation changes occurring within individual parcels is minimal, ensuring that seasonal spectral variations in these crops are adequately represented.
The processing of Sentinel-1 (S1) data follows closely the methodology of the EU Crop Map 2018 [
8]. In GEE, the Level-1 Ground Range Detected (GRD) was selected. No terrain correction was applied, since most crops in the area are typically found in flat areas [
8]. For each scene, three products are acquired from S1: the VH, VV polarization and the Cross-polarization (VHVV). The details of the processing are presented in
Appendix A.
The main difference from the EU crop map 2018 methodology lies in the temporal compositing strategy. We created monthly composites for the period January to December 2018, whereas the EU Crop Map employed 10-day composites. Given that permanent crops generally exhibit less pronounced phenological changes than annual crops, a lower temporal resolution was considered sufficient. This approach yielded three layers per month (VH, VV, and VHVV), totaling 36 Sentinel-1 layers.
Sentinel-2 (S2) Level-2A imagery was processed using the GEE platform. All scenes intersecting the EFMA were collected, without cloud cover filtering. Instead, to remove noise from cloud cover, we used Cloud Score+S2_HARMONIZED [
38]. After cloud removal, we calculated monthly composites for the period January–December 2018. In this case, the median of each pixel in bands B3, B4, B8, B8A, B11, and B12, follows the GHG calculation procedure similar to Fatchurrachman et al. [
23]. Five normalized difference indices were then computed: the Normalized Difference Vegetation Index (NDVI) [
39], the Normalized Difference Build up Index (NDBI) [
40], the Normalized Difference Water Index (NDWI) [
41], the Normalized Burn Ratio (NBR) [
42], and the Normalized Burn Ratio 2 (NDMIR) [
43]. Remaining gaps due to persistent cloud cover were filled using linear interpolation between the monthly composites [
44,
45]. The final Sentinel-2 dataset comprised 60 layers (five layers multiplied by 12 months).
Permanent crops and forested areas often exhibit high intraclass variability at the 10 m scale, which represents a challenge to pixel-based Random Forest classification, and lacks spatial context provided by neighboring pixels [
46,
47,
48]. To address this, a texture analysis was conducted using a 3 × 3 neighborhood window, with the GEE function ee.Image.reduceNeighborhood. The standard deviation kernel (stdDev) was computed for the annual mean of each of the spectral products (VV, VH, and VHVV from S1; NDVI, NDBI, NDWI, NBR, and NDMIR from S2). Instead of applying this function to all 96 pre-processed layers, it was applied to the mean values of these products to avoid excessive data redundancy. This step generated eight additional layers, yielding a total of 104 classification inputs for classifying the study area.
2.5. Model Development
In this study we adopted a supervised machine learning approach, based on the algorithm Random Forest (RF). The following methodological steps were adopted in the model development.
2.5.1. Supervised Classification
Supervised classification was carried out in Jupyter IDE, using the scikit-learn module’s [
49] implementation of Random Forest [
50], one of the most widely used machine learning algorithms for land cover classification [
25]. The code is available in GitHub (see Data Availability Statement).
The dataset comprises 96 features (36 S1 layers and 60 S2 layers) extracted for 25,398 locations from the satellite images stack exported from GEE. The dataset was divided into the train set (80%) and the test set (20%). Hyperparameter tuning was performed using a grid search with cross validation on the train dataset. The parameters, their value range, and selected values for each parameter are presented in
Table 2.
A single classification model was developed to classify the EFMA area. Model performance and feature importance were evaluated before generating the final classification map.
2.5.2. Feature Importance Determination
Feature importance was assessed using two standard Random Forest metrics, Mean Decrease Impurity (MDI) and Mean Decrease Accuracy (MDA), which provide complementary perspectives on feature relevance [
51]. MDI measures each feature’s contribution to reducing impurities across all nodes in the decision trees, with the total importance normalized to sum to 1 across all features [
51,
52]. In contrast, MDA evaluates the impact of each feature’s impact on model accuracy by randomly permuting its values and measuring the resulting decrease in overall accuracy (OA) relative to the baseline model. Unlike MDI’s relative rankings, MDA provides absolute scores for each feature range between 0 and 1 [
51]. Given that these two metrics assess feature importance differently and can yield varying results, both were applied to enable a comparative analysis. The feature_importances and permutation_importance functions from scikit-learn were used for this purpose.
2.6. Validation and Comparison with COS 2018
Model performance was validated on the test set using overall accuracy (OA) and F1-score; these two metrics were also employed in the EU Crop Map 2018 [
8]. OA measures the proportion of correctly classified samples. F1-score is the harmonic mean of Precision and Recall (also termed User’s Accuracy (UA) and Producer’s Accuracy (PA), respectively), ranging from 0 to 1, where 1 represents the highest classification score. A confusion matrix was generated to identify commission and omission errors. The EU Crop Map 2018 reported an overall accuracy of 76.1% across all 19 land use classes. The “Woodland” class achieved a User’s Accuracy of 0.813 and a Producer’s Accuracy of 0.969, and an F1-score of 0.896. These values were therefore adopted in the present study as reference thresholds for evaluating classification performance.
The classification map was compared with the 2018 Carta de Uso e Ocupação do Solo (COS 2018) [
34], a reference classification for mainland Portugal. COS2018. However, COS 2018 follows specific cartographic criteria that were not applied here, including minimum unit area of 1 ha, and therefore could not be used as reference data. Nevertheless, it was used to identify classification errors through spatial intersection of both maps. The accuracy metrics PA, UA, and F1-score were calculated using only areas equal or above 1 ha to match COS 2018′s minimum mapping unit. The correspondence between the COS 2018 classes and this study’s land cover legend is presented in
Appendix B,
Table A1.
2.7. Effect of Sampling Size Reduction on the Accuracy
To test the effect of sampling size in the accuracy of the classification, we adopted an approach similar to Moraes et al. (2021) [
19], progressively reducing the train set and evaluating its impact on classification performance. The train set was systematically reduced by 75%, 50%, 40%, 30%, 20%, 10%, and 5%, training a new Random Forest model at each stage. By comparing the performance of these models on the same test set, we aimed to identify the extent to which the train set could be reduced without significantly compromising classification accuracy.
3. Results
3.1. Feature Importance
Features relevance for the permanent crop classification model was assessed using the Mean Decrease Impurity (MDI) and Mean Decrease Accuracy (MDA). The dataset of S1 layers (monthly VV, VH, and VHVV) and S2 layers (monthly NDVI, NDBI, NDWI, NBR, and NDMIR), plus the standard deviation kernel of each layer, totaled 104 features (layers). MDI and MDA metrics were used to identify which time periods of the year provided the most discriminative spectral information for land cover classification.
Figure 3 shows the monthly dispersion of features performance, for both MDI and MDA metrics. The detailed values are provided in
Appendix C,
Figure A1 and
Figure A2, respectively. The most important variables are NDBI and NDMIR, as shown in
Figure 3, for most of the year, except for March, July, and November. The results also indicate that kernel standard deviations measuring spatial heterogeneity can also contribute significantly to classification performance.
3.2. Model Validation
Model performance of the Random Forest model on the test set was assessed using overall accuracy (OA) and F1-score (
Table 3). A confusion matrix (
Table 4) was also used to provide a detailed breakdown of classification errors.
At the LULC class level, forest (class 7) achieved the highest F1-scores (0.96), representing for more than half of the test set (3037 samples). For classes other than “forest”, classification performance declined moderately, with more pronounced decrease observed for other permanent crops and other occupations, which recorded an F1-score of 0.48 and 0.40, respectively. The vineyard class also exhibited a lower-than-average performance (F1-score of 0.71), whereas the remaining permanent crop classes consistently achieved F1-scores between 0.78 and 0.90.
The confusion matrix (
Table 4) provides a detailed breakdown of classification errors, facilitating the identification of classes that were most difficult to classify. A clear example is the forest class, which, despite its high OA, PA, and F1-score, exhibits a pronounced attractive effect, drawing a considerable number of misclassified samples from other classes. This effect disproportionately affects classes with smaller sample sizes, such as vineyards, traditional olive groves, and other occupations, thereby reducing their F1-scores. However, due to its large sample representation, forest classification remains largely unaffected, even though it records the highest commission score among all classes.
Among all classes, two reveal low metrics as expected, with these classes being “other permanent crops”, and “other occupation”. These are refuge classes that aggregate a mixture of land uses other than the ones identified individually in the legend adopted, and thus have high features variability. However, vineyards also reveal a low commission score, showing challenges in capturing the characteristics of this crop by satellite imagery. This can be due to the lack of satellite resolution, as well as the marked phenological changes that the crop shows during the calendar year.
3.3. Effect of Sampling Size in the Model’s Accuracy
A key objective of this study was to develop a practical and accessible methodology that minimizes the effort required for reference sample collection. A training dataset of 25,398 samples was compiled through significant manual effort to ensure sufficient reference data for achieving high model accuracy and reliability. This comprehensive training dataset enabled the development of a robust model that, once trained, can be applied to future Sentinel imagery processing without the need for repeated sample collection. To assess the minimum required sample size required to maintain comparable accuracy, a sensitivity analysis was conducted by progressively reducing the training dataset to 75%, 50%, 40%, 30%, 20%, 10%, and 5% of its original size, while keeping the test set constant for consistency.
Figure 4 illustrates the impact of training set reductions on OA, the class-specific F1-scores, and the weighted average F1-score for permanent crops.
Appendix D provides details: the training set reduction proportions (
Table A2), the hyperparameters used in each model iteration (
Table A3) and the resulting changes in overall accuracy (OA), class-specific F1-scores (%), and the weighted average F1-score for permanent crops (
Table A4).
Figure 4 shows that OA remained remarkably stable, even when the train set was reduced to 5% (1270 samples), with the lowest recorded OA of 0.83, which is still an acceptable accuracy value by standard practice. Forest and HD olive grove maintained high F1-scores of 0.92 and 0.81, respectively, suggesting that large sample sizes of imbalanced sets contribute to classification robustness. These findings align with Moraes et al. [
19], where a 90% sample reduction (only 50 training units) preserved overall classification quality. However, OA is heavily influenced by the accuracy of the most frequent class, in this case Forest, which maintained high accuracy despite training set reductions. Therefore, exterminating each class individually is essential, rather than relying solely on the overall OA or F1-score.
Most of the permanent crop classes were less stable than the Forest and HD olive grove. The weighted average F1-score for these classes declined gradually with training set size reduction, reaching 0.66 at the smallest training set size. Almond groves and SHD olive groves showed more pronounced declines, both reaching a final F1-score of 0.65.
The steepest performance declines occurred in the vineyard, traditional olive grove, other permanent crops, and other occupations. Traditional olive groves dropped to a score of 0.32, with sharp decline beyond the 20% threshold. Vineyards experienced the steepest drop overall, reaching 0.18 at 5% training data. Other permanent crops declined markedly beyond the 40% threshold, ultimately reaching zero. The Other Occupations class followed a similar trend but, unlike Other permanent crops, stabilized at 0.10 rather than dropping to zero.
3.4. Mapping Permanent Crops
Figure 5 presents the permanent crops map for the irrigation area of the Alqueva, produced by the Random Forest classification model (10 m resolution raster layer available, see [
26]). The map expands the EU Crop Map within the EFMA, adding classification discrimination of permanent crops, in the areas designated as “Woodland and shrubland type of vegetation” to the original map.
Figure 5 shows the largest contiguous areas of permanent crops align closely with the hydro-agricultural schemes (
Figure 1, AH in operation), as expected, given that most permanent crops require irrigation. These regions are predominantly composed of homogeneous plots, with olive groves being the most dominant. At this scale, super-high-density (SHD) olive, high-density (HD) olive, almond groves, and traditional olive groves can be distinguished.
Detailed inspection of the high-resolution raster output reveals that HD olive classification aligns closely with ground truth observations. However, commission errors frequently occur for the SHD olive class, as indicated by the confusion matrix (
Table 4) as illustrated in
Figure 5b. Like HD olives, SHD olive groves form large contiguous areas associated with extensive farms; however, several omission errors were identified where orchards overlap with access roads (
Figure 5b). The map successfully identified major almond groves, though smaller patches exhibited notably poor classification performance. Visual inspection also revealed discontinuities in the vineyard class (
Figure 5c), likely resulting from model omission errors for this class.
To identify potential discrepancies between the model result and the COS 2018 LULC classification,
Table 5 presents the spatial overlap between each land use class area in the final classification map and COS 2018. The table summarizes the result for the four most representative COS 2018 classes. For each class mapped in the present study, the distribution of COS 2018 classes within its boundaries was determined to assess the degree of correspondence between the two classifications. Complete results are presented in
Supplementary Materials (Table S1).
Table 6 presents the PA, UA, and F1-score of these comparisons between the predicted occupation by the model and the COS 2018 class.
The dominant COS 2018 classes within each mapped category are consistent with the expected land uses, although correspondence varies across classes. For the “Forest” class, the most prevalent COS 2018 subclass, “Holm oak SAF”, accounts for only 23% of the classified area. However, when all COS 2018 forest subclasses are combined, correspondence rises to 72.96%, (
Supplementary Materials, Table S1).
The three olive groves classes show strong agreement with COS 2018. However, COS 2018 does not differentiate between traditional, high-density (HD), and super-high-density (SHD) systems, grouping nearly all olive groves into a single category. Therefore, the observed correspondence confirms the crop type (olive) but provides no validation of the cultivation system classification.
The “Vineyard” class showed 72% correspondence with the equivalent COS 2018 class. The “Other Occupations” class showed lower correspondence (67.04%), meaning that 32.96% of its area—approximately 6787 ha—actually corresponds to land use classes relevance to this study, representing a notable discrepancy. Finally, the “Other Permanent Crops” class recorded only 48.78% correspondence, which is consistent with the poor performance of this class observed in earlier analyses.
F1-scores for the model’s classification relative to COS 2018 classes indicate values around 60% for vineyards, olive groves, and forest, and these are the classes with the largest representation. However, while vineyards and olive exhibit high PA and lower UA, forests show the inverse pattern, indicating that the model tends to overclassify samples in the forest class. Classes with lower representation show much lower F1-scores, not only due to reduced model precision, but also because complete alignment between this study’s classification and COS 2018 is not achievable.
4. Discussion
Timely monitoring of land use and land cover (LULC) changes is crucial for effective natural resource management, spatial planning, and conservation efforts of agricultural landscapes. This requires sufficient discrimination between annual and permanent crops in classification schemes. With freely accessible satellite imagery now available, this can be achieved if classification models are trained with adequate datasets representing regional crops and land cover diversity. Several studies have successfully integrated Sentinel-1 and Sentinel-2 data for crops classification [
53,
54], but these lack the discrimination of permanent crops, forests and shrublands.
4.1. Feature Importance Analysis
This study developed a classification model focused on permanent crops in a Mediterranean agricultural landscape. The dominant crops present in the study area are olive, almond, and vineyards. Among olive orchards, three cultivation intensities were distinguished: traditional, high-density, and super-high-density orchards. The model included 96 features representing monthly values for one full year, with feature importance assessed using MDI and MDA metrics.
Figure 3 suggests that no single month stands out significantly for either metric, as expected given that several Sentinel-2 normalized indices reflect vegetation growth activity throughout the year. For the MDI metric, August, September, and October were the only three consecutive months with mean values exceeding 0.01. This period, marking the summer to autumn transition, is typically characterized by prolonged drought in the study area. However, since most target crops are irrigated perennials, they retain vigorous spectral signatures that contrast sharply with the non-irrigated herbaceous vegetation. This contrast may enhance spectral separability in Sentinel-1 and Sentinel-2 imagery, potentially improving classification accuracy.
The two more relevant features for classification were the NDBI and NDMIR indices. Contrary to Veloso et al. and Meroni et al. [
55,
56], which emphasized the value of the Cross-Polarization Ratio (VHVV), our analysis found this to be the least relevant feature, displaying the lowest MDI and MDA values. These results confirm that texture analysis contributed positively to model performance, as NDVI_stdDev and NDWI_stdDev ranked among the highest performing features in both MDI and MDA analysis.
4.2. Model Assessment and Interpretation
The classification model achieved an OA of 0.91 on the test set, aligning with previous remote sensing studies [
44,
45]. This performance exceeds the evaluation thresholds defined in this study—0.76 for overall accuracy and 0.89 for the “Woodland” class F1-score—based on the EU Crop Map [
8]. The model demonstrated high predictive performance for the permanent crops widely adopted within the Alqueva irrigation system, namely HD and SHD olive groves and nuts crops, supporting the need for continued monitoring of agricultural landscape changes.
A notable pattern observed in the accuracy metrics was the systematically lower PA (recall) compared to UA (precision), except for forest and HD olive grove, which exhibited a higher PA, indicating prevalent commission errors. This pattern is particularly evident in other permanent crops and other occupations, contributing to their lower F1-scores. Similarly, while vineyards, traditional olive groves and SHD olive groves achieved relatively high F1-scores, they also exhibited a substantial UA-PA difference. Misclassification between SHD olive and HD olive groves resulted in a lower PA for SHD olives (0.72), affecting the final classification map. Additionally, the other permanent crop classes showed no discernible misclassification patterns in the test set plots, reinforcing the model’s difficulty in identifying these crops. Notably, this class exhibited high omission errors, but produced no commission errors.
In terms of the interpretation of the model mapping, the classification map shows that HD and SHD olive grove areas align closely with irrigation infrastructure. While this is expected given that these intensive cultivation systems depend on water availability, it also demonstrates the strong influence of irrigation systems on land use change. Globally, studies have shown increased cropping frequency and intensification in areas affected by irrigation dams compared to rainfed control areas [
57].
4.3. Comparison with Other LULC Maps
The implemented methodology enables rapid updates to LULC mapping for agricultural landscapes across the country. It is valuable to compare this approach with COS 2018 [
34], the existing 2018 classification for the same region based on orthophoto interpretation. However, this comparison cannot serve as formal model validation because several factors, such as the minimum mapping unit defined in COS, were not controlled, precluding its use as a strict validation method. Nevertheless, it provides a valuable means of evaluating cartographic quality, particularly since COS 2018 is a national reference for land use and land cover in mainland Portugal. Overall, most mapped categories show general consistency with COS 2018 classes, though correspondence varies: forests and vineyards exhibit relatively high alignment (around 73%), olive groves match at crop type level only, while “Other Occupations” and “Other Permanent Crops” show lower correspondence (67% and 49%, respectively). Accuracy metrics from the map comparison reveal that the best-performing classes, vineyards, olive groves, and forests, achieved F1-scores around 0.60. However, omission and commission error patterns differed, with vineyards and olive groves showing low omission but high commission errors, while forests exhibited the inverse pattern. Overall, the relatively low F1-scores likely result from differences in spatial granularity between the two classification methods. For the remaining classes, almond groves, other permanent crops, and other occupations, this comparison is further complicated by the difficulty of establishing consistent mappings between the two classification systems.
4.4. Assessing the Effect of Sampling Reduction
An important consideration in machine learning applications is determining appropriate training set size. This analysis was enabled by a comprehensive reference dataset compiled for the EFMA and shared as open data [
26]. Our findings indicate that while some classes remain stable with as little as 5% of the original training data, others—particularly vineyards, traditional olive groves, other permanent crops, and other occupations—show significant accuracy loss and require larger training datasets. Overall accuracy remained high (≥0.83) even with only 5% of the training data, but class-specific performance varied substantially. Forests and HD olive groves retained strong performance, whereas vineyards, traditional olive groves, other permanent crops, and other occupations suffered severe F1-score declines (0.18, 0.32, 0, and 0.10, respectively), indicating greater sensitivity to reduced training set size. These declines resulted primarily from a drastic drop in PA, while UA remained consistently high, except for other permanent crops. This effect is particularly evident in the other occupations class, where UA exceeded 0.47 while PA never exceeded 0.26, illustrating severe classification imbalances.
The uneven rate of accuracy reduction across classes is expected. Classes experiencing the greatest decline correspond to those least represented in the dataset, which results in poor decision boundaries in Random Forest algorithms. As model fitting prioritizes overall accuracy, underrepresented classes are negatively impacted. Furthermore, these classes correspond to land uses with higher textural and spectral variability. Vineyards present discontinuous vegetative surfaces that mix vegetation and background, requiring processing methods not employed in this study [
58]. Traditional olive groves share this characteristic and can additionally be confused with forests, which may absorb pixels from this class, as observed. Other permanent crops are inherently a heterogeneous class combining different crop types. Finally, for smaller classes, even a small number of misclassifications have a disproportionate impact on performance metrics. These results underscore the importance of detailed per-class accuracy assessment in multiclass classifications, particularly when sample sizes are imbalances [
59].
4.5. Study Limitations and Potential Use
The methodology developed in this study enables discrimination not only between permanent crops but also between cultivation systems and intensification levels within the same crop. The machine learning model was trained on major Mediterranean permanent crops, including olive groves, almond groves, and vineyards, using a large reference dataset compiled for southern Portugal. The approach covered a full calendar year with monthly composites to capture complete phenological cycles and integrated textural data from radar (Sentinel-1) with spectral data from optical sensors (Sentinel-2). By utilizing free and open satellite data, GEE, and Python scripts, the methodology can be applied to permanent crop mapping in other Mediterranean regions. The study also demonstrated sample size optimization, reducing the effort required for future applications while maintaining acceptable accuracy.
However, several limitations must be considered when implementing this methodology in other regions. Class imbalance significantly affects underrepresented classes, resulting in lower performance and higher sensitivity to training set reduction. This underscores the importance of assessing per-class performance rather than relying solely on overall accuracy. Confusion between spectrally similar classes was observed, as in the case of traditional olive groves and forests, indicating limitations in the textural and spectral resolution provided by Sentinel missions. This may also relate to different establishment stages of permanent crops, which create patchiness and spatial variability, particularly in rapidly changing agricultural landscapes driven by new irrigation infrastructure. Another limitation of the study is the single-year (2018) temporal scope, which may not adequately represent inter-annual variability in crop phenology and management practices.
For successful transferability to other regions, the following recommendations are provided as follows: (1) compile region-specific reference datasets representing local crop varieties and management practices; (2) perform sensitivity analysis of feature importance to identify critical phenological periods for the target region; (3) assess and address potential class imbalance bias in the training data; and (4) validate model performance across multiple years to account for inter-annual variability.
5. Conclusions
Water availability from dam construction can be an important driver of agricultural landscapes change, as observed in the Alqueva region, home to one of European’s largest artificial lakes. This has promoted a shift from annual to permanent crops in the Mediterranean agricultural landscape, with environmental, economic, and social implications. The first continent-wide crop mapping data product for Europe based on remote sensing, the EU Crop Map 2018, did not distinguish between permanent crops from woodlands. Extending its classification to include permanent crops is therefore of particular interest in irrigation systems such as Alqueva. Permanent crops dominate the irrigated areas within the Alqueva irrigation system, and their distinct characteristics compared to forests and annual crops necessitate accurate differentiation. This study developed a methodology to discriminate between forest and permanent crops from Sentinel imagery, and to distinguish among individual permanent crop types by crop species and, for olive groves, by cultivation method.
The classification model achieved 91% overall accuracy, demonstrating strong performance in distinguishing permanent crops, forests, and other occupations. F1-score analysis indicates that the model effectively identified almond and olive groves, and differentiated olive grove cultivation methods (F1-score ≥ 0.78). Performance was slightly lower for vineyards (0.71) and significantly weaker for other permanent crops (0.48). When compared with COS 2018, the classification shows strong overall alignment, though several inconsistencies and classification mismatches were identified.
A key challenge in machine learning applications is determining adequate training set size. Our results suggest that while some classes maintain stability with only 5% of the original training set, others—particularly vineyards, traditional olive groves, other permanent crops and other occupations—experienced severe accuracy degradation and require larger training dataset. To support future research, the comprehensive reference dataset compiled for the EFMA is publicly available [
26].
As the EU Crop Map continues to evolve, and given the importance of permanent crops in this region, this study provides a foundation for complementing its information. While results indicate room for improvement, future advances—such feature selection optimization and alternative training data collection methods—could enhance the approach for integration with upcoming EU Crop Map versions.