1. Introduction
Accurate monitoring of Land Use and Land Cover (LULC) dynamics underpins a broad spectrum of scientific and applied disciplines. Changes in LULC alter fundamental ecosystem processes, including hydrological regulation, carbon sequestration, soil erosion dynamics, and biodiversity habitat structure, and directly affect the provision of ecosystem services such as water purification, flood attenuation, and landscape connectivity [
1,
2,
3]. These biophysical consequences make LULC mapping an essential input for territorial planning, natural hazard assessment, environmental vulnerability analysis, and sustainable resource management at local and regional scales [
2]. At the international level, LULC information also supports progress toward the Sustainable Development Goals (SDGs) and informs climate mitigation strategies advanced by the Intergovernmental Panel on Climate Change (IPCC) [
2]. Over the past decade, the convergence of open-data policies and cloud-based processing has fundamentally transformed Earth observation capabilities in four specific dimensions: (i) free and open access to decades of archival satellite imagery (Landsat from 1972, Copernicus Sentinel from 2014–2015); (ii) increased spatial resolution, with Sentinel-2 providing 10 m multispectral data and Sentinel-1 providing 10 m SAR data; (iii) higher revisit frequency, with the dual satellite constellations Sentinel-1A/1B and Sentinel-2A/2B achieving 5–6-day repeat cycles; and (iv) scalable cloud-based geospatial processing through platforms such as Google Earth Engine (GEE), which enables petabyte-scale multi-temporal analyses without the computational and storage barriers of local data processing [
3].
Despite these advances, the production of systematic LULC maps in mountainous ecosystems remains severely constrained, even when machine-learning classifiers are employed [
4,
5]. Persistent cloud cover creates substantial data gaps in optical time series, while complex topography introduces radiometric distortions that degrade model performance [
6]. Such distortions arise as terrain shadows reduce spectral signal quality in slopes shielded from direct solar illumination; differential illumination caused by slope aspect produces divergent spectral responses for identical land-cover classes. Furthermore, sensor viewing geometry over abrupt relief generates geometric displacements that compromise spatial correspondence between image pixels and surface features.
This observational deficit has produced a critical geographic and thematic bias within the national literature. A recent systematic review [
7] shows that LULC research in Chile is disproportionately concentrated in the central region, whereas areas such as Aysén and Magallanes account for less than 2% of published studies. This gap is particularly concerning given that the Chilean Patagonia represents a globally significant natural laboratory, characterized by a pronounced west–east ecological gradient encompassing temperate forests, shrublands, steppes, and wetlands [
8,
9]. Moreover, the limited monitoring efforts conducted to date have primarily focused on forest dynamics, leaving substantial gaps in the characterization of transitional land-cover classes. These classes are frequently misclassified in official static inventories and global products [
10,
11].
This scarcity of studies is not compensated for by the available global or national cartographic products. While initiatives such as ESA World Cover and the CONAF land registry provide a first-rate thematic reference, their operational utility in the Aysén Basin is structurally limited by specific technical reasons. The CONAF land registry, while constituting the most detailed national reference, was not conceived as a systematic monitoring tool: its updates lack standardized periodicity at the national level, and its methodology has undergone changes over time that undermine its comparability across editions. For its part, global products such as ESA World Cover have documented thematic limitations in characterizing transitional classes in ecosystems of high structural complexity [
11,
12]. Furthermore, none of these products were designed to provide reproducible operational guidelines on what combination of data sources is needed to classify land cover in mountain environments with persistent cloud cover, nor how much each source contributes individually. This absence of an evidence-based methodological framework for sensor selection in cloud-prone mountain environments represents a critical operational gap that motivates the present study.
To overcome these limitations, it is necessary to move beyond traditional optical-only approaches. The combined use of robust statistics derived from the temporal distribution (e.g., medians and percentiles such as
and
) has become an effective strategy for mitigating atmospheric noise and capturing phenological variability in optical time series, even under limited observation availability [
13]. Nevertheless, exclusive reliance on optical data remains a vulnerability at austral latitudes [
14]. Consequently, the integration of Synthetic Aperture Radar (SAR), particularly Sentinel-1, emerges as a critical complementary solution by providing information on physical structure and surface roughness that is independent of atmospheric conditions [
15]. In environments characterized by abrupt relief, however, effective SAR use requires the explicit incorporation of topographic variables, not only to contextualize the radar signal as a function of terrain geometry but also to represent ecologically relevant altitudinal gradients [
16].
The present study makes three specific methodological contributions that distinguish it from the standard Sentinel-1/Sentinel-2 fusion workflows prevalent in the LULC literature [
5,
17,
18]. The dominant paradigm in multi-sensor LULC mapping concatenates optical bands, SAR backscatter channels, and DEM-derived variables into a single feature stack and trains a classifier on the combined system. This approach improves overall accuracy relative to single-sensor baselines but cannot quantify how much each data source contributes to the observed gain. While ablation-style sensor comparisons have been applied in isolated contexts, such as tropical monsoon environments [
19] and selectively logged tropical forests [
16], no prior study has structured a full progressive modular ablation framework in a cloud-prone subpolar Andean basin.
This study departs from the stacking paradigm in three ways. First, the systematic progressive ablation design (A → A + P → A + T → A + R → A + P + T + R) isolates the marginal and synergistic discriminatory capacity of each thematic block under fixed experimental conditions, enabling source-attributable performance reasoning that is directly informative for data acquisition decisions. Second, by explicitly comparing annual versus seasonal SAR temporal aggregation strategies within this framework, the study documents and physically explains a counterintuitive divergence between point-based statistical accuracy and cartographic spatial coherence, a finding not previously reported for multi-sensor LULC mapping in cloud-prone Andean Mountain environments. Third, the study derives an explicit operational recommendation: annual SAR compositing as a temporal low-pass filter that suppresses dielectric transients while preserving structural land-cover signatures. This recommendation is absent from existing Sentinel-1/2 fusion guidelines for high-latitude, data-scarce environments and is directly implementable in Google Earth Engine using freely available Copernicus data.
Within this context, the present study implements a systematic ablation experimental design in the Aysén River Basin to quantify the relative and complementary contributions of the optical, topographic, and radar domains to LULC classification. Three working hypotheses are formulated, linking sensor physical properties to landscape structure:
Phenological Hypothesis (H1): Distribution-based percentile metrics are expected to capture the amplitude of phenological signals more effectively than measures of central tendency, enabling the discrimination of spectrally similar land covers with contrasting intra-annual dynamics (e.g., deciduous versus evergreen vegetation). This hypothesis is considered supported if the A + P configuration shows an improvement in Macro-F1 relative to the optical baseline model (A), together with consistent class-level gains in spectrally dynamic vegetation classes.
Structural Hypothesis (H2): The inclusion of SAR backscatter from Sentinel-1 is hypothesized to provide information orthogonal to optical reflectance, facilitating land-cover discrimination based on surface roughness and volumetric structure, particularly for classes with contrasting architectures (e.g., urban areas versus bare soil). This hypothesis is considered supported if the A + R configuration shows an improvement in Macro-F1 relative to the optical baseline model (A), particularly for structurally complex classes, as reflected in class-level performance gains.
Topo-Ecological Hypothesis (H3): Topographic variables are anticipated to act as environmental proxies of the altitudinal gradient, constraining the spatial probability of class occurrence and reducing thematic confusion in Andean transition zones (e.g., vegetation, snow, and wetland distributions). This hypothesis is considered supported if the A + T configuration shows an improvement in Macro-F1 relative to the optical baseline model (A), particularly for classes constrained by topographic gradients, as reflected in class-level performance gains.
Against this background, the present study pursues three specific objectives: (1) to quantify the marginal and synergistic contributions of optical, SAR, topographic, and phenological data to LULC classification accuracy in a cloud-prone Andean mountain basin through a controlled ablation design; (2) to evaluate the trade-off between statistical accuracy metrics and cartographic spatial coherence under different SAR temporal aggregation strategies; and (3) to derive explicit, reproducible operational guidelines for multi-sensor data integration in data-scarce, high-cloud environments.
By addressing the absence of source-attributable, evidence-based guidelines for sensor selection in cloud-prone mountain environments, a gap not previously filled by existing Sentinel-1/2 fusion studies in subpolar Andean basins, this study provides land management agencies and geospatial practitioners with a reproducible, open-source framework directly applicable to analogous environments in the Southern Hemisphere. The following sections describe the study area, data, and experimental design in detail.
3. Results
3.1. Global Performance of the Ablation Models
Overall LULC classification performance improved progressively as additional variable blocks were incorporated into the seasonal optical baseline model (A).
Table 4 summarizes the global accuracy metrics obtained for each experimental configuration.
The optical baseline model (A), based exclusively on seasonal spectral information, achieved an Overall Accuracy (OA) of 89.2%, with a κ coefficient of 0.871, a Balanced Accuracy (BA) of 86.1%, and a Macro-F1 of 80.5%. Although these values indicate strong overall performance, the 8.7 pp gap between OA (89.2%) and Macro-F1 (80.5%) in the baseline model is a direct consequence of class-area imbalance: dominant classes such as Water, Snow, and Native Forest are well-separated optically, inflating OA while masking the substantially lower performance of minority classes such as Natural Grasslands/Shrublands (F1 = 48.4%) and Bare Soil/Alluvial Beaches (F1 = 53.4%). This structural bias makes Macro-F1 the more informative primary metric for evaluating the ablation configurations, as it captures gains in the minority classes where each additional data block provides its most significant discriminatory benefit, gains that OA would systematically underreport.
The subsequent inclusion of multi-temporal percentiles (A + P) yielded modest but consistent improvements relative to the baseline, with gains of +0.4% in OA and +1.2% in Macro-F1. This result indicates that intra-annual information contributes to stabilizing spectral responses across land covers, although its isolated impact on overall performance remains limited. In contrast, adding topographic variables (A + T) produced more pronounced improvements, particularly in metrics sensitive to class-level performance, such as BA (+1.3%) and Macro-F1 (+3.8%). This pattern suggests enhanced discrimination of land covers conditioned by topographic gradients.
The largest single contribution, however, was observed with the incorporation of radar information (A + R). Relative to the baseline model, A + R improved OA by 2.5%, κ by 0.028, and Macro-F1 by 3.8%, highlighting the substantial role of SAR data in separating spectrally similar and structurally complex land covers.
Finally, the Full model (A + P + T + R) integrated all data sources and achieved the best overall performance across all evaluated metrics, with an OA of 92.5%, a BA of 89.0%, and a Macro-F1 of 86.0%. The cumulative gain of +5.5 percentage points in Macro-F1 relative to the optical baseline demonstrates that multisensor integration is both highly effective and synergistic. The contributions of topography and radar are not redundant but complementary, correcting misclassification in different underrepresented classes.
To visualize the marginal impact of each variable domain,
Figure 3 presents the relative gains with respect to the baseline model.
Taken together,
Figure 3 highlights the progressive and incremental nature of the observed performance gains, underscoring the dominant contribution of topographic information and, in particular, radar data, as well as the cumulative effect achieved by the Full model. These results establish the quantitative basis for the detailed class-wise analysis presented in the following subsection.
3.2. Class-Wise Metrics and Confusion Patterns
The class-wise analysis reveals that the impact of the different variable blocks varies markedly across land-cover categories, as summarized in
Figure 4.
The persistent classification challenges observed in several land-cover classes have distinct physical explanations rooted in spectral mixing, structural similarity, and phenological overlap. Natural Grasslands/Shrublands, the most challenging class across all configurations (F1 = 48.4% in model A; F1 = 70.4% in the Full model), is subject to three simultaneous sources of confusion: (i) spectral overlap with Forage Grassland in the NIR–Red reflectance space, as both classes exhibit similar canopy greenness signals during the growing season; (ii) structural similarity with sparse Steppe vegetation along the west–east aridity gradient; and (iii) high intra-class heterogeneity. This transitional class encompasses a continuum from pioneer shrub patches to dense Nothofagus antarctica thickets, producing a wide and internally overlapping spectral distribution.
Bare Soil/Alluvial Beaches (F1 = 53.4% in model A; 76.0% in the Full model) is confused primarily with Rocky Terrain, due to shared high reflectance and minimal vegetation cover, and secondarily with Urban surfaces, whose mineral substrates produce similar high-reflectance signatures in the SWIR bands. Forage Grassland (F1 ≈ 81.0% in the Full model) experiences phenological confusion with Natural Grasslands during the austral winter (JJA), when managed pastures enter dormancy and become spectrally indistinguishable from surrounding natural herbaceous vegetation. Finally, Wetlands (F1 ≈ 83.0% in the Full model) are confused with Water bodies in permanently inundated sectors, and with Natural Grasslands in seasonally saturated mallines (wet meadows and Sphagnum bogs), where the ecological transition is gradual rather than spatially discrete, generating mixed pixels at the spatial resolutions of the sensors used.
Classes with well-defined spectral signatures, such as Water, Snow/Ice, and forest covers, exhibit high F1 values (≥90%). In contrast, classes characterized by greater spectral and structural heterogeneity show substantially lower performance, most notably Natural Grasslands/Shrublands (48.4% F1) and Bare Soil/Alluvial Beaches (53.4% F1), followed by Forage Grassland (72.9% F1) and Wetlands (75.2% F1). For Natural Grasslands/Shrublands, the combination of high PA and very low UA indicates a predominance of commission errors, consistent with over-assignment of this class.
The inclusion of multi-temporal percentiles (A + P) results in limited and class-specific improvements, with a clear increase for Forage Grassland (+6.6 percentage points in F1) and a moderate improvement for the Urban class (+2.9 percentage points in F1). For the remaining categories, changes are minor and do not substantially alter the confusion patterns observed in the optical baseline model.
In contrast, the topographic block (A + T) produces more consistent gains for relief-conditioned classes, particularly Bare Soil/Alluvial Beaches (+15.6 percentage points in F1), Natural Grasslands/Shrublands (+13.7), and Rocky Terrain (+7.4), as well as moderate improvements for Wetlands (+6.3). Classes that were already well classified remain relatively stable.
Similarly, the addition of radar information (A + R) contributes to reducing persistent confusion among structurally complex classes. Under this configuration, Natural Grasslands/Shrublands show the largest relative gain in F1 (≈+16 percentage points), while the Urban class (≈+6 percentage points) and Forage Grassland (+4 percentage points) exhibit moderate improvements. Bare Soil/Alluvial Beaches show a smaller gain (≈+3 percentage points), whereas Wetlands display an intermediate increase (≈+4 percentage points). In contrast, Water and Snow/Ice remain virtually unchanged, consistent with their high separability already achieved using optical information.
Finally, the Full model (A + P + T + R) consolidates the observed improvements, achieving high F1 values for most classes and substantially reducing the performance gaps of the baseline model. Nevertheless, Natural Grasslands/Shrublands remains the lowest-performing class (F1 ≈ 70%), followed by Bare Soil/Alluvial Beaches (F1 ≈ 76%) and Forage Grassland (F1 ≈ 81%).
3.3. Variable Importance
Variable importance analysis using the Random Forest classifier allowed for the identification of the most influential predictors in each model configuration and an evaluation of how their hierarchy shifts across the incremental scheme. To facilitate comparison between configurations,
Figure 5 presents the 15 variables with the highest relative importance for each case, estimated using the Mean Decrease in Gini Index.
In the seasonal optical baseline model (A), the importance hierarchy is dominated by short-wave infrared (SWIR) bands, specifically B11_son and B12_jja. These are accompanied by spectral indices associated with snow, moisture, and the contrast between vegetated and non-vegetated surfaces (NDSI, NBR, NDWI, NDBI and BSI). This pattern indicates that the classification relies primarily on spectral contrasts related to surface moisture conditions, seasonal snow presence, and the differentiation of land cover states.
The incorporation of multi-temporal percentiles (A + P) maintains the SWIR bands as pillars but introduces statistical metrics of intra-annual extremes. In particular, the high percentiles of indices related to snow and surface/vegetation cover state (such as NDSI_p75 and NBR_p75) gain greater relevance. This suggests that the persistence or maximum intensity of certain events throughout the year provides critical information that complements the data contained in seasonal medians.
In the model with topography (A + T), a marked shift in the importance hierarchy is observed, with elevation and slope occupying the top-ranking positions and significantly outperforming individual spectral predictors. Although seasonal optical variables remain in the list, their relative weight decreases. This pattern confirms the foundational role of relief and the altitudinal gradient in the spatial distribution of land covers within the study area.
Analysis of the radar-integrated model (A + R) shows that the importance hierarchy is headed by the acquisition geometry variable (angle), followed by VV and VH backscatter intensities, and derived SAR indices sensitive to structure and scattering, such as PRVI and NPRVI. These variables exceed the importance of the optical bands, which remain in intermediate ranking positions. This pattern suggests that, in this configuration, the primary contribution of SAR is linked to capturing the geometric and structural information of the terrain.
Finally, in the Full model (A + P + T + R), the set of most influential variables is dominated by physical landscape descriptors, with elevation and slope among the highest-weighted predictors, alongside the acquisition geometry variable (angle). In this context, there is a prominent presence of SAR variables within the top 10, including PRVI, NPRVI, VV, VH, CR, and RVI. Optical variables appear starting from the seventh position in the ranking (e.g., B12_djf and B11_son). Taken together, this pattern highlights the integrated contribution of topographic gradients, observation geometry, and SAR descriptors in achieving the highest global performance. For further details, the complete list of the top 20 variables for each model configuration is provided in
Appendix D.
3.4. Spatial Consistency of the Mapping
The spatial distribution of land use and land cover produced by the best-performing configuration (Full model, A + P + T + R) for the entire study area is shown in
Figure 6. At the basin scale, the map adequately reproduces the main ecological and land-use gradients characteristic of the region, capturing the transition from steppe and shrubland in the eastern sector, through an intermediate agricultural–forestry mosaic, to evergreen forests in the western sector, as well as the location of the main urban centers.
- (a)
Case 1: Lacustrine–steppe transition in the Lago Misterioso sector (
Figure 7).
In the reference image,
Figure 7, the interface between native forest and forest plantations is clearly distinguishable, associated with the reddish tones of native forest during its autumn phenological phase and the more homogeneous texture and regular geometry of evergreen plantations. This pattern is consistently reproduced in the classification maps.
In the optical baseline model (A) and the A + P configuration, persistent commission errors are observed for the Urban class, reflected in low User’s Accuracy values (UA = 72.0% and 75.4%, respectively), together with an over-assignment of the Wetlands class at higher elevations, consistent with UA = 73.4% in both configurations. These confusions are visually expressed as spurious Urban and Wetlands assignments in areas where such covers are not expected according to ground reference information.
The incorporation of topographic variables (A + T) leads to a clear reduction in wetlands at higher elevations, with an increase in UA to 81.6% and in F1 to 81.5%, consistent with the visual removal of spurious wetlands on slopes and steep terrain. In this configuration, greater spatial stability is also observed for classes such as Steppe and Bare Soil; however, this improvement does not translate into a reduction in commission errors for the Urban class, whose UA decreases to 68.8%.
The inclusion of radar information (A + R) produces a clear reduction in Urban commission errors, with UA increasing to 77.6% and F1 to 86.8%, consistent with a more accurate delineation of lacustrine–terrestrial transitions. For Wetlands, improvements are more moderate (UA = 80.4%; F1 = 79.5%).
Finally, the Full model (A + P + T + R) consolidates the observed reductions, achieving UA = 88.9% and F1 = 93.0% for the Urban class, and UA = 83.8% and F1 = 83.1% for Wetlands, while classes that were already well characterized (e.g., Water, Native Forest, and Forest Plantations) remain stable (F1 ≥ 93%). Overall, these results are reflected in enhanced spatial stability of lacustrine–terrestrial interfaces within the analyzed area.
- (b)
Case 2: Urban–fluvial environment of the city of Puerto Aysén (
Figure 8).
This case examines the spatial consistency of the mapping in a complex urban–fluvial setting characterized by compact urban areas, active alluvial plains, riparian wetlands, and bare-soil surfaces linked to fluvial bars, see
Figure 8 for spatial comparison across all model configurations.
In configurations A and A + P, persistent commission errors are observed for the Urban class, manifested as spurious expansion into riparian sectors and non-built surfaces, consistent with the low UA values reported (72.0% and 75.4%, respectively). These confusions are mainly concentrated along urban–fluvial interfaces and in transitions with Bare Soil/Sands and Wetlands. The incorporation of topography (A + T) contributes to greater spatial coherence of the fluvial corridor and adjacent alluvial surfaces but does not reduce Urban commission errors (UA = 68.8%), indicating that topographic information stabilizes the geomorphological context without directly discriminating urban cover.
In contrast, the A + R configuration shows a clear reduction in Urban commission errors, with UA increasing to 77.6% and F1 to 86.8%, reflected in a more precise delineation of the urban footprint. For Wetlands, improvements are moderate (UA = 80.4%; F1 = 79.5%), associated with a progressive stabilization of their spatial distribution. The Full model (A + P + T + R) consolidates these improvements, achieving the highest accuracy values for the Urban class (UA = 88.9%; F1 = 93.0%) and enhanced spatial stability along urban–riparian interfaces.
3.5. Sensitivity to the Temporal Aggregation of SAR Variables
As a complementary analysis, the sensitivity of the integrated model (A + P + T + R) to the temporal aggregation scheme of SAR variables was evaluated by comparing the annual aggregation used in the main experimental design with an alternative seasonal aggregation.
From a quantitative perspective, the SAR seasonal-aggregation variant yielded additional gains in global metrics relative to the Full model with annual aggregation (OA = 92.5%; Macro-F1 = 86.0%), reaching an OA of 92.9% and a Macro-F1 of 89.5%. These values correspond to increases of +0.4 and +3.5 percentage points, respectively.
However, qualitative inspection of spatial consistency reveals contrasting behavior.
Figure 9 illustrates this effect in the urban–periurban environment of Coyhaique, comparing the optical reference (
Figure 9A) with the Full model using annual SAR aggregation (
Figure 9B) and its seasonal-aggregation variant (
Figure 9C). While annual aggregation preserves a compact urban delineation that is consistent with the reference, seasonal aggregation induces spurious expansion of the Urban class into rural and periurban areas.
A similar, though less pronounced, pattern is observed for Forage Grasslands, which exhibit increased spatial fragmentation under seasonal aggregation.
4. Discussion
The results demonstrate that the synergy between multi-seasonal optical data, multi-temporal SAR observations, and topographic variables significantly enhances LULC classification in complex Andean ecosystems. The performance of the Full model (OA: 92.5%; Macro-F1: 86.0%) confirms that multisensor integration is not merely additive but genuinely complementary, particularly in mitigating the effects of class imbalance (
Table 4). Beyond achieving high statistical accuracy, the contribution of this study lies in the empirical quantification of the marginal gains provided by different data domains through a systematic ablation design, and in the evaluation of annual SAR median composites as a robust alternative to seasonal aggregations for maintaining cartographic coherence in cloud-prone mountain environments.
4.1. Multisensor Synergy and Model Performance
The integration of optical, radar, and topographic data proved decisive for accurate LULC mapping within the complex orography of the Aysén basin. Overall Accuracy increased progressively from 89.2% in the optical baseline model to 92.5% in the Full model. This performance gain is especially relevant when contrasted with national-scale mapping efforts in Chile, where a systematic decline in accuracy toward austral latitudes has been reported due to persistent cloud cover and complex terrain [
14]. Rather than focusing solely on surpassing previously reported regional performance levels, these results demonstrate how the integration of multiple data domains resolves classification ambiguities that are otherwise difficult to address using single-sensor approaches. Moreover, the observed gain of +5.5 percentage points in Macro-F1 is consistent with previous studies conducted in heterogeneous landscapes, which have shown that the fusion of optical and radar information systematically improves the discrimination of complex land-cover classes compared to single-source approaches [
19,
63]. Taken together, these results highlight that the ablation-based design provides a systematic framework to disentangle the individual and combined contributions of optical, SAR, and topographic variables in complex mountainous environments.
Regarding the optical domain, the extreme cloud persistence characteristic of high-latitude or humid mountain environments poses a significant challenge for maintaining seasonal spectral integrity. To ensure spatial continuity, we implemented a hierarchical compositing strategy in which an annual median was used as a pixel-level fallback to fill residual gaps. While temporal interpolation is often applied to reconstruct phenology [
64], its use in data-scarce environments is constrained by the limited availability of temporally consistent observations required for reliable reconstruction [
65]. Under such conditions, interpolation may introduce phenological trajectories not directly supported by observations when large temporal gaps are present, highlighting the challenges of reconstructing temporally consistent signals in cloud-prone environments [
66].
By prioritizing observed reflectance values (i.e., the annual median) over interpolated estimates, this approach maintains the physical consistency of the input data. This strategy is consistent with compositing approaches widely used in Google Earth Engine and large-scale land-cover mapping, where median-based composites are commonly employed to reduce cloud-related noise and ensure spatial continuity in heterogeneous landscapes [
67,
68].
A minor limitation of this approach is that the frequency of gap filling from the annual composite was not explicitly quantified. However, as the same compositing strategy was applied consistently across all experimental configurations, relative comparisons between models remain unaffected. Taken together, these elements support the consistency of the optical domain under conditions of persistent cloud cover.
Although optical data (Sentinel-2) effectively captured phenological variability, as reflected by the dominance of SWIR bands (B11, B12) and snow- and vegetation-related indices in the baseline model, this information alone was insufficient to differentiate classes with similar spectral responses but distinct geometric or structural configurations. In this context, the incorporation of Sentinel-1 SAR backscatter (A + R configuration) yielded the largest marginal performance gain (+2.5% in OA).
This improvement is attributed to the ability of radar data to introduce descriptors sensitive to surface roughness and three-dimensional canopy organization, thereby facilitating the separation of structurally complex classes such as urban areas, bare soil, and shrublands [
19].
The robust contribution of these different data domains is further supported by the variable-importance ranking (
Figure 5). The inclusion of multiple spectral indices introduces a degree of redundancy, as some predictors are derived from similar spectral bands and capture related surface properties. However, the Random Forest algorithm mitigates the impact of multicollinearity through its random feature selection at each node [
43,
69], which reduces the dominance of correlated predictors within the ensemble.
In this study, importance values reflect the relative contribution of feature groups within a controlled ablation framework rather than strictly independent physical drivers. The consistent prominence of non-collinear features, such as elevation, slope, and SAR-derived metrics, indicates their strong contribution, as they remain highly influential despite the high dimensionality of the optical block.
4.2. The Role of Topography and SAR in Structural Discrimination
Our findings highlight topography as a key structuring factor of the landscape. The dominance of elevation and slope in the variable-importance ranking (
Figure 5) is consistent with the environmental gradients of the Aysén River Basin, where altitude governs wetland distribution and the upper treeline. However, it must be acknowledged that the model’s topographic influence is fundamentally constrained by the 30 m native resolution of the SRTM product. Although these data were resampled to 10 m for multisensor integration, this procedure does not enhance the inherent geomorphological detail. Consequently, the topographic variables in this study represent broad environmental gradients rather than fine-scale terrain features, a factor that should be considered when interpreting results in areas of extreme topographic fragmentation.
In this context, the inclusion of topographic variables (A + T) effectively corrected commission errors associated with wetlands on steep slopes (
Figure 7), a recurrent issue in classifications based exclusively on optical information, where topographic shadows or surfaces with high soil moisture can mimic wetland spectral signatures [
70].
However, the results also indicate that topography, while a robust predictor for natural land covers, is insufficient for urban discrimination in the complex environments of Patagonia. Under the A + T configuration, relief variables introduced a systematic bias, reducing the User’s Accuracy (UA) of the Urban class from 72.0% to 68.8% (
Figure 4). This limitation is consistent with findings from national-scale mapping efforts in Chile, where austral topographic complexity hampers land-cover separation even when digital elevation models and multi-temporal optical data are integrated [
14]. This behavior can be explained by geographic covariance, whereby a low slope acts as a positional predictor of settlements. In the Aysén River Basin, urban centers such as Puerto Aysén and Coyhaique are located on fluvial terraces and alluvial plains, sharing this geomorphological niche with forage grasslands and sand bars. This limitation can be more rigorously interpreted within the framework of decision tree learning. In Random Forest classifiers, predictor variables contribute to classification performance according to their ability to reduce class impurity during node splitting, typically quantified through the Gini gain [
43].
When different land-cover classes occupy overlapping regions of the feature space, such as urban areas and forage grasslands within similar elevation and slope ranges, splits based on topographic variables produce child nodes with comparable class compositions, resulting in limited impurity reduction. Consequently, these variables exhibit limited discriminative capacity within homogeneous geomorphological domains.
In contrast, topographic variables remain highly informative across altitudinal gradients where classes are naturally stratified (e.g., snow, forest, and steppe). This behavior is consistent with the variable-importance ranking (
Figure 5), where elevation and slope dominate under stratified conditions, but their relative importance decreases when structurally informative variables (e.g., SAR metrics) are introduced to resolve class ambiguities in low-relief areas. Within this framework, SAR and topographic variables address different sources of classification error: SAR enhances discrimination of structurally similar classes, whereas topography resolves terrain-induced spectral ambiguities.
Resolution of this bias is achieved through the incorporation of Sentinel-1 SAR signals (A + R). Radar sensitivity to surface roughness and double-bounce scattering mechanisms enables effective separation of built infrastructure from natural substrates, as documented in previous SAR–optical fusion studies for urban mapping [
59,
71]. Furthermore, the high importance of the local incidence angle variable in the Full model reflects its role as a terrain compensating predictor: by exposing the combined effect of sensor look geometry and local slope to the classifier, the model implicitly accounts for terrain-induced radiometric variations without requiring an explicit prior normalization step. Because the annual median composite aggregates both ascending and descending acquisitions, the angle value at each pixel converges toward a terrain-driven central tendency rather than a pass-specific acquisition geometry, mitigating potential artifacts associated with systematic orbital effects [
28].
4.3. Spatial Consistency Versus Statistical Metrics
The sensitivity analysis of SAR temporal aggregation revealed a critical discrepancy between statistical performance metrics and cartographic quality. Although the seasonal-SAR variant of the Full model numerically outperformed annual aggregation (Macro-F1: 89.5% vs. 86.0%; OA: 92.9% vs. 92.5%), visual inspection demonstrated that this metric gain masked a substantial degradation in spatial coherence, manifested as spurious urban expansion and increased class fragmentation at landscape ecotones (
Figure 9). This paradox arises from two concurrent causes: a physical one, whereby seasonal composites aggregate insufficient acquisitions to suppress transient dielectric anomalies induced by precipitation and snowmelt events characteristic of the western Patagonian hydrological cycle [
72], generating radiometric confusion concentrated at class boundary zones; and a methodological one, whereby the point-based validation protocol samples pixels exclusively from homogeneous polygon interiors, remaining spatially blind to the geometric fragmentation occurring at ecotones. The annual median composite, by aggregating a full hydrological cycle of approximately 40–60 acquisitions per pixel, suppresses stochastic moisture-driven backscatter variance while preserving the structural signature of each class [
66], prioritizing cartographic coherence over marginal gains in point-based statistical metrics. Such radiometric ambiguity between built-up surfaces and natural substrates with high surface roughness or rocky components is a well-documented challenge in Sentinel-1-based mapping, particularly in heterogeneous and topographically complex environments where distinct land covers may produce similar signal response [
28]. These findings support a methodological recommendation: the use of temporally aggregated SAR features, particularly annual composites, as a robust strategy to improve spatial consistency under persistent cloud cover conditions.
This effect can be interpreted in light of the physical sensitivity of C-band SAR backscatter to short-term variations in surface moisture, dielectric properties, roughness, and vegetation structure [
55,
72]. In dynamic environments, transient increases in surface moisture following precipitation or snowmelt can modify the effective dielectric properties of natural surfaces and alter the backscatter response, occasionally producing signals comparable to those of structurally complex or built environments.
In this context, the use of seasonal aggregations increases the likelihood that the classifier incorporates transient radiometric variability linked to short-term environmental conditions rather than to the permanent structure of land covers. Methodological studies on SAR preprocessing have emphasized that decisions regarding temporal aggregation and radiometric stabilization directly affect the robustness of derived products and the suppression of spurious artifacts [
27]. By contrast, the annual median composite acted as a robust temporal filter, smoothing transient noise and prioritizing the geographic consistency of permanent structures over marginal gains in global statistical metrics.
4.4. Persistent Challenges in Natural Grasslands
The Natural Grasslands/Shrublands class represented the primary challenge within the classification scheme, exhibiting the lowest performance in the Full model (F1 = 70.4%). In the seasonal optical baseline model, this class showed systematic overestimation, associated with its high spectral and phenological similarity to forage grasslands, which limits discrimination based solely on optical reflectance [
69].
The incorporation of topographic and radar information reduced commission errors, increasing User’s Accuracy (UA = 67.3%). However, this improvement was accompanied by reduced sensitivity, reflecting a characteristic trade-off for transitional land covers with high internal heterogeneity. From a biophysical perspective, this limitation can be explained by phenological overlap in the optical domain [
73] and by the limited ability of C-band SAR to resolve fine-scale structural differences. At the pixel scale, the volumetric scattering response of sparse shrublands is comparable to that generated by dense, managed grasslands [
70].
This suggests that pixel-level spectral and structural information may be insufficient to fully resolve these highly heterogeneous transitional zones. In this context, approaches that incorporate explicit spatial context, such as GEOBIA or deep learning architectures, may help capture textural and neighborhood patterns that are not represented in pixel-based features. These methods have demonstrated potential for addressing persistent confusion among land-cover classes with similar physical signatures [
70,
74].
While these limitations highlight the challenges associated with class-specific discrimination, it is also important to consider the implications of the experimental design. The use of a balanced sampling scheme (1500 samples per class) represents a methodological trade-off that helps ensure adequate representation of all land-cover categories during model training. In heterogeneous landscapes such as the Aysén basin, where minority but relevant classes (e.g., urban areas or wetlands) occupy a small fraction of the territory, proportional-to-area sampling could lead to dominance of majority classes, reducing the model’s ability to learn discriminative spectral–structural characteristics of underrepresented categories [
58,
75,
76].
From a validation perspective, while the polygon-level partition strategy prevents direct pixel-level leakage, it does not fully eliminate inherent spatial autocorrelation between neighboring units. However, since all model configurations in this ablation study were consistently trained and evaluated under the same sampling and validation framework, the relative performance gains quantified remain robust and comparable. This integrated design supports the interpretation that the observed synergistic contributions are primarily associated with the information content of the multi-sensor data rather than differences in class prevalence or spatial artifacts, particularly in contexts where overall accuracy (OA) may be insufficient to represent classification performance under imbalanced conditions [
77].
5. Conclusions
The implemented ablation design demonstrated that multisensor fusion is strongly recommended for overcoming the limitations of optical remote sensing in complex Andean landscapes, a finding that, while derived from a single basin and a single year of observation, is consistent with the physical constraints imposed by persistent cloud cover and rugged terrain that are characteristic of high-latitude mountain environments more broadly. Beyond improving global performance, this approach allowed for the decoupling and quantification of contributions from complementary information domains. While the central tendency baseline proved insufficient for capturing compositional heterogeneity (Macro-F1 = 80.5%), the inclusion of phenological metrics (P) refined vegetation discrimination, and the subsequent integration of topographic (T) and radar (R) variables generated critical, non-redundant gains (+3.8 points each), acting as geomorphological filters and structural descriptors. The integrated model (A + P + T + R) maximized global performance (OA = 92.5%; Macro-F1 = 86.0%), validating the hypotheses regarding the necessary complementarity between phenological dynamics (H1), physical structure (H2), and topographic landscape context (H3).
Methodologically, this study cautions against optimizing statistical metrics at the expense of the spatial plausibility of cartographic products. It was shown that although seasonal SAR data aggregation improved numerical metrics, it introduced noise and geometric artifacts; in contrast, annual composites operated as robust temporal regularizers, prioritizing cartographic coherence over marginal metric gains. Nonetheless, challenges persist in transitional classes such as Grasslands/Shrublands (F1 ≈ 70%), suggesting that incorporating explicit spatial context may help improve class separability, for example, through approaches such as GEOBIA or deep learning techniques, which can better capture spatial patterns in heterogeneous landscapes. From the perspective of potential operational implementation, the developed workflow, based entirely on open data from the Copernicus program and cloud-based GEE processing, constitutes a cost-effective and scalable framework for the systematic production of LULC maps in vast and inaccessible regions. Its reliance on freely available and globally consistent data sources enhances its reproducibility. However, confirming its robustness under operational conditions will require multi-year validation to assess the temporal stability of classification accuracy under varying hydrological and phenological conditions. This approach demonstrates strong potential for transfer to land management agencies operating in analogous mountain environments characterized by persistent cloud cover, where it could facilitate frequent updates of land cover data for watershed management, fire monitoring, and climate change adaptation planning. However, further evaluation in comparable environmental contexts or in other basins is required to confirm its performance under different conditions.
It is important to acknowledge that the findings presented here are based on a single case study conducted in the Aysén River Basin using 2021 imagery. While the results demonstrate robust performance and consistent behavior across multiple feature configurations, the broader applicability of the proposed framework should be further evaluated in additional geographic and environmental contexts. Future research should therefore assess its reproducibility and classification performance in other cloud-prone mountain basins of Chilean Patagonia and comparable high-latitude environments.