1. Introduction
Wetlands are strategic ecosystems that play a crucial role in maintaining biodiversity and hydromorphic vegetation. They function as natural buffers against floods, filter pollutants, and provide habitat for endemic species. However, accelerated urban growth has transformed these ecosystems, leaving only small remnants within highly urbanized zones [
1]. In addition, planning policies and road construction have fragmented wetlands, reducing vegetation cover, water bodies, and, consequently, the resilience of these ecosystems [
2].
In Latin America and the Caribbean, wetlands located within or near cities are highly threatened by rapid population growth and informal settlements that invade hydraulic buffer zones and areas designated for environmental management [
3]. Weaknesses in territorial planning and public policy exacerbate their degradation, with impacts not only on ecological processes but also on economic activities and spatial organization. As a result, mangroves, lagoons, and marshes have suffered severe transformations due to their proximity to large urban centers [
4]. For example, in Lima (Peru), 203 hectares of wetlands disappeared between 1990 and 2013, compromising their ecological functions. In Chile, the Aconcagua–Concón wetland has been deteriorating since 2015 due to urban pressure and extreme climatic events [
5].
In this context, photointerpretation and satellite imagery have proven useful for monitoring wetland transformations. Specifically, the Normalized Difference Vegetation Index (NDVI), derived from near-infrared and visible red bands, has been widely applied to classify land cover and distinguish natural from artificial zones [
6]. NDVI provides a powerful tool for regional-scale monitoring, given the accessibility of remote sensing data and its applicability across diverse Latin American contexts.
Similarly, spectral water indices are essential for detecting water bodies and quantifying surface moisture based on reflectance in near-, mid-, and shortwave infrared bands [
6,
7]. Among them, the Normalized Difference Water Index (NDWI) is widely used in wetland studies. For instance, urban wetlands in China have been analyzed with NDWI to assess water body variations under anthropogenic and climatic influences [
8]. In Colombia, Pedraza et al. [
9] used NDWI to track seasonal water fluctuations in the Torca–Guaymaral wetland, while in Kenya, the Automated Water Extraction Index (AWEI) has been applied to analyze historical water-level variations in Lake Baringo, incorporating factors such as land use change, surface temperature, erosion, sedimentation, and precipitation [
3].
More recently, the integration of spectral indices with advanced machine learning (ML) methods has expanded monitoring and predictive capacities. Algorithms such as Random Forest (RF) and cellular automata (CA) allow for the simulation of land-use dynamics at the pixel level, capturing spatiotemporal processes of change. Random Forest stands out for its robustness against noise, ability to handle high-dimensional datasets, and capacity to evaluate variable importance. When combined with neural networks—particularly for calibrating transition rules—ML methods achieve high accuracy in projecting future scenarios [
10].
The integration of these spectral indices with advanced machine learning (ML) methods has significantly expanded monitoring and predictive capacities. Algorithms such as Random Forest (RF) are particularly notable for their robustness against noise, ability to handle high-dimensional datasets, and capacity to evaluate variable importance. For example, the RF algorithm has been successfully employed using vegetation indices to classify wetlands and create classification maps in Gansu Province across eight periods from 1987 to 2020 (at 30 m resolution), achieving robust results with an average overall accuracy (OA) of 96.0% and a Kappa coefficient of 0.954 [
11]. The marsh wetland type exhibited the highest average user’s accuracy (UA) and producer’s accuracy (PA) at 96.4% and 95.2%, respectively. When combined with other techniques like cellular automata (CA)—particularly using neural networks for calibrating transition rules—these ML methods achieve high accuracy in projecting future scenarios [
10,
12].
Several studies illustrate the effectiveness of this approach in wetland conservation. At Khinjhir Lake, Pakistan, a spatiotemporal analysis (2000–2020) combining spectral indices, Random Forest, and CA–Markov modeling revealed an 11% reduction in water bodies and a 30% loss of moderately to highly vulnerable zones, with projections indicating further decline [
13]. Likewise, in Cauca, Colombia, the application of K-Means and Random Forest with NDVI/NDWI enabled the identification of five wetland complexes, including 25,929.39 ha of high-vegetation swamps and 1795.51 ha of low-vegetation cover [
14].
In Bogotá, the urgency of such approaches is clear: wetlands have lost 84.52% of their area in the 21st century due to urban growth and road infrastructure development [
15]. The Capellanía wetland, despite being part of Bogotá’s Main Ecological Structure and receiving international protection under the Ramsar Convention, remains highly vulnerable. Infrastructure projects, such as Avenida La Esperanza and Avenida Ferrocarril de Occidente, have already fragmented its connectivity, while the planned Avenida Longitudinal de Occidente (ALO) threatens to eliminate nearly one-third of its area, severely affecting its hydrological dynamics and vegetation [
16].
Given this situation, the present study seeks to monitor the Capellanía wetland using spectral indices and machine learning algorithms to predict future scenarios and evaluate vegetation cover changes under urban expansion pressures.
3. Results
3.1. Diagnosis of the Capellanía Wetland
The initial diagnosis, based on NDVI and NDBI indices (2024–2025), indicates that the Capellanía wetland maintains dense vegetation cover in its central and southwestern sectors, dominated by grasslands and shrubs, particularly near ecological trails. The wetland also preserves permanent and seasonal water bodies, which sustain ecological connectivity.
In contrast, the peripheral zones exhibit sparse vegetation or bare soil, coinciding with areas of high probability of cover loss as indicated by the NDBI. These areas are subject to intense anthropogenic pressure from surrounding urban, road, and industrial developments. This situation highlights the vulnerability of ecosystem edges and underscores the urgent need for restoration and management actions to mitigate progressive vegetation cover loss (
Figure 4).
3.2. Fieldwork Results
Validation of the vegetation cover maps through field observations (April 2024) showed a consistent correspondence with vegetation index values (NDVI;
Figure 5). A clear spectral gradient was identified: high NDVI values (>0.6) corresponded to dense vegetation areas, intermediate values (0.2–0.4) were associated with grasslands and herbaceous cover, while low or negative values (<0.1) coincided with water bodies.
During in situ verification, wetland indicator species such as Schoenoplectus californicus and Typha angustifolia were documented, along with invasive taxa such as Ulex europaeus. Characteristic wetland birds were also recorded, including the Tropical Kingbird (Tyrannus melancholicus) and Vermilion Cardinal (Pyrocephalus rubinus). These observations provided ecological validation of the assigned spectral classes.
However, widespread anthropogenic pressures were also observed, including landfills, drainage systems, and extensive livestock grazing, which explain the atypically low NDVI values recorded along wetland edges and buffer zones. Collectively, these results confirm the robustness of the automated spectral classification and quantitatively complement the satellite analysis with direct botanical, zoological, and geospatial evidence, consolidating the foundation for subsequent predictive modeling.
3.3. Spectral Correlation Matrix Between Moisture and Vegetation Indices
The correlation matrix (
Table 4) shows a very strong negative relationship between NDVI and NDWI (−0.99), a moderate negative correlation between NDVI and MNDWI (−0.57), and a low positive correlation between NDVI and NDMI (0.29). This indicates that areas with higher vegetation density tend to have lower surface moisture. Although the NDMI correlation is low, it was retained in the model because it provides complementary information about canopy and soil, which is useful for algorithms such as Random Forest that capture non-linear interactions.
These results are complemented by the SHAP technique (
Figure 6), which identified NDWI and MNDWI as the variables with the highest influence on predictions, followed by NDMI. Seasonal variables (month_sin and month_cos) showed a smaller but non-negligible contribution, indicating that temporal seasonality also partially affects vegetation spectral response, albeit to a lesser extent.
3.4. Multitemporal Analysis of NDVI (2013–2032)
Figure 7 illustrates the temporal evolution of vegetation cover in the Capellanía wetland between 2013 and 2032. Initially, bare soil and sparse vegetation dominate, with localized water presence. From 2021 onwards, a progressive transition is observed, characterized by an increase in moderate vegetation, which exceeds 50% of the area during several periods. Starting in 2022, there is a notable increase in dense vegetation, reaching approximately 70% of the total cover by 2032.
In this context, the strong negative correlation observed between NDVI and NDWI (−0.99) does not pose a methodological issue; rather, it reflects an expected dynamic in urban wetlands. While the increase in NDVI indicates vegetation recovery, the concurrent decrease in NDWI signals a relative reduction of water-covered surfaces. This result highlights the competitive relationship between water and vegetation within the ecosystem.
The evolution of the Shannon diversity index (H′) in the Capellanía wetland between 2013 and 2032 shows marked variability in the early years (values ranging from 0.2 to 1.3), reflecting phases dominated by bare soil or sparse vegetation, alternating with periods of higher heterogeneity. From 2021 onwards, a sustained increase in structural diversity is observed, associated with the expansion of moderate vegetation, reaching values between 0.8 and 1.0.
In the projected period (2025–2032), the index tends to stabilize, although with a slight decline toward 2032, indicating the consolidation of moderate and dense vegetation cover, but with a reduced relative balance among categories. These results suggest a process of partial wetland recovery, in which vegetation cover strengthens while structural diversity gradually declines (
Figure 8).
The calculation of the Compound Annual Growth Rate (CAGR) between 2013 and 2032 (
Table 5) reflects a transition from degraded covers toward increased vegetation presence. There is a sustained decrease in water and bare soil, while moderate vegetation shows a slight but consistent growth. The most notable case is dense vegetation, whose compound growth rate is infinite because it starts from zero in 2013 and consolidates by 2032. These figures confirm a reduction of degraded areas (bare soil and sparse vegetation) and a significant increase in moderate and dense vegetation.
3.5. Multitemporal NDVI Mapping (2013–2032)
Figure 9 shows the spatial distribution of vegetation classes at different dates and projections for 2025 and 2032. Each map is classified into four categories: bare soil/non-vegetated (brown), sparse vegetation (yellow), moderate vegetation (light green), and dense vegetation (dark green).
2013–2019: Dominance of bare soil and sparse vegetation, especially in the northern and eastern sectors of the wetland, with small patches of dense vegetation concentrated in the southern sector.
2020: Critical scenario characterized by strong internal fragmentation, an increase in water bodies, and a reduction in vegetation cover.
2022–2024: Transition phase with a decrease in degraded areas and an increase in moderate and dense vegetation, particularly toward the central and southwestern sectors.
2025–2032 (projections): Predominance of moderate and dense vegetation, with almost complete disappearance of degraded categories, reflecting ecological recovery and consolidation of the ecosystem.
This spatial sequence supports the numerical results and Shannon index trends, showing that the wetland is moving toward higher vegetation cover, although with progressively greater structural homogeneity by 2032.
The multitemporal mapping supports the numerical results shown in
Table 4 and the NDVI graphs, illustrating how the wetland is trending toward higher vegetation cover and diversity by 2032.
3.6. Urban Simulation Scenarios (With and Without ALO)
Scenarios II and III were analyzed using cellular automata (MOLUSCE), and the results are described below:
Scenario II (with ALO): Non-urbanized areas decrease from 705.6 ha (2013) to 356.85 ha (2025, −33.8%), while urbanized areas increase from 324.09 ha to 672.30 ha (+33.8%) (
Table 6 and
Table 7).
The transition matrix for Scenario II with the ALO project shows a land-use change dynamic where unbuilt land (State 0) has a 41.4% probability of being developed in the next period, reflecting high urbanization pressure. Conversely, already developed land (State 1) has an unusually high probability (29.1%) of reverting to unbuilt status, suggesting notable instability or reversibility in the development process under the modeled conditions.
Figure 10 shows comparative results for the two projected scenarios in 2032 using the MOLUSCE cellular automata model. In the scenario without the ALO (left), large green areas correspond to stable or consolidating vegetation, with smaller red areas indicating high risk of vegetation loss. In contrast, the scenario with the ALO (right) shows a notable increase in red areas, particularly in the northeastern sector, reflecting a significant reduction in stable vegetation cover.
These results confirm that the ALO acts as a catalyst for urbanization processes, exerting greater pressure on wetland vegetation and increasing the risk of loss in adjacent areas. These findings highlight the urgent need to implement mitigation and restoration measures should the road project be executed.
The comparison between the spectral NDVI projection for 2032 and the cellular automata simulation based on NDBI, including the ALO route (
Figure 10), shows strong spatial convergence in identifying areas at risk of ecological degradation. Both models highlight the northeastern sector of the wetland as critical, characterized by high NDBI and low NDVI values, indicating a direct relationship between urban intensification and vegetation loss.
Although the methodological approaches differ—NDVI projections via linear regression and Random Forest do not explicitly account for the ALO, while the cellular automata model incorporates it as a spatial variable—both models consistently indicate urban expansion as the main driver of ecological deterioration. This convergence reinforces the validity of the results and underscores the value of NDBI as an early indicator of anthropogenic pressure, particularly when integrated with vegetation indices in vulnerable urban wetlands.
4. Discussion
Machine learning-based approaches have become essential tools for wetland analysis. This study implements a methodology combining the Random Forest algorithm and Markov chains for vegetation cover analysis, complemented by cellular automata (via the MOLUSCE module) for prospective spatial simulation. The analysis relies on the calculation of vegetation, moisture, and urbanization spectral indices, selected for their demonstrated effectiveness in managing high-dimensional datasets and robustness in discriminating spectrally similar classes [
50].
The relevance of these methods is supported by applications in cities facing analogous pressures, such as Barranquilla, Pakistan, Lima, and Concepción, where they proved effective in quantifying wetland degradation driven by urban expansion a [
5,
10,
14].
It is important to note that this study was conducted exclusively for the urban Capellanía wetland in Bogotá, so its findings reflect the specific biophysical, climatic, and anthropogenic conditions of this ecosystem. While the proposed approach (spectral indices, Random Forest, MOLUSCE) proved effective in modeling vegetation dynamics and urban expansion, generalization to other wetland types, such as coastal, alpine, or rural wetlands, requires additional validation that considers the peculiarities of each environment and necessitates fieldwork [
25]. Wetland classification from remote sensing data is often challenging due to seasonal vegetation dynamics [
29].
Additionally, the MOLUSCE model, based on cellular automata and historical data, does not incorporate external variables such as policy decisions or socioeconomic changes; therefore, its projections should be interpreted with caution. This limitation aligns with findings from arid wetlands, such as those in Xinjiang, China, where Random Forest achieved high accuracy (99%, Kappa = 0.92) but exhibited constraints in transferring the model to other scenarios [
38].
4.1. Analysis of Wetland Vulnerability Metrics
The Random Forest model for the Capellanía wetland exhibited outstanding performance, with a coefficient of determination (R
2) of 0.991, a root mean square error (RMSE) of 0.0214, and a mean absolute error (MAE) of 0.0127. These results demonstrate the model’s high predictive capacity and the close agreement between estimates and observed values, providing rigorous quantitative validation for assessing wetland vulnerability, consistent with studies conducted in the Sindh province [
13]. Recent research confirms that Random Forest is highly accurate across diverse contexts, including marshes and coastal wetlands, due to the integration of multiple spectral indices [
43,
48].
Urban areas adjacent to wetlands were validated using the cellular automata algorithm, yielding a moderate Kappa coefficient of 0.640. This is consistent with studies modeling Lake Baringo (Kenya), where Kappa values above 0.45 were reported. Cellular automata provide higher sampling efficiency and capture complex interactions between urban growth drivers and wetland dynamics [
3]. These results support the ability of cellular automata to model spatially complex interactions between urban growth drivers and wetland dynamics.
4.2. Comparison of Machine Learning and Simulation Models in Land-Use Change
Combining Random Forest and cellular automata enriched the analysis of territorial dynamics. NDVI-based vegetation cover projections accurately identified areas directly influencing the wetland, confirming the utility of Random Forest in urban environments. Studies in Sindh highlighted its robustness in vulnerable ecosystems [
13]. Other studies have also demonstrated Random Forest’s ability to integrate spectral indices (NDVI, NDWI, NDBI) and improve predictions in urban and wetland contexts [
51].
Concurrently, MOLUSCE successfully captured neighborhood effects and local spatial dynamics, demonstrating its potential for evaluating urban growth scenarios. Simulations for the Capellanía wetland, including the projection of the (ALO), indicate that this infrastructure could threaten ecological integrity, increasing pressure on vegetation. This finding aligns with studies such as “Urban Wetland Trends in Three Latin American Cities” (Rojas et al., 2020), which identify urban expansion as a critical threat to metropolitan wetlands [
5]. Additionally, Cuéllar & Pérez (2023) [
19] applied a hybrid model (Markov-ANN-FLUS) to 15 urban wetlands in Bogotá, projecting a nearly 25% reduction by 2034 and highlighting Capellanía as one of the most affected ecosystems by urbanization. These findings support our results and emphasize the need to consider road infrastructure scenarios like the ALO in urban wetland planning and conservation. Similar predictive approaches have been successfully applied at Lake Baringo using neural networks and cellular automata [
3].
Overall, these discussions demonstrate that the results obtained for the Capellanía wetland are consistent with global and regional trends. Despite significant surrounding urban pressure, there is a clear trend toward vegetation cover recovery and progressive consolidation of the wetland core, underscoring the relevance of implementing comprehensive planning and ecological restoration measures. Simultaneously, urban simulation projections confirm that infrastructure projects, such as the ALO, can dramatically accelerate vegetation loss in adjacent areas. These conclusions, supported by both current remote sensing data and recent scientific literature, provide a solid foundation for management, conservation, and decision-making in urban wetlands facing similar ecological characteristics and anthropogenic pressures.
4.3. Structural Diversity and Ecological Resilience (Shannon Index Analysis)
Shannon index values correspond with observations in the Capellanía Wetland Environmental Management Plan (2008) [
20], which noted a loss of structural diversity due to water body drying and the predominance of degraded covers [
1]. The marked variability observed between 2013 and 2020, with index declines approaching zero, reflects this ecosystem fragility and the temporal dominance of categories such as bare soil or sparse vegetation.
From 2021 onwards, the increase in moderate vegetation and the consequent recovery of diversity aligns with the 2023 Management Report of the RDH Capellanía, highlighting improvements in water quality, vegetation restoration processes, and strengthened community monitoring [
2,
19]. However, index stabilization around intermediate values (0.8–1.0) toward 2032 indicates that, although vegetation cover recovers, structural diversity tends to decrease due to the dominance of moderate and dense vegetation.
This finding suggests that the ecosystem is moving toward a more homogeneous state, which, while indicative of recovery, limits long-term ecological resilience. Therefore, restoration actions should aim to preserve cover heterogeneity, ensuring not only increased vegetation cover but also structural diversity that supports the ecological stability of the wetland.
5. Conclusions
The results demonstrate a clear transition in vegetation cover in the Capellanía Wetland between 2013 and 2032. Degraded classes (bare soil and sparse vegetation) decreased from over 80% to less than 10%, while moderate and dense vegetation increased steadily to exceed 90% by the end of the period. Water cover remained low but stable. These changes reflect a progressive recovery process and consolidation of vegetation, in contrast with the loss trends observed in other Latin American urban wetlands, such as the Ciénaga de Mallorquín.
Methodologically, this study demonstrates the value of integrating a multitemporal Random Forest model (R2 = 0.991; RMSE = 0.0214; MAE = 0.0127) with spatial simulations based on cellular automata in MOLUSCE. This approach proved effective in anticipating ecological trajectories, identifying critical areas, and guiding targeted restoration strategies. The strong negative correlation between NDVI and NDWI (–0.99), along with vegetation consolidation in the core and potential loss along the northeastern edges, underscores the need for specific interventions to maintain ecological connectivity.
Urbanization scenario analysis confirms that the construction of the (ALO) would act as a catalyst for urban expansion, drastically reducing non-urbanized areas and increasing pressure on vegetation. This finding provides critical input for strategic environmental assessments and highlights the necessity of integrating spatial modeling into road and territorial planning in Bogotá.
Overall, this study provides quantitative, spatial, and prospective evidence to inform conservation policies and sustainable urban planning. The results advocate for preventive, integrated, and data-driven approaches to ensure the protection of strategic wetlands like Capellanía, which perform essential ecological functions within the urban matrix.
Finally, a limitation of this study is the absence of integrated climatic and socioeconomic variables due to resolution and data quality constraints, presenting an opportunity for model expansion in future research. It is also recommended to explore participatory management scenarios and incorporate in situ data to strengthen decision-making and promote more effective environmental governance in urban wetlands across Latin America.