1. Introduction
Understanding urban transformation is a fundamental aspect of contemporary urban planning, given the interaction between physical changes in land use and the social, economic, and cultural dynamics that influence the identity of cities [
1,
2,
3]. A deep understanding of these processes is essential for striking an appropriate balance between urban growth and the preservation of cultural identity [
4,
5]. In this context, the conservation of cultural heritage has become a significant challenge in areas with historical and industrial legacies, where pressures from urban development and real estate speculation threaten the loss of built heritage [
6,
7,
8,
9]. Heritage protection is part of the international commitments made by states, especially within the framework of the Sustainable Development Goals (SDGs), with SDG 11 being the most notable, as it aims to promote inclusive, safe, resilient, and sustainable cities [
10].
In this sense, those responsible for the conservation of historical heritage primarily rely on instruments such as protection, conservation, or development control zoning. These tools delineate areas of historical, landscape, or architectural value and impose restrictions on permitted uses [
11]. They aim to balance the preservation of built heritage with the dynamics of urban transformation through incentives, regulations, and restrictions that guide land use [
12]. However, their effectiveness depends on the institutional capacity to anticipate patterns of change and understand the factors influencing transformation [
13,
14].
Among the main risks to conservation are unplanned real estate expansion, pressure on areas of historical value, and morphological fragmentation of the urban fabric due to the opening of new road infrastructure or changes in permitted uses [
15,
16]. These processes are expressed spatially through changes in land use, which makes the analysis of these changes a tool for heritage planning [
17]. In particular, planners require interpretable spatial evidence that allows them to prioritise driving factors associated with land transformation [
18]. Recent studies have shown that heritage protection areas do not always guarantee morphological stability, as their impact depends on physical variables and the degree of regulatory compliance [
19]. Notably, the opening of new roads is one of the most consistent drivers of change, increasing pressure on heritage environments and buffer zones [
20,
21].
To address these challenges, it is essential to have accessible and understandable tools for urban planners and decision-makers [
4,
5]. In recent years, various approaches have been developed to identify and quantify the drivers of urban transformation. These factors encompass biophysical, socioeconomic, institutional, and morphological dimensions, and their interactions shape both the magnitude and direction of territorial change processes. Additionally, quantitative methods have been incorporated to establish causal or associative relationships between explanatory variables and spatial transition patterns. Among these, traditional statistical approaches (such as logistic regression, principal component analysis, and geographically weighted regression) stand out for their ability to estimate the individual effects of each variable on territorial change [
22,
23]. Simultaneously, machine learning-based techniques, such as Random Forest, Support Vector Machines, and Neural Networks, have demonstrated high predictive performance in complex and nonlinear contexts [
24,
25].
However, most of these techniques have focused on urban expansion processes rather than changes within the existing urban heritage. Among the few that address intra-urban transformations, even fewer consider a detailed breakdown of land-use categories or the specific phenomena of change in historical urban contexts. Another limitation concerns the low interpretability of many machine learning models when compared with classical statistical methods. These problems restrict their efficiency in planning within urban heritage contexts, where understanding causal relationships at a detailed level and ensuring traceability of results are essential [
26,
27,
28].
It is important to note that in this work, the term “interpretability” is used in the sense of intrinsic interpretability, understood as the model’s capacity to transparently show how the variables produce the result, without requiring external post-processing techniques [
29,
30]. From this perspective, our analysis compares logistic regression (
), recursive partitioning (
), and random forests (
) considering widely recognized criteria of structural interpretability—model transparency, variable-effect traceability, complexity, and communicability—thereby allowing for a rigorous evaluation of the degree of explainability inherent in each approach.
According to this definition,
and
are considered fully interpretable models: the former because they offer coefficients and marginal effects directly attributable to each predictor and the latter because they generate hierarchical partitioning rules that can be easily understood and communicated by a human analyst. In contrast,
models are not considered intrinsically interpretable. However, the literature places them in an intermediate category between transparent and completely opaque models. Although their internal structure is difficult to inspect, they provide global measures of variable importance that allow for a partial understanding of their functioning [
29,
31,
32,
33].
The aim of this study is to address a persistent gap in the literature: the lack of methodological comparisons evaluating the interpretive utility of different models for tackling specific heritage planning and conservation problems. In historical urban contexts, where decisions must transparently justify why certain areas are more vulnerable to transformation than others, the interpretability of the model is as important as its accuracy. Therefore, this work not only compares the predictive capacity of three techniques (, , and ) but also critically examines how each translates territorial factors into actionable information for urban managers and conservationists. The purpose is to determine how the different forms of explanation offered by these models can support the identification of heritage risks, the evaluation of regulatory effectiveness, and the prioritization of interventions in historic urban environments.
The results show that urban transformation in heritage contexts follows complex and nonlinear spatial patterns, making models like particularly suitable for simulating scenarios and capturing highly complex interactions. In contrast, is the most appropriate model for identifying heritage risk zones, as it produces explicit rules and thresholds that are easily linked to zoning criteria. , for its part, is particularly useful for evaluating the effectiveness of regulatory instruments, as it allows for the quantification of the marginal effect of each variable and analyzes whether protected areas are fulfilling their intended purpose. Finally, the factor ranking provided by serves as a valuable tool for prioritizing interventions, clearly revealing which territorial elements most distinctly differentiate conservation and urban renewal processes.
The main contribution of this study is to provide distinct analytical tools that support heritage decision-making by linking spatial modeling with the specific needs of planners. In line with UNESCO’s Recommendation on Historic Urban Landscapes, the study prioritizes understanding the factors driving urban transformation, which facilitates the selection of the most appropriate model for each specific problem. This approach enables a more direct connection between analytical results and heritage management, optimizing risk identification, regulatory assessment, and intervention planning in historic urban environments.
The rest of the article is organized as follows:
Section 2 characterizes the theoretical foundations of the models;
Section 3 describes the proposed methodology;
Section 4 presents the main findings, including the performance evaluation through cross-validation, the analysis of probability maps to examine spatial patterns, and the comparative interpretation of model outputs. This is followed by
Section 5, which discusses the implications of the results, and
Section 6, which summarizes the main conclusions and contributions of the study.
3. Methodology
The present study adopts a comparative approach to evaluate the explanatory capacity and predictive accuracy of three widely used methods in spatial land-use change modeling: , , and . The methodological design is structured in three stages:
Application of the models to an urban case study, followed by a quantitative comparison of prediction metrics as evaluation: model validation was performed using a five-fold stratified cross-validation (5-fold CV) scheme applied to the training set, which represented 80% of the total observations. In each iteration, the
,
, and
models were retrained, calculating performance metrics such as accuracy, sensitivity, specificity, precision, F1 score, balanced accuracy, and Kappa coefficient. The decision threshold was determined from the Youden index (J), obtained using the ROC curve [
47], with the aim of maximizing both sensitivity and specificity. This procedure allowed for more robust estimates of each model’s predictive capacity, reducing the variance associated with a single data partitioning [
48]. Subsequently, the model with the best average performance was retrained using the entire training set and evaluated on the remaining 20% (test set) to estimate its final performance and generalizability. Additionally, the assumption of spatial independence was assessed using Moran’s I applied to the residuals of the Logistic Regression model to detect potential spatial patterns not captured by the predictors.
The analysis of the probability maps identifies areas most likely to experience land-use transition, based on the estimates from each model. In this stage, the probability maps obtained from each model are generated and compared, representing the spatial distribution of transformation risk. The analysis focuses on identifying consistent territorial patterns, contrasting differences in the locations of vulnerable areas, and evaluating the degree of coherence between the methods. The objective is to determine the extent to which these predictions provide useful and reliable information for decision-making in heritage urban planning, especially in identifying critical areas where interventions, regulations, or protective measures are needed.
Analysis of the explanatory capacity of each model, focusing on the interpretation of the driving variables. In this stage, a systematic examination is conducted of how each model represents the influence of the drivers of change. In the case of , the analysis focuses on the coefficients and odds ratios, which quantify the direction and magnitude of the effect of each territorial variable. For , the focus is on the hierarchical rules and critical thresholds that structure the decision tree, revealing specific combinations of normative and morphological conditions under which land transformation occurs. In the model, attention is paid to the relative importance of the variables, allowing for the prioritization of the factors that best distinguish between stable areas and sectors undergoing urban renewal processes. This comparative approach allows for contrasting the consistency between methods, identifying which factors exert the greatest influence on territorial transformation, and evaluating their relevance for the design of conservation and heritage zoning instruments. Overall, the analysis seeks to maximize the interpretability of the models and extract robust evidence to strengthen decision-making in land management.
All processing and analysis were performed in R (v4.3.1) using RStudio, with the rpart [
37], randomForest [
49], lulcc [
50], and pROC [
51] packages for model fitting, evaluation, and visualization.
Figure 4 illustrates the methodological flow of the land-use change modeling process. It begins with raster maps of observed land use from 1992 and 2019, which are stacked and reclassified along with the explanatory factors (
: Distance to the Historic Network;
: Distance to Beach;
: Historic Conservation Zone;
: Sectional Zone of the Coastal Border;
: Historic Monument Zone).
After spatial alignment, the layers are combined to generate the input dataset. This dataset is divided into 80% for training and 20% for testing. To evaluate model performance, a cross-validation scheme was applied to the training set. Subsequently, the metrics were estimated on the test set. This procedure allowed for the quantification of predictive capacity using widely accepted indicators in spatial modeling: overall accuracy, sensitivity, specificity, precision, F1 score, balanced accuracy, and the Kappa coefficient. Together, these metrics facilitate the evaluation of both the model’s overall performance and its ability to accurately distinguish between stable areas and areas undergoing transformation. The applied models (
,
, and
) facilitate the generation of probability maps and the analysis of the importance of variables, coefficients, significance, and odds ratios associated with land-use changes. To be precise, a complete algorithm is formally presented and explained in
Appendix A: Algorithm A1 (input, step 1 and step 2) and Algorithm A2 (step 3, step 4, step 5, step 6 and output).
3.1. Data
The study area corresponds to the Bellavista neighborhood, located in the city of Tomé, Chile. This urban enclave of industrial origin preserves a significant collection of heritage buildings linked to the former textile industry. The territory has undergone substantial transformations in its urban morphology, driven by deindustrialization, real estate pressure, and the reconfiguration of public spaces. Due to its high concentration of historically significant elements and the intensity of recent dynamics of change, Bellavista constitutes a representative case for analyzing land-use transition patterns in heritage urban contexts.
The data used in this study comes from the digitization and analysis of aerial orthophotos corresponding to the years 1992 (see
Figure 5) and 2019 (see
Figure 6), following the methodology previously documented in [
52]. In this work, a cartographic database of land uses was developed through photointerpretation, complemented by cadastral information and field verifications. The unit of analysis corresponds to regular 1 × 1 m cells, on which a binary layer was built that distinguishes between change and no change in land use. For the purposes of this analysis, the original classes were reorganized into a simplified typology composed of five main categories:
For the analysis, four land use categories were distinguished. First, the Historical class encompasses residential, commercial, amenity, and circulation uses established before 1992, which form the neighborhood’s heritage base. Second, open spaces of historical origin (such as beaches, courtyards, and green areas) were identified; despite their informal nature, they play a structuring role in the urban configuration. Third, the “Historical Productive” class was defined, comprising industrial uses that are either disused or undergoing conversion, traditionally linked to the local economy. Finally, all uses emerging after 1992 (including new housing, recent commercial development, informal settlements, and coastal retaining structures) were grouped into the “New” category. The complete correspondence between the original uses and their reclassification is presented in
Table 1.
Figure 7 illustrates the spatial evolution of land use in the Bellavista neighborhood between 1992 and 2019, as well as the precise locations of the cells that underwent transformations during this period. A significant expansion of the areas classified as “new use” is evident, concentrated mainly in sectors previously occupied by industrial buildings or historic open spaces. This dynamic reflects a reurbanization process that has altered the original structure of the heritage fabric, moving towards greater densification and increasing functional diversification of the territory.
Figure 8 reinforces this trend by showing, in percentage terms, a 24% increase in the area classified as “
new use” between 1992 and 2019, accompanied by a significant reduction in areas designated for historical productive uses. This redistribution of land use suggests a reconfiguration of the urban landscape, where the obsolescence of industrial land has progressively led to its conversion to residential or mixed uses.
Figure 9 allows us to disaggregate the destination of the original urban land, revealing the proportion of each use category recorded in 1992 that was absorbed by new functions in 2019. It can be observed that a considerable portion of the land classified as “Historical Productive” land and “Historical Open Space” was transformed into recent residential or urban areas, highlighting the high vulnerability of industrial heritage to the pressures of urban development.
3.2. Driving Variables
The explanatory variables used in this study are based on the methodological proposal developed by [
52], which aims to model urban dynamics in heritage territories, such as the Bellavista neighborhood of Tomé. This area has been the scene of various heritage activation processes and conflicts surrounding land use and the symbolic value of space, especially since the deindustrialization of the textile complex [
53,
54,
55].
Based on this background and the territorial approach adopted, five driving variables of change have been incorporated, selected for their relevance in the literature and their ability to capture the factors that influence the urban transitions observed in the study area:
Distance to the “Historical Traffic Network”. Defined as the Euclidean distance between each cell of the model and the foundational road layout of the Bellavista sector. This structured network, inherited from the original industrial design, plays a central role in the current urban configuration and has been identified as a key axis in the functional transformation processes of the neighborhood [
52].
Distance to the beach. It corresponds to the Euclidean distance from each cell to the coastline, an area of high real estate appreciation and subject to social disputes over its public use. As [
54] warns, the seafront has been the scene of tensions between tourism, residential, and conservation interests, which directly impact the local dynamics of territorial transformation.
- 3.
Historical Conservation Zone (HCZ). Delimited by the Tomé Municipal Regulatory Plan in 2005, this zone aims to safeguard the morphological value of the founding industrial area. The units located within this polygon are subject to specific regulatory restrictions regarding land use and building typologies in order to preserve their heritage integrity.
- 4.
Sectional Zone of the Coastal Border (SZC). Incorporated into the Sectional Plan of the Coastal Border in 2012, this normative figure regulates the development of the seafront, aiming to harmonize ecological, tourist, and urban interests within a particularly sensitive and contested area [
54].
- 5.
Historic Monument Zone (HMZ). This corresponds to the protected area designated under Exempt Decree No. 222 of the Ministry of Education, issued in 1999, which declares the Bellavista–Oveja Tomé complex a National Historic Monument. This designation imposes legal restrictions on permitted interventions, uses, and modifications to safeguard its historical, architectural, and cultural value [
53].
5. Discussion
The results of this comparative study demonstrate that the analyzed methods
,
and
offer complementary contributions to understanding land-use changes in heritage urban contexts, such as the Bellavista neighborhood of Tomé. While
stood out for its high predictive accuracy, both
and
presented advantages in terms of interpretability, an essential quality for supporting urban planning decisions based on evidence that is understandable to multiple stakeholders [
45,
59].
5.1. Performance
In terms of predictive performance, the results indicate that the Random Forest model is the most robust and consistent for identifying transformation patterns in heritage urban contexts [
63,
64]. Its ability to capture non-linear interactions and complex spatial structures allowed it to maintain high performance both in cross-validation (average accuracy = 0.90; F1 = 0.82;
= 0.75) and in the test set (Accuracy = 0.98; F1 = 0.96;
= 0.95 in the New category), consolidating itself as the alternative with the greatest stability and generalisation to support urban planning processes in heritage contexts [
18,
21].
presented significant limitations in categories with imbalance or non-linear relationships, as evidenced in Open Space Historical, where it achieved a of only 0.26 and an F1 of 0.44 in cross-validation. The RPART model showed intermediate performance overall but stood out particularly in this category, achieving an accuracy of 0.92, a of 0.66, and an adequate balance between sensitivity (0.61) and specificity (0.98), suggesting that its hierarchical structure was particularly effective at capturing perimeter patterns in areas with well-defined edges. Likewise, the results revealed instances of structural overfitting, especially in the category, where high collinearity between the dependent variable and heritage regulations generated near-perfect metrics that do not necessarily translate into a high capacity for generalization. Finally, the incorporation of the Youden index (J) allowed for the definition of specific optimal thresholds for each category, avoiding the traditional cutoff point and improving the models’ discriminatory capacity. This refined adjustment strengthened classification coherence and provided a more transparent and defensible methodological criterion for its application in urban and heritage decision-making processes.
5.2. Interpretability
With regard to interpretability, notable differences arise among the approaches.
, although less accurate in predictive terms, facilitates the estimation of the marginal effect of each variable using odds ratios, which are essential for quantifying the risks or benefits associated with zoning decisions [
45]. This approach is especially useful in public policymaking scenarios, where regulations need to be supported by clear and replicable statistical evidence [
57].
In the field of urban planning and heritage conservation, this model offers two key advantages. First, it provides quantitative evidence of the effectiveness of regulatory instruments, allowing for an assessment of whether protected areas actually reduce the risk of territorial transformation. Second, the use of standardized criteria, such as statistical significance and confidence intervals, facilitates communication of results to decision-makers, which is essential for justifying regulatory adjustments, prioritizing vulnerable areas, and supporting conservation interventions based on verifiable evidence. However, it should be noted that the model exhibits significant residual spatial autocorrelation, reflecting the presence of territorial patterns that are not fully captured and constituting an inherent limitation of explanatory models.
Rpart provides significant value for urban planning by translating complex relationships into explicit, hierarchical, and communicable rules [
63,
65]. Each partition of the tree is linked to a regulatory or morphological threshold, enabling the model’s results to be transparently associated with zoning criteria, land-use restrictions, and buffer zone delimitation. This capability to convert spatial patterns into operational statements (for example, “if the plot is not in a high-protection zone and the distance to the centerline exceeds X meters, then the risk of change increases”) facilitates its integration into territorial management instruments and enhances the traceability of decisions. Overall, Rpart enables the identification of heritage risk areas based on the interaction between regulatory and territorial factors, providing usable evidence to prioritize interventions and guide conservation policies in heritage urban contexts.
In the Random Forest model, the hierarchy of variables is derived from identifying which predictors most effectively separate cases of change from those of no change—that is, renewal processes versus conservation processes. In practical terms, the importance assigned by the Random Forest (RF) indicates which territorial factors have the greatest discriminatory capacity to differentiate areas where the territory tends to remain stable from those where consistent patterns of urban transformation emerge. This prioritization constitutes a strategic input for planning, as it allows for focusing interventions on areas with greater heritage vulnerability and adjusting urban regulations based on quantitative and reproducible evidence. Thus, the model not only improves the understanding of the drivers of change but also strengthens the transparency of the decision-making process by offering objective criteria to guide conservation and land management policies.
5.3. Performance vs. Interpretability
Overall, as summarized in the
Table 8, the results indicate that there is no universally optimal model; rather, the selection depends directly on the analytical objective and the type of evidence required by the planning processes. If the central purpose is to maximize predictive robustness and capture complex territorial patterns, the Random Forest model emerges as the most consistent alternative. When the focus is on quantifying marginal effects and rigorously evaluating the effectiveness of regulatory instruments, Logistic Regression offers irreplaceable advantages due to the clarity and traceability of its coefficients and odds ratios. Finally, if the goal is to communicate understandable and directly applicable decision rules in planning instruments—especially in contexts with perimeter structures or defined regulatory boundaries—Rpart is the most suitable tool because of its ability to translate complex relationships into explicit thresholds. Consequently, the methodological choice must align with specific management needs: prediction, explanation, or regulatory communication, understood as complementary dimensions that, together, strengthen evidence-based urban planning and heritage conservation.
From an urban planning perspective, these findings suggest that modeling land-use change in heritage areas should combine approaches by integrating the predictive capacity of advanced algorithms with the readability of simpler models. This combination favors more effective urban heritage management by coordinating technical knowledge with regulatory frameworks and social participation [
7,
53,
55,
61].
6. Conclusions and Future Works
In this work, we demonstrate the importance of comparatively evaluating different models to address specific issues in heritage planning and conservation. The study focuses on the Bellavista neighborhood of Tomé, a historic district where this type of analysis had not been previously applied. We analyze three main models (Random Forest, Logistic Regression, and RPART) and compare them not only in terms of predictive capacity but also regarding how each translates territorial factors into useful information for urban managers and conservationists. This dual approach, both predictive and interpretive, allows for a deeper understanding of the dynamics of urban transformation and their relationship to the area’s heritage vulnerability.
The aim was to determine how the different explanatory frameworks offered by these models can support the identification of heritage risks, the evaluation of regulatory effectiveness, and the prioritization of interventions in historic urban environments. The main findings regarding the performance and interpretability of the models are as follows: first, the Random Forest model can serve as an early warning system to identify critical areas of urban transformation, establishing a solid foundation for guiding territorial monitoring and prioritizing preventive interventions; second, Logistic Regression is a particularly valuable tool for evaluating the effectiveness of heritage protection zones, as its coefficients and likelihood ratios allow for rigorous quantification of the influence of each regulatory and morphological factor on changes; third, the RPART model stands out for its ability to translate spatial patterns into explicit regulatory rules, facilitating their direct incorporation into urban planning instruments such as regulatory plans, buffer zones, and differentiated densification criteria. Together, these approaches constitute a complementary and replicable methodological framework that enhances evidence-based urban decision-making and heritage management.
Future research should delve deeper into the multiscale validation of models through their comparative application in different cities. This would allow for the evaluation of their performance under varying regulatory, morphological, and territorial dynamics. It is also pertinent to integrate information sources from participatory processes (such as citizens’ perceptions of heritage value, everyday practices, and informal land use) in order to capture social dimensions that traditional spatial models do not typically represent. It is also relevant to incorporate post hoc analyses, such as SHAP values, which allow the contribution of each variable to the model’s result to be broken down. Incorporating these inputs would enhance the robustness and territorial sensitivity of the proposed approaches, strengthening their capacity to adapt to changing urban contexts and contributing to more comprehensive, inclusive, and evidence-based heritage planning.