Comparative Evaluation of Urban Expansion Mapping Methods in Diriyah Using GHSL, NDBI, and Unsupervised Classification

Alfehaid, Muhannad Mohammed

doi:10.3390/land15030510

Open AccessArticle

Comparative Evaluation of Urban Expansion Mapping Methods in Diriyah Using GHSL, NDBI, and Unsupervised Classification

by

Muhannad Mohammed Alfehaid

Department of Geography and GIS, College of Social Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 13318, Saudi Arabia

Land 2026, 15(3), 510; https://doi.org/10.3390/land15030510

Submission received: 9 February 2026 / Revised: 14 March 2026 / Accepted: 18 March 2026 / Published: 22 March 2026

Download

Browse Figures

Versions Notes

Abstract

Accurate urban expansion mapping in dryland environments is essential for sustainable planning, infrastructure management, and heritage-sensitive development, yet it remains methodologically challenging because built-up surfaces often exhibit strong spectral similarity to bright bare soils. This study comparatively evaluates three widely used urban mapping approaches in Diriyah, Saudi Arabia, a rapidly transforming heritage district of high relevance to Saudi Vision 2030: the Global Human Settlement Layer (GHSL), the Normalized Difference Built-up Index (NDBI), and unsupervised k-means classification. Built-up extent was mapped for 2015, 2020, and 2025, and method performance was assessed using 150 stratified reference points interpreted from high-resolution imagery. The results reveal substantial quantitative differences among methods. GHSL produced the most conservative estimates of urban extent (2.80, 4.94, and 5.31 km²), while NDBI and unsupervised classification generated much larger and less realistic built-up areas due to spectral confusion with bright bare soil. Accuracy assessment confirmed the superiority of GHSL, which achieved the highest overall accuracy (0.88) and Kappa coefficient (0.83), compared with NDBI (0.53; 0.41) and unsupervised classification (0.61; 0.50). To support integrative interpretation, the study also developed a Hybrid Built-up Detection Model (HBDM), which combines the three outputs into a continuous urban intensity layer that helps distinguish persistent urban cores from uncertain transition zones. The findings demonstrate that conservative global built-up products provide a more reliable baseline than index-based or unsupervised methods in bright-soil dryland settings. More broadly, the study offers practical methodological guidance for urban monitoring and sustainable land management in desert cities undergoing rapid transformation under large-scale development agendas such as Saudi Vision 2030.

Keywords:

urban expansion; Diriyah; GHSL; NDBI; unsupervised classification; dryland mapping

1. Introduction

Urban expansion mapping is a fundamental requirement for sustainable land management, infrastructure planning, environmental monitoring, and heritage conservation. Its importance becomes even greater in rapidly transforming dryland regions, where urban growth often occurs in landscapes dominated by bright bare soils, sparse vegetation, and highly reflective surfaces that complicate remote-sensing interpretation. In such contexts, inaccurate delineation of built-up land may distort estimates of growth, misinform planning priorities, and weaken the spatial evidence needed for managing sensitive heritage and development zones.

These challenges are particularly relevant in Saudi Arabia, where major development initiatives under Saudi Vision 2030 are accelerating urban transformation across historically and environmentally significant areas. Diriyah represents a highly important case in this regard. As the historic core of the first Saudi state and a UNESCO-recognized heritage landscape, Diriyah is simultaneously a site of cultural preservation, tourism investment, and rapid urban redevelopment. Monitoring its urban expansion therefore requires methods that are not only technically robust but also spatially reliable in a dryland setting where built-up surfaces are easily confused with exposed soil and construction-related land cover.

Remote sensing provides powerful tools for monitoring urban change across large areas and multiple time periods, but the performance of commonly used methods varies considerably by environmental context. Spectral indices such as the Normalized Difference Built-up Index (NDBI) are widely used because of their simplicity and operational efficiency, yet they are known to perform inconsistently in arid and semi-arid environments where bare land exhibits reflectance properties similar to urban materials. Unsupervised classification offers greater flexibility by allowing class boundaries to emerge from the data, but it may still confuse fragmented built-up patches with bright non-urban surfaces when spectral separability is weak. In contrast, global built-up products such as the Global Human Settlement Layer (GHSL) provide standardized and temporally consistent datasets derived from broader training and multi-source classification frameworks, potentially offering more stable performance in difficult mapping environments.

Despite extensive work on urban mapping, an important gap remains in the comparative evaluation of these approaches within dryland heritage districts undergoing rapid transformation. Much of the literature either focuses on single-method applications, broad metropolitan analyses, or humid and mixed-land-cover settings where spectral confusion is less severe. Fewer studies have systematically compared global built-up products, spectral-index methods, and unsupervised classification in a heritage-sensitive desert environment while also examining how their outputs diverge spatially and temporally. This gap is important because method choice directly affects estimates of urban growth and, by extension, the planning decisions derived from them.

Accordingly, this study comparatively evaluates three widely used urban expansion mapping approaches in Diriyah for 2015, 2020, and 2025: GHSL, NDBI, and unsupervised k-means classification. In addition to comparing their mapped urban extents and accuracy metrics, the study introduces a simple Hybrid Built-up Detection Model (HBDM) as an integrative diagnostic layer that synthesizes the outputs of the three methods into a continuous urban intensity surface. Rather than replacing formal accuracy assessment, the hybrid model is intended to support interpretation by highlighting persistent built-up cores, transitional zones, and areas of methodological disagreement. Through this framework, the study contributes empirical guidance for dryland urban mapping and offers planning-relevant insights for heritage conservation and sustainable urban development in Saudi Arabia and comparable desert cities.

2. Literature Review

2.1. Challenges in Dryland Urban Mapping

Urban expansion mapping is a long-established application of remote sensing, but its reliability remains strongly conditioned by environmental context. In dryland cities, built-up detection is particularly difficult because impervious surfaces often share similar spectral responses with bright bare soils, exposed rock, construction areas, and sparsely vegetated land. As a result, urban boundaries may be exaggerated, transitional zones may be misclassified, and temporal growth trends may be distorted if methods are transferred from humid or mixed-land-cover settings without local calibration. Previous studies from Saudi Arabia and other arid environments have shown that these conditions reduce separability between built-up and non-built classes and increase the risk of both commission error and spatial overestimation [1,2,3,4]. More recent work has reinforced the same concern in dry climates, showing that impervious-surface extraction remains sensitive to the spectral similarity between urban materials and surrounding barren land, especially when using medium-resolution optical imagery alone [5,6].

2.2. Spectral Indices for Urban Detection

To address the challenge of built-up extraction, a wide range of spectral indices has been developed. The Normalized Difference Built-up Index (NDBI) remains one of the most widely used methods because it offers a simple and operational way to emphasize built-up surfaces using near-infrared and shortwave-infrared reflectance [7]. Subsequent work expanded this family of methods through thematic combinations designed to suppress vegetation and water signals and improve urban delineation [8,9]. In dry and semi-arid environments, however, index-based methods remain vulnerable to soil background effects, surface brightness, and threshold instability, which often require local tuning and cross-checking with reference imagery [10]. Recent evaluations in dry climates have confirmed that single-index approaches are rarely sufficient on their own, and that combinations of indices or additional contextual constraints may improve the distinction between built-up land and bare surfaces [5,6]. These findings are important for the present study because they suggest that NDBI is useful as a comparative baseline, but not necessarily as a stand-alone solution in heritage-sensitive desert settings.

2.3. Unsupervised and Machine Learning Approaches

Beyond threshold-based indices, unsupervised and machine learning methods provide greater flexibility because class boundaries are inferred from the data rather than fixed a priori. K-means clustering and related unsupervised approaches are often attractive for exploratory mapping because they can reveal dominant spectral groupings without requiring labelled training samples. Yet in dryland environments their performance may still be constrained by weak spectral contrast and mixed clusters that combine bare soil, construction surfaces, and impervious materials into the same class [11,12,13]. More recent studies indicate that improved performance can be achieved when optical imagery is complemented by radar, topographic, thermal, or textural information, or when deep learning architectures are used for impervious-surface extraction [14,15,16,17]. At the same time, these more advanced approaches often require larger training datasets, more complex feature engineering, or computationally intensive workflows. This makes them valuable reference points for future work, while preserving the relevance of simpler comparative designs when the research objective is to evaluate the behavior of commonly used approaches under a difficult dryland mapping scenario.

2.4. Global Built-Up Products, Hybrid Approaches, and the Research Gap

Global built-up products offer a different pathway by providing standardized, multi-temporal representations of settlement patterns derived from large-scale supervised frameworks. Among these, the Global Human Settlement Layer (GHSL) has become one of the most widely used datasets for tracking built-up dynamics and supporting regional and global urban analysis [18,19]. Evaluations suggest that GHSL performs particularly well in delineating established urban cores and providing temporally consistent baselines, although it may underrepresent fragmented, low-density, or newly emerging development at the urban fringe [20]. Recent advances in high-resolution impervious-surface mapping and hybrid urban extraction also point toward the benefits of integrating multiple data sources rather than relying on a single method alone. Examples include fused optical-SAR workflows, texture-enhanced impervious-surface mapping, and new 10 m products that aim to improve spatial completeness and thematic accuracy [14,15,16,17]. Despite this progress, a gap remains in the comparative evaluation of GHSL, spectral-index mapping, and unsupervised clustering within dryland heritage districts undergoing rapid transformation. Much of the literature focuses either on single-method applications, high-resolution AI workflows, or broader metropolitan settings. Fewer studies have directly examined how these different methodological families diverge in a high-reflectance desert landscape where method choice can materially alter estimates of urban growth and planning interpretation. This gap is especially relevant in Diriyah, where accurate mapping is needed not only for measuring expansion but also for supporting heritage protection, landscape management, and sustainable planning under Saudi Vision 2030.

3. Data and Methodology

3.1. Study Area and Research Design

Diriyah is located northwest of Riyadh in central Saudi Arabia (Figure 1) and represents one of the country’s most historically and culturally significant urban landscapes. It includes At-Turaif, a UNESCO World Heritage Site and the historic core of the first Saudi state, while also forming part of a rapidly transforming district shaped by tourism development, residential expansion, infrastructure investment, and heritage-led redevelopment under Saudi Vision 2030. This dual character makes Diriyah an especially suitable case for evaluating urban expansion mapping methods in a dryland heritage setting, where accurate delineation of built-up growth is important not only for land monitoring, but also for spatial planning and conservation-oriented decision-making.

The study was designed as a comparative, multi-method assessment of urban expansion across three benchmark years: 2015, 2020, and 2025. These years were selected to capture pre-expansion conditions, an intermediate stage of development, and the most recent phase of accelerated transformation in Diriyah. Three widely used yet methodologically distinct approaches were evaluated: (1) the Global Human Settlement Layer (GHSL), which represents a standardized global built-up product; (2) the Normalized Difference Built-up Index (NDBI), which represents a spectral-index-based method; and (3) unsupervised k-means clustering, which represents a simple data-driven classification approach. Taken together, these methods provide a practical framework for examining how method choice influences estimates of built-up extent in a high-reflectance dryland environment.

Figure 2 summarizes the methodological workflow of the study, including the main data inputs, the three mapping streams, the validation procedure, and the derivation of the Hybrid Built-up Detection Model (HBDM).

3.2. Data Sources and Method Selection

The analysis uses multi-temporal satellite imagery and GHSL built-up surfaces to map urban change in Diriyah for 2015, 2020, and 2025. Cloud-free imagery with consistent seasonal timing and comparable spatial resolution was selected to minimize temporal noise and improve inter-year comparability (Table 1). Official GHSL built-up layers were used for 2015 and 2020 and clipped to the Diriyah boundary. For 2025, because no official GHSL was available for that date, a GHSL-based extension surface was derived by combining the 2020 GHSL layer with visually interpreted new built-up patches identified from high-resolution imagery acquired between 2018 and 2024. This derived layer is not treated as an official GHSL product; instead, it is used as an indicative extension to approximate recent built-up expansion while preserving the conservative logic of the GHSL baseline.

The three selected methods were chosen because they represent different and commonly used families of urban mapping approaches, each with distinct strengths and limitations in dryland settings. GHSL was included because it provides a globally calibrated and temporally consistent representation of built-up land, making it a useful baseline for comparison. NDBI was selected because it is one of the most widely applied spectral indices for urban detection and offers a transparent, low-complexity way of highlighting built-up surfaces. Unsupervised k-means clustering was included because it allows land-surface patterns to emerge directly from the spectral data without requiring labelled training samples, thereby serving as an exploratory classification baseline. Comparing these three approaches makes it possible to assess not only differences in mapped urban extent, but also the trade-offs between standardized global products, index-based extraction, and unsupervised classification under the same environmental conditions [21,22,23].

3.3. NDBI-Based Built-Up Mapping

For the NDBI-based approach, the Normalized Difference Built-up Index was calculated for each study year using the standard formulation:

NDBI = (SWIR − NIR)/(SWIR + NIR)

where SWIR is the shortwave infrared reflectance and NIR is the near-infrared reflectance.

Initial thresholds were guided by previous applications in dryland and semi-arid environments and then refined iteratively through visual comparison with high-resolution imagery in order to reduce confusion with bright bare soils and active construction areas. To further improve separation from non-urban surfaces, pixels exceeding the final NDBI threshold and falling below a simple NDVI threshold were classified as built-up. This produced binary built-up masks and associated area estimates for each year. The procedure balances methodological transparency with the need for local calibration in spectrally complex dryland landscapes.

3.4. Unsupervised k-Means Classification

Unsupervised k-means clustering was applied to the multispectral imagery for each year and treated here as a simple exploratory machine learning approach. After testing several cluster configurations, k = 5 was selected as a practical compromise between capturing spectral variability and maintaining interpretability. This number allowed the imagery to be partitioned into a manageable set of land-surface groups while preserving sufficient flexibility to distinguish built-up areas from vegetation, bare land, and transitional surfaces.

The cluster most closely corresponding to built-up land was identified through examination of cluster centroids and visual comparison with high-resolution imagery. The same general spectral and contextual logic was applied across all three years to maintain inter-temporal consistency. Although this approach does not eliminate spectral ambiguity, it provides a useful data-driven baseline for assessing how far unsupervised grouping alone can distinguish urban surfaces in a dryland setting characterized by high surface brightness [24,25].

3.5. Accuracy Assessment and Reference Points

To evaluate classification performance quantitatively, 150 stratified random reference points were generated across the study area and visually interpreted from high-resolution imagery for 2015, 2020, and 2025 as either built-up or non-built. Stratification was used to ensure that the validation sample represented both urbanized and non-urbanized surfaces rather than being dominated by the more extensive non-built background. This is especially important in dryland environments, where class imbalance can artificially inflate apparent accuracy when most validation points fall within the dominant bare-land class.

The sample size of 150 points was considered appropriate for this study because the analysis focuses on a relatively compact case-study area, compares broad built-up versus non-built classes rather than a large multi-class scheme, and is intended to provide a robust comparative assessment of method behavior rather than a cadastral inventory. Each method’s output at the reference locations was compared against the visually interpreted labels using confusion matrices.

Four standard accuracy metrics were calculated: overall accuracy (OA), user’s accuracy (UA), producer’s accuracy (PA), and the Kappa coefficient. These were computed as follows:

OA = Σxii/N

UAi = xii/xi+

PAi = xii/x + i

Kappa = [N Σxii − Σ(xi + x + i)]/[N² − Σ(xi + x + i)]

where xii is the number of correctly classified observations for class i, xi+ is the total number of observations assigned to class i, x + i is the total number of reference observations in class i, and N is the total number of validation points. Together, these metrics provide a more complete assessment of performance than area comparison alone by capturing both overall agreement and class-specific omission and commission error [26].

A map of the reference points used in the accuracy assessment is shown in Figure 3.

3.6. Hybrid Built-Up Detection Model (HBDM)

In addition to the three primary methods, a Hybrid Built-up Detection Model (HBDM) was constructed as an integrative diagnostic layer rather than as a separate classifier. Its purpose is not to replace the formal accuracy assessment or to claim a new ground-truth-validated classification product. Instead, it combines the outputs of GHSL, NDBI, and unsupervised k-means clustering into a continuous urban intensity surface that highlights zones of convergence and disagreement among methods.

To ensure pixelwise comparability, all input layers were aligned to a common spatial grid. The GHSL and NDBI layers were treated as continuous surfaces and normalized to a common range between 0 and 1 using min–max scaling:

X_norm = (X − X_min)/(X_max − X_min)

where normalized values were clipped to the [0, 1] range. The built-up result derived from the unsupervised classification was converted into a binary built-up mask, where a value of 1 indicates built-up and 0 indicates non-built. In the present study, Cluster 0 was identified as the built-up class based on inspection of cluster characteristics and visual comparison with high-resolution imagery.

The HBDM was then computed as a weighted linear combination of the normalized GHSL and NDBI layers and the binary clustering mask:

HBDM = 0.55 · GHSL_norm + 0.25 · NDBI_norm + 0.20 · Cluster_built

The weighting structure was designed to reflect the comparative reliability observed in the study. GHSL was assigned the highest weight because it showed the strongest overall accuracy and the clearest temporal consistency, while NDBI and the clustering layer were assigned lower supporting weights because both were more affected by commission error under dryland conditions. In this sense, the hybrid layer is accuracy-informed rather than arbitrarily weighted.

The resulting HBDM surface is interpreted as a relative urban-intensity layer for visual comparison and diagnostic interpretation, especially for distinguishing persistent urban cores, transition zones, and areas of greater methodological uncertainty. It is therefore used as a supportive interpretive product rather than as an independent classification to be evaluated in the accuracy assessment.

4. Results

The three evaluated methods produced substantially different estimates of built-up area and contrasting spatial representations of urban growth in Diriyah. Table 2 summarizes the built-up area mapped by GHSL, NDBI, and unsupervised k-means clustering for 2015, 2020, and 2025, while Figure 4, Figure 5, Figure 6 and Figure 7 present the corresponding spatial outputs. Taken together, the results reveal clear methodological divergence not only in the amount of built-up land identified by each approach, but also in their ability to represent a plausible trajectory of urban growth in a dryland heritage landscape.

4.1. Comparative Built-Up Area Estimates and Temporal Change

Built-up area estimates varied markedly across methods. GHSL mapped 2.80 km² in 2015, 4.94 km² in 2020, and 5.31 km² in 2025, indicating a progressive and spatially plausible increase in urban extent over time. In contrast, NDBI mapped 36.28 km², 35.75 km², and 22.67 km² for the same years, while unsupervised clustering mapped 35.70 km², 32.91 km², and 32.05 km², respectively. These values are far larger than the GHSL estimates and, in the case of NDBI and clustering, do not reflect a realistic growth trajectory for Diriyah.

The contrast becomes even clearer when temporal change is considered explicitly. As shown in Table 2, GHSL indicates an increase of 2.14 km² between 2015 and 2020 and a further 0.37 km² between 2020 and 2025, corresponding to a total gain of 2.51 km² across the study period. By comparison, NDBI shows a slight decline of 0.53 km² between 2015 and 2020 and a much larger decline of 13.08 km² between 2020 and 2025, yielding a total change of −13.61 km². Unsupervised clustering also produces a negative trajectory, with −2.79 km² from 2015 to 2020 and −0.86 km² from 2020 to 2025, for a total change of −3.65 km².

Given the documented transformation of Diriyah during this period, these negative trajectories should not be interpreted as evidence of actual urban contraction. Rather, they reflect methodological artefacts arising from over-classification and temporal instability. This comparison demonstrates that method choice affects not only the magnitude of mapped built-up area, but also the apparent direction and pace of change itself.

4.2. GHSL Results

Among the three approaches, GHSL provides the most spatially coherent and temporally credible representation of urban growth in Diriyah. Across all three years, the mapped built-up footprint remains relatively compact and expands gradually around the established urban core. In 2015, built-up pixels are concentrated mainly in the historic district and adjacent settled areas. By 2020, the mapped extent expands around the existing core, consistent with the early phases of recent development. In 2025, the GHSL-based footprint shows additional infill and limited outward growth toward newly developed zones.

Overall, GHSL portrays Diriyah as a relatively compact urban landscape undergoing steady expansion rather than diffuse sprawl. This pattern is consistent with the known development trajectory of the district and aligns more closely with the broader transformation associated with heritage-led redevelopment and Vision 2030.

4.3. NDBI Results

The NDBI-derived outputs differ sharply from the GHSL results and indicate substantial overestimation of built-up extent in all years. In 2015 and 2020, the NDBI maps classify extensive bright surfaces surrounding Diriyah as built-up, including rocky terrain, bare wadi floors, and highly reflective barren land. As a result, the mapped urban extent is inflated by more than an order of magnitude relative to GHSL.

The limited change between 2015 and 2020 suggests that NDBI had already saturated much of the scene with built-up labels at the beginning of the analysis period. Although the 2025 NDBI result appears somewhat more constrained, with greater concentration near the urban core, it still substantially overestimates the true built-up footprint. These results indicate that NDBI is highly sensitive to bright-soil conditions and therefore struggles to distinguish real urban growth from non-urban high-reflectance surfaces in Diriyah.

In practical terms, NDBI captures built-up features, but under these dryland conditions it also introduces widespread commission error. This makes it unsuitable, on its own, for reliable estimation of urban extent in the study area.

4.4. Unsupervised k-Means Clustering Results

The unsupervised k-means clustering approach also produces built-up area estimates that are much larger than those obtained from GHSL and broadly comparable to those derived from NDBI. In 2015, the selected built-up cluster extends across large parts of the study area, including many bright bare-soil surfaces surrounding Diriyah. In 2020 and 2025, the mapped built-up cluster becomes somewhat more compact, with some peripheral areas reassigned to other spectral classes, but it still covers a much broader area than the actual urban footprint.

The slight decline in mapped built-up area between 2015 and 2025 is counterintuitive and should not be interpreted as true urban shrinkage. Instead, it reflects year-to-year redistribution of spectral classes within a method that remains sensitive to weak separability between impervious surfaces and highly reflective desert soils. Compared with NDBI, clustering produces somewhat more contiguous spatial patterns, but it remains affected by the same underlying problem of spectral confusion.

In practical terms, the clustering approach identifies many actual urban pixels, yet it also misclassifies extensive non-urban surfaces as built-up. This limits its usefulness as a standalone basis for reliable built-up statistics in Diriyah.

4.5. Accuracy Assessment

The quantitative accuracy assessment based on 150 stratified random reference points confirms the superiority of GHSL over the other two methods. GHSL achieved the highest overall accuracy (0.88) and Kappa coefficient (0.83), together with strong user’s accuracy (0.91) and producer’s accuracy (0.86) for the built-up class. In contrast, NDBI produced substantially weaker results, with an overall accuracy of 0.53, Kappa of 0.41, user’s accuracy of 0.49, and producer’s accuracy of 0.71. Unsupervised clustering performed slightly better than NDBI but remained clearly below GHSL, with an overall accuracy of 0.61, Kappa of 0.50, user’s accuracy of 0.57, and producer’s accuracy of 0.79 (Table 3).

These differences are too large to be treated as minor variation among otherwise similar methods. Instead, they indicate clear differences in the ability of the three approaches to represent built-up land under dryland conditions. The contrast between user’s and producer’s accuracy is also revealing. Both NDBI and clustering show higher producer’s accuracy than user’s accuracy, meaning that they capture many true built-up pixels but do so at the cost of substantial commission error. This pattern is consistent with the visual results, where non-urban bright surfaces are frequently misclassified as built-up.

GHSL, in contrast, shows a more balanced performance across all metrics and substantially fewer false positives. Quantitatively, GHSL exceeds NDBI by 0.35 in overall accuracy and 0.42 in Kappa, and exceeds unsupervised clustering by 0.27 in overall accuracy and 0.33 in Kappa. These margins confirm that GHSL provides the most reliable built-up mapping approach among the methods tested in Diriyah.

4.6. HBDM Results and Interpretive Role

The Hybrid Built-up Detection Model (HBDM) was used as a continuous urban intensity layer to synthesize the three mapping outputs and to visualize zones of convergence and uncertainty rather than to produce a separate binary classification. The HBDM maps highlight a compact but gradually expanding urban core around historic Diriyah. In 2015, high values are concentrated mainly in the historic core and adjacent settled areas. By 2020 and 2025, these zones extend outward toward newly developed corridors and project areas.

Compared with GHSL, HBDM provides a smoother gradation between clearly built-up areas and transitional zones. Compared with NDBI and clustering, it reduces the visual dominance of bright-soil overestimation by anchoring the interpretation more strongly to the more reliable GHSL baseline. In this sense, HBDM is most useful as a diagnostic layer that helps distinguish persistent urban cores from uncertain edges and areas of methodological disagreement.

Its value therefore lies in interpretation rather than validation. HBDM does not replace the formal accuracy assessment, but it adds an integrative perspective that is particularly useful in a dryland environment where the boundary between built-up, transitional, and reflective non-urban surfaces is often difficult to represent with a single method alone.

Higher values indicate stronger agreement among methods and more persistent built-up intensity, while intermediate values highlight transition zones and areas of greater methodological uncertainty.

5. Discussion

This study shows that estimates of urban expansion in Diriyah vary substantially according to the mapping approach used, and that these differences are large enough to influence both spatial interpretation and planning relevance. Although GHSL, NDBI, and unsupervised k-means clustering were all applied to identify built-up land, they did not produce minor variations around a shared urban pattern. Instead, they generated markedly different representations of the scale, configuration, and temporal direction of urban growth. This finding confirms that method selection in dryland urban mapping is not a routine technical step, but a substantive analytical decision with direct consequences for how urban change is understood and communicated.

5.1. Why GHSL Provides the Most Reliable Baseline in Diriyah

Among the evaluated methods, GHSL provides the most defensible baseline for monitoring urban expansion in Diriyah. Its outputs are spatially compact, temporally coherent, and strongly supported by the accuracy assessment. In a heritage-sensitive district such as Diriyah, a conservative tendency is methodologically preferable to widespread commission error, because overestimating built-up land can distort the perceived scale, location, and pace of transformation. The GHSL results indicate that Diriyah has undergone clear but spatially concentrated growth, with expansion occurring mainly through infill and contiguous outward development rather than diffuse, landscape-wide sprawl.

This interpretation is important in planning terms. In rapidly changing desert environments, a method that slightly underestimates fringe growth may still be more useful than one that systematically exaggerates urban extent across large non-urban areas. In this case, GHSL offers a more credible balance between sensitivity and reliability, making it particularly suitable as a monitoring baseline for the urban core and its immediate expansion zones.

5.2. Why NDBI and Unsupervised Clustering Overestimate Built-Up Extent

The weaker performance of NDBI and unsupervised clustering is closely related to the environmental characteristics of Diriyah. In high-reflectance dryland settings, bare soils, rocky surfaces, disturbed ground, and construction-related land cover may exhibit spectral responses similar to impervious materials. Under such conditions, methods that rely primarily on spectral contrast are especially vulnerable to commission error. NDBI is particularly sensitive to scene brightness and can classify extensive non-urban surfaces as built-up when threshold separation is weak. Unsupervised clustering reduces some of the dispersion visible in NDBI by producing broader and more contiguous spatial regions, but it remains governed by the same limitation of weak spectral separability.

A key implication is that both methods tend to reach a form of early saturation. Once large portions of the scene are already classified as built-up, the ability of the method to capture additional real growth declines sharply. This helps explain why NDBI and clustering produced unrealistic or even negative temporal trajectories despite the well-documented transformation of Diriyah over the study period. The problem is therefore not simply one of overestimation in individual years, but of reduced temporal sensitivity in environments where built-up and non-built surfaces are spectrally difficult to separate.

5.3. The Role and Limits of HBDM

A key contribution of this study is the introduction of the Hybrid Built-up Detection Model (HBDM) as an interpretive support layer. Its value lies not in replacing accuracy-tested classification, but in integrating the outputs of three methodological families into a continuous urban intensity surface. Used in this way, HBDM helps identify where the methods converge on plausible built-up areas and where they diverge because of uncertainty, mixed land cover, or likely misclassification. This is particularly useful in a dryland heritage setting, where the transition between established urban fabric, active development, exposed soil, and intermediate land surfaces is often difficult to represent using a single binary method.

At the same time, HBDM should not be interpreted as a substitute for formal validation. Because it was not assessed as an independent classifier, its role in this study remains explicitly diagnostic and supportive. This distinction is methodologically important. It preserves the integrity of the accuracy assessment while allowing the hybrid layer to contribute additional spatial insight, particularly in highlighting persistent urban cores, transitional zones, and areas of methodological disagreement.

5.4. Planning, Sustainability, and Heritage Implications

The findings have direct implications for urban monitoring and heritage-sensitive planning in Diriyah and similar desert cities. If built-up land is substantially overestimated, development pressure on heritage assets, landscape corridors, tourism areas, and infrastructure networks may also be overstated. Conversely, if emerging fringe development is detected too coarsely or too late, early signals of spatial pressure may be overlooked. In this context, the results suggest that a conservative but reliable baseline such as GHSL is preferable for core monitoring, especially when complemented by high-resolution visual checks and local contextual interpretation.

More broadly, the study supports a planning approach in which methodological caution is treated as part of sustainable land management. In places undergoing rapid transformation under Saudi Vision 2030, urban monitoring is not merely a technical exercise; it is part of a wider system of heritage protection, development control, and evidence-based spatial decision-making. A workflow that combines a stable global built-up product with locally informed interpretive tools may therefore offer a more useful planning basis than reliance on index-based or unsupervised outputs alone.

5.5. Limitations and Future Research

Several limitations should be acknowledged. First, the analysis evaluates three practical and widely used approaches rather than the full range of available supervised, object-based, or deep learning methods. This was intentional, as the study was designed to compare accessible and operationally distinct approaches under the same dryland conditions. Future research could extend this comparison by incorporating supervised classifiers such as Random Forest or other multi-feature approaches.

Second, the 2025 GHSL-based layer is a derived extension rather than an official GHSL release and should therefore be interpreted as a conservative approximation of recent built-up expansion rather than a formally released GHSL product. Third, HBDM was used as a diagnostic layer and was not validated as a separate classifier. Future research could test alternative weighting strategies, undertake sensitivity analysis, and explore multi-source hybrid models that incorporate thermal, radar, texture, or topographic information. Further work could also investigate whether morphology-sensitive measures, including fragmentation or urban fabric typologies, improve the interpretation of growth patterns in rapidly transforming heritage districts.

6. Conclusions

This study compared three remote-sensing approaches—GHSL, NDBI, and unsupervised k-means clustering—for mapping urban expansion in Diriyah, Saudi Arabia, between 2015 and 2025. The results show that method choice strongly influences both the magnitude and the apparent direction of urban change in dryland environments. GHSL produced the most spatially coherent and temporally plausible estimates, whereas NDBI and unsupervised clustering substantially overestimated built-up extent because of spectral confusion with bright bare soils and related high-reflectance surfaces.

The accuracy assessment confirms that GHSL provides the most reliable baseline among the tested approaches, with clearly higher overall accuracy and Kappa values than the other methods. Although NDBI and clustering identified many true built-up pixels, their large commission errors reduce their suitability for standalone use in a heritage-sensitive desert landscape. HBDM added value as an integrative diagnostic layer by highlighting persistent urban cores, uncertain edges, and areas of methodological disagreement, but it should be interpreted as a support surface rather than as a replacement for formal classification and validation.

Overall, the methodological lessons drawn from Diriyah extend beyond the study area itself. In arid and semi-arid cities undergoing rapid transformation, conservative global built-up products may provide a stronger monitoring baseline than index-based or unsupervised methods when surface reflectance conditions are highly ambiguous. When combined with local visual interpretation and planning context, such approaches can support more credible urban monitoring, more cautious land-management decisions, and more informed heritage-sensitive planning under large-scale development agendas such as Saudi Vision 2030.

Funding

This work was supported and funded by the Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University (IMSIU) (grant number IMSIU-DDRSP2602).

Data Availability Statement

Data are available upon request.

Conflicts of Interest

The author declares no conflicts of interest.

References

Alqurashi, A.F.; Kumar, L.; Sinha, P. Urban land cover change modelling using time-series satellite images: A case study of Saudi Arabia. GeoJournal 2016, 81, 775–791. [Google Scholar]
Rahman, M.T. Detection of land use/land cover changes and urban sprawl in Al-Khobar, Saudi Arabia: An analysis of multi-temporal remote sensing data. ISPRS Int. J. Geo-Inf. 2016, 5, 15. [Google Scholar] [CrossRef]
Al-Dakheel, J.; Hashim, M. Spatiotemporal analysis of urban expansion in arid environments using multi-sensor satellite data. Egypt. J. Remote Sens. Space Sci. 2019, 22, 331–340. [Google Scholar]
Taubenbock, H.; Esch, T.; Felbier, A.; Wiesner, M.; Roth, A.; Dech, S. Monitoring urbanization in mega cities from space. Remote Sens. Environ. 2012, 117, 162–176. [Google Scholar] [CrossRef]
Almohamad, H.; Alshwesh, I.O. Evaluation of index-based methods for impervious surface mapping from Landsat-8 to cities in dry climates: A case study of Buraydah City, KSA. Sustainability 2023, 15, 9704. [Google Scholar] [CrossRef]
Harrak, Y.; Rachid, A.; Aguejdad, R. Evaluation of spectral indices and global thresholding methods for the automatic extraction of built-up areas: An application to a semi-arid climate using Landsat 8 imagery. Urban Sci. 2025, 9, 78. [Google Scholar] [CrossRef]
Zha, Y.; Gao, J.; Ni, S. Use of normalized difference built-up index in automatically mapping urban areas from TM imagery. Int. J. Remote Sens. 2003, 24, 583–594. [Google Scholar] [CrossRef]
Xu, H. Extraction of urban built-up land features from Landsat imagery using a thematic-oriented index combination technique. Photogramm. Eng. Remote Sens. 2007, 73, 1381–1391. [Google Scholar] [CrossRef]
As-syakur, A.R.; Adnyana, I.W.S.; Arthana, I.W.; Nuarsa, I.W. Enhanced built-up and bareness index (EBBI) for mapping built-up and bare land in an urban area. Remote Sens. 2012, 4, 2957–2970. [Google Scholar] [CrossRef]
Sheikh, V.; Pal, S.; Bhattacharya, B.K. Comparison of spectral indices, machine learning, and clustering for built-up mapping in dryland cities. Remote Sens. Appl. Soc. Environ. 2020, 20, 100400. [Google Scholar]
Abdi, A.M. Land cover and land use classification performance of machine learning algorithms in a boreal landscape using Sentinel-2 data. GISci. Remote Sens. 2020, 57, 1–20. [Google Scholar] [CrossRef]
Zhang, Q.; Wang, P.; Chen, H.; Huang, Q.; Jiang, H.; Zhang, Z.; Zhang, Y.; Luo, X.; Sun, S. A novel method for urban built-up area extraction from Landsat imagery in arid regions. Int. J. Appl. Earth Obs. Geoinf. 2017, 62, 178–190. [Google Scholar]
Zhou, Y.; Yang, G.; Wang, S.; Wang, L.; Wang, F.; Liu, X. A new index for mapping built-up and bare land areas from Landsat-8 OLI data in arid regions. Remote Sens. Lett. 2014, 5, 862–871. [Google Scholar] [CrossRef]
Azmi, R.; Chenal, J.; Amar, H.; Tekouabou Koumetio, C.S.; Diop, E.B. A hybrid approach for extracting large-scale and accurate built-up areas using SAR and multispectral data. Atmosphere 2023, 14, 240. [Google Scholar] [CrossRef]
Chang, R.; Hou, D.; Chen, Z.; Chen, L. Automatic extraction of urban impervious surface based on SAH-Unet. Remote Sens. 2023, 15, 1042. [Google Scholar] [CrossRef]
Li, X.; Zhou, G.; Zhou, L.; Lv, X.; Li, X.; He, X.; Tian, Z. A new technique for urban and rural settlement boundary extraction based on spectral-topographic-radar polarization features and its application in Xining, China. Remote Sens. 2024, 16, 1091. [Google Scholar] [CrossRef]
Ren, S.; Pan, Y.; Zhu, X.; Zhao, C.; Gao, Y. A general and simple automated impervious surface mapping approach based on three-dimensional texture features (3DTF) using fine spatial resolution remotely sensed imagery. Sci. Total Environ. 2024, 923, 171181. [Google Scholar] [CrossRef] [PubMed]
Pesaresi, M.; Guo, H.; Blaes, X.; Ehrlich, D.; Florczyk, A.J.; Freire, S.; Kemper, T.; Syrris, V.; Halkia, S. A global human settlement layer from optical HR/VHR RS data: Concept and first results. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2013, 6, 2102–2131. [Google Scholar] [CrossRef]
Schiavina, M.; Melchiorri, M.; Pesaresi, M.; Politis, P.; Freire, S.; Maffenini, L.; Florio, P.; Ehrlich, D.; Goch, K.; Kemper, T. GHS-BUILT-S R2022A: GHSL Built-Up Surface Grid, Derived from Sentinel-2 Composite Imagery for 2018; European Commission, Joint Research Centre (JRC): Luxembourg, 2022. [Google Scholar]
Liu, F.; Wang, S.; Xu, Y.; Ying, Q.; Yang, F.; Qin, Y. Accuracy Assessment of Global Human Settlement Layer (GHSL) Built-Up Products over China. PLoS ONE 2020, 15, e0233164. [Google Scholar] [CrossRef]
Sun, G.; Li, Z.; Zhang, A.; Wang, X.; Yan, K.; Jia, X.; Liu, Q.; Li, J. A 10-m resolution impervious surface area map for the Greater Mekong Subregion from remote sensing images. Sci. Data 2023, 10, 607. [Google Scholar] [CrossRef]
Kingdom of Saudi Arabia. Saudi Vision 2030. Government of Saudi Arabia, 2016. Available online: https://www.vision2030.gov.sa/en (accessed on 10 October 2025).
UNESCO World Heritage Centre. At-Turaif District in ad-Dir’iyah (Saudi Arabia). World Heritage List, Ref. 1329. 2010. Available online: https://whc.unesco.org/en/list/1329/ (accessed on 25 October 2025).
Li, X.; Gong, P. Urban growth models: Progress and perspective. Sci. Bull. 2016, 61, 1637–1652. [Google Scholar] [CrossRef]
Jensen, J.R. Introductory Digital Image Processing: A Remote Sensing Perspective, 3rd ed.; Prentice Hall: Hoboken, NJ, USA, 2005. [Google Scholar]
Congalton, R.G.; Green, K. Assessing the Accuracy of Remotely Sensed Data: Principles and Practices, 3rd ed.; CRC Press: Boca Raton, FL, USA, 2019. [Google Scholar]

Figure 1. Location of the study area. The figure shows Diriyah within Saudi Arabia, together with the study-area boundary used for the urban expansion analysis.

Figure 2. Workflow diagram illustrating the main methodological components of the study, including data inputs, GHSL processing, NDBI extraction, clustering, accuracy assessment, and HBDM synthesis.

Figure 3. Spatial distribution of the 150 stratified reference points used for accuracy assessment in Diriyah. Red points represent built-up reference samples, while blue points represent non-built-up reference samples interpreted from high-resolution imagery.

Figure 4. GHSL-derived built-up surfaces in Diriyah for (a) 2015, (b) 2020, and (c) 2025. The GHSL outputs show a compact and temporally consistent built-up footprint, with gradual expansion around the historic core and adjacent development areas.

Figure 5. NDBI-derived built-up maps in Diriyah for (a) 2015, (b) 2020, and (c) 2025. The maps illustrate the strong sensitivity of NDBI to bright bare surfaces, resulting in widespread overestimation of built-up land beyond the actual urban footprint.

Figure 6. Unsupervised k-means clustering results for built-up areas in Diriyah for (a) 2015, (b) 2020, and (c) 2025. Although spatially more contiguous than NDBI, the selected built-up cluster still includes extensive non-urban bright surfaces, reflecting limited spectral separability in the dryland environment.

Table 1. Summary of datasets used in the study.

Data Source	Resolution	Date (Year)	Usage
GHSL built-up layer (raster)	~30 m	2015, 2020, 2025	Extract built-up area extent and provide a comparative baseline
Landsat 8 OLI imagery	30 m	15 August 2015; 27 July 2020; 15 February 2025	Compute NDBI and perform unsupervised k-means classification
High-resolution imagery/Google Earth	Sub-meter to meter-level	Multi-date visual interpretation	Reference interpretation, threshold refinement, and built-up validation
Diriyah boundary/ancillary spatial layers	Vector	Current study boundary	Clipping, contextual interpretation, and cartographic framing
Stratified random reference points	Point sample	Validation stage	Accuracy assessment of built-up versus non-built classes

Table 2. Built-up area (km²) and net change in Diriyah for 2015, 2020, and 2025 using GHSL, NDBI, and unsupervised k-means clustering.

Year	GHSL Built-Up Area (km²)	NDBI Built-Up Area (km²)	Unsupervised Clustering Built-Up Area (km²)	GHSL Net Change (km²)	NDBI Net Change (km²)	Unsupervised Net Change (km²)
2015	2.80	36.28	35.70	—	—	—
2020	4.94	35.75	32.91	+2.14	−0.53	−2.79
2025	5.31	22.67	32.05	+0.37	−13.08	−0.86

Note: Net change is calculated relative to the previous observation year.

Table 3. Accuracy assessment of GHSL, NDBI, and unsupervised k-means clustering based on 150 stratified reference points, including overall accuracy (OA), Kappa, user’s accuracy (UA), and producer’s accuracy (PA) for the built-up class.

Method	Overall Accuracy (OA)	Kappa	User’s Accuracy (Built-Up)	Producer’s Accuracy (Built-Up)
GHSL	0.88	0.83	0.91	0.86
NDBI	0.53	0.41	0.49	0.71
Unsupervised Clustering	0.61	0.50	0.57	0.79

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Alfehaid, M.M. Comparative Evaluation of Urban Expansion Mapping Methods in Diriyah Using GHSL, NDBI, and Unsupervised Classification. Land 2026, 15, 510. https://doi.org/10.3390/land15030510

AMA Style

Alfehaid MM. Comparative Evaluation of Urban Expansion Mapping Methods in Diriyah Using GHSL, NDBI, and Unsupervised Classification. Land. 2026; 15(3):510. https://doi.org/10.3390/land15030510

Chicago/Turabian Style

Alfehaid, Muhannad Mohammed. 2026. "Comparative Evaluation of Urban Expansion Mapping Methods in Diriyah Using GHSL, NDBI, and Unsupervised Classification" Land 15, no. 3: 510. https://doi.org/10.3390/land15030510

APA Style

Alfehaid, M. M. (2026). Comparative Evaluation of Urban Expansion Mapping Methods in Diriyah Using GHSL, NDBI, and Unsupervised Classification. Land, 15(3), 510. https://doi.org/10.3390/land15030510

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Comparative Evaluation of Urban Expansion Mapping Methods in Diriyah Using GHSL, NDBI, and Unsupervised Classification

Abstract

1. Introduction

2. Literature Review

2.1. Challenges in Dryland Urban Mapping

2.2. Spectral Indices for Urban Detection

2.3. Unsupervised and Machine Learning Approaches

2.4. Global Built-Up Products, Hybrid Approaches, and the Research Gap

3. Data and Methodology

3.1. Study Area and Research Design

3.2. Data Sources and Method Selection

3.3. NDBI-Based Built-Up Mapping

3.4. Unsupervised k-Means Classification

3.5. Accuracy Assessment and Reference Points

3.6. Hybrid Built-Up Detection Model (HBDM)

4. Results

4.1. Comparative Built-Up Area Estimates and Temporal Change

4.2. GHSL Results

4.3. NDBI Results

4.4. Unsupervised k-Means Clustering Results

4.5. Accuracy Assessment

4.6. HBDM Results and Interpretive Role

5. Discussion

5.1. Why GHSL Provides the Most Reliable Baseline in Diriyah

5.2. Why NDBI and Unsupervised Clustering Overestimate Built-Up Extent

5.3. The Role and Limits of HBDM

5.4. Planning, Sustainability, and Heritage Implications

5.5. Limitations and Future Research

6. Conclusions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI