Comparative Analysis of Non-Negative Matrix Factorization in Fire Susceptibility Mapping: A Case Study of Semi-Mediterranean and Semi-Arid Regions

Rahimi, Iraj; Duarte, Lia; Barkhoda, Wafa; Teodoro, Ana Cláudia

doi:10.3390/land14071334

Open AccessArticle

Comparative Analysis of Non-Negative Matrix Factorization in Fire Susceptibility Mapping: A Case Study of Semi-Mediterranean and Semi-Arid Regions

¹

Department of Geosciences, Environment and Spatial Planning, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal

²

Darbandikhan Technical Institute, Sulaimani Polytechnic University, Wrme Street 327/76, Qrga, Sulaymaniyah 70-236, Iraq

³

Institute of Earth Sciences, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal

⁴

Department of Computer Engineering, University of Kurdistan, Sanandaj 66177-15175, Iran

⁵

Faculty of Information Technology, Kermanshah University of Technology, Kermanshah 67156-85420, Iran

^*

Author to whom correspondence should be addressed.

Land 2025, 14(7), 1334; https://doi.org/10.3390/land14071334

Submission received: 11 April 2025 / Revised: 17 June 2025 / Accepted: 20 June 2025 / Published: 23 June 2025

(This article belongs to the Section Land Use, Impact Assessment and Sustainability)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Semi-Mediterranean (SM) and semi-arid (SA) regions, exemplified by the Kurdo-Zagrosian forests in western Iran and northern Iraq, have experienced frequent wildfires in recent years. This study proposes a modified Non-Negative Matrix Factorization (NMF) method for detecting fire-prone areas using satellite-derived data in SM and SA forests. The performance of the proposed method was then compared with three other already proposed NMF methods: principal component analysis (PCA), K-means, and IsoData. NMF is a factorization method renowned for performing dimensionality reduction and feature extraction. It imposes non-negativity constraints on factor matrices, enhancing interpretability and suitability for analyzing real-world datasets. Sentinel-2 imagery, the Shuttle Radar Topography Mission (SRTM) Digital Elevation Model (DEM), and the Zagros Grass Index (ZGI) from 2020 were employed as inputs and validated against a post-2020 burned area derived from the Normalized Burned Ratio (NBR) index. The results demonstrate NMF’s effectiveness in identifying fire-prone areas across large geographic extents typical of SM and SA regions. The results also revealed that when the elevation was included, NMF_L1/2-Sparsity offered the best outcome among the used NMF methods. In contrast, the proposed NMF method provided the best results when only Sentinel-2 bands and ZGI were used.

Keywords:

fire susceptibility; NMF; semi-Mediterranean; semi-arid; ZGI; machine learning

1. Introduction

Forest fires’ rising frequency and severity have emerged globally as a critical issue fueled by natural factors and human activities. Extreme weather events, shifts in land use, and rapid urban development are crucial contributors to this growing problem, exacerbating the risk of wildfires [1,2,3,4,5]. The consequences of forest fires extend beyond immediate environmental damage; they contribute to global carbon emissions and disrupt ecosystems, leading to soil degradation, erosion, loss of microfauna and flora, and the deterioration of water quality [6].

To improve the focus, this study narrows its scope to the semi-Mediterranean (SM) and semi-arid (SA) forests of the Kurdo-Zagrosian region, especially Marivan and Sarvabad in western Iran, which are experiencing increasing wildfire activity due to a combination of prolonged dry spells, human encroachment, and land-use change [7,8,9].

In response to these challenges, the early identification of fire-prone areas has become increasingly crucial for effective risk mitigation. Forest Fire Susceptibility Mapping (FFSM) is vital in identifying fire-prone areas or areas at high risk of wildfires, enabling better ecological management and biodiversity conservation [10,11,12]. These efforts rely on factors such as vegetation cover, temperature, humidity, rainfall, wind speed, proximity to roads and water bodies, elevation, slope, and land-use patterns [13,14,15,16].

Technological advances, particularly in combining Geographic Information Systems (GISs) with Remote Sensing (RS) data/techniques, have revolutionized the development of FFSMs, enabling more precise identification of fire-prone areas. Additionally, ensemble models, which combine the strengths of multiple algorithms, have shown superior accuracy in predicting fire susceptibility [17,18,19,20,21]. Beyond this, integrating geospatial and RS data has allowed researchers to account for various factors influencing wildfire susceptibility, such as temperature, land use, proximity to roads, slope, and vegetation cover [13,22,23].

On the other hand, several studies have focused on classifying various land features and vegetation types for mapping fire-prone areas using only reflectance data from satellite sources, bypassing topographic features like slope, aspect, and distance from roads [4,24,25,26,27]. These models primarily utilize spectral indices from satellite imagery, such as the Normalized Difference Vegetation Index (NDVI), Enhanced Vegetation Index (EVI), Live Fuel Moisture Content (LFMC), and Zagros Gras Index (ZGI) derived from RS data [4,9,24,25,26,27,28,29]. These indices can capture vegetation’s health, moisture content, and stress, all critical factors in fire susceptibility [24,25,26,27,28,29].

Unsupervised methods, such as principal component analysis (PCA), K-means clustering, and IsoData, enable the automatic detection of patterns in data. They are widely applied in environmental studies and land cover mapping for dimensionality reduction and clustering tasks [30,31,32,33,34]. PCA transforms correlated input variables into a set of uncorrelated principal components that retain the majority of variance in the data, though this transformation can reduce the interpretability of the original features [35,36]. K-means and IsoData clustering support pattern recognition in large datasets by grouping data into clusters based on feature similarity. However, their effectiveness can be limited by the need for a predefined number of clusters and sensitivity to initialization, which can impact the reproducibility in fire susceptibility mapping applications [37,38].

In contrast, Non-Negative Matrix Factorization (NMF) decomposes data into additive, non-negative components and weights, preserving the physical interpretability of features, a key advantage when analyzing mixed spectral signatures in vegetation or post-fire landscapes [39,40]. Its ability to extract distinct, interpretable elements from mixed pixels has driven successful applications in hyperspectral unmixing, land-use classification, and forest health monitoring [41,42]. Moreover, by constraining components to non-negative values, NMF naturally aligns with the physics of spectral reflectance, making it particularly effective for burn severity mapping and rapid post-fire assessments in data-scarce environments [41,42].

Early work by Gillis et al. [43] demonstrated the power of a rank-two NMF framework for hierarchical clustering of hyperspectral imagery, outperforming K-means and standard NMF in cluster stability and balance. Nguyen Van and Lee [44] further showed that, among 13 feature extraction methods for Landsat-8 burn severity mapping, NMF combined with machine learning can exceed PCA’s accuracy. Harries and O’Kane [45] contrasted NMF with PCA and K-means in climate data, confirming NMF’s superior interpretability despite PCA’s optimal reconstruction and K-means’ clustering ease. For change detection, Zhu et al. [46] introduced an L_2,1-norm-regularized double NMF model that robustly identifies altered regions in multi-temporal hyperspectral images. Guillaume et al. [47] extended NMF to coastal benthic habitat mapping by incorporating water-column adjacency effects, while Esi et al. [48] applied NMF for unsupervised monitoring of marine mucilage outbreaks. Although NMF is commonplace in hyperspectral analysis, it is rarely applied to multispectral data. Yokoya et al. [49] used coupled NMF to combine hyperspectral and Sentinel-2-like multispectral images, improving details without losing quality. Later, Khader et al. [50] developed NMF-DuNet, a deep learning model with NMF steps, which gave better results than normal NMF or deep learning alone. Hou et al. [51] proposed an NMF-based method to reconstruct visible-range reflectance data from multispectral satellite products.

Despite the success of PCA and K-means in mapping fire-prone areas from multispectral images [52,53], NMF is yet to be explored for this specific task. In this study, we compare three established NMF variants (standard, L1-sparse, and L1/2-sparse) and our self-proposed Sparse Endmember-Independent NMF (SEI-NMF) against PCA, K-means, and IsoData. We hypothesize that NMF methods, especially SEI-NMF, will deliver the clearest fire-related features. The “endmember-independence” constraint in SEI-NMF is a separate term (the off-diagonal autocorrelation penalty) that specifically minimizes redundancy between spectral components. Focusing on the Kurdo-Zagrosian region and using NDVI plus the ZGI, we aim to create sharper, fully satellite-based fire-risk maps.

2. Materials and Methods

2.1. Study Area

This study focuses on the SA and SM climate regions, particularly the forests of Marivan and Sarvabad in the Kurdistan Province, western Iran, which have faced numerous wildfires over the past decades (see Figure 1) [9].

Fires typically begin in late May, driven by the accumulation of dry grass species, and persist until the onset of autumn rains. These forests are located between 750 and 1700 m above mean sea level (a.m.s.l.), neighboring different shrub species [54]. Grass species, highly flammable and widespread, further increase fire vulnerability, especially during the dry summer months, when most of these species dry out, contributing to fire susceptibility [9]. These forests are located near the Iraq border, between longitudes 45°57′50″ E and 46°46′41″ E and latitudes 35°1′1″ N and 35°49′51″ N [9]. Marivan sits at an average elevation of 1287 m within the northern Zagros mountain range, featuring mountainous terrains and valleys [9]. The region’s SM climate brings cold winters and hot summers, with an annual precipitation of approximately 991 mm. The vegetation, predominantly Brant’s oak (Quercus brantii) and Crataegus pontica and Pistacia atlantica, covers dense and semi-dense forests, rangelands, and pastures [55]. These areas have experienced numerous fires in recent decades, highlighting the urgent need for studies to address the issue [9].

2.2. Data Sources

The data used in this study are presented in Table 1. Sentinel-2 multispectral imagery, spanning from 2020 to 2023, was obtained from the European Space Agency’s (ESA) Copernicus archive via Google Earth Engine (GEE) [56]. The elevation data were derived from the Shuttle Radar Topography Mission (SRTM) Digital Elevation Model (DEM) at 30 m spatial resolution, later resampled to 10 m using GEE to meet the resolution of the Sentinel-2 data to avoid scaling biases [57]. All datasets were processed within the Universal Transverse Mercator (UTM) coordinate system, Zone 38N (EPSG:32638). Additionally, the ZGI was calculated from the Sentinel-2 imagery [9]. ZGI maps dry grass-rich areas, the most flammable fuels in the Zagros region. It uses NDVI differences between the green-up season (when trees and grasses are green) and mid-summer (when grasses dry out) to highlight areas with high dry grass cover for wildfire risk assessments [9].

2.3. Sparse and Endmember-Independent Non-Negative Matrix Factorization (SEI-NMF)

Besides applying three scenarios for the already developed NMF methods, this study introduces the SEI-NMF models. Three variants of NMF were applied as follows: standard NMF, L1-NMF, and L1/2-NMF. The L1-Sparsity constraint enforces a more localized, part-based decomposition suitable for isolating fire-affected spectral signatures [59]. At the same time, the L1/2 variant promotes greater sparsity and robustness to noise, which is particularly valuable in hyperspectral fire scene analyses [60]. Essential foundational methods are discussed to contextualize SEI-NMF’s approach. The Linear Mixture Model (LMM) is a widely used model in RS that decomposes pixel spectra into physical components, or “endmembers”, representing distinct surface types [37]. NMF, closely aligned with LMM, similarly decomposes multispectral data into non-negative components, making it particularly effective for identifying additive spectral features. SEI-NMF extends NMF by adding sparsity and endmember independence, improving its utility in detecting surface materials and mapping fire-prone regions using RS data [37,38].

This study employs a range of notation to represent mathematical entities. The lowercase letters stand for scalars, while the boldface lowercase letters and uppercase letters stand for vectors and matrices, respectively. For any matrix

M

, its

i

-th column and

j

-th row are denoted by

M_{i}

and

M_{(j)}

, and

M_{i j}

represents its (i, j)-element. The trace of

M

is represented by

T r (M)

, and the transposed matrix of

M

is denoted by

M^{T}

. The Frobenius norm of a matrix

M \in R^{d \times n}

is described as Equation (1), which provides a single scalar value that measures the overall magnitude of the matrix and is commonly used for comparing matrices or for error minimization in optimization problems [38].

{| | M | |}_{F} = \sqrt{\sum_{i}^{n} \sum_{j}^{d} M_{j i}^{2}} = \sqrt{T r (M^{T} M)} = \sqrt{T r (M M^{T})}

(1)

To clarify the following steps, we will first discuss the concepts of LMM, NMF, and the sparsity regularizer. Subsequently, we will highlight the modifications that lead to the proposed NMF method (SEI-NMF).

2.3.1. Linear Mixture Model (LMM)

Given an observation matrix

X = [x_{1}, x_{2}, \dots, x_{n}] \in R^{d \times n}

with

d

bands and

n

pixels, where each band

i

corresponds to a particular wavelength

λ_{i}

, the LMM can be expressed as Equation (2).

X = W H + E

(2)

where

W = [w_{1}, w_{2}, \dots, w_{r}] \in R^{d \times r}

and

H = [h_{1}, h_{2}, \dots, h_{n}] \in R^{r \times n}

represent the endmember matrix and the abundance matrix, respectively,

r

denotes the number of endmembers, and

E \in R^{d \times n}

is the additive noise. In contrast to the number of pixels or the bands, the number of endmembers existing in a multispectral image is often much smaller, i.e.,

r < m i n (d, n)

.

Let

i

stand for the location of a pixel in the image after putting all the pixels in order. Then, according to LMM, a single-pixel

x_{i}

can be approximately represented as a linear combination of endmember vectors (Equation (3)).

x_{i} \approx \sum_{p = 1}^{r} w_{p} h_{p i}

(3)

In Equation (3), the value of

h_{p i}

measures how much the endmember

w_{p}

contributes to the pixel

x_{i}

.

2.3.2. Non-Negative Matrix Factorization (NMF_Basic)

Since NMF can lead to the interpretable part-based representation of high-dimensional data [59], it owns a wide range of applications, such as clustering [60], feature selection [61], recommender system [62], community detection [63], and matrix completion [64]. Let

X

be the given data matrix, represented as

X = [x_{1}, x_{2}, \dots, x_{n}] \in R^{d \times n}

, where

d

is the number of features and

n

is the number of samples. Each column vector

x_{i}

denotes a non-negative data sample with

d

dimensions. NMF aims to discover two non-negative matrices

W \in R^{d \times r}

and

H \in R^{r \times n}

, which can accurately reconstruct the data matrix as

X \approx W H

, according to Equation (4).

\min_{W, H} \sum_{i = 1}^{n} Ξ (x_{i}, W h_{i}), s . t . W, H \geq 0

(4)

where, according to the measure function Ξ, each sample x_i can be reconstructed as a linear combination of the vector bases in

W

, using coefficients given by the vector

h_{i}

. The Basic NMF model utilizes the square error distance to quantify the difference between

X

and

W H

. Its objective function is defined as Equation (5).

\min_{W, H} {| | X - W H | |}_{F}^{2} = \sum_{i = 1}^{n} {| | {x_{i} - W h}_{i} | |}^{2}, s . t . W, H \geq 0

(5)

where

| | {x_{i} - W h}_{i} | |

represents the reconstructed error of the i-th sample and

{| | . | |}_{F}

refers to the Frobenius norm. Although Equation (5) is convex in

W

and

H

independently, it loses convexity when both variables are considered simultaneously. Therefore, obtaining the globally optimal solution is impractical, and a locally optimal solution can only be obtained by using optimization methods. It is worth noting that most NMF algorithms are iterative and leverage the fact that NMF can be simplified to a convex non-negative least square problem (NNLS) when either

W

or

H

is fixed. Specifically, one of the two factors is held constant throughout each iteration. At the same time, the other is updated to decrease the objective function. The most well-known and widely used optimization can be conducted using the following multiplicative update rules (Equations (6) and (7)) [65]:

W \leftarrow W ⊙ \frac{X H^{T}}{W H H^{T}}

(6)

H \leftarrow H ⊙ \frac{W^{T} X}{W^{T} W H}

(7)

where the operator

{(.)}^{T}

represents the transpose of the matrix and

⊙

and / denote the elementwise multiplication and division, respectively. Note that the rules denoted by Equations (6) and (7) will not change the non-negativity of

W

and

H

, provided that the initial

W

and

H

are non-negative.

2.3.3. Sparsity Regularizer (NMF_L1- and L1/2-Sparsity)

In NMF, the objective function is non-convex, which means it can have many local minimum points. As a result, solutions can vary, making the outcome less stable or unique. To address this, additional constraints are often added to standard NMF. Since multispectral data naturally exhibit sparse abundance patterns—where only a few components significantly contribute to each pixel—a sparsity constraint is applied to the objective function. This approach, introduced in studies like [59,66], helps improve stability by focusing the solution on more realistic, sparse data representations. Equation (8) shows the objective function.

\min_{W, H} {| |X - W H| |}_{F}^{2} + γ g (H) s . t . W, H \geq 0

(8)

where

γ \geq 0

is the parameter used to control the contribution of the sparsity measure function

g (\cdot)

of the matrix

H

, which is regarded as the regularization term. In this study, we introduce two kinds of sparsity regularizers:

L_{1}

and

L_{1 / 2}

. The corresponding L₁-sparse NMF is given as Equations (9) and (10).

\min_{W, H} {| |X - W H| |}_{F}^{2} + γ {| |H| |}_{1} s . t . W, H \geq 0

(9)

The

L_{1 / 2}

-NMF is then written as follows:

\min_{W, H} {| |X - W H| |}_{F}^{2} + γ {| |H| |}_{1 / 2} s . t . W, H \geq 0

(10)

where

{| |H| |}_{1 / 2} = \sum_{i, j}^{d, n} {(H_{i j})}^{1 / 2}

and

H_{i j}

are the abundance fraction for the

i

th endmember at the

j

th pixel in the multispectral data.

2.3.4. Proposed Method (SEI-NMF)

To develop the proposed SEI-NMF method, the L1-Sparsity NMF model was extended by incorporating specific constraints. This enhancement allows the model to capture sparse and independent features better. The solution space of the NMF model is vast, and the identification of endmembers plays a critical role in fire susceptibility research. Incorporating the characteristics of endmembers as prior knowledge into the NMF model can significantly enhance its performance. This approach enables the identification of more accurate endmembers, leading to improved results. Since different components of the multispectral data contribute with specific proportions, it is essential to ensure their independence. To achieve this, the autocorrelation matrix can be adapted as a constraint, where the independence of these components is reflected in a diagonal autocorrelation matrix. That is, the off-diagonal elements of its autocorrelation matrix should be as close to 0 as possible. Therefore, the NMF model with endmember independence constraint is as Equation (11).

\min_{W, H} {| | X - W H | |}_{F}^{2} + α ({| | W^{T} W | |}_{1} - {| |W| |}_{F}^{2}) s . t . W, H \geq 0

(11)

where α is the parameter to balance the data fidelity and endmember independence term. The second term refers to the sum of the off-diagonal elements of the autocorrelation matrix for endmembers, i.e., the difference between the sum of all the components (the first sub-term) and the sum of the diagonal elements (the second sub-term). The purpose of the second term in Equation (10) is to make the endmembers independent of each other as much as possible; that is, the correlation between different endmembers should be as small as possible.

Equation (12) describes the ultimate loss function proposed to our SEI-NMF.

\min_{W, H} {| | X - W H | |}_{F}^{2} + γ {| |H| |}_{q} + α ({| | W^{T} W | |}_{1} - {| |W| |}_{F}^{2}) s . t . W, H \geq 0

(12)

where q ∈ {1/2,1} determines the sparse regularization term, while γ ≥ 0 controls the contribution of the sparsity regularization g(·) on H, and α ≥ 0 balances the data fidelity against the endmember-independence term.

Standard NMF seeks only to minimize the reconstruction error

{| | X - W H | |}_{F}^{2}

, without any additional structure on W or H. L1-sparse and L½-sparse NMF extend this by adding a sparsity penalty

γ {| |H| |}_{q}

(q ∈ {1/2, 1}), which forces most abundance coefficients in H toward zero but does not regulate the endmember signatures in W.

SEI-NMF builds on L1-sparse NMF by introducing a second constraint on W, an endmember-independence term,

α ({| | W^{T} W | |}_{1} - {| |W| |}_{F}^{2})

, which penalizes off-diagonal elements of the autocorrelation matrix of endmembers. This ensures that different spectral signatures remain as uncorrelated as possible. Together, the sparsity and independence constraints guide SEI-NMF toward sparser, non-redundant components that better reflect the physical reality of mixed pixels in fire-prone landscapes.

2.3.5. Optimization

Similar to the NMF case, the cost functions (Equation (12)) are not convex concerning

W

and

H

together. To this end, a standard method is to iteratively optimize the objective function by minimizing one variable while keeping the other variable fixed. We then perform two types of updates. First, we update

H

while

W

is fixed. Let

Φ \geq 0

and

Ψ \geq 0

be the corresponding Lagrange multipliers. Consider the Lagrange

L

for q = 1/2 as Equation (13).

L = {| |X - W H| |}_{F}^{2} + γ {| |H| |}_{1 / 2} + α ({| | W^{T} W | |}_{1} - {| |W| |}_{F}^{2}) + T r (Φ W^{T}) + T r (Ψ H^{T}) = T r (X^{T} X - 2 X^{T} W H + H^{T} W^{T} W H) + γ {| |H| |}_{1 / 2} + α ({| | W^{T} W | |}_{1} - T r (W^{T} W)) + T r (Φ W^{T}) + T r (Ψ H^{T})

(13)

Then, taking the partial derivative concerning

H

on both sides leads to Equation (14).

\frac{\partial L}{\partial H} = - 2 W^{T} X + 2 W^{T} W H + \frac{1}{2} γ H^{- \frac{1}{2}} + Ψ

(14)

According to the KKT conditions,

Ψ ⊙ H = 0

, so it follows to Equation (15).

(- 2 W^{T} X + 2 W^{T} W H + \frac{1}{2} γ H^{- \frac{1}{2}}) ⊙ H = 0

(15)

Therefore, the updating rule can be acquired as Equation (16).

H \leftarrow H ⊙ \frac{W^{T} X}{W^{T} W H + \frac{γ}{4} H^{- \frac{1}{2}}}

(16)

In the same way, the updating rule for the matrix

H

, based on (9), is represented by Equation (17).

H \leftarrow H ⊙ \frac{W^{T} X}{W^{T} W H + γ}

(17)

Secondly, we update

W

while

H

is fixed, taking the partial derivative concerning

W

on both sides of

L

leads to Equation (18).

\frac{\partial L}{\partial W} = - 2 X H^{T} + 2 W H H^{T} + 2 α (W I_{1} - W) + Φ

(18)

where

I_{1}

represents a matrix in which all elements are equal to 1. Following the similar steps, the update rule for

W

is as Equation (19).

W \leftarrow W ⊙ \frac{X H^{T}}{W H H^{T} + α (W I_{1} - W)}

(19)

Figure 2 displays the methodologic framework briefly describing the steps to define the proposed method. As can be seen, the proposed method results from applying new constraints on L1-Sparsity NMF.

2.4. The Number of Components (Endmembers) and Iteration

In the context of dimensionality reduction, mainly when working with large datasets, it is crucial to identify the optimal number of dimensions that can capture the essential features of the data while minimizing complexity. Domain knowledge is often used to estimate the likely number of clusters or components before applying algorithms like K-means [67]. Accordingly, experimenting with different dimensionalities (3, 4, and 5), it was found that using three dimensions yielded the optimal results [67]. This suggests that three dimensions are sufficient to represent the data effectively, allowing for a more manageable and interpretable biomass analysis without significant information loss and less time-consuming.

Additionally, iterations play a crucial role in converging iteration-based algorithms, such as NMF. Therefore, various iterations should be considered to determine the optimum iteration for each tested method. This finding will indicate that the models can reach a stable solution within this number of iterations, balancing computational efficiency and model performance.

2.5. Labelling and Validation

PCA, K-means, and NMF methods generate components or clusters based on the inherent structure and relationships within the data, without relying on predefined labels or arbitrary thresholds. These unsupervised techniques extract latent patterns that represent variations in land cover, vegetation, topography, and other environmental features. To assign meaningful categories to the results, the outputs were interpreted and labeled into fire susceptibility classes (high, average, and low) based on geographical characteristics and available field-related information. This labeling was conducted after model execution and relied on known spatial features typically associated with fire-prone areas, such as vegetation density, slope, elevation, and proximity to human activity zones or water bodies [68,69,70]. This interpretive labeling process aligns with the exploratory nature of unsupervised learning, where final classes are derived based on domain-specific knowledge rather than fixed training data.

To evaluate the accuracy of the classified fire susceptibility maps, a validation step was performed using burned area data from 2021 to 2023. Due to the unavailability of complete and precise field-based burned area perimeters or official ignition point records, satellite imagery was used as the most viable source for validation. Burned areas were identified using the Normalized Burn Ratio (NBR) derived from Sentinel-2 imagery (Equation (20)). NBR, calculated using Band 8 (Near-Infrared—NIR) and Band 12 (Short-Wave Infrared—SWIR), is sensitive to changes in vegetation and provides reliable spatial indicators of burn severity and extent [71,72].

N B R = \frac{N I R - S W I R}{N I R + S W I R}

(20)

While NBR is not a replacement for field verification, it is a widely accepted proxy for mapping burn scars, especially where field data are incomplete or unavailable. In this study, NBR enabled consistent and detailed burned area identification, supported by available records from the Kurdistan Province Natural Resources Administration for broader coverage [72].

The model’s performance was evaluated using a spatial pixel-level overlay analysis between the classified fire susceptibility zones based on 2020 data and the post-2020 burned areas identified through the NBR index. Then, the confusion matrix was derived from the resulting data. It displays how a model performs across four categories: True Positives (TP), pixels correctly identified as “High” fire risk and later burned; False Positives (FP), pixels labeled as “High” but not actually burned; False Negatives (FN), burned pixels that were not identified as high risk; and True Negatives (TN), unburned pixels accurately excluded from the “High” category. The average and low classes have both been regarded as “Low” or negative. All values are shown in percentages instead of the number of pixels for easier understanding.

Finally, the standard metrics were calculated from the confusion matrix: precision (TP/(TP + FP)), recall or sensitivity (TP/(TP + FN)), specificity (TN/(TN + FP)), the F₁-score, balanced accuracy ((recall + specificity)/2), and the chi-square (χ²) statistic, which tests the strength of association between predicted and actual outcomes. The total number of classified pixels within the study area is 32,697,593, of which 1,150,229 pixels represent the burned areas from 2021 to 2023, which equals almost 3.5% of the total area.

The methods were applied and examined using two scenarios. In the first scenario, Sentinel-2 bands, ZGI, and DEM were used, while in the second scenario, the DEM data was omitted, and only Sentinel-2 bands and ZGI were used.

Figure 3 presents the methodological framework used.

3. Results

The results compare the model’s performance in two scenarios, with and without elevation data. Figure 4 represents the convergence chart of each NMF scenario used. Convergence in NMF methods refers to the process by which the algorithm iteratively improves its solution until it reaches a stable point where further iterations do not significantly change the results. This property is essential for ensuring the reliability and robustness of NMF, especially in unsupervised tasks, like feature extraction and dimensionality reduction. However, convergence can be sensitive to initialization and the chosen number of components, highlighting the need for careful model setup.

In the first scenario, Sentinel-2 data, ZGI, and elevation were used to assess fire susceptibility (Figure 5). In contrast, only Sentinel-2 data and ZGI were applied in the second scenario, making it solely based on multispectral information (Figure 6).

By testing these two scenarios, the study examines whether including elevation improves the model’s performance or disrupts the clustering when combined with Sentinel-2 data. DEM, a static variable representing topographical elevation, is recognized in many fire susceptibility models, especially supervised methods.

Figure 5 displays the results of applying the discussed unsupervised methods, including Basic NMF (Figure 5a), L1-Sparsity (Figure 5b), L1/2-Sparsity (Figure 5c), SEI-NMF (Figure 5d), PCA (Figure 5e), K-means (Figure 5f), and IsoData (Figure 5g) on Sentinel-2 satellite data, ZGI, and elevation data.

In the same way, Figure 6 presents the resulting maps of applying Basic NMF (Figure 6a), L1-Sparsity (Figure 6b), L1/2-Sparsity (Figure 6c), SEI-NMF (Figure 6d), PCA (Figure 6e), K-means (Figure 6f), and IsoData (Figure 6g) on Sentinel-2 satellite data and ZGI. The dark areas represent the high-fire susceptible areas, while the orange and blue areas represent the average (moderate) and low-fire susceptible areas, respectively.

Furthermore, the red polygons represent the post-2020 burned areas, and the yellow polygons represent the water bodies (Zrebar Lake and Garan Lake). Marivan town, as an urban area, is displayed as a black polygon within the maps.

Figure 7a shows the statistical distribution of each class for the 2020 maps, which have been derived from a pixel-level overlay of the 2020 developed maps, and the areas that experienced fires from 2021 to 2023 (post-2020 fires) regarding the first scenario, including elevation. Figure 7b represents the second scenario without the elevation factor. While NMF L1/2-Sparsity offers the best overlay in the first scenario (Figure 7a), SEI-NMF has increased and has offered the highest overlay when the DEM data is omitted in the second scenario.

Table 2 displays the confusion matrix derived for the first scenario (Sentinel-2 bands, ZGI, and DEM) and the second scenario (Sentinel-2 bands and ZGI). The values show the percentage of pixel abundance within each confusion matrix’s elements. Table 3 and Table 4 illustrate the statistical metrics calculated from the confusion matrix regarding the first and second scenarios, respectively.

4. Discussion

In applying NMF methods, PCA, K-means, and IsoData clustering to Sentinel-2 data, we observed variability across different runs, mainly due to these algorithms’ random initialization processes and non-convex nature. Without a fixed random seed, these methods may converge with different local minimums, producing slightly different decompositions or cluster assignments [59,73]. This is particularly evident in NMF, where the initial values significantly influence the results. While setting a fixed random seed could improve the consistency, some variability remains inherent due to the complexity and dimensionality of RS data [59,74]. Such considerations are crucial for ensuring reproducibility and robustness in analyses using high-dimensional satellite imagery. In practice, sparse methods (e.g., NMF-L1- or NMF-L1/2-Sparsity), which enforce sparsity constraints, tend to improve convergence by promoting more interpretable and meaningful features for real-world applications, such as identifying fire-prone zones, compared to the Basic NMF method (Figure 4). Among the methods compared, the L1/2-Sparsity variant converges more quickly (Figure 4). This faster convergence is due to its stronger sparsity constraint, accelerating the optimization by promoting more efficient feature selection early in the iterations.

Regarding the first scenario encompassing Sentinel-2, ZGI, and elevation, the results demonstrate that the NMF methods offer superior predictive performance compared to PCA, K-means, and IsoData [59]. In our highly imbalanced setting, where only about 3.5% of the total area (≈1.15 million out of ≈32.7 million pixels) experienced fire, recall and balanced accuracy metrics become especially important. According to Table 3, NMF-L1/2 had the highest recall (72.18%), showing a strong ability to detect the most burned pixels with DEM. Its precision was modest (5.73%), indicating some over-flagging, but it still achieved a balanced accuracy of 64.45% and a substantial chi-square value (375,753.54; p < 0.0001). SEI-NMF is closely followed by 64.61% balanced accuracy and the highest chi-square (446,861.95), balancing moderate sensitivity (58.78%) and higher precision (6.76%).

NMF-L1 also performed well with 60.77% recall, 6.31% precision, and 63.94% balanced accuracy (χ² ≈ 385,646.66). NMF-Basic showed a simpler but stable pattern (55.30% recall, 6.43% precision, 62.97% balanced accuracy, χ² ≈ 354,027.69), useful where interpretability is key.

PCA achieved the highest specificity (72.56%) but lower recall (41.36%), yielding 56.96% balanced accuracy and a solid χ² (~107,401.38). It suits situations where false alarms must be minimized. In contrast, K-Means and IsoData had near-chance performance (recalls ~31–34%, balanced accuracy ~52.9%, χ² ≈ 228,209.04 and 212,653.11), showing limited value for fire risk mapping with DEM.

The second scenario also supported the superior predictive performance of the NMF method against other methods (Table 4). It may be attributed to NMF’s data decomposition into parts-based representations, helping to isolate areas with characteristics strongly associated with fire susceptibility [38,59]. Accordingly, SEI-NMF had the highest recall (77%), making it the best at identifying fire-prone areas. Its precision was modest (6.56%), meaning it over-predicts fire-prone zones. Still, it had the top balanced accuracy (68.51%) and a strong chi-square score (627,079), showing reliable performance. This makes it ideal for early-warning systems where missing fires are riskier than false alarms.

NMF-L1/2 followed with 75% recall, 67.35% balanced accuracy, and better specificity (59.71%) and precision (6.36%) compared to SEI-NMF—useful when false positives matter. NMF-L1 achieved 70.5% recall and 66.03% balanced accuracy, highlighting the effectiveness of sparsity constraints.

PCA again showed the best specificity (68.33%) and precision (6.63%), with 65.01% balanced accuracy and the highest F₁-score (11.97%). NMF-Basic was moderate (62.5% recall, 64.06% balanced accuracy), maintaining simplicity and stability. K-Means and IsoData again performed poorly (27.38% and 30.59% recalls, balanced accuracy ~51%), with very low χ² (~1845 and ~3494), reflecting their sensitivity to noise and dimensionality [35].

The low precision values across all models (Table 3 and Table 4), even those with high recall, can be interpreted within real-world fire dynamics. High-fire-prone areas do not necessarily burn yearly, since ignition depends on various stochastic and external factors, such as lightning strikes, human activity, or weather extremes. Therefore, many pixels predicted as high risk but not burned are not misclassified but did not encounter ignition events during the observation period. This distinction is critical for understanding why FPs appear inflated, particularly in high-recall models. Thus, balanced accuracy and recall offer more stable metrics for assessing the model’s usefulness rather than focusing solely on precision. In scenarios where the cost of missing a fire-prone zone is high (e.g., evacuation planning, hazard mitigation), high-recall models like SEI-NMF and NMF-L1/2 are invaluable. Conversely, when the intervention capacity is limited and false alarms must be minimized, PCA or NMF-Basic may be preferred due to their more selective, higher-specificity predictions.

Overall, the models performed better without DEM. SEI-NMF saw the most significant improvement, with recall rising from 58.78% to 77% and balanced accuracy from 64.61% to 68.51%. NMF-L1/2 and NMF-L1 also improved in both recall and balanced accuracy. PCA showed notable gains, especially in recall (61.68% vs. 41.36%) and balanced accuracy (65.01% vs. 56.96%), suggesting that elevation data may have added noise. NMF-Basic followed the same trend. In contrast, K-Means and IsoData slightly declined without DEM, particularly in F₁-score and balanced accuracy. Overall, removing DEM enhanced most models, especially those using dimensionality reduction.

While Sentinel-2 imagery spans 2020–2023, only 2020 was used to generate fire-prone classification maps. Validation was conducted using burned area data from 2021 to 2023, ensuring temporal separation and minimizing the risk of overfitting. We also ensured that no post-2020 features were included during the preprocessing or feature extraction. The SEI-NMF method, for example, achieved a recall of 77% and a balanced accuracy of 68.5%, with consistent performance across other metrics, such as F₁-score and specificity. These results suggest strong generalization and no overfitting to the 2020 dataset. Additionally, high chi-square values (627,079, p < 0.0001), especially relative to other methods, confirm the statistically significant correspondence between the predicted fire-prone areas and the actual burned regions in later years.

However, DEM differs fundamentally from Sentinel-2 reflectance data, which captures dynamic surface characteristics, like vegetation health, moisture, and land use. These differences can influence clustering in unsupervised methods, potentially altering the cluster composition. Comparing both scenarios highlights how terrain and surface data interact, affecting the adaptability of unsupervised clustering in fire susceptibility assessments. Including DEM data can introduce noise that weakens spectral discrimination compared to using only Sentinel-2 data, which has clearer and more consistent spectral signals. DEM emphasizes elevation features, which may not align with the land cover targets (endmembers), making it harder for models to focus on key spectral traits. Using just Sentinel-2 allows for the modified NMF to extract cleaner spectral information.

The NMF models identified non-fire-prone areas like water bodies, bare mountains, and urban zones (yellow polygons in Figure 5a–d and Figure 6a–c). These were reliably marked as low-risk, confirming NMF’s usefulness in separating flammable from non-flammable regions. The PCA also performed well, though with slightly lower precision (Figure 5e and Figure 6e). In contrast, K-means and IsoData often misclassified water and urban areas, contradicting ground truth and other methods. Adjusting them to label water bodies as low-risk might break the model’s consistency, as it could distort the classification of true fire-prone areas, affecting resource use and planning.

Urban areas like Marivan town (black polygon) were also better classified without elevation data. Most methods failed to mark it as low-risk when the elevation was included. All NMF variants and PCA correctly identified it without elevation, and even K-means and IsoData partly succeeded. NMF-L1/2 still partially misclassified it as medium-risk.

Overall, NMF has shown strong dimensionality reduction and pattern recognition capabilities, but it has not yet been widely applied in wildfire detection or burn severity mapping [41,59]. This is a missed opportunity, as NMF handles complex, varied data well. In other fields, such as hyperspectral image analysis, advanced forms like Semi-Supervised NMF (SSNMF) have even outperformed PCA and Basic NMF [63]. Despite this, NMF’s potential in wildfire analysis remains underexplored [72,75].

In contrast, PCA has been used effectively in wildfire studies. For example, Zheng et al. [76] used PCA with PolSAR and a random forest to map vegetation structures and fire. Other studies also used PCA with Landsat data to identify burned areas [77,78]. Hybrid methods like K-means with neural networks and FFHDM also showed success in fire detection [79,80].

SEI-NMF and other NMF methods have practical limitations. The independence constraint that improves the interpretability also raises the computation time and memory needs—processing full Sentinel-2 stacks without optimized or parallelized code can take days. Because NMF learns from the input imagery, models trained on dry midsummer data may not work well on green spring scenes, so separate or seasonal composites are needed. Using only one year’s data can embed that year’s weather and growth patterns, biasing validation on later years; multi-year composites can mitigate this. Finally, poor preprocessing (e.g., inconsistent cloud masking or corrections) can lead NMF to learn misleading features.

5. Conclusions

This study evaluated unsupervised learning methods for mapping fire-prone areas in semi-Mediterranean and semi-arid forests using satellite-derived data. It proposed a Sparse Endmember-Independent NMF (SEI-NMF) and compared it against standard NMF, L1- and L1/2-Sparsity NMF variants, PCA, K-means, and IsoData.

When elevation was part of the inputs, NMF-L1/2 showed the highest ability to catch fire-prone pixels while keeping false alarms moderate, and SEI-NMF had the strongest statistical match to actual fires. PCA stood out for its precision, which is helpful when you need to avoid too many false warnings. Removing elevation made SEI-NMF the most sensitive and accurate, closely followed by NMF-L1/2. In both cases, K-means and IsoData lagged behind, confirming that decomposition methods like NMF and PCA handle complex land-cover signals better than simple clustering.

Throughout our work, we saw that all methods displayed some variability from one run to another, an expected result of random starts in unsupervised algorithms. Using sparse variants (NMF-L1 and NMF-L1/2) and fixing a random seed can improve the consistency. We also found that adding DEM sometimes introduced noise that weakened the performance, suggesting that a careful choice of input layers is crucial.

Although precision remained low across the board because many high-risk areas simply did not ignite during our study period, recall and balanced accuracy metrics proved more meaningful in our highly imbalanced setting (~3.5% burned). High-recall models like SEI-NMF and NMF-L1/2 are ideal for early-warning systems where missing a high-risk patch could have serious consequences. Models like PCA or NMF-Basic may be preferable when resources are limited and false alarms must be reduced.

Finally, we recommend strengthening future fire-risk mapping: combine unsupervised maps with supervised filters or field-verified points to boost precision. Additionally, dynamic ignition factors, such as recent weather, human access routes, or lightning data, can be incorporated to refine predictions.

Our findings demonstrate that advanced NMF methods, especially SEI-NMF, offer a powerful, interpretable, and scalable approach for fire-prone mapping in semi-Mediterranean and semi-arid regions. This framework can be adapted to other fire-prone landscapes (e.g., parts of Spain, Greece, or Portugal), supporting more proactive wildfire management under changing climate conditions.

Author Contributions

Conceptualization, I.R. and W.B.; methodology, I.R. and W.B.; software, I.R. and W.B.; investigation, I.R., W.B. and L.D.; data curation, I.R.; writing—original draft preparation, I.R. and W.B.; writing—review and editing, I.R., L.D. and A.C.T.; funding acquisition, A.C.T. and L.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

This work is supported by national funding awarded by FCT—Foundation for Science and Technology, I.P., projects UIDB/04683/2025, Instituto Ciências da Terra.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zema, D.A.; Nunes, J.P.; Lucas-Borja, M.E. Improvement of Seasonal Runoff and Soil Loss Predictions by the MMF (Morgan-Morgan-Finney) Model after Wildfire and Soil Treatment in Mediterranean Forest Ecosystems. Catena 2020, 188, 104415. [Google Scholar] [CrossRef]
Bowman, D.M.J.S.; Moreira-Muñoz, A.; Kolden, C.A.; Chavez, R.O.; Munoz, A.A.; Salinas, F.; Gonzalez-Reyes, A.; Rocco, R.; de la Barrera, F.; Williamson, G.J.; et al. Human-environmental drivers and impacts of the globally extreme 2017 Chilean fires. Ambio 2019, 48, 350–362. [Google Scholar] [CrossRef]
Geist, H.J.; Lambin, E.F. Proximate Causes and Underlying Driving Forces of Tropical Deforestation. BioScience 2002, 52, 143–150. [Google Scholar] [CrossRef]
Suryabhagavan, K.V.; Alemu, M.; Balakrishnan, M. GIS-Based Multi-Criteria Decision Analysis for Forest Fire Susceptibility Mapping: A Case Study in Harenna Forest, Southwestern Ethiopia. Trop. Ecol. 2016, 57, 33–43. [Google Scholar]
Dos Reis, M.; Graça, P.M.L.d.A.; Yanai, A.M.; Ramos, C.J.P.; Fearnside, P.M. Forest Fires and Deforestation in the Central Amazon: Effects of Landscape and Climate on Spatial and Temporal Dynamics. J. Environ. Manag. 2021, 288, 112310. [Google Scholar] [CrossRef]
Kala, C.P. Environmental and socioeconomic impacts of forest fires: A call for multilateral cooperation and management interventions. Nat. Hazards Res. 2023, 3, 286–294. [Google Scholar] [CrossRef]
Keenan, R.J. Climate Change Impacts and Adaptation in Forest Management: A Review. Ann. For. Sci. 2015, 72, 145–167. [Google Scholar] [CrossRef]
Simioni, G.; Marie, G.; Davi, H.; Martin-St Paul, N.; Huc, R. Natural Forest Dynamics Have More Influence than Climate Change on the Net Ecosystem Production of a Mixed Mediterranean Forest. Ecol. Model. 2020, 416, 108921. [Google Scholar] [CrossRef]
Rahimi, I.; Duarte, L.; Teodoro, A.C. Zagros Grass Index—A New Vegetation Index to Enhance Fire Fuel Mapping: A Case Study in the Zagros Mountains. Sustainability 2024, 16, 3900. [Google Scholar] [CrossRef]
Teodoro, A.C.; Duarte, L. Forest fire risk maps: A GIS open source application—A case study in Norwest of Portugal. Int. J. Geogr. Inf. Sci. 2013, 27, 699–720. [Google Scholar] [CrossRef]
Ghorbanzadeh, O.; Valizadeh, K.K.; Blaschke, T.; Aryal, J.; Naboureh, A.; Einali, J.; Bian, J. Spatial Prediction of Wildfire Susceptibility Using Field Survey GPS Data and Machine Learning Approaches. Fire 2019, 2, 43. [Google Scholar] [CrossRef]
Gong, J.; Jin, T.; Cao, E.; Wang, S.; Yan, L. Is Ecological Vulnerability Assessment Based on the VSD Model and AHP-Entropy Method Useful for Loessial Forest Landscape Protection and Adaptive Management? A Case Study of Ziwuling Mountain Region, China. Ecol. Indic. 2022, 143, 109379. [Google Scholar] [CrossRef]
Lamat, R.; Kumar, M.; Kundu, A.; Lal, D. Forest Fire Risk Mapping Using Analytical Hierarchy Process (AHP) and Earth Observation Datasets: A Case Study in the Mountainous Terrain of Northeast India. SN Appl. Sci. 2021, 3, 425. [Google Scholar] [CrossRef]
Arca, D.; Hacısalihoğlu, M.; Kutoğlu, Ş.H. Producing Forest Fire Susceptibility Map via Multi-Criteria Decision Analysis and Frequency Ratio Methods. Nat. Hazards 2020, 104, 73–89. [Google Scholar] [CrossRef]
Tiwari, A.; Shoab, M.; Dixit, A. GIS-Based FFS Modeling in Pauri Garhwal, India: A Comparative Assessment of Frequency Ratio, Analytic Hierarchy Process, and Fuzzy Modeling Techniques. Nat. Hazards 2021, 105, 1189–1230. [Google Scholar] [CrossRef]
Moayedi, H.; Mehrabi, M.; Bui, D.T.; Pradhan, B.; Foong, L.K. Fuzzy-Metaheuristic Ensembles for Spatial Assessment of Forest Fire Susceptibility. J. Environ. Manag. 2020, 260, 109867. [Google Scholar] [CrossRef]
Shi, C.; Zhang, F. A forest fire susceptibility modeling approach based on integration of machine learning algorithms. Forests 2023, 14, 1506. [Google Scholar] [CrossRef]
Trucchia, A.; Meschi, G.; Fiorucci, P.; Gollini, A.; Negro, D. Defining wildfire susceptibility maps in Italy for understanding seasonal wildfire regimes at the national level. Fire 2022, 5, 30. [Google Scholar] [CrossRef]
Saha, S.; Bera, B.; Shit, P.K.; Bhattacharjee, S.; Sengupta, N. Prediction of forest fire susceptibility applying machine and deep learning algorithms for conservation priorities of forest resources. Remote Sens. Appl. Soc. Environ. 2023, 29, 100917. [Google Scholar] [CrossRef]
Piao, Y.; Lee, D.; Park, S.; Kim, H.G.; Jin, Y. Forest fire susceptibility assessment using Google Earth Engine in Gangwon-do, Republic of Korea. Geomat. Nat. Hazards Risk 2022, 13, 432–450. [Google Scholar] [CrossRef]
Kalantar, B.; Ueda, N.; Idrees, M.O.; Janizadeh, S.; Ahmadi, K.; Shabani, F. Forest fire susceptibility prediction based on machine learning models with resampling algorithms on remote sensing data. Remote Sens. 2020, 12, 3682. [Google Scholar] [CrossRef]
Mishra, M.; Guria, R.; Baraj, B.; Nanda, A.P.; Santos, C.A.G.; Da Silva, R.M.; Laksono, F.A.T. Spatial analysis and machine learning prediction of forest fire susceptibility: A comprehensive approach for effective management and mitigation. Sci. Total Environ. 2024, 926, 171713. [Google Scholar] [CrossRef] [PubMed]
Sharma, L.K.; Gupta, R.; Naureen Fatima, N. Assessing the predictive efficacy of six machine learning algorithms for the susceptibility of Indian forests to fire. Int. J. Wildland Fire 2022, 31, 735–758. [Google Scholar] [CrossRef]
Maffei, C.; Menenti, M. An application of the perpendicular moisture index for the prediction of fire hazard. EARSel eProceedings 2014, 13, 13–19. [Google Scholar]
Sulova, A.; Arsanjani, J.J. Exploratory Analysis of Driving Force of Wildfires in Australia: An Application of Machine Learning within Google Earth Engine. Remote Sens. 2020, 13, 10. [Google Scholar] [CrossRef]
Sivrikaya, F.; Küçük, Ö. Modeling forest fire risk based on GIS-based analytical hierarchy process and statistical analysis in the Mediterranean region. Ecol. Inform. 2021, 68, 101537. [Google Scholar] [CrossRef]
Chaleplis, K.; Walters, A.; Fang, B.; Lakshmi, V.; Gemitzi, A. A Soil Moisture and Vegetation-Based Susceptibility Mapping approach to wildfire events in Greece. Remote Sens. 2024, 16, 1816. [Google Scholar] [CrossRef]
Chuvieco, E.; Cocero, D.; Riaño, D.; Martin, P.; Martínez-Vega, J.; De La Riva, J.; Pérez, F. Combining NDVI and surface temperature for the estimation of live fuel moisture content in forest fire danger rating. Remote Sens. Environ. 2004, 92, 322–331. [Google Scholar] [CrossRef]
Luz, A.E.O.; Negri, R.G.; Massi, K.G.; Colnago, M.; Silva, E.A.; Casaca, W. Mapping fire susceptibility in the Brazilian Amazon forests using multitemporal remote sensing and time-varying unsupervised anomaly detection. Remote Sens. 2022, 14, 2429. [Google Scholar] [CrossRef]
Yankovich, K.S.; Yankovich, E.P.; Baranovskiy, N.V. Classification of vegetation to estimate forest fire danger using LANDSAT 8 Images: Case study. Math. Probl. Eng. 2019, 2019, 6296417. [Google Scholar] [CrossRef]
Zaidi, A. Predicting wildfires in Algerian forests using machine learning models. Heliyon 2023, 9, e18064. [Google Scholar] [CrossRef] [PubMed]
Wang, M.; Gao, G.; Huang, H.; Heidari, A.A.; Zhang, Q.; Chen, H.; Tang, W. A principal component Analysis-Boosted dynamic Gaussian mixture clustering model for ignition factors of Brazil’s rainforests. IEEE Access 2021, 9, 145748–145762. [Google Scholar] [CrossRef]
Thein, A.M.; Htwe, A.N. Based on Principal Component Analysis of Land Use Land Cover Change Detection Using Landsat Satellite Images (Case Study Mandalay City). In Proceedings of the 2023 IEEE Conference on Computer Applications (ICCA), Yangon, Myanmar, 27–28 February 2023; pp. 147–152. [Google Scholar] [CrossRef]
Sadeghi, V.; Ebadi, H.; Sadeghi, V.; Moghimi, A. Automatic Land Use/Land Cover Change Detection from Multitemporal Remote Sensed Images and Old Maps by Refining of Training Data Based on Chi-Square Test and K-Means Clustering. J. Geomatics Sci. Technol. 2021, 10, 143–161. Available online: http://jgst.issgeac.ir/article-1-935-en.html (accessed on 10 April 2025).
Jarocińska, A.; Kopeć, D.; Kycko, M. Comparison of Dimensionality Reduction Methods on Hyperspectral Images for the Identification of Heathlands and Mires. Sci. Rep. 2024, 14, 27662. [Google Scholar] [CrossRef] [PubMed]
Ma, Z.; Liu, Z.; Zhao, Y.; Zhang, L.; Liu, D.; Ren, T.; Zhang, X.; Li, S. An Unsupervised Crop Classification Method Based on Principal Components Isometric Binning. ISPRS Int. J. Geo-Inf. 2020, 9, 648. [Google Scholar] [CrossRef]
Lv, Z.; Liu, T.; Shi, C.; Benediktsson, J.A.; Du, H. Novel Land Cover Change Detection Method Based on K-Means Clustering and Adaptive Majority Voting Using Bitemporal Remote Sensing Images. IEEE Access 2019, 7, 34425–34437. [Google Scholar] [CrossRef]
Lillesand, T.; Kiefer, R. Remote Sensing and Image Interpretation, 4th ed.; Wiley: New York, NY, USA, 2000. [Google Scholar]
Bioucas-Dias, J.M.; Plaza, A.; Dobigeon, N.; Parente, M.; Du, Q.; Gader, P.; Chanussot, J. Hyperspectral Unmixing Overview: Geometrical, Statistical, and Sparse Regression-Based Approaches. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2012, 5, 354–379. [Google Scholar] [CrossRef]
Huang, R.; Jiao, H.; Li, X.; Chen, S.; Xia, C. Hyperspectral unmixing using robust deep nonnegative matrix factorization. Remote Sens. 2023, 15, 2900. [Google Scholar] [CrossRef]
Settembre, G.; Taggio, N.; Del Buono, N.; Esposito, F.; Di Lauro, P.; Aiello, A. A land cover change framework analyzing wildfire-affected areas in bitemporal PRISMA hyperspectral images. Math. Comput. Simul. 2024, 229, 855–866. [Google Scholar] [CrossRef]
Berry, M.W.; Browne, M.; Langville, A.N.; Pauca, V.P.; Plemmons, R.J. Algorithms and Applications for Approximate Nonnegative Matrix Factorization. Comput. Stat. Data Anal. 2007, 52, 155–173. [Google Scholar] [CrossRef]
Gillis, N.; Kuang, D.; Park, H. Hierarchical clustering of hyperspectral images using Rank-Two nonnegative matrix factorization. IEEE Trans. Geosci. Remote Sens. 2014, 53, 2066–2078. [Google Scholar] [CrossRef]
Van Nguyen, L.; Lee, G. Underutilized Feature Extraction Methods for Burn Severity Mapping: A Comprehensive Evaluation. Remote Sens. 2024, 16, 4339. [Google Scholar] [CrossRef]
Harries, D.; O’Kane, T.J. Applications of matrix factorization methods to climate data. Nonlinear Process. Geophys. 2020, 27, 453–471. [Google Scholar] [CrossRef]
Zhu, X.; Li, M.; Deng, Y.; Luo, X.; Shen, L.; Long, C. L₂,₁-Norm Regularized Double Non-Negative Matrix Factorization for Hyperspectral Change Detection. Symmetry 2025, 17, 304. [Google Scholar] [CrossRef]
Guillaume, M.; Minghelli, A.; Deville, Y.; Chami, M.; Juste, L.; Lenot, X.; Lafrance, B.; Jay, S.; Briottet, X.; Serfaty, V. Mapping Benthic Habitats by Extending Non-Negative Matrix Factorization to Address the Water Column and Seabed Adjacency Effects. Remote Sens. 2020, 12, 2072. [Google Scholar] [CrossRef]
Esi, Ç.; Ertürk, A.; Erten, E. Nonnegative matrix factorization-based environmental monitoring of marine mucilage. Int. J. Remote Sens. 2024, 45, 3764–3788. [Google Scholar] [CrossRef]
Yokoya, N.; Chan, J.C.-W.; Segl, K. Potential of resolution-enhanced hyperspectral data for mineral mapping using simulated EnMAP and Sentinel-2 images. Remote Sens. 2016, 8, 172. [Google Scholar] [CrossRef]
Khader, A.; Yang, J.; Xiao, L. NMF-DUNET: Nonnegative matrix factorization inspired deep unrolling networks for hyperspectral and multispectral image fusion. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 5704–5720. [Google Scholar] [CrossRef]
Hou, W.; Liu, X.; Wang, J.; Chen, C.; Xu, X. Multispectral Land Surface Reflectance Reconstruction Based on Non-Negative Matrix Factorization: Bridging Spectral Resolution Gaps for GRASP TROPOMI BRDF Product in Visible. Remote Sens. 2025, 17, 1053. [Google Scholar] [CrossRef]
Mupfiga, U.N.; Mutanga, O.; Dube, T.; Kowe, P. Spatial Clustering of Vegetation Fire Intensity Using MODIS Satellite Data. Atmosphere 2022, 13, 1972. [Google Scholar] [CrossRef]
Elhag, M.; Yilmaz, N.; Bahrawi, J.; Al-Ghamdi, K.; Mansour, K. Evaluation of Optical Remote Sensing Data in Burned Areas Mapping of Thasos Island, Greece. Earth Syst. Environ. 2020, 4, 813–826. [Google Scholar] [CrossRef]
Jazirehi, M.H.; Rostaaghi, E.M. Silviculture in Zagros; University of Tehran Press: Tehran, Iran, 2003; 560p, Available online: https://www.scirp.org/reference/referencespapers?referenceid=1852053 (accessed on 20 November 2023).
El-Moslimany, A.P. Ecology and Late-Quaternary History of the Kurdo-Zagrosian Oak Forest Near Lake Zeribar, Western Iran. Vegetation 1986, 68, 55–63. [Google Scholar] [CrossRef]
Google Earth Engine Team. COPERNICUS/S2: Sentinel-2 MSI: MultiSpectral Instrument, Level-1C; Google Earth Engine: 2025. Available online: https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S2 (accessed on 20 February 2025).
USGS EROS Archive. Digital Elevation—Shuttle Radar Topography Mission (SRTM). Available online: https://www.usgs.gov/centers/eros/science/usgs-eros-archive-digital-elevation-shuttle-radar-topography-mission-srtm-1 (accessed on 22 February 2025).
National Cartographic Center of Iran. Administrative Boundaries Vector Data. Available online: https://www.ncc.gov.ir (accessed on 20 February 2025).
Qian, Y.; Jia, S.; Zhou, J.; Robles-Kelly, A. Hyperspectral unmixing via L1/2 sparsity-constrained nonnegative matrix factorization. IEEE Trans. Geosci. Remote Sens. 2011, 49, 4282–4297. [Google Scholar] [CrossRef]
Huang, Q.; Yin, X.; Chen, S.; Wang, Y.; Chen, B. Robust Nonnegative Matrix Factorization with Structure Regularization. Neurocomputing 2020, 412, 72–90. [Google Scholar] [CrossRef]
Faraji, M.; Seyedi, S.A.; Akhlaghian Tab, F.; Mahmoodi, R. Multi-label feature selection with global and local label correlation. Expert Syst. With Appl. 2024, 246, 123198. [Google Scholar] [CrossRef]
Li, H.; Li, K.; An, J.; Zhang, W.; Li, K. An efficient manifold regularized sparse non-negative matrix factorization model for large-scale recommender systems on GPUs. Inf. Sci. 2019, 496, 464–484. [Google Scholar] [CrossRef]
Liu, X.; Wang, W.; He, D.; Jiao, P.; Jin, D.; Cannistraci, C.V. Semi-supervised community detection based on non-negative matrix factorization with node popularity. Inf. Sci. 2017, 381, 304–321. [Google Scholar] [CrossRef]
Seyedi, S.A.; Tab, F.A.; Lotfi, A.; Salahian, N.; Chavoshinejad, J. Elastic adversarial deep nonnegative matrix factorization for matrix completion. Inf. Sci. 2023, 621, 562–579. [Google Scholar] [CrossRef]
Lee, D.D.; Seung, H.S. Algorithms for non-negative matrix factorization. In Proceedings of the 2000 Conference on Advances in Neural Information Processing Systems, Denver, CO, USA, 1 January 2000; MIT Press: Cambridge, MA, USA, 2001; pp. 556–562. [Google Scholar]
Lu, X.; Wu, H.; Yuan, Y.; Yan, P.; Li, X. Manifold regularized sparse NMF for hyperspectral unmixing. IEEE Trans. Geosci. Remote Sens. 2013, 51, 2815–2826. [Google Scholar] [CrossRef]
Xu, R.; Wunsch, D. Survey of clustering algorithms. IEEE Trans. Neural Netw. 2005, 16, 645–678. [Google Scholar] [CrossRef]
Kang, J.; Wang, Z.; Sui, L.; Yang, X.; Ma, Y.; Wang, J. Consistency Analysis of Remote Sensing Land Cover Products in the Tropical Rainforest Climate Region: A Case Study of Indonesia. Remote Sens. 2020, 12, 1410. [Google Scholar] [CrossRef]
Wei, R.; Ye, C.; Sui, T.; Ge, Y.; Li, Y.; Li, J. Combining Spatial Response Features and Machine Learning Classifiers for Landslide Susceptibility Mapping. Int. J. Appl. Earth Obs. Geoinf. 2022, 107, 102681. [Google Scholar] [CrossRef]
Meyer, H.; Pebesma, E. Machine Learning-Based Global Maps of Ecological Variables and the Challenge of Assessing Them. Nat. Commun. 2022, 13, 29838. [Google Scholar] [CrossRef]
Giddey, B.L.; Baard, J.A.; Kraaij, T. Verification of the differenced Normalised Burn Ratio (dNBR) as an index of fire severity in Afrotemperate Forest. S. Afr. J. Bot. 2021, 146, 348–353. [Google Scholar] [CrossRef]
Sivrikaya, F.; Günlü, A.; Küçük, Ö.; Ürker, O. Forest fire risk mapping with Landsat 8 OLI images: Evaluation of the potential use of vegetation indices. Ecol. Inform. 2024, 79, 102461. [Google Scholar] [CrossRef]
Guo, Z.; Min, A.; Yang, B.; Chen, J.; Li, H. A Modified Huber Nonnegative Matrix Factorization Algorithm for Hyperspectral Unmixing. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 5559–5571. [Google Scholar] [CrossRef]
Bari, M.H.; Ahmed, T.; Afjal, M.I.; Nitu, A.M.; Uddin, M.P.; Marjan, M.A. Segmented Nonnegative Matrix Factorization for Hyperspectral Image Classification. In Proceedings of the International Conference on Electrical, Computer and Communication Engineering (ECCE), Chittagong, Bangladesh, 23–25 February 2023; pp. 1–5. [Google Scholar] [CrossRef]
Zhao, L.; Zhuang, G.; Xu, X. Facial expression recognition based on PCA and NMF. In Proceedings of the 7th World Congress on Intelligent Control and Automation (WCICA), Chongqing, China, 25–27 June 2008; pp. 6826–6829. [Google Scholar] [CrossRef]
Zheng, Z.; Zeng, Y.; Zou, B.; Xie, Q.; Xian, W.; Xu, W.; Liu, Y.; Liu, Z. Assessing the burn severity of wildfires by incorporating vegetation structure information. Geomat. Nat. Hazards Risk 2024, 15, 1. [Google Scholar] [CrossRef]
Henry, M.C. Comparison of single- and multi-date Landsat data for mapping wildfire scars in Ocala National Forest, Florida. Photogramm. Eng. Remote Sens. 2008, 74, 881–891. [Google Scholar] [CrossRef]
Epting, J.; Verbyla, D.; Sorbel, B. Evaluation of remotely sensed indices for assessing burn severity in interior Alaska using Landsat TM and ETM+. Remote Sens. Environ. 2005, 96, 328–339. [Google Scholar] [CrossRef]
Oladimeji, M.O.; Ghavami, M.; Dudley, S. A new approach for event detection using k-means clustering and neural networks. In Proceedings of the International Joint Conference on Neural Networks (IJCNN), Killarney, Ireland, 12–17 July 2015. [Google Scholar] [CrossRef]
Lakshmanaswamy, P.; Sundaram, A.; Sudanthiran, T. Prioritizing the right to environment: Enhancing forest fire detection and prevention through satellite data and machine learning algorithms for early warning systems. Remote Sens. Earth Syst. Sci. 2024, 7, 472–485. [Google Scholar] [CrossRef]

Figure 1. Location of the study area—Marivan, Sarvabad in western Iran.

Figure 2. SEI-NMF methodological framework. The red-border steps show the proposed modifications.

Figure 3. Methodological framework.

Figure 4. Convergence rate of various NMF methods per iteration. The L1/2-sparse NMF (c) demonstrates the fastest convergence, attributed to its stronger sparsity-inducing regularization.

Figure 5. Maps generated by (a) Basic NMF, (b) L1-Sparsity, (c) L1/2-Sparsity, (d) SEI-NMF, (e) PCA, (f) K-means, and (g) IsoData using Sentinel-2 bands, ZGI, and elevation. The red areas are the burned areas from 2021 to 2023. The yellow polygons represent the area with water bodies, and the black polygon represents Marivan town.

Figure 6. Maps generated by (a) Basic NMF, (b) L1-Sparsity, (c) L1/2-Sparsity, (d) SEI-NMF, (e) PCA, (f) K-means, and (g) IsoData using Sentinel-2 bands and ZGI. The red areas are the burned areas from 2021 to 2023. The yellow polygons represent the area with water bodies, and the black polygon represents Marivan town.

Figure 7. Statistical distribution of each class for 2020 within the post-2020 burned area, regarding (a) Sentinel-2 data, ZGI, and the elevation, and (b) only Sentinel-2 bands and ZGI.

Table 1. Data sources: satellite imagery and field data.

Data Type	Projection System	Spatial Resolution (m)	Time Period	Source
Sentinel-2	UTM	10	2020–2023	[56]
DEM SRTM	UTM	10	2000	[57]
Vector data: study area, water body, town border	UTM	-	-	[58]

Table 2. Confusion matrix of the classification outcomes for the first and second scenarios. The high label is regarded as positive, and the average and low classes are both regarded as negative.

First Scenario (%)					Second Scenario (%)
Method	TP	FP	FN	TN	TP	FP	FN	TN
NMF-Basic	1.95	28.32	1.57	68.16	2.2	33.19	1.32	63.3
NMF-L1	2.14	31.74	1.38	64.74	2.48	37.09	1.04	59.4
NMF-L1/2	2.54	41.77	0.98	54.71	2.64	38.88	0.88	57.61
SEI-NMF	2.07	28.52	1.45	67.96	2.71	38.58	0.81	57.9
PCA	1.46	26.48	2.06	70.01	2.17	30.56	1.35	65.93
K-Means	1.1	24.6	2.42	71.88	0.96	24.7	2.55	71.79
IsoData	1.19	27.06	2.33	69.42	1.08	27.08	2.44	69.4

Table 3. Performance metrics derived from the confusion matrix for the first scenario, including recall (sensitivity), precision, specificity, F1-score, specificity, balanced accuracy, and chi-square (χ²), computed over the period of 2021–2023.

Method	Precision (%)	Recall (%)	F₁-Score (%)	Specificity (%)	Balanced Acc. (%)	χ²	p-Value
NMF-Basic	6.43	55.30	11.52	70.65	62.97	354,027.69	<0.0001
NMF-L1	6.31	60.77	11.44	67.10	63.94	385,646.66	<0.0001
NMF-L1/2	5.73	72.18	10.62	56.71	64.45	375,753.54	<0.0001
SEI-NMF	6.76	58.78	12.13	70.44	64.61	446,861.95	<0.0001
PCA	5.21	41.36	9.26	72.56	56.96	107,401.38	<0.0001
K-Means	4.27	31.29	7.21	74.57	52.93	228,209.04	<0.0001
IsoData	4.22	33.88	7.39	71.94	52.91	212,653.11	<0.0001

Table 4. Performance metrics derived from the confusion matrix for the second scenario, including recall (sensitivity), precision, specificity, F1-score, specificity, balanced accuracy, and chi-square (χ²), computed over the period of 2021–2023.

Method	Precision (%)	Recall (%)	F₁-Score (%)	Specificity (%)	Balanced Acc. (%)	χ²	p-Value
NMF-Basic	6.21	62.51	11.3	65.6	64.06	383,639	<0.0001
NMF-L1	6.27	70.5	11.51	61.56	66.03	477,239	<0.0001
NMF-L1/2	6.36	75	11.72	59.71	67.35	550,592	<0.0001
SEI-NMF	6.56	77	12.09	60.01	68.51	627,079	<0.0001
PCA	6.63	61.68	11.97	68.33	65.01	454,071	<0.0001
K-Means	3.75	27.38	6.6	74.4	50.89	1844.85	<0.0001
IsoData	3.82	30.59	6.11	71.94	51.26	3494.24	<0.0001

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Rahimi, I.; Duarte, L.; Barkhoda, W.; Teodoro, A.C. Comparative Analysis of Non-Negative Matrix Factorization in Fire Susceptibility Mapping: A Case Study of Semi-Mediterranean and Semi-Arid Regions. Land 2025, 14, 1334. https://doi.org/10.3390/land14071334

AMA Style

Rahimi I, Duarte L, Barkhoda W, Teodoro AC. Comparative Analysis of Non-Negative Matrix Factorization in Fire Susceptibility Mapping: A Case Study of Semi-Mediterranean and Semi-Arid Regions. Land. 2025; 14(7):1334. https://doi.org/10.3390/land14071334

Chicago/Turabian Style

Rahimi, Iraj, Lia Duarte, Wafa Barkhoda, and Ana Cláudia Teodoro. 2025. "Comparative Analysis of Non-Negative Matrix Factorization in Fire Susceptibility Mapping: A Case Study of Semi-Mediterranean and Semi-Arid Regions" Land 14, no. 7: 1334. https://doi.org/10.3390/land14071334

APA Style

Rahimi, I., Duarte, L., Barkhoda, W., & Teodoro, A. C. (2025). Comparative Analysis of Non-Negative Matrix Factorization in Fire Susceptibility Mapping: A Case Study of Semi-Mediterranean and Semi-Arid Regions. Land, 14(7), 1334. https://doi.org/10.3390/land14071334

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Comparative Analysis of Non-Negative Matrix Factorization in Fire Susceptibility Mapping: A Case Study of Semi-Mediterranean and Semi-Arid Regions

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Data Sources

2.3. Sparse and Endmember-Independent Non-Negative Matrix Factorization (SEI-NMF)

2.3.1. Linear Mixture Model (LMM)

2.3.2. Non-Negative Matrix Factorization (NMF_Basic)

2.3.3. Sparsity Regularizer (NMF_L1- and L1/2-Sparsity)

2.3.4. Proposed Method (SEI-NMF)

2.3.5. Optimization

2.4. The Number of Components (Endmembers) and Iteration

2.5. Labelling and Validation

3. Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI