1. Introduction
The assessment of direct material flood damage at the building-level remains a persistent challenge due to nonlinear hazard–exposure interactions, spatial heterogeneity, and the scarcity of object-level loss records in many flood-prone regions. In rural lowland environments, shallow but spatially extensive inundation is common, and exposure configurations, rather than flood depth alone, often play a dominant role in shaping loss outcomes. These conditions hinder the development of reliable and interpretable building-level flood damage models in structurally data-scarce environments.
Classical depth–damage functions provide deterministic mappings between inundation depth and monetary loss and remain widely used in engineering and civil protection applications due to their simplicity and modest data requirements. However, their transferability across regions and building typologies is limited, and their ability to represent uncertainty and interaction effects, particularly in low-relief floodplains, is constrained [
1,
2]. Probabilistic, statistical, and machine-learning approaches can incorporate additional explanatory variables and quantify uncertainty. Yet, they generally require extensive calibration datasets and may lack transparency for operational decision-making in data-scarce settings [
3,
4,
5].
Fuzzy logic offers an alternative mechanism for representing imprecise, linguistic, and nonlinear relationships through rule-based mappings between hazard, exposure, and damage. Mamdani-type fuzzy inference systems (FISs) have been applied in flood risk and vulnerability assessment because they encode expert knowledge via interpretable IF–THEN rules and are typically constructed without data-driven statistical calibration [
6,
7]. However, the existing fuzzy damage models generally focus on direct hazard-based inference and do not explicitly exploit relational similarity among exposed buildings. In low-relief and data-scarce floodplains, this limits discriminatory capacity, as buildings experiencing comparable flood depths may differ substantially due to micro-topographic or structural exposure conditions.
Motivated by these limitations, the present study introduces a similarity-based fuzzy modeling framework for supporting building-level flood damage assessment under structurally data-scarce conditions. The proposed approach integrates: (i) a Composite Exposure Index () constructed from geospatial exposure indicators, (ii) a Mamdani-type fuzzy inference system mapping exposure and hazard to damage, and (iii) a similarity-based amplification mechanism that enhances differentiation among high-exposure configurations. Unlike neuro-fuzzy or deep learning approaches, the proposed method does not require empirical calibration and remains interpretable, which makes it suitable for rural low-relief settings with limited or no loss data.
The main contributions of this work are as follows:
A unified fuzzy modeling framework integrating exposure aggregation, Mamdani fuzzy inference, and similarity-based amplification for building-level flood damage assessment in structurally data-scarce environments;
A scalar Composite Exposure Index () that serves as both a fuzzy inference input and as a prototype-based exposure signature for similarity assessment;
A lightweight similarity-based amplification mechanism that enhances differentiation among high-exposure configurations without empirical calibration while preserving interpretability and the analytical properties of boundedness and monotonicity;
A formal analysis of key mathematical properties, demonstrating monotonicity with respect to exposure and flood depth, boundedness of damage outputs, and stability with respect to the amplification parameter.
This paper is organized as follows. The section Background and Related Work reviews related work and positions the proposed approach within existing methodological paradigms.
Section 2 describes the study area, synthetic data configuration, and methodological framework.
Section 3 presents the numerical results and model behavior.
Section 4 discusses the implications, validation under data scarcity, and limitations.
Section 5 concludes the study and outlines avenues for future research.
Background and Related Work
Flood loss modeling frameworks combine hazard, exposure, and vulnerability information, with methodological differentiation arising primarily from the adopted modeling paradigm. Classical depth–damage functions establish deterministic relationships between inundation depth and loss ratios, calibrated from empirical datasets [
1,
8,
9]. Although widely used due to their parsimony, these models exhibit limited transferability and a weak representation of uncertainty outside their calibration domains, particularly in low-relief floodplains [
1,
2].
Indicator-based approaches incorporate spatial exposure and physical vulnerability descriptors, such as distance to river, terrain elevation, and structural attributes, as commonly reported in applied studies and reviews (e.g., [
10,
11,
12]). These models better represent spatial heterogeneity than scalar depth–damage functions; however, in many implementations, they rely on linear or weighted-sum aggregation and do not explicitly represent nonlinear interactions or relational similarity among exposed elements.
Probabilistic, statistical, and machine-learning models introduce stochastic structure and data-driven inference. Multivariate regression and related statistical frameworks enable formal uncertainty quantification [
3], while classical empirical flood loss models and depth–damage approaches remain widely used in practice [
1,
9]. Multivariate statistical models have been applied to building-level flood damage analysis [
3], while interpretable machine-learning approaches have been explored in recent flood-risk and impact-related studies [
13]. At a broader spatial scale, fluvial flood risk projections explicitly integrate hazard, exposure, and vulnerability components to assess their combined and individual contributions to future risk dynamics [
14]. Deep learning and other machine-learning approaches have also emerged for flood susceptibility, mapping, and loss-related prediction tasks [
5,
15]. However, these approaches generally require extensive training datasets and empirical calibration, which constrains their practical deployment in data-scarce settings.
Within fuzzy systems, Mamdani-type fuzzy inference systems (FISs) have been adopted to model hazard intensity, exposure, and damage as linguistic variables linked through rule bases [
6,
7,
16]. These models handle imprecision and are typically constructed without data-driven statistical calibration, making them suitable for data-scarce environments. However, they usually rely on direct hazard-based inference and do not explicitly incorporate similarity relations among exposed buildings. Neuro-fuzzy models such as ANFIS [
17] combine fuzzy inference with neural training and have been extensively used for flood susceptibility mapping and spatial hazard analysis, often in data-rich applications and in combination with evolutionary or metaheuristic optimization techniques [
18,
19,
20,
21]. While effective in data-rich contexts, their calibration requirements constrain applicability when empirical hazard–impact records are unavailable.
A related methodological strand concerns fuzzy similarity and non-metric similarity measures. Such approaches have been shown to be mathematically well-posed and effective in pattern recognition and clustering applications, without strict reliance on classical distance metrics or high-dimensional metric spaces [
22].
Comparative assessments indicate that no single modeling family dominates across all application contexts, particularly in data-scarce environments where expert-based and transparent models may outperform data-intensive approaches [
23,
24,
25,
26]. In low-relief floodplains characterized by structural data scarcity, limited building-level loss records, and pronounced exposure heterogeneity, transparent and data-efficient models therefore remain particularly relevant.
Within this landscape, the present study:
Employed an interpretable Mamdani-type rule-based system, appropriate for structurally data-scarce contexts where empirical calibration is infeasible;
Aggregated geospatial exposure indicators into a Composite Exposure Index tailored to low-relief floodplain settings;
Introduced a prototype-based scalar similarity mechanism inspired by fuzzy similarity constructs;
Embedded similarity-based amplification directly within fuzzy inference, rather than relying solely on direct hazard–damage mappings.
2. Materials and Methods
This section describes the study area and the structurally data-scarce context, the construction of the synthetic dataset, the selection and normalization of exposure variables, the formulation of the Composite Exposure Index, the fuzzy inference system, the similarity-based amplification mechanism, the mathematical properties of the model, and the computational workflow.
In the proposed modeling framework, distance-to-river , foundation elevation , and terrain elevation are treated as exposure variables. After normalization to the unit interval , they form exposure indicators that are aggregated into a scalar Composite Exposure Index (). Flood depth is treated as a separate hazard variable. The fuzzy inference system produces a baseline direct material damage estimate , which is subsequently transformed by the similarity-based amplification mechanism into the final damage estimate . The scalar additionally serves as a one-dimensional exposure signature within the similarity-based amplification mechanism.
2.1. Study Area and Data-Scarce Context
The methodological evaluation was conducted for a low-relief rural floodplain in the settlement of Begeč, Serbia. The terrain is characterized by minimal longitudinal slope gradients, shallow groundwater conditions, and a fluvial regime dominated by slow-onset, shallow inundation from the Danube River. These geomorphological characteristics are typical of lowland floodplains, where flood depth varies gradually and exposure-related attributes such as distance to the river and micro-topographic position strongly influence building-level flood impacts.
Historically, the Begeč floodplain has functioned as a designated flood retention zone within the regional flood defense system. During the 1965 Danube flood, the earthen levee was deliberately breached at this location to prevent catastrophic inundation of Novi Sad, the principal industrial, academic, and administrative center of the region. This intervention reflects the strategic prioritization of high-density urban assets and illustrates the functional role of Begeč as a sacrificial retention zone. Although the 1965 event is well-documented hydrologically, no systematic building-level damage inventories exist for Begeč, which is consistent with long-standing patterns of structural data scarcity in rural lowland environments in Southeastern Europe.
In the absence of empirical building-level loss data, insurance claims, or survey-based impact records, empirical calibration of probabilistic, machine-learning, or neuro-fuzzy models is infeasible. These constraints motivate the use of synthetic data for controlled methodological assessment under structurally data-scarce conditions.
For hazard characterization, an extreme flood scenario consistent with a 100-year return period was adopted. Available hydrological records for the study area indicate that such events correspond to the Danube water levels of approximately 80 m a.s.l. in the examined domain, resulting in shallow-to-deep inundation depths relative to local terrain elevations (76.50–80.00 m a.s.l.) [
27]. Although 100-year floods are traditionally classified as low-probability events, such scenarios remain relevant for risk and damage assessment, highlighting the importance of evaluating damage formation under rare but physically plausible inundation conditions.
Under these conditions, the Begeč floodplain provides a physically consistent and hydro morphologically representative environment for assessing the behavior and interpretability of the proposed modeling framework in the absence of paired hazard–impact data.
2.2. Synthetic Dataset Construction
A synthetic dataset comprising
residential buildings was constructed to support controlled methodological evaluation under structurally data-scarce conditions. Synthetic or scenario-based datasets have been widely used in flood-risk and flood-loss modeling, particularly when empirical hazard–impact records are unavailable or incomplete, as documented in comparative and state-of-the-art reviews. They are used because they enable the controlled variation of exposure characteristics and support reproducible methodological evaluation (e.g., [
1,
24,
25]).
Each synthetic building footprint was assigned to a spatially coherent location within the floodplain domain. Exposure-related variables were generated within physically plausible ranges constrained by local topography and settlement structure, ensuring geographical realism rather than arbitrary statistical sampling. Spatial descriptors, including terrain elevation and river proximity, were derived from publicly available geospatial sources and processed in a GIS environment to ensure consistency with the physical characteristics of the study area.
The synthetic dataset serves solely as a physically coherent environment for evaluating the proposed fuzzy similarity-based modeling framework. All real-world information entered the model exclusively through physical spatial descriptors, while hazard–impact records, empirical loss inventories, and insurance data were not used. This configuration ensures full reproducibility and supports methodological development under structurally data-scarce conditions.
2.3. Exposure Variables and Physical Interpretation
Let
N denote the number of individual buildings in the synthetic dataset. Each building is characterized by a vector of three exposure variables:
where
denotes the distance to the river,
denotes the foundation elevation relative to ground level, and
denotes the terrain elevation above sea level.
These variables are widely adopted in indicator-based flood exposure and vulnerability assessment frameworks due to their influence on hydraulic connectivity, inundation pathways, and structural exposure (e.g., [
10,
11,
12]). Specifically, distance to the river serves as a proxy for hydraulic connectivity and inundation propensity, while terrain elevation reflects local inundation and drainage potential. Both are widely used in GIS-supported indicator-based exposure and vulnerability frameworks (e.g., [
10,
11]). Foundation elevation can be interpreted as a structural resistance characteristic relevant to shallow flooding and surface-water intrusion mechanisms, consistent with building-related damage-influencing factors discussed in flood damage and vulnerability studies [
1,
8,
9]. Collectively, these variables represent hydro-morphological and structural characteristics relevant for building-level flood damage formation in low-relief environments [
1,
2].
Following normalization to the unit interval in
Section 2.4, the normalized quantities
are referred to as exposure indicators. These indicators were subsequently aggregated into a scalar Composite Exposure Index (
) (
Section 2.5), which also serves as a one-dimensional exposure signature within the similarity-based amplification mechanism (
Section 2.7).
2.4. Parameter Ranges and Exposure Normalization
Synthetic exposure variables were generated within physically plausible ranges representing typical residential and terrain characteristics of the Begeč floodplain:
Distance to the river and foundation elevation are expressed in meters, while terrain elevation is expressed in meters above sea level (m a.s.l.).
The numerical ranges for the synthetic exposure variables were derived from real elevation data, building footprints, and river proximity characteristics of the Begeč floodplain, ensuring physically consistent values. In addition, a GIS-based flood damage assessment study for the same region [
27] provided empirical context consistent with settlement patterns and exposure conditions characteristic of low-relief floodplains. The hydro-morphological characteristics of low-relief floodplains discussed in [
2] are likewise consistent with these exposure conditions and with the physical setting of the Begeč floodplain.
All exposure variables were normalized to the unit interval
to obtain dimensionless exposure indicators using inverse min–max normalization:
where
denotes the raw physical value of indicator
for building
, and
and
represent the minimum and maximum values of that variable across the building set. This transformation yields normalized exposure indicators
, where larger values consistently represent higher exposure.
The use of inverse normalization ensures a uniform exposure orientation across indicators, facilitating aggregation and interpretation within the fuzzy modeling framework. The resulting bounded representation embeds spatial exposure into a numerical domain suitable for fuzzy inference.
The min–max bounds reflect physically meaningful local limits. Adaptation of the model to other lowland floodplains therefore requires only redefining (
) without altering the model structure, consistent with the context-dependent indicator scaling discussed in [
2].
2.5. Composite Exposure Index (EI)
Normalized exposure variables are aggregated into a scalar Composite Exposure Index (
) using a convex weighted sum:
where
denote the normalized exposure indicators for building
and
denote non-negative weighting coefficients.
In this study, the weight vector
was adopted to reflect a commonly used indicator-based exposure formulation in which hydraulic connectivity, represented by distance to the river, was assigned higher relative importance, while foundation elevation and terrain elevation represented secondary but non-negligible exposure components. Such heuristic weighting schemes are widely applied in indicator-based flood exposure and vulnerability assessments, where normalized variables are aggregated into composite indices, often in the absence of paired exposure–damage data for empirical calibration (e.g., [
10,
11]).
The specific numerical values of the weights are not intended to represent optimal or calibrated parameters, but rather to encode a physically motivated ordering of exposure importance consistent with low-relief floodplain environments [
2]. Since the objective of this work is methodological demonstration under structurally data-scarce conditions rather than predictive optimization, no statistical calibration or learning-based parameter identification was performed. Data-driven or empirically calibrated weighting strategies could be incorporated in future extensions when suitable exposure–damage datasets become available.
The resulting Composite Exposure Index provides a one-dimensional, interpretable exposure signature that summarizes multidimensional physical exposure conditions relevant to flood damage formation. In subsequent sections, this scalar index serves both as an input to the fuzzy inference system and as an exposure signature for similarity-based amplification (
Section 2.7).
2.6. Fuzzy Inference System
Fuzzy logic provides a formal framework for representing imprecision and linguistic uncertainty in complex systems [
28]. The fuzzy inference system maps exposure and hazard conditions to a crisp estimate of direct material damage. Two input linguistic variables were defined: the Composite Exposure Index (
) and flood depth (
), and one output linguistic variable representing direct material damage (
). A Mamdani-type structure was adopted due to its interpretability and compatibility with expert-driven rule construction under structurally data-scarce conditions.
2.6.1. Linguistic Variables and Membership Functions
The adopted linguistic partitions follow the conventions commonly used in fuzzy flood-risk and flood-damage inference systems [
6,
16], where linguistic discretization supports transparent rule construction and domain expert interpretation. The input linguistic variables are defined as:
while the output linguistic variable adopts the partition:
The damage output is numerically bounded as:
where in this study,
= 50,000 EUR. This value reflects an upper reconstruction-cost limit for typical single-family residential buildings in the study area and ensures that inferred damage magnitudes remain physically meaningful within the local building context.
Flood depth (
) is treated as a crisp hazard variable and computed as:
where
denotes the water level associated with the analyzed hazard scenario. A 100-year Danube flood event was adopted as the reference scenario, corresponding to water levels of approximately 80 m a.s.l. in the study area. Accordingly, flood depth was constrained to non-negative values and, under the adopted scenario, spanned the interval
meters, yielding inundation depths corresponding to shallow-to-deep fluvial flooding conditions.
Membership breakpoints for were defined to reflect physically plausible inundation regimes consistent with this extreme flood scenario, rather than generic depth thresholds. Specifically:
Shallow flooding (<0.6 m) corresponds to limited overbank inundation;
Moderate flooding (approximately 0.5–2.3 m) reflects typical fluvial backwater conditions in low-relief floodplains;
Deep flooding (>2.0 m) captures extreme retention-zone behavior during peak events.
These depth ranges are consistent with empirically documented damage escalation mechanisms for residential buildings, whereby shallow inundation primarily affects flooring and surface materials, moderate flood depths involve walls, utilities, and interior finishes, and deeper inundation may result in extensive interior or structural damage. Such mechanisms are widely reported in empirical vulnerability studies and review-based depth–damage analyses (e.g., [
1,
9,
29]).
The exact numerical parameters for all membership functions (
,
, and
) are provided in
Appendix A.
2.6.2. Fuzzy Rule Base
A Mamdani-type fuzzy rule base comprising nine monotone rules maps combinations of exposure and hazard linguistic states to damage outcomes. Each rule follows the generic structure:
where
,
, and
denote the linguistic terms defined in
Section 2.6.1.
Monotonicity is enforced by design, such that increases in either exposure () or flood depth () cannot result in a lower inferred damage category. This design principle is consistent with the established empirical understanding of residential flood impacts, where greater structural exposure or more severe inundation conditions are not associated with reduced direct material losses.
The complete fuzzy rule base is provided in
Appendix B.
2.6.3. Defuzzification
Crisp damage estimates were obtained using centroid defuzzification:
Centroid defuzzification was adopted due to its numerical stability and interpretability in continuous output spaces and is widely used in Mamdani-type inference systems in applied fuzzy flood modeling studies [
6,
16]. In the present framework, centroid aggregation produces a single representative damage estimate by combining the contributions of all activated output membership functions.
The suitability of this operator follows from the convexity of the output membership functions and the monotone structure of the Mamdani rule base. Under these conditions, centroid defuzzification preserves boundedness, monotonicity, and interpretability of the resulting mapping from linguistic inputs to crisp outputs. These properties are particularly advantageous in structurally data-scarce environments, where the empirical calibration of rule weights or fuzzy set parameters is not feasible.
The applicability of Equation (5) and its closed-form implementation for the adopted output membership functions are formally justified in
Appendix C.
2.7. Similarity-Based Amplification Mechanism
Empirical and survey-based studies report disproportionately larger variability and higher absolute flood losses among households and residential assets subject to adverse exposure conditions, particularly in data-scarce settings [
3,
4]. In such regimes, buildings experiencing comparable flood depths may nevertheless exhibit markedly different damage outcomes due to differences in hydraulic connectivity, micro-topographic position, and structural exposure characteristics. The baseline Mamdani-type fuzzy inference system exhibits limited discriminatory resolution in these high-exposure regimes due to the coarse linguistic partitioning inherent to rule-based models.
To enhance differentiation without increasing model dimensionality or introducing empirical calibration requirements, a lightweight similarity-based amplification mechanism was introduced. Exposure-relevant physical attributes were aggregated into a one-dimensional exposure signature
which summarizes multidimensional exposure conditions relevant to damage formation. Under this reduced representation, similarity with respect to flood-relevant exposure conditions is defined as:
which corresponds to a prototype-based exposure similarity indicator relative to the most disadvantageous configuration
. The prototype
represents the worst-case exposure state within the normalized domain and serves as a natural anchor point in the absence of empirical damage calibration. Here, similarity is not defined in the sense of fuzzy set similarity measures or distance-based metrics, but as a prototype-referenced exposure proximity indicator within a reduced scalar representation.
Defining similarity along the exposure axis is physically motivated by the characteristics of low-relief floodplains, where shallow inundation dynamics, hydraulic connectivity, and micro-topographic sensitivity exert a dominant influence on residential flood damage. Consequently, similarity in exposure conditions provides a meaningful basis for enhancing differentiation among buildings subject to otherwise comparable hazard intensities.
The amplified crisp damage value is computed as:
where
denotes the baseline fuzzy damage estimate. This formulation selectively increases the damage values for high-exposure configurations (
) while leaving low-exposure buildings largely unaffected, thereby increasing discriminatory resolution in the portion of the exposure–hazard space where empirical variance in losses is most frequently observed (e.g., [
3,
4]). Importantly, the relative ordering of damage estimates is preserved, and the qualitative behavior of the baseline fuzzy inference system remains unchanged under similarity-based modulation.
The coefficient α has no direct physical interpretation; its role is purely structural, controlling the strength of similarity-based differentiation without altering the underlying fuzzy inference logic. The amplification interval was selected based on three complementary considerations:
Accordingly, the amplification coefficient is not a calibration or tuning parameter but a bounded modulation factor introduced to improve internal discriminatory resolution under structural data scarcity.
Finally, the similarity-based amplification does not constitute generic proportional scaling, as similarity operates on the composite exposure signature rather than directly on hazard variables or monetary values. Unlike ANFIS and other neuro-fuzzy systems, the mechanism introduces relational exposure structure without parameter training, thereby preserving interpretability, analytical transparency, and compatibility with data-scarce floodplain environments.
2.8. Mathematical Properties
The proposed modeling framework satisfies three fundamental analytical properties that ensure consistent, interpretable, and numerically robust model behavior under structurally data-scarce conditions.
Monotonicity. The damage estimates are monotone non-decreasing with respect to both exposure and flood depth. Increases in the Composite Exposure Index or flood depth cannot result in lower inferred damage values. This property follows from the ordered structure of the fuzzy rule base, the use of centroid defuzzification, and the non-negative similarity-based amplification term.
Boundedness. All model components are defined over bounded domains. The exposure signature is confined to the unit interval, the amplification parameter is bounded, and the damage output is explicitly limited by an upper reconstruction-cost threshold. Consequently, the amplified damage estimates remain finite and physically meaningful under all admissible parameter values.
Stability. The model exhibits stable behavior with respect to variations in the amplification parameter. Small changes in the amplification coefficient induce proportionally small changes in the resulting damage values, without altering their relative ordering. This continuous and controlled response ensures numerical robustness and prevents amplification-induced instability.
Formal mathematical justifications of these properties are provided in
Appendix C.
2.9. Computational Implementation
The proposed modeling framework was implemented in the R programming environment. Membership functions, fuzzy rule bases, and centroid defuzzification were implemented using standard numerical routines available in base R. No machine-learning optimization, parameter fitting, or empirical calibration was performed at any stage of the implementation. Visualizations of membership functions and representative model outputs were generated using base R graphics utilities.
2.10. Workflow Overview
Figure 1 presents a structural workflow diagram summarizing the modeling pipeline, which consists of the following steps:
Definition and normalization of exposure variables;
Computation of the Composite Exposure Index ();
Computation of flood depth () under the considered hazard scenario;
Fuzzy inference based on the input pair ();
Similarity-based amplification of damage estimates;
Derivation of final damage values.
All steps are implemented without empirical calibration or data-driven parameter estimation, reflecting the methodological suitability of the proposed framework for structurally data-scarce environments.
3. Results
This section presents the numerical results of the proposed similarity-based fuzzy damage model on the synthetic building dataset, covering the distribution of exposure conditions, baseline damage behavior, amplification effects, and sensitivity to the amplification coefficient.
3.1. Distribution of the Composite Exposure Index
The Composite Exposure Index (
) provides a scalar representation of exposure conditions across the building set.
Figure 2 shows the empirical cumulative distribution function (CDF) of
, indicating that values spanned nearly the entire unit interval without artificial clustering near the bounds. This confirms that the synthetic dataset covers a broad range of low, intermediate, and high exposure configurations, providing a suitable basis for evaluating model behavior across distinct exposure regimes.
Summary statistics of the Composite Exposure Index (
) are reported in
Table 1. The interquartile range (0.28–0.74) confirms substantial exposure heterogeneity within the dataset, which is essential for assessing the discriminatory behavior of both the baseline fuzzy inference system and the proposed similarity-based amplification mechanism.
3.2. Baseline Fuzzy Damage Estimates
Baseline fuzzy damage estimates
were obtained using the Mamdani-type fuzzy inference system described in
Section 2.
Figure 3 shows the relationship between the Composite Exposure Index (
) and the corresponding baseline damage values.
The resulting response curve exhibited smooth and nonlinear behavior, with damage increasing gradually at low exposure levels and rising more rapidly at higher exposure values as multiple fuzzy rules become jointly activated. This behavior reflects the overlapping structure of the membership functions and the monotone design of the fuzzy rule base, which together prevent abrupt transitions or discontinuities commonly associated with deterministic depth–damage formulations.
For low values, the inferred damage remained primarily within the Low and Medium linguistic classes, whereas higher values increasingly activated the High and Very High damage classes. This progression is consistent with qualitative damage escalation patterns observed under increasing exposure severity in low-relief floodplains.
3.3. Effect of Similarity-Based Damage Amplification
To examine the effect of similarity-based modulation, baseline fuzzy damage estimates were transformed into amplified values
using the similarity-based amplification mechanism defined in
Section 2.7.
Figure 4 compares the baseline and similarity-amplified damage estimates for a representative value of the amplification coefficient (
).
The results indicate that similarity-based amplification selectively increases damage estimates for buildings associated with high exposure levels (large
values) while leaving low-exposure configurations largely unchanged. Importantly, the amplification mechanism preserves the relative ordering of damage estimates, consistent with the analytical monotonicity property demonstrated in
Appendix C.
To further characterize this behavior,
Table 2 reports the aggregated mean values of the baseline and amplified damage across three exposure classes defined by
(Low, Medium, High). The largest relative increases were observed in the High exposure class, confirming that the amplification mechanism primarily enhances differentiation in regimes where exposure-related heterogeneity is most pronounced.
These results demonstrate that similarity-based amplification increases contrast among high-exposure configurations without altering the underlying fuzzy inference structure. This behavior supports the role of the proposed mechanism as a structural enhancement for resolving exposure-driven differentiation in data-scarce modeling contexts.
3.4. Sensitivity to the Amplification Coefficient
A sensitivity analysis was performed to examine the numerical behavior and stability of the proposed model with respect to the amplification coefficient
.
Figure 5 illustrates the relationship between
and the mean similarity-amplified damage
.
The results show an approximately linear and smooth dependence within the prescribed interval, indicating that the amplification term introduces a controlled and bounded modulation of damage estimates rather than abrupt or unstable changes. This behavior is consistent with the analytical formulation of the similarity-based amplification mechanism.
Table 3 summarizes the aggregated mean values of amplified damage corresponding to different values of
.
The variation across the examined interval remained moderate and well within the predefined output bounds. No numerical divergence, instability, or reordering of damage estimates was observed. These findings are consistent with the boundedness and stability properties formally demonstrated in
Appendix C, and confirm that the amplification coefficient functions as a stable modulation parameter rather than a calibration or fitting variable.
3.5. Summary of Findings
The numerical results indicate that:
The synthetic exposure dataset exhibits substantial heterogeneity, enabling meaningful methodological evaluation under data-scarce conditions;
Baseline fuzzy damage estimates display smooth nonlinear behavior consistent with physically interpretable damage progression;
Similarity-based amplification selectively enhances differentiation among high-exposure configurations while preserving relative ordering and interpretability;
The model exhibits stable and well-controlled behavior with respect to the amplification coefficient, consistent with the analytical properties derived in
Appendix C.
Overall, these findings demonstrate that the proposed modeling framework enhances internal discriminatory resolution in low-relief, structurally data-scarce floodplain environments while preserving transparency, interpretability, and mathematical consistency.
4. Discussion
The numerical findings presented in
Section 3 indicate that the proposed similarity-based fuzzy framework exhibits stable, interpretable, and physically plausible behavior across a heterogeneous range of exposure configurations. The Composite Exposure Index spanned a substantial portion of the unit interval (
Figure 2), confirming that the synthetic dataset captured a broad range of exposure conditions rather than artificially clustered or discretized patterns. This heterogeneity provides an appropriate basis for evaluating model behavior across different exposure regimes and supports the use of a low-relief floodplain environment as a representative methodological testbed under structurally data-scarce conditions.
Baseline fuzzy damage estimates displayed smooth and nonlinear escalation with increasing exposure (
Figure 3). This behavior arises from the overlapping structure of the membership functions and the monotone design of the fuzzy rule base, and is consistent with the qualitative damage progression patterns reported in empirical and applied vulnerability and damage assessment studies at the building scale (e.g., [
1,
8,
29]). In contrast to classical depth–damage functions, which often introduce abrupt transitions at fixed inundation thresholds [
1,
9], the fuzzy inference structure preserves continuity and interpretability across the exposure domain. At the same time, the baseline Mamdani system alone exhibited limited differentiation among structurally similar high-exposure buildings, highlighting the need for additional expressive capacity in such regimes.
The similarity-based amplification mechanism directly addresses this limitation. As demonstrated in
Figure 4, the mechanism selectively increased the damage estimates for buildings associated with high-exposure configurations, while leaving low-exposure buildings largely unaffected. This targeted behavior enhances contrast precisely in regimes where exposure-driven heterogeneity is most relevant for practical decision-support applications, such as civil protection prioritization, loss differentiation, and recovery planning. Importantly, similarity-based modulation preserves the monotonicity and relative ordering of the baseline fuzzy outputs, in accordance with the analytical properties established in
Section 2 and formally justified in
Appendix C.
Sensitivity analysis further indicates that the amplification coefficient functions as a controlled modulation parameter rather than a calibration parameter. The approximately linear response observed in
Figure 5 and the bounded variability across the prescribed amplification interval are consistent with the analytical boundedness and stability guarantees derived for the model. No numerical divergence, output reordering, or inflationary artifacts were observed, indicating that similarity-based modulation constitutes a coherent structural extension of Mamdani inference rather than an arbitrary numerical adjustment.
4.1. Positioning Within Broader Literature
Flood damage modeling frameworks differ substantially with respect to their underlying methodological paradigms, data requirements, and degree of interpretability. Neuro-fuzzy and hybrid machine-learning approaches, most commonly based on adaptive neuro-fuzzy inference systems (ANFISs) combined with evolutionary or metaheuristic optimization techniques [
18,
19,
20,
21], build on the foundational architecture introduced by Jang [
17] and enable flexible nonlinear mappings between hazard, exposure, and damage. However, such approaches typically require paired hazard–exposure–damage datasets for calibration and training and often provide limited transparency for operational or policy-oriented decision support.
More recently, deep learning and other machine-learning approaches have been proposed for large-scale flood susceptibility analysis, inundation mapping, and flood-related prediction tasks [
5,
15]. While these models can achieve high predictive performance in data-rich environments, they rely on extensive training datasets and frequently operate as black-box predictors. This limits interpretability and constrains their applicability in structurally data-scarce contexts, particularly when empirical building-level loss inventories are unavailable.
A third methodological family comprises classical fuzzy inference frameworks. Mamdani-type fuzzy systems have been applied to flood hazard, vulnerability, and damage assessment [
6,
7,
16], offering rule-based mappings that are transparent and expert-interpretable and are typically constructed without data-driven statistical calibration. These characteristics make them particularly attractive for data-scarce environments. However, conventional Mamdani-based damage models generally treat buildings sharing similar linguistic exposure states as indistinguishable and do not explicitly account for relational similarity among exposure configurations.
In parallel, empirical and indicator-based approaches remain widely used in flood risk analysis. Depth–damage models such as FLEMOcs [
9] and related formulations [
1,
8] estimate monetary losses through empirical calibration, while indicator-based frameworks aggregate spatial exposure descriptors, such as terrain elevation, hydraulic proximity, and structural characteristics, into composite indices [
10,
11]. Systematic reviews indicate that these approaches are generally robust and transferable, but often rely on linear aggregation schemes and do not explicitly represent nonlinear modulation or similarity-driven differentiation among exposed assets [
2].
Within this methodological landscape, the proposed framework occupies an intermediate position. It preserves the interpretability and absence of data-driven statistical calibration characteristic of Mamdani-type fuzzy inference systems [
6,
7,
16], interfaces naturally with GIS-derived exposure indicators commonly used in indicator-based approaches [
10,
11], and introduces a prototype-based scalar similarity mechanism conceptually inspired by fuzzy similarity concepts and non-metric similarity measures, rather than metric or clustering-based similarity learning [
22]. This combination enables enhanced differentiation among high-exposure configurations without resorting to neural training, empirical calibration, or high-dimensional similarity computation, thereby maintaining transparency and suitability for structurally data-scarce floodplain environments.
4.2. Methodological Implications
The numerical results yielded several methodological implications relevant to flood damage modeling under structurally data-scarce conditions.
Enhanced discrimination under data scarcity. The similarity-based amplification mechanism increases differentiation among high-exposure configurations—precisely in regimes where the baseline Mamdani inference exhibits limited discriminatory capacity—while preserving interpretability, monotonicity, and boundedness. This demonstrates that additional expressive capacity can be introduced structurally without compromising the analytical properties of the underlying fuzzy inference system.
Transparent extension of Mamdani inference. The similarity operator constitutes an intrinsic structural extension of the Mamdani framework rather than an external post-processing step. It operates directly on an aggregated exposure signature and requires neither additional fuzzy input variables nor neuro-fuzzy optimization, clustering procedures, or empirical calibration. As a result, the proposed extension preserves transparency, conceptual simplicity, and analytical tractability.
Compatibility with GIS-based indicator frameworks. By aggregating multi-dimensional exposure descriptors into a scalar Composite Exposure Index, the proposed framework is directly compatible with geospatial indicator-based flood risk assessment practices. This design facilitates straightforward integration with GIS-derived exposure datasets commonly available in applied flood risk and vulnerability studies.
Relevance for low-relief floodplains. The framework is explicitly tailored to low-relief floodplain environments, where micro-topography, hydraulic connectivity, and spatial exposure heterogeneity exert a dominant influence on damage formation and where empirical building-level loss data are rarely available. Addressing such settings responds directly to methodological gaps identified in recent flood vulnerability and damage assessment reviews (e.g., [
2]).
4.3. Validation Under Structural Data Scarcity
Empirical validation through calibration is infeasible for Begeč and similar rural low-relief floodplains due to the absence of paired exposure–damage records. The purpose of the present validation is therefore not to demonstrate predictive accuracy against observed losses, but to assess internal coherence, physical plausibility, and context-consistent behavior of the model under unavoidable data limitations. Consequently, the proposed framework is positioned as a methodological contribution to transparent and physically grounded damage modeling in settings where empirical loss data are structurally unavailable, rather than as a calibrated predictive tool. In such settings, where data scarcity is structural rather than incidental, model evaluation must rely on scientifically accepted alternative validation strategies discussed in the flood-risk and vulnerability modeling literature, particularly in comparative and review studies (e.g., [
2,
23,
24,
25,
26]). Accordingly, three complementary forms of validation were employed.
Face validity (behavioral plausibility). The baseline fuzzy damage progression exhibits physically plausible escalation patterns that are consistent with empirical observations of direct residential flood losses. In particular, higher inferred damage for buildings located closer to the river, at lower elevations, or within micro-topographic depressions aligns with documented mechanisms of fluvial damage formation reported in empirical and applied vulnerability and damage assessment studies (e.g., [
1,
8,
29]).
Behavioral consistency. The similarity-based amplification mechanism enhances differentiation among high-exposure configurations while preserving ordering, boundedness, and numerical stability. These properties are demonstrated analytically in
Appendix C and are further supported by the numerical results in
Section 3. The resulting model behavior is consistent with empirical evidence from flood-risk and impact studies indicating that both variance and absolute flood losses increase disproportionately among highly exposed residential structures (e.g., [
3,
4,
13]).
Contextual feasibility. The synthetic exposure distributions reflect hydro-morphological characteristics documented for low-relief floodplains in regional and review-based GIS assessments (e.g., [
2]), supporting the physical plausibility of the generated exposure configurations and the resulting damage patterns.
Comparative modeling reviews emphasize that no single methodological paradigm dominates across all data regimes [
23]. Transparent and data-efficient frameworks such as the proposed model remain appropriate for methodological analysis and decision-support exploration in contexts where calibrated depth–damage models [
9], neuro-fuzzy systems [
17,
18,
19,
20,
21], or data-intensive deep learning approaches for flood mapping, susceptibility analysis, or prediction tasks [
5,
15] cannot be deployed reliably due to data limitations.
Taken together, this triangulated validation strategy, combining face validity, behavioral consistency, and contextual feasibility, provides a coherent and methodologically sound basis for assessing model behavior under conditions of structural data scarcity, without overstating predictive performance or claiming empirical calibration beyond what the available data can support.
4.4. Limitations and Future Work
The proposed framework is subject to several limitations that define its intended scope of application. First, the analysis was conducted using a synthetic dataset, and the numerical results should therefore be interpreted as illustrative of model behavior rather than as operational damage estimates. The framework is designed for methodological exploration and decision-support reasoning under structural data scarcity, rather than for calibrated loss prediction.
Second, the current formulation focuses exclusively on direct material building damage and does not explicitly incorporate building typology, construction materials, socio-economic vulnerability, indirect losses, or post-event recovery dynamics. These components were intentionally excluded to preserve interpretability and analytical transparency, and to avoid introducing additional parameters that would require empirical calibration unavailable in the study context.
Third, the similarity-based amplification mechanism relies on exposure-based prototypes defined within the normalized exposure domain and is not trained against observed damage datasets. While this design choice preserves transparency and mathematical tractability, the framework does not aim to reproduce site-specific empirical loss distributions.
These limitations define the intended scope of the framework rather than diminishing its methodological contribution, which lies in providing an interpretable and analytically sound tool for exposure-driven damage differentiation under structural data scarcity.
Future research may address these limitations by: (i) extending the exposure representation to include typology-aware and material-specific indicators where such data are available; (ii) applying the framework to empirical case studies supported by expert-elicited rule bases or limited calibration datasets; and (iii) exploring hybrid extensions that retain interpretability while incorporating data-driven components for enhanced site-specific performance. In addition, cross-site transferability and multi-hazard extensions represent promising directions for further investigation.
4.5. Concluding Remarks
Overall, the results demonstrate that similarity-based modulation enhances the expressive capacity of classical fuzzy loss models while preserving analytical transparency, numerical stability, and interpretability under structurally data-scarce conditions. By integrating exposure aggregation, Mamdani-type fuzzy inference, and prototype-based similarity modulation within a unified modeling structure, the proposed framework provides a methodologically distinct and practically relevant approach for the assessment of building-level flood damage.
The framework is particularly suited to low-relief floodplain environments, where exposure heterogeneity plays a dominant role in damage formation and where empirical building-level loss data are rarely available. Within such contexts, the proposed approach offers a transparent and data-efficient alternative to calibration-dependent empirical, probabilistic, and machine-learning models while maintaining mathematically well-behaved and interpretable output characteristics.
5. Conclusions
This study introduced a similarity-based fuzzy framework for assessing direct material flood damage under structurally data-scarce conditions. The proposed approach integrates a Composite Exposure Index, a Mamdani-type fuzzy inference system, and a similarity-based amplification mechanism into a unified, transparent, and interpretable modeling framework.
The numerical results demonstrate that the fuzzy inference component captures the nonlinear escalation of building-level damage with smooth transitions between damage states while preserving monotonic behavior with respect to both exposure and flood depth. The similarity-based modulation selectively increases the damage estimates for highly exposed configurations and enhances discriminatory resolution in high-exposure regimes, without inducing numerical instability. By embedding similarity information directly into a scalar exposure representation, the framework provides an interpretable alternative to deterministic depth–damage functions and calibration-dependent probabilistic or machine-learning models, particularly in low-relief floodplains where exposure heterogeneity plays a dominant role in loss formation.
Several limitations should be acknowledged. The numerical results are illustrative rather than predictive due to the absence of paired exposure–damage records, and the present formulation focused exclusively on direct material losses. In addition, the weighting structure of the Composite Exposure Index was informed by the literature rather than empirically calibrated. These limitations define the intended scope of the framework rather than diminishing its methodological contribution, which lies in providing an interpretable and analytically sound tool for exposure-driven damage differentiation under structural data scarcity.
Future research will focus on extending the similarity construct to multidimensional or spatially explicit forms, incorporating indirect losses and recovery processes, and exploring integration with empirical or proxy-based validation sources when available. Within these directions, the proposed framework offers a transparent and data-efficient tool for flood damage assessment and risk-informed infrastructure planning in decision-support contexts where empirical calibration is infeasible.