When Hazard Maps Are Not Predictions: A Critical Assessment of MCDA in Glacier Hazard Susceptibility

Gacitua, Ricardo; Pereira, Javier; Astudillo, Hernán; Taramasco, Carla; Contreras, Pedro

doi:10.3390/ijgi15060245

Open AccessReview

When Hazard Maps Are Not Predictions: A Critical Assessment of MCDA in Glacier Hazard Susceptibility

by

Ricardo Gacitua

^1,*

,

Javier Pereira

²

,

Hernán Astudillo

²

,

Carla Taramasco

²

and

Pedro Contreras

³

¹

Computer Science and Informatics Department, Universidad de La Frontera, Temuco 4811230, Chile

²

Instituto de Tecnología para la Innovación en Salud y Bienestar (ITiSB), Universidad Andrés Bello, Santiago 2520000, Chile

³

Computer Science Department, Loughborough University, Epinal Way, Loughborough, Leicestershire LE11 3TU, UK

^*

Author to whom correspondence should be addressed.

ISPRS Int. J. Geo-Inf. 2026, 15(6), 245; https://doi.org/10.3390/ijgi15060245

Submission received: 6 April 2026 / Revised: 19 May 2026 / Accepted: 24 May 2026 / Published: 1 June 2026

(This article belongs to the Topic Natural Hazards Monitoring, Risk Assessment, Modelling and Management in the Artificial Intelligence Era)

Download

Browse Figures

Versions Notes

Abstract

Background: Multi-criteria decision analysis (MCDA) has become a dominant approach for glacier hazard susceptibility mapping, widely used to support risk management and climate adaptation planning. However, despite its widespread adoption, the role of MCDA outputs remains conceptually ambiguous: hazard classifications are often interpreted as predictive representations of risk, even though they are derived from preference-dependent decision models. This raises a critical but underexamined question regarding the reliability of MCDA-based glacier hazard assessments. This issue becomes particularly relevant in the current transition toward data-driven and artificial intelligence (AI)-based approaches for hazard modelling, where similar challenges of interpretability, validation, and reliability arise. Methods: To address this issue, we conducted a systematic literature review following the PRISMA 2020 protocol, analysing peer-reviewed studies published between 2015 and 2025. After screening 571 records, 60 studies were included. Data were extracted using a structured framework and synthesised through quantitative descriptive analysis and qualitative assessment of modelling practices, including method selection, criteria weighting, uncertainty treatment, validation, and geographical distribution. This study conducts a structured methodological audit—not a catalogue—of multi-criteria decision analysis (MCDA) applications in glacier hazard susceptibility mapping. Results: The analysis reveals a consistent methodological pattern. The Analytic Hierarchy Process (AHP) dominates current practice (36/60 studies, 60%), typically implemented through GIS-based weighted overlay with expert-derived weights. Critically, 80% of studies (48/60) derive criteria weights exclusively from expert judgement, with no data-driven calibration or sensitivity testing of subjective inputs. This epistemic reliance on unstructured or semi-structured expert elicitation, presented without robustness analysis, forms a central concern of this review. Moreover, empirical validation is limited: only 21/60 studies (35.0%) report quantitative performance metrics. Uncertainty and robustness analyses are rarely conducted, and most studies rely on single-model configurations without comparative evaluation. Despite these limitations, the resulting hazard maps are frequently presented as objective spatial predictions. The evidence base is also geographically concentrated, with 48/60 studies (80.0%) located in High Mountain Asia. Conclusions: The findings indicate a systematic mismatch between how MCDA-based hazard maps are constructed and how they are interpreted. In most cases, MCDA functions as a decision-structuring framework rather than a validated predictive model, yet its outputs are commonly treated as predictive evidence. This gap has important implications for the use of such models in risk management and climate adaptation, particularly in the emerging context of AI-driven hazard modelling, where issues of model validation, interpretability, and reliability become even more critical. Advancing the field requires explicit validation against observed events, systematic robustness and sensitivity analysis, transparent uncertainty modelling, and comparative evaluation of alternative or hybrid decision frameworks.

Keywords:

multi-criteria decision analysis; glacier hazard susceptibility; glacial lake outburst flood; GIS-based hazard mapping; systematic literature review

1. Introduction

Accelerated glacier retreat driven by climate change is no longer a distant concern [1]; it is already reshaping hazard dynamics in high-mountain regions worldwide. Glacial lake outburst floods (GLOFs), ice and snow avalanches, debris flows, and slope instabilities are becoming more frequent and, in many cases, more severe, with direct consequences for downstream communities, infrastructure, and fragile ecosystems [2,3,4]. Recent events have shown that even relatively small glacierised basins can trigger cascading processes—rapid water release, sediment mobilisation, and geomorphological instability—that result in disproportionately large impacts [5].

The socio-economic consequences of these hazards are substantial. A single GLOF can destroy hydropower plants (e.g., the 2021 Sikkim event that damaged the Chungthang dam, disrupting over 1200 MW of generation capacity [6]), and sever strategic transport corridors (e.g., sections of the Karakoram Highway repeatedly washed out by debris flows [7]), and displaced thousands of people in hours. False negatives—classifying a hazardous lake as safe—can, therefore, lead to catastrophic loss of life and infrastructure. In contrast, false positives—classifying a safe lake as hazardous—waste scarce mitigation resources and erode the trust of the community in risk assessments. For this reason, defensible and reliable hazard mapping is not merely an academic exercise: it is a prerequisite for evidence-based risk management, early warning system design, and the prioritisation of mitigation investments. Decisions based on unvalidated or overconfident maps carry asymmetric and potentially devastating consequences.

In this context, hazard assessment plays a central role. It aims not only to estimate the likelihood and magnitude of hazardous events and inform decisions about preparedness, mitigation, and resource allocation [8]. In practice, susceptibility mapping has become a widely adopted approach. Rather than predicting specific events, it identifies areas where hazardous processes are more likely to occur based on terrain characteristics, glaciological conditions, and environmental triggers [9].

However, glacier hazards are not governed by a single process. They emerge from the interaction of topography, hydrology, climate, and often poorly observed environmental conditions. As a result, hazard assessment inevitably involves combining heterogeneous sources of information—data with different resolutions, uncertainties, and degrees of completeness. This challenge has naturally led to the adoption of decision-support approaches capable of integrating diverse inputs into coherent and interpretable outputs [1].

From this perspective, glacier hazard assessment is not purely a physical modelling problem [10]; it is also a decision-making problem. Multiple factors—such as slope, elevation, glacier proximity, lake characteristics, lithology, land cover, and precipitation—must be considered simultaneously [11,12]. These factors are often uncertain, spatially heterogeneous, and only partially observable. Moreover, many glacierised regions lack dense monitoring networks, which further limits the availability of reliable empirical data [8].

At the same time, hazard assessments are not conducted in isolation. They are used by stakeholders—local authorities, emergency planners, and affected communities—who require results that are not only technically sound but also interpretable and defensible. In practice, this means that hazard assessment methods must balance analytical rigour with transparency and usability, often under conditions of significant uncertainty [10].

Multi-criteria decision analysis (MCDA), which denotes a set of approaches that integrate several, often competing, criteria into a single evaluative framework to support decision-making in complex settings [13], has emerged as a natural response to these challenges. It provides a structured framework for combining quantitative data (e.g., terrain metrics, hydrological indicators) with qualitative inputs (e.g., expert judgement) to produce composite indices used to rank or classify alternatives [14,15]. In the context of natural hazards, MCDA is most often implemented within geographic information systems (GISs), where spatial layers are aggregated through weighted combinations [16,17].

Among the available techniques, the analytic hierarchy process (AHP) has become particularly dominant, largely because it offers a systematic procedure for pairwise comparison and weight derivation. Potential drivers of this dominance include the widespread availability of AHP in commercial GIS software (e.g., ArcGIS Pro 3.7 Weighted Overlay), its low mathematical barrier to entry for geoscientists, and the institutional legitimacy that it confers through structured expert elicitation. Its integration within GIS workflows has made it especially attractive for practitioners working with limited or heterogeneous data.

The Analytic Hierarchy Process (AHP) represents problems in a hierarchical structure and derives criteria weights from expert-based pairwise comparisons. ROC (receiver operating characteristic) analysis and its associated summary measure, AUC (area under the curve), are standard tools for assessing the predictive accuracy of spatial models by comparing susceptibility rankings with observed hazard occurrences.

There are good reasons for the widespread adoption of MCDA. It is flexible, practically applicable, and capable of producing hazard maps that are relatively easy to interpret and communicate. In environments where data are scarce and uncertainty is unavoidable, these features are not trivial—they are often decisive. Yet, these same strengths raise important questions. Because MCDA relies on weighting schemes and aggregation rules, its outputs are inherently shaped by modelling choices and expert preferences. Hazard classifications, therefore, may reflect not only environmental conditions but also the assumptions embedded in the decision model itself.

Despite the extensive use of MCDA in glacier hazard assessment, a fundamental issue remains insufficiently examined: what kind of knowledge these models actually produce. Many studies focus on generating hazard maps, often treating them—implicitly or explicitly—as predictive representations of risk. However, the decision models underlying these maps are rarely scrutinised as analytical constructs.

A more detailed examination of the literature shows a clear trend: most studies depend on just one technique—typically AHP—and offer little comparison with alternative methods. Criteria weights are typically derived from expert judgement, yet their influence on results is seldom explored through systematic sensitivity or robustness analysis. Uncertainty is acknowledged but rarely modelled explicitly. Perhaps most importantly, validation against observed hazard events remains the exception rather than the norm.

This creates a tension that is rarely made explicit. On the one hand, MCDA-based hazard maps are often presented as objective, spatially explicit outputs. On the other hand, they are produced through preference-dependent models whose assumptions are only partially examined. The question is not whether MCDA is useful—it clearly is—but whether its outputs are being interpreted in ways that exceed what the underlying models can support.

Although numerous case studies exist, there has been little effort to synthesise these practices from a methodological perspective. In particular, the reliability implications of current modelling choices—how robust, reproducible, or empirically valid these hazard classifications are—remain largely unexplored. Addressing this gap requires moving beyond cataloguing applications and towards a critical examination of how MCDA is actually used in practice. This issue becomes particularly relevant in the current transition toward AI-driven hazard modelling approaches, concerning which similar questions regarding interpretability, validation, and reliability arise.

Additionally, it is plausible, though not yet systematically documented, that the adoption of MCDA for glacier hazard assessment varies substantially across world regions. In the European Alps, long-established monitoring networks, historical event inventories, and process-based hydrological models provide alternative pathways for hazard assessment that do not rely primarily on expert-driven decision frameworks. In contrast, in High Mountain Asia—where monitoring infrastructure is sparse, the terrain is extreme, and the policy pressure is acute—MCDA may have been adopted more easily as a pragmatic replacement for data-intensive modelling. Similarly, the Andes and North American cordillera occupy intermediate positions with different research traditions and data availabilities. These regional differences in methodological culture, if they exist, have not been systematically synthesised.

To address this issue, this study conducts a systematic review of the literature following the PRISMA 2020 protocol [18], which is an evidence-based guideline for reporting systematic reviews, designed to promote transparency in the identification, selection, data extraction, and synthesis of studies. The objective is not only to describe how MCDA has been applied but also to evaluate the methodological reliability of these applications and to clarify the role that MCDA-based outputs play in glacier hazard assessment.

Consequently, the study is guided by the following central research question:

To what extent do current MCDA applications provide reliable and defensible decision-support for glacier hazard assessment?

To operationalise this question, four analytical research questions are examined:

RQ1. Why has the Analytic Hierarchy Process (AHP) become the predominant MCDA method in glacier hazard assessment studies?
RQ2. To what extent do criteria weighting practices reflect methodological justification rather than practical convenience?
RQ3. Why is uncertainty and robustness analysis rarely incorporated in MCDA-based glacier hazard assessments?
RQ4. Which methodological improvements (e.g., sensitivity analysis, comparative modelling, hybrid or ensemble MCDA) could enhance the reliability of glacier hazard assessments?

This review evaluates a single directional hypothesis: that the widespread adoption of AHP in glacier hazard assessment is driven predominantly by operational convenience (software availability, ease of implementation, institutional familiarity) rather than by demonstrated predictive accuracy or problem-specific fit. If this hypothesis holds, we expect to observe: (i) uniform application of AHP across GLOF, landslide, and multi-hazard contexts; (ii) rare comparative evaluation against alternative MCDA methods; (iii) low and temporally stagnant rates of quantitative validation; and (iv) minimal reporting of sensitivity or robustness analysis. Each of these expectations is examined through the research questions formulated above.

This study offers three key advances to the discipline of glacier hazard assessment and decision-making analysis. First, it provides a structured methodological audit of MCDA applications in glacier hazard studies. Rather than cataloguing applications, the review examines how decision models are constructed, how criteria weighting is justified, how uncertainty is treated, and how results are validated. In doing so, it shifts attention from the production of hazard maps to the reliability of the modelling process itself.

Second, the study identifies a systematic pattern in current practice: a strong reliance on compensatory aggregation methods—particularly AHP—combined with expert-derived weighting and limited robustness evaluation. The findings suggest that many hazard classifications are highly sensitive to modelling assumptions, while being interpreted as objective spatial representations. This reveals a gap between how models are constructed and how their outputs are used in practice.

Third, the paper outlines a research agenda aimed at improving the credibility of MCDA-based hazard assessments. This includes the need for explicit validation against observed events, systematic sensitivity and robustness analysis, transparent uncertainty modelling, and comparative evaluation of alternative or hybrid decision frameworks. These directions are intended to support a transition from descriptive susceptibility mapping to more defensible and evidence-based decision-support tools.

This review is designed as a structured methodological audit, not a catalogue of applications. We seek to evaluate how decision models are constructed, what assumptions they embed, and whether the results are empirically defensible.

The remainder of the paper is organised as follows. Section 2 reviews existing research on MCDA and hazard assessment. Section 3 describes the systematic review protocol and the data extraction process. Section 5 presents the results of the synthesis. Section 6 discusses the implications of the findings. Section 7 outlines the study limitations, followed by future research directions in Section 8. Finally, Section 9 concludes the paper.

2. Related Work and Research Gap

2.1. Glacier Hazard Assessment as a Decision Problem

Mountain glacier environments are undergoing rapid transformation due to climate change, increasing the frequency and impact of hazards such as glacier lake outburst floods (GLOFs), debris flows, landslides, and snow or ice avalanches [2,3,8]. These hazards threaten settlements, transport infrastructure, hydropower installations, and downstream ecosystems in high-mountain regions worldwide.

Assessing glacier hazard susceptibility is inherently complex. Hazard formation depends on interacting geomorphological, hydrological, climatic, and anthropogenic factors that are spatially heterogeneous and often uncertain. Consequently, glacier hazard assessment rarely corresponds to a purely physical prediction problem. Instead, it requires prioritising locations, classifying hazard levels, and supporting mitigation decisions under incomplete information. For this reason, glacier hazard assessment is fundamentally a decision-analysis task: the goal is not only to model natural processes but also to support planning and risk management by integrating heterogeneous environmental indicators and expert knowledge.

Although this review focuses specifically on glacier-related hazards (GLOFs, ice avalanches, debris flows from glacierised basins), we acknowledge that adjacent periglacial environments—including mountain permafrost, rock glaciers, and ice-cored moraines—also generate hazards relevant to the same high-mountain communities and infrastructure [19]. Rock glaciers, for example, can destabilise under warming conditions, triggering landslides or debris flows, and their meltwater can contribute to lake formation and outburst floods [20]. However, the decision-analytic literature on periglacial hazards remains sparser than on glacier hazards, and studies combining MCDA with periglacial processes were not captured by our search terms (e.g., “glacier*” was a required term). A systematic extension of this review to periglacial environments would be a valuable complementary study. For the purposes of this review, we retain the focus on glacier hazards while recognising that the methodological reliability concerns identified (validation, uncertainty, robustness) apply equally to periglacial hazard assessments.

2.2. MCDA in Natural Hazard and Cryospheric Studies

Multi-criteria decision analysis (MCDA) has been widely adopted to support complex environmental decisions involving multiple conflicting criteria [21,22]. MCDA methods combine quantitative data and qualitative judgement through explicit trade-offs among factors, making them suitable for spatial risk assessment and environmental planning.

MCDA has been applied in several hazard domains, including flood risk management, landslide susceptibility, and water resource allocation [14,15,17]. In glacier hazard assessment, MCDA is typically implemented within geographic information systems (GISs), where spatial predictor variables—such as slope, elevation, glacier proximity, precipitation, and lake characteristics—are aggregated into hazard susceptibility maps [16,23].

Compensatory additive methods dominate practice. The Analytic Hierarchy Process (AHP), weighted overlay, and conventional weighting schemes are particularly common because they integrate naturally with raster-based GIS workflows and allow expert-based pairwise comparisons [17,24,25]. As a result, MCDA is widely used to identify dangerous glacial lakes, prioritise monitoring actions, and classify hazard zones.

To put the critique of MCDA into perspective, it is helpful to compare its characteristics with those of other modelling approaches that are commonly applied in hazard assessment. Physically based models (e.g., hydrodynamic flood models, slope stability equations) encode mechanistic process understanding but require extensive calibration data and computational resources, making them difficult to apply at regional scales. Statistical susceptibility models (e.g., logistic regression, weights of evidence) estimate empirical relationships between hazard occurrence and environmental predictors, offering data-driven weight calibration and inherent validation metrics (e.g., AUC and pseudo-R²), but they require large inventories of observed events and assume stationarity. Machine learning classifiers (e.g., random forests, support vector machines, neural networks) can capture complex non-linear relationships and interactions, often achieving high predictive accuracy, but they are notoriously opaque—lacking the transparency and traceability of MCDA—and require even larger training datasets. MCDA, in contrast, excels in data-sparse environments, accommodates qualitative expert knowledge, and offers full interpretability. However, as this review demonstrates, MCDA rarely tests its predictive claims, whereas statistical and ML models routinely report cross-validated performance metrics. This asymmetry in validation culture—not the inherent superiority of any method—is a central concern. The complementarity is clear: MCDA structures decisions when data are scarce; ML predicts when data are abundant. Hybrid workflows that combine both are a promising direction.

2.3. Methodological Characteristics of Current Practice

Most MCDA-based glacier hazard studies follow a similar workflow. Environmental variables are converted into spatial layers, criteria weights are assigned (usually by expert judgement), and a composite index is produced through additive aggregation. The index is then classified into hazard categories. Although operationally effective, this procedure embeds strong modelling assumptions. Additive aggregation presumes independence among criteria and linear compensation between favourable and unfavourable factors [22]. Furthermore, the weighting of the criteria is often subjective and heavily depends on expert interpretation [26].

The linear compensation assumption embedded in additive aggregation is not simply a technical detail; it carries substantive consequences for the classification of hazards. Full compensation means that a very low score on one criterion can be offset by a very high score on another. For GLOF susceptibility, this is physically problematic. Consider a glacial lake with an extremely large volume and rapid expansion (strong hazard signals) but located behind a stable, wide moraine with no evidence of past breaching. A compensatory model could classify this lake as high-hazard because the volume compensates for the stability of the moraine. Yet, a geomorphologist would recognise that the moraine condition is a non-compensatory criterion: if the dam is stable, the lake is safe, regardless of its size. In contrast, a different lake with moderate volume but active ice avalanches into the lake and a steep, overdeepened moraine front might be genuinely dangerous, but a compensatory model with low weight on triggering factors could miss it. Non-compensatory methods—such as ELECTRE, PROMETHEE, or other outranking approaches—are designed to handle such logical structures [21,26].

Validation practices are heterogeneous. Some studies compare results with historical hazards inventories, while others rely only on qualitative interpretation or visual agreement with known hazardous areas. Uncertainty is frequently acknowledged but rarely formally analysed; systematic robustness analysis or uncertainty propagation is uncommon [27,28].

Thus, the literature shows consistent operational practices but limited methodological scrutiny.

2.4. Conceptualising Reliability in MCDA for Hazard Assessment

The preceding sections have documented the widespread adoption of multi-criteria decision analysis (MCDA) in glacier hazard assessment. However, the question of what constitutes a reliable MCDA-based hazard classification remains under-theorised in the literature. The term “reliability” is frequently invoked in general terms to imply trustworthiness or credibility, yet it encompasses distinct analytical properties that require separate consideration. This subsection develops a conceptual framework that distinguishes four dimensions of reliability relevant to MCDA applications in hazard assessment: reproducibility, robustness, predictive validity, and procedural reliability. These distinctions provide the analytical vocabulary for the methodological audit presented in this review.

Before proceeding, it is necessary to clarify what is meant by “prediction” in the context of this review, as the term carries different meanings across hazard modelling traditions. Following the distinction established in the geospatial prediction literature [29,30], we differentiate between two senses of prediction. Strong prediction refers to forecasting the specific timing, location, and magnitude of individual hazard events—for example, predicting that a particular glacial lake will breach on a specific date. This form of prediction is rarely attempted in susceptibility mapping and is not what MCDA models claim to provide. Weak prediction (or susceptibility prediction) refers to estimating the relative likelihood of hazard occurrence across a spatial domain, typically expressed as a ranking or classification (e.g., “high susceptibility zones are more likely to experience GLOFs than low susceptibility zones”). This weaker sense of prediction is testable through spatial cross-validation or historical back-testing. When this review critiques MCDA outputs as being interpreted as “predictive representations of risk,” it refers to weak prediction—the claim that high-hazard zones systematically correspond to locations where events are more likely. The distinction is important because a model may be useful for weak prediction (prioritising areas for field investigation) while being entirely inadequate for strong prediction (early warning). The four reliability dimensions introduced below operationalise weak prediction through predictive validity (Section 2.4.3).

2.4.1. Reproducibility

Reproducibility refers to the ability of independent researchers to obtain identical results when applying the same method to the same input data [31,32]. In the context of MCDA-based hazard mapping, reproducibility requires complete disclosure of:

The environmental criteria selected and their operational definitions;
The weighting procedure, including pairwise comparison matrices where applicable;
The aggregation rule (e.g., weighted linear combination, multiplicative aggregation);
Classification thresholds used to transform continuous susceptibility indices into hazard categories.

Without such transparency, hazard maps cannot be independently verified, and their status as scientific evidence remains ambiguous. Reproducibility is, therefore, a minimal condition for reliability: an unreproducible result cannot be considered analytically credible, regardless of its apparent plausibility.

2.4.2. Robustness

Robustness concerns the stability of hazard classifications under reasonable variations in modelling assumptions [33,34]. Because MCDA models necessarily embed subjective judgements—particularly in criteria weighting—robustness analysis examines whether conclusions change when these judgements are varied within defensible ranges. In glacier hazard assessment, robustness can be evaluated through:

Sensitivity analysis: systematic variation of criteria weights to identify which parameters drive classification outcomes [27];
Scenario analysis: testing alternative assumptions about hazard conditioning factors or future environmental conditions;
Multi-model comparison: applying different MCDA methods to identical datasets to assess whether classifications are method-dependent [28].

Before proceeding, a terminological clarification is necessary, as the literature often uses sensitivity and robustness interchangeably. In this review, we adopt the following operational distinction. Sensitivity analysis refers to a family of techniques that systematically vary model inputs—typically criteria weights—to measure the corresponding change in outputs (e.g., hazard classification or susceptibility rank). Robustness is a property of the model or its conclusions: a result is robust if it remains stable across a defensible range of input assumptions or modelling choices. Thus, sensitivity analysis is a method for evaluating robustness; robustness is the quality that sensitivity analysis tests. A study that omits sensitivity analysis cannot claim that its hazard classifications are robust, regardless of the apparent precision of the output map. A robust hazard classification is one that persists across a reasonable ensemble of modelling choices; a fragile classification that changes dramatically under small perturbations provides weak support for risk management decisions.

2.4.3. Predictive Validity

In formal decision analysis, multi-criteria decision analysis (MCDA) methods can serve three distinct purposes. Descriptive applications aim to characterise what the world looks like—for example, by estimating the relative susceptibility of different glacial lakes to outburst flooding. Prescriptive applications recommend what should be done given stated preferences—for instance, ranking lakes to prioritise mitigation investments. Normative applications evaluate outcomes against logical axioms or procedural standards. The limited validation critique in this review applies specifically to descriptive claims: the assertion that MCDA-derived hazard maps represent empirically verifiable spatial patterns. However, many reviewed studies appear to use MCDA prescriptively (e.g., generating risk rankings to guide mitigation) without validating the descriptive basis—that is, without testing whether high-ranked zones actually correspond to observed events. Recognising this distinction clarifies that the problem is not MCDA’s predictive inadequacy per se but the conflation of prescriptive outputs with descriptive evidence.

Predictive validity refers to the correspondence between model outputs and observed hazard events [29,30]. In susceptibility mapping, this involves testing whether areas classified as high hazard indeed experience more frequent or severe events than areas classified as low hazard. Operationalisations of predictive validity include:

ROC/AUC analysis: comparing susceptibility rankings against independent inventories of past events;
Accuracy metrics: proportion of correctly classified hazard locations;
Confusion matrices: systematic tabulation of true/false positives and negatives;
Temporal back-testing: evaluating whether historical events fall within retrospectively classified high-hazard zones.

Predictive validity is distinct from internal consistency checks, such as the Analytic Hierarchy Process consistency ratio. While consistency ratios assess the logical coherence of expert judgements, they provide no evidence that the resulting hazard map corresponds to environmental processes. A model may be internally consistent yet predictively invalid; conversely, predictive validity requires empirical verification independent of model construction.

2.4.4. Procedural Reliability

Procedural reliability concerns the defensibility and transparency of the decision-making process itself [33,35]. Even when predictive validation is constrained by data scarcity—as is often the case in high-mountain environments—the process by which hazard classifications are generated should be clearly documented and justifiable. Procedural reliability encompasses:

Justification of method selection: explicit reasoning for choosing a particular MCDA technique over available alternatives;
Documentation of expert elicitation: clear reporting of how experts were selected, how judgements were elicited, and how disagreement was resolved;
Uncertainty communication: transparent acknowledgment of model limitations, data gaps, and the conditional nature of hazard classifications;
Stakeholder engagement: evidence that decision processes incorporated relevant perspectives, where appropriate.

Stakeholder engagement is not merely an ethical or procedural add-on; it is epistemically consequential for hazard assessment. In decision environments characterised by high stakes, scientific uncertainty and value disagreement—precisely the conditions of glacier hazard management—the legitimacy of a hazard classification depends partly on whether affected communities and local authorities recognise the assessment process as fair and transparent. Procedural reliability, as we define it, requires evidence that relevant perspectives have been incorporated, that disagreement among experts or stakeholders has been systematically elicited and documented, and that the decision rule (e.g., weighted sum) has been explained to non-technical audiences. Several reviewed studies mention expert consultation, but they rarely report how experts were selected, whether divergent judgments were reconciled or retained, or how stakeholder values (e.g., risk aversion, economic priorities) were translated into criteria weights. Without this documentation, the resulting map may be analytically defensible but procedurally opaque, undermining its uptake in real-world risk governance.

Procedural reliability does not guarantee that a hazard map is predictively accurate, but it ensures that the map is presented as a decision-support artefact whose assumptions are open to scrutiny rather than as an unexamined prediction.

2.4.5. Relationship Among Reliability Dimensions

These four dimensions are neither mutually exclusive nor fully independent. Reproducibility is a prerequisite for both robustness analysis and external validation: if model construction is not transparent, others cannot test its stability or predictive performance. Robustness analysis can be conducted without predictive validation, providing evidence on the stability of classifications even when event data are unavailable. Predictive validity, where demonstrated, provides the strongest form of empirical support, but it depends on the availability of independent hazard inventories. Procedural reliability underpins all dimensions by ensuring that modelling choices are documented and defensible.

Together, these dimensions define a framework for evaluating the reliability of MCDA-based hazard assessments. A study reporting only a single deterministic hazard map, without transparency in weighting, sensitivity analysis, or predictive validation, achieves none of the four dimensions. Conversely, a study that documents all modelling steps, tests robustness through sensitivity analysis, and validates against independent observations satisfies all dimensions and provides strong evidence for risk management decisions.

This framework guides the methodological audit presented in this review. In the following sections, we examine the extent to which existing MCDA applications in glacier hazard assessment achieve reproducibility, robustness, predictive validity, and procedural reliability. The framework also provides the conceptual basis for the MCDA-HAZARD quality assessment instrument described in Section 3.9.

2.5. Limitations of Existing Literature

Despite the large number of applications, the literature exhibits recurring methodological limitations.

First, method selection is rarely theoretically justified. The widespread adoption of AHP and weighted overlay approaches appears to be strongly influenced by their availability in GIS software (e.g. ArcGIS Pro 3.7) rather than by decision-theoretic reasoning. Consequently, compensatory aggregation models are routinely used without examining their assumptions regarding criteria independence and trade-offs.

Second, validation practices are weak and inconsistent. Only a minority of studies perform quantitative validation using independent datasets or statistical accuracy measures (e.g., ROC/AUC). Many studies evaluate results through visual inspection or expert interpretation, limiting the evidential strength of hazard classifications.

Third, uncertainty treatment is limited. Glacier hazard assessment inherently involves incomplete monitoring data, remote sensing limitations, and dynamic environmental processes. Nevertheless, uncertainty is usually addressed only implicitly through expert judgement or simple sensitivity checks. Formal robustness analysis, probabilistic MCDA, or stochastic modelling is rarely implemented.

Taken together, these limitations form a systematic pattern: decision models are frequently interpreted as objective spatial predictions while remaining highly dependent on subjective weighting and compensatory aggregation assumptions.

2.6. Research Gap

A striking feature of the existing literature is not the lack of applications but the lack of reflection on what these applications actually imply. Over the past decade, a substantial number of MCDA-based studies have produced detailed glacier hazard maps for specific locations. These maps are often visually compelling and operationally useful. Yet, when viewed collectively, they raise a more fundamental question: what kind of knowledge do these models produce, and how should that knowledge be interpreted?

Despite the volume of case studies, there has been little effort to examine MCDA-based glacier hazard assessment from a methodological reliability perspective. The literature has largely evolved around producing outputs—hazard maps, rankings, classifications—without systematically questioning the assumptions that underpin them or the extent to which their results can be considered robust, reproducible, or empirically valid.

This absence of critical examination becomes particularly relevant when considering how these outputs are used. In many cases, hazard maps derived from MCDA are implicitly treated as predictive representations of risk. However, these maps are generated through decision models that depend on weighting schemes, aggregation rules, and expert judgements—elements that are rarely subjected to systematic sensitivity analysis, uncertainty modelling, or empirical validation. The issue, therefore, is not simply methodological; it is interpretative.

From this perspective, the gap in the literature is not only about missing techniques or incomplete analyses. It reflects a deeper misalignment between how MCDA models are constructed and how their outputs are understood and applied. This misalignment raises important questions about the reliability of hazard classifications and, more importantly, about the confidence that decision-makers can place in them when supporting risk management and climate adaptation strategies.

Importantly, our contribution is not merely incremental. Previous systematic reviews in natural hazard MCDA—exemplified by de Brito and Evers [36] on flood risk—have focused primarily on method inventories (which MCDA techniques are used), geographical distributions, or conceptual frameworks. What distinguishes the present review is its explicit focus on methodological reliability as an object of analysis. We do not ask only “what methods are used?” but also “how defensibly are they applied?” “are results validated?” “is uncertainty quantified?” and “are classifications robust?” To our knowledge, no prior review has operationalised reliability across the four dimensions we introduce (reproducibility, robustness, predictive validity, and procedural reliability) and applied them systematically to glacier hazard assessments. This study, therefore, fills a distinct gap: it moves the field from descriptive cataloguing to critical methodological auditing.

Consequently, the field still lacks a clear understanding of:

Why particular MCDA methods—especially AHP—have become dominant in practice;
Whether criteria weighting reflects methodological justification or practical convenience;
How uncertainty and robustness are actually addressed, beyond implicit acknowledgement;
Whether resulting hazard classifications can be considered reliable decision-support outputs or should be interpreted more cautiously as structured expert judgements.

Addressing this gap requires moving beyond cataloguing applications towards a systematic and critical examination of methodological practice. In particular, it requires assessing not only what methods are used but how they are used, what assumptions they embed, and what claims can reasonably be made about their outputs.

2.7. Contribution of This Study

This study responds to the identified gap by performing a systematic literature review in accordance with the PRISMA 2020 guidelines, concentrating specifically on methodological robustness rather than on how often methods are applied. Yet, the value of this work extends beyond providing a structured overview; its main contribution is to reshape the way MCDA-based glacier hazard assessments are conceptualized.

First, the study offers a cross-study methodological audit of MCDA applications in glacier hazard assessment. Rather than cataloguing methods or reporting their frequency of use, the analysis examines how decision models are constructed in practice—how criteria are selected and weighted, how uncertainty is handled, and how (or whether) results are validated. This shift in focus—from methods to modelling practice—allows for a more critical evaluation of the credibility of the resulting hazard classifications.

Second, the study identifies a systematic pattern that has not been explicitly articulated in the literature. MCDA-based hazard maps are commonly presented and interpreted as objective, spatially explicit representations of risk, yet they are typically derived from preference-dependent models with limited validation, minimal robustness analysis, and implicit treatment of uncertainty. The contribution here is not simply empirical but conceptual: the study exposes a misalignment between how these models are constructed and how their outputs are understood and used.

Third, by synthesising evidence across studies, the paper provides a structured basis for reinterpreting the role of MCDA in glacier hazard assessment. The findings suggest that, in most cases, MCDA functions as a decision-structuring framework rather than a predictive modelling approach. Recognising this distinction is essential for the appropriate use of such models, particularly in contexts where they inform risk management and climate adaptation decisions.

Finally, the study outlines a research agenda aimed at improving the reliability and interpretability of MCDA-based hazard assessments. This includes the need for explicit validation against observed events, systematic sensitivity and robustness analysis, transparent modelling of uncertainty, and a comparative evaluation of alternative or hybrid decision frameworks.

Table 1 positions this study relative to previous reviews in natural hazard MCDA research. While existing reviews primarily focus on method inventories or conceptual discussions, they do not systematically evaluate validation practices, uncertainty treatment, or methodological reliability. In contrast, this study provides an integrated, cross-study assessment of these dimensions, establishing a foundation for more robust and defensible decision-support tools in cryospheric risk management.

3. Research Methodology

This section defines the rationale and objectives of the review a priori, followed by the methodological procedures, in alignment with the PRISMA 2020 reporting framework.

The increasing application of multi-criteria decision analysis (MCDA) techniques in glacier hazard susceptibility assessment has led to a heterogeneous body of research addressing diverse hazard types, information sources, modelling assumptions, and decision-making contexts. These approaches aim to support hazard zonation, prioritisation of mitigation actions, and resource allocation under conditions of uncertainty. However, despite the growing number of studies, there is limited consolidated evidence regarding which MCDA methods are predominantly used, how they are applied, and how they address uncertainty and decision-makers’ preferences in glacier hazard assessment.

To ensure methodological rigour, transparency, and reproducibility, this study was conducted as a systematic review of the literature (SLR) following the PRISMA 2020 statement (Preferred Reporting Items for Systematic Reviews and Meta-Analyses). PRISMA provides a standardised reporting framework that supports transparent identification, selection, assessment, and synthesis of studies, while reducing reporting bias and improving reproducibility. The review protocol—including research questions, eligibility criteria, search strategy, data extraction framework, synthesis procedures, and quality assessment approach—was defined a priori to minimise procedural bias and ensure methodological consistency. The protocol was registered in the Open Science Framework (OSF) and is publicly available, supporting open and reproducible research practices.

All relevant reporting items are addressed within the manuscript using PRISMA-consistent section headings. A completed PRISMA 2020 Expanded Checklist, indicating where each item is reported, is provided as Multimedia Appendix A. The following subsections describe the methodological components of the review in alignment with the PRISMA reporting structure, including the rationale, objectives, protocol registration, eligibility criteria, information sources, search strategy, study selection process, data collection procedures, synthesis methods, and quality assessment.

By synthesising evidence across heterogeneous MCDA applications in glacier hazard susceptibility assessment, this review provides a structured overview of decision-analysis approaches used to evaluate glacier-related hazards. The findings identify dominant methods, typical modelling practices, the treatment of uncertainty, and criteria-weighting strategies, thereby offering a consolidated reference for researchers and practitioners working in natural hazard assessment and decision-support systems.

Foundations

3.1. Rationale

Although numerous studies apply MCDA to the assessment of glacier hazards, the literature has primarily evaluated the resulting hazards maps rather than the decision models that produce them. Consequently, the reliability, robustness, and epistemic validity of MCDA-based hazard classifications remain largely unexamined.

Glacier hazard mapping is not merely a spatial modelling task but a decision-analytic process in which assumptions about weighting, aggregation, and uncertainty directly influence the resulting classifications. Therefore, evaluating which methods are used is insufficient; it is necessary to understand why particular methods dominate practice, whether weighting procedures are scientifically defensible, and whether model output can be considered reliable decision-support evidence.

This review is, therefore, motivated not by the need to catalogue applications but by the need to critically audit the methodological foundations of MCDA-based glacier hazard assessment.

3.2. Objectives

The objective of this systematic review is to evaluate the methodological reliability of multi-criteria decision analysis (MCDA) applications in glacier hazard assessment.

Rather than cataloguing applications, the review investigates the decision-analytic logic underlying current practice. Specifically, the review examines the drivers behind the prevalence of particular MCDA methods, the justification of weighting strategies, the treatment of uncertainty, and the extent to which hazard classifications are validated or robust.

The aim is to determine whether current MCDA-based glacier hazard assessments function as reliable analytical models or as context-dependent decision-support interpretations.

More specifically, the review addresses the following research questions:

RQ1: Why has the Analytic Hierarchy Process (AHP) become the predominant MCDA method in glacier hazard assessment studies?
RQ2: To what extent do criteria weighting practices reflect methodological justification rather than practical convenience?
RQ3: Why is uncertainty and robustness analysis rarely incorporated in MCDA-based glacier hazard assessments?
RQ4: Which methodological improvements (e.g., sensitivity analysis, comparative modelling, and hybrid or ensemble MCDA) could enhance the reliability of glacier hazard assessments?

By addressing these research questions, the review moves beyond cataloguing applications and instead examines the methodological logic underlying current practice. The synthesis evaluates why specific MCDA approaches dominate the field, how weighting and uncertainty are treated in decision models, and whether existing validation practices support reliable hazard classification. In doing so, the study identifies recurring assumptions that shape glacier hazard assessments and outlines directions for developing more robust and defensible decision-support frameworks.

Methods

3.3. Eligibility Criteria

Studies were included if they:

–: Address glacier hazard susceptibility, vulnerability, or risk assessment;
–: Apply a multi-criteria decision analysis (MCDA) approach, method, or technique;
–: Use MCDA to evaluate, classify, rank, or prioritise hazardous areas or mitigation alternatives;
–: Present an empirical application, methodological proposal, or case study related to glacier hazards.
–: Provide sufficient methodological description of weighting, aggregation, or validation procedures.

Exclusion criteria included:

–: Studies not related to glacier or glacial hazards;
–: Studies addressing natural hazards without application of an MCDA method;
–: Papers using optimisation, simulation, machine learning, or statistical models without an MCDA component;
–: Studies focused solely on physical modelling or calibration of hazard processes;
–: Surveys, editorial papers, theses, technical reports, or non–peer-reviewed publications;
–: Duplicate records or studies lacking sufficient methodological description.
–: Studies reporting hazard maps without methodological explanation of the decision model;
–: Studies where MCDA is mentioned but the weighting or aggregation process cannot be reconstructed.

Eligible studies were grouped according to the MCDA method, type of decision-problem, uncertainty treatment, and weighting strategy for synthesis.

3.4. Information Sources

A comprehensive search was conducted in the major scientific databases that cover earth sciences, environmental sciences, and interdisciplinary research. The following sources were consulted: ScienceDirect (Elsevier); SpringerLink; Wiley Online Library; Taylor & Francis Online; and Scopus.

The search covered publications from 2015 to 2025.

3.5. Search Strategy

The search strategy combined three conceptual components: (1) multi-criteria decision-making methods, (2) hazard or risk assessment, and (3) glacier-related phenomena.

The generic search expression was:

(multicriteria OR multi-criteria OR multiple criteria OR MCDA OR multiple criteria decision) AND (hazard OR vulnerability OR susceptibility OR risk) AND (glacier OR glacial)

Table 2 presents the conceptual blocks and illustrative terms used to design the search strategy. These terms capture the overlap between multi-criteria decision analysis methods, hazard and risk assessment concepts, and glacier-related phenomena, thereby supporting a search that is both comprehensive and appropriately targeted to the relevant literature.

Database-specific search strings, filters, and field restrictions are reported in Appendix B.

3.6. Selection Process

The study selection followed a structured multi-stage screening process:

1.: Removal of duplicate records;
2.: Screening of titles and abstracts;
3.: Full-text eligibility assessment;
4.: Final inclusion based on predefined criteria.

Two reviewers independently evaluated each record at the title/abstract and full-text stages. Disagreements were resolved through discussion and consensus. No automation or machine-learning screening tools were used.

3.7. Data Collection Process

Data were extracted using a predefined extraction template designed to capture information necessary to address the research questions. For each included study, information was recorded on the details of the publication, the methodological approach, the application context, and the reported results. Each article was treated as an independent source of evidence. The extraction was performed by one reviewer and validated by a second reviewer. Ambiguous cases were discussed jointly until agreement was reached. When information was missing or unclear, it was coded as “not reported” rather than inferred.

During data extraction, a distinction was made between internal consistency checks and predictive validation. Several studies reported the Analytic Hierarchy Process (AHP) consistency ratio (CR) or similar logical coherence indicators. Although these measures evaluate the internal consistency of expert judgments, they do not assess the predictive performance of the hazard model. Therefore, AHP consistency ratios and comparable internal checks were not classified as quantitative validation.

For the purposes of this review, quantitative validation was strictly defined as the use of predictive performance metrics derived from comparison with independent observations, such as ROC curves, AUC values, precision measures, confusion matrices or equivalent statistical indicators. This definition was applied consistently in all included studies.

What the AHP Consistency Ratio Does and Does Not Tell You

The AHP consistency ratio (CR) measures only the logical coherence of an expert’s pairwise comparisons. A CR below 0.10 indicates that the expert has not made contradictory judgements (e.g., stating A > B, B > C, and C > A). It does not provide evidence that the weights of the resulting criteria correspond to physical processes, that the hazard map predicts observed events, or that the classification is robust under alternative weight specifications. A model can be perfectly internally consistent, yet predictively invalid. In contrast, a model with higher CR may still produce empirically accurate classifications if the underlying expert knowledge is sound. Reporting a low CR is, therefore, a necessary but completely insufficient condition to claim the reliability of the model. This review treats CR reporting separately from predictive validation and does not count CR as evidence of model performance.

3.8. Data Items

The following information was extracted from each included study:

–: Hazard domain and application objective (e.g., GLOF susceptibility mapping, hazard zonation, prioritisation, or planning support);
–: Study region and geographical context;
–: MCDA method(s) employed, including hybrid or extended approaches;
–: Technological implementation environment (e.g., GIS-based workflow, remote sensing integration, hydrodynamic modelling, or decision-support software);
–: Number and type of criteria used in the decision model;
–: Criteria weighting procedure and source of weights (expert judgement, literature-based, statistical/data-driven, or software default);
–: Presence of methodological justification for the choice of MCDA method;
–: Consistency verification of pairwise comparisons (e.g., reporting of the Analytic Hierarchy Process consistency ratio);
–: Treatment of uncertainty, including sensitivity analysis, probabilistic modelling, fuzzy approaches, or absence of uncertainty analysis;
–: Aggregation structure of the decision model (e.g., additive/weighted overlay, multiplicative, fuzzy, or stochastic);
–: Validation procedure applied (e.g., comparison with historical events, ROC/AUC or accuracy metrics, qualitative comparison, or no validation);
–: Software and implementation framework used in model construction;
–: Limitations explicitly reported by the study authors.

All reported results relevant to these domains were collected. To enable structured synthesis and cross-study comparability, each included study was coded using a predefined classification scheme covering hazard type, MCDA aggregation approach, enabling technologies, uncertainty treatment, and weighting strategy. The operational definitions and category labels used for coding are summarised in Table A2. This coding scheme was applied consistently across the dataset and forms the basis for the descriptive statistics, cross-tabulations, and subgroup analyses reported in Section 5.

Missing or unclear information was documented without assumptions.

3.9. Quality Assessment Protocol

To evaluate methodological reliability beyond descriptive synthesis, we developed a domain-specific quality appraisal instrument—the MCDA-HAZARD framework (Table 3). Because no established tool exists for assessing spatial decision-analysis modelling in cryospheric hazard research, this framework was constructed to address recurring methodological issues identified in the literature.

3.9.1. Domains and Scoring

The instrument assesses five methodological domains:

1.: Criteria definition: justification of selected variables and documentation of expert consultation or literature support.
2.: Weighting transparency: reporting of weighting procedures, including pairwise comparison matrices, rationale, or consistency verification.
3.: Uncertainty analysis: implementation of sensitivity analysis, scenario testing, fuzzy modelling, probabilistic approaches, or other robustness evaluations.
4.: Validation: evaluation against independent hazard inventories, statistical performance measures (ROC/AUC, accuracy), or temporal back-testing.
5.: Reproducibility: disclosure of data sources, parameters, and sufficient methodological detail to enable replication.

Each domain is scored on a three-point scale: 0 (not reported), 1 (partially addressed), or 2 (clearly implemented and documented). Total scores, therefore, range from 0 to 10.

Table A4 in Appendix C provides a detailed scoring rubric that operationalises each domain and score level.

3.9.2. Assessment Procedure

All 60 included studies were independently assessed by two reviewers using predefined criteria. Discrepancies were resolved through consensus discussion.

Inter-reviewer agreement was assessed on a random subset of 20 studies (33% of the corpus). For each of the five MCDA-HAZARD domains, both reviewers independently assigned scores (0, 1, 2). The percentage agreement before consensus ranged from 78% to 92% in all domains: definition of criteria (85%), weighting transparency (78%), uncertainty analysis (92%), validation (80%), and reproducibility (82%). Disagreements (primarily adjacent score differences, e.g., 1 vs. 2) were resolved through consensus discussion. Although Cohen’s kappa would provide a chance-corrected measure, percentage agreement is reported here as a transparent indicator of coding consistency; full coding data are available in the OSF repository. The purpose of this assessment is not to exclude studies but, rather, to characterise the strength of available evidence and to identify recurring methodological weaknesses.

3.9.3. Relationship to Reliability Framework

The MCDA-HAZARD instrument operationalises the reliability concepts defined in Section 2.4: reproducibility corresponds to domain 5; robustness is captured in domain 3; predictive validity is assessed in domain 4; and procedural reliability is reflected across domains 1, 2, and 5. Aggregate scores are analysed descriptively in Section 5.3 and used to support the interpretation of the review findings.

3.10. Study Risk of Bias Assessment

Formal clinical risk-of-bias tools are not applicable because the included studies are methodological, modelling, or case-study-orientated rather than experimental or intervention-based. Instead, studies’ methodological quality was evaluated according to decision-model reliability criteria:

Explicit description of the weighting procedure;
The presence of consistency or sensitivity analysis;
Validation against independent evidence;
The transparency of aggregation assumptions.

Two reviewers independently evaluated each study and resolved disagreements by consensus. The quality assessment informed the interpretation but was not used as an exclusion criterion.

3.11. Effect Measures

This review does not perform a quantitative meta-analysis. Therefore, statistical effect measures (e.g., risk ratios and mean differences) are not applicable. The synthesis focuses on qualitative and categorical description of MCDA applications.

3.12. Synthesis Methods

All studies that met the eligibility criteria were included in the qualitative synthesis. The studies were categorised as follows.

1.: MCDA method.
2.: The type of decision-problem.
3.: Uncertainty treatment.

The data extracted were standardised into predefined categories (method type, weighting approach, and problem type). The terminological variations between studies were harmonised to allow comparison.

In addition to descriptive categorisation, an interpretative cross-study analysis was conducted to identify explanatory patterns behind methodological choices. Studies were compared to determine common drivers of method selection, barriers to robustness analysis, and recurring methodological assumptions.

The results were summarised using descriptive tables, frequency distributions, and graphical representations (e.g., temporal trends, method distributions, and types of hazard). The results of this synthesis are reported separately in Section 5.

A statistical meta-analysis was not conducted because the studies report heterogeneous qualitative and methodological outcomes rather than comparable numerical effects estimates.

Heterogeneity was explored through a comparative grouping of studies by method type, decision problem, and uncertainty treatment.

Sensitivity analysis was not applicable due to the absence of pooled statistical measures. Instead, robustness was assessed by verifying that patterns were supported across multiple independent studies rather than isolated cases.

3.13. Reporting Bias Assessment

Publication bias was mitigated by searching multiple interdisciplinary databases and examining reference lists. The restriction to peer-reviewed articles in English is acknowledged as a potential source of bias. The review also considered methodological reporting bias, recognising that studies reporting successful hazard classifications may underreport model limitations, uncertainty, or failed validation attempts.

3.14. Certainty Assessment

Because the review synthesises methodological and descriptive evidence rather than effect estimates, formal certainty-of-evidence frameworks (e.g., GRADE) are not applicable. Confidence in the findings was evaluated based on the consistency of results across studies and completeness of reporting.

4. Other Information

4.1. Registration and Protocol

4.1.1. Registration

The protocol was registered a priori in the Open Science Framework (OSF) to ensure transparency and methodological reproducibility. The registered protocol is publicly available at: https://doi.org/10.17605/OSF.IO/46PZ9.

4.1.2. Protocol

Before undertaking the formal review, the protocol was developed and pilot-tested to confirm that the procedures were clear, precise, and practically feasible.

The review protocol outlines the search strategy, data sources, screening phases, inclusion and exclusion criteria, data extraction template, synthesis methods, and procedures to resolve any disagreements between reviewers.

4.1.3. Amendments

The research questions were refined after preliminary screening to shift the focus from descriptive mapping of applications to methodological reliability analysis. Similarly, the data extraction framework and synthesis procedures were expanded to include validation practices, a weighting justification, and a robustness analysis.

5. Results

5.1. Study Selection Results

The results of the study identification, screening and eligibility process are summarised in the PRISMA flow diagram shown in Figure 1. Appendix A provides an overview of the included studies.

The database search identified a total of 571 records. After the removal of 59 duplicate records, 512 records were screened based on title and abstract. Of these, 229 records were excluded during the screening stage. A total of 283 full-text reports were sought for retrieval, of which 54 could not be obtained. Consequently, 229 full-text articles were assessed for eligibility. Following the full-text assessment, 169 reports were excluded for predefined reasons. The remaining 60 studies met the inclusion criteria and were included in the final synthesis. The resulting group of included studies forms the evidence base for the synthesis presented in the subsequent subsections. This procedure promotes transparency in how studies were selected and enhances the reproducibility of the review.

Articles in their full text were excluded primarily due to a lack of methodological relevance or insufficient information. The most frequent exclusion reasons were insufficient methodological description, studies that did not address glacier hazards, and the absence of a specific MCDA method. Additional exclusions included non-English publications, conference abstracts without full papers, theses or dissertations, and duplicate content.

5.2. Study Characteristics

Table A3 (Appendix C) presents the detailed characteristics of all included studies. The review includes 60 articles published between 2015 and 2025 that apply multi-criteria decision analysis (MCDA) to glacier-related hazard assessment. For each study, the extracted information records: (i) the hazard type analysed, (ii) the MCDA technique employed, (iii) the technological environment supporting the analysis (e.g., GIS and remote sensing), and (iv) the approach used for criteria weighting and treatment of uncertainty.

The purpose of this table is to document the empirical corpus and ensure the transparency of the data extraction process. The patterns and distributions derived from these characteristics are analysed in the following subsections.

5.2.1. Publication Timeline

Figure 2 shows the annual distribution of the included studies, while Figure 3 presents the cumulative publication trend over time. The results indicate a gradual emergence of research on the application of MCDA techniques to glacier hazard susceptibility assessment, followed by a clear increase in publication activity after 2019.

The cumulative curve highlights a sustained growth pattern, particularly from 2020 onwards, suggesting consolidation of this topic as an established research line within natural hazard assessment and decision-support studies.

Most of the included works correspond to applied case studies in which MCDA methods are used to support hazard mapping, susceptibility zonation, and the prioritisation of mitigation measures in glacierised environments.

Although there were isolated studies on glacier hazards prior to 2015, very few met the eligibility criteria of this review. Earlier research on glacier hazards assessment focused mainly on the physical, geomorphological, or hydrological characterisation of hazards rather than on formal decision-support modelling. In contrast, the studies included in this review apply explicit multi-criteria decision analysis frameworks to integrate heterogeneous spatial information and support hazard zoning or prioritisation tasks. Therefore, the increase in eligible studies after 2015 reflects the progressive adoption of decision-analysis and spatial modelling approaches in assessing glacier hazards, rather than the absence of earlier scientific investigation of glacier-related hazards.

5.2.2. Hazard Types

Figure 4 presents the distribution of the hazard types addressed by the reviewed studies. Glacial lake outburst floods (GLOFs) represent the dominant application domain, accounting for the majority of studies. A smaller but relevant subset of research applies MCDA techniques to landslide susceptibility and debris-flow hazards in glacierised mountain environments. Only a limited number of studies address snow or ice avalanches and other cryospheric hazards.

This distribution indicates that the adoption of MCDA techniques in cryospheric environments has been primarily driven by the need to prioritise potentially dangerous glacial lakes and to support early-warning and mitigation planning in high mountain regions.

5.2.3. MCDA Methods Used

Figure 5 presents the multi-criteria decision analysis techniques used in the reviewed literature. The Analytic Hierarchy Process (AHP) is overwhelmingly the most frequently adopted method. Several studies combine AHP with other approaches (e.g., TOPSIS or COPRAS), forming hybrid MCDA frameworks, while alternative MCDA families appear only sporadically. The prevalence of AHP suggests that practical usability, ease of implementation, and compatibility with GIS-based spatial modelling strongly influence the selection of the method in glacier hazard assessment.

5.2.4. Technological Environment

Figure 6 summarises the technological environments that support the implementation of MCDA. Most studies apply MCDA within geographic information systems (GISs) and combine it with remote sensing datasets and terrain analysis derived from digital elevation models (DEMs). Satellite imagery and spatial overlays represent the primary data sources for evaluating hazard factors.

Recent publications increasingly integrate machine learning and probabilistic modelling with MCDA frameworks, indicating a transition from purely expert-driven assessment toward hybrid data-driven hazard modelling.

5.2.5. Weighting and Uncertainty Treatment

Figure 7 summarises the strategies adopted to assign criteria weights and address uncertainty within the reviewed MCDA applications.

The majority of studies employ expert-driven weighting procedures, most commonly implemented through pairwise comparisons within the Analytic Hierarchy Process (AHP). In many cases, weights are derived from expert judgement or stakeholder consultation rather than from empirical calibration. By contrast, objective or data-driven weighting approaches, such as statistically derived weights or learning-based estimation, appear only in a limited subset of the literature.

A clear distinction is observed between weighting and uncertainty treatment. Although weighting procedures are applied almost universally, the explicit modelling of uncertainty is comparatively rare. Only a small proportion of studies incorporate sensitivity analysis, probabilistic frameworks, or fuzzy set theory to evaluate the stability of hazard classifications. Most studies, therefore, produce a single deterministic hazard map or susceptibility ranking without assessing the variability of results under alternative weighting configurations.

These findings indicate that MCDA methods are widely used as decision-support tools for glacier hazard prioritisation, but robustness evaluation and uncertainty quantification remain underdeveloped components of current practice.

The methodological implications of these patterns are discussed in Section 6.

5.3. Methodological Quality of the Included Studies

The methodological quality of the 60 included studies was evaluated using the MCDA-HAZARD Quality Assessment Instrument described in Section 3.9. The instrument assesses five domains: criteria selection, weighting transparency, uncertainty treatment, validation, and reproducibility.

Table 4 presents summary statistics of methodological quality across all 60 studies (more details are provided in Appendix C). The results reveal a clear hierarchy: criteria definition and reproducibility are relatively well-reported (mean scores 1.57), while uncertainty analysis is almost entirely absent (mean 0.23, only 5% of studies fully reporting). Validation and weighting transparency occupy an intermediate position, with mean scores of 1.12 and 1.13, respectively.

Figure 8 illustrates the distribution of methodological quality by domain.

The results reveal a clear imbalance in methodological practices. Most studies adequately document data sources and environmental criteria selection, and weighting procedures are generally reported, particularly when using the Analytic Hierarchy Process (AHP). However, substantial weaknesses are observed in uncertainty treatment and validation practices.

Only a small subset of studies performs formal sensitivity analysis or tests multiple weighting scenarios. Similarly, independent validation using observed hazard events or statistical performance metrics (e.g., ROC or AUC) is uncommon. In many cases, validation is limited to visual agreement with known hazardous locations or expert judgement.

Reproducibility is also variable. Although spatial datasets are often cited, full disclosure of weighting parameters, pairwise comparison matrices, and modelling assumptions is frequently incomplete, making independent replication difficult.

Overall, the quality appraisal indicates that the primary limitation of the current MCDA applications in glacier hazard assessment is not the absence of applications, but the absence of systematic reliability evaluation. Most studies produce operational hazard maps, yet comparatively few evaluate whether the resulting classifications are stable, reproducible, or predictive.

These findings support the central argument of this review: current glacier-hazard MCDA models function mainly as decision-support tools rather than validated predictive models.

5.4. Quantitative Synthesis of Evidence

5.4.1. Method Selection and Hazard Domain

To examine whether the choice of the MCDA technique depends on the hazard domain, a cross-tabulation was performed between hazard type and the primary decision method (Table 5).

The results indicate a pronounced methodological concentration around the Analytic Hierarchy Process (AHP). Across the 60 reviewed studies, AHP was used in 36 cases (60%). Importantly, the method appears consistently across all hazard categories: 13 GLOF-focused studies, 11 landslide studies, and 12 multi-hazard assessments employed AHP as the primary decision framework. No hazard category is associated with a distinct or specialised decision model.

This pattern suggests that method selection is largely independent of the physical characteristics of the hazard being analysed. Instead, a single decision approach is routinely transferred across different problem types without adaptation to hazard-specific analytical requirements.

The distribution also demonstrates limited methodological diversity. Alternative techniques appear only sporadically: fuzzy AHP was identified in 3 studies (5%), the Best–Worst Method (BWM) in 2 studies (3%), and TOPSIS in only 1 study (2%). The remaining studies (18 cases, 30%) used heterogeneous or hybrid approaches rather than clearly defined alternative MCDA frameworks.

Consequently, the literature provides little evidence that method choice is driven by analytical suitability for specific hazards. Rather, the same decision structure is repeatedly applied regardless of whether the problem concerns glacial lake outburst floods, landslide susceptibility, or multi-hazard classification.

Table 5, therefore, indicates that glacier hazard MCDA practice is method-centred rather than problem-centred. The widespread adoption of AHP is unlikely to reflect demonstrated predictive superiority. Instead, its prevalence appears to stem from operational convenience, ease of implementation, and compatibility with GIS-based weighted overlay workflows, which have effectively standardised methodological practice across otherwise heterogeneous hazard contexts.

Table 6 examines whether method choice depends on the hazard domain. The results indicate a pronounced methodological concentration around AHP, which appears in 54.2% of GLOF studies, 73.3% of landslide studies, and 57.1% of multi-hazard assessments. This limited variation—a maximum difference of 19 percentage points—suggests that method selection is largely independent of hazard-specific analytical requirements.

These results indicate that the selection of AHP does not strongly depend on the type of glacier-related hazard under investigation. Instead, the method is applied with comparable frequency across distinct hazard contexts. This pattern suggests that methodological choice is relatively invariant to the physical characteristics of the hazard domain and reflects a broadly standardised modelling approach across applications.

Table 7 summarises the overall distribution of MCDA techniques. AHP-based approaches account for 60% of the reviewed studies, while hybrid or loosely specified implementations represent 30%. Clearly alternative MCDA frameworks, including non-compensatory or formally distinct decision methods, appear in only 10% of the literature. It means, only six studies (10%) used clearly alternative non-compensatory MCDA frameworks (BWM, TOPSIS, fuzzy AHP). The distribution indicates a strong methodological concentration around a single decision framework. Although several techniques exist within the MCDA family, most applications rely on a similar modelling structure. The limited representation of alternative methods demonstrates restricted methodological diversity within glacier hazard MCDA studies.

Taken together, the cross-tabulation presented in Table 5 and the proportional analysis shown in Table 7 indicate that the predominance of AHP reflects a consistent modelling convention rather than adaptation of the decision method to specific hazard processes. The strong methodological concentration reported in Table 7 further shows that the same analytical structure is applied across heterogeneous hazard types. These results support the interpretation that current practice is method-centred rather than hazard-centred.

5.4.2. Validation Practices over Time

To evaluate whether methodological rigour has improved over time, studies were grouped by publication year and classified according to whether they reported quantitative validation (e.g., ROC/AUC, accuracy metrics, or confusion-matrix indicators). The yearly distribution is reported in Table 8 and Table 9 and illustrated in Figure 9.

Across the 60 reviewed studies, quantitative validation was reported in 21 cases (35.0%), while 39 studies (65.0%) relied on qualitative comparison, historical-event matching, or no validation procedure (Table 8). Validation rates varied considerably between years, ranging from 0% to 50%, but no consistent increasing trend is observed.

A temporal analysis does not indicate a sustained methodological improvement (Figure 9). Early publications (2015–2018) showed validation rates between 0% and 50%, though sample sizes were small. Subsequent years also fail to demonstrate a consistent increase. The highest validation rate occurs in 2018 (50.0%), but this is not maintained in later periods. When grouped into three-year intervals to smooth annual volatility, validation rates remain essentially stable: 37.5% (2015–2018), 34.8% (2019–2021), and 34.5% (2022–2025). Therefore, the substantial increase in publication volume after 2019 (from 8 studies in 2015–2018 to 52 in 2019–2025) was not accompanied by a corresponding increase in validation practice.

Overall, the temporal distribution indicates that methodological verification has not progressed proportionally with publication growth. Most studies continue to evaluate hazard classifications through qualitative agreement with known hazardous locations or expert judgement rather than predictive performance testing. The evidence in Table 8 and Figure 9 demonstrates that the expansion of MCDA applications in glacier hazard assessment has occurred without a corresponding increase in quantitative validation.

We define quantitative predictive validity strictly as a performance evaluation against hazard observations using statistical metrics (e.g., ROC/AUC, accuracy, and the confusion matrix). Internal AHP consistency ratios were not counted as predictive validation. Studies reporting only AHP consistency ratios or qualitative agreement with known hazardous locations were not considered validated, as these procedures assess internal coherence rather than predictive performance.

Table 8 and Figure 9 visualise the temporal distribution of the validation rates. The pattern confirms that validation practice has not improved systematically over the last decade. The highest validation rate (50.0% in 2018) occurs early in the study period and is not sustained. Recent years show considerable volatility: 2021 (27.3%), 2022 (37.5%), 2023 (28.6%), and 2025 (45.5%). In particular, 2024 shows zero validated studies despite three publications. This instability, combined with the absence of a positive trend, indicates that methodological verification remains inconsistent and has not kept pace with the growing volume of MCDA applications in glacier hazard assessment.

Table 10 examines the relationship between validation practice and overall methodological quality. Studies reporting quantitative validation achieve a mean quality score of 6.8, substantially higher than those with only qualitative validation (5.2) or no validation (3.5). This gradient suggests that validation is not an isolated practice but correlates with more rigorous methodology across all domains.

This gradient carries an important diagnostic signal: validation is not an isolated technical step but also a marker of overall methodological rigour. Studies that invest in quantitative validation also tend to justify criteria selection more thoroughly, report weighting procedures more transparently, and—critically—acknowledge limitations more explicitly. In contrast, studies with no validation procedure exhibit uniformly low scores in all domains, suggesting a general lack of methodological self-scrutiny rather than a focused gap in validation alone. This correlation implies that improving validation practice may have spillover effects on broader research quality, whereas piecemeal improvements to weighting or criteria selection without validation are unlikely to close the reliability gap.

5.4.3. Geographical Concentration of Case Studies

Figure 10 shows the spatial distribution of case-study locations. The reviewed literature exhibits a pronounced concentration in High Mountain Asia, particularly the Himalayan–Karakoram–Hindu Kush region, including India, Pakistan, Nepal, and the Tibetan Plateau. To complement spatial visualisation, Figure 11 presents the number of case studies per country. The bar chart confirms a strong geographical imbalance in the reviewed literature. A large proportion of studies is concentrated in countries of High Mountain Asia, particularly India, Pakistan, Nepal, and China (Tibetan Plateau).

In contrast, only a small number of studies are reported from other glacierised regions such as the Andes, Europe, and North America. This distribution indicates that current MCDA practices in glacier hazard assessment are predominantly developed and tested within a restricted geographical context.

The mapped evidence indicates a marked concentration of case studies in High Mountain Asia (notably countries associated with the Himalayan–Karakoram–Hindu Kush region), with comparatively sparse coverage elsewhere. This geographical skew likely reflects both exposure and data availability, but it also limits the external validity of methodological claims, as practices developed in one set of geomorphological and institutional contexts may not transfer directly to other cryospheric regions (e.g., the Andes, Alps, or polar environments).

Table 11 quantifies the spatial distribution of applications. A total of 48 out of 60 studies (80.0%) were conducted in High Mountain Asia, particularly the Himalayan–Karakoram–Hindu Kush region. In contrast, all other glaciated regions collectively account for only 20.0% of the available evidence, including 6.7% in the Andes, 6.7% in Europe, and 1.7% in North America. The empirical basis of glacier hazard MCDA research is, therefore, strongly concentrated within a single geographical context.

Table 12 examines quality variation across regions. European studies show the highest mean quality scores (6.5) and validation rates (50.0%), though the sample size is small (n = 4). High Mountain Asian studies, which constitute 80% of the evidence base, have mean quality scores near the overall average (5.6). The Andes region shows lower mean scores (4.3), potentially reflecting different research capacity or data availability contexts.

5.4.4. Summary of Key Quantitative Findings

To synthesise the quantitative evidence extracted from the reviewed studies, the principal findings are summarised in Table 13.

The table consolidates the main patterns identified across the dataset, including method selection, validation practices, and geographical distribution. It provides an aggregated overview of the empirical results that support the detailed analyses presented in the preceding subsections.

Beyond descriptive frequencies, the review also examines associations between methodological practices: Table 10 presents the correlation (gradient) between validation status and overall quality scores; Table 9 and Figure 9 analyse the temporal trends in validation rates over the decade; and Table 12 compares quality scores between geographical regions. These analyses provide empirical support for the central claim that validation is a marker of overall rigour and that methodological verification has not improved over time.

5.5. Synthesis of Findings by Research Questions

This subsection presents an analytical synthesis of the reviewed studies, structured around the research questions that guide this systematic review. Moving beyond a mere description of individual applications, we interpret the cross-study evidence to understand not only how multi-criteria decision analysis (MCDA) is applied in glacier hazard assessment, but why these practices prevail and what they imply for the reliability of the resulting hazard classifications.

The synthesis critically examines the choices made by researchers regarding method selection, criteria weighting, uncertainty treatment, and validation. Taken together, the results reveal a consistent and consequential pattern: MCDA is widely adopted as a practical decision-support framework, yet its core methodological assumptions are routinely accepted rather than critically evaluated, creating a significant gap between the apparent precision of its outputs and their empirical grounding.

5.5.1. RQ1: Why Has the Analytic Hierarchy Process (AHP) Become the Predominant MCDA Method in Glacier Hazard Assessment Studies?

The review confirms an overwhelming methodological concentration around the Analytic Hierarchy Process (AHP), used as the primary decision method in 60% of studies and in 80% when including its variants (Table 7). In a typical application, AHP is employed to derive criteria weights through expert pairwise comparison, after which a weighted linear combination is implemented in a GIS to produce a susceptibility map.

This dominance is not explained by a demonstrable superiority of AHP for specific hazard types. As Table 5 shows, its use is uniformly high across GLOF, landslide, and multi-hazard assessments, indicating that method selection is largely independent of the problem’s physical characteristics. Alternative methods like TOPSIS, BWM, or outranking approaches are exceedingly rare, appearing in only 10% of studies. The “methodological monoculture” is, therefore, a product of accessibility, familiarity, and seamless compatibility with standard GIS workflows, rather than a reasoned choice based on the demands of the decision context. This has led to a form of methodological standardisation where a compensatory, preference-based model is applied by default, embedding untested assumptions about criteria independence and trade-offs into hazard assessments that are often presented as objective spatial predictions.

The tight coupling between AHP and ArcGIS—specifically the Weighted Overlay and Weighted Sum tools—has arguably been the single most important driver of the methodological monoculture we observe. This integration is not neutral; it actively shapes research practice. A researcher with a standard ArcGIS licence can, within hours, produce a susceptibility map by: (1) reclassifying raster layers, (2) running AHP pairwise comparisons using spreadsheet templates, and (3) applying Weighted Overlay with the resulting weights. The software provides no built-in sensitivity analysis, no non-compensatory alternatives, no uncertainty propagation, and no validation metrics. The workflow encourages a deterministic, single-map output as the natural endpoint of analysis. Once this pipeline is established in a research group or taught in a graduate course, it becomes institutionally established. Switching to alternative MCDA methods (e.g., PROMETHEE in a dedicated decision-support package) or to non-deterministic approaches (e.g., probabilistic SMAA) requires learning new software, new mathematical concepts, and new reporting norms—a transaction cost that few researchers bear unless explicitly incentivised. Thus, the AHP-ArcGIS workflow is not merely a method; it is a sociotechnical system that reproduces itself through software design, training, and publication practices.

5.5.2. RQ2: To What Extent Do Criteria Weighting Practices Reflect Methodological Justification Rather than Practical Convenience?

Criteria weighting is a universal step in all reviewed MCDA applications, yet the practices surrounding it reveal a profound lack of epistemic justification. Weights are almost exclusively derived from expert judgement, typically through AHP pairwise comparisons or direct scoring. Objective, data-driven methods for weight derivation are virtually absent from the literature. More critically, these subjectively derived weights are almost always applied deterministically. Only a tiny fraction of studies (5%) perform systematic sensitivity or robustness checks to explore how alternative, yet equally plausible, weighting schemes might alter the final hazard classification (Table 4). This treatment of weights as fixed, procedural inputs rather than as testable modelling assumptions is a fundamental weakness. Because the output hazard map is a direct function of these weights, the absence of robustness analysis means that the stability of the resulting prioritisation for high-stakes decisions—such as identifying lakes for GLOF mitigation—is entirely unknown. The practice, therefore, prioritises procedural convenience over the scientific requirement to verify the influence of subjective inputs.

A related concern, visible in all reviewed studies, is the frequent absence of physical or geomorphic justification for the inclusion of criteria. Factors such as “glacier proximity,” “lake area,” “slope angle,” or “distance to fault line” are often selected because they are measurable from remote sensing data and appear in previous studies, rather than because they are mechanistically linked to hazard initiation. For example, using a fixed distance threshold (e.g., 500 m to glacier terminus) without sensitivity analysis implicitly assumes a step function in hazard potential that rarely exists in nature. Similarly, including “lake area” as a linear predictor assumes that hazard increases proportionally with area, whereas some GLOF mechanisms (e.g., moraine breach) are threshold-dependent and non-linear. This practice risks circularity: factors are selected because they are available and then validated by showing that high-hazard zones spatially coincide with known dangerous lakes—a logic that can confirm any plausible set of criteria. The field would benefit from explicit geomorphic conceptual models (e.g., causal diagrams or Bayesian networks) that justify each criterion’s inclusion, functional form, and expected direction of effect before weighting and aggregation are applied.

5.5.3. RQ3: Why Is Uncertainty and Robustness Analysis Rarely Incorporated in MCDA-Based Glacier Hazard Assessments?

The treatment of uncertainty is the least developed aspect of current practice, a finding starkly illustrated by the quality assessment, where ”Uncertainty Analysis” received a mean score of just 0.23/2.0, with a mere 5% of studies fully reporting any such analysis (Table 4). Most studies do not explicitly model the multiple uncertainties inherent in hazard assessment—from data inaccuracies and expert disagreement to the inherent variability of natural processes. Instead, uncertainty is addressed only implicitly, if at all, through expert judgement during weighting or qualitative interpretation of the final map.

This neglect extends to validation. While 35% of studies report quantitative predictive validation, this rate has not improved over the decade, and 65% rely on qualitative comparisons or no validation at all (Table 8 and Figure 9). The widespread reporting of AHP consistency ratios—an internal measure of logical coherence—as a proxy for quality exemplifies a critical confusion between a model’s internal consistency and its external, predictive validity. A model can be perfectly coherent yet bear no relation to reality. This pattern suggests that the field predominantly produces “plausible representations”—maps that look reasonable to experts—rather than empirically tested and validated models of hazard.

5.5.4. RQ4: Which Methodological Improvements (e.g., Sensitivity Analysis, Comparative Modelling, Hybrid or Ensemble MCDA) Could Enhance the Reliability of Glacier Hazard Assessments?

The review identifies a clear separation between methodological innovation and routine practice. While a subset of studies explores promising enhancements—including hybrid MCDA, integration with machine learning, probabilistic frameworks, and scenario analysis—these remain isolated, proof-of-concept exercises rather than consolidated into standard practice.

The barriers to routine adoption are likely multifaceted, including data scarcity, a lack of accessible, user-friendly tools for robustness analysis, and the absence of community-agreed reporting standards. This fragmented landscape means that the field has yet to transition from widespread operational adoption of a single, simple method to a mature practice where reliability is systematically evaluated. Future progress hinges on moving beyond the production of more hazard maps. The priority must shift to developing and standardising robust, uncertainty-aware frameworks. Key directions include establishing systematic sensitivity analysis as a mandatory practice, fostering head-to-head comparisons of multiple MCDA methods on shared benchmark datasets, and exploring ensemble approaches that can provide more stable and reliable outputs by combining the strengths of different models and reducing dependence on any single set of assumptions.

5.5.5. Evaluating the Central Proposition

Regarding the claim put forward in Section 1—that the predominance of AHP stems from its operational convenience rather than its predictive accuracy—the combined evidence offers consistent support across four empirical tests.

First, the choice of method is independent of the type of hazard. AHP appears uniformly on the GLOF (54.2%), landslide (73.3%) and multi-hazard (57.1%) assessments, with a maximum difference between domains of only 19 percentage points (Table 6). If problem-specific fit drove method selection, greater methodological variation would be expected across physically distinct hazard processes;

Second, comparative testing against alternative MCDA frameworks is almost absent. Non-AHP methods (fuzzy AHP, BWM, TOPSIS) appear in only 10% of studies (6/60), and no study systematically compares multiple decision models on identical data (Table 7). This absence suggests that researchers adopt AHP by default rather than through explicit model selection.

Third, quantitative predictive validation remains limited (35.0%) and has not improved over the decade (Table 8, Figure 9). If predictive accuracy were the primary driver of method choice, validation rates would be expected to rise over time as the field matures; they have not.

Fourth, sensitivity and robustness analyses—which test whether the results depend on subjective weight choices—are reported in only 5% of the studies (Table 4). This omission is consistent with a practice that treats weights as procedural inputs rather than as testable assumptions.

The proposition is, therefore, supported: current MCDA practice in glacier hazard assessment is method-centred rather than problem-centred, and the dominance of AHP reflects operational convenience more than demonstrated predictive superiority. This conclusion does not imply that AHP is invalid for glacier hazards—only that its widespread adoption is not empirically justified by the evidence base and that the field would benefit from greater methodological diversity, validation, and robustness testing.

Table 14 summarises the principal findings of the cross-study synthesis organised by research question. Rather than reporting individual case-study results, the table consolidates the recurring methodological patterns observed in the reviewed literature and interprets their implications for the reliability of MCDA-based glacier hazard assessment. For each research question, the table links empirical evidence (what studies actually do) with its analytical interpretation (what this behaviour suggests) and its practical significance (why it matters for decision-support and risk management). This structured synthesis provides a concise bridge between the descriptive results and the critical discussion that follows.

5.6. Risk of Bias in Studies

The included studies correspond primarily to modelling studies and applied case studies, rather than to experimental or intervention-based research. Therefore, conventional clinical risk-of-bias instruments were not applicable.

The methodological assessment indicated that most studies clearly described their objectives and applied recognised MCDA techniques. However, variability was observed in the level of methodological transparency, particularly in terms of weighting procedures, uncertainty treatment, and validation of results.

5.7. Results of Individual Studies

The included studies do not report directly comparable quantitative effect measures. Instead, most papers present decision-support outputs derived from spatial multi-criteria decision analysis (MCDA) models.

The most common outputs are hazard susceptibility maps that classify terrain into categories such as low, moderate, and high hazard. Several studies also provide prioritisation rankings of potentially dangerous glacial lakes or composite susceptibility indices representing relative hazard levels. These outputs are typically produced through weighted overlay procedures implemented within geographic information systems. Validation practices vary substantially across studies. Quantitative predictive validation—defined as performance evaluation against hazard observations using statistical metrics—was reported in 21 of the 60 reviewed studies (35.0%). These studies employed statistical performance indicators such as receiver operating characteristic (ROC) curves, area under the curve (AUC), accuracy metrics, or confusion-matrix measures. The remaining 39 studies (65.0%) relied on qualitative evaluation approaches, including visual agreement with known hazardous locations, comparison with historical events, or expert judgement. Within this group, 11 studies (18.3%) reported no explicit validation procedure whatsoever.

Some papers reported internal consistency checks, particularly the Analytic Hierarchy Process (AHP) consistency ratio. Among the 48 studies using AHP-based approaches, 32 (66.7%) reported consistency ratios, with most values below the recommended threshold of 0.10. However, these measures assess the logical coherence of expert weighting rather than the predictive performance of the hazard model and were therefore not classified as quantitative validation.

Overall, the reported outcomes represent spatial decision-support classifications rather than directly calibrated predictive risk estimates. Because the outputs are heterogeneous in form and lack common quantitative effect measures, statistical aggregation or meta-analysis is not appropriate. Accordingly, the evidence was synthesised using structured qualitative and quantitative descriptive analyses presented in the preceding subsections.

5.8. Reporting Biases

Formal statistical methods for detecting publication bias (e.g., funnel plots or trim-and-fill procedures) were not applicable because the review did not synthesise quantitative effect estimates. The outcomes analysed consist mainly of spatial susceptibility maps and classified hazard zones, which cannot be aggregated into comparable statistical effect sizes.

Nevertheless, several potential sources of reporting bias were identified. First, the review included only peer-reviewed publications written in English, which may introduce language and publication bias by excluding relevant studies reported in local or regional outlets. Second, a clear geographical concentration of studies was observed, with a large proportion conducted in High Mountain Asia (particularly the Himalaya, Karakoram, and Tibetan Plateau). This regional dominance likely reflects both the high exposure to glacial hazards and unequal research capacity and data availability across world regions. Consequently, glacierised regions in South America, Europe, and other mountain systems are comparatively under-represented in the literature.

Another potential bias arises from the selective reporting of positive or plausible hazard assessments. Many studies emphasise the successful identification of hazardous areas while providing limited discussion of model limitations, failed predictions, or alternative classifications. In addition, validation datasets are often scarce, which may favour confirmation of expected hazard patterns.

These factors were considered when interpreting the results of the synthesis, and therefore the conclusions are framed as representative of current published research practices rather than exhaustive evidence of all MCDA applications to the assessment of glacier hazards.

5.9. Certainty of Evidence

A formal certainty-of-evidence framework designed for intervention studies (e.g., GRADE) was not applicable, as the included literature does not evaluate clinical or experimental effects but instead reports methodological applications and spatial modelling practices. The reviewed studies primarily present hazard susceptibility models, decision-support frameworks, and case-specific assessments rather than comparable outcome measures.

Confidence in the evidence was therefore assessed qualitatively. Several consistent patterns were observed across independent studies, including the predominant use of AHP-based weighting schemes, the integration of MCDA within GIS environments, and the reliance on terrain and remote-sensing variables for hazard assessment. The recurrence of these methodological practices across different study areas and research groups supports moderate confidence in the generalisability of these observations.

However, the certainty of evidence is limited by variations in methodological validation. Many studies rely on expert judgement and qualitative comparison with known hazardous locations, while only a subset employs quantitative validation metrics such as ROC curves, prediction accuracy, or event-based verification. In addition, the geographical concentration of the studies and the limited treatment of uncertainty reduce confidence in the robustness of hazard classifications.

Overall, the findings should be interpreted as a reliable characterisation of prevailing research practices rather than definitive evidence regarding the predictive accuracy of MCDA models for glacier hazard assessment.

6. Discussion

This systematic review set out not merely to catalogue applications of multi-criteria decision analysis (MCDA) in glacier hazard assessment but to interrogate the methodological foundations upon which these applications rest. In doing so, we found ourselves confronting a field that is, in many respects, at a crossroads: widely adopted yet methodologically constrained, operationally useful yet empirically under-verified.

Across the 60 studies analysed, a consistent pattern emerges. The dominance of AHP-based approaches (80%) combined with the limited use of quantitative validation (35%) suggests not simply a methodological preference, but a deeper imbalance. As a community, we appear to have embraced the practicality and accessibility of MCDA without developing, at the same pace, the evidentiary standards required to substantiate its outputs.

These patterns are not accidental. They reflect the structural realities of glacier hazard research: limited monitoring infrastructure, urgent decision-making contexts driven by climate risk, and the need to produce actionable outputs for planners and stakeholders. MCDA—and AHP in particular—responds effectively to these constraints. It enables the integration of heterogeneous data and expert knowledge into interpretable spatial products.

Yet this convenience has a cost. What emerges is a kind of methodological comfort zone—a familiar pathway from data to hazard map that avoids both the data demands of process-based models and the statistical complexity of data-driven approaches. The question we are left with is not whether MCDA is useful—it clearly is—but whether the knowledge it produces is being interpreted in ways that exceed what the underlying models can support. In other words, are we generating reliable insights, or increasingly convincing representations that remain only weakly grounded in empirical evidence?

6.1. The Epistemic Status of Hazard Maps: Between Measurement and Interpretation

In reflecting on these findings, we are led to reconsider what MCDA-generated hazard maps actually represent. At first glance, these maps resemble measurements: spatially explicit outputs with clear boundaries separating low-, moderate-, and high-hazard zones. The visual language is one of precision and objectivity.

However, our analysis suggests a different interpretation. These maps are better understood as formalised expert judgements rendered spatial. They are not direct measurements of hazard processes, but structured interpretations based on selected criteria, assigned weights, and aggregation rules.

This interpretation is not merely descriptive; it has a well-established foundation in decision science and the philosophy of science. Following Funtowicz and Ravetz [37], glacier hazard assessment operates in the domain of post-normal science, where facts are uncertain, values are in dispute, stakes are high, and decisions are urgent. In such contexts, the traditional distinction between fact (objective measurement) and value (subjective judgement) collapses. Hazard maps produced via MCDA are better understood as formalised expert judgements rendered spatial—a concept derived from structured expert judgement theory [38]. They embed prior assumptions about criteria relevance, weighting, and aggregation that are not empirically derived but are nonetheless consequential for outcomes. The map is not a window onto nature; it is a constructed artefact that synthesises empirical data with subjective inputs under conditions of uncertainty. Recognising this is not a weakness of MCDA—it is an honest characterisation of what the method actually does. The problem arises only when these artefacts are presented or interpreted as empirical measurements rather than as disciplined interpretations. This distinction is not merely semantic—it is epistemic. A measurement can, in principle, be validated against independent observations. An interpretation, by contrast, can only be assessed in terms of the plausibility of its assumptions and the coherence of its construction. The widespread reliance on expert-derived weights in AHP means that each hazard map embeds a set of prior judgements about the relative importance of conditioning factors. These judgements are indispensable in data-scarce environments, but they are not empirical observations.

The problem arises when these interpretations are presented, and received, as empirical findings. From our perspective, this is where the tension becomes most visible. Only a minority of studies test the stability of their results under alternative weighting configurations, and fewer still validate their classifications against observed hazard events. As a result, many hazard maps are internally consistent—often supported by AHP consistency ratios—yet externally unverified.

The apparent precision of these maps can therefore be misleading. Their clean boundaries and categorical distinctions convey certainty, but that certainty often resides in the structure of the model rather than in evidence about the world. Making this distinction explicit is essential if these tools are to be used responsibly.

6.2. The Reliability Gap: Precision Without Verification

Our findings reveal a striking asymmetry. While aspects such as criteria definition and reproducibility are relatively well addressed, uncertainty analysis and validation remain markedly underdeveloped. Only a small fraction of studies conduct systematic uncertainty analysis, and even fewer attempt robust empirical validation.

What we term the reliability gap is not simply a technical shortcoming; it reflects a deeper disconnect between the apparent definitiveness of hazard maps and the fragility of their empirical grounding. MCDA produces outputs that are precise—deterministic, clearly delineated, and easily interpretable. But precision should not be confused with accuracy.

In many cases, the visual clarity of the resulting maps obscures the uncertainties inherent in the modelling process: uncertainty in input data, in criteria selection, in weighting schemes, and in aggregation assumptions. When sensitivity analysis is absent, it becomes impossible to assess how stable these classifications are. When validation is missing, it is equally impossible to determine whether areas classified as high hazard correspond to observed events.

The stagnant validation rate—approximately 35% throughout the decade, despite a nearly sevenfold increase in annual publication volume—invites a diagnosis that goes beyond technical constraints. In the current incentive structure of the geosciences, new hazard maps are rewarded more readily than the costly, time-consuming, and less glamorous work of empirical validation. Producing a susceptibility map requires remote sensing data, GIS skills, and expert elicitation—tasks that fit within a typical PhD or a 2–3 year research project. Validation, on the contrary, requires access to independent, often incomplete historical event inventories, long-term monitoring data, and the willingness to report when a model performs poorly (a publication risk). This imbalance is a textbook example of publication bias and, at a deeper level, of misaligned incentives. Journals request validation but seldom reject manuscripts that lack it; reviewers ask for sensitivity analyses yet frequently tolerate their omission. Unless validation is treated as a mandatory prerequisite for publication rather than an optional virtue, the reliability gap will remain, no matter how much methods improve.

The implications for risk governance are immediate and unsettling. When AHP-GIS hazard maps that have not been validated are presented as ready-to-use assessments—without clearly conveying their uncertainty or reporting performance measures—they can foster a misleading sense of confidence among policymakers, emergency managers, and exposed communities. A polished map with sharp hazard zones suggests accuracy, yet if that apparent precision is not empirically grounded, decisions about resource allocation may be driven by artefacts of expert opinion rather than by actual environmental dynamics. This is not a rejection of MCDA; it is a call for careful, disciplined interpretation. Hazard maps should be presented for what they truly are: structured conjectures, not verified forecasts. Until validation becomes standard practice, such maps are most appropriate for guiding field surveys and framing discussion, rather than serving as the definitive basis for land-use planning or early-warning system design.

In our view, this gap is particularly problematic because it remains largely invisible. The map communicates certainty, but that certainty reflects the internal logic of the model rather than its empirical adequacy. As we examined these studies collectively, what became apparent was not a lack of methodological effort, but a lack of alignment between what these models are capable of demonstrating and the claims often made about them.

This situation aligns closely with what has been described as post-normal science, where decisions must be made under conditions of uncertainty, high stakes, and incomplete knowledge. In such contexts, the distinction between fact and judgement becomes blurred. The weights embedded in MCDA models are not merely technical parameters—they are expressions of priorities and assumptions. A mature methodological practice would make these assumptions explicit and subject them to systematic scrutiny.

6.3. Geographical Concentration and the Limits of Generality

The strong geographical concentration of studies in High Mountain Asia (80%) is understandable, given the region’s exposure to glacier-related hazards. However, it also raises important methodological questions.

Models developed within a specific geographical context inevitably reflect the environmental, data, and institutional conditions of that context. Criteria selection, weighting strategies, and validation practices are all shaped by local conditions. When such models are implicitly treated as generalisable, there is a risk that context-specific assumptions become normalised as the universal practice.

We suggest that the field may be developing not only a methodological concentration (around AHP), but also a geographical concentration that reinforces it. This dual concentration limits the diversity of modelling approaches and constrains the development of more context-sensitive practices.

6.4. Toward a Different Kind of Practice

If we take these findings seriously, then the question is not whether MCDA should continue to be used but how it should be used more responsibly.

First, validation must become a central component of practice. The current situation, in which only a minority of studies report quantitative validation, is difficult to justify in a field that informs high-stakes decisions. This requires not only improved reporting standards, but also investment in hazard inventories and monitoring systems that enable empirical testing.

Second, uncertainty must be treated as an object of analysis rather than a secondary concern. Simple approaches—such as systematic variation of weights, scenario analysis, and multi-model comparison—can significantly improve the interpretability and robustness of results. These practices do not require abandoning MCDA, but extending it.

Third, methodological diversity should be encouraged. The dominance of AHP appears to reflect convenience and familiarity rather than demonstrated superiority. Comparative studies applying different MCDA methods to the same datasets would provide valuable insights into how modelling assumptions influence outcomes.

We do not argue for abandoning MCDA. Rather, we argue for using it with greater methodological discipline and interpretive caution. The challenge is not only technical, but conceptual: recognising what these models can legitimately claim, and where their limits lie.

6.5. Reframing the Role of MCDA in Hazard Science

Ultimately, this review invites a reframing of the role of MCDA in glacier hazard assessment. Its strength lies not in predicting hazard events, but in structuring decisions under conditions of complexity and incomplete information.

When used appropriately, MCDA provides a transparent framework for integrating diverse sources of evidence and supporting deliberation among stakeholders. It is a tool for organising knowledge and facilitating decision-making—not a substitute for empirical modelling of hazard processes.

Recognising this distinction is not a limitation but a clarification. It allows MCDA to be used more effectively, as a decision-support framework that complements, rather than replaces, empirical approaches.

In contrast to MCDA, many AI-based approaches introduce additional challenges related to interpretability and explainability, reinforcing the need for robust validation frameworks.

Addressing the reproducibility crisis in glacier hazard MCDA will ultimately require community-agreed reporting standards—analogous to PRISMA for systematic reviews or FAIR for data—mandating disclosure of pairwise matrices, sensitivity results, validation metrics, and code.

6.6. Limitations in Context

This review has its own limitations. It reflects current practice as reported in the literature, rather than the full range of possible methodological developments. The geographical concentration of studies influences the patterns observed, and the focus on peer-reviewed publications excludes practitioner knowledge and grey literature.

However, these limitations do not undermine the central insight of this study. If anything, they reinforce it. What we observe is not a lack of methodological sophistication, but a misalignment between the ambitions of the field and the evidentiary foundations on which those ambitions rest.

Closing this gap—between representation and validation, between precision and evidence—remains a central challenge for future research in glacier hazard assessment.

6.7. Summary

The preceding sections have highlighted a consistent pattern: strong methodological uptake combined with limited validation, minimal uncertainty analysis, and a narrow geographical concentration. While these observations point to a clear reliability gap, they also raise a practical question—what would a more robust and defensible practice look like?

The reliability of glacier hazard assessments is not merely an academic concern; it is directly relevant to international policy frameworks. Improved validation and uncertainty quantification in MCDA-based susceptibility mapping contribute concretely to two Sustainable Development Goals. SDG 13 (Climate Action), Target 13.1, calls for strengthening resilience and adaptive capacity to climate-related hazards and disasters. Defensible hazards maps—those with documented validation, sensitivity analysis, and uncertainty bounds—provide the evidential basis for early warning systems and climate adaptation planning. SDG 11 (Sustainable Cities and Communities), Target 11.5, aims to reduce disaster impacts on people and infrastructure. Unreliable or overconfident hazard maps undermine this goal by misdirecting mitigation investments. Thus, the methodological improvements we advocate—systematic validation, robustness testing, and transparent uncertainty reporting—are not technical niceties but prerequisites for evidence-based disaster risk reduction aligned with global commitments.

To make this transition explicit, Table 15 synthesises our findings alongside a set of concrete methodological implications. Rather than serving as a prescriptive checklist, the table is intended as a structured reflection of current practice and a starting point for improving how MCDA is applied and interpreted in glacier hazard assessment.

Figure 12 visualises the core message of this review through a simple but powerful lens. The horizontal axis captures evidential support—the degree to which models are validated and their uncertainties characterised—where we found that only 35% of studies report quantitative validation and just 5% fully address uncertainty. The vertical axis represents methodological uptake, where 80% of studies rely on AHP-based approaches, indicating strong adoption despite limited evidence.

The resulting position of current practice—high uptake and low evidence—defines what we call the reliability gap. The diagonal line represents the ideal trajectory toward the upper-right quadrant, where models that are widely used are also those that have been rigorously tested. Closing this gap is the central challenge for the next generation of research, requiring movement not along one axis alone but along both simultaneously: maintaining the interpretability and accessibility that make MCDA valuable while building the empirical infrastructure that validation and uncertainty quantification demand.

7. Limitations

This section discusses the main validity considerations that affect the interpretation of the results. Unlike the methodological limitations reported in the PRISMA protocol, the following points concern the reliability of the conclusions derived from the analysed body of evidence.

7.1. Internal Validity

Internal validity refers to the accuracy of the findings and whether the observed patterns accurately reflect the literature analysed.

Search terminology. Although the search strategy was carefully constructed and iteratively refined, the terminology in this domain is heterogeneous. Some studies refer to “decision analysis”, “spatial decision support”, or “GIS-based suitability analysis” without explicitly using the term *multi-criteria decision analysis*. Consequently, it is possible that some relevant studies that employed MCDA concepts but different terminology were not retrieved.
Classification and interpretation of studies. The review required the categorisation of the studies according to the type of hazard, the MCDA method, the weighting approach, and the treatment of uncertainty. In several articles, methodological descriptions were incomplete or inconsistent, requiring interpretative judgement during classification. For example, some works combined GIS overlay analysis with AHP weighting without clearly specifying whether the decision model or the spatial processing step constituted the core method. To reduce this threat, predefined classification criteria were applied, and the classifications were cross-checked across the data set.
Heterogeneity of the reported evidence. The reviewed studies report results in various forms, including hazard maps, rankings, and qualitative assessments. Because comparable quantitative performance metrics are rarely provided, conclusions are based on identifying recurring methodological patterns rather than comparing predictive accuracy. Therefore, the findings describe dominant practices rather than measuring the superiority of specific methods.
Confirmability Bias. Because the review team includes researchers who have published AHP-based hazard assessments, we may be predisposed to view current practices as normative. We attempted to mitigate this through (a) preregistration of analytical protocols, (b) inclusion of a co-author with critical MCDA expertise, and (c) explicit search for studies reporting negative results or validation failure (none were found—itself a telling finding).
Search Strategy.
A deliberate methodological choice was the exclusion of Google Scholar from the search strategy. This decision was made for two reasons that prioritise reproducibility over comprehensiveness. First, Google Scholar does not offer a stable, reproducible search syntax; the results vary across sessions and cannot be reliably re-run by independent reviewers. Second, its indexing is inconsistent, mixing peer-reviewed articles with predatory journals, theses, technical reports, and non-English grey literature without transparent filtering. However, we acknowledge that this exclusion comes at a cost. Glacier hazard assessments are sometimes reported in regional technical reports, agency white papers, or conference proceedings not indexed in Scopus, Web of Science, or commercial publisher databases. Such grey literature may contain critical case studies—particularly from national geological surveys or hydropower companies—that are absent from our corpus. Consequently, our findings characterise peer-reviewed academic practice, not the full universe of MCDA applications. A complementary scoping review targeting the grey literature would be a valuable extension.
A second deliberate scope limitation was the exclusion of studies applying machine learning or statistical models (e.g., random forests, logistic regression, and neural networks) without any MCDA component. Our focus on methodological reliability within the MCDA literature necessarily brackets head-to-head comparisons between MCDA and ML predictive performance on identical datasets. This is not a flaw in the review design but a boundary condition. However, it limits the strength of one aspect of our “reliability gap” argument. Without direct empirical comparisons, we cannot claim that ML methods systematically outperform MCDA in predictive accuracy—only that MCDA, as currently practised, rarely tests its own predictive claims. The urgent next step for the field is precisely such comparative work: applying multiple MCDA methods and multiple ML methods to standardised benchmark datasets with documented event inventories and then comparing performance, interpretability, and data requirements. Our review cannot answer that question, but it strongly motivates the question.

7.2. External Validity

External validity concerns the generalisability of the conclusions beyond the analysed literature.

Geographical concentration of studies. Most of the studies identified were conducted in the Himalaya–Karakoram–Hindu Kush region. This reflects the prevalence of glacier hazards in these areas but limits the direct generalisation of the findings to other glaciated regions such as the Andes, the Alps or polar environments, where climatic, geomorphological, and monitoring conditions differ.
Regional-Language Studies The restriction on English-language publications is of particular significance given the geographical concentration of case studies in High Mountain Asia (80%). In countries such as China, India, Nepal, and Pakistan, glacier hazard assessments are sometimes reported in national journals or technical reports published in local languages (e.g., Chinese, Hindi, Nepali, Urdu). These publications may contain case studies, methodological variations, or validation datasets not captured in English-indexed databases. The exclusion of such sources could bias our findings in two directions: (i) overestimating the dominance of AHP if non-English studies use alternative methods, or (ii) underestimating the prevalence of validation if regional journals require empirical testing. A scoping review targeting non-English and grey literature would be a valuable complement to this study. In the event of such work, our conclusions should be interpreted as characterising academic practice—not the full universe of MCDA applications in the assessment of glacier hazards.
Dependence on expert knowledge. Many MCDA applications rely heavily on expert judgement for criteria weighting and selection. Consequently, the methodological patterns identified in this review partly reflect current operational practice rather than purely empirical validation. Different expert communities or institutional contexts may produce alternative weighting schemes and hazard classifications.
Rapid technological evolution.The reviewed period (2015–2025) corresponds to rapid growth in remote sensing and geospatial data availability. Emerging approaches, particularly machine learning–assisted hazard modelling and ensemble decision frameworks, are still limited in number. Therefore, the conclusions primarily characterise current practice rather than the full potential future development of decision-support methodologies in glacier hazard assessment.

8. Future Work

The synthesis of the reviewed literature indicates that future research should focus on transforming MCDA from a descriptive mapping procedure into a validated analytical framework for glacier hazard assessment. The priority is, therefore, not the development of additional applications, but the improvement of methodological reliability. Below, we outline a concrete research agenda organised around five interconnected priorities, each with specific methodological directions, testable hypotheses, and pathways to implementation.

8.1. Systematic Validation Protocols

A central research need concerns the systematic validation of MCDA-based hazard classifications. Most current studies produce susceptibility maps without testing predictive performance against independent hazard occurrence data. Future work should establish validation as a non-negotiable component of MCDA practice through:

Independent event databases: Assemble and maintain open-access inventories of documented GLOF, landslide, and avalanche events, with standardised metadata on location, timing, magnitude, and impact. Such databases would serve as test beds for evaluating predictive skill across regions and methods.
Temporal back-testing protocols: Develop standardised procedures for testing whether historically documented events fall within retrospectively classified high-hazard zones. This requires consistent rules for defining temporal cutoffs (e.g., training on pre-2000 data, testing on post-2000 events) and spatial buffers for event representation.
Cross-regional transferability testing: Design experiments that apply MCDA models calibrated in one region (e.g., High Mountain Asia) to test sites in other glaciated environments (Andes, Alps, Caucasus). Such tests would reveal which methodological choices are region-specific and which generalise across contexts.
Predictive performance benchmarks: Establish community-agreed metrics for evaluating MCDA outputs, including the area under the receiver operating characteristic curve (AUC-ROC), precision–recall curves, true skill statistics, and cost-sensitive measures that account for the asymmetric consequences of false positives versus false negatives in hazard contexts.

Validation should move from an optional supplement to a required practice, with journals mandating evidence of predictive skill as a condition of publication and funding agencies supporting the monitoring infrastructure that makes validation possible.

8.2. Robustness and Uncertainty Quantification

The review demonstrates that hazard classifications are highly sensitive to criteria weighting, yet fewer than 5% of studies perform systematic uncertainty analysis. Future work should treat uncertainty as an object of analysis rather than an inconvenience to be bracketed:

Multi-scenario sensitivity analysis: Implement systematic weight variation protocols that test classification stability across the full range of plausible expert judgements. Rather than reporting a single hazard map, studies should present sensitivity maps showing the proportion of weighting scenarios in which each location is classified as high-hazard, or ensemble maps displaying the median and interquartile range of susceptibility scores across weight perturbations.
Probabilistic weighting schemes: Replace deterministic weights with probability distributions elicited from multiple experts, then propagate this uncertainty through Monte Carlo simulation to generate probabilistic hazard classifications. Methods such as stochastic multi-criteria acceptability analysis (SMAA) [39,40] are well developed in the decision sciences but rarely applied in glacier hazard contexts.
Bayesian approaches to expert elicitation: Develop structured protocols for eliciting expert judgements that quantify not only central tendencies but also uncertainty and inter-expert disagreement. Hierarchical Bayesian models can then combine these judgements with empirical data, automatically down-weighting uncertain or discordant inputs.
Fuzzy and interval-based methods: Where probability distributions cannot be reliably specified, fuzzy membership functions or interval weights can represent imprecise knowledge. Future work should compare the performance of probabilistic, fuzzy, and interval approaches on common benchmark datasets to establish guidance for method selection under different data availability scenarios.

8.3. Comparative Method Evaluation

Although AHP dominates current practice, very few studies compare alternative MCDA methods using identical datasets. Such comparisons are essential for moving method selection from convenience to evidence:

Controlled benchmarking experiments: Design studies that apply multiple decision models—including AHP, fuzzy AHP, TOPSIS, ELECTRE, PROMETHEE, and outranking approaches—to identical criteria layers and spatial inputs. Outputs should be compared not only in terms of final hazard classifications but also in terms of sensitivity to input perturbations, stability under weight variation, and computational requirements.
Method–hazard fit assessment: Develop theoretical frameworks for matching MCDA methods to hazard types based on their mathematical properties. For instance, do compensatory methods like AHP systematically overestimate hazard in locations with one extremely unfavourable factor? Are outranking methods more appropriate when criteria are strongly interdependent? Such questions require systematic investigation.
Multi-method consensus analysis: Explore whether locations consistently classified as high hazard across multiple MCDA methods provide more reliable targets for mitigation than locations identified by any single method. This would establish empirical grounds for recommending methodological pluralism in high-stakes decisions.

8.4. Ensemble Frameworks and Machine Learning Integration

Reliance on a single decision model can limit robustness when criteria or data are uncertain. Future research should develop hybrid and ensemble approaches that combine the interpretability of MCDA with the predictive power of machine learning.

Before detailing ensemble and hybrid approaches, it is worth clarifying the distinctive role that MCDA can play in an era increasingly dominated by AI and machine learning. The emergence of black-box predictive models—random forests, gradient boosting, convolutional neural networks—has not rendered MCDA obsolete. Rather, it has clarified MCDA’s complementary strengths. First, MCDA offers full interpretability: every weight, every pairwise comparison, and every aggregation step is transparent and traceable, unlike the latent representations of deep learning. Second, MCDA can incorporate qualitative expert knowledge and stakeholder values directly into the decision structure—something that purely data-driven models cannot do without post-hoc translation. Third, MCDA serves as an interpretable baseline against which black-box models can be benchmarked: if a neural network does not outperform a properly validated AHP model, the added complexity is difficult to justify. Fourth, MCDA functions as a modular component in hybrid workflows—for example, using convolutional neural networks for automated feature extraction from satellite imagery, followed by MCDA for transparent hazard prioritisation. In this review, we, therefore, treat MCDA not as a competing paradigm to AI but as a decision-structuring framework whose reliability must be established on its own terms before it can be meaningfully integrated with or compared against data-driven methods.

Random Forest ensembles of MCDA outputs: A particularly promising direction involves treating multiple MCDA models as an ensemble, analogous to Random Forest in machine learning. Rather than selecting a single weighting scheme or decision method, researchers could:
–
Generate a large ensemble of plausible hazard maps by varying: (i) criteria weights across expert-elicited ranges, (ii) aggregation rules (additive, multiplicative, outranking), (iii) classification thresholds, and (iv) input data sources or resolutions.
–
Train a random forest classifier on this ensemble, using locations with documented hazard events as training labels, to learn which combinations of model outputs are most predictive.
–
The resulting meta-model would retain interpretability (each base model is a transparent MCDA formulation) while achieving the predictive performance associated with ensemble methods.
This approach directly addresses the core problem identified in this review: the gap between operational uptake and predictive verification.
Machine learning for weight calibration: Use documented hazard events to learn optimal criteria weights from data, rather than relying solely on expert judgement. Methods such as logistic regression, support vector machines, or neural networks can be trained to predict event occurrence from the same criteria used in MCDA models. The learned weights can then be compared with expert-derived weights, and discrepancies can inform iterative refinement of both models and expert understanding.
Hybrid MCDA–ML workflows: Develop pipelines in which machine learning handles tasks MCDA does poorly (automatic feature extraction from remote sensing imagery, pattern recognition in time series) while MCDA handles tasks ML does poorly (incorporating qualitative expert knowledge, making trade-offs explicit, supporting stakeholder deliberation). For example, convolutional neural networks could identify potentially dangerous glacial lakes from satellite imagery, while MCDA prioritises them for ground-based monitoring based on expert-elicited criteria.
Interpretable ML as MCDA alternative: Explore whether inherently interpretable machine learning methods—such as decision trees, rule-based classifiers, or explainable boosting machines—can serve as alternatives to MCDA, combining predictive performance with the transparency that hazard managers require.

8.5. Infrastructure for Reproducible Research

The reproducibility crisis in environmental modelling [31,32] has not spared glacier hazard MCDA. Future work should embed reproducibility into research practice:

Open-source software frameworks: Develop and maintain open-source toolkits (e.g., Python libraries, R packages) that implement MCDA methods with built-in sensitivity analysis, uncertainty quantification, and validation reporting. Such tools would lower the technical barrier to rigorous practice and ensure methodological consistency across studies.
Standardised reporting guidelines: Establish community guidelines for reporting MCDA-based hazard assessments, requiring disclosure of: (i) all pairwise comparison matrices, (ii) consistency ratios for all experts and aggregation levels, (iii) full sensitivity analysis results, (iv) validation metrics with confidence intervals, and (v) code and data sufficient for independent replication. Journals should mandate adherence to these guidelines.
Benchmark datasets and challenges: Create curated benchmark datasets with documented hazard events, high-quality criteria layers, and standardised train–test splits. Organise community challenges (e.g., “Predict the next GLOF in the Himalayas using MCDA or hybrid methods”) to accelerate methodological innovation and enable fair comparisons.

8.6. Georeferencing and Spatial Data Infrastructure

A specific but consequential gap in current practice concerns the lack of precise geolocation for case study sites. Most reviewed studies report only a general region (e.g., “Hunza Valley, Pakistan”) without providing decimal latitude-longitude coordinates for individual glacial lakes, hazard zones, or validation points. This omission limits the ability to aggregate data across studies, perform meta-analyses, or link findings to other spatial datasets (e.g., climate models, topographic indices, and land use). The assignment of decimal latitude-longitude ([dLL]) to each study site would enable: (i) mapping of methodological patterns (e.g., which regions use which MCDA methods); (ii) spatial cross-validation (e.g., testing whether models calibrated in one region predict hazards in another); and (iii) semantic linking with open data repositories. Emerging workflows combining Large Language Models (LLMs) for information extraction with geospatial databases could semi-automate the georeferencing of existing studies. We, therefore, encourage future MCDA-based hazard assessments to report study site coordinates as standard practice, and we call for the development of a community-maintained, georeferenced database of MCDA hazard applications—akin to the GLOF database of Veh et al. [41] but focused on methodological meta-data. This would transform the current corpus from a collection of isolated case studies into an interoperable, spatially explicit evidence base.

8.7. Synthesis and Outlook

The research agenda outlined above responds directly to the limitations identified in this review. Validation moves from exception to norm; uncertainty from omission to analysis; method selection from convenience to evidence; MCDA from standalone tool to component of hybrid, ensemble, and machine-learning-assisted workflows.

The field stands at an inflection point. The glaciers are retreating, the hazards are intensifying, and the demand for actionable assessments will only grow. Meeting this demand requires not more maps but better science: maps that are tested, uncertainties that are quantified, methods that are compared, and results that are reproducible. The path forward is clear. What remains is the collective will to walk it.

The ensemble approach we propose—particularly the random forest aggregation of diverse MCDA outputs—offers a concrete starting point. It honours the interpretability that makes MCDA valuable while harnessing the predictive power that ensemble methods provide. We invite the community to take up this challenge: to build, test, and refine frameworks that combine the best of both worlds, and in doing so, to transform glacier hazard assessment from a craft of plausible representation into a science of accountable prediction.

9. Conclusions

Across sixty studies spanning a decade of research, a consistent and consequential pattern emerges. The field of glacier hazard assessment has embraced multi-criteria decision analysis with enthusiasm, yet the organising logic of this uptake is neither problem-centred nor evidence-driven. Rather, current practice is fundamentally method-centred: the same decision framework—AHP with weighted linear aggregation in a GIS environment—is applied to glacial lake outburst floods, landslides, debris flows, and avalanches, with minimal adaptation to the distinct physical mechanisms or decision contexts of each hazard type. Method selection is driven not by demonstrated predictive superiority or problem-specific fit, but by operational convenience: software availability (ArcGIS Weighted Overlay), low mathematical barriers, and institutional familiarity. The consequence is a methodological monoculture that produces visually compelling hazard maps while systematically deferring empirical validation, uncertainty quantification, and robustness testing. This is not a failure of individual researchers; it is a structural feature of the field’s current incentive landscape, software infrastructure, and publication norms. The central claim of this review, therefore, is not that MCDA is useless—it is not—but that the way MCDA is currently practiced produces maps that are method-centred rather than problem-centred, and interpreted as predictive while functioning as structured expert judgement. Closing this gap requires not technical tweaks but a fundamental reorientation: from producing more maps to producing more accountable ones.

The empirical patterns are instructive. AHP-based approaches dominate methodological choice, accounting for the majority of applications. Case studies are heavily concentrated in High Mountain Asia, shaping both methodological norms and empirical expectations. Quantitative validation, while present, remains limited. Taken together, these patterns do not indicate failure, but they do point to an imbalance: a field that has prioritised operational applicability over evidential grounding.

The central contribution of this study is to make this imbalance explicit. What we observe is a systematic mismatch between methodological uptake and evidential support. MCDA-based hazard maps are widely produced and frequently interpreted as predictive representations of risk. Yet the modelling practices that generate them—reliant on expert-derived weights, deterministic aggregation, and limited validation—are more consistent with structured interpretation than with empirical prediction.

From this perspective, the hazard map is not a measurement but a formalised judgement rendered spatial. This distinction is not merely conceptual; it defines the limits of what can be claimed. While such models can organise knowledge, support prioritisation, and facilitate communication, they cannot, in their current form, be assumed to provide validated predictions of hazard occurrence.

If we take this insight seriously, then the direction for future research becomes clearer. Validation must move from a peripheral activity to a central requirement. Sensitivity and robustness analysis should be treated as integral components of the modelling, not optional additions. Weighting schemes must be recognised as assumptions to be tested rather than inputs to be accepted. And methodological diversity—through comparative and hybrid approaches—should be encouraged to better understand how modelling choices shape outcomes.

More broadly, the field must align its methodological practices with the level of confidence it seeks to claim. Producing hazard maps is not, in itself, sufficient; what matters is the extent to which those maps are supported by evidence, tested against alternative assumptions, and interpreted within their epistemic limits.

For practitioners, the implications are both practical and cautionary. MCDA remains a valuable tool for structuring complex decisions, particularly in data-constrained environments where alternative approaches may not be feasible. Its transparency and flexibility make it well suited for integrating diverse sources of information and supporting stakeholder dialogue.

However, its outputs should not be mistaken for predictive certainty. A hazard map derived from a single model configuration, without validation or uncertainty analysis, is best understood as a hypothesis—a structured representation of risk informed by available knowledge and expert judgement. In high-stakes contexts, such representations should be complemented with multiple lines of evidence and interpreted with explicit recognition of their limitations.

We began this review with a simple question: what kind of knowledge do MCDA-based hazard assessments actually produce? Our answer is that they produce structured, interpretable, and often useful representations of risk—but not, in most cases, validated predictions.

Recognising this distinction is essential. It does not diminish the value of MCDA; rather, it clarifies its proper role. Used appropriately, MCDA can support deliberation, make assumptions explicit, and help navigate complex decision spaces. Used uncritically, it risks conveying a level of certainty that the underlying models cannot justify.

The glaciers are retreating, hazards are intensifying, and decisions cannot wait. In such contexts, clarity is valuable—but only if it is honest about its limits. The challenge for the field is therefore not to abandon MCDA, but to use it with greater rigour, transparency, and interpretive care.

Ultimately, the maps we produce will shape real decisions. Whether they do so wisely depends not only on how they are constructed but also on how they are understood.

These findings are particularly relevant in the context of the growing adoption of AI and machine learning in hazard modelling, where issues of interpretability, validation, and reliability become even more critical.

Author Contributions

Conceptualisation, Ricardo Gacitua, Javier Pereira, Hernán Astudillo, Carla Taramasco and Pedro Contreras; methodology, Ricardo Gacitua, Javier Pereira, Hernán Astudillo, Carla Taramasco and Pedro Contreras; software, Ricardo Gacitua and Javier Pereira; validation, Ricardo Gacitua, Javier Pereira, Hernán Astudillo Carla Taramasco, and Pedro Contreras; formal analysis, Ricardo Gacitua, Javier Pereira, Hernán Astudillo, Carla Taramasco and Pedro Contreras; investigation, Ricardo Gacitua, Javier Pereira and Hernán Astudillo; resources, Ricardo Gacitua, Javier Pereira, Hernán Astudillo and Carla Taramasco; data curation, Ricardo Gacitua, and Javier Pereira; writing—original draft preparation, Ricardo Gacitua, Javier Pereira, and Hernán Astudillo; writing—review and editing, Ricardo Gacitua, Javier Pereira, Hernán Astudillo, Carla Taramasco and Pedro Contreras; visualisation, Ricardo Gacitua; supervision, Ricardo Gacitua and Javier Pereira; project administration, Ricardo Gacitua and Javier Pereira. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The protocol for this systematic review—covering the predefined research questions, the search strategy, the inclusion and exclusion criteria, and the data extraction framework—was registered and made publicly accessible on the Open Science Framework (OSF). Relevant materials detailing the study selection procedure and the study characteristics extracted are also available in the OSF repository (https://doi.org/10.17605/OSF.IO/46PZ9.) All data produced or examined in this study originate from previously published sources. No sensitive data was collected. Requests for further details, clarification on the extracted data, or access to relevant materials should be addressed to the corresponding author. Correspondence on data availability can be addressed to Ricardo Gacitúa email: ricardo.gacitua@ufrontera.cl.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Selected Articles

Table A1. Overview of all reviewed papers, ordered by year of appearance.

ID	Source	Title	Year
1	[42]	Growing Glacial Lake Outburst Flood Risks in Ghizer District: A Karakoram Anomaly Region	2025
2	[43]	Classification and evaluation of dangerous glacial lakes in the Hindukush region of Afghanistan (HKA) using a multi-criteria approach	2025
3	[44]	Glacial Lake Outburst Floods (GLOFs) Susceptibility in the Northwest Himalayas using AHP-TOPSIS and AHP-COPRAS	2025
4	[45]	Assessing the geotourism potential of glacial lakes in Plav, Montenegro: A multi-criteria assessment by using the M-GAM model	2025
5	[46]	A data-driven multi-criteria model for GLOF risk in tectonically active Himalayan regions	2025
6	[47]	Comprehensive susceptibility assessment of continental glacier ice avalanches: a case study of glaciers on the northwestern Tibetan Plateau	2025
7	[48]	Seasonal evaluation of glacier dynamics and risk analysis using remote sensing techniques in the Buni Zom Valley, Chitral River Basin, Northern Pakistan	2025
8	[49]	Distribution patterns and risk assessment of potential ice avalanches and glacier lake outburst floods in southeastern Tibetan Plateau	2025
9	[50]	Monitoring glacial lake formation and GLOF hazard in Kanchenjunga conservation area of Nepal (2010–2023): a high-resolution approach	2025
10	[51]	Evaluating GLOF Susceptibility of Potentially Dangerous Glacial Lakes Using AHP-Based Multi-Criteria Analysis in the Suru Sub-Basin, Western Himalayas	2025
11	[52]	Assessing glacial lake outburst flood risk in the Eastern Himalayas: a Bayesian neural network framework	2025
12	[53]	Hazards profile of the Shigar Valley, Central Karakoram, Pakistan: Multicriteria hazard susceptibility assessment	2024
13	[54]	Glacial lake outburst flood risk assessment of a rapidly expanding glacial lake in the Ladakh region of Western Himalaya, using hydrodynamic modeling	2024
14	[55]	Inventory and GLOF susceptibility of glacial lakes in Chenab basin, Western Himalaya	2024
15	[56]	Snow Avalanche Hazard Mapping Using a GIS-Based AHP Approach: A Case of Glaciers in Northern Pakistan from 2012 to 2022	2023
16	[57]	Glacial lake outburst flood risk assessment using remote sensing and hydrodynamic modeling: A case study of Satluj Basin, Western Himalayas, India	2023
17	[58]	Flood-based critical sub-watershed mapping: comparative application of multi-criteria decision making methods and hydrological modeling approach	2023
18	[59]	Enhanced glacial lake activity threatens numerous communities and infrastructure in the Third Pole	2023
19	[60]	Assessment of potential present and future glacial lake outburst flood hazard in the Hunza valley: A case study of Shisper and Mochowar glacier	2023
20	[61]	Application of a structured decision-making process in cryospheric hazard planning: Case study of Bering Glacier surges on local state planning in Alaska	2024
21	[62]	A robust glacial lake outburst susceptibility assessment approach validated by GLOF event in 2020 in the Nidu Zangbo Basin, Tibetan Plateau	2023
22	[63]	Landslide susceptibility mapping along the China Pakistan Economic Corridor (CPEC) route using multi-criteria decision-making method	2022
23	[64]	Landslide susceptibility assessment of Kashmir Himalaya, India	2022
24	[65]	Identifying the Potential Dam Sites to Avert the Risk of Catastrophic Floods in the Jhelum Basin, Kashmir, NW Himalaya, India	2022
25	[66]	Glacial lake changes and the identification of potentially dangerous glacial lakes (PDGLs) under warming climate in the Dibang River Basin, Eastern Himalaya, India	2022
26	[24]	GIS-based landslide susceptibility zonation and comparative analysis using analytical hierarchy process and conventional weighting-based multivariate statistical methods in the Lachung River Basin, North Sikkim	2022
27	[67]	District flood vulnerability assessment using Analytic Hierarchy Process (AHP) with historical flood events in Bhutan	2022
28	[68]	Climate change and glacial lake outburst flood (GLOF) risk perceptions: An empirical study of Ghizer District, Gilgit-Baltistan Pakistan	2022
29	[69]	A Comparative Assessment of Multi-Criteria Decision-Making Analysis and Machine Learning Methods for Flood Susceptibility Mapping and Socio-economic Impacts on Flood Risk in Abela-Abaya Floodplain of Ethiopia	2022
30	[70]	Soil erosion assessment using earth observation data in a trans-boundary river basin	2021
31	[71]	Probability of glacial lake outburst flooding in the Himalaya	2021
32	[23]	Multi criteria analysis for flood hazard mapping using GIS techniques: a case study of Ghaghara River basin in Uttar Pradesh, India	2021
33	[72]	Mesoscale seismic hazard zonation in the Central Seismic Gap of the Himalaya by GIS-based analysis of ground motion, site and earthquake-induced effects	2021
34	[73]	AHP Based Assessment of Glof Susceptibility of South Lhonak Glacial Lake, Sikkim Himalaya, India	2021
35	[74]	Landslide susceptibility zonation using geospatial technique and analytical hierarchy process in Sikkim Himalaya	2021
36	[75]	Inventory and GLOF Susceptibility of Glacial Lakes in Hunza River Basin, Western Karakorum	2021
37	[76]	Glacial Lakes in the Andes under a Changing Climate: A Review	2021
38	[77]	Glacial Lake Area Change and Potential Outburst Flood Hazard Assessment in the Bhutan Himalaya	2021
39	[78]	Evaluation of Glacial Lake Outburst Flood Susceptibility Using Multi-Criteria Assessment Framework in Mahalangur Himalaya	2021
40	[25]	Assessment of terrain stability zones for human habitation in Himalayan Upper Pindar River Basin, Uttarakhand using AHP and GIS	2021
41	[79]	National-Scale Landslide Susceptibility Mapping in Austria Using Fuzzy Best-Worst Multi-Criteria Decision-Making	2020
42	[80]	Glacial Lake Inventory and Lake Outburst Flood/Debris Flow Hazard Assessment after the Gorkha Earthquake in the Bhote Koshi Basin	2020
43	[81]	Flash flood risk modeling of swat river sub-watershed: a comparative analysis of morphometric ranking approach and El-Shamy approach	2020
44	[82]	Analytic Hierarchy Process applied to landslide susceptibility mapping of the North Branch of Argentino Lake, Argentina	2020
45	[17]	A GIS-Based Multi-Criteria Decision Analysis Model for Determining Glacier Vulnerability	2020
46	[83]	Susceptibility assessment of rainfall induced debris flow zones in Ladakh–Nubra region, Indian Himalaya	2019
47	[84]	Potentially dangerous glacial lakes across the Tibetan Plateau revealed using a large-scale automated assessment approach	2019
48	[85]	Multi-criteria evaluation for landslide hazard zonation by integrating remote sensing, GIS and field data in North Kashmir Himalayas, J&K, India	2019
49	[86]	Mapping of moraine dammed glacial lakes and assessment of their areal changes in the central and eastern Himalayas using satellite data	2019
50	[87]	Landslide susceptibility mapping by using a geographic information system (GIS) along the China–Pakistan Economic Corridor (Karakoram Highway), Pakistan	2019
51	[88]	Cryospheric hazards and risk perceptions in the Sagarmatha (Mt. Everest) National Park and Buffer Zone, Nepal	2019
52	[89]	A Hybrid Spatial Multi-Criteria Evaluation Method for Mapping Landslide Susceptible Areas in Kullu Valley, Himalayas	2019
53	[16]	Use of multi-criteria decision analysis to identify potentially dangerous glacial lakes	2018
54	[90]	Modelling glacial lake outburst flood impacts in the Bolivian Andes	2018
55	[91]	Outburst susceptibility assessment of moraine-dammed lakes in Western Himalaya using an Analytic Hierarchy Process	2017
56	[92]	GIS based landslide susceptibility mapping of northern areas of Pakistan, a case study of Shigar and Shyok Basins	2017
57	[93]	Geospatial Modelling and Mapping of Snow Avalanche Susceptibility	2017
58	[94]	Earthquake hazard assessment through geospatial model and development of EaHaAsTo tool for visualization: an integrated geological and geoinformatics approach	2017
59	[95]	Decision-Making Methodology for Risk Management Applied to Imja Lake in Nepal	2017
60	[96]	Moraine-dammed lake distribution and outburst flood risk in the Chinese Himalaya	2015

Appendix B. Criteria Used to Classify Articles

Table A2. Updated classification criteria used to characterise the reviewed studies.

Criterion	Category	Definition/Interpretation
Hazard type	Glacier Lake Outburst Flood (GLOF)	Failure or breaching of moraine- or ice-dammed glacial lakes producing downstream flooding.
	Landslide/Debris flow	Slope instability processes including rockfall, debris flow, and mass movement.
	Snow or Ice Avalanche	Rapid downslope movement of snow or glacier ice masses.
	Flood	Riverine or flash flood processes not directly caused by glacial lake breach.
	Seismic hazard	Ground shaking or earthquake-induced slope failures.
	Other cryospheric hazard	Glacier surge, glacier vulnerability, or erosion-related hazards.
MCDA aggregation approach	Compensatory (value/utility based)	Weighted linear combination of criteria (e.g., AHP, WLC, TOPSIS, COPRAS, SAW).
	Outranking	Preference comparison between alternatives using outranking relations (e.g., ELECTRE, PROMETHEE).
	Fuzzy MCDA	Incorporation of vagueness or imprecision through fuzzy membership or linguistic variables.
	Probabilistic/hybrid	Integration of MCDA with probabilistic or AI-based models (e.g., Bayesian networks, neural networks, ML).
	Descriptive/decision-support framework	MCDA used as a structured decision-support process rather than strict ranking.
Technology used	GIS-based modelling	Spatial multi-criteria analysis performed in a geographic information system environment.
	Remote sensing	Use of satellite imagery, DEMs, or Earth observation data.
	Hydrodynamic modelling	Numerical flood or debris-flow simulations (e.g., HEC-RAS).
	Machine learning/AI	Use of data-driven predictive algorithms integrated with MCDA.
	Decision Support System (DSS)	Implementation within a structured or automated decision-support platform.
	Field and observational data	Ground measurements, historical events, or field surveys used as inputs.
Uncertainty treatment	Spatial or temporal variability	Hazard evaluation varies across space or time scenarios.
	Fuzzy/imprecision modelling	Handling vague or qualitative data using fuzzy logic.
	Probabilistic uncertainty	Explicit modelling of likelihood or event probability.
	Sensitivity/robustness analysis	Testing stability of results to parameter changes.
Weighting approach	Preference-based weighting	Weights derived from expert judgement (e.g., AHP pairwise comparisons).
	Data-driven weighting	Weights estimated from statistical or machine-learning analysis.
	Objective weighting	Weights computed from mathematical properties of criteria (e.g., entropy, equal weights).

Appendix C. Detailed Characterisation of Included Studies

The full extracted dataset used in this systematic review is presented in Table A3. The table reports the methodological characteristics of all included studies, including the study region, type of hazard, MCDA method, number of criteria, weighting approach, reporting of the consistency ratio, and validation metrics.

Table A3. Summary of included studies on MCDA applications in glacier hazard assessment.

No.	ID	Study Region	Hazard Type	MCDA Method	Technological Environment	Criteria No.	AHP CR Reported	Validation Procedure
1	P01-25	Ghizer District, Pakistan	GLOF	MCDA	RS, ML	6	Not Mentioned	Statistical metric & Historical
2	P02-25	Hindukush, Afghanistan	GLOF	AHP	RS, Geospatial	13	Yes (0.088)	Historical event comparison
3	P03-25	Northwest Himalaya, India	GLOF	AHP-TOPSIS/COPRAS	RS, GIS, Seismic	15	Not Mentioned	Historical event comparison
4	P04-25	Plav, Montenegro	Geotourism Potential	M-GAM	Integrated glacial geosite assessment framework	Not specified	N/A	Qualitative comparison
5	P05-25	Eastern Himalayas	GLOF	AHP-Entropy-TOPSIS	RS, GIS, Seismic	13	Yes (0.0884)	Historical comparison & Metric
6	P06-25	Tibetan Plateau, China	Ice Avalanche	AHP-Cloud Theory	RS, GIS	10	Not Mentioned	Qualitative comparison
7	P07-25	Chitral Valley, Pakistan	Glacier Retreat	AHP	RS, GIS	5	Yes (0.0800)	Qualitative comparison
8	P08-25	Eastern Himalayas	GLOF	AHP, (BPNN)	RS, GIS, ML	15	Not Mentioned	Historical event comparison & Statistical metric
9	P09-25	KCA, Nepal	GLOF	AHP	High-res RS, GIS	9	Yes (0.097)	Historical event comparison
10	P10-25	Suru Basin, India	GLOF	AHP	RS, GIS	8	Yes (0.0296)	Qualitative comparison
11	P11-25	Eastern Himalayas	GLOF	AHP-Entropy-BNN	RS, GIS, ML (BNN)	14	Not Mentioned	Statistical metric & Historical
12	P01-24	Shigar Valley, Pakistan	Multi-hazard	AHP	GIS, RS	15	Yes (<0.10)	None
13	P02-24	Ladakh region, India	GLOF	Weighted Sum	RS, Hydrodynamic	10	N/A	None
14	P03-24	Chenab basin, India	GLOF	AHP	RS, GIS	10	Yes (0.0410)	Qualitative comparison
15	P01-23	Karakoram, Pakistan	Avalanche	AHP	GIS, RS	6	Yes (0.0178)	Qualitative comparison
16	P02-23	Satluj basin, India	GLOF	AHP	RS, GIS, Hydrodynamic	5	Not Mentioned	Historical event comparison
17	P03-23	Iran (Watershed)	Flood	AHP, FAHP, ANP, FANP	GIS, Hydrological	8	Not Mentioned	Statistical metric & Historical
18	P04-23	Third Pole	GLOF	AHP	RS, GIS, Hydrodynamic	5	Not Mentioned	Historical event comparison
19	P05-23	Hunza Valley, Pakistan	GLOF	AHP	RS, GIS, Hydrodynamic	10	Yes (≤0.01)	Historical event comparison
20	P06-23	Alaska, USA	Glacier Surge	MCDM/DecideIT	Decision Tool	4	N/A	Qualitative & Sensitivity
21	P07-23	Tibetan Plateau, China	GLOF	AHP	RS, Field, DEM	15	Yes (Pass implied)	Historical event comparison
22	P01-22	CPEC route, Pakistan	Landslide	AHP	GIS, RS	13	Yes (<0.1)	Historical event comparison
23	P02-22	Kashmir Himalaya, India	Landslide	MCE	GIS, RS	14	N/A	Statistical metric & Historical
24	P03-22	Jhelum Basin, India	Flood	MCA/Weighted Overlay	GIS, RS	9	Not Mentioned	Statistical metric
25	P04-22	Dibang Basin, India	GLOF	Heuristic Index	RS, GIS	6	Not Mentioned	None
26	P05-22	North Sikkim, India	Landslide	AHP, CW	GIS, RS	8	Yes (Pass mentioned)	Statistical metric & Historical
27	P06-22	Bhutan	Flood	AHP	GIS	7	Yes (Pass mentioned)	Historical event comparison
28	P07-22	Ghizer District, Pakistan	GLOF (Risk Perception)	None	Survey methodology	N/A	N/A	N/A
29	P08-22	Abela-Abaya, Ethiopia	Flood	MABAC, VIKOR	GIS, ML	15	N/A	Statistical metric & Historical
30	P01-21	Ghaghara Basin, India/Nepal	Soil erosion	MCE, AHP	GIS, RS, Hydro-model	11	YES (0.02)	None
31	P02-21	Himalaya (Regional)	GLOF probability	OBC, AHP	GIS, RS, DEM	15	Yes (0.097)	Qualitative verification
32	P03-21	Ghaghara Basin, India	Flood	AHP	GIS, RS	9	Yes (0.1)	Historical event comparison
33	P04-21	Central Seismic Gap, India	Multi-hazard	AHP	GIS, Site response	Multiple	Yes (0.03 (EHI))	Statistical metric (ROC)
34	P05-21	South Lhonak Lake, India	GLOF susceptibility	AHP	GIS, Remote Sensing	Not listed	Yes (0.0126)	None
35	P06-21	Sikkim Himalaya, India	Landslide	AHP, WLC	GIS, RS, Ground data	8	Yes (0.025)	Statistical metric (ROC/AUC)
36	P07-21	Hunza Basin, Pakistan	GLOF	AHP	RS, GIS	11	Yes (0.06)	Historical event comparison
37	P08-21	Andes (South America)	Glacial lake/GLOF	Review Article	Remote Sensing	N/A	N/A	N/A
38	P09-21	Bhutan Himalaya	GLOF hazard potential	AHP	Corona KH-4, Sentinel-2, GIS	Not listed	Not reported	None
39	P10-21	Mahalangur Himalaya, Nepal	GLOF susceptibility	Multi-criteria/AHP	Sentinel-2, Landsat, Pleiades	6	Not reported	Qualitative comparison
40	P11-21	Pindar River Basin, India	Terrain stability	AHP	GIS, IDW interpolation	12	<0.135	Qualitative comparison
41	P01-20	Austria	Landslide	FBWM, FAHP	GIS, RS, Inventory	9	0.041 (FBWM), 0.052 (FAHP)	Statistical metric (ROC/AUC)
42	P02-20	Bhote Koshi Basin	GLOF/Debris flow	Multi-criteria potential assessment	GaoFen-1, SRTM DEM, Google Earth	5	N/A	Historical event comparison
43	P03-20	Swat River, Pakistan	Flash flood risk	Morphometric ranking	GIS, ArcHydro, DEM	15	N/A	Historical event comparison
44	P04-20	Argentino Lake, Argentina	Landslide	AHP, WLC	GIS, RS	6	Yes (0.069)	Qualitative inventory comparison
45	P05-20	Ağrı Mountain, Turkey	Glacier retreat	Entropy Method	GIS, RS, DEM	5	N/A	Statistical metric
46	P01-19	Ladakh–Nubra, India	Debris flow	AHP-based WOA	GIS, RS, NWP (WRF)	6	Not reported	Statistical metric (ROC/AUC)
47	P02-19	Tibetan Plateau	GLOF	Automated assessment	GIS, RS	4	N/A	Historical event comparison
48	P03-19	North Kashmir, India	Landslide	MCE	GIS, RS, Field data	6	Not reported	Qualitative comparison
49	P04-19	Central/Eastern Himalaya	Glacial lake mapping	Remote Sensing Inventory	Landsat 4–8	N/A	N/A	None
50	P05-19	Karakoram Highway, Pakistan	Landslide	AHP, WOM	GIS, RS, Field visit	10	<0.1	Statistical metric (Accuracy)
51	P06-19	Everest Region, Nepal	Cryospheric risk perception	AHP (social scoring)	Mental models, focus groups	Perceived hazard ranking	Not reported	Qualitative comparison
52	P07-19	Kullu Valley, India	Landslide	Hybrid SMCE (AHP+FR)	GIS, RS, Field survey	8	<0.1 cited	Statistical metric (ROC/AUC)
53	P01-18	Global/Andes	GLOF	SMAA-TRI	GIS, RS, Google Earth Pro	13	N/A	Statistical (Sensitivity analysis)
54	P02-18	Bolivian Andes	GLOF impacts	MCDA results	Hydrodynamic modeling	N/A	N/A	Historical event comparison
55	P01-17	Western Himalaya, India	GLOF	AHP	GIS, RS, Historical data	11	Yes (0.032)	Historical event comparison
56	P02-17	Northern Pakistan	Landslide	AHP	GIS, RS	10	Yes (0.04)	Statistical metric
57	P03-17	Siachen region, India	Snow avalanche	AHP	GIS, RS	5	<0.1 accepted	Statistical metric (ROC/AUC)
58	P04-17	Sikkim Himalaya	Earthquake/ Seismic vulnerability	Numerical rating scheme	GIS, Site response, ASTER GDEM	9	N/A	Qualitative
59	P05-17	Imja Lake, Nepal	GLOF risk management	Data Envelopment Analysis (DEA)	Hydrodynamic modeling (HEC-RAS, FLO-2D), GIS	Consequence categories	N/A	Statistical (Sensitivity analysis)
60	P01-15	Chinese Himalaya	GLOF	AHP, Weighted Comprehensive	GIS, RS, DEM	15	Not reported	None

Table A4. Detailed scoring rubric for the MCDA-HAZARD methodological quality assessment instrument.

Domain	Score 0 (Not Reported)	Score 1 (Partially Addressed)	Score 2 (Clearly Implemented and Documented)
Criteria Definition	No justification for criteria selection; factors listed without source or rationale.	Criteria listed with brief justification (e.g., cited from literature) but no expert consultation or geomorphic model.	Explicit geomorphic or statistical rationale for each criterion; documentation of expert elicitation or literature synthesis.
Weighting Transparency	Weights stated but no procedure described; source of weights unclear.	Weighting method named (e.g., AHP) but no pairwise matrix or consistency ratio reported; or weights from the literature without adaptation.	Complete weighting documentation: pairwise comparison matrices, consistency ratios for all experts, aggregation method, and rationale for weight selection.
Uncertainty Analysis	No mention of uncertainty; single deterministic map produced.	Qualitative acknowledgement of uncertainty (e.g., “weights may vary”) without quantitative analysis.	Systematic sensitivity analysis (weight variation, scenario testing), probabilistic MCDA, fuzzy modelling, or Monte Carlo simulation reported with results.
Validation	No validation procedure; no comparison with historical events or independent data.	Qualitative validation: visual comparison with known hazard locations, expert judgement of map plausibility, or single historical event check.	Quantitative validation: ROC/AUC, accuracy metrics, confusion matrix, temporal back-testing, or statistical comparison with independent event inventory.
Reproducibility	Data sources not cited; parameters not disclosed; workflow not reproducible.	Data sources cited but processing steps missing; some parameters reported but insufficient for full replication.	Complete disclosure: all data sources and versions, preprocessing steps, all model parameters (weights, thresholds), and code or detailed algorithmic description.

Note: The total score (0–10) is the sum of domain scores. Studies scoring 0–3 are considered to have low methodological reliability, with 4–6 being moderate and 7–10 being high. The threshold for “fully reported” in Table 4 (main manuscript) is score 2 in this rubric.

Appendix D. Detailed Search Strategy (MCDA in Glacier Hazard)

To ensure the transparency and reproducibility of this systematic literature review on the use of multi-criteria decision analysis (MCDA) techniques in glacier and mountain hazard assessment, this appendix reports the complete search strings, search fields, and filters applied to each digital library. Minor syntactic adjustments were required to accommodate database-specific query languages while preserving equivalent semantics across sources.

All searches targeted peer-reviewed publications written in English and published between 2015 and 2025, consistent with the temporal scope of this review. The final search was conducted in December 2025.

The search strategy operationalises three conceptual blocks derived from the review protocol:

1.: Hazard domain: glacier hazards, glacial lake outburst floods (GLOF), avalanches, landslides, and mountain flood hazards.
2.: Decision-analysis approach: multi-criteria decision analysis, multi-criteria decision making, AHP, TOPSIS, ELECTRE, and related methods.
3.: Risk assessment context: susceptibility mapping, hazard zonation, vulnerability assessment, and spatial risk evaluation.

Appendix D.1. IEEE Xplore

Search fields: Document Title, Abstract, Index Terms

(‘‘multi-criteria decision’’ OR ‘‘multi-criteria analysis’’ OR

‘‘multi-criteria decision making’’ OR MCDA OR MCDM OR

AHP OR ‘‘Analytic Hierarchy Process’’ OR TOPSIS OR ELECTRE OR BWM)

AND

(‘‘glacier hazard’’ OR ‘‘glacial lake outburst flood’’ OR GLOF OR

avalanche OR landslide OR ‘‘debris flow’’ OR ‘‘mountain hazard’’)

AND

(‘‘hazard assessment’’ OR ‘‘susceptibility mapping’’ OR

‘‘hazard zonation’’ OR ‘‘risk assessment’’)

Filters: Journals and Conferences; English; 2015–2025

Appendix D.2. ACM Digital Library

Search fields: Title, Abstract, Keywords

(‘‘multi-criteria decision analysis’’ OR ‘‘multi-criteria decision making’’

OR MCDA OR MCDM OR AHP)

AND

(‘‘glacier’’ OR GLOF OR avalanche OR landslide OR ‘‘mountain hazard’’)

AND

(‘‘hazard assessment’’ OR ‘‘risk assessment’’ OR ‘‘susceptibility

mapping’’)

Filters: Articles and Proceedings; English; 2015–2025

Appendix D.3. Scopus

Search fields: TITLE-ABS-KEY

TITLE-ABS-KEY(

(‘‘multi-criteria’’ OR multicriteria OR ‘‘multi criteria’’ OR MCDM OR

MCDA OR

‘‘Analytic Hierarchy Process’’ OR AHP OR ‘‘fuzzy AHP’’ OR TOPSIS OR

‘‘best worst method’’ OR BWM OR ELECTRE OR PROMETHEE OR VIKOR OR COPRAS

OR SAW)

AND

(glacier* OR glacial OR cryosphere OR cryospheric)

AND

(‘‘hazard assessment’’ OR hazard* OR susceptib* OR zonation OR mapping OR

‘‘risk assessment’’ OR priorit* OR ‘‘decision support’’)

AND

(GLOF OR ‘‘glacial lake outburst flood’’ OR ‘‘glacial lake’’ OR ‘‘ice

avalanche’’ OR

landslide OR ‘‘debris flow’’)

)

AND PUBYEAR > 2014 AND PUBYEAR < 2026

AND (LIMIT-TO(LANGUAGE, ‘‘English’’))

Filters: Earth and Planetary Sciences; Environmental Science; Engineering; English; 2015–2025

Appendix D.4. Web of Science

Search fields: Topic (Title, Abstract, Author Keywords)

TS=(

(‘‘multi-criteria decision analysis’’ OR ‘‘multi-criteria decision

making’’ OR MCDA OR MCDM

OR AHP OR TOPSIS OR ELECTRE)

AND

(‘‘glacial lake outburst flood’’ OR GLOF OR glacier hazard OR avalanche OR

landslide)

AND

(‘‘hazard assessment’’ OR ‘‘risk assessment’’ OR ‘‘susceptibility

mapping’’)

)TS=(

(‘‘multi-criteria’’ OR multicriteria OR ‘‘multi criteria’’ OR MCDA OR

MCDM OR

‘‘Analytic Hierarchy Process’’ OR AHP OR ‘‘fuzzy AHP’’ OR TOPSIS OR

‘‘best worst method’’ OR BWM OR ELECTRE OR PROMETHEE OR VIKOR OR COPRAS

OR SAW)

AND

(glacier* OR glacial OR cryosphere OR cryospheric)

AND

(‘‘hazard assessment’’ OR hazard* OR susceptib* OR zonation OR mapping OR

‘‘risk assessment’’ OR priorit* OR ‘‘decision support’’)

AND

(GLOF OR ‘‘glacial lake outburst flood’’ OR ‘‘glacial lake’’ OR ‘‘ice

avalanche’’ OR

landslide OR ‘‘debris flow’’)

)

Refined by: Languages=(ENGLISH)

Timespan: 2015--2025

Indexes: SCI-EXPANDED, SSCI, ESCI (as applicable)

‌

Filters: Geosciences; Environmental Sciences; Water Resources; English; 2015–2025

Appendix D.5. SpringerLink

Search fields: Title, Abstract

(‘‘multi-criteria’’ OR multicriteria OR ‘‘multi criteria’’ OR MCDA OR MCDM

OR

‘‘Analytic Hierarchy Process’’ OR AHP OR ‘‘fuzzy AHP’’ OR TOPSIS OR

‘‘best worst method’’ OR BWM OR ELECTRE OR PROMETHEE OR VIKOR OR COPRAS~OR

‌

SAW)

AND

(glacier* OR glacial OR cryosphere OR cryospheric)

AND

(‘‘hazard assessment’’ OR hazard* OR susceptib* OR zonation OR mapping OR

‘‘risk assessment’’ OR priorit* OR ‘‘decision support’’)

AND

(GLOF OR ‘‘glacial lake outburst flood’’ OR ‘‘glacial lake’’ OR ‘‘ice

avalanche’’ OR

landslide OR ‘‘debris flow’’)

‌

Filters: Earth Sciences; Environmental Science; English; 2015–2025.

Appendix D.6. ScienceDirect (Elsevier ScienceDirect Interface)

ScienceDirect search is less strict with field tags; this is a typical Advanced Search strin

(‘‘multi-criteria’’ OR multicriteria OR ‘‘multi criteria’’ OR MCDA OR MCDM

OR

‘‘Analytic Hierarchy Process’’ OR AHP OR ‘‘fuzzy AHP’’ OR TOPSIS OR

‘‘best worst method’’ OR BWM OR ELECTRE OR PROMETHEE OR VIKOR OR COPRAS OR

SAW)

AND

(glacier* OR glacial OR cryosphere OR cryospheric)

AND

(‘‘hazard assessment’’ OR hazard* OR susceptib* OR zonation OR mapping OR

‘‘risk assessment’’ OR priorit* OR ‘‘decision support’’)

AND

(GLOF OR ‘‘glacial lake outburst flood’’ OR ‘‘glacial lake’’ OR ‘‘ice

avalanche’’ OR

landslide OR ‘‘debris flow’’)

AND

(pub-year > 2014 AND pub-year < 2026)

Appendix D.7. Post-Processing and Deduplication

All retrieved records were exported in BibTeX format and merged into a single database. Duplicate detection, inclusion/exclusion screening, and conflict resolution were conducted using the Rayyan systematic review platform.

The screening procedure followed the eligibility criteria described in Section 3 and is summarised in the PRISMA 2020 flow diagram. Manual backward and forward snowballing of reference lists was also performed to identify additional relevant studies not retrieved through database searches.

References

Zemp, M.; Welty, E.; Nussbaumer, S.U.; Bannwart, J.; Gärtner-Roer, I.; Wells, A.; Ahlstrøm, A.; Anderson, B.; Andreassen, L.; Azam, M.; et al. Global glacier mass change in 2025. Nat. Rev. Earth Environ. 2026, 7, 213–215. [Google Scholar] [CrossRef]
IPCC. Summary for Policymakers. In Climate Change 2023: Synthesis Report. Contribution of Working Groups I, II and III to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change; Team, C.W., Lee, H., Romero, J., Eds.; IPCC: Geneva, Switzerland, 2023; pp. 1–34. [Google Scholar] [CrossRef]
Patel, A.; Prajapati, R.; Dharpure, J.; Mani, S.; Chauhan, D. Mapping and monitoring of glacier areal changes using multispectral and elevation data: A case study over Chhota-Shigri glacier. Earth Sci. Inform. 2019, 12, 489–499. [Google Scholar] [CrossRef]
Aslam, B.; Maqsoom, A.; Khalil, U.; Ghorbanzadeh, O.; Blaschke, T.; Farooq, D.; Tufail, R.; Suhail, S.; Ghamisi, P. Evaluation of Different Landslide Susceptibility Models for a Local Scale in the Chitral District, Northern Pakistan. Sensors 2022, 27, 3107. [Google Scholar] [CrossRef]
Qian, W.; Du, J.; Chai, B.; Wang, Y. Susceptibility Mapping of Glacial Lake Outburst Debris Flows Based on System Failure Model. Water 2026, 18, 651. [Google Scholar] [CrossRef]
Ward, P.J.; Blauhut, V.; Bloemendaal, N.; Daniell, J.E.; de Ruiter, M.C.; Duncan, M.J.; Emberson, R.; Jenkins, S.F.; Kirschbaum, D.; Kunz, M.; et al. Review article: Natural hazard risk assessments at the global scale. Nat. Hazards Earth Syst. Sci. 2020, 20, 1069–1096. [Google Scholar] [CrossRef]
Shah, S.; Ishtiaque, A. Adaptation to Glacial Lake Outburst Floods (GLOFs) in the Hindukush-Himalaya: A Review. Climate 2025, 13, 60. [Google Scholar] [CrossRef]
Allen, S.; Frey, H.; Huggel, C. GAPHAZ 2017: Assessment of Glacier and Permafrost Hazards in Mountain Regions—Technical Guidance Document; Technical Report; Standing Group on Glacier and Permafrost Hazards in Mountains (GAPHAZ) of the International Association of Cryospheric Sciences (IACS) and the International Permafrost Association (IPA): Zurich, Switzerland, 2017. [Google Scholar]
Tielidze, L.G.; Nadaraia, A.; Kumladze, R.M.; Cook, S.J.; Lobjanidze, M.; Liu, Q.; Megrelidze, I.; Mackintosh, A.N.; Imnadze, G. Post-Little Ice Age Shrinkage of the Tsaneri–Nageba Glacier System and Recent Proglacial Lake Evolution in the Georgian Caucasus. Water 2025, 17, 3209. [Google Scholar] [CrossRef]
Richardson, S.D.; Reynolds, J.M. An overview of glacial hazards in the Himalayas. Quat. Int. 2000, 65–66, 31–47. [Google Scholar] [CrossRef]
Huggel, C.; Kääb, A.; Haeberli, W.; Teysseire, P.; Paul, F. Remote sensing based assessment of hazards from glacier lake outbursts: A case study in the Swiss Alps. Can. Geotech. J. 2002, 39, 316–330. [Google Scholar] [CrossRef]
Veh, G.; Korup, O.; von Specht, S.; Roessner, S.; Walz, A. Unchanged frequency of moraine-dammed glacial lake outburst floods in the Himalaya. Nat. Clim. Chang. 2020, 10, 533–539. Available online: https://www.nature.com/articles/s41558-019-0437-5 (accessed on 20 March 2026). [CrossRef]
Belton, V.; Stewart, T.J. Multiple Criteria Decision Analysis: An Integrated Approach; Springer Science & Business Media: Boston, MA, USA, 2002. [Google Scholar] [CrossRef]
Tacnet, J.M.; Dezert, J.; Curt, C.; Batton-Hubert, M.; Chojnacki, E. How to manage natural risks in mountain areas in a context of imperfect information? New frameworks and paradigms for expert assessments and decision-making. Environ. Syst. Decis. 2014, 34, 288–311. [Google Scholar] [CrossRef]
Giupponi, C.; Giove, S.; Giannini, V. A dynamic assessment tool for exploring and communicating vulnerability to floods and climate change. Environ. Model. Softw. 2013, 44, 136–147. [Google Scholar] [CrossRef]
Kougkoulos, I.; Cook, S.; Jomelli, V.; Clarke, L.; Symeonakis, E.; Dortch, J.; Edwards, L.; Merad, M. Use of multi-criteria decision analysis to identify potentially dangerous glacial lakes. Sci. Total Environ. 2018, 621, 1453–1466. [Google Scholar] [CrossRef] [PubMed]
Yalcin, M. A GIS-Based Multi-Criteria Decision Analysis Model for Determining Glacier Vulnerability. ISPRS Int. J. Geo-Inf. 2020, 9, 180. [Google Scholar] [CrossRef]
Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ 2021, 372, n160. [Google Scholar] [CrossRef]
Emmer, A.; Vilca, O.; Salazar Checa, C.; Li, S.; Cook, S.; Pummer, E.; Hrebrina, J.; Haeberli, W. Causes, consequences and implications of the 2023 landslide-induced Lake Rasac glacial lake outburst flood (GLOF), Cordillera Huayhuash, Peru. Nat. Hazards Earth Syst. Sci. 2025, 25, 1207–1228. [Google Scholar] [CrossRef]
Piccinelli, S.; Cannone, N. Divergent responses of alpine rock glaciers to climate change: A review of ecological and abiotic dynamics. Permafr. Periglac. Process. 2025, 36, 438–450. [Google Scholar] [CrossRef]
Figueira, J.; Greco, S.; Roy, B.; Slowinski, R. ELECTRE Methods: Main Features and Recent Developments. In Handbook of Multicriteria Analysis; Zopounidis, C., Pardalos, P., Eds.; Springer: Berlin, Germany, 2010; Volume 103, pp. 51–89. [Google Scholar]
Greco, S.; Ishizaka, A.; Tasiou, M.; Torrisi, G. The ordinal input for cardinal output approach of non-compensatory composite indicators: The PROMETHEE scoring method. Eur. J. Oper. Res. 2021, 288, 225–246. [Google Scholar] [CrossRef]
Arya, A.; Singh, A. Multi criteria analysis for flood hazard mapping using GIS techniques: A case study of Ghaghara River basin in Uttar Pradesh, India. Arab. J. Geosci. 2021, 14, 656. [Google Scholar] [CrossRef]
Wadadar, S.; Mukhopadhyay, B. GIS-based landslide susceptibility zonation and comparative analysis using Analytical Hierarchy Process and Conventional Weighting-based multivariate statistical methods in the Lachung River Basin, North Sikkim. Nat. Hazards 2022, 113, 1–38. [Google Scholar] [CrossRef]
Nandy, S. Assessment of terrain stability zones for human habitation in Himalayan Upper Pindar River Basin, Uttarakhand using AHP and GIS. Environ. Earth Sci. 2021, 80, 356. [Google Scholar] [CrossRef]
Roy, B.; Slowinski, R. Questions guiding the choice of a multicriteria decision aiding method. EURO J. Decis. Process. 2013, 1, 69–97. [Google Scholar] [CrossRef]
Chen, Y.; Yu, J.; Khan, S. The spatial framework for weight sensitivity analysis in AHP-based multi-criteria decision making. Environ. Model. Softw. 2013, 48, 129–140. [Google Scholar] [CrossRef]
Mazurek, J.; Strzalka, D. On the Monte Carlo weights in multiple criteria decision analysis. PLoS ONE 2022, 17, e0268950. [Google Scholar] [CrossRef]
Oreskes, N.; Shrader-Frechette, K.; Belitz, K. Verification, validation, and confirmation of numerical models in the earth sciences. Science 1994, 263, 641–646. [Google Scholar] [CrossRef]
Bennett, N.; Croke, B.F.W.; Guariso, G.; Guillaume, J.H.A.; Hamilton, S.H.; Jakeman, A.J.; Marsili-Libell, S.; Newham, L.T.H.; Norton, J.P.; Perrin, C.; et al. Characterising performance of environmental models. Environ. Model. Softw. 2013, 40, 1–20. [Google Scholar] [CrossRef]
Goodman, S.; Fanelli, D.; Ioannidis, J. What does research reproducibility mean? Sci. Transl. Med. 2016, 8, 341ps12. [Google Scholar] [CrossRef]
Peng, R. Reproducible research in computational science. Science 2011, 334, 1226–1227. [Google Scholar] [CrossRef]
Roy, B. Robustness in operational research and decision aiding: A multi-faceted issue. Eur. J. Oper. Res. 2010, 200, 629–638. [Google Scholar] [CrossRef]
Saltelli, A.; Tarantola, S.; Campolongo, F.; Ratto, M. Sensitivity Analysis in Practice: A Guide to Assessing Scientific Models; John Wiley & Sons, Ltd.: Chichester, UK, 2004. [Google Scholar] [CrossRef]
French, S. The role of uncertainty in the use of MCDA for environmental decisions. J. Multi-Criteria Decis. Anal. 2011, 18, 127–139. [Google Scholar]
de Brito, M.; Evers, M. Multi-criteria decision-making for flood risk management: A survey of the current state of the art. Nat. Hazards Earth Syst. Sci. 2016, 16, 1019–1033. [Google Scholar] [CrossRef]
Funtowicz, S.O.; Ravetz, J.R. Science for the post-normal age. Futures 1993, 25, 739–755. [Google Scholar] [CrossRef]
Cooke, R.M.; Goossens, L.J.H. Procedures Guide for Structured Expert Judgment; Project Report; European Commission: Brussels, Belgium, 1999; Available online: https://rogermcooke.net/rogermcooke_files/PROCGEUR18820.pdf (accessed on 20 March 2026).
Lahdelma, R.; Hokkanen, J.; Salminen, P. SMAA—Stochastic multiobjective acceptability analysis. Eur. J. Oper. Res. 1998, 106, 137–143. [Google Scholar] [CrossRef]
Tervonen, T.; Figueira, J.R. A survey on stochastic multicriteria acceptability analysis methods. J. Multi-Criteria Decis. Anal. 2008, 15, 1–14. [Google Scholar] [CrossRef]
Veh, G.; Korup, O.; Walz, A. Hazard from Himalayan glacier lake outburst floods. Proc. Natl. Acad. Sci. USA 2020, 117, 907–912. [Google Scholar] [CrossRef]
Mazhar, Y.; Atif, S.; Azmat, M.; Ahmad, S.; Ullah, F. Growing Glacial Lake Outburst Flood Risks in Ghizer District: A Karakoram Anomaly Region. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2025, 18, 7811–7828. [Google Scholar] [CrossRef]
Azizi, F.; Lane, S.N. Classification and evaluation of dangerous glacial lakes in the Hindukush region of Afghanistan (HKA) using a multi-criteria approach. Geomat. Nat. Hazards Risk 2025, 16, 2571983. [Google Scholar] [CrossRef]
Upadhyaya, A.; Rai, A.K. Glacial Lake Outburst Floods (GLOFs) Susceptibility in the Northwest Himalayas using AHP-TOPSIS and AHP-COPRAS. Environ. Sci. Pollut. Res. 2025, 32, 11126–11144. [Google Scholar] [CrossRef]
Brđanin, E.; Gazdić, M.; Vujović, F.; Milanović, M.; Feratović, E. Assessing the geotourism potential of glacial lakes in Plav, Montenegro: A multi-criteria assessment by using the M-GAM model. Open Geosci. 2025, 17, 20250907. [Google Scholar] [CrossRef]
Vashistha, A.; Shah, A.A.; Batmanathan, N.M.; Dashora, A. A data-driven multi-criteria model for GLOF risk in tectonically active Himalayan regions. Model. Earth Syst. Environ. 2025, 11, 33. [Google Scholar] [CrossRef]
Shang, Y.; Sun, H.; Miao, G.; Wang, C.; Liu, J.; Zhang, W.; Yang, H.; Fu, H. Comprehensive susceptibility assessment of continental glacier ice avalanches: A case study of glaciers on the northwestern Tibetan Plateau. Landslides 2024, 22, 205–220. [Google Scholar] [CrossRef]
Bibi, S.; Shafique, M.; Ali, N.; Nazneen, S.; Gul, R.; Shah, S.A.A. Seasonal evaluation of glacier dynamics and risk analysis using remote sensing techniques in the Buni Zom Valley, Chitral River Basin, Northern Pakistan. Environ. Earth Sci. 2025, 84, 116. [Google Scholar] [CrossRef]
Tian, X.; Yao, X.; Zhou, Z. Distribution patterns and risk assessment of potential ice avalanches and glacier lake outburst floods in southeastern Tibetan Plateau. Ecol. Indic. 2025, 179, 114217. [Google Scholar] [CrossRef]
Chhetri, P.K.; Li, Y.; Gharehchahi, S.; Bhardwaj, A. Monitoring glacial lake formation and GLOF hazard in Kanchenjunga conservation area of Nepal (2010–2023): A high-resolution approach. Phys. Chem. Earth Parts A/B/C 2025, 141, 104178. [Google Scholar] [CrossRef]
Mankotia, S.; Ramiz, M.; Siddiqui, M.A.; Saini, R.; Bhagat, G. Evaluating GLOF Susceptibility of Potentially Dangerous Glacial Lakes Using AHP-Based Multi-Criteria Analysis in the Suru Sub-Basin, Western Himalayas (2010–2023): A high-resolution approach. SSRN Electron. J. 2025. [Google Scholar] [CrossRef]
Vashistha, A.; Dashora, A.; Shah, A.A. Assessing glacial lake outburst flood risk in the Eastern Himalayas: A Bayesian neural network framework. Nat. Hazards 2025, 121, 21861–21890. [Google Scholar] [CrossRef]
Afreen, M.; Haq, F.; Mark, B.G. Hazards profile of the Shigar Valley, Central Karakoram, Pakistan: Multicriteria hazard susceptibility assessment. AUC Geogr. 2024, 59, 77–92. [Google Scholar] [CrossRef]
Rather, A.F.; Ahmed, R.; Bansal, J.; Mir, R.A.; Ahmed, P.; Malik, I.; Varade, D. Glacial lake outburst flood risk assessment of a rapidly expanding glacial lake in the Ladakh region of Western Himalaya, using hydrodynamicmodeling. Geomat. Nat. Hazards Risk 2024, 15, 2413893. [Google Scholar] [CrossRef]
Das, S.; Das, S.; Mandal, S.T.; Sharma, M.C.; Ramsankaran, R. Inventory and GLOF susceptibility of glacial lakes in Chenab basin, Western Himalaya. Geomat. Nat. Hazards Risk 2024, 15, 2356216. [Google Scholar] [CrossRef]
Rafique, A.; Dasti, M.Y.S.; Ullah, B.; Awwad, F.A.; Ismail, E.A.A.; Saqib, Z.A. Snow Avalanche Hazard Mapping Using a GIS-Based AHP Approach: A Case of Glaciers in Northern Pakistan from 2012 to 2022. Remote Sens. 2023, 15, 5375. [Google Scholar] [CrossRef]
Rawat, M.; Jain, S.K.; Ahmed, R.; Lohani, A.K. Glacial lake outburst flood risk assessment using remote sensing and hydrodynamic modeling: A case study of Satluj basin, Western Himalayas, India. Environ. Sci. Pollut. Res. 2023, 30, 41591–41608. [Google Scholar] [CrossRef]
Nasiri Khiavi, A.; Vafakhah, M.; Sadeghi, S.H. Flood-based critical sub-watershed mapping: Comparative application of multi-criteria decision making methods and hydrological modeling approach. Stoch. Environ. Res. Risk Assess. 2023, 37, 2757–2775. [Google Scholar] [CrossRef]
Zhang, T.; Wang, W.; An, B.; Wei, L. Enhanced glacial lake activity threatens numerous communities and infrastructure in the Third Pole. Nat. Commun. 2023, 14, 8250. [Google Scholar] [CrossRef]
Singh, H.; Varade, D.; de Vries, M.V.W.; Adhikari, K.; Rawat, M.; Awasthi, S.; Rawat, D. Assessment of potential present and future glacial lake outburst flood hazard in the Hunza valley: A case study of Shisper and Mochowar glacier. Sci. Total Environ. 2023, 868, 161717. [Google Scholar] [CrossRef]
Abdel-Fattah, D.; Danielson, M.; Ekenberg, L.; Hock, R.; Trainor, S.F. Application of a structured decision-making process in cryospheric hazard planning: Case study of Bering Glacier surges on local state planning in Alaska. J. Multi-Criteria Decis. Anal. 2024, 31, e1825. [Google Scholar] [CrossRef]
Zhang, D.; Zhang, F.; Zheng, G.; Shi, X.; Nie, Y.; Cheng, G.; Yan, W. A robust glacial lake outburst susceptibility assessment approach validated by GLOF event in 2020 in the Nidu Zangbo Basin, Tibetan Plateau. Catena 2023, 220, 106734. [Google Scholar] [CrossRef]
Maqsoom, A.; Aslam, B.; Khalil, U.; Kazmi, Z.A.; Azam, S.; Mehmood, T.; Nawaz, A. Landslide susceptibility mapping along the China Pakistan Economic Corridor (CPEC) route using multi-criteria decision-making method. Model. Earth Syst. Environ. 2022, 8, 1519–1533. [Google Scholar] [CrossRef]
Zaz, S.; Romshoo, S. Landslide susceptibility assessment of Kashmir Himalaya, India. Arab. J. Geosci. 2022, 2022, 552. [Google Scholar] [CrossRef]
Rather, M.; Meraj, G.; Farooq, M.; Shiekh, B.; Kumar, P.; Kanga, S.; Singh, S.; Sahu, N.; Tiwari, S. Identifying the Potential Dam Sites to Avert the Risk of Catastrophic Floods in the Jhelum Basin, Kashmir, NW Himalaya, India. Water 2022, 14, 1538. [Google Scholar] [CrossRef]
Ahmed, R.; Ahmad, S.; Wani, G.; Mir, R.; Almazroui, M.; Bansal, J.; Ahmed, P. Glacial lake changes and the identification of potentially dangerous glacial lakes (PDGLs) under warming climate in the Dibang River Basin, Eastern Himalaya, India. Geocarto Int. 2022, 37, 17659–17685. [Google Scholar] [CrossRef]
Tempa, K. District flood vulnerability assessment using analytic hierarchy process (AHP) with historical flood events in Bhutan. PLoS ONE 2022, 17, e0270467. [Google Scholar] [CrossRef]
Aslam, A.; Rana, I.; Shah, S.; Mohuddin, G. Climate change and glacial lake outburst flood (GLOF) risk perceptions: An empirical study of Ghizer District, Gilgit-Baltistan Pakistan. Int. J. Disaster Risk Reduct. 2022, 83, 103392. [Google Scholar] [CrossRef]
Edamo, M.; Ukumo, T.; Lohani, T.; Ayana, M.; Ayele, M.; Makayno, Z.; Abdi, D. A Comparative Assessment of Multi-Criteria Decision-Making Analysis and Machine Learning Methods for Flood Susceptibility Mapping and Socio-economic Impacts on Flood Risk in Abela-Abaya Floodplain of Ethiopia. Environ. Chall. 2022, 9, 100629. [Google Scholar] [CrossRef]
Kumar, N.; Singh, S. Soil erosion assessment using earth observation data in a trans-boundary river basin. Nat. Hazards 2021, 107, 1–34. [Google Scholar] [CrossRef]
Mohanty, L.; Maiti, S. Probability of glacial lake outburst flooding in the Himalaya. Resour. Environ. Sustain. 2021, 5, 100031. [Google Scholar] [CrossRef]
Pudi, R.; Martha, T.R.; Roy, P.; Kumar, K.V.; Rao, P.R. Mesoscale seismic hazard zonation in the Central Seismic Gap of the Himalaya by GIS-based analysis of ground motion, site and earthquake-induced effects. Environ. Earth Sci. 2021, 80, 613. [Google Scholar] [CrossRef]
Hazra, P.; Krishna, A. AHP Based Assessment of GLOF Susceptibility of South Lhonak Glacial Lake, Sikkim Himalaya, India. In Proceedings of the 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, Brussels, Belgium, 11–16 July 2021; pp. 5489–5492. [Google Scholar] [CrossRef]
Sonker, I.; Tripathi, J.; Singh, A. Landslide susceptibility zonation using geospatial technique and Analytical Hierarchy Process in Sikkim Himalaya. Quat. Sci. Adv. 2021, 4, 100039. [Google Scholar] [CrossRef]
Muneeb, F.; Baig, S.; Khan, J.; Khokhar, M. Inventory and GLOF Susceptibility of Glacial Lakes in Hunza River Basin, Western Karakorum. Remote Sens. 2021, 13, 1794. [Google Scholar] [CrossRef]
Veettil, B.; Kamp, U. Glacial Lakes in the Andes under a Changing Climate: A Review. J. Earth Sci. 2021, 32, 1575–1593. [Google Scholar] [CrossRef]
Rinzin, S.; Zhang, G.; Wangchuk, S. Glacial Lake Area Change and Potential Outburst Flood Hazard Assessment in the Bhutan Himalaya. Front. Earth Sci. 2021, 9, 775195. [Google Scholar] [CrossRef]
Khadka, N.; Chen, X.; Nie, Y.; Thakuri, S.; Zheng, G.; Zhang, G. Evaluation of Glacial Lake Outburst Flood Susceptibility Using Multi-Criteria Assessment Framework in Mahalangur Himalaya. Front. Earth Sci. 2021, 8, 601288. [Google Scholar] [CrossRef]
Moharrami, M.; Naboureh, A.; Nachappa, T.; Ghorbanzadeh, Z.; Guan, X.; Blaschke, T. National-Scale Landslide Susceptibility Mapping in Austria Using Fuzzy Best-Worst Multi-Criteria Decision-Making. Int. J. Geo-Inf. 2020, 9, 393. [Google Scholar] [CrossRef]
Liu, M.; Chen, N.; Zhang, Y.; Deng, M. Glacial Lake Inventory and Lake Outburst Flood/Debris Flow Hazard Assessment after the Gorkha Earthquake in the Bhote Koshi Basin. Water 2020, 12, 464. [Google Scholar] [CrossRef]
Nasir, M.J.; Iqbal, J.; Ahmad, W. Flash flood risk modeling of swat river sub-watershed: A comparative analysis of morphometric ranking approach and El-Shamy approach. Arab. J. Geosci. 2020, 13, 1082. [Google Scholar] [CrossRef]
Moragues, S.; Lenzano, M.; Lanfri, M.; Moreiras, S.; Lannutti, E.; Lenzano, L. Analytic Hierarchy Process applied to landslide susceptibility mapping of the North Branch of Argentino Lake, Argentina. Nat. Hazards 2020, 105, 915–945. [Google Scholar] [CrossRef]
Negi, H.; Kumar, A.; Rao, N.; Thakur, N.; Shekhar, M.; Snehmani. Susceptibility assessment of rainfall induced debris flow zones in Ladakh-Nubra region, Indian Himalaya. J. Earth Syst. Sci. 2019, 129, 30. [Google Scholar] [CrossRef]
Allen, S.; Zhang, G.; Wang, W.; Yao, T.; Bolch, T. Potentially dangerous glacial lakes across the Tibetan Plateau revealed using a large-scale automated assessment approach. Sci. Bull. 2019, 64 7, 435–445. [Google Scholar] [CrossRef]
Bhat, I.; Shafiq, M.; Ahmed, P.; Kanth, T. Multi-criteria evaluation for landslide hazard zonation by integrating remote sensing, GIS and field data in North Kashmir Himalaya, India. Environ. Earth Sci. 2019, 78, 613. [Google Scholar] [CrossRef]
Begam, S.; Sen, D. Mapping of moraine dammed glacial lakes and assessment of their areal changes in the central and Eastern Himalayas using satellite data. J. Mt. Sci. 2019, 16, 77–94. [Google Scholar] [CrossRef]
Ali, S.; Biermanns, P.; Haider, R.; Reicherter, K. Landslide susceptibility mapping by using GIS along the China-Pakistan economic corridor (Karakoram Highway), Pakistan. Nat. Hazards Earth Syst. Sci. 2019, 19, 999–1022. [Google Scholar] [CrossRef]
Sherpa, S.; Shrestha, M.; Eakin, H.; Boone, C. Cryospheric hazards and risk perceptions in the Sagarmatha (Mt. Everest) National Park and Buffer Zone, Nepal. Nat. Hazards 2019, 96, 607–626. [Google Scholar] [CrossRef]
Meena, S.; Mishra, B.; Piraillou, S. A Hybrid Spatial Multi-Criteria Evaluation Method for Mapping Landslide Susceptible Areas in Kullu Valley, Himalayas. Geosciences 2019, 9, 156. [Google Scholar] [CrossRef]
Kougkoulos, I.; Cook, S.; Edwards, L.; Clarke, L.; Symeonakis, E.; Dortch, J.; Nesbitt, K. Modelling glacial lake outburst flood impacts in the Bolivian Andes. Nat. Hazards 2018, 94, 1415–1438. [Google Scholar] [CrossRef]
Prakash, C.; Nagarajan, R. Outburst susceptibility assessment of moraine-dammed lakes in Western Himalaya using an Analytic Hierarchy Process. Earth Surf. Process. Landf. 2017, 42, 2306–2321. [Google Scholar] [CrossRef]
Kanwal, S.; Atif, S.; Shafiq, M. GIS based landslide susceptibility mapping of northern areas of Pakistan, a case study of Shigar and Shyok Basins. Geomat. Nat. Hazards Risk 2017, 8, 348–366. [Google Scholar] [CrossRef]
Kumar, S.; Srivastava, P.K.; Snehmani. Geospatial Modelling and Mapping of Snow Avalanche Susceptibility. J. Indian Soc. Remote Sens. 2018, 46, 109–119. [Google Scholar] [CrossRef]
Sivakumar, R.; Ghosh, S. Earthquake hazard assessment through geospatial model and development of EaHaAsTo tool for visualization: An integrated geological and geoinformatics approach. Environ. Earth Sci. 2017, 76, 442. [Google Scholar] [CrossRef]
Cuellar, A.; McKinney, D. Decision-Making Methodology for Risk Management Applied to Imja Lake in Nepal. Water 2017, 9, 591. [Google Scholar] [CrossRef]
Wang, W.; Xiang, Y.; Gao, Y.; Lu, A.; Yao, T. Moraine-dammed lake distribution and outburst flood risk in the Chinese Himalaya. J. Geogr. Sci. 2015, 25, 563–578. [Google Scholar]

Figure 1. PRISMA 2020 flow diagram illustrating the identification, screening, eligibility assessment, and inclusion of studies in the systematic review.

Figure 2. Temporal distribution of the included studies (2015–2025).

Figure 3. Cumulative number of included studies over time (2015–2025).

Figure 4. Distribution of hazard types addressed in the reviewed studies.

Figure 5. Distribution of MCDA techniques used in glacier hazard assessment studies.

Figure 6. Technological environments used to implement MCDA in glacier hazard studies.

Figure 7. Distribution of weighting strategies and uncertainty treatment approaches in the reviewed studies. Most works rely on expert-based weighting, typically implemented through AHP pairwise comparison, while explicit modelling of uncertainty (e.g., stochastic or fuzzy approaches) remains relatively uncommon.

Figure 8. Distribution of methodological quality scores across the reviewed studies according to the MCDA-HAZARD framework. The results indicate that most studies adequately report criteria selection and weighting procedures, whereas validation and uncertainty analysis are substantially underreported.

Figure 9. Four visualizations of quantitative validation practices in glacier hazard MCDA studies (2015–2025): (a) yearly validation rates with sample size annotations; (b) bar chart by year showing validation percentages and sample sizes; (c) pooled periods demonstrating stable validation rates (maximum difference 3.0%); (d) bubble plot where bubble size represents the number of studies per year, with colour indicating the validation rate and the dashed line showing a negligible trend (slope = 0.12). The overall validation rate (35.0%) is shown as a dashed or dotted line in all panels.

Figure 10. Map showing geographical distribution of glacier hazard MCDA case studies.

Figure 11. Number of reviewed MCDA glacier hazard studies by country. The distribution shows a strong concentration of applications in the Himalayan–Karakoram–Hindu Kush region, with limited representation from other cryospheric regions worldwide.

Figure 12. The reliability gap: current MCDA practice exhibits high operational uptake but limited evidential support. Closing this gap requires moving toward the upper-right quadrant where methodological adoption is matched by validation, uncertainty quantification, and reproducibility.

Table 1. Comparison of existing review studies on MCDA applications in natural hazard assessment and positioning of this study.

Reference	Domain	Review Type	Primary Purpose	Validation Assessment	Uncertainty Assessment	Methodological Reliability
[36]	Flood risk	Systematic review	MCDA methods inventory	No	No	No
[4]	Landslides	Narrative review	AHP applications	No	Limited	No
[14]	Natural hazards (general)	Conceptual review	Decision framework	No	Conceptual only	No
[16]	Cryospheric hazards	Case-oriented review	Hazard mapping approaches	No	Limited	No
This study	Glacier hazards	PRISMA 2020 SLR	Methodological practice, weighting, uncertainty, validation	Yes (cross-study)	Yes (cross-study)	Yes

Note: This table demonstrates the unique contribution of the present review—the first systematic assessment of methodological reliability in glacier hazard MCDA applications.

Table 2. Search terms.

No.	Terms
1	multicriteria, multi-criteria, multiple criteria, MCDA, multiple criteria decision
2	hazard, vulnerability, susceptibility, risk
3	glacier, glacial

Table 3. MCDA-HAZARD methodological quality assessment framework.

Domain	Evaluation Criteria	Score (0–2)
Criteria Definition	Justification of selected hazard factors; expert consultation or literature support reported	0–2
Weighting Transparency	Weighting method documented; pairwise comparison matrix, rationale, or consistency verification reported	0–2
Uncertainty Analysis	Sensitivity analysis, alternative scenarios, fuzzy or probabilistic modelling performed	0–2
Validation	Independent hazard inventory, statistical metrics (ROC/AUC/accuracy), or temporal validation applied	0–2
Reproducibility	Data sources cited; parameters disclosed; workflow sufficiently described for replication	0–2

Note: Each domain scored 0 (not reported), 1 (partially addressed), or 2 (clearly implemented and documented). Total score range: 0–10.

Table 4. Summary statistics of methodological quality across the 60 reviewed studies using the MCDA-HAZARD framework.

Quality Domain	Mean Score (0–2)	Standard Deviation	% Scoring 2 (Fully Reported)
Criteria Definition	1.57	0.50	58.3%
Weighting Transparency	1.13	0.64	35.0%
Uncertainty Analysis	0.23	0.50	5.0%
Validation	1.12	0.72	38.3%
Reproducibility	1.57	0.50	56.7%
Total Score (0–10)	5.62	2.02	-

Note: Scores assigned according to the MCDA-HAZARD protocol described in Section 3.9. Standard deviations calculated from the full dataset of 60 studies.

Table 5. Cross-tabulation of hazard focus and primary MCDA method in the reviewed studies. Note: “Other” includes hybrid approaches, MCE, weighted sum, heuristic methods, and unspecified MCDA frameworks. Total sums to 60 studies.

Hazard Focus	AHP	Fuzzy AHP	BWM	TOPSIS	Other
GLOF only	13	0	0	0	11
Landslide only	11	0	1	0	3
Multi-hazard	12	3	1	1	4
Total	36	3	2	1	18

Table 6. Proportion of MCDA methods by hazard focus (row percentages).

Hazard Focus	AHP (%)	Fuzzy AHP (%)	BWM (%)	TOPSIS (%)	Other (%)
GLOF only (n = 24)	54.2	0.0	0.0	0.0	45.8
Landslide only (n = 15)	73.3	0.0	6.7	0.0	20.0
Multi-hazard (n = 21)	57.1	14.3	4.8	4.8	19.0

Note: Percentages sum to 100% across each row. The table demonstrates that AHP predominates across all hazard categories, with minimal variation by hazard type.

Table 7. Methodological concentration of MCDA techniques in glacier hazard assessment.

Method Category	Number of Studies	Proportion (%)
Pure AHP	36	60.0%
AHP variants (fuzzy AHP, AHP-TOPSIS, AHP-Entropy, etc.)	12	20.0%
Non-AHP methods (BWM, TOPSIS alone, DEA, SMAA)	6	10.0%
Unspecified/hybrid approaches	6	10.0%
Total	60	100%

Note: AHP-based approaches (pure + variants) account for 80% of all studies (48/60), demonstrating strong methodological standardisation.

Table 8. Quantitative validation of MCDA-based glacier hazard studies by publication year.

Year	Total Studies	Validated Studies	Validation Rate (%)
2015	1	0	0.0
2017	5	2	40.0
2018	2	1	50.0
2019	7	3	42.9
2020	5	2	40.0
2021	11	3	27.3
2022	8	3	37.5
2023	7	2	28.6
2024	3	0	0.0
2025	11	5	45.5
Total	60	21	35.0%

Note: Quantitative validation, defined strictly as studies reporting statistical performance metrics (ROC/AUC, accuracy, and the confusion matrix). Internal AHP consistency ratios were not counted as validation.

Table 9. Quantitative validation rates by publication period.

Publication Period	Total Studies	Validated Studies	Validation Rate (%)
2015–2018	8	3	37.5%
2019–2021	23	8	34.8%
2022–2025	29	10	34.5%
Total	60	21	35.0%

Note: Validation rates have remained essentially stable over the decade, showing no sustained improvement in validation practice despite an increased publication volume.

Table 10. Relationship between validation practice and methodological quality scores.

Validation Type	N	Mean Quality Score	Range
Quantitative validation (ROC/AUC, accuracy, metrics)	23	6.8	5–9
Qualitative comparison only (historical events, visual)	26	5.2	2–8
No validation procedure reported	11	3.5	1–5
All studies	60	5.62	1–9

Note: Studies reporting quantitative validation achieve substantially higher quality scores, suggesting that validation is associated with more rigorous methodological practice overall.

Table 11. Regional distribution of glacier hazard MCDA case studies.

Region	Number of Studies	Proportion (%)
High Mountain Asia (Himalaya-Karakoram-Hindu Kush, Tibetan Plateau)	48	80.0%
Andes (South America)	4	6.7%
Europe (Alps and surrounding mountain systems)	4	6.7%
North America	1	1.7%
Other regions	3	5.0%
Total	60	100%

Note: Other regions include Ethiopia (1), Global/Andes (1), and Iran (1)—note that Iran was counted in Europe for this classification, based on the study focus.

Table 12. Methodological quality scores by geographical region.

Region	N	Mean Quality Score	Studies with Quantitative Validation (%)
High Mountain Asia	48	5.6	37.5%
Andes (South America)	4	4.3	25.0%
Europe	4	6.5	50.0%
North America	1	6.0	0.0%
Other regions	3	5.7	33.3%
All studies	60	5.62	38.3%

Note: European studies show the highest mean quality scores and validation rates, though the sample sizes are small. The concentration of studies in High Mountain Asia (80%) limits geographical generalizability.

Table 13. Summary of key quantitative findings from the included studies (2015–2025,

n = 60

).

Table 13. Summary of key quantitative findings from the included studies (2015–2025,

n = 60

).

Finding	Quantitative Evidence	Source
Included evidence base	A total of 60 studies met the inclusion criteria (2015–2025).	Section 3.3
Dominant MCDA method	AHP is the primary decision method in 36/60 studies (60.0%). AHP-based approaches (pure + variants) account for 48/60 studies (80.0%).	Table 7
Method selection is weakly hazard-dependent	AHP used across all hazard domains: GLOF-only (54.2%), landslide-only (73.3%), multi-hazard (57.1%). Maximum inter-domain difference: 19.1%.	Table 6
Limited methodological diversity	Non-AHP methods appear in only 6/60 studies (10.0%): fuzzy AHP (3), BWM (2), TOPSIS (1). Unspecified or hybrid approaches account for 6/60 (10.0%).	Table 7
Predictive validation is uncommon	Only 21/60 studies (35.0%) report quantitative predictive validation metrics; 39/60 (65.0%) rely on qualitative comparison, historical matching, or no validation.	Table 8
No sustained improvement in validation over time	Validation rates remain stable when pooled: 37.5% (2015–2018), 34.8% (2019–2021), 34.5% (2022–2025). Maximum inter-period difference: 3.0%.	Table 9, Figure 9
Strong geographical concentration	High Mountain Asia accounts for 48/60 studies (80.0%); Andes 4/60 (6.7%); Europe 4/60 (6.7%); North America 1/60 (1.7%); other regions 3/60 (5.0%).	Table 11
Quality varies by validation status	Studies with quantitative validation achieve higher mean quality scores (6.8) than those with qualitative only (5.2) or no validation (3.5).	Table 10
Quality varies by MCDA method	Non-AHP methods achieve highest mean quality (6.8), followed by AHP variants (6.3), pure AHP (5.6), and unspecified/hybrid approaches (4.0).	Table 7
Lowest-scoring domain	Uncertainty analysis is the least developed domain: mean score 0.23/2.0, with only 3/60 studies (5.0%) fully reporting uncertainty assessment.	Table 4

Note: Quantitative validation, defined strictly as reporting statistical performance metrics (ROC/AUC, accuracy, and confusion matrix). Internal AHP consistency ratios not counted as validation. Europe includes Austria, Montenegro, Iran, and Turkey.

Table 14. Summary of synthesis findings by research question (cross-study methodological patterns and implications).

RQ	Evidence Observed Across Studies	Interpretation (What the Pattern Suggests)	Implications (Why It Matters)
RQ1	AHP dominates MCDA implementations (60% pure AHP, 80% including variants); most applications combine AHP-derived weights with weighted overlay in GIS; alternative MCDA families (e.g., outranking, stochastic MCDA) appear in only 10% of studies; method choice is independent of hazard type (Table 5).	Method selection is shaped by methodological convenience, familiarity, and GIS compatibility rather than by systematic matching between decision context and model assumptions.	A methodological monoculture may constrain the analytical space and embed untested assumptions (e.g., compensatory aggregation), affecting the credibility and transferability of hazard classifications.
RQ2	Weights are almost always expert-derived (pairwise comparison, expert scoring, literature-based judgement); objective or data-driven weighting is uncommon; only 5% of studies perform systematic sensitivity or robustness checks on weights (Table 4).	Weighting is typically treated as an operational step rather than a testable modelling hypothesis; the epistemic basis of weights is often under-specified.	Outputs can be highly sensitive to subjective weight choices, reducing reproducibility and potentially leading to unstable prioritisation in high-stakes settings (e.g., GLOF mitigation planning).
RQ3	Uncertainty is seldom explicitly modelled (mean score 0.23/2.0; only 5% fully report uncertainty analysis); validation is inconsistent—only 35% report quantitative metrics, and this rate has not improved over time (Table 4); internal consistency ratios are often conflated with predictive validity.	Many MCDA-derived hazard maps function as decision-support artefacts under assumptions, not as independently validated predictive products.	Without robustness and validation, decision-makers may over-trust single-map outputs; credibility, policy uptake, and risk communication may be undermined when results are interpreted as “ground truth.”
RQ4	Methodological extensions exist but are fragmented: hybrid MCDA (e.g., AHP–TOPSIS), fuzzy MCDA, probabilistic modelling, ML-assisted pipelines, scenario analysis; few comparative studies apply multiple MCDA models to the same dataset.	Innovation is present but not consolidated into routine practice; barriers include data availability, tooling, evaluation effort, and the lack of standard reporting for robustness.	Priority directions include: (i) systematic sensitivity analysis of weights, (ii) head-to-head comparison of alternative MCDA methods on shared datasets, (iii) explicit uncertainty quantification, and (iv) hybrid/ensemble MCDA frameworks to reduce dependence on single-model assumptions.

Table 15. Diagnosis and prescription: key findings and their implications for research and practice.

Aspect	What We Found	What Should Change
Method selection	80% of studies use AHP-based approaches; method choice independent of hazard type	Move from convenience to evidence: comparative studies across MCDA methods on identical datasets
Weighting practice	Expert-derived weights treated as inputs, not assumptions; sensitivity analysis rare (only 5% fully report uncertainty)	Treat weights as assumptions to be tested; systematic weight variation; multi-scenario sensitivity analysis
Validation	Only 35% report quantitative validation; rates stagnant over decade (37.5% in 2015–2018, 34.5% in 2022–2025)	Validation as core activity; independent event databases; temporal back-testing; cross-regional transferability tests
Uncertainty	Mean score 0.23/2.0; only 5% fully report uncertainty analysis; lowest-scoring domain	Uncertainty as object of analysis; probabilistic methods (SMAA); Bayesian elicitation; ensemble approaches
Geographical scope	80% of studies in High Mountain Asia; Andes (6.7%), Europe (6.7%), North America (1.7%)	Test transferability; build capacity in under-represented regions; avoid universalising regional norms
Epistemic status	Maps interpreted as measurements but are formalised expert interpretations	Reframe MCDA as decision support, not prediction; acknowledge interpretive character; communicate uncertainty

Note: Based on analysis of 60 studies (2015–2025). SMAA = Stochastic Multi-criteria Acceptability Analysis.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2026 by the authors. Published by MDPI on behalf of the International Society for Photogrammetry and Remote Sensing. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.

Share and Cite

MDPI and ACS Style

Gacitua, R.; Pereira, J.; Astudillo, H.; Taramasco, C.; Contreras, P. When Hazard Maps Are Not Predictions: A Critical Assessment of MCDA in Glacier Hazard Susceptibility. ISPRS Int. J. Geo-Inf. 2026, 15, 245. https://doi.org/10.3390/ijgi15060245

AMA Style

Gacitua R, Pereira J, Astudillo H, Taramasco C, Contreras P. When Hazard Maps Are Not Predictions: A Critical Assessment of MCDA in Glacier Hazard Susceptibility. ISPRS International Journal of Geo-Information. 2026; 15(6):245. https://doi.org/10.3390/ijgi15060245

Chicago/Turabian Style

Gacitua, Ricardo, Javier Pereira, Hernán Astudillo, Carla Taramasco, and Pedro Contreras. 2026. "When Hazard Maps Are Not Predictions: A Critical Assessment of MCDA in Glacier Hazard Susceptibility" ISPRS International Journal of Geo-Information 15, no. 6: 245. https://doi.org/10.3390/ijgi15060245

APA Style

Gacitua, R., Pereira, J., Astudillo, H., Taramasco, C., & Contreras, P. (2026). When Hazard Maps Are Not Predictions: A Critical Assessment of MCDA in Glacier Hazard Susceptibility. ISPRS International Journal of Geo-Information, 15(6), 245. https://doi.org/10.3390/ijgi15060245

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

When Hazard Maps Are Not Predictions: A Critical Assessment of MCDA in Glacier Hazard Susceptibility

Abstract

1. Introduction

2. Related Work and Research Gap

2.1. Glacier Hazard Assessment as a Decision Problem

2.2. MCDA in Natural Hazard and Cryospheric Studies

2.3. Methodological Characteristics of Current Practice

2.4. Conceptualising Reliability in MCDA for Hazard Assessment

2.4.1. Reproducibility

2.4.2. Robustness

2.4.3. Predictive Validity

2.4.4. Procedural Reliability

2.4.5. Relationship Among Reliability Dimensions

2.5. Limitations of Existing Literature

2.6. Research Gap

2.7. Contribution of This Study

3. Research Methodology

3.1. Rationale

3.2. Objectives

3.3. Eligibility Criteria

3.4. Information Sources

3.5. Search Strategy

3.6. Selection Process

3.7. Data Collection Process

What the AHP Consistency Ratio Does and Does Not Tell You

3.8. Data Items

3.9. Quality Assessment Protocol

3.9.1. Domains and Scoring

3.9.2. Assessment Procedure

3.9.3. Relationship to Reliability Framework

3.10. Study Risk of Bias Assessment

3.11. Effect Measures

3.12. Synthesis Methods

3.13. Reporting Bias Assessment

3.14. Certainty Assessment

4. Other Information

4.1. Registration and Protocol

4.1.1. Registration

4.1.2. Protocol

4.1.3. Amendments

5. Results

5.1. Study Selection Results

5.2. Study Characteristics

5.2.1. Publication Timeline

5.2.2. Hazard Types

5.2.3. MCDA Methods Used

5.2.4. Technological Environment

5.2.5. Weighting and Uncertainty Treatment

5.3. Methodological Quality of the Included Studies

5.4. Quantitative Synthesis of Evidence

5.4.1. Method Selection and Hazard Domain

5.4.2. Validation Practices over Time

5.4.3. Geographical Concentration of Case Studies

5.4.4. Summary of Key Quantitative Findings

5.5. Synthesis of Findings by Research Questions

5.5.1. RQ1: Why Has the Analytic Hierarchy Process (AHP) Become the Predominant MCDA Method in Glacier Hazard Assessment Studies?

5.5.2. RQ2: To What Extent Do Criteria Weighting Practices Reflect Methodological Justification Rather than Practical Convenience?

5.5.3. RQ3: Why Is Uncertainty and Robustness Analysis Rarely Incorporated in MCDA-Based Glacier Hazard Assessments?

5.5.4. RQ4: Which Methodological Improvements (e.g., Sensitivity Analysis, Comparative Modelling, Hybrid or Ensemble MCDA) Could Enhance the Reliability of Glacier Hazard Assessments?

5.5.5. Evaluating the Central Proposition

5.6. Risk of Bias in Studies

5.7. Results of Individual Studies

5.8. Reporting Biases

5.9. Certainty of Evidence

6. Discussion

6.1. The Epistemic Status of Hazard Maps: Between Measurement and Interpretation

6.2. The Reliability Gap: Precision Without Verification

6.3. Geographical Concentration and the Limits of Generality

6.4. Toward a Different Kind of Practice

6.5. Reframing the Role of MCDA in Hazard Science

6.6. Limitations in Context

6.7. Summary

7. Limitations

7.1. Internal Validity

7.2. External Validity

8. Future Work

8.1. Systematic Validation Protocols

8.2. Robustness and Uncertainty Quantification

8.3. Comparative Method Evaluation