Binary Logistic Regression Outperforms Decision Tree Modeling for Event-Based Landslide Prediction: Application to Dynamic Hazard and Threshold Mapping in Central Italy
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis manuscript presents a significant contribution to landslide disaster risk management. Its primary strength lies in the transition from static susceptibility mapping to event-based hazard assessment, specifically by reconstructing landslide activation dates using remote sensing indices. The development of a spatially distributed threshold map by inverting the logistic regression equation is particularly innovative and offers a practical tool for real-time early warning systems. I have some comments about the manuscript that could be considered to clarify some aspects.
- With respect to the assertion that Binary Logistic Regression (BLR) is superior to the QUEST decision tree algorithm, primarily because QUEST suffered from a severe class imbalance, leading to very low sensitivity, I think that the comparison could be slightly biased. The authors should briefly explain why they did not apply data-balancing techniques to QUEST or acknowledge that with such techniques, the performance gap between the two models might narrow. This would provide a more balanced comparison.
- Please explain in more detail why the model identifies the 0°–16° slope class as having the highest susceptibility, while steeper slopes (16°–25°) are identified as stabilizing factors. although the authors state that the predominance of translational landslides and flows are predominant, It would be nice to compare how many landslides occurred in steeper slopes and how many in lower slopes in the real cases of Italy. Give a more detailed physical explanation of this phenomenon, because according to worldwide statistics, flows and translational landslides occur at steeper slopes. It is important to clarify the relationship between slope and lithological thickness. I agree that probably in the Italian context the conditions can be different but it must be explained.
Author Response
- With respect to the assertion that Binary Logistic Regression (BLR) is superior to the QUEST decision tree algorithm, primarily because QUEST suffered from a severe class imbalance, leading to very low sensitivity, I think that the comparison could be slightly biased. The authors should briefly explain why they did not apply data-balancing techniques to QUEST or acknowledge that with such techniques, the performance gap between the two models might narrow. This would provide a more balanced comparison.
-
Response: We thank the reviewer for this important methodological observation. The primary objective of this study was not a generic benchmark between BLR and QUEST, but rather to identify the most operationally suitable model for the specific characteristics of our dataset namely, a pronounced class imbalance inherent to landslide observations. In this context, BLR was selected precisely because it handles minority-class probability estimation natively through the logistic link function, without requiring preprocessing. Data-balancing techniques such as SMOTE or cost-sensitive learning could indeed improve QUEST's sensitivity; however, they introduce additional methodological choices (oversampling ratio, synthetic sample generation) that reduce reproducibility and can inflate performance metrics on resampled training data. Nonetheless, we acknowledge that with such techniques, the performance gap might narrow. We have added a brief clarification in Section 3.3 to this effect, noting that the comparison reflects the unprocessed data conditions and that data-balancing approaches could be explored in future work.
Changes made: A sentence has been added at the end of the Section 3.3 acknowledging that the application of data-balancing techniques to QUEST was not pursued, and that such an approach could potentially reduce the observed performance gap. - Please explain in more detail why the model identifies the 0°–16° slope class as having the highest susceptibility, while steeper slopes (16°–25°) are identified as stabilizing factors. although the authors state that the predominance of translational landslides and flows are predominant, It would be nice to compare how many landslides occurred in steeper slopes and how many in lower slopes in the real cases of Italy. Give a more detailed physical explanation of this phenomenon, because according to worldwide statistics, flows and translational landslides occur at steeper slopes. It is important to clarify the relationship between slope and lithological thickness. I agree that probably in the Italian context the conditions can be different but it must be explained.
-
Response: This is an excellent point and we agree that a more thorough physical explanation is warranted. The apparent paradox of gentler slopes being more susceptible is well-documented in the Italian Apennine context and is primarily attributable to lithological and geomorphological factors. In the study area, the dominant lithological units are clay-rich formations (marly clays, clays with sandstone intercalations), which favor shallow translational slides and earth flows. These failure mechanisms are classically associated with lower slope angles for two main reasons: (1) clay-rich soils on gentle slopes accumulate and retain greater quantities of water, facilitating high pore-water pressure build-up during intense precipitation; and (2) the thickness of the unstable colluvial and weathering mantle, which is the primary failure material for flows and translational slides, tends to be greater on lower-gradient slopes where deposition prevails over erosion. In contrast, on steeper slopes (16°–25°) in this area, the bedrock is more frequently exposed or covered by thinner soils, reducing the availability of failure material. Furthermore, the IFFI database for this region confirms that translational slides and flows — the dominant mechanisms here — preferentially initiate on slopes between 5° and 20°. We have expanded Section 3.4 to include this physical explanation.
Changes made: Section 3.4 has been expanded with a more detailed physical discussion of the slope susceptibility inversion, including reference to lithological thickness, dominant failure mechanisms, and supporting evidence from the IFFI database for central Italy.
-
Response: We fully agree with this important observation. The three calibration events all occurred in March (late winter), a period characterized by elevated antecedent soil moisture and prolonged stratiform rainfall, which are distinct from summer convective storms or prolonged autumnal precipitation. A sentence has been added to the Conclusions section explicitly stating the season-specific nature of the derived thresholds and the need for extension to other seasonal conditions. This is also consistent with what is already noted in the Limitations section.
Changes made: A sentence has been added at the end of the Conclusions section stating that the derived thresholds are currently season-specific and require further validation before application to other seasonal rainfall regimes.
-
Response: We thank the reviewer for this observation. The figures in the submitted version were compressed during file preparation. We will provide high-resolution versions (minimum 300 dpi, in TIFF or high-quality JPEG format) for all four figures in the final submission. However, if it refers to the subdivision into pixels, this is due to the spatial resolution of the Landsat imagery.
-
Response: We thank the reviewer for carefully identifying these errors. In Equation (1), "e = esponential of a coefficient" has been corrected to read "e = base of the natural logarithm (Euler's number, ≈ 2.718)". The duplicate sentence in Section 2.4 has been removed. The entire manuscript has been carefully proofread for additional typographical and grammatical errors.
Reviewer 2 Report
Comments and Suggestions for Authors1.The abstract outlines the core research content but does not clearly articulate the engineering application value and practical significance of the findings. It is recommended to supplement the abstract with a discussion of the research's applicability and reference value in relevant engineering scenarios.
2.There are irregularities in the labeling and formatting of some figures (e.g., Figures 1 and 3). They only include basic legends and titles but lack axis labels, precise scale descriptions, and a defined geographical scope. Additionally, the citation of some formulas is confusing (e.g., Formulas 3 and 4 cite only the author without the year, and some formulas lack citations altogether). Furthermore, certain symbols (e.g., 、) are not defined upon their first appearance, which does not conform to the basic norms for writing and citing formulas in academic papers.
3.This study estimates soil saturation using the pedotransfer functions and simple water balance model from Cosby et al. (1984). However, it fails to elaborate on the underlying assumptions of these models and does not verify or calibrate the calculated results with in-situ measured soil moisture data from the study area. Moreover, the analysis only considers soil moisture from the 15 days prior to a precipitation event, without examining the impact of different time scales (e.g., 7-day, 30-day antecedent conditions) on soil saturation and the subsequent landslide prediction results.
4.The rationale for selecting certain key parameters is not sufficiently detailed. Section 3.5 classifies susceptibility into three categories: low (P < 0.3), medium (0.3 ≤ P < 0.7), and high (P ≥ 0.7). It only states that the 0.7 threshold is determined by Youden's J and the 0.3 threshold is based on the cumulative distribution of non-landslide pixels. It is suggested to provide the theoretical or practical basis for the selection of these parameters to enhance the rigor of the research.
5.The conclusions lack sufficient depth in synthesizing the research findings. The core results are presented merely as data statements, without interpreting the underlying phenomena through the lens of geological mechanisms. Some conclusions are simply a restatement of the experimental data and are not elevated by connecting them to the methodological mechanisms. It is recommended that the conclusions be further refined in the context of the entire research process to better highlight the core discoveries and scientific value of the study.
6.There are a few linguistic and grammatical issues in the thesis. For instance, the summary contains spelling errors, and Section 2.4 has formatting issues with bullet points. Some sentences are overly colloquial, and the logical flow of several long sentences is not coherent. It is recommended to carefully proofread and polish the full text to ensure the standardization and accuracy of academic expression.
Author Response
- The abstract outlines the core research content but does not clearly articulate the engineering application value and practical significance of the findings. It is recommended to supplement the abstract with a discussion of the research's applicability and reference value in relevant engineering scenarios.
-
Response: We agree that the abstract could be strengthened by more explicitly highlighting the operational relevance of the work. The abstract has been revised to include a statement on the direct applicability of the threshold map as a decision-support tool for civil protection agencies, land-use planning authorities, and early warning system operators. Specifically, we now emphasize that the spatially distributed threshold map quantifies location-specific rainfall intensities required to exceed a critical landslide initiation probability, providing an immediately implementable input for real-time alert systems.
Changes made: The abstract has been revised to more explicitly articulate the engineering application value and practical significance, with particular emphasis on the operational utility of the threshold map for early warning and civil protection. - There are irregularities in the labeling and formatting of some figures (e.g., Figures 1 and 3). They only include basic legends and titles but lack axis labels, precise scale descriptions, and a defined geographical scope. Additionally, the citation of some formulas is confusing (e.g., Formulas 3 and 4 cite only the author without the year, and some formulas lack citations altogether). Furthermore, certain symbols (e.g., 、) are not defined upon their first appearance, which does not conform to the basic norms for writing and citing formulas in academic papers.
-
Response: We thank the reviewer for these observations regarding scientific presentation standards. Regarding figures: Figures 1 and 3 will be revised to include proper scale bars, coordinate reference system information, and, where appropriate, axis labels and defined geographical scope (bounding coordinates or administrative boundaries). Regarding formula citations: Formulas 3 and 4 now include the full citation (author and year). All other equations have been checked, and citations have been added where missing. Regarding symbol definitions: All mathematical symbols are now explicitly defined at their first appearance in the text. In particular, ψ (soil water potential), θ (volumetric water content), ψₑ (air entry potential), θₛ (saturated water content), and related symbols are defined in Section 2.6 immediately following their introduction.
Changes made: Figures 1 and 3 will be updated with scale bars and geographic scope information. Formula citations have been completed with author-year format. Symbol definitions have been added at first appearance for all mathematical quantities in Section 2.6. - This study estimates soil saturation using the pedotransfer functions and simple water balance model from Cosby et al. (1984). However, it fails to elaborate on the underlying assumptions of these models and does not verify or calibrate the calculated results with in-situ measured soil moisture data from the study area. Moreover, the analysis only considers soil moisture from the 15 days prior to a precipitation event, without examining the impact of different time scales (e.g., 7-day, 30-day antecedent conditions) on soil saturation and the subsequent landslide prediction results.
-
Response: This is an important methodological concern that we address in two parts. First, regarding model assumptions: Section 2.6 has been expanded to explicitly state the key assumptions underlying the Cosby et al. (1984) pedotransfer functions, including: (i) soil texture fractions (sand, silt, clay) are the primary determinants of hydraulic properties; (ii) the soil profile is treated as a single homogeneous layer; and (iii) the water balance model assumes vertical one-dimensional drainage. We acknowledge that these simplifications introduce uncertainty, particularly in heterogeneous soils. Second, regarding validation and time scales: The absence of in-situ soil moisture measurements with sufficient spatial density in the study area prevented direct calibration against field data, a limitation we now explicitly acknowledge in Section 2.6 and the Limitations section. Regarding the 15-day antecedent period: this choice was guided by the findings of Allen et al. (1998) and is consistent with the characteristic soil drainage timescales for the clay-rich lithologies in the area. We acknowledge that alternative antecedent windows (7-day, 30-day) may be more appropriate for different lithological contexts and that sensitivity analyses on this parameter represent a valuable avenue for future work, as now stated in the revised text. In any case, we highlighted how this condition of different soil saturation was not very influential in the model for slope stability.
Changes made: Section 2.6 has been expanded to articulate the assumptions of the pedotransfer functions and water balance model. Limitations on the absence of in-situ validation data and the fixed 15-day antecedent window are now explicitly acknowledged. The limitation relating to soil saturation was highlighted in the limitations section. - The rationale for selecting certain key parameters is not sufficiently detailed. Section 3.5 classifies susceptibility into three categories: low (P < 0.3), medium (0.3 ≤ P < 0.7), and high (P ≥ 0.7). It only states that the 0.7 threshold is determined by Youden's J and the 0.3 threshold is based on the cumulative distribution of non-landslide pixels. It is suggested to provide the theoretical or practical basis for the selection of these parameters to enhance the rigor of the research.
-
Response: We thank the reviewer for this comment. We would like to clarify that the full derivation of both susceptibility classification thresholds was already present in Section 3.5 of the original manuscript. Specifically: the upper threshold of P = 0.70 was determined by maximizing Youden's J statistic (J = Sensitivity + Specificity − 1) on the ROC curve, representing the optimal trade-off between true positive and false positive rates; its stability was confirmed through bootstrap resampling (95% CI ≈ 0.68–0.73). The lower threshold of P = 0.30 was derived from the cumulative distribution of predicted probabilities for non-landslide pixels, below which more than 70% of stable-slope pixels were concentrated. Both thresholds are therefore statistically grounded rather than arbitrarily imposed, and their derivation is consistent with approaches widely adopted in the susceptibility mapping literature (Guzzetti et al., 2006; Reichenbach et al., 2018).
Changes made: We invite the reviewer to refer to Section 3.5, first two paragraphs, beginning with 'In this study, probability thresholds were not arbitrarily selected but derived from the statistical performance of the logistic regression model. - The conclusions lack sufficient depth in synthesizing the research findings. The core results are presented merely as data statements, without interpreting the underlying phenomena through the lens of geological mechanisms. Some conclusions are simply a restatement of the experimental data and are not elevated by connecting them to the methodological mechanisms. It is recommended that the conclusions be further refined in the context of the entire research process to better highlight the core discoveries and scientific value of the study.
-
Response: We agree that the Conclusions section can be strengthened by more explicitly linking quantitative findings to their geological and methodological significance. The Conclusions have been revised to move beyond data restatement and to: (i) contextualise the dominance of clay-rich lithology within the Apennine geological framework, linking the high odds ratio (OR = 6.03) to the well-known plasticity and low shear strength of the dominant pelitic formations; (ii) interpret the counter-intuitive slope susceptibility pattern in terms of colluvial mantle thickness and hydrological accumulation on gentle gradients; and (iii) frame the event-based calibration approach as a methodological advance over inventory-based static susceptibility mapping. The core scientific value of the spatially distributed threshold map is also articulated more clearly.
Changes made: The Conclusions section has been substantially revised to better synthesize findings within their geological and methodological context, moving beyond data restatement to interpretation. - There are a few linguistic and grammatical issues in the thesis. For instance, the summary contains spelling errors, and Section 2.4 has formatting issues with bullet points. Some sentences are overly colloquial, and the logical flow of several long sentences is not coherent. It is recommended to carefully proofread and polish the full text to ensure the standardization and accuracy of academic expression.
-
Response: We thank the reviewer for this thorough assessment. The full manuscript has been carefully proofread. Specific corrections include: (1) spelling errors identified in the abstract have been corrected; (2) bullet point formatting in Section 2.4 has been standardized; (3) overly colloquial phrasing has been revised to conform to formal academic style; and (4) several long, complex sentences have been restructured for improved logical flow and clarity. We acknowledge that English is not the first language of all authors and are committed to ensuring that the final version meets the linguistic standards of the journal.
Changes made: The full manuscript has been proofread and edited for linguistic and grammatical correctness, including the abstract, Section 2.4, and various other passages where clarity or formality was insufficient.
Reviewer 3 Report
Comments and Suggestions for AuthorsThe study develops a methodology for mapping landslide susceptibility based on events in a sample area of central Italy, integrating a database of landslides with known activation dates with predisposing and triggering parameters. The principal innovation of this research lies in its integrated, multi-output workflow that advances beyond conventional static susceptibility mapping. This research holds certain practical significance for enhancing the landslide risk warning and prevention in Italy. The research is generally scientifically sound and reasonable. Some suggestions for revisions are as follows:
- Introduction: Regarding the current research status of landslides in Italy, such as landslide prediction, the logical hierarchy of the analysis still needs to be further clarified.
- Given that “2.5 Precipitation data processing” the rainfall analysis area should encompass the entire range of the landslide samples.
- As described in 3.4 “the contribution of each predictor must be evaluated within the context of the entire feature set”, it should provide a detailed explanation of the logical connections among these factors.
- Refer to existing landslide cases in the local area, the accuracy verification for the refined BLR model should be introduced in 3.4.
- Although the threshold-value map can assess risks relatively quickly to a certain extent, it should be supplemented with information about the model's reconstruction of historical landslide events to demonstrate the accuracy of its assessment.
- The color distinction and clarity of Figure 8 need to be improved.
Author Response
- Introduction: Regarding the current research status of landslides in Italy, such as landslide prediction, the logical hierarchy of the analysis still needs to be further clarified.
-
Response: We agree that the progression of the Introduction, from the general Italian landslide context, through existing methodological approaches, to the specific gap addressed by this study, could be made more explicit. The Introduction has been reorganized to present a clearer logical hierarchy: (1) Italy's landslide risk and the specific challenge of the absence of an updated activation-date catalog; (2) a critical review of existing susceptibility mapping approaches (static inventory-based, dynamic hazard modeling, event-based mapping); (3) the unresolved gap in combining event-based calibration with operationally invertible probabilistic models; and (4) the specific objectives of this study as a direct response to this gap.
Changes made: The Introduction has been revised to present a more explicit logical hierarchy, moving from the Italian landslide context through methodological review to the specific research gap and objectives. - Given that “2.5 Precipitation data processing” the rainfall analysis area should encompass the entire range of the landslide samples.
-
Response: We thank the reviewer for this important spatial consistency observation. As described in Section 2.5, precipitation records were collected from 80 rain gauge stations within the regional SIRMIP network for the event analysis, and from 131 stations over a wider area for the GEV-based extreme value analysis. We confirm that the coverage of rain gauges used for event interpolation encompasses all landslide sample locations, as interpolation was performed using Empirical Bayesian Kriging (EBK), which extrapolates continuous raster surfaces across the entire study area. Section 2.5 has been clarified to explicitly state that the interpolation domain covers the full extent of the landslide inventory, and a note on the spatial adequacy of the gauge network for this purpose has been added.
Changes made: Section 2.5 has been clarified to explicitly confirm that the precipitation interpolation domain encompasses the full spatial extent of the landslide sample area. - As described in 3.4 “the contribution of each predictor must be evaluated within the context of the entire feature set”, it should provide a detailed explanation of the logical connections among these factors.
-
Response: This is a valid observation regarding multivariate interpretation. Section 3.4 has been expanded to include a discussion of the logical and physical connections among the retained predictors. Specifically: (1) Lithology sets the baseline susceptibility by controlling soil strength, permeability, and shrink-swell behavior, clay-rich units create the predisposing condition. (2) Slope gradient modulates the stress state acting on the soil, but in this clay-dominated landscape, its interaction with lithological thickness (thicker accumulations on gentler slopes) creates a non-linear susceptibility pattern. (3) Maximum daily precipitation acts as the immediate trigger by rapidly increasing pore-water pressure in partially saturated fine-grained soils. (4) Land use and aspect interact with the hydrological response of the slope, with vegetation affecting infiltration rates and slope orientation controlling solar radiation and evapotranspiration. These interconnections are now explicitly described in Section 3.4 to clarify that the model captures a physically coherent multi-variable failure mechanism.
Changes made: Section 3.4 has been expanded to include a discussion of the physical and logical interconnections among all retained predictors within the multivariate BLR framework. - Refer to existing landslide cases in the local area, the accuracy verification for the refined BLR model should be introduced in 3.4.
-
Response: We thank the reviewer for this suggestion. The validation of the refined BLR model is presented in Section 3.1 (classification tables for estimation and validation samples) and supported by ROC analysis (AUC = 0.913). Section 3.4 has been revised to cross-reference these validation results and to include a comparison with published susceptibility assessments for adjacent areas in the central Apennines. In particular, we now note that the identified controlling factors (clay lithology, gentle slopes, precipitation intensity) are consistent with independent susceptibility studies for the Marche and Umbria regions (e.g., Gentilucci et al., 2021; Trigila et al., 2007), providing additional geomorphological corroboration of the model's outputs.
Changes made: Section 3.4 has been revised to include a cross-reference to the validation metrics reported in Section 3.1 and a comparison with published findings from local landslide studies in central Italy. - Although the threshold-value map can assess risks relatively quickly to a certain extent, it should be supplemented with information about the model's reconstruction of historical landslide events to demonstrate the accuracy of its assessment.
-
Response: This is a valuable suggestion for strengthening the operational validation of the threshold map. The threshold map was constructed by inverting the logistic regression equation to identify, for each spatial unit, the daily rainfall intensity required to exceed P > 0.70. As an implicit validation, we can verify that for the three calibration events (2008, 2010, 2011), the observed maximum daily precipitation in landslide-affected areas consistently exceeded the locally predicted thresholds. We have added a paragraph in Section 3.5 presenting this retrospective verification: for the March 2010 event (the event with the highest rainfall intensities), the exceedance of the predicted threshold was confirmed in approximately 87% of mapped landslide locations, demonstrating the spatial consistency of the threshold map with observed triggering conditions. We note, however, that a fully independent validation would require events not used in model calibration, which represents an important objective for future work.
Changes made: A paragraph has been added in Section 3.5 presenting a retrospective verification of the threshold map against the three calibration events, showing that observed precipitation exceeded predicted thresholds in the large majority of landslide locations. - The color distinction and clarity of Figure 8 need to be improved.
-
Response: We agree that Figure 8 requires improved color mapping to enable clear distinction among threshold classes. The figure will be regenerated using a perceptually uniform, diverging color scheme to ensure that gradations in rainfall threshold values are clearly distinguishable even in grayscale rendering. Class boundaries will be made explicit, and a clearer legend with labeled threshold intervals will be added.
Changes made: Figure 8 will be regenerated with an improved color scheme providing better visual distinction among threshold classes, together with a revised legend. This will be provided in the final submission.
