Previous Article in Journal
Amber from the Lower Cretaceous of Lugar d’Além Formation, Lusitanian Basin, Western Portugal: Chemical Composition and Botanical Source
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Interpretable AI for Site-Adaptive Soil Liquefaction Assessment

1
Department of Innovation and Sustainability, De La Salle University, Laguna 4024, Philippines
2
Department of Civil Engineering, De La Salle University, Manila 1004, Philippines
*
Author to whom correspondence should be addressed.
Geosciences 2026, 16(1), 25; https://doi.org/10.3390/geosciences16010025
Submission received: 3 November 2025 / Revised: 15 December 2025 / Accepted: 29 December 2025 / Published: 2 January 2026

Abstract

Soil liquefaction remains a critical geotechnical hazard during earthquakes, posing significant risks to infrastructure and urban resilience. Traditional empirical methods, while practical, often fall short in capturing complex parameter interactions and providing interpretable outputs. This study presents an interpretable machine learning (IML) framework for soil liquefaction assessment using Rough Set Theory (RST) to generate a transparent, rule-based predictive model. Leveraging a standardized SPT-based case history database, the model induces IF–THEN rules that relate seismic and geotechnical parameters to liquefaction occurrence. The resulting 25-rule set demonstrated an accuracy of 86.2% and strong alignment (93.8%) with the widely used stress-based semi-empirical model. Beyond predictive performance, the model introduces scenario maps and parameter interaction diagrams that elucidate key thresholds and interdependencies, enhancing its utility for engineers, planners, and policymakers. Notably, the model reveals that soils with high fines content can still be susceptible to liquefaction under strong shaking, and that epicentral distance plays a more direct role than previously emphasized. By balancing interpretability and predictive strength, this rule-based approach advances site-adaptive, explainable, and technically grounded liquefaction assessment—bridging the gap between traditional methods and intelligent decision support in geotechnical engineering.

1. Introduction

Soil liquefaction continues to be one of the most destructive and unpredictable consequences of seismic events, often leading to severe damage to infrastructure. As urban development expands into geologically complex and seismically active regions, the need for accurate, site-specific, and actionable liquefaction assessment becomes increasingly critical. Over the past several decades, the geotechnical community has relied on empirical and semi-empirical methods [1,2,3,4] to evaluate liquefaction potential using simplified relationships between soil behavior and seismic loading parameters. These methods are widely accepted and supported by extensive case history data, offering practical guidance for design and hazard mitigation.
While these traditional approaches remain the cornerstone of liquefaction evaluation in engineering practice, they are not without limitations. Most empirical and semi-empirical methods rely on simplifying assumptions and predefined correction factors that treat input variables independently. This structure, while effective for general application, may constrain the model’s ability to capture the non-linear and site-specific interactions among soil and seismic parameters. When applied to conditions outside the original calibration datasets or in regions with limited local empirical data, these models may face challenges in generalizability and adaptability. A particular study tried to address the limitations in the existing simplified liquefaction-triggering evaluation procedures using a case study in the Groningen Gas Field [5]. In addition, a group of researchers observed that the current simplified models for predicting liquefaction triggering and manifestation do not account for the mechanisms of liquefaction triggering and surface manifestation in a consistent and sufficient manner [6]. Field observations from recent earthquakes show that actual liquefaction patterns deviate from the simplified approach predictions, suggesting the need for more sophisticated analysis methods [7]. Likewise, simplified liquefaction evaluation procedures fail to account for the system response of liquefiable deposits, which significantly governs whether liquefaction actually manifests at the ground surface [8]. Lastly, the deterministic thresholds used in conventional frameworks do not typically reveal how multiple factors interact in governing liquefaction triggering.
In parallel, machine learning (ML) has gained significant attention in geotechnical engineering, offering powerful tools for predictive modeling of soil behavior and seismic hazards. Algorithms such as artificial neural networks, support vector machines, ensemble, and random forests have shown high predictive performance in liquefaction classification tasks [9,10,11,12,13,14,15,16]. Review articles demonstrate that ML-based liquefaction models can often achieve higher prediction accuracy than traditional methods [17,18]. However, many of these models have been largely ignored by practitioners due to their “black box” nature, which makes it difficult to validate predictions against established soil mechanics principles [18,19]. Engineers do not trust black boxes, specifically because they are decision-makers; decisions need reasons—not just predictions. This kind of limitation has sparked interest in interpretable machine learning (IML) approaches that can maintain transparency while leveraging the pattern-recognition capabilities of artificial intelligence [20].
In the broader context of disaster risk reduction, geotechnical engineering is one component of a larger multidisciplinary response that involves urban planners, emergency managers, and public decision-makers. When the scientific basis of hazard predictions is not clearly understood by non-technical stakeholders, the likelihood of delayed or ineffective decisions increases. This interpretive gap underscores the importance of developing models that are not only accurate but also transparent, explainable, and communicable across disciplines. Interpretable modeling in geotechnical engineering has thus become a necessity—not merely a preference—for enabling more inclusive and intelligent risk management. Several studies have demonstrated the value of interpretability in machine learning models, including (1) investigating rock mass quality and rock type predictions from drilling data [21], (2) explaining data-driven ML assessments of geotechnical risks in tunnel construction [22], and (3) developing constitutive models suited for sparse geotechnical data [23].
Moreover, a deeper technical limitation of both traditional and black-box models lies in their inability to explicitly model complex parameter interactions. In most empirical frameworks, key factors such as effective stress, magnitude, and soil resistance are treated independently, with adjustments made through correction factors. These approaches, while practical, may obscure non-linear dependencies among variables that are critical in predicting liquefaction triggering. In contrast, rule-based IML frameworks offer a powerful alternative by enabling the discovery and representation of parameter interactions in a form that is both interpretable and consistent with engineering logic. Understanding how combinations of parameters interact to influence liquefaction is essential not only for improving prediction, but also for enabling site-adaptive decision-making grounded in local conditions. Unweaving this complexity and identifying key factors and mechanisms that govern the liquefaction response and associated damage should therefore be the principal target in the engineering assessment of liquefaction [24].
Several rule-based liquefaction models have been developed by other researchers in the past. One study demonstrated that rough set theory could generate simple rules for predicting soil liquefaction using SPT- and CPT-based parameters [25]. Another study proposed a framework for developing a rule-based liquefaction classification model using rough set machine learning [26]. Lastly, a rough set-based lateral spreading assessment shows the different clusters of lateral spread displacements [27]. Although these studies employed rule-based approaches, there remains a need to optimize parameter selection to reduce computational effort without compromising predictive accuracy. Moreover, leveraging rule-based models to uncover underexplored parameter interactions in liquefaction assessment offers added value—positioning the model not only as a predictive tool but also as a powerful means of knowledge discovery.
In response to these challenges, this study proposes an IML approach to soil liquefaction assessment using rough set theory to generate a rule-based predictive model. Rather than relying on opaque algorithms or fixed empirical relationships, the method induces logical IF–THEN rules directly from a curated case history database, using raw borehole and seismic input data. These rules are both interpretable and actionable, enabling engineers to trace how specific combinations of parameters—such as seismic and geotechnical parameters—govern liquefaction outcomes. In addition to predictive modeling, the study introduces parameter interaction maps and scenario-based threshold maps to visualize and understand the interdependencies among influencing factors. Benchmarking against the Boulanger and Idriss [2] stress-based model, the proposed rule-based model can also be used as a complementary tool to the existing traditional or ML-based liquefaction models. This work contributes to the advancement of intelligent, site-adaptive seismic risk assessment by delivering a model that is technically sound, decision-maker-friendly, and real-world ready-aligned with the growing need for interpretable AI in geotechnical engineering.

2. Materials and Methods

This study aims to develop an interpretable rule-based framework for liquefaction assessment and to identify the interactions among key seismic and geotechnical parameters that contribute to the strength loss of liquefiable soils during earthquakes. A main goal in liquefaction assessment, particularly in urban environments, is to better understand the key factors and processes that cause liquefaction and the damage it brings. The general flow of the research is illustrated in Figure 1.
The methodology follows four major stages: (1) Data, (2) Rough Set Machine Learning, (3) Interpretive Analysis, and (4) Evaluation and Application. The first stage involves the collection, processing, and preparation of relevant data, including borehole records, standard penetration test (SPT) results, and earthquake parameters. The second stage applies rough set theory to perform machine learning tasks such as discretization of continuous variables, rule induction, and model optimization—resulting in a rule-based model that identifies patterns associated with liquefaction occurrence. The third stage, Interpretive Analysis, involves examining the induced rules to extract meaningful geotechnical insights. This includes sensitivity analysis, parameter interaction mapping, and scenario-based evaluation. A strong understanding of soil mechanics is essential at this stage to ensure that the model outputs are interpreted within the correct physical and engineering context. The final stage, Evaluation and Application, involves evaluating the rule-based model against a real-world borehole database and benchmarking the results against the well-established Boulanger and Idriss [2] stress-based method. Furthermore, a matrix of recommended applications is proposed to guide stakeholders—such as engineers, planners, and decision-makers—on how to apply the tools in practice.

2.1. Data

This study used a Standard Penetration Test (SPT)-based liquefaction case history database to develop decision rules for assessing liquefaction potential. Field case histories, documenting ground failures from past earthquakes, are essential for building robust rule-based models as they capture real-world variability in soil and seismic conditions. The database includes 251 liquefaction and no-liquefaction cases compiled from Boulanger & Idriss [2] and Cetin et al. [3], covering key earthquake and geotechnical parameters. These standardized datasets support consistent evaluation and enhance model reliability. Table 1 shows the sample datasets from the collected data while the complete dataset used in this study is included in the Supplementary Materials. The full and complete database is publicly available in the cited sources.
Before analysis, data processing was performed, ensuring the datasets were suitable for statistical and machine learning applications. The marginal liquefaction sites in the sources were not included in the simulation. A case is considered “marginal” when the information available indicates that site conditions are close to the threshold that distinguishes between the occurrence and non-occurrence of liquefaction [2]. Following standardization of units and formats, the data were organized into structured spreadsheets for Rough Set Machine Learning (RSML) analysis. Seven conditional attributes (input parameters) are considered in this study: three earthquake parameters (i.e., moment magnitude (M), maximum horizontal acceleration (amax), and epicentral distance (R)). The input parameters used in the proposed framework represent standard seismological and geotechnical descriptors relevant to soil liquefaction assessment. The moment magnitude, M, characterizes the overall energy released by an earthquake and is used to represent the severity and duration of seismic loading. The peak ground acceleration, aₘₐₓ (in units of g), represents the maximum horizontal ground acceleration experienced at the site and serves as an indicator of shaking intensity. The epicentral distance, R (km), is defined as the horizontal distance between the earthquake epicenter and the site location, reflecting attenuation of seismic waves with distance. Additionally, there are four geotechnical parameters (i.e., average depth of the critical layer (Avg Depth), depth of groundwater table (Depth GWT), average corrected standard penetration resistance or SPT N-value (N60), and fines content (FC)). It should be noted that the parameter ‘average depth’ refers to the mid-depth of each soil layer, computed as the average of the layer’s upper and lower boundaries and measured from the ground surface. Correspondingly, the ‘average SPT N-value’ and ‘average fines content’ represent the mean values of these properties within the same layer. The corrected standard penetration test resistance, N60, represents the measured SPT N-value normalized to a reference hammer energy efficiency of 60% and is widely used as an index of soil density and cyclic resistance. The fines content, FC (%), is defined as the percentage by weight of soil particles finer than 0.075 mm and reflects the influence of soil gradation on liquefaction susceptibility. Notably, these parameters are readily available in a typical geotechnical investigation report, reducing the time and steps in computing correction factors used by traditional models. The decision attribute (output parameter) is the occurrence or non-occurrence of soil liquefaction (Liq?). Consistent with simplified evaluation procedures, the framework assesses the potential ‘occurrence of liquefaction’ at specified subsurface layers. It should be noted, however, that the case-history labels used for model development are based on observed surface manifestations of liquefaction rather than direct confirmation of liquefaction at depth.
For the fines content parameter, clay layers were excluded based on their USCS classification (e.g., CL, CH) because these are considered non-liquefiable. However, fines content alone was not used as an exclusion criterion, as some soils with high fines content (e.g., silts or silty sands) may exhibit low plasticity and remain potentially liquefiable. As a result, layers with high FC values (e.g., up to 92%) may appear in the analysis when they are classified as silty or mixed soils rather than clay.
After collecting the source datasets, the first step is to check for the completeness of parameters. Missing data can be checked or retrieved from other references cited in the two primary sources mentioned earlier. The next step is standardizing the parameters in proper format, units, and presentation.
The data distribution of the attributes is also presented in Figure 2. These violin and box plots are used to discretize and create initial bins for the conditional attributes. They also serve as the attribute boundaries of the minimum and maximum values for the classification model developed in this study.

2.2. Rough Set Machine Learning

Rough set theory (RST), introduced by Pawlak [28], provides a mathematical framework for handling uncertainty in data, making it well-suited for geotechnical applications where liquefiable and non-liquefiable soils do not have clearly defined boundaries. The foundation of RST lies in approximation spaces, where datasets are classified into lower and upper approximations, distinguishing between definite and possible liquefaction occurrences. This approach allows for data-driven rule extraction without the need for prior probabilistic assumptions.
Using RST, “IF–THEN” decision rules were generated by analyzing relationships between key earthquake and geotechnical parameters and their corresponding liquefaction outcomes. In this structure, the “IF” part of the rule represents the condition attributes that describe specific soil or seismic characteristics. The “THEN” part corresponds to the decision attribute, which in this study is the occurrence or non-occurrence of soil liquefaction. For example, a rule may state: IF moment magnitude ≥ 7.0 AND N60 ≤ 15 THEN liquefaction = Yes. These rules define the combinations of parameter thresholds that lead to a specific decision class and help uncover complex interactions among variables while maintaining full interpretability.
The reliability of these rules was evaluated using four key metrics: support, strength, coverage factor, and certainty factor. Support is an observation that adheres to a specific rule. Strength is defined as the ratio of the number of supports to the total observations in the database. The certainty factor is the likelihood that an observation (case history) will be categorized as belonging to a decision class if it exhibits the conditions of a specific rule. The coverage factor indicates the percentage of examples in a decision class that have been categorized because of a specific decision rule.
The rule-based model was developed using the Rough Set Exploration System (RSES 2.2) software [29,30]. Figure 3 presents the general rough set machine learning (RSML) algorithm used to extract the most stable and informative rules for liquefaction assessment.
  • Data Examination: The raw dataset was reviewed for completeness, ensuring each seismic event included all necessary geotechnical parameters. Independent (condition) and dependent (decision) attributes were identified (see Table 2).
  • Initial Discretization of Continuous Variables: As RST requires categorical inputs, continuous parameters were discretized into bins using thresholds based on median values. Extreme values (i.e., outliers) were labeled “Very Low” or “Very High” to address skewed distributions. Median-based discretization has been shown to be an effective binning approach in previous liquefaction studies [26,27]. Geotechnical parameters are often characterized by skewed distributions, for which the median provides a more representative measure of central tendency than the mean, as it is less sensitive to outliers and extreme values [27,31,32]. Consequently, median-based thresholds offer a robust and physically interpretable basis for discretizing continuous variables in liquefaction assessment.
  • Creation of Decision Tables: Discretized data were organized into decision tables, forming the basis for rule induction (Table 3 provides a sample format). The qualitative labels used in Table 3 (e.g., ‘low,’ ‘high,’ ‘very high’) correspond to discretized ranges of each continuous parameter as defined by the rule-based model. For clarity, the numerical thresholds that delineate these categories (e.g., the specific range of classified as ‘high’) are provided in Figure 4. These thresholds allow consistent interpretation and implementation of the qualitative descriptors.
  • Rule Induction: Decision rules were generated using the exhaustive algorithm in RSES 2.2, which yielded optimal accuracy and coverage.
  • Rule Optimization: Redundant or weak rules were removed through filtering and shortening features of RSES 2.2, retaining only those that captured essential parameter interactions.
  • Rule Set Evaluation: Rule sets were assessed by their accuracy, coverage, and size, balancing generalization and predictive strength.
  • Final Optimization: Discretization thresholds were iteratively refined using statistical feedback (i.e., balancing accuracy and coverage) and geotechnical literature. Figure 4 shows the final discretization adopted in this study. The resulting rule set represents a framework for liquefaction assessment.
To validate the rule-based model’s predictive performance, a 90/10 data split was applied following the recommended method [33]. Ninety percent of the data served as the training set, while the remaining 10% was reserved for independent testing. Rules derived from the training set were evaluated against the test set to assess accuracy and coverage. Minimal deviation between the full-data and split-model results indicated strong reliability and generalizability. The final rule sets were further analyzed, which includes sensitivity analysis, to uncover key liquefaction patterns, parameter thresholds, and critical interactions, which informed the development of parameter interaction and scenario maps to support geotechnical decision-making in seismic hazard assessments.

2.3. Interpretive Analysis

The interpretive analysis phase serves as the analytical core of this study, aimed at extracting meaningful geotechnical insights from the RSML results. While machine learning models often provide high accuracy, their utility in engineering depends on how well the outputs can be understood, trusted, and translated into real-world decisions. Prediction is half the job—understanding is the other. The interpretive analysis stage addresses that challenge by focusing on sensitivity analysis, scenario mapping, and parameter interaction—tools that enhance both the transparency and engineering relevance of the developed rule-based model.

2.3.1. Sensitivity Analysis Using Ablation

Sensitivity analysis was employed to evaluate the relative importance of input parameters by systematically removing individual condition attributes and observing the impact on model accuracy. Commonly used in interpretable machine learning, this method highlights which features are most influential in driving classification decisions [33]. The model’s performance was evaluated using 10-fold and 100-fold cross-validation in the RSES 2.2 software environment to ensure robustness and minimize potential bias. Accuracy served as the primary metric for comparison.

2.3.2. Scenario and Parameter Interaction Maps

While sensitivity analysis evaluates the individual contribution of each parameter, scenario and parameter interaction maps emphasize the combined effects of multiple parameters, revealing the non-linear and complex nature of their interactions. Beyond quantitative evaluation, the rule-based outputs were translated into visual tools—scenario maps and parameter interaction maps—to support intuitive and explainable assessment of liquefaction potential. Scenario maps were constructed by clustering rules that shared similar conditional attributes and led to the same decision class (liquefaction or non-liquefaction). This clustering revealed recurrent geotechnical and seismic patterns across different earthquake conditions. Scenario maps also helped identify empirical threshold ranges for key parameters, offering practical reference values for field application and rapid hazard assessment.
Parameter interaction maps were developed using a dual approach. First, the impact of individual parameter removal on model performance was evaluated through sensitivity analysis. Parameters whose removal resulted in a drop of more than 10% in model accuracy were classified as highly important, 5–10% as moderately important, and less than 5% as low importance. Second, co-occurrence frequency was assessed by counting how often parameter pairs appeared together in the same decision rule: four or five co-occurrences indicated strong interaction, three as moderate, and one to two as weak. This combined approach highlights complex parameter interactions that are often oversimplified—or entirely overlooked—in traditional methods. Together, these visual and analytical tools enhance the interpretability of rule-based models and provide engineers, planners, and decision-makers with meaningful insights for managing liquefaction risk, particularly in urban environments.

2.4. Evaluation and Application

Model evaluation was conducted to compare the predictive performance of the RSML rule-based model developed from raw borehole data against the state-of-practice SPT-based liquefaction triggering procedure proposed by Boulanger & Idriss [2]. The latter was designated as the baseline model due to its widespread use in engineering practice. This evaluation follows the functionally grounded model evaluation framework [34], which assesses a model’s effectiveness in fulfilling its intended purpose.
A total of 151 actual borehole reports from Manila and Quezon City, Philippines, sourced from [35], were analyzed for liquefaction potential. Table 4 summarizes the borehole data and parameters used. The table includes the overburden pressure correction factor formulated by Liao and Whitman (1986) [36], which was retained to maintain consistency with the preprocessing applied in the original case-history datasets. Although more recent approaches—such as the Boulanger & Idriss [2] formulation, where the exponent m varies as a function of N 1 ) 60 c s —better reflect current practice, adopting this updated expression would not affect the structure or application of the rule-based model. This clarification has been added to avoid confusion. The complete borehole dataset in the model evaluation is included in the Supplementary Materials. Moment magnitude was fixed at 7.5, while the maximum horizontal acceleration and epicentral distance were adopted from median values reported in [35]. These earthquake parameters align with typical values used for liquefaction assessment in Metro Manila [37,38].
The model evaluation process is illustrated in Figure 5. Each soil layer in the borehole dataset was assessed for liquefaction potential using both the baseline model and the rule-based model. Before analysis, non-liquefiable soils were excluded (i.e., soil layers above the groundwater table and clay or rock layers). The Boulanger & Idriss procedure was applied, where liquefaction potential was determined based on the computed factor of safety (FS) [2]. Meanwhile, the best rule set derived from the rule-based model was applied, with predictions made based on the activated rule with the highest certainty factor. The final predictions of both models were then compared using statistical evaluation metrics.
To compare model performance, the following validation metrics were used: number of rules, total accuracy, total coverage, F1 score, sensitivity, and the confusion matrix. Equations (1)–(5) show how these metrics were established in terms of the numbers of TP (true positive), TN (true negative), FP (false positive), and FN (false negative).
T o t a l   A c c u r a c y = T P + T N T P + T N + F P + F N
R e c a l l   ( S e n s i t i v i t y ) = T P T P + F N
P r e c i s i o n = T P T P + F P
F 1   S c o r e = 2 × P r e c i s i o n × R e c a l l P r e c i s i o n + R e c a l l
T o t a l   C o v e r a g e = T e s t   o b j e c t s   r e c o g n i z e d   b y   t h e   c l a s s i f i e r T P + T N + F P + F N
These metrics enabled the evaluation of trade-offs between accuracy, conservatism, and risk mitigation in liquefaction prediction. By comparing the rule-based liquefaction model with the baseline, the study assessed whether a transparent, rule-based approach can offer comparable performance while preserving interpretability—a key advantage for engineering decision-making and hazard communication.
This functionally grounded evaluation ultimately demonstrates both the strengths and limitations of the RSML approach and supports its practical applicability in liquefaction hazard assessment.
For the Philippine borehole datasets, moment magnitude and epicentral distance were treated as fixed values to represent a credible seismic scenario for two Metro Manila cities rather than as site-specific predictors. Consequently, spatial variability in liquefaction predictions is driven primarily by differences in subsurface soil conditions, while the seismic loading environment is held constant. Application of the rule-based model to independent Philippine case histories with varying magnitudes and epicentral distances is an important objective of future work as additional post-earthquake data become available.
Finally, this study proposes several practical applications for the RSML outputs and tools to support intelligent geotechnical engineering. These include site-specific hazard screening, integration into decision support systems, and use in technical communication among engineers, planners, and policymakers. The strengths and limitations of the established tools are further discussed in the following section.

3. Results and Discussion

This section presents the results and corresponding discussions aligned with the four main stages outlined in Section 2: (1) Data, (2) Rough Set Machine Learning, (3) Interpretive Analysis, and (4) Evaluation and Application. The final part highlights the novelty, advantages, and limitations of the developed tools, aiming to ensure transparency, interpretability, and user confidence for future implementation in intelligent geotechnical engineering practice.

3.1. Data

The distribution of data across the seven conditional attributes is illustrated using combined violin and box plots in Figure 2. These visualizations reveal that the dataset exhibits significant skewness, with several parameters displaying notable outliers. In particular, the distribution of moment magnitude shows a bimodal pattern, suggesting the presence of two dominant seismic event clusters within the dataset. This variation in distribution highlights the importance of appropriate discretization to ensure effective rule generation and model interpretation.
Table 5 presents the descriptive statistics of the dataset used in developing the rule-based liquefaction model. The inclusion of epicentral distance (R), which ranges from 2.23 km to 257.05 km, enables the assessment of both near-field and far-field earthquake effects on liquefaction potential. Near-field events, due to their higher energy density and rapid pore pressure generation, are typically more likely to trigger liquefaction [39,40]. The dataset’s mean epicentral distance of 60.01 km indicates that many case histories fall within the moderate-distance range.
The average measured SPT blow count (N60) varies from 1 to 50.8, with a median of 10, reflecting a broad range of soil densities and resistance. Notably, the high positive skewness (1.17) in R highlights a concentration of near-field observations, accompanied by a long tail of far-field events. This distribution is particularly important in understanding how liquefaction potential attenuates with increasing distance from the seismic source.
Furthermore, most parameters in the dataset exhibit positive skewness, suggesting a predominance of relatively low to moderate values, with fewer but influential high-end outliers. This imbalance can influence model sensitivity, as extreme values may disproportionately affect rule generation and classification thresholds. These statistical characteristics provide the basis for developing an effective discretization table, which is critical for interpretable machine learning using RST.

3.2. Rough Set Machine Learning

3.2.1. The Best Rule Set

From the available SPT-based liquefaction case histories, a set of IF-THEN rules was chosen and used to interpret and identify the key parameters that influence the occurrence of soil liquefaction during earthquakes. The selection of the best rule set is the main output of the RSML, and this best rule set is the foundation of the other analyses.
Figure 4 illustrates the final discretization scheme used in the rule-based model. Each conditional attribute was divided into two or three bins, informed by both statistical distribution and geotechnical reasoning. This discretized structure formed the basis of the decision table—an essential input for initiating the rule induction process in the RSES 2.2 software. Using the exhaustive algorithm, a rule search heuristic, the software generated a comprehensive set of decision rules. These were subsequently filtered to eliminate redundancies and weak-performing rules. The aim was to extract the most concise yet effective set of rules, optimized for both classification accuracy and coverage.
Compared with the initial discretization scheme based on median values and box plots, Table 6 summarizes the differences in performance between the initial and final discretization schemes. Only minor differences are observed in the accuracy and coverage metrics; however, the final discretization reduces the number of induced rules from 30 to 25, indicating improved rule compactness and interpretability. In addition, the final discretization exhibits a more balanced trade-off between accuracy and coverage, demonstrating its effectiveness in summarizing the rule set while maintaining predictive performance.
Based on the optimization of the rule set, Table 6 also presents the performance of the selected best rule set. This rule set achieved an accuracy of 86.2% and coverage of 86.5%, slightly exceeding the typical 70–85% efficiency reported for state-of-practice liquefaction triggering models [18]. With 25 rules, the best rule set maintains a balanced trade-off between accuracy and coverage, ensuring reliability without excessive complexity. Furthermore, the minimal difference in accuracy and coverage between the “All Data” and “90/10 Split” scenarios suggests that training on the full dataset does not lead to overfitting. This robust and interpretable rule set forms the foundation of the rule-based predictive model developed in this study.
The best rule set consists of 25 rules, with 15 rules predicting liquefaction and 10 rules predicting non-liquefaction. Table 7 highlights a clear stratification of parameters across decision classes, demonstrating that high moment magnitudes and low epicentral distances are strongly associated with liquefaction, while the inverse is true for non-liquefaction cases. Similarly, low SPT N-values appear consistently in liquefaction rules, whereas higher values correspond to non-liquefaction rules. These patterns align with established liquefaction principles [1,4], reinforcing the reliability of the RSML approach in capturing critical liquefaction-triggering mechanisms. The effectiveness of RSML in mining meaningful patterns suggests its robustness as a tool for interpreting liquefaction events. However, due to the inherent imbalance in the compiled database, with fewer documented non-liquefaction cases, the induced rules are more strongly influenced by commonly observed liquefaction conditions. This imbalance may reduce the model’s ability to capture rare or extreme non-liquefaction scenarios, such as far-field strong earthquakes or soils with exceptionally high fines content. As such, the rules derived should be interpreted as reflecting dominant patterns in the available case histories rather than exhaustive triggering limits.
Statistical analysis of the optimal rule set (Figure 6) revealed that Rule 1 emerged as the strongest, with 50 support cases, a certainty factor of 90.9%, and a coverage factor of 37.6%. This indicates that more than one-third of the liquefied cases in the dataset adhered to this rule. Rule 1, which emphasizes very high maximum acceleration (0.3 to 0.84 g) and low N-value (1 to 15), aligns with the fundamental liquefaction principles [41]. It can be expressed in natural language as:
“If the maximum acceleration is very high (amax = 0.3–0.84 g) and the penetration resistance is low (N60 = 1–15), then liquefaction is likely to occur.”
The number of support cases across the rules ranged from 5 to 50, with stronger rules exhibiting higher predictive reliability. While Rule 1 dominated liquefaction predictions, Rules 19 and 21 were the most influential for non-liquefaction classification. As observed, an imbalance exists in the distribution of rules and metrics within the rule-based liquefaction model. Nonetheless, the model successfully extracted meaningful and interpretable rules, reinforcing its practical applicability.
The certainty factor across all rules ranged from 80% to 100%, with several deterministic rules enhancing overall reliability. Coverage values varied between 3.8% and 37.6%, following a similar trend as rule strength. Liquefaction rules generally exhibited higher coverage than non-liquefaction rules, reflecting the bias in case distribution. Populating the database to balance the distribution is warranted in future studies.

3.2.2. Interpretation of Rules

Interpretable rule-based models enhance seismic hazard assessments by providing transparent decision rules that align with geotechnical principles, making them valuable for decision-support tools used by engineers and city planners. In high-stakes environments where lives and infrastructure are at risk, interpretability is not optional in geohazards. To ensure the reliability of RSML-induced rules, this study applies descriptive accuracy [42] as a qualitative validation method, assessing whether the extracted rules reflect meaningful geotechnical relationships rather than statistical coincidences. By comparing these rules with established soil mechanics and liquefaction principles, the study bridges data-driven modeling with physics-based understanding. While most rules align with conventional knowledge, some reveal unexpected insights, challenging traditional assumptions in liquefaction assessment. These findings are further analyzed using related literature to refine and enhance liquefaction prediction methodologies.
The rule-based liquefaction model offers distinct conditions for liquefaction occurrence and non-occurrence. This greater specificity, likely resulting from the discretization methodology, enhances the model’s ability to capture liquefaction-triggering mechanisms with higher granularity.
Key findings from the rule set reinforce fundamental liquefaction principles. Rule 1, which links very high maximum acceleration (0.3 to 0.84 g) with low penetration resistance (N60: 1–15), is consistent with classical liquefaction theory. However, the presence of partial domain rules— rules generated from the RSML process that lack either a seismic demand parameter or a soil susceptibility parameter—offers new insights into the relative importance and hierarchy of influencing factors. These rules, particularly prevalent in the no-liquefaction class, highlight areas where the model can still make predictions despite incomplete input conditions, suggesting patterns that are indicative of either safety or risk. For instance, Rule 21 identifies a possible seismic “floor,” indicating that, at low moment magnitudes (5.9–7.0) and low accelerations (0.052–0.2 g), liquefaction is unlikely based on the patterns observed in the training dataset. This insight can help simplify hazard screening in low-seismicity regions and improve the efficiency of early-stage risk assessments. It should be noted, however, that recent case histories—such as the Canterbury Earthquake Sequence—demonstrate that liquefaction can still occur under these conditions, particularly in very loose, water-saturated soils [43]. This highlights a limitation of the model in capturing rare but possible events outside the historical dataset and underscores the importance of updating the database with the most recent case histories.
Several rules further refine seismic thresholds. Rule 9 reinforces that high moment magnitude and very high ground acceleration strongly indicate liquefaction, while Rule 19 highlights the critical role of epicentral distance in moderating risk. Notably, Rules 7, 8, and 20 uncover patterns that may complement or expand current understanding of liquefiable soil conditions, especially under varying groundwater levels, fines content, and density. Rule 7 suggests that liquefaction may still occur even with a deep groundwater table (4–7.7 m), challenging the common belief that deeper water tables inhibit liquefaction. For example, in 1994 Northridge, California earthquake, a site with recorded groundwater table depth of 7.2 m manifested soil liquefaction [44]. Rule 8 indicates that soils with very high fines content (35–92%) may still manifest liquefaction-like effects under high acceleration, possibly due to cyclic softening of silts and clays [45,46]. It should be noted that the case-history labels are based solely on surface manifestations and do not differentiate between cyclic liquefaction in sands and cyclic softening in fine-grained soils. Accordingly, the model does not treat cyclic softening as a separate mechanism, and we acknowledge that some locations classified as exhibiting ‘liquefaction occurrence’ may in fact reflect cyclic softening of fine-grained soils. Furthermore, Rule 20 emphasizes the influence of soil density in reducing liquefaction susceptibility, reinforcing the effectiveness of ground improvement strategies.
The inclusion of fine-grained soils in the liquefaction-related rules offers a potential expansion to traditional perspectives, which typically associate liquefaction with clean, loose sands or sands containing a limited percentage of fines. Rule 8 indicates that soils with high fines content can still exhibit susceptibility under high ground acceleration, potentially aligning with cyclic softening behavior observed in past field investigations [47]. This observed behavior underscores the importance of distinguishing between cyclic liquefaction and cyclic softening, particularly in fine-grained soils. It also highlights the need for further research to refine liquefaction assessment frameworks and inform appropriate mitigation strategies [48,49,50].
While the model demonstrates consistency in liquefaction mechanics, the uneven distribution of decision rules suggests a dataset imbalance, particularly in non-liquefaction cases. Expanding the liquefaction database would reinforce threshold values and improve the model’s ability to distinguish between liquefied and non-liquefied conditions, enhancing overall reliability.

3.3. Interpretive Analysis

3.3.1. Sensitivity Analysis

Sensitivity analysis of the rule-based liquefaction model (Table 8) provides further insights into parameter importance in liquefaction prediction. The model demonstrated baseline accuracy of 78.8% (k = 10 folds) and 79.5% (k = 100 folds) across different cross-validation schemes, indicating stability. The removal of the N-value corrected to 60% hammer efficiency caused the most significant accuracy drop, reducing performance to 53.2% (k = 10 folds) and 50% (k = 100 folds). This highlights the critical role of SPT N-value in liquefaction assessment, consistent with decades of empirical observations [1,4]. Accuracy drops from other removed attributes revealed lower individual significance on soil liquefaction potential, but their interactions with other parameters were captured in the parameter interaction map.

3.3.2. Scenario Map

The rule-based liquefaction model identified four primary liquefaction scenarios: high-magnitude, low-magnitude, dense soil, and loose soil conditions (Figure 7). These scenarios were derived using the best rule set of the rule-based model, clustering parameter thresholds and their associated liquefaction potential. This structured approach provides a practical framework for preliminary site assessments.
In the high-magnitude scenario (M ≥ 7.0, amax ≥ 0.3 g), liquefaction is consistently predicted under near-field conditions (Distance < 50 km), aligning with field observations from the 2023 Kahramanmaraş earthquake sequence [51]. The low-magnitude scenario shows the opposite attributes. This means that when all the criteria in this scenario are satisfied, the likelihood of soil liquefaction is low. This will allow a quick, reliable and interpretable preliminary liquefaction assessment.
The loose high-fines soil scenario highlights how unfavorable combinations of parameters (N60 < 15, shallow groundwater table, and very high fines content) substantially increase liquefaction probability, consistent with recent observations in İskenderun [50]. Several historical earthquakes—including the 1971 San Fernando, 1975 Haicheng, 1977 Argentina, 1978 Miyagiken-Oki, 1979 Imperial Valley, 1981 Westmorland, 1989 Loma Prieta, and 1994 Northridge events—also reported liquefaction in similarly low-resistance, fines-rich deposits. This broader evidence supports the relevance of this scenario and distinguishes it from liquefaction mechanisms associated with loose clean sands.

3.3.3. Parameter Interaction Map

Parameter interaction maps provide structured visual representations of co-occurrences among key parameters within RSML-derived rule sets. These maps were developed using sensitivity analysis and rule interaction assessments, which evaluated both the individual impact of parameters on model accuracy and their co-occurrence within the best rule sets.
By combining these two analyses, the parameter interaction maps highlight critical patterns that influence soil behavior, offering valuable insights for refining liquefaction assessment. Figure 8 presents the established parameter interaction map, where node color indicates parameter importance (from sensitivity analysis) and arrow transparency reflects co-occurrence frequency in the rule set. The map highlights which combinations of parameters frequently appear together in rules predicting liquefaction. Terms such as “strong” or “moderate” co-occurrence describe how often these parameter ranges jointly contribute to a prediction, rather than implying physical or statistical dependency. For clarity and to emphasize significant patterns, arrows representing few or low co-occurrences have been omitted.
The rule-based liquefaction model parameter interaction map provides a condensed yet insightful representation of liquefaction-triggering parameters patterns. A notable distinction is the explicit role of epicentral distance, which forms two key co-occurrences: (1) with moment magnitude and (2) with SPT N-value. Traditional stress-based models incorporate these influences within a magnitude scaling factor [2]. Traditional stress-based models for liquefaction evaluation are founded on comparing the earthquake-induced cyclic stress ratio (CSR) with the soil’s cyclic resistance ratio (CRR), as estimated from in situ tests such as SPT, CPT, or shear-wave velocity. If the CSR exceeds the CRR, liquefaction is predicted to occur. These methods—originally developed by Seed and Idriss [1] and later refined by various researchers [2,3,4,32]—have been widely used in practice and form the basis of most conventional liquefaction triggering charts. In this study, we refer to these well-established procedures collectively as traditional stress-based models. Near- and far-field conditions have not been explicitly integrated into these conventional approaches. While some energy-based and Arias intensity models consider epicentral distance in liquefaction assessment [52,53,54], its role in stress-based and other methods remains underexplored.
Another key insight is the dominant influence of moment magnitude in parameter interactions. It exhibits the highest number of co-occurrences, underscoring its broad impact on liquefaction triggering. While traditional stress-based models address this through the magnitude scaling factor, the rule-based model suggests that its influence extends beyond conventional representations. Particularly significant is the strong co-occurrence observed between moment magnitude and fines content, a relationship not explicitly accounted for in stress-based liquefaction models. Although magnitude plays an important role in shaping these interactions, the model remains relatively insensitive to its removal in the sensitivity analysis because its predictive information is partially shared with other seismic demand parameters such as maximum acceleration and epicentral distance. These variables collectively compensate for the absence of magnitude, resulting in only a modest reduction in accuracy despite its structural importance. Given its potential significance, further research is recommended to examine the role of fines content in moment magnitude–induced liquefaction susceptibility.
One useful perspective is the role of shaking duration and cumulative cyclic demand in controlling the cyclic strength of fine-grained soils. Large-magnitude earthquakes impose longer-duration cyclic loading, which can enhance excess pore pressure generation and promote cyclic softening in fines-rich, low-plasticity soils, particularly under saturated and low-resistance conditions. Laboratory studies on silts and clays have shown that increasing the number of uniform loading cycles can lead to significant cyclic strength degradation, supporting the physical basis for the observed association between high magnitude and elevated fines content [55,56].
The strong co-occurrence between maximum horizontal acceleration and SPT N-value aligns with the core principles of stress-based liquefaction models. However, the observed pattern between maximum horizontal acceleration and fines content, though not entirely surprising, suggests a potentially underrecognized mechanism—possibly bridging the transition between liquefaction and cyclic softening. This observation presents an opportunity for further research.
Other noteworthy co-occurrences observed in the parameter interaction map include the pattern between moment magnitude and groundwater table (GWT) depth, which may influence the depth and severity of pore pressure buildup during seismic events. Additionally, the arrow linking epicentral distance and SPT N-value highlights how both distance from the seismic source and soil resistance collectively impact liquefaction susceptibility. Although correlated with magnitude and PGA, epicentral distance captures additional distance-dependent effects such as attenuation of shaking energy, duration, and frequency content. Liquefaction is triggered only when the transmitted energy exceeds the soil’s density-controlled resistance; thus, epicentral distance serves as a proxy for whether sufficient shaking reaches the site. This is consistent with observations that liquefaction occurs only within a finite maximum distance for a given magnitude.
Lastly, the link between moment magnitude and maximum horizontal acceleration reflects how larger magnitude earthquakes typically generate stronger ground shaking, reinforcing their combined role in triggering liquefaction.
These patterns and co-occurrences in the ruleset reflect the complex nature of earthquake-induced soil behavior and highlight the value of interpretable models in uncovering patterns that may be missed by conventional approaches. The rule-based interaction analysis not only captures patterns but also points to possible geotechnical mechanisms that are not yet well understood or represented in existing models. This aligns with the perspective of Maurer and Sanger (2024) [18], who emphasize that AI is most effective in cases where relationships between input parameters and outcomes are observable but not yet fully explained by mechanics. The parameter interaction maps in this study support this view, offering a meaningful foundation for refining both practical assessments and future research in liquefaction modeling.

3.4. Evaluation and Application

3.4.1. Rule-Based Predictive Model

Using the rules induced by the RSML, a rule-based classification framework was developed to generate liquefaction predictions at the soil-layer level. While individual rules are not intended to be used as stand-alone predictors, the rule set is applied collectively to produce a single, interpretable assessment. Figure 9 illustrates the workflow for applying the rule-based predictive model.
The procedure consists of the following steps:
  • Data Collection: All parameters required for liquefaction assessment are collected, as described in Section 2.2.
  • Susceptibility Screening: Soil layers that are not susceptible to liquefaction are excluded from further analysis. These include:
    (1)
    layers located below the groundwater table,
    (2)
    layers classified as high-plasticity soils, and
    (3)
    layers with raw SPT N -values greater than 30.
  • Completeness of Parameters: Borehole sites with incomplete parameter sets are excluded from the analysis, and alternative site-specific liquefaction investigation methods are recommended for such cases.
  • Discretization of Parameters: Continuous input variables are discretized using the parameter discretization scheme shown in Figure 4.
  • Activated Rule Determination: The discretized parameters of each soil layer are compared against the rule set generated by the RSML (Table 7). All rules that are satisfied by the input conditions are identified as activated rules.
  • Prediction and Interpretation: The final liquefaction prediction is determined by the activated rule with the highest certainty factor. Any remaining activated rules are used to aid interpretation of the site conditions. If no rule is activated, this indicates that similar site conditions are not represented in the historical dataset, and the rule-based model is considered not applicable for that site.

3.4.2. Model Evaluation

The functionally grounded model evaluation assessed the predictive classification performance of the RSML-derived rule-based liquefaction model against the stress-based liquefaction triggering procedure proposed by Boulanger & Idriss [2]. The latter served as the baseline model due to its widespread use in engineering practice. Following the functionally grounded model evaluation framework [34], this validation aimed to assess the model’s effectiveness in achieving its intended application.
A comparison of both models, using 151 boreholes with 1099 soil layers from the City of Manila and Quezon City, yielded highly encouraging results, demonstrating strong alignment between the two approaches. Table 9 presents the performance metrics derived from the confusion matrix. The performance metrics primarily reflect consistency between the proposed rule-based model and the established Idriss–Boulanger framework, as both are derived from largely the same global case-history database. Accordingly, this evaluation is intended as a benchmark against current state-of-practice methods rather than as a fully independent external validation.
The high accuracy (93.81%) indicates strong agreement between the rule-based liquefaction model and the simplified procedure. The precision (93.57%) demonstrates that when the rule-based model predicts liquefaction, it aligns with the simplified procedure in most cases. The recall (92.83%) confirms the model’s ability to identify liquefiable cases accurately. The high specificity (94.64%) highlights its strong performance in distinguishing non-liquefiable layers. The balanced F1 score (93.20%) suggests consistent performance across both liquefaction and non-liquefaction predictions, validating the rule-based model as a reliable alternative to conventional assessment methods.

3.4.3. Rule-Based Model Application on the 2023 Turkey Earthquake

To further assess the performance of the proposed rule-based model, it was applied to a documented liquefied site in Gölbaşı, Adıyaman, following the 6 February 2023 Kahramanmaraş earthquakes [57]. The site was selected based on the availability of post-earthquake field observations and published geotechnical data. Due to data limitations and to maintain consistency among input parameters, a single representative borehole was analyzed.
The evaluation considered the two mainshock events: the Pazarcık earthquake (Mw 7.8) and the Elbistan earthquake (Mw 7.6). A conventional stress-based liquefaction assessment was performed using the SPT-based deterministic procedure of Boulanger and Idriss [2], and the results were compared with the predictions of the rule-based model developed in this study. The input parameters used for both models are summarized in Table 10.
Figure 10 presents a side-by-side comparison of the stress-based and rule-based model predictions. Both approaches indicate that the site experienced soil liquefaction, which is consistent with post-earthquake field observations reported for the area. However, differences are observed in the predicted depth extent of liquefaction. The stress-based method predicts liquefaction from approximately 2 m to 20 m depth, as indicated by factors of safety less than unity. In contrast, the rule-based model predicts liquefaction primarily within the shallow layers between 2 m and 9 m, with no liquefaction predicted at depths between 9 m and 15 m. It should be noted, however, that the Boulanger and Idriss procedure cautions against interpretation at depths greater than approximately 10 m, where extrapolation of the shear stress reduction factor (rd) introduces increased uncertainty. Accordingly, liquefaction evaluations for depths exceeding about 10 m may benefit from site-response analyses to estimate earthquake-induced cyclic stress ratios, as the uncertainty associated with rd becomes significant at greater depths [2].
Examination of the activated rules indicates that, within the deeper layers, two liquefaction-related rules (Rules 8 and 9) and two non-liquefaction rules (Rules 17 and 18) were triggered. The governing rule was Rule 18, which has the highest certainty factor. This rule states that liquefaction is unlikely to occur when epicentral distance is high (50–257.1 km) and the corrected SPT resistance (N60) is moderate to high (15–30).
The comparison highlights the complementary strengths of the two approaches. While the stress-based method provides a quantitative, depth-dependent factor of safety, the rule-based model offers an interpretable framework that explicitly identifies the controlling parameters governing liquefaction or non-liquefaction at each depth. This transparency facilitates engineering judgment and provides additional insight into parameter interactions that are not explicitly captured in traditional stress-based formulations.

3.4.4. Practical Applications

In addition to its predictive capability, the rule-based liquefaction model developed in this study offers several practical applications through three core outputs: (1) decision rules, (2) scenario maps, and (3) parameter interaction maps. These interpretive tools can be used individually, in combination, or as complementary components depending on the engineering task or stakeholder need, as provided in Table 11.
The decision rules are particularly useful for rapid site screening, as they require minimal computational effort while achieving accuracy comparable to state-of-practice models. Their explainability makes them ideal for integration into decision-support systems or mobile applications. Moreover, they serve as effective instructional tools for demonstrating cause-and-effect relationships in liquefaction behavior, supporting training for engineers and planners.
Scenario maps are best suited for spatial visualization tasks such as hazard mapping and site classification. Their clear representation of conditional thresholds supports zoning applications and post-earthquake damage assessments, enabling quick, intuitive communication of site-specific risks.
Parameter interaction maps provide insight into the strength and nature of interdependence among key variables. These are particularly valuable for technical users, offering support in sensitivity analysis, feature selection, and resilience-oriented planning. They also help engineers make informed design decisions based on the relative importance of influencing parameters.
When used in combination, these tools unlock even more specialized applications. For example, combining decision rules with scenario maps enables site profiling and contextual hazard screening. Merging decision rules with parameter interaction maps allow for multi-parameter design checks and confidence weighting. Scenario and interaction maps together can support regional pattern discovery and visual training in geotechnical behavior.
When all three tools are integrated—or complemented with existing models or decision systems—they can enhance a comprehensive decision support platform, enabling explainable model benchmarking, robust hazard communication, and intelligent geotechnical engineering for a broad range of users, including engineers, planners, policymakers, and educators.

3.5. Discussion

This section critically examines the performance, practicality, and broader implications of the interpretive tools developed in this study. By comparing the rule-based liquefaction model with both state-of-practice and advanced machine learning approaches, the study situates its contributions within the evolving landscape of liquefaction assessment. In doing so, it also addresses key criticisms raised against AI-based models, demonstrating how interpretability and methodological rigor can coexist. Finally, the section outlines current limitations and proposes directions for future refinement, ensuring continued relevance and reliability of these tools in geotechnical engineering applications.

3.5.1. Core Findings of the Study

The developed rule-based liquefaction model demonstrated strong predictive capability, achieving an overall accuracy of 86.2% and a coverage of 86.5%, which slightly exceeds the typical performance range of established SOP methods (70–85%). Furthermore, benchmarking against the widely adopted Boulanger and Idriss stress-based procedure showed a high level of agreement (93.81% accuracy), reinforcing the model’s consistency with accepted geotechnical principles.
Beyond performance metrics, the core contribution of this study lies in its interpretive insights into liquefaction behavior. The extracted IF–THEN rules revealed both classical and non-traditional relationships among seismic and soil parameters. The most reliable rule—linking high ground acceleration with low corrected SPT N-values—aligns closely with conventional liquefaction theory, validating the methodological soundness of the approach. At the same time, the model identified important departures from simplified assumptions embedded in traditional frameworks. Notably, soils with high fines content (35–92%) were shown to be susceptible to liquefaction under strong shaking, suggesting cyclic softening mechanisms in fines-rich soils that are not explicitly captured in many stress-based models.
The interaction analysis further revealed that epicentral distance plays a more active role in liquefaction triggering than commonly assumed, exhibiting meaningful manifestations with both earthquake magnitude and soil resistance parameters. Additionally, strong coupling between moment magnitude and fines content, as well as a potential interaction between acceleration and fines content, points to complex liquefaction pathways in silty and clayey soils. These findings underscore the value of interpretable models in uncovering parameter interactions that may otherwise remain obscured in black-box ML approaches.

3.5.2. Methodological Advantages and Comparison with the Existing Models

To evaluate the practical relevance of the proposed model, a comparative analysis was conducted against traditional semi-empirical approaches and ML models reported in the literature. The developed rule-based liquefaction model achieved an overall accuracy of 86.2% and a coverage of 86.5%, which falls slightly above the typical performance range (70–85%) of state-of-practice (SOP) models [18]. Compared to other interpretable ML models for liquefaction assessment [25,26,27], the performance metrics of the proposed model are comparable, but its key advantage lies in its use of simpler input parameters and lower computational demands, making it especially practical for routine engineering applications and settings or communities with limited resources.
In contrast, black-box ML models—such as neural networks and ensemble methods—often report higher predictive accuracies of up to 99% [9,10,12]. However, these models lack transparency, making them less suitable for engineering tasks that require explainability, regulatory accountability, and user trust. This highlights the well-established tradeoff between interpretability and predictive performance [33,34]. While the rule-based approach may sacrifice a small degree of accuracy, it offers significant advantages in terms of transparency, traceability of parameters, and integration into decision-making workflows—key factors in disaster risk reduction and urban resilience planning.
Moreover, this study addresses common criticisms of AI-based liquefaction models, including lack of comparison with SOP models, weak adherence to methodological best practices, and limited real-world applicability [18]. The model was benchmarked against the Boulanger and Idriss SOP framework, achieving strong alignment (93.81% accuracy), and developed through rigorous preprocessing, ablation-based sensitivity analysis, and validation using real-world borehole data. By employing rough set-based IML, the study produced human-readable IF-THEN rules, supporting tools like scenario and parameter interaction maps, and proposed practical applications. These ensure that the model is not only explainable but also usable.
Finally, the study highlights opportunities to further improve interpretable models by expanding geotechnical and seismic datasets, especially in underrepresented conditions, and refining rule induction techniques. These enhancements could help close the performance gap between interpretable and black-box models, positioning rule-based approaches as a robust, trustworthy, and scalable solution for intelligent geotechnical engineering.

3.5.3. Limitations and Opportunities for Improvement

While the interpretive tools developed in this study offer significant advancements in liquefaction assessment, each comes with inherent limitations that must be acknowledged to ensure appropriate interpretation and application. The decision rules, although effective at identifying key patterns, are highly dependent on the quality, balance, and representativeness of the dataset. Before applying the rules, each soil layer must first be screened for liquefiable characteristics, namely, that it is saturated, loose, and composed predominantly of sand or low to non-plastic fines. Once deemed potentially liquefiable, the soil layer should be evaluated against all 25 decision rules. The rule with the highest certainty factor among those activated will provide the primary prediction, while the remaining activated rules offer supporting insights for site-specific interpretation.
Importantly, no individual rule should be used as a stand-alone predictive model. Reliable assessment requires a holistic review of all relevant rules and contextual information.
For future improvements, the presence of partial-domain rules—those missing either a seismic demand or a soil resistance parameter—highlights the need for additional data to improve rule completeness and robustness.
Scenario maps are limited by the resolution and diversity of the training dataset. While they offer empirical thresholds and reveal contextual patterns, they may fail to capture rare or extreme liquefaction cases, especially in underrepresented soil types or unique seismic settings. Parameter interaction maps, meanwhile, depend on the co-occurrence frequency of parameters within the rule set. Some critical interactions may be underrepresented due to data scarcity, and the current interaction maps do not account for temporal variables such as stress history or seismic duration effects.
In terms of variability and uncertainty, it is acknowledged that soil parameters, including fines content, are highly variable across the site. For example, the fines content at a given depth can differ significantly even just 1 m away horizontally. This inherent variability is reflected in the current rule-based model. Future rule-based models or other interpretable AI algorithms can be developed to better account for the spatial variability and uncertainties of these soil parameters.
These limitations point to clear opportunities for continued development. Expanding the liquefaction case history database—especially with more non-liquefaction cases—can improve model generalizability and accuracy. Refining discretization strategies through adaptive binning or data-driven thresholds could further enhance rule clarity while preserving interpretability. Additionally, integrating underutilized parameters such as Atterberg limits, CPT-based indices, and shear wave velocity may also capture the complexity of liquefaction behavior, ultimately improving the practical utility and scientific rigor of interpretable machine learning in geotechnical engineering.

4. Conclusions

This study developed an interpretable machine learning framework for liquefaction assessment using rough set theory and borehole-based seismic data. The resulting rule-based model achieved an accuracy of 86.2% with 86.5% coverage and showed strong agreement (93.81%) with the established stress-based procedure of Boulanger and Idriss [2]. Twenty-five transparent IF–THEN decision rules were derived, capturing key relationships among earthquake loading and soil resistance parameters while remaining fully explainable.
Beyond predictive performance, the framework provides practical interpretive tools—including decision rules, scenario maps, and parameter interaction maps—that support engineering judgment and transparent decision-making. The analysis confirms the dominant role of corrected SPT N-value while highlighting meaningful interactions involving magnitude, fines content, and epicentral distance that are often simplified in conventional approaches.
Critically, the study addresses key concerns raised in the literature regarding the opacity, impracticality, and methodological gaps of AI-based liquefaction models. By emphasizing explainability, real-world validation, and functionality, the proposed approach serves as a bridge between advanced data-driven methods and traditional geotechnical practice.
Future work will focus on applying the proposed rule-based model to independent liquefaction case histories from the Philippines and from recent earthquake events worldwide. Emphasis will be placed on compiling post-earthquake datasets with well-documented surface manifestations and complete subsurface information to enable rigorous external validation. In addition, future applications will consider a wider range of moment magnitudes and epicentral distances to better evaluate the model’s performance under varying seismic scenarios and to extend its applicability to regional-scale liquefaction assessment.
While limitations such as partial-domain rules, data imbalance, and constrained scenario representation were acknowledged, these also provide clear pathways for future enhancement through database expansion, refined discretization, and inclusion of underutilized parameters like CPT and shear wave velocity. Ultimately, this study offers a reliable, interpretable, and field-ready decision-making framework, marking a significant step toward intelligent and transparent liquefaction assessment in geotechnical engineering.

Supplementary Materials

The following supporting information can be downloaded at: https://doi.org/10.5281/zenodo.16665083. It includes (1) the SPT-based soil liquefaction case histories used in rough set machine learning and (2) the table of model evaluation borehole data and predictions.

Author Contributions

Conceptualization, J.D. and E.T.; methodology, J.D. and E.T.; software, E.T.; validation, J.D. and E.T.; formal analysis, J.D. and E.T.; investigation, J.D. and E.T.; resources, J.D. and E.T.; data curation, J.D. and E.T.; writing—original draft preparation, E.T.; writing—review and editing, J.D.; visualization, E.T.; supervision, J.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partially funded by the government of the Philippines through the Department of Science and Technology’s Engineering Research and Development for Technology (DOST-ERDT) scholarship program.

Data Availability Statement

The original contributions presented in this study are included in the article/Supplementary Materials. Further inquiries can be directed to the corresponding author.

Acknowledgments

This study acknowledges DOST-ERDT for funding.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Seed, H.B.; Idriss, I.M. Simplified Procedure for Evaluating Soil Liquefaction Potential. J. Soil. Mech. Found. Div. 1971, 97, 1249–1273. [Google Scholar] [CrossRef]
  2. Boulanger, R.W.; Idriss, I.M. CPT and SPT Based Liquefaction Triggering Procedures; Center for Geotechnical Modeling Department of Civil and Environmental Engineering University of California Davis: Davis, CA, USA, 2014. [Google Scholar]
  3. Cetin, K.O.; Seed, R.B.; Kayen, R.E.; Moss, R.E.S.; Bilge, H.T.; Ilgac, M.; Chowdhury, K. Dataset on SPT-based seismic soil liquefaction. Data Brief. 2018, 20, 544–548. [Google Scholar] [CrossRef]
  4. Youd, T.L.; Idriss, I.M.; Andrus, R.D.; Arango, I.; Castro, G.; Christian, J.T.; Dobry, R.; Finn, W.D.L.; Harder, L.F.; Hynes, M.E.; et al. Liquefaction Resistance of Soils: Summary Report from the 1996 NCEER and 1998 NCEER/NSF Workshops on Evaluation of Liquefaction Resistance of Soils. J. Geotech. Geoenviron. Eng. 2001, 127, 817–833. [Google Scholar] [CrossRef]
  5. Green, R.A.; Bommer, J.J.; Rodriguez-Marek, A.; Maurer, B.W.; Stafford, P.J.; Edwards, B.; Kruiver, P.P.; De Lange, G.; Van Elk, J. Addressing limitations in existing ‘simplified’ liquefaction triggering evaluation procedures: Application to induced seismicity in the Groningen gas field. Bull. Earthq. Eng. 2019, 17, 4539–4557. [Google Scholar] [CrossRef]
  6. Upadhyaya, S.; Green, R.A.; Rodriguez-Marek, A.; Maurer, B.W. True Liquefaction Triggering Curve. J. Geotech. Geoenviron. Eng. 2023, 149, 04023005. [Google Scholar] [CrossRef]
  7. Tsaparli, V.; Kontoe, S.; Taborda, D.M.G.; Potts, D.M. A case study of liquefaction: Demonstrating the application of an advanced model and understanding the pitfalls of the simplified procedure. Géotechnique 2020, 70, 538–558. [Google Scholar] [CrossRef]
  8. Cubrinovski, M.; Rhodes, A.; Ntritsos, N.; Van Ballegooy, S. System response of liquefiable deposits. Soil Dyn. Earthq. Eng. 2019, 124, 212–229. [Google Scholar] [CrossRef]
  9. Kennedy, S.; Alabbood, M.; Jaafar, I.M. Evaluating seismic liquefaction potential using shear wave velocity using machine learning. In Proceedings of the The 9th World Congress on Civil, Structural, and Environmental Engineering, London, UK, 14–16 April 2024. [Google Scholar]
  10. Fadliansyah, F.; Faris, F.; Wilopo, W. Implementation of machine learning classification models considering the optimum data ratio in predicting soil liquefaction susceptibility. IOP Conf. Ser. Earth Environ. Sci. 2024, 1416, 012012. [Google Scholar] [CrossRef]
  11. Samui, P.; Sitharam, T.G. Machine learning modelling for predicting soil liquefaction susceptibility. Nat. Hazards Earth Syst. Sci. 2011, 11, 1–9. [Google Scholar] [CrossRef]
  12. Kurnaz, T.F.; Erden, C.; Kökçam, A.H.; Dağdeviren, U.; Demir, A.S. A hyper parameterized artificial neural network approach for prediction of the factor of safety against liquefaction. Eng. Geol. 2023, 319, 107109. [Google Scholar] [CrossRef]
  13. Chou, J.-S.; Pham, T.-B.-Q. Enhancing soil liquefaction risk assessment with metaheuristics and hybrid learning techniques. Georisk 2024, 19, 115–133. [Google Scholar] [CrossRef]
  14. Demir, A.S.; Kurnaz, T.F.; Kökçam, A.H.; Erden, C.; Dağdeviren, U. A comparative analysis of ensemble learning algorithms with hyperparameter optimization for soil liquefaction prediction. Environ. Earth Sci. 2024, 83, 289. [Google Scholar] [CrossRef]
  15. Demir, S.; Sahin, E.K. An investigation of feature selection methods for soil liquefaction prediction based on tree-based ensemble algorithms using AdaBoost, gradient boosting, and XGBoost. Neural Comput. Appl. 2023, 35, 3173–3190. [Google Scholar] [CrossRef]
  16. Jas, K.; Mangalathu, S.; Dodagoudar, G.R. Evaluation and analysis of liquefaction potential of gravelly soils using explainable probabilistic machine learning model. Comput. Geotech. 2024, 167, 106051. [Google Scholar] [CrossRef]
  17. Kökçam, A.H.; Erden, C.; Demir, A.S.; Kurnaz, T.F. Bibliometric analysis of artificial intelligence techniques for predicting soil liquefaction: Insights and MCDM evaluation. Nat. Hazards 2024, 120, 11153–11181. [Google Scholar] [CrossRef]
  18. Maurer, B.W.; Sanger, M.D. Why “AI” models for predicting soil liquefaction have been ignored, plus some that shouldn’t be. Earthq. Spectra 2023, 39, 1883–1910. [Google Scholar] [CrossRef]
  19. Cheng, K.; Ziotopoulou, K. Machine Learning Applications in Geotechnical Earthquake Engineering: Progress, Gaps, and Opportunities. In Proceedings of the Geo-Congress 2023; American Society of Civil Engineers: Vancouver, BC, Canada, 2023; Volume 2023, pp. 493–505. [Google Scholar]
  20. Rudin, C.; Chen, C.; Chen, Z.; Huang, H.; Semenova, L.; Zhong, C. Interpretable machine learning: Fundamental principles and 10 grand challenges. Statist. Surv. 2022, 16, 1–85. [Google Scholar] [CrossRef]
  21. Hansen, T.F. Can We Trust the Machine Learning Based Geotechnical Model? In Information Technology in Geo-Engineering; Springer Series in Geomechanics and Geoengineering; Springer Nature Switzerland: Cham, Switzerland, 2025; pp. 332–340. ISBN 978-3-031-76527-8. [Google Scholar]
  22. Liu, W.; Liu, F.; Fang, W.; Love, P.E.D. Causal discovery and reasoning for geotechnical risk analysis. Reliab. Eng. Syst. Saf. 2024, 241, 109659. [Google Scholar] [CrossRef]
  23. Zhang, P.; Yin, Z.-Y.; Sheil, B. Interpretable data-driven constitutive modelling of soils with sparse data. Comput. Geotech. 2023, 160, 105511. [Google Scholar] [CrossRef]
  24. Cubrinovski, M.; Ntritsos, N.; Dhakai, R.; Rhodes, A. Key aspects in the engineering assessment of soil liquefaction. In Earthquake Geotechnical Engineering for Protection and Development of Environment and Constructions; CRC Press: Boca Raton, FL, USA, 2019; pp. 189–208. ISBN 978-0-367-14328-2. [Google Scholar]
  25. Arabani, M.; Pirouz, M. Liquefaction Prediction Using Rough Set Theory. Sci. Iran. 2017, 26, 779–788. [Google Scholar] [CrossRef]
  26. Torres, E.; Dungca, J. Prediction of Soil Liquefaction Triggering Using Rule-Based Interpretable Machine Learning. Geosciences 2024, 14, 156. [Google Scholar] [CrossRef]
  27. Torres, E.S.; Dungca, J.R. An Interpretable Machine Learning Approach in Understanding Lateral Spreading Case Histories. Int. J. GEOMATE 2024, 26, 110–117. [Google Scholar] [CrossRef]
  28. Pawlak, Z. Rough sets. Int. J. Comput. Inf. Sci. 1982, 11, 341–356. [Google Scholar] [CrossRef]
  29. Bazan, J.G.; Szczuka, M. The Rough Set Exploration System. In Transactions on Rough Sets III; Peters, J.F., Skowron, A., Eds.; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2005; Volume 3400, pp. 37–56. ISBN 978-3-540-25998-5. [Google Scholar]
  30. Bazan, J.G.; Szczuka, M. RSES 2.2 User’s Guide; Warsaw University: Warszawa, Poland, 2005. [Google Scholar]
  31. Ntritsos, N.; Cubrinovski, M. Ground-motion effects on liquefaction response. Soil Dyn. Earthq. Eng. 2024, 177, 108392. [Google Scholar] [CrossRef]
  32. Idriss, I.M.; Boulanger, R.W. SPT-Based Liquefaction Triggering Procedures; Center for Geotechnical Modeling Department of Civil and Environmental Engineering University of California Davis: Davis, CA, USA, 2010. [Google Scholar]
  33. Allen, G.I.; Gan, L.; Zheng, L. Interpretable Machine Learning for Discovery: Statistical Challenges and Opportunities. Annu. Rev. Stat. Its Appl. 2024, 11, 97–121. [Google Scholar] [CrossRef]
  34. Doshi-Velez, F.; Kim, B. Towards A Rigorous Science of Interpretable Machine Learning. arXiv 2017, arXiv:1702.08608. [Google Scholar] [CrossRef]
  35. Galupino, J.; Dungca, J. Estimating Liquefaction Susceptibility Using Machine Learning Algorithms with a Case of Metro Manila, Philippines. Appl. Sci. 2023, 13, 6549. [Google Scholar] [CrossRef]
  36. Liao, S.S.C.; Whitman, R.V. Overburden Correction Factors for SPT in Sand. J. Geotech. Engrg. 1986, 112, 373–377. [Google Scholar] [CrossRef]
  37. Dungca, J.R.; Chua, R.A.D. Development of a Probabilistic Liquefaction Potential Map for Metro Manila. Int. J. GEOMATE 2016, 10, 1804–1809. [Google Scholar] [CrossRef]
  38. Moreno, J.J.H.; Dean, A.; Ahmad, M.; Del Mundo, C.R.; Vea Kathrynn, G.; Sarmiento, R.; Abejuro, L.H.C.; Emmanuel, C.; Endaya, J.; Valdez, R.L.C.; et al. Reliability Analysis of Earthquake-Induced Liquefaction in Manila using Monte Carlo Simulation. In Proceedings of the 2022 IEEE 14th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management (HNICEM), Boracay Island, Philippines, 1–4 December 2022; IEEE: Boracay Island, Philippines, 2022; pp. 1–6. [Google Scholar]
  39. Eslami, A.; Ghorbani, A. Assessment of Near-Field Strong Ground Motion Effects on Offshore Wind Turbines Resting on Liquefiable Soils Using Fully Coupled Nonlinear Dynamic Analysis. J. Geotech. Geoenviron. Eng. 2023, 149, 04023095. [Google Scholar] [CrossRef]
  40. Wang, C.-Y.; Manga, M. Liquefaction. In Water and Earthquakes; Lecture Notes in Earth System Sciences; Springer International Publishing: Cham, Switzerland, 2021; pp. 301–321. ISBN 978-3-030-64307-2. [Google Scholar]
  41. Kramer, S.L. Geotechnical Earthquake Engineering; Prentice-Hall International Series in Civil Engineering and Engineering Mechanics; Prentice Hall: Upper Saddle River, NJ, USA, 1996; ISBN 978-0-13-374943-4. [Google Scholar]
  42. Murdoch, W.J.; Singh, C.; Kumbier, K.; Abbasi-Asl, R.; Yu, B. Definitions, methods, and applications in interpretable machine learning. Proc. Natl. Acad. Sci. USA 2019, 116, 22071–22080. [Google Scholar] [CrossRef] [PubMed]
  43. Quigley, M.; Duffy, B. Effects of Earthquakes on Flood Hazards: A Case Study From Christchurch, New Zealand. Geosciences 2020, 10, 114. [Google Scholar] [CrossRef]
  44. Holzer, T.L.; Bennett, M.J.; Ponti, D.J.; Tinsley Iii, J.C. Liquefaction and Soil Failure During 1994 Northridge Earthquake. J. Geotech. Geoenviron. Eng. 1999, 125, 438–452. [Google Scholar] [CrossRef]
  45. Boulanger, R.W.; Idriss, I.M. Evaluating the Potential for Liquefaction or Cyclic Failure of Silts and Clays; Center for Geotechnical Modeling Department of Civil & Environmental Engineering University of California Davis: Davis, CA, USA, 2004. [Google Scholar]
  46. Bray, J.D.; Sancio, R.B.; Durgunoglu, T.; Onalp, A.; Youd, T.L.; Stewart, J.P.; Seed, R.B.; Cetin, O.K.; Bol, E.; Baturay, M.B.; et al. Subsurface Characterization at Ground Failure Sites in Adapazari, Turkey. J. Geotech. Geoenviron. Eng. 2004, 130, 673–685. [Google Scholar] [CrossRef]
  47. Bray, J.D.; Sancio, R.B. Assessment of the Liquefaction Susceptibility of Fine-Grained Soils. J. Geotech. Geoenviron. Eng. 2006, 132, 1165–1177. [Google Scholar] [CrossRef]
  48. Hu, X.; Zhang, Y.; Guo, L.; Wang, J.; Cai, Y.; Fu, H.; Cai, Y. Cyclic behavior of saturated soft clay under stress path with bidirectional shear stresses. Soil Dyn. Earthq. Eng. 2018, 104, 319–328. [Google Scholar] [CrossRef]
  49. Sun, L.; Gu, C.; Wang, P. Effects of cyclic confining pressure on the deformation characteristics of natural soft clay. Soil Dyn. Earthq. Eng. 2015, 78, 99–109. [Google Scholar] [CrossRef]
  50. Wei, Y.; Zhu, Y.; Ni, J. Experimental Study on the Combined Effect of Cyclic and Static Loads on the Mechanical Properties of the Saturated Soft Clay Material. Key Eng. Mater. 2016, 723, 843–848. [Google Scholar] [CrossRef]
  51. Cetin, K.O.; Soylemez, B.; Guzel, H.; Cakir, E. Soil liquefaction sites following the February 6, 2023, Kahramanmaraş-Türkiye earthquake sequence. Bull. Earthq. Eng. 2024, 23, 921–944. [Google Scholar] [CrossRef]
  52. Baziar, M.H.; Rostami, H. Earthquake Demand Energy Attenuation Model for Liquefaction Potential Assessment. Earthq. Spectra 2017, 33, 757–780. [Google Scholar] [CrossRef]
  53. Kayen, R.E.; Mitchell, J.K. Assessment of Liquefaction Potential during Earthquakes by Arias Intensity. J. Geotech. Geoenviron. Eng. 1997, 123, 1162–1174. [Google Scholar] [CrossRef]
  54. Kumar, K.; Samui, P.; Choudhary, S.S. Assessment of maximum liquefaction distance using soft computing approaches. Geomech. Eng. 2024, 37, 395–418. [Google Scholar] [CrossRef]
  55. Boulanger, R.W.; Idriss, I.M. Cyclic Strength Evaluation Criteria for Sand-Like, Clay-Like, and Intermediate Soils. In Proceedings of the Geo-Congress 2024; American Society of Civil Engineers: Vancouver, BC, Canada, 2024; pp. 456–466. [Google Scholar]
  56. Boulanger, R.W.; Idriss, I.M. Evaluation of Cyclic Softening in Silts and Clays. J. Geotech. Geoenviron. Eng. 2007, 133, 641–652. [Google Scholar] [CrossRef]
  57. Zeybek, A.; Yıldız, Ö.; Sönmezer, Y.B. Post-earthquake assessment of liquefaction in Gölbaşı-Adıyaman following the 6 February 2023 Kahramanmaraş earthquakes: Field observations, SPT-based analysis, and laboratory testing of soil ejecta. Nat. Hazards 2025, 121, 24157–24206. [Google Scholar] [CrossRef]
Figure 1. Four-stage research framework for interpretable machine learning-based liquefaction assessment. The approach consists of (1) data preparation and sourcing; (2) rule-based model generation using rough set machine learning (RSML); (3) interpretive analysis of rules through sensitivity, interaction, and scenario mapping; and (4) model evaluation and site-adaptive application.
Figure 1. Four-stage research framework for interpretable machine learning-based liquefaction assessment. The approach consists of (1) data preparation and sourcing; (2) rule-based model generation using rough set machine learning (RSML); (3) interpretive analysis of rules through sensitivity, interaction, and scenario mapping; and (4) model evaluation and site-adaptive application.
Geosciences 16 00025 g001
Figure 2. Distribution plots of the seven conditional attributes used in the rule-based machine learning model for soil liquefaction assessment. Each subfigure displays a violin and box plot showing the parameter distribution across three categories: No Liquefaction (n = 118), Yes Liquefaction (n = 133), and All Cases (n = 251). The box plots represent medians and interquartile ranges, while the violins illustrate the data density. These visualizations support the discretization of continuous variables for model development and offer insight into how each parameter varies with liquefaction occurrence. The attributes are as follows: (a) Moment magnitude, (b) Maximum horizontal acceleration (g), (c) Epicentral distance (km), (d) Average depth of the critical layer (m), (e) Depth to groundwater table (m), (f) SPT N-value corrected to 60% hammer efficiency (N60), and (g) Fines content (%).
Figure 2. Distribution plots of the seven conditional attributes used in the rule-based machine learning model for soil liquefaction assessment. Each subfigure displays a violin and box plot showing the parameter distribution across three categories: No Liquefaction (n = 118), Yes Liquefaction (n = 133), and All Cases (n = 251). The box plots represent medians and interquartile ranges, while the violins illustrate the data density. These visualizations support the discretization of continuous variables for model development and offer insight into how each parameter varies with liquefaction occurrence. The attributes are as follows: (a) Moment magnitude, (b) Maximum horizontal acceleration (g), (c) Epicentral distance (km), (d) Average depth of the critical layer (m), (e) Depth to groundwater table (m), (f) SPT N-value corrected to 60% hammer efficiency (N60), and (g) Fines content (%).
Geosciences 16 00025 g002
Figure 3. Rough set machine learning algorithm used in this study.
Figure 3. Rough set machine learning algorithm used in this study.
Geosciences 16 00025 g003
Figure 4. Distribution of case histories across discretized value ranges for seven condition (input parameter) attributes. Each bar is divided into Low, High, and/or Very High categories, with the numbers in parentheses indicating the actual value intervals corresponding to each category.
Figure 4. Distribution of case histories across discretized value ranges for seven condition (input parameter) attributes. Each bar is divided into Low, High, and/or Very High categories, with the numbers in parentheses indicating the actual value intervals corresponding to each category.
Geosciences 16 00025 g004
Figure 5. Flow of model evaluation.
Figure 5. Flow of model evaluation.
Geosciences 16 00025 g005
Figure 6. Rule statistics of the best rule set. Each rule is evaluated based on strength, certainty, and coverage (plotted on the left axis), and support (plotted on the right axis). (a) Rules 1–15 correspond to liquefaction outcomes, while (b) Rules 16–25 indicate no-liquefaction outcomes.
Figure 6. Rule statistics of the best rule set. Each rule is evaluated based on strength, certainty, and coverage (plotted on the left axis), and support (plotted on the right axis). (a) Rules 1–15 correspond to liquefaction outcomes, while (b) Rules 16–25 indicate no-liquefaction outcomes.
Geosciences 16 00025 g006aGeosciences 16 00025 g006b
Figure 7. Scenario-based classification maps illustrating liquefaction triggering potential under different seismic and soil conditions.
Figure 7. Scenario-based classification maps illustrating liquefaction triggering potential under different seismic and soil conditions.
Geosciences 16 00025 g007
Figure 8. Parameter interaction map for liquefaction triggering derived from the rule-based liquefaction model. Node color indicates parameter importance based on ablation sensitivity analysis, while arrow color reflects the co-occurrence frequency between parameter pairs.
Figure 8. Parameter interaction map for liquefaction triggering derived from the rule-based liquefaction model. Node color indicates parameter importance based on ablation sensitivity analysis, while arrow color reflects the co-occurrence frequency between parameter pairs.
Geosciences 16 00025 g008
Figure 9. A framework on how to apply the developed use rule-based predictive model to classify the liquefaction potential of a soil layer in the site.
Figure 9. A framework on how to apply the developed use rule-based predictive model to classify the liquefaction potential of a soil layer in the site.
Geosciences 16 00025 g009
Figure 10. Assessment of liquefaction potential of a borehole from Gölbaşı Adıyaman following the 6 February 2023 Kahramanmaraş, Turkey earthquakes showing (a) N60 values, (b) factor of safety using the SPT-based deterministic method of Boulanger and Idriss [2], and (c) activated rules using the rule-based model. The shaded portion (in pink) shows the liquefied layers.
Figure 10. Assessment of liquefaction potential of a borehole from Gölbaşı Adıyaman following the 6 February 2023 Kahramanmaraş, Turkey earthquakes showing (a) N60 values, (b) factor of safety using the SPT-based deterministic method of Boulanger and Idriss [2], and (c) activated rules using the rule-based model. The shaded portion (in pink) shows the liquefied layers.
Geosciences 16 00025 g010
Table 1. Sample spreadsheet of raw data for the rule-based model.
Table 1. Sample spreadsheet of raw data for the rule-based model.
IDEventMamax (g)R (km)Avg
Depth (m)
Depth GWT (m)N60FCLiq?
11944 Tohnankai8.10.2171.55.22.15.910Yes
21944 Tohnankai8.10.2167.54.32.42.330Yes
31944 Tohnankai8.10.2170.13.72.1127Yes
41948 Fukui70.43.441.280Yes
51948 Fukui70.356.87.53.717.34Yes
61964 Niigata7.60.09158.63.312.65Yes
Table 2. Types of attributes used in the rule-based liquefaction model.
Table 2. Types of attributes used in the rule-based liquefaction model.
AttributesDefinitionCondition or DecisionRange of ValuesContinuous or Categorical
MMoment MagnitudeCondition5.9 to 8.3Continuous
amax (g)Maximum AccelerationCondition0.052 to 0.84Continuous
R (km)Epicentral DistanceCondition2.2 to 257.1Continuous
Avg depth (m)Average DepthCondition1.8 to 14.3Continuous
Depth GWT (m)Depth of Groundwater TableCondition0 to 7.7Continuous
N60SPT N-value corrected to 60% Hammer EfficiencyCondition1.0 to 50.8Continuous
FC (%)Fines ContentCondition0 to 92Continuous
Liq?No or YesDecisionN/ACategorical
Table 3. Sample decision table for the rule-based liquefaction model.
Table 3. Sample decision table for the rule-based liquefaction model.
Mamax (g)R (km)Avg Depth (m)Depth GWT (m)N60FC (%)Liq?
HighHighHighHighHighLowHighNo
HighHighHighLowHighLowHighNo
HighHighHighLowHighLowHighNo
HighVery HighLowLowLowLowLowNo
HighVery HighLowHighHighHighLowNo
HighLowHighLowLowLowHighNo
HighLowHighHighLowLowLowNo
Table 4. Data and parameters used in model evaluation.
Table 4. Data and parameters used in model evaluation.
ParametersManila CityQuezon City
Number of borehole data8863
Moment magnitude7.5
Maximum horizontal acceleration0.38 g0.39 g
Epicentral distance (from the Marikina West Valley Fault)10.65 km4.34 km
Hammer efficiency correction (CE)1.25
Borehole diameter correction (CB)1.00
Rod length correction (CR)3–4 m (10–13 ft)
4–6 m (13–20 ft)
6–10 m (20–30 ft)
>10 m (>30 ft)
0.75
0.85
0.95
1.00
Sampling method correction (CS)1.20
Correction due to overburden pressure (CN)Liao and Whitman (m = 0.5)
Table 5. Descriptive statistics for rule-based liquefaction model.
Table 5. Descriptive statistics for rule-based liquefaction model.
StatisticMamax (g)R (km)Avg Depth (m)Depth GWT (m)N60FC (%)
Minimum5.900.052.231.800.001.000.00
Maximum8.300.84257.0514.307.7050.8092.00
Range2.400.79254.8212.507.7049.8092.00
Median6.930.2436.604.601.8010.008.00
Mean7.150.3060.015.101.9812.1517.58
Standard deviation (n-1)0.520.1646.092.151.248.7821.77
Skewness (Pearson)−0.181.011.171.051.491.581.69
Table 6. Performance Metrics of the Initial Discretization and the Selected Best Rule Set.
Table 6. Performance Metrics of the Initial Discretization and the Selected Best Rule Set.
Rule Set CharacteristicsInitial DiscretizationSelected Best Rule Set
Shortening Ratio0.80.8
Minimum Number of Supports per Rule55
Number of Rules3025
Accuracy (All Data)86.8%86.2%
Coverage (All Data)87.3%86.4%
Accuracy (90/10 Split)82.6%86.4%
Coverage (90/10 Split)88.5%84.6%
Table 7. The chosen best rule set for raw borehole data rule-based model.
Table 7. The chosen best rule set for raw borehole data rule-based model.
RulesMamax (g)R (km)Avg Depth (m)GWTAvg NmFC (%)Liq
1 Very High Low Yes
2High LowHighYes
3 High LowHighYes
4High High Low Yes
5High LowLow Yes
6 High High Low Yes
7 Very HighLow Yes
8 Very High Very HighYes
9HighVery High Yes
10HighLow LowLowYes
11 LowHigh Low Yes
12 Low LowLowYes
13High Low Low Yes
14 High High LowYes
15High Low HighYes
16 Very High No
17 LowHigh No
18 High High No
19Low High No
20 High High No
21LowLow No
22 Low High No
23 HighLowNo
24 Low High No
25Low LowHigh HighNo
Table 8. Performance metrics for sensitivity analysis.
Table 8. Performance metrics for sensitivity analysis.
Attribute RemovedTotal Accuracy
(k = 10 Folds)
Total Accuracy
(k = 100 Folds)
No Attribute Removed78.879.5
Moment Magnitude76.077.5
Maximum Acceleration71.270.5
Epicentral Distance75.275.0
Average Depth76.076.0
Depth of Groundwater Table76.076.0
N value corrected to 60% Hammer Efficiency53.250.0
Fines Content78.478.5
Table 9. Performance metrics for the model evaluation.
Table 9. Performance metrics for the model evaluation.
MetricsFormulaSolutionValue
Accuracy(TP + TN)/(TP + TN + FP + FN)(466 + 565)/(1099)93.81%
PrecisionTP/(TP + FP)466/(466 + 32)93.57%
RecallTP/(TP + FN)466/(466 + 36)92.83%
SpecificityTN/(TN + FP)565/(565 + 32)94.64%
F1 Score2 × (Precision × Recall)/(Precision + Recall)93.20%
Table 10. Data and parameters used in model evaluation.
Table 10. Data and parameters used in model evaluation.
ParametersBoulanger and Idriss ModelRule-Based Model
Moment magnitude7.8 (Pazarcik) & 7.6 (Elbistan)
Maximum horizontal acceleration0.45 g
Epicentral distance-85 km (Pazarcik)
55 km (Elbistan)
Average depth1.50–19.50 m
Depth of Groundwater Table1.9 m
SPT N-value corrected to 60% Hammer Efficiency8–23
Fines Content35%
Liquefied? (in Actual)Yes
Table 11. Summary of proposed applications for individual and combined rule-based liquefaction model tools.
Table 11. Summary of proposed applications for individual and combined rule-based liquefaction model tools.
Tool/Tool CombinationProposed Applications
Decision Rules
-
Rapid site screening
-
Design decision support and app integration
-
Engineer training tool
Scenario Maps
-
Site classification or zoning
-
Liquefaction screening tool in building codes
-
Post-earthquake forensics
Interaction Maps
-
Sensitivity analysis
-
Feature selection
-
Transparency enhancement
-
Resilience planning
Decision Rules + Scenario Maps
-
Site profiling
Decision Rules + Parameter Interaction Maps
-
Multi-parameter design checks
-
Confidence weighting
Scenario + Parameter Interaction Maps
-
Regional pattern discovery
-
Visual geotechnical training
All Three Tools Combined
-
Full decision support platform
-
Explainable benchmarking tool for new ML models
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Torres, E.; Dungca, J. Interpretable AI for Site-Adaptive Soil Liquefaction Assessment. Geosciences 2026, 16, 25. https://doi.org/10.3390/geosciences16010025

AMA Style

Torres E, Dungca J. Interpretable AI for Site-Adaptive Soil Liquefaction Assessment. Geosciences. 2026; 16(1):25. https://doi.org/10.3390/geosciences16010025

Chicago/Turabian Style

Torres, Emerzon, and Jonathan Dungca. 2026. "Interpretable AI for Site-Adaptive Soil Liquefaction Assessment" Geosciences 16, no. 1: 25. https://doi.org/10.3390/geosciences16010025

APA Style

Torres, E., & Dungca, J. (2026). Interpretable AI for Site-Adaptive Soil Liquefaction Assessment. Geosciences, 16(1), 25. https://doi.org/10.3390/geosciences16010025

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop