Explainable Artiﬁcial Intelligence (XAI) Model for Earthquake Spatial Probability Assessment in Arabian Peninsula

: Among all the natural hazards, earthquake prediction is an arduous task. Although many studies have been published on earthquake hazard assessment (EHA), very few have been published on the use of artiﬁcial intelligence (AI) in spatial probability assessment (SPA). There is a great deal of complexity observed in the SPA modeling process due to the involvement of seismological to geophysical factors. Recent studies have shown that the insertion of certain integrated factors such as ground shaking, seismic gap, and tectonic contacts in the AI model improves accuracy to a great extent. Because of the black-box nature of AI models, this paper explores the use of an explainable artiﬁcial intelligence (XAI) model in SPA. This study aims to develop a hybrid Inception v3-ensemble extreme gradient boosting (XGBoost) model and shapely additive explanations (SHAP). The model would efﬁciently interpret and recognize factors’ behavior and their weighted contribution. The work explains the speciﬁc factors responsible for and their importance in SPA. The earthquake inventory data were collected from the US Geological Survey (USGS) for the past 22 years ranging the magnitudes from 5 M w and above. Landsat-8 satellite imagery and digital elevation model (DEM) data were also incorporated in the analysis. Results revealed that the SHAP outputs align with the hybrid Inception v3-XGBoost model (87.9% accuracy) explanations, thus indicating the necessity to add new factors such as seismic gaps and tectonic contacts, where the absence of these factors makes the prediction model performs poorly. According to SHAP interpretations, peak ground accelerations (PGA), magnitude variation, seismic gap, and epicenter density are the most critical factors for SPA. The recent Turkey earthquakes (M w 7.8, 7.5, and 6.7) due to the active east Anatolian fault validate the obtained AI-based earthquake SPA results. The conclusions drawn from the explainable algorithm depicted the importance of relevant, irrelevant, and new futuristic factors in AI-based SPA modeling.


Introduction
Earthquakes are one of the most critical, and destructive natural hazards that can last for a few seconds, but the impact stays for years or even decades.According to the National Earthquake Information Center (NEIC), around 20,000 earthquakes are occurring each year worldwide [1].Since the 1900s, geologists and seismologists expect about 16 major earthquakes that could affect over a billion people [2].The Arabian Peninsula is one of the driest inhabited continents, which has experienced several earthquakes due to the Zagros-Bitlis fold and fault belt, Red Sea, Gulf of Aden, and Arabian/Persian Gulf [3].There is a possibility that future events of high magnitude will occur in this region.
The basic definition of earthquake spatial probability is a potential location for a particular magnitude event to occur within a specific period.Seismic hazard is the probability of ground shaking due to earthquakes [4].However, the definition is quite complicated as it changes with time and space [5].Two major types of hazard assessment techniques are often used globally such as probabilistic (PSHA) and deterministic seismic hazard assessment (DSHA) [6,7].Particularly, the statistical models are commonly used in PSHA to estimate the probability, whereas DSHA accentuates the ground shaking, based on physical models.
Globally, some works demonstrated that in a quantitative earthquake probability study, the incorporation of fault interaction and stress triggering is necessary.These studies implemented novel techniques to derive earthquake probability using the stress changes which is used with fault models [8,9].Shcherbakov et al. [10] conducted a study to compute extreme earthquake probabilities above a certain magnitude.The study shows that the occurrence of unexpected earthquakes could trigger powerful subsequent earthquakes.The results of their study could be useful in estimating earthquake probabilities in several stages of a sequence of events.Schäfer and Wenzel [11] conducted a global earthquake temporal probability assessment using multi-variate ML.Their results showed the potential of megathrust events in Manus Adriatic Thrust, and Mussau Trenches, where no historical earthquakes have been recorded.Gitis and Derendyaev [12] implemented an ML method of the minimum area of alarm and a method of approximation of interval expert for probability estimation.They successfully forecasted the target magnitudes based on an automatic web-based platform.Jena et al. [13] implemented a deep learning technique to estimate earthquake probability and achieved 89.47% accuracy in Palu, Indonesia.In another work, Jena et al. [14] applied a deep learning-based spatial probability assessment (SPA) in NE India and obtained 94% accuracy.Jena et al. [15] proposed an integrated model of artificial neural network analytical hierarchy process (ANN-AHP) for earthquake probability and risk assessment in Aceh province, Indonesia.The study observed that a high probability can be found in the southwest portion of the city using the ANN technique having an overall accuracy of 84%.In more recent work, Pourghasemi et al. [16] conducted a study on multi-hazard probability assessment in Iran using ensemble ML techniques.They achieved an overall accuracy of more than 80% for earthquake probability estimation.This implies that the use of ML techniques is becoming increasingly popular among geoscientists and hazard modelers due to its incremental learning process and high level of accuracy.
In the Arabian Peninsula, to the extent of our knowledge, no earthquake SPA analyses for magnitudes above 5 M w have been conducted.For instance, the two major studies were executed by Al-Haddad et al. [17,18] to generate seismic design standards for Saudi Arabia based on a probabilistic approach for PGA estimation.Later, they enlarged the PGA values for the whole Arabian Peninsula.Without considering the tectonic nature, a single ground-motion prediction equation (GMPE) was implemented.According to their results, the southernmost and western parts of the Zagros Belt and Makran Subduction Zone (MSZ) fall under the high-hazard zone, respectively.The Zagros-Bitlis Belt in the southeast and northwest has encountered a geodetical shortening of about 10 mm/year and 5 mm/year, respectively [19,20].Further, they estimated the PGA values for 475 and 2475 years of the return period.However, the observed PGA was not well correlated with the damage potential of ground shaking.The third study was performed by Pascucci et al. (2008) on hazard estimation without considering the active fault map of Iran [21] in the seismic source model.This work did not produce the iso-acceleration maps for the entire Arabian Peninsula.Their study estimated hazard values for some cities in UAE, Saudi Arabia, Qatar, and Bahrain.In another work, Al-Shijbi et al. [22] performed a PSHA for the Arabian Peninsula.Their study observed the highest ground acceleration values along the Zagros, the East Anatolian Fault, and the Gulf of Aqaba-Dead Sea Fault.The East Anatolian Fault separates the Arabian Plate from the Anatolian Plate resulting in high-magnitude events [23].The Red Sea shows evidence of continental rifting and falls under a high-hazard zone.Numerous other seismic hazard assessments have been conducted in the Arab Peninsula [24][25][26][27].Generally, most parts of the Arabian Peninsula are covered by low-to-moderate hazard zone as reported in the aforementioned studies.A smaller region falls under high hazard levels because of poor building quality.Geophysical studies conducted in the Gulf of Aden are characterized by an oceanic crust, indicating a high-hazard zone [28].
The hindrance to applying ML models in earthquake probability management studies is to achieve high prediction accuracy [29,30].Moreover, the investigations face challenges such as a lack of transparency and explainability of the results, due to the black-box nature of the applied ML models [31].The SHAP explainability approach describes the internal function of the Inception v3-XGBoost model to estimate the factors' interaction, relative importance, stable factors, local contribution, the distance among explanations, and predict the individual factors information on a single prediction, respectively.The physical models are widely used in the literature for seismological studies; however, several studies have found that AI-based models outperform them [13,14].Moreover, the traditional probabilistic models assume an attenuation relation for all the events and locations with an unrealistic hypothesis for PGA estimation having uncertainties [32].In the literature, we did not come across any work on the application of AI techniques in SPA in the Arabian Peninsula.The emergence of new explainable algorithms, such as SHAP, is useful to understand model and factors interaction that changed the perception of using AI-based models [33].Because SHAP works more robustly for all ML models than other XAI models such as Local Interpretable Model-Agnostic Explanations (LIME) and generalized additive model (GAM) [34].
The aim and novelty of the work lie in the application of an XAI framework to estimate earthquake spatial probability and identify the contributing factors, hidden interaction, and their relative importance.The current study attempted to leverage the use of the XAI technique to explain the black-box nature of the ML models.Hence, the present study focuses on estimating and understanding the SPA outputs using SHAP plots.No updated probability maps in the Gulf of Aqaba-Dead Sea Fault can be seen after Al-Haddad et al. [20] published a temporal probability map.However, no study has been conducted for a comprehensive SPA in the Gulf of Aden, Red Sea, which makes this study different from others.In summary, the objectives of the work are to: (i) use a hybrid Inception v3-ensemble XGBoost model to estimate earthquake spatial probability index; (ii) analyze the lone predictions using SHAP outputs to understand the predictors' interaction for earthquake SPA; and (iii) examine the spatial variation of outputs on change in factors and geographic conditions.This study addresses the following research questions: (1) how XAI works to apprehend and comprehend the model decisions, and complex intrinsic non-linear relations, respectively; and (2) ascertain the models' suitability for earthquake probability mapping.

Study Area
The Arabian Peninsula is bordered by active tectonic margins [35].The selected buffer area (2000 km radius) is centroid by Saudi Arabia and unexplored/partially explored that considers the major thrust faults and tectonic contacts.Divergent boundaries can be found along the Red Sea and the Gulf of Aden.Major transform plate boundaries are situated along the East Anatolian Fault, Owen Fracture Zone, and the Gulf of Aqaba-Dead Sea Fault Zone [24].The active convergent plate boundaries are located in the Makran Subduction Zone (MSZ), where the Arabian Plate subducts beneath the Eurasian plate producing a collision zone of the Zagros-Bitlis Fold Thrust Belt [36,37].
Seismic activity is mostly confined to plate boundaries in the Arabian Peninsula.Some studies mentioned that the Arabian Plate is a stable craton [26,38], however small magnitude events were observed from Palmyra, Sinjar area, and Oman mountains.Additionally, small-to-moderate earthquakes are observed within the Peninsula [35].The MSZ is dominated by shallow-depth earthquakes (30 km).The northward MSZ develops broad north dipping thrust faults and deformation zone.Hessami et al. [23] revealed that the MSZ dips at an angle of 6 • northwards, where the deeper dip angle was observed at 19 • .An ongoing drift makes the Arabian Peninsula spread at a rate of approximately 16 mm/year [39].The Gulf of Aden is a seafloor spreading environment, where the Arabian Plate is moving away from the African Plate.Therefore, all the tectonic movements are responsible for many earthquakes along the active plate boundaries around the Arabian Peninsula (Figure 1a,b).
are situated along the East Anatolian Fault, Owen Fracture Zone, and the Gulf of Aqaba-Dead Sea Fault Zone [24].The active convergent plate boundaries are located in the Makran Subduction Zone (MSZ), where the Arabian Plate subducts beneath the Eurasian plate producing a collision zone of the Zagros-Bitlis Fold Thrust Belt [36,37].
Seismic activity is mostly confined to plate boundaries in the Arabian Peninsula.Some studies mentioned that the Arabian Plate is a stable craton [26,38], however small magnitude events were observed from Palmyra, Sinjar area, and Oman mountains.Additionally, small-to-moderate earthquakes are observed within the Peninsula [35].The MSZ is dominated by shallow-depth earthquakes (30 km).The northward MSZ develops broad north dipping thrust faults and deformation zone.Hessami et al. [23] revealed that the MSZ dips at an angle of 6° northwards, where the deeper dip angle was observed at 19°.An ongoing drift makes the Arabian Peninsula spread at a rate of approximately 16 mm/year [39].The Gulf of Aden is a seafloor spreading environment, where the Arabian Plate is moving away from the African Plate.Therefore, all the tectonic movements are responsible for many earthquakes along the active plate boundaries around the Arabian Peninsula (Figure 1a,b).

Data
In this study, the major inputs were derived from reliable seismological, geological, geostructural, and ground motion data.This study implemented a procedure originally developed by Wason et al. [40] to convert all forms of magnitudes into moment magnitudes (M w ).First, earthquake catalogs were collected from several databases, including the National Earthquake Information Center (NEIC), the National Centre for Seismology (NCS), and the United States Geological Survey (USGS).The data collection period was from 2000 until 2022 with a filtered threshold of magnitude 5 M w and above to avoid incompleteness.The inventory data were applied for training and validation purposes in the Inception v3-extreme gradient boosting (Inception v3-XGBoost) model.Second, several factors were generated in the GIS environment.Third, administrative boundary, digital elevation model (DEM), thrust faults, tectonic contacts, and geology data were acquired by using remote sensing images and shape files.Fault information was derived using Landsat ETM+ and geology was derived using Landsat data.The earthquake SPA map for the Arabian Peninsula was generated using ArcGIS 10.8 and Python 3.9.To generate the thematic layers and training purposes, different algorithms, such as inverse distance weighting (IDW), spline, Euclidian distance, Kernel density, and buffer were used.
Developed thematic layers based on the natural breaks classification technique can be explained through objective themes, concepts of conditioning factors, prominent patterns, and insight prediction logic.Data pre-processing was conducted to remove some negative and illogical values.Subsequently, all the thematic maps were extracted into multiple values using ArcGIS tools.Each pixel value was produced in association with the target values to process.Next, post-processing was conducted after the prediction task using a point-to-raster conversion tool in ArcGIS software to generate a raster map.The major earthquakes in the Arabian Shield (M w 6.5), Dead Sea (M w 7.8), Red Sea (M w 7.9), Gulf of Aden (M w 6.7), Makran subduction zone (M w 8.4) are recorded and the detail earthquake information in the Arabian shield is presented in Table 1.The details about data sources, input factors, methods, and importance are presented in Table 2.

•
Magnitude variation: probability of occurrence of a specific magnitude event at a particular location can be understood.

•
Depth variation: provides the source depth information on the fault zone.

•
Epicenter density: provides a view of the clusters of events.This locates in the high probable zone.

•
Seismic gap: stress accumulation occurs in the seismic gap which could lead to future mega-events.

•
Frequency: more the earthquake frequency, the less the magnitude of the event.

Methodology
This study proposed a hybrid Inception V3-XGBoost model (Figure 2).The architecture of the Inception v3-XGBoost model is shown in Figure 3a.The designed model is characterized by a feature extractor and a classification head [50].This model replaced the classification head in the inception V3 model with the XGBoost classifier (Figure 3b).The input features provided by the Inception V3 model were used for the XGBoost classifier.This setup of hybrid combination works for other pre-trained CNNs and RNN models too.In the first stage, the Inception v3 model was fine-tuned for the training dataset.This study observed that the information originating from Inception v3 leads to a better result by the XGBoost classifier.This implemented Inception v3-XGBoost model is used to predict the targets, such as earthquake points and non-earthquakes, through Inception v3 and string indexer, vector assembler, XGBoost estimator, XGBoost transformer [51].To make a close of the predicted value with the real value in each round, trees are constructed based on the output of the previous tree, to enhance the model's prediction performance.Then, pre-processing of the data set is necessary to avoid interference with classification results because of null values, unbalanced data, different data structures, etc. Hereafter, feature selection can be applied to the data.Training, testing, and evaluation can be performed.
Then, the SHAP library was implemented using JavaScript functions to estimate the contribution of factors towards hazard assessment and factors interaction.Therefore, we explored how variables affect the model output by using individual and collective summary plots.Thereafter, a comparative study was conducted among the SHAP findings with or without important factors, and the influence of the factor "seismic gap" on earthquake hazard was estimated.This research has been conducted by using a combined approach of an ensemble boosting algorithm (XGBoost) and an explainable AI named SHAP.The steps are mentioned in detail in Figure 2.This implemented Inception v3-XGBoost model is used to predict the targets, such as earthquake points and non-earthquakes, through Inception v3 and string indexer, vector assembler, XGBoost estimator, XGBoost transformer [51].To make a close of the predicted value with the real value in each round, trees are constructed based on the output of the previous tree, to enhance the model's prediction performance.Then, pre-processing of the data set is necessary to avoid interference with classification results because of null values, unbalanced data, different data structures, etc. Hereafter, feature selection can be applied to the data.Training, testing, and evaluation can be performed.
Then, the SHAP library was implemented using JavaScript functions to estimate the contribution of factors towards hazard assessment and factors interaction.Therefore, we explored how variables affect the model output by using individual and collective summary plots.Thereafter, a comparative study was conducted among the SHAP findings with or without important factors, and the influence of the factor "seismic gap" on

Inception V3 Model Architecture
The network of Inception-v3 is a deep learning model [50].The difficulty occurs in training the model directly using a low-configured computer if there is no access to a supercomputer.Therefore, Inception-v3 works well through transfer learning, and the main graph of the Inception-v3 model can be presented in Figure 3.The TensorFlow library was used to retrain Inception's final layer for new categories.The transfer learning approach is a knowledge gaining method that uses the previous layer's parameters and removes the end layer, then retrains the last layer.The last layer output nodes are equal to the number of dataset categories.If the dataset has 1000 classes, 1000 output nodes can be observed in the last layer in the original model.Therefore, for the final classification purposes (0,1), this study applied the XGBoost classifier.The details of the XGBoost classifier were explained using the mathematical expression as shown in Section 4.2.Here, the model loss can be presented as follows: Training CNNs often generates results in overfitting.Therefore, this study implemented pre-trained CNNs to avoid this problem.The most popular CNN architectures are being used as ResNet and DenseNet.However, the best CNNs model that this study employed is the Inception-v3 model which is initialized by random weights and fine-tuned on the dataset to extract features.The Inception v3 is a pre-trained CNN model that provided the best F1 score.This model is the 3rd version of the inception family CNN models characterized by several improvements.This model provides an improved factorized convulsion which reduces the number of parameters and maintains network efficiency.This model uses a regularizer for label smoothing.Additionally, an auxiliary classifier was also employed to help propagate label information and regularization.

XGBoost Model Architecture
XGBoost is an ensemble machine learning model, a gradient boost algorithm, used to improve the performance of model prediction that combines a sequence of weak models into a strong learning model [52].The ensemble models provide better results compared to a single model (Figure 3).
In this section, the steps of XGB algorithm implementation were described [53].
Step 1: Initialization For solving a binary classification problem, where y i is the actual label can be represented as 1 or 0. Therefore, the most used log loss function can be considered in this case, can be presented as: where According to P i , y i and p values, g i and h i values can be estimated.where from the (t − 1)th tree of sample x i , the estimated predicted value can be presented as ŷ(t−1) i , where the actual value of x i is y i .However, the prediction value will be 0 for the 0th tree which means ŷ(0 Step 2: The Gain value of features needs to traverse through and be calculated to determine the splitting mode for the current root node.The Gain value will help to estimate the feature node with the maximum Gain score.
Step 3: In this step, the establishment of the Current Binary Leaf Node setup can be conducted.According to the feature with maximum Gain, the sample set is classified into two parts to obtain two leaf nodes.In addition, the second step should be repeated for two leaf nodes by considering the negative gain score and stopping conditions, respectively.This step led to establishing the whole tree [53].
Step 4: Whole Leaf Node prediction values can be calculated in this step.Leaf node ω j prediction values can be calculated as: and the second tree prediction results can be written as Then, this will lead to establishing the second tree.
Step 5: Next step is to repeat steps 1 and 2 to set up more trees until enough number of trees are established.The prediction values of the model ŷ(t) i can be written as ŷ(t , where ŷ(t) i denotes the prediction value of t trees on sample x i .This process establishes the tth tree.
Step 6: The formula that can be used to determine the classification result of the sample is to achieve the probability by converting the final predicted value ŷ(t) i of the sample.When p i ≥ 0.5, the probability of the sample is 1, otherwise, it is 0.

Model Implementation
Firstly, the training dataset was used to train Inception v3.The pretrained layer weights were employed for Inception v3 classification.The fully connected layer was addressed using a 2-node SoftMax classifier for the binary classification.The randomly initialized parameters were implemented for the analysis.The SoftMax function is being used for the probability distribution.Each output element lies between 0 and 1 and the sum of the output elements reaches 1.The input layers are assigned with maximum probability to the class.The dataset has a class imbalance that can be observed.Therefore, to solve this problem, the study implemented a weighted cross-entropy loss function.This function shows the weight assigned to each class.Larger weights were assigned to minimize the class imbalance for minor classes.For each minibatch, the average loss across observations is analyzed.The next stage deals with several augmentation techniques in the training for data pre-processing.The augmentation process generalizes the model better and improves performance.The validation loss for the Inception v3 models shows the performance of the model.
The training of the XGBoost model was conducted using earthquake points with magnitudes of 5 M w and above, whereas non-earthquake points were selected randomly for training purposes.The original data were split into a 4:1 ratio of size and distribution of points.Specifically, the data were split into training (6000 earthquakes and 6000 nonearthquake points), validation (2400 non-earthquake points), and testing (1,000,000 points).The modeling was performed on four Nvidia 6 GB Graphics with 40 GB of GPU VRAM.The model employed regularization to prevent overfitting using the least absolute shrinkage and selection operator (LASSO) (L1) and Ridge (L2) regularization.The sparsity awareness in the XGBoost model gives access to sparse features to handle efficiently the sparse patterns in data and best estimates the missing value based on training loss [54].The weighted quantile sketch algorithm was employed in the XGBoost model to find split points amidst weighted datasets.The cross-validation algorithm in the model comes at each iteration to explicitly specify the required iterations in a single run.
Until the trained hybrid model achieves a satisfactory result (overall accuracy above 80%), the training process continues with tunning the model's parameters and modification of training samples.Finally, the ArcGIS's multi-values to points were considered to remove irrelevant and illogical values to estimate the final prediction result of models.The model training and prediction were conducted with the optimal parameters.The general prediction parameters used in the Inception v3-XGBoost model are presented in Table 3.Based on the Inception v3-XGBoost feature selection, the model is significant for determining good and bad samples for earthquake points and non-earthquake points.

Model Evaluation
The present work uses 12 factors including seismological, geological, and geo-structural as predictors (Table 2).The model's predictive capacity was appraised using four statistical metrics such as: coefficient of determination (R 2 ) and root mean square error (RMSE), precision and accuracy [15].The mathematical expression of these metrics is shown in Equations ( 7)- (10).
where the mean value is x i , the observed and forecasted values are x i and x i with m being the number of samples.The precision and accuracy can be represented as: where PPV stands for positive predictive value, ACC is accuracy, N and P denote negative and positive points, TP is true positive, FP is false positive, and TN stands for true negative, respectively.

SHAP Interpretation
The SHAP model was first introduced in game theory to estimate the individual player's contribution to a team game [33].This concept was created to administer the total gain based on the contributions of the players to solve the problem of providing a fair reward.Recently, the SHAP algorithm development was conducted by Lundberg and Lee [55], which has opened a new direction for understanding black-box models.This provides more lucidity towards the AI-based model's output.The SHAP was used by assigning a value estimated by accuracy, consistency, and null effect [33].
The classical SHAP value can be mathematically described as: where feature contribution i can be presented by ∅ i , N is the feature set, whereas n is the number of features in N. The subset of N is S that contains feature i and the base value v(N) for each feature in N without deliberating the feature values.For every observation, the output is estimated through the SHAP value summation for each feature.Therefore, the SHAP model can be explained as: where the number of features M are denoted as z {0, 1}M and ∅ i can be obtained from the above equation.The SHAP model provides several ML and deep learning explainers, which is beyond the scope of the current work.Molnar (2020) described different explainers and plots in their study.The current study uses a deepExplainer designed for a deep learning model.

SHAP Explanation and Interpretation
After identifying the twelve factors, including eight seismological, one geological, two geo-structural, and one ground motion factor, they were extracted from different datasets.Next, the binary Inception v3-XGBoost model was trained using these factors, and the SHAP values were estimated for all contributing features to the model.Ultimately, the SHAP summary plot was plotted that ranks the features based on their impact on the prediction.
Waterfall plots are conceived to explain individual predictions (Figure 4a) and portray a single row of the whole data as input (Figure 4b).The bottom of a waterfall plot shows the expected model output.The positive (red) or negative (blue) contribution in each row shows the value that moves from the expected model output to the model output in each prediction.Figure 4 shows the plots for the first explanation of Inception v3-XGBoost based on the waterfall plot.The log-odds units are presented on the x-axis, so negative values denote the probabilities of less than 0.5.The gray text behind the factors shows the value of each factor for the individual sample.Interestingly, having proximity to a thrust of 2,611,110 m dramatically increases the predicted spatial probability of earthquakes.Seeing that the waterfall plots only manifest a single row worth of data makes it hard to understand the impact of changing values of the factor proximity to thrust.To demonstrate this, a scatter plot is necessary to show that low SHAP values are the negative predictor of earthquakes, while high values are the positive predictor (Figure 5a, b, c, d).To explain the result, a deep dive into the data is required and careful training of the model is necessary with bootstrap resamples for uncertainty quantification.The important factors as per the waterfall plot were considered for scatter plotting of SHAP values.The SHAP values are clustered against their original factor values.In the case of non-earthquake prediction, proximity to thrust, epicenter den- The gray text behind the factors shows the value of each factor for the individual sample.Interestingly, having proximity to a thrust of 2,611,110 m dramatically increases the predicted spatial probability of earthquakes.Seeing that the waterfall plots only manifest a single row worth of data makes it hard to understand the impact of changing values of the factor proximity to thrust.To demonstrate this, a scatter plot is necessary to show that low SHAP values are the negative predictor of earthquakes, while high values are the positive predictor (Figure 5a-d).To explain the result, a deep dive into the data is required and careful training of the model is necessary with bootstrap resamples for uncertainty quantification.The important factors as per the waterfall plot were considered for scatter plotting of SHAP values.The SHAP values are clustered against their original factor values.In the case of non-earthquake prediction, proximity to thrust, epicenter density, curvature, and magnitude variation, the values vary from 0 to −1.8, 0 to −1.5, 0 to −1.6, and 0 to −4, respectively.Similarly, for earthquake prediction, proximity to thrust, epicenter density, curvature, and magnitude variation, the values vary from 0 to 1.6, 0 to 2.5, 0 to 1.7, and 0 to 5, respectively.The hybrid model performed well in classifying earthquakes and non-earthquake points by detecting earthquake points.However, it failed to detect some of the earthquake and non-earthquakes points in the test dataset.A possible reason behind the high number of true-negative results could be due to the skewed dataset towards the non-earthquake points.The accuracy is below 90% because the model predicts more non-earthquake points.Individually, the SHAP values for earthquakes and non-earthquakes are plotted in red and blue color in a scatter plot, respectively (Figure 7a-f).This will overcome the issue of visualization observed in a simple scatterplot.In the SHAP colored scatter plot, the colors such as blue, purple, and red color denote the low, average, and high values of factors in the training dataset, respectively.The SHAP values in the Y-axis show each observation, while the X-axis shows the factor values.It is therefore necessary to create a summary plot in order to gain a better understanding of this concept.However, this plot is unable to describe when some low SHAP values are predicting earthquake points.Therefore, a summary plot is necessary to understand this concept.According to the factor's importance in the earthquake SPA, Figure 6a shows the variation of SHAP values.As per the results, PGA and magnitude variation contribute the highest with the SHAP values of 4.3 and 5, respectively.All other factors are contributing the highest, with SHAP values ranging from 0 to 0.7. Figure 6b represents the bar diagram showing the mean SHAP values in SPA, where the highest value is achieved by PGA and magnitude variation, respectively.Lower SHAP values were observed for the factors such as earthquake frequency, curvature, and sand-filled geology, which contribute to a minimal level of earthquake SPA.The hybrid model performed well in classifying earthquakes and non-earthquake points by detecting earthquake points.However, it failed to detect some of the earthquake and non-earthquakes points in the test dataset.A possible reason behind the high number of true-negative results could be due to the skewed dataset towards the non-earthquake points.The accuracy is below 90% because the model predicts more non-earthquake points.Individually, the SHAP values for earthquakes and non-earthquakes are plotted in red and blue color in a scatter plot, respectively (Figure 7a-f).This will overcome the issue of visualization observed in a simple scatterplot.In the SHAP colored scatter plot, the colors such as blue, purple, and red color denote the low, average, and high values of factors in the training dataset, respectively.The SHAP values in the Y-axis show each observation, while the X-axis shows the factor values.It is therefore necessary to create a summary plot in order to gain a better understanding of this concept.However, this plot is unable to describe when some low SHAP values are predicting earthquake points.Therefore, a summary plot is necessary to understand this concept.
The factors' importance is shown in the SHAP summary plot (Figure 8).The seismological factors are portrayed as most impactful in distinguishing the earthquake points and non-earthquake points.This result is obvious because the PGA, magnitude, epicenter, and depth variation are the properties that define earthquake probability areas.In contrast, some factors do not have a significant effect on model predictions.The low contribution of the factors in the classification might be controlled by zero values in most of the data points.Therefore, due to the low variance in the factors does not provide sufficient information to contribute to the classification, which is evident in the SHAP summary plot (Figure 8).In the SHAP summary plot, the aqua blue, purple, and fuchsia red colors denote the low, average, and high SHAP values of factors, respectively.The SHAP values in the X-axis show each observation.Therefore, the interaction between the factors and the target can be explored using this plot.Moreover, most of the earthquake points occurred in the potential areas according to the previously published maps of Al-Haddad et al. [19,20].Therefore, a correlation might be observed between low-important factors in the study area and its condition with the earthquake spatial probability.However, this study found these factors are irrelevant and can be removed from the dataset.Further, due to low contribution in SPA estimation curvature, earthquake frequency, and geology are likely to be redundant.It was found that geology is not important as the entire peninsula is characterized by dried sands.For PGA and magnitude variation, the SHAP values are highest and vary from −4.5 to 4 and −3.5 to 4.5, while for SPA, the values vary from 0 to 4 and 0 to 4.5, respectively.The study will provide new results if the irrelevant and redundant factors are removed.The nine remaining seismological and structural factors can lead to good accuracy; however, the accuracy will go down.The SHAP values for all factors were calculated and presented in    If the attributes correlation is perfectly positive or negative, then there is a high chance that the model performance will be impacted by "Multicollinearity".In fact, Multicollinearity occurs when one predictor in a multiple regression model can be predicted from others in a linear manner having a high degree of accuracy.Therefore, this can lead to skewed or misleading results.and 0 to 4.5, respectively.The study will provide new results if the irrelevant and redundant factors are removed.The nine remaining seismological and structural factors can lead to good accuracy; however, the accuracy will go down.The SHAP values for all factors were calculated and presented in Figure 8.It can be expected that the new SHAP summary results will portray seismological factors as the most important factors for SPA estimation.If the attributes correlation is perfectly positive or negative, then there is a high chance that the model performance will be impacted by "Multicollinearity".In fact,

Spatial and Temporal Probability Assessment
The earthquake spatial probability (Figure 9a) and non-spatial probability (Figure 9b) maps are presented.Based on the probabilistic point of view, the index varies between 0 (non-earthquakes) and 1 (earthquakes) derived using SoftMax and classified based on the natural break's classification technique.The classified values can be compared for different locations which can, later, be used for the hazard assessment based on seismic coding and retrofitting.The probability index was classified into five different classes such as 0.002-0.01(very low), 0.011-0.093(low), 0.094-0.91(medium), 0.92-0.99(high), and 0.991-1 (very high).
The index values in the non-probability map were classified into five different classes such as 0-0.01 (very low), 0.012-0.11(low), 0.12-0.89(medium), 0.89-0.98(high), and 0.99-1 (very high).Most of the high to very high spatial probability zones are characterized by more than 5.5 M w events.A comparison between the probability maps without (Figure 10a) and with 3 important factors was conducted.The result shows that the one with 3 important factors portrays better output than the one without PGA, magnitude variation, and seismic gap as shown below.Unrealistic and poor results can be seen in Figure 10b, which portrays a vast region of the Arabian Peninsula that falls under a high probability zone.
For a return period of 475 years., the PGA was estimated (Figure 11a) based on the PSHA technique for the Arabian Peninsula where, eastern parts of UAE, and Iraq, northwestern parts of Syria and Jordan, and southwestern parts of Yemen are falling under a medium range of PGA values (91-150 cm/s 2 ).The current earthquake in Turkey is falling under a very high PGA value ranging from (216-380 cm/s 2 ) causing the location highly hazardous.The frequency vs. time shows that frequency has increased significantly from 1995 onwards and reached 600 events in 2023 (Figure 11b).
(non-earthquakes) and 1 (earthquakes) derived using SoftMax and classified based on natural break's classification technique.The classified values can be compared for diff ent locations which can, later, be used for the hazard assessment based on seismic cod and retrofitting.The probability index was classified into five different classes such 0.002-0.01(very low), 0.011-0.093(low), 0.094-0.91(medium), 0.92-0.99(high), and 0.99 1 (very high).

Validation and Threshold Evaluation
Validation of the obtained results of earthquake SPA (Figure 10) and PGA (Figure 11) using the recent main and aftershock earthquakes in Turkey was conducted.The recent Turkey events of M w 7.8, M w 7.5, and M w 6.7 may be falling within or very close to a seismic gap in Turkey.The events are falling under the spatial probability index values of (0.991-1) based on the Inception v3-XGBoost model with all major factors.Similarly, the events are falling within a high PGA (216-380 cm/s 2 ) zone confirming the accuracy of the obtained result.These events including more than 45 aftershocks caused huge destruction.Threshold for all the factors and their weighted score were derived (Table 4).
such as 0-0.01 (very low), 0.012-0.11(low), 0.12-0.89(medium), 0.89-0.98(high), and 0.99-1 (very high).Most of the high to very high spatial probability zones are characterized by more than 5.5 Mw events.A comparison between the probability maps without (Figure 10a) and with 3 important factors was conducted.The result shows that the one with 3 important factors portrays better output than the one without PGA, magnitude variation, and seismic gap as shown below.Unrealistic and poor results can be seen in Figure 10b, which portrays a vast region of the Arabian Peninsula that falls under a high probability zone.For a return period of 475 years., the PGA was estimated (Figure 11a) based on the PSHA technique for the Arabian Peninsula where, eastern parts of UAE, and Iraq, northwestern parts of Syria and Jordan, and southwestern parts of Yemen are falling under a medium range of PGA values (91-150 cm/s 2 ).The current earthquake in Turkey is falling under a very high PGA value ranging from (216-380 cm/s 2 ) causing the location highly hazardous.The frequency vs. time shows that frequency has increased significantly from 1995 onwards and reached 600 events in 2023 (Figure 11b).

Validation and Threshold Evaluation
Validation of the obtained results of earthquake SPA (Figure 10) and PGA (Figure 11) using the recent main and aftershock earthquakes in Turkey was conducted.The recent Turkey events of Mw 7.8, Mw 7.5, and Mw 6.7 may be falling within or very close to a seismic gap in Turkey.The events are falling under the spatial probability index values of (0.991-1) based on the Inception v3-XGBoost model with all major factors.Similarly, the events are falling within a high PGA (216-380 cm/s 2 ) zone confirming the accuracy of the obtained result.These events including more than 45 aftershocks caused huge destruction.Threshold for all the factors and their weighted score were derived (Table 4).According to the Inceptionv3-XGBoost model, the highest weights were achieved by PGA (14%), magnitude variation (13%), seismic gap (12%), and epicenter density (10%).These are the four highly recommended and globally stable factors for the SPA in the Arabian Peninsula.The threshold achieved for all these four factors are >100 cm/s 2 , >M w 5.5, >500 km, and >8 events in a cluster, respectively.

Model Performance Evaluation
A confusion matrix and classification report were derived to measure the model performance, as presented in Tables 5 and 6, respectively.A total of 12,000 points were considered for training and validation purposes, out of which 6000 points were trained as earthquake points.A total of 6000 random points were considered for non-earthquakes training purposes.The receiver operating characteristics (ROC) curve was plotted as shown in Figure 12a which shows the accuracy (87.9%) of the proposed model.The sensitivity assessment for the employed training data was conducted which shows that with an increase in the training data points, the accuracy increases.For a minimum of 10,000 data points the accuracy reaches up to 70% whereas the accuracy gradually increases until 1,000,000 points (Figure 12b).After the pre-processing stage, 1,000,000 points were derived from the study area for testing purposes.The hybrid model achieved an overall accuracy of 87.91%.The macro average was achieved by the model with a precision of 0.8805, while the weighted average with a precision of 0.8806.The earthquake probability classification achieved a precision of 0.8557, while the earthquake non-probability assessment achieved a precision of 0.9053.This shows that the trained model classifies the non-earthquake points more efficiently than earthquake points.The predicted probability against log loss (Figure 12c) and classification error is shown in Figure 12d.After the pre-processing stage, 1,000,000 points were derived from the study area for testing purposes.The hybrid model achieved an overall accuracy of 87.91%.The macro average was achieved by the model with a precision of 0.8805, while the weighted average with a precision of 0.8806.The earthquake probability classification achieved a precision of 0.8557, while the earthquake non-probability assessment achieved a precision of 0.9053.This shows that the trained model classifies the non-earthquake points more efficiently than earthquake points.The predicted probability against log loss (Figure 12c) and classification error is shown in Figure 12d.

Discussion
In the current study, the SPA was conducted in Arabian Peninsula using a combined approach of ML and XAI techniques.Because feature learning is still unclear in the literature, therefore a hybrid combination of the Inception v3-XGBoost model was developed for clarity.This hybrid model performs both feature learning and prediction better than the standalone models [50].The model deeply analyses the features to improve processes, automate tasks, and predict outcomes, based on past experiences.Several instance-level plots for the explanations can be very informative [56].To interpret the effect of the feature on the prediction, waterfall, scatter, bar, and summary plots were plotted.This investigates the factors' impact on earthquake and non-earthquake classification (Figures 5-9).As shown in Figure 9, the trained XGBoost model could correctly classify the earthquake and non-earthquake areas, respectively.
The way of summarizing the importance of factors can make a big difference in understanding the model.In the plot shown in Figure 8, the topmost factors can be seen, which portrays the impact and common interaction.The study uses the SHAP model to summarize the factor's importance and the largest impact on the model.It can be noted that the high-rank factors lead to an increase in SHAP values.However, the increase in the low-rank factors leads to a decrease in SHAP values.It is worth noting that the results portray the general factors analysis in the dataset, and the factors might impact the classification output differently for individual events.
When estimating earthquake SPA using ML, the SHAP plots can help in understanding the reasons behind the specific model outputs.Somala et al. [57] conducted a study on the time period estimation of reinforced concrete (RC) frames.They successfully estimated the SHAP values of the input parameters and their importance.Matin et al. [58] conducted a study on earthquake-induced building-damage mapping using explainable AI.This paper successfully implemented SHAP to interpret the outputs of the proposed ANN model and analyzed the impact of the feature to understand the model's reliability.Though, it would be precipitous to fully believe SHAP that can explain the contribution of the factors, as currently, no literature is available on earthquake SPA using SHAP.Nevertheless, with an increase in studies on hazard estimation, researchers could trust SHAP to understand the importance of the factors on SPA estimation.
This study conducted an assessment by removing the low-impact factors to see the changes in results.The results after removing the low-impact factors do not provide any remarkable changes in the outputs.Therefore, the presence or absence of low-impact factors does not matter to the model.However, the spatial variation of factors after removing the low contribution factors of specific predictors does not show any changes in ranking.This could be due to the reduction of the test dataset.Then another assessment was conducted by removing the high-influence factors, as shown in Figure 10.Looking at the spatial variation, some differences could be observed in both maps (Figure 10).The important point is that the SHAP algorithm was not developed for time series analysis, with the development of deepExplainer still situated in its incipient stage [59].As per the current analysis, deepExplainer suits the proposed model.Individual rows of data and their contribution were compared based on explainers which show a minimum distance of 0.33 and a maximum of 0.48.SHAP values, however, can provide valuable information that is correlated with the results of actual predictions.Therefore, this could lead to explaining in detail the black box models for all stages of analysis in the future.
The results of the current study considered the postulation that future events might fall within the mapped seismic zones.This might not happen as few areas in the peninsula have shown seismic quiescence for a long period.These are the Makran Subduction Zone and the Gulf of Aqaba, which may become active in the future.This confirms further studying the active tectonics in the Peninsula and surrounding areas.The earthquake inventory indicates strong shaking at Nizwa, Qalhat, Najran, Sohar, Makah, Al-Madinah, Taief, and Tabuk manifesting significant seismic probability and hazard warnings within the Arabian Plate.However, most of the recorded earthquakes suggest that the Arabian Plate is aseismic [20].The SPA is tied to some specific faults in Iran and Turkey rather than an average area, indicating accurate and localized spatial probability.Therefore, state-ofthe-art AI-based studies on paleo-seismicity and active faulting in association with field investigations could confirm or deny the historical reports.This research implemented the seismic gap as a new factor, which comes as the third most important factor in the SPA analysis.The spatial seismic probability areas of the current study are similar to the map produced by Al-Haddad et al. [20].
As this study predicted the SPA based on binary classification techniques, therefore, the probability values vary between 0 (non-earthquake) and 1 (earthquake).The obtained SPA is comparable to some extent with the PGA map derived by Al-Haddad et al. [20] which shows high PGA with a ground motion of 250 cm/s 2 in the southwestern part of Saudi Arabia.A little difference in mapped areas could be observed which are because of several reasons: (1) the current study implemented an updated catalog and recent GMPEs for PGA estimation, (2) the current work contemplates Iran and Southeastern Turkey as seismic sources, (3) adopted hybrid machine learning model (Inception v3-XGBoost), and (4) inclusion of new input parameters and treated the factors appropriately.Generally, the SPA in the current study is consistent with the earthquake events showing Central Saudi Arabia, Egypt, and Sudan come under low probability levels (index values range from 0.002 to 0.093).Medium-to-high probability index ranges from 0.94 to 0.99 surrounding the very high probability areas.Very high probability index (falling under the index values ranging from 0.991 to 1) can be found in the Gulf of Aden, Red Sea, Iran, and Turkey (Figure 13).The current study, therefore, may convey the hybrid ML-based SPA improves the previous works in the Arabian Peninsula [3,19,20] as shown in (Figure 9).This work is limited to SPA estimation without any risk assessment.A large area of study needs a huge amount of training data for better accuracy.This can be studied using smart predictors to improve the SPA map.The proposed hybrid Inception V3-XGBoost model achieved good accuracy as compared to other state-of-the-art ML models.However, the CNN model achieved a better accuracy in prediction which is 90%.Detailed information about the achieved accuracy by several ML models is shown in Table 7.

Conclusions
Earthquakes spatial probability assessment is the most challenging among all natural hazards owing to multiple factors and event non-linearity.The transparency and explainability of the AI-based models in the field of earthquake SPA were aimed in this work.The advantages of the current study deal with operationalizing AI that builds confidence in black-box models and monitors the models to optimize.Because this is the first ever study using XAI for SPA, the main findings of the work are as follows:  The hybrid Inception v3-Ensemble XGBoost model was found to be an effective and robust approach for SPA and its global acceptability should be further tested with new factors and geotectonic conditions. The SHAP implementation builds more trust toward the implementation of ML models, thereby grasping data-mining models for SPA. The study found the importance of the seismic gap as a predictor in SPA along with eight other factors confirming its insertion in the assessment.

Conclusions
Earthquakes spatial probability assessment is the most challenging among all natural hazards owing to multiple factors and event non-linearity.The transparency and explainability of the AI-based models in the field of earthquake SPA were aimed in this work.The advantages of the current study deal with operationalizing AI that builds confidence in black-box models and monitors the models to optimize.Because this is the first ever study using XAI for SPA, the main findings of the work are as follows:

Conclusions
Earthquakes spatial probability assessment is the most challenging among all natural hazards owing to multiple factors and event non-linearity.The transparency and explainability of the AI-based models in the field of earthquake SPA were aimed in this work.The advantages of the current study deal with operationalizing AI that builds confidence in black-box models and monitors the models to optimize.Because this is the first ever study using XAI for SPA, the main findings of the work are as follows:  The hybrid Inception v3-Ensemble XGBoost model was found to be an effective and robust approach for SPA and its global acceptability should be further tested with new factors and geotectonic conditions. The SHAP implementation builds more trust toward the implementation of ML models, thereby grasping data-mining models for SPA. The study found the importance of the seismic gap as a predictor in SPA along with eight other factors confirming its insertion in the assessment.


The results show that Central Saudi Arabia, Egypt, and Sudan come under low probability levels (index values range from 0.002 to 0.093) dominating the major parts of the Arabian Peninsula.Very high probability index (falling under the index values ranging from 0.991 to 1) can be found in the Gulf of Aden, Red Sea, Iran, and Turkey.


The recent earthquake of Mw 7.8 and the corresponding aftershocks show the importance of this study and can be used to validate the obtained results.


This may substantially contribute to establishing seismic codes for buildings in the Arab's pioneering project.Further, the results could provide relevant parameters to determine whether retrofitting is necessary to minimize ground-shaking effects in the Arabian Peninsula.
The hybrid Inception v3-Ensemble XGBoost model was found to be an effective and robust approach for SPA and its global acceptability should be further tested with new factors and geotectonic conditions.

Conclusions
Earthquakes spatial probability assessment is the most challenging among all natural hazards owing to multiple factors and event non-linearity.The transparency and explainability of the AI-based models in the field of earthquake SPA were aimed in this work.The advantages of the current study deal with operationalizing AI that builds confidence in black-box models and monitors the models to optimize.Because this is the first ever study using XAI for SPA, the main findings of the work are as follows:  The hybrid Inception v3-Ensemble XGBoost model was found to be an effective and robust approach for SPA and its global acceptability should be further tested with new factors and geotectonic conditions. The SHAP implementation builds more trust toward the implementation of ML models, thereby grasping data-mining models for SPA.


The study found the importance of the seismic gap as a predictor in SPA along with eight other factors confirming its insertion in the assessment.


The results show that Central Saudi Arabia, Egypt, and Sudan come under low probability levels (index values range from 0.002 to 0.093) dominating the major parts of the Arabian Peninsula.Very high probability index (falling under the index values ranging from 0.991 to 1) can be found in the Gulf of Aden, Red Sea, Iran, and Turkey.


The recent earthquake of Mw 7.8 and the corresponding aftershocks show the importance of this study and can be used to validate the obtained results.


This may substantially contribute to establishing seismic codes for buildings in the Arab's pioneering project.Further, the results could provide relevant parameters to determine whether retrofitting is necessary to minimize ground-shaking effects in the Arabian Peninsula.
The SHAP implementation builds more trust toward the implementation of ML models, thereby grasping data-mining models for SPA.

Conclusions
Earthquakes spatial probability assessment is the most challenging among all natural hazards owing to multiple factors and event non-linearity.The transparency and explainability of the AI-based models in the field of earthquake SPA were aimed in this work.The advantages of the current study deal with operationalizing AI that builds confidence in black-box models and monitors the models to optimize.Because this is the first ever study using XAI for SPA, the main findings of the work are as follows:  The hybrid Inception v3-Ensemble XGBoost model was found to be an effective and robust approach for SPA and its global acceptability should be further tested with new factors and geotectonic conditions. The SHAP implementation builds more trust toward the implementation of ML models, thereby grasping data-mining models for SPA.


The study found the importance of the seismic gap as a predictor in SPA along with eight other factors confirming its insertion in the assessment.


The results show that Central Saudi Arabia, Egypt, and Sudan come under low probability levels (index values range from 0.002 to 0.093) dominating the major parts of the Arabian Peninsula.Very high probability index (falling under the index values ranging from 0.991 to 1) can be found in the Gulf of Aden, Red Sea, Iran, and Turkey.


The recent earthquake of Mw 7.8 and the corresponding aftershocks show the importance of this study and can be used to validate the obtained results.


This may substantially contribute to establishing seismic codes for buildings in the Arab's pioneering project.Further, the results could provide relevant parameters to determine whether retrofitting is necessary to minimize ground-shaking effects in the Arabian Peninsula.
The study found the importance of the seismic gap as a predictor in SPA along with eight other factors confirming its insertion in the assessment.

Conclusions
Earthquakes spatial probability assessment is the most challenging among all natural hazards owing to multiple factors and event non-linearity.The transparency and explainability of the AI-based models in the field of earthquake SPA were aimed in this work.The advantages of the current study deal with operationalizing AI that builds confidence in black-box models and monitors the models to optimize.Because this is the first ever study using XAI for SPA, the main findings of the work are as follows:  The hybrid Inception v3-Ensemble XGBoost model was found to be an effective and robust approach for SPA and its global acceptability should be further tested with new factors and geotectonic conditions. The SHAP implementation builds more trust toward the implementation of ML models, thereby grasping data-mining models for SPA.


The study found the importance of the seismic gap as a predictor in SPA along with eight other factors confirming its insertion in the assessment.


The results show that Central Saudi Arabia, Egypt, and Sudan come under low probability levels (index values range from 0.002 to 0.093) dominating the major parts of the Arabian Peninsula.Very high probability index (falling under the index values ranging from 0.991 to 1) can be found in the Gulf of Aden, Red Sea, Iran, and Turkey.


The recent earthquake of Mw 7.8 and the corresponding aftershocks show the importance of this study and can be used to validate the obtained results.


This may substantially contribute to establishing seismic codes for buildings in the Arab's pioneering project.Further, the results could provide relevant parameters to determine whether retrofitting is necessary to minimize ground-shaking effects in the Arabian Peninsula.
The results show that Central Saudi Arabia, Egypt, and Sudan come under low probability levels (index values range from 0.002 to 0.093) dominating the major parts of the Arabian Peninsula.Very high probability index (falling under the index values ranging from 0.991 to 1) can be found in the Gulf of Aden, Red Sea, Iran, and Turkey.

Conclusions
Earthquakes spatial probability assessment is the most challenging among all natural hazards owing to multiple factors and event non-linearity.The transparency and explainability of the AI-based models in the field of earthquake SPA were aimed in this work.The advantages of the current study deal with operationalizing AI that builds confidence in black-box models and monitors the models to optimize.Because this is the first ever study using XAI for SPA, the main findings of the work are as follows:  The hybrid Inception v3-Ensemble XGBoost model was found to be an effective and robust approach for SPA and its global acceptability should be further tested with new factors and geotectonic conditions. The SHAP implementation builds more trust toward the implementation of ML models, thereby grasping data-mining models for SPA. The study found the importance of the seismic gap as a predictor in SPA along with eight other factors confirming its insertion in the assessment.


The results show that Central Saudi Arabia, Egypt, and Sudan come under low probability levels (index values range from 0.002 to 0.093) dominating the major parts of the Arabian Peninsula.Very high probability index (falling under the index values ranging from 0.991 to 1) can be found in the Gulf of Aden, Red Sea, Iran, and Turkey.


The recent earthquake of Mw 7.8 and the corresponding aftershocks show the importance of this study and can be used to validate the obtained results.


This may substantially contribute to establishing seismic codes for buildings in the Arab's pioneering project.Further, the results could provide relevant parameters to determine whether retrofitting is necessary to minimize ground-shaking effects in the Arabian Peninsula.
The recent earthquake of M w 7.8 and the corresponding aftershocks show the importance of this study and can be used to validate the obtained results.

Conclusions
Earthquakes spatial probability assessment is the most challenging among all natural hazards owing to multiple factors and event non-linearity.The transparency and explainability of the AI-based models in the field of earthquake SPA were aimed in this work.The advantages of the current study deal with operationalizing AI that builds confidence in black-box models and monitors the models to optimize.Because this is the first ever study using XAI for SPA, the main findings of the work are as follows:  The hybrid Inception v3-Ensemble XGBoost model was found to be an effective and robust approach for SPA and its global acceptability should be further tested with new factors and geotectonic conditions. The SHAP implementation builds more trust toward the implementation of ML models, thereby grasping data-mining models for SPA. The study found the importance of the seismic gap as a predictor in SPA along with eight other factors confirming its insertion in the assessment.


The results show that Central Saudi Arabia, Egypt, and Sudan come under low probability levels (index values range from 0.002 to 0.093) dominating the major parts of the Arabian Peninsula.Very high probability index (falling under the index values ranging from 0.991 to 1) can be found in the Gulf of Aden, Red Sea, Iran, and Turkey.


The recent earthquake of Mw 7.8 and the corresponding aftershocks show the importance of this study and can be used to validate the obtained results.


This may substantially contribute to establishing seismic codes for buildings in the Arab's pioneering project.Further, the results could provide relevant parameters to determine whether retrofitting is necessary to minimize ground-shaking effects in the Arabian Peninsula.
This may substantially contribute to establishing seismic codes for buildings in the Arab's pioneering project.Further, the results could provide relevant parameters to determine whether retrofitting is necessary to minimize ground-shaking effects in the Arabian Peninsula.

Conclusions
Earthquakes spatial probability assessment is the most challenging among all natural hazards owing to multiple factors and event non-linearity.The transparency and explainability of the AI-based models in the field of earthquake SPA were aimed in this work.The advantages of the current study deal with operationalizing AI that builds confidence in black-box models and monitors the models to optimize.Because this is the first ever study using XAI for SPA, the main findings of the work are as follows:  The hybrid Inception v3-Ensemble XGBoost model was found to be an effective and robust approach for SPA and its global acceptability should be further tested with new factors and geotectonic conditions. The SHAP implementation builds more trust toward the implementation of ML models, thereby grasping data-mining models for SPA.


The study found the importance of the seismic gap as a predictor in SPA along with eight other factors confirming its insertion in the assessment.


The results show that Central Saudi Arabia, Egypt, and Sudan come under low probability levels (index values range from 0.002 to 0.093) dominating the major parts of the Arabian Peninsula.Very high probability index (falling under the index values ranging from 0.991 to 1) can be found in the Gulf of Aden, Red Sea, Iran, and Turkey.


The recent earthquake of Mw 7.8 and the corresponding aftershocks show the importance of this study and can be used to validate the obtained results.


This may substantially contribute to establishing seismic codes for buildings in the Arab's pioneering project.Further, the results could provide relevant parameters to determine whether retrofitting is necessary to minimize ground-shaking effects in the Arabian Peninsula.
In earthquake SPA work, the inclusion of subduction-related parameters, fault surface area, and fault width is necessary for a better representation of seismic coupling and probability estimation.

Conclusions
Earthquakes spatial probability assessment is the most challenging among all natural hazards owing to multiple factors and event non-linearity.The transparency and explainability of the AI-based models in the field of earthquake SPA were aimed in this work.The advantages of the current study deal with operationalizing AI that builds confidence in black-box models and monitors the models to optimize.Because this is the first ever study using XAI for SPA, the main findings of the work are as follows:  The hybrid Inception v3-Ensemble XGBoost model was found to be an effective and robust approach for SPA and its global acceptability should be further tested with new factors and geotectonic conditions. The SHAP implementation builds more trust toward the implementation of ML models, thereby grasping data-mining models for SPA. The study found the importance of the seismic gap as a predictor in SPA along with eight other factors confirming its insertion in the assessment.


The results show that Central Saudi Arabia, Egypt, and Sudan come under low probability levels (index values range from 0.002 to 0.093) dominating the major parts of the Arabian Peninsula.Very high probability index (falling under the index values ranging from 0.991 to 1) can be found in the Gulf of Aden, Red Sea, Iran, and Turkey.


The recent earthquake of Mw 7.8 and the corresponding aftershocks show the importance of this study and can be used to validate the obtained results.


This may substantially contribute to establishing seismic codes for buildings in the Arab's pioneering project.Further, the results could provide relevant parameters to determine whether retrofitting is necessary to minimize ground-shaking effects in the Arabian Peninsula.

Figure 1 .
Figure 1.(a) Location of the study area.(The circle refers to the data collection area).(b) Different types of faults and tectonic boundaries along with the recent Turkey earthquake (Mw 7.8).

Figure 1 .
Figure 1.(a) Location of the study area.(The circle refers to the data collection area).(b) Different types of faults and tectonic boundaries along with the recent Turkey earthquake (M w 7.8).
Remote Sens. 2023, 15, x FOR PEER REVIEW 7 of 27 characterized by a feature extractor and a classification head [50].This model replaced the classification head in the inception V3 model with the XGBoost classifier (Figure 3b).The input features provided by the Inception V3 model were used for the XGBoost classifier.This setup of hybrid combination works for other pre-trained CNNs and RNN models too.In the first stage, the Inception v3 model was fine-tuned for the training dataset.This study observed that the information originating from Inception v3 leads to a better result by the XGBoost classifier.

Figure 3 .
Figure 3. Implementation of SHAP in model explanation followed by (a) Inception v3 and (b) XGBoost.

Figure 3 .
Figure 3. Implementation of SHAP in model explanation followed by (a) Inception v3 and (b) XGBoost.

Figure 4 .
Figure 4. (a) Waterfall plot shows the contribution of factors in an individual sample data, and (b) bar plot shows the contribution of the factors for earthquake and non-earthquake classification.

Figure 4 .
Figure 4. (a) Waterfall plot shows the contribution of factors in an individual sample data, and (b) bar plot shows the contribution of the factors for earthquake and non-earthquake classification.

27 Figure 5 .
Figure 5.The scatter plot shows SHAP values for important factors based on the waterfall plot; (a) proximity to thrust, (b) epicenter density, (c) curvature, and (d) magnitude variation.

Figure 5 .
Figure 5.The scatter plot shows SHAP values for important factors based on the waterfall plot; (a) proximity to thrust, (b) epicenter density, (c) curvature, and (d) magnitude variation.

Figure 6 .
Figure 6.SHAP interpretation for earthquake SPA: (a) bee swarm plot portrays SHAP importance, and (b) bar plot shows importance based on mean SHAP values.

Figure 6 .
Figure 6.SHAP interpretation for earthquake SPA: (a) bee swarm plot portrays SHAP importance, and (b) bar plot shows importance based on mean SHAP values.

Figure 8 .
Figure 8.It can be expected that the new SHAP summary results will portray seismological factors as the most important factors for SPA estimation.

Figure 6 .
Figure 6.SHAP interpretation for earthquake SPA: (a) bee swarm plot portrays SHAP importance, and (b) bar plot shows importance based on mean SHAP values.

Figure 8 .
Figure 8.The summary plot shows the factors' interaction and importance based on Inception v3-XGBoost model.

Figure 8 .
Figure 8.The summary plot shows the factors' interaction and importance based on Inception v3-XGBoost model.

Figure 9 .
Figure 9. SPA maps for Arabian Peninsula based on Inception v3-XGBoost model with PGA, m nitude variation, and seismic gap.(a) Spatial probability, (b) Spatial non-probability.

Figure 9 .
Figure 9. SPA maps for Arabian Peninsula based on Inception v3-XGBoost model with PGA, magnitude variation, and seismic gap.(a) Spatial probability, (b) Spatial non-probability.

Figure 10 .
Figure 10.SPA maps for Arabian Peninsula based on Inception v3-XGBoost model without PGA, magnitude variation, and seismic gap.

Figure 10 .
Figure 10.SPA maps for Arabian Peninsula based on Inception v3-XGBoost model without PGA, magnitude variation, and seismic gap.

Figure 11 .
Figure 11.(a) PGA map derived based on PSHA, and (b) earthquake frequency vs. year analysis.

Figure 11 .
Figure 11.(a) PGA map derived based on PSHA, and (b) earthquake frequency vs. year analysis.

Figure 12 .
Figure 12.(a) ROC curve, (b) Sensitivity analysis for training, (c) XGBoost log loss, and (d) classification error for training and testing data.

Figure 12 .
Figure 12.(a) ROC curve, (b) Sensitivity analysis for training, (c) XGBoost log loss, and (d) classification error for training and testing data.

27 Figure 13 .
Figure 13.Recent main and aftershock earthquakes in Turkey are falling under the SPA maps based on Inception v3-XGBoost model causing huge destruction.

Figure 13 .
Figure 13.Recent main and aftershock earthquakes in Turkey are falling under the SPA maps based on Inception v3-XGBoost model causing huge destruction.

Figure 13 .
Figure 13.Recent main and aftershock earthquakes in Turkey are falling under the SPA maps based on Inception v3-XGBoost model causing huge destruction.

Figure 13 .
Figure 13.Recent main and aftershock earthquakes in Turkey are falling under the SPA maps based on Inception v3-XGBoost model causing huge destruction.

Figure 13 .
Figure 13.Recent main and aftershock earthquakes in Turkey are falling under the SPA maps based on Inception v3-XGBoost model causing huge destruction.

Figure 13 .
Figure 13.Recent main and aftershock earthquakes in Turkey are falling under the SPA maps based on Inception v3-XGBoost model causing huge destruction.

Figure 13 .
Figure 13.Recent main and aftershock earthquakes in Turkey are falling under the SPA maps based on Inception v3-XGBoost model causing huge destruction.

Figure 13 .
Figure 13.Recent main and aftershock earthquakes in Turkey are falling under the SPA maps based on Inception v3-XGBoost model causing huge destruction.

Figure 13 .
Figure 13.Recent main and aftershock earthquakes in Turkey are falling under the SPA maps based on Inception v3-XGBoost model causing huge destruction.

Figure 13 .
Figure 13.Recent main and aftershock earthquakes in Turkey are falling under the SPA maps based on Inception v3-XGBoost model causing huge destruction.

Table 1 .
Earthquake characteristics, deaths, and injuries in Arabian Peninsula.

Table 2 .
Data used in the current study.

Table 3 .
Parameters used for the proposed model.

Table 4 .
Threshold and importance of factors derived based on Inception v3-XGBoost model.

Table 4 .
Threshold and importance of factors derived based on Inception v3-XGBoost model.

Table 5 .
Confusion matrix for Inception v3-XGBoost classification and statistical metrics.

Table 6 .
Classification report for Inception v3-XGBoost classification with macro and weighted average accuracy.

Table 6 .
Classification report for Inception v3-XGBoost classification with macro and weighted average accuracy.