Next Article in Journal
The Negative Binomial INAR(1) Process under Different Thinning Processes: Can We Separate between the Different Models?
Previous Article in Journal
Time-Varying Correlations between JSE.JO Stock Market and Its Partners Using Symmetric and Asymmetric Dynamic Conditional Correlation Models
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Seismic Evaluation Based on Poisson Hidden Markov Models—The Case of Central and South America

by
Evangelia Georgakopoulou
1,†,
Theodoros M. Tsapanos
1,†,
Andreas Makrides
2,3,†,
Emmanuel Scordilis
1,†,
Alex Karagrigoriou
4,*,†,
Alexandra Papadopoulou
5,† and
Vassilios Karastathis
6,†
1
School of Geology, Department of Geophysics, Geophysical Laboratory, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece
2
Department of Statistics and Actuarial-Financial Mathematics, Lab of Statistics and Data Analysis, University of the Aegean, 83200 Samos, Greece
3
Department of Computer Science, University of Nicosia, Nicosia 2417, Cyprus
4
Department of Statistics and Insurance Science, University of Piraeus, 18534 Piraeus, Greece
5
Department of Mathematics, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece
6
National Observatory of Athens, Institute of Geodynamics, Lofos Nymfon, Thissio, 11851 Athens, Greece
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Stats 2024, 7(3), 777-792; https://doi.org/10.3390/stats7030047
Submission received: 4 June 2024 / Revised: 5 July 2024 / Accepted: 10 July 2024 / Published: 23 July 2024

Abstract

:
A study of earthquake seismicity is undertaken over the areas of Central and South America, the tectonics of which are of great interest. The whole territory is divided into 10 seismic zones based on some seismotectonic characteristics, as in previously published studies. The earthquakes used in the present study are extracted from the catalogs of the International Seismological Center, cover the period of 1900–2021, and are restricted to shallow depths (≤60 km) and a magnitude M 4.5 . Fore- and aftershocks are removed according to Reasenberg’s technique. The paper confines itself to the evaluation of earthquake occurrence probabilities in the seismic zones covering parts of Central and South America, and we implement the hidden Markov model (HMM) and apply the EM algorithm.

1. Introduction

Seismic hazard evaluation is always considered to be a problem of high scientific importance due to the consequences that are often associated with earthquake occurrence. Central and South America are considered to be regions with very high seismicity levels. In fact, the two strongest shocks ever recorded occurred in this region: in particular, in Chile, with magnitudes of M = 9.5 (1960) and M = 8.8 (2010) on the Richter scale. It should be noted that, based on their seismicity levels, Chile and Peru have been placed in second and fourth place, respectively, among 50 seismogenic countries [1]. The region of Central America is known to be less seismogenic compared to South America. Indeed, large shocks like the one in the year 2012 with a magnitude of M = 8.2 rarely occur, but shocks with magnitudes from M = 7.0 to M = 7.9 are frequently recorded.
The tectonics of the abovementioned areas are characterized by an underthrust process by the underlying Nazca (South America) and Cocos (Central America) Plates. Reverse faults prevail in both areas, and almost all of the seismicity occurs along the west coasts [2]. The high seismicity indicates that almost 90% of the deformation is released by the earthquakes [3,4,5]. Convergent velocity values of approximately 9.0 cm/year are present in these regions [6,7].
The high seismicity and related phenomena—like deformation, among others—indicate that the Nazca Plate is a complicated tectonic structure [8]. The Cocos Plate, which was created by seafloor spread along the East Pacific Rise, also forms a complicated tectonic plate; it was created almost 23 million years ago when the Faralon Plate broke into two pieces [9]. Note that the Cocos Plate broke into two pieces (see [9]), creating the small Rivera Plate, which is bounded by the North American and Caribbean Plates (northeast), the Pacific Plate (west), and the Nazca Plate (south) (see Figure 1).
Several researchers have worked on various aspects of seismic activity. For example, Ruiz and Madariaga [11] focused on historical and recent megathrust earthquakes in Chile to identify seismic gaps, earthquake periodicity, and potential precursors of such events. In contrast, Tsapanos [12] analyzed the return periods of earthquakes in South America based on different seismicity parameters at various focal depths. More recently, several authors [13,14,15] have studied seismic activity in South and Central America.
From a mathematical perspective, physical phenomena such as earthquake activity are often described by deterministic models when the temporal evolution is known, or stochastic models when the time evolution is unknown. Stochastic processes are classic examples of functions of time that can represent the evolution of a system of random values over time. One such process, with numerous applications, is the Markov chain, which transitions from one state to another among a finite number of states. Vere-Jones in [16] presented a model for aftershocks in which successive aftershocks are seen as transitions of an active system from one state to another, with these states linked in a Markov chain. The states include aftershock frequency, energy release, and the frequency-magnitude law. However, these models are not very satisfactory as they are not tied to a specific theory of earthquake mechanisms, limiting their effectiveness.
A Markov chain was recently used by Nava et. al in [17] for seismic risk assessment in Japan by modeling the transition probabilities of different types of seismicity in a geographic area over a time interval. The resulting high transition probabilities provided satisfactory predictions for the Japan area. In South America, Tsapanos in [18] applied the Markov model and successfully predicted earthquakes of 2001 and 2010 with magnitudes M w = 8.4 and M w = 8.8 , respectively ( M w : Moment magnitude scale). While Markov models are effective, they are limited in terms of the distribution of waiting times.
Alternatives to Markov models include semi-Markov and hidden Markov models (HMM), both of which have various applications, including in seismology. A hidden Markov model consists of hidden states, observed values, and transition probabilities. Researchers have used the hidden Markov methodology to compute earthquake occurrence probabilities [19]. Granat and Donnellan in [20] applied HMM to earthquake data in Southern California to estimate earthquake magnitudes above a certain threshold. Ebel et al. [21] used HMM to predict future earthquakes based on available data. Chambers et al. [22] developed a new method based on HMM for earthquake prediction in Southern California and Western Nevada. Li and Anderson-Spencer in [23] proposed using HMM to estimate the distribution of waiting times for earthquake swarms, focusing on the largest swarms in the Yellowstone region. Semi-Markov models allow for any underlying distribution but do not consider hidden states that may affect seismic activity, which is the primary focus of this work, with each hidden state representing the mean of a Poisson distribution.
This paper focuses on evaluating earthquake occurrence probabilities in the seismic zones that cover parts of Central and South America. The evaluation is done by implementing the hidden Markov model, specifically the Poisson HMM (PHMM), which is a dual discrete-time stochastic process. The parameters of the underlying Poisson distributions represent the hidden states of the process. The methodology is discussed in Section 2, while Section 3 is dedicated to implementing the PHMM methodology for the regions of Central and South America. Conclusions and a discussion are provided in Section 4.

2. The HMM Methodology

Hidden Markov models (HMM) are statistical models proposed by Baum and Petrie in [24] and have been applied in various scientific fields such as biology, gesture and speech recognition, economics, bioinformatics, and seismology.
A hidden Markov model is a double stochastic process which consists of the underlying process which is a Markov process with state space λ = { λ 1 , λ 2 , , λ N } where N is the number of hidden states and of the observation process i.e., a sequence of random variables that take values from the space O = { O 1 , O 2 , , O M } where M is the number of observations which not necessarily coincides with N. The underlying process cannot be observed unlike the observation process. More specifically, the observable sequence of states is produced by and depends on the states of the Markov chain.
A Poisson hidden Markov model (PHMM) is a dual discrete-time stochastic process consisting of the finite-space underlying Markov process X t , t = 1 , , T and an observable stochastic process Y t , t = 1 , , T , taking values in the λ and O sets, respectively (Figure 2). The observed process is Y 1 , , Y T where each one is associated with the process X 1 , , X T and each X i is associated with one of N states λ 1 , , λ N , N T (see also Figure 3 where the same hidden states can be revisited during the realization of the process X t ). A PHMM is characterized by the following five elements:
  • The number of hidden states (N): Each observation Y t at the time point t, t { 1 , 2 } comes from one of N Poisson distributions but the state X t of Y t , is not directly observed, i.e., it is hidden. The unobserved hidden state corresponding to X t at the time point t of the underlying homogeneous Markov Chain, is the mean λ n , n = 1 , 2 , , N of the associated Poisson distribution. The N states of a PHMM are denoted by λ = { λ 1 , λ 2 , , λ N } where the state λ i represents the rate of occurrence of events associated with the specific state.
  • The transition probability matrix (P): Each element of the transition probability matrix P expresses the probability of transition in one step, from state λ i at time t to state λ j at time t + 1 .
    p i j = P ( X t + 1 = λ j | X t = λ i ) , λ i , λ j λ , t = 1 , 2 , , T .
The following matrix shows the transition probabilities matrix for N hidden states:
P = p 11 p 12 p 1 N p 21 p 22 p 2 N p N 1 p N 2 p N N
3.
Observations Y t , t = 1 , 2 , , T , are non-negative integer values.
4.
When the system is in state λ i at time t the observed value Y t comes from a Poisson distribution with parameter λ i . Thus, the distribution of Y t conditional on the state X t , is defined as follows:
P Y t = y | X t = λ i = e λ i λ i y y ! , y = 0 , 1 , 2 , .
The above conditional probabilities for each y and i are often viewed as the elements B ( y , i ) of a T × N matrix known as the emission probability matrix denoted by B. Note that in theory there is no upper bound on the value that the r.v. can take on but depending on the situation the probabilities eventually get tiny for large values of Y t . Nevertheless, we could choose a large value for T for all practical purposes.
5.
The N dimensional row vector π , namely the initial distribution which consists of the elements π ( i ) , representing the probability that the Markov process starts from state λ i , that is
π ( i ) = P X 0 = λ i , λ i λ .
The PHMM with the above characteristics is often represented by Ω = { P , B , π } . Figure 3 indicates a 3-state PHMM with the relevant parameters and the associated transition probabilities.
There have been a few applications of PHMM on earthquake problems. For instance, Can et al. in [25] used PHMM for prediction of earthquake hazard around Bilecik (NW Turkey) and Orfanogiannaki et al. in [26] applied PHMM to identify seismicity levels in the seismogenic area of Killini (Ionian Sea).

2.1. The Three Main Issues of HMM

According to [27], when dealing with the observation sequence Y = { Y 1 , Y 2 , , Y T } and the model Ω , using HMM in real applications involves solving three fundamental issues. These are:
(1)
The Estimation problem, which involves calculating the probability P ( Y | Ω ) , i.e., the probability that the observation sequence was produced by the given model. This helps in selecting the model that best fits the observations when multiple models are considered.
(2)
The Learning problem, which focuses on maximizing the probability P ( Y | Ω ) by determining the optimal parameters of the model.
(3)
The Decoding problem, which is about finding the optimal sequence of states for the Markov process, i.e., the sequence of Poisson distributions that generate the given observations.

2.2. The Implementation of the PHMM

To begin, the number N of hidden states must be determined for each seismic zone. This number represents the different Poisson distributions generating the observations Y 1 , , Y T , which express the annual number of earthquakes in each seismic zone. Due to the dataset’s size, it is more appropriate to work with annual events rather than monthly or daily ones. The researcher’s goal is to identify the optimal number of hidden states and their parameter values λ 1 ,…, λ N to accurately describe how the observations were generated. The Baum-Welch algorithm [28], a form of the Expectation Maximization (EM) algorithm (see [29]), is used to determine these parameters iteratively through the E-step and M-step until convergence is achieved. The Baum-Welch algorithm is repeated for various values of N to identify candidate models. Model selection criteria, such as the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC), are then used to choose the best-fitting model. The model with the lowest criterion value is selected as the most suitable one. It is noted that the R software is employed for implementing the PHMM.
After determining the ideal model, the steady state probabilities ( π i j ) need to be calculated. These probabilities represent the long-term probabilities of the system being in each state. For this, the following is calculated:
Π = π i j i , j N = lim k P k ,
where P the transition probability matrix given in (2).
Let N i j the number of transitions from hidden state i to hidden state j and N i the number of exits from hidden state i. The estimated transition probabilities are defined by
p ^ i j = N i j N i ,
and are used as initial values for the implementation of Baum–Welch algorithm under the package “HMMpa” of R software. The likelihood associated with the classical Poisson model is defines as
L = e M λ i = 1 M λ x i x i ! ,
while in the complex cases with N 2 hidden states the likelihood is formulated accordingly taking into account all the parameters λ i involved.
The PHMM combined with the EM algorithm and the AIC and BIC criteria will be applied in the following section for the Central and South American regions.

3. Application of PHMM in Central and South America Seismicity

The methodology outlined in the previous sections will be demonstrated for Central and South America.

3.1. Data Set and Processing

The geographic area under investigation is divided into ten seismic zones (Zone 1–Zone 10, see Figure 4) based on previously published papers (e.g., [30,31,32,33]).
A complete and homogeneous earthquake catalogue covering the region bounded by the coordinates 17° N / −45° S and −95° W/ −65° W and extending over a wide time period (1900–2021) has been prepared for the needs of the present work. The main databases used are:
Special handling was necessary to determine the most accurate magnitude for each earthquake in the catalog. The reported magnitudes in the data sources mentioned are provided in different scales, such as M s for Surface wave magnitude, m b for Body wave magnitude, and M w for Moment magnitude. To ensure homogeneity in the catalog, the Moment magnitude scale, M w , was chosen as the most reliable. All other magnitudes were converted to M w using established formulas (e.g., [37,38,39]). The final magnitude assigned to each earthquake was either the original Moment magnitude (taken from various sources) or the equivalent Moment magnitude calculated as the weighted average of the converted magnitudes, with each weight determined by the inverse standard deviation of the corresponding conversion relation.
The final version of the catalog includes information on the focal parameters of shallow earthquakes (≤60 km). The magnitude of these earthquakes was determined to be on the M w scale, and the data processed pertains to those with magnitudes M w 4.5 that occurred within the specified geographic area. It is important to note that the magnitude of completeness ( M c ), which is the minimum magnitude at which all earthquakes are considered to be reliably recorded, varies across different seismic zones depending on the available data and time. Additionally, the year of completeness is the year from which all earthquakes with magnitudes greater than M c are included in the catalog.
The method proposed by Reasenberg ([40]) is a useful tool for declustering data sets. In this method, data considered as foreshocks and/or aftershocks are removed from the original dataset. Table 1 provides the magnitude of completeness ( M c ), the corresponding year, and the number of earthquakes after declustering for each zone.
In the following sections, representative results based on the proposed methodology are provided for zones 4 and 7 of the database mentioned above. Similar results to those of zones 4 and 7 have been obtained for all other zones. It is important to note that, as will be seen in Section 3.4, the steady state probabilities for zones 4 and 7 are obtained relatively faster (for k = 10- & 8-steps ahead) compared to the corresponding probabilities for other zones such as 1, 2, 9, or 10, where 19 or more steps are required.

3.2. The Determination of the N Hidden States

For the determination of the hidden states, various values of N have been considered and the corresponding values of the AIC and BIC selection criteria have been evaluated. For comparative purposes the PHMM models with N 2 are compared with the classical Poisson model where no hidden states are involved but instead all observations Y t come from a single Poisson distribution (state 1) with mean λ 1 . Figure 5 and Figure 6 provide the calculated values of AIC (blue curves) and BIC (red curves) for all candidate models under investigation. According to both criteria, in zone 4, the most suitable model, i.e., the one that best describes the underlying process, is the two-state PHMM. In zone 7, the proposed approach recommends two models, the 3-state PHMM according to AIC (blue curve Figure 6, min = 282.5186) and the 2-state PHMM according to BIC (red curve Figure 6, min = 297.1541).
Zone 4 includes parts of Colombia and Panama with a recorded (year 1965) maximum earthquake magnitude of 7.4 while zone 7 includes Peru with a recorded (year 1992) maximum earthquake magnitude of 8.4. Similar results to those for zones 4 and 7 have been obtained for the other zones of Central and South America (results not shown). Table 2 and Table 3 provide the values of the AIC and BIC criteria for different number of hidden states ( N = 1 through N = 6 with “1” representing the Poisson model) for all seismic zones aggregated. The min values of the AIC and BIC criteria are highlighted in red.
The minimum values of AIC and BIC in zones 1, 2, 4, 5 and 8 occur for the same candidate model but the same is not true for zones 3, 6, 7, 9 and 10. It should be noted though that in these four cases the best two models coincide for both criteria with the values for the criteria being very close. To select the best model in these zones one could obtain the estimated frequencies for each of the two selected models and choose the one according to the accuracy evaluation, i.e., by identifying the one that better fits the observed frequencies (see Section 3.5). We observe that in all cases the PHMM is superior to the classical model. In some instances the superiority is of a limited magnitude as in zone 4 where PHMM, N = 2 is slightly better than the Poisson model but in others the superiority is of a very high magnitude as in zones 1, 9 and 10 where the classical Poisson model fails to describe the process as opposed to the PHMM model with N 2 . One of the weak points of the methodology is that the selection criteria (AIC and BIC) in some cases, may not select the same model like for example in zone 7 where the criteria try to choose between N = 2 and N = 3 hidden states but they do not agree. In cases like these the researcher may consider the simplest among the chosen models since the contribution of the extra state could most likely be insignificant (see e.g., the last column and row of the matrix in (8)).

3.3. Parameter Estimation and Transition Probabilities

After identifying the proper number N of hidden states based on the available data and methodology, we can proceed with estimating seismicity in the study areas for the next 30 years (2022–2051). This section focuses on obtaining parameter estimates and the transition probability matrix P using the EM algorithm. The subsequent section will address the steady state probability distribution. The results below pertain to zones 4 and 7 but similar results can be obtained for other zones.
For zone 4, the best model selected - a 2-state PHMM - is detailed in Table 4, with the transition probabilities given in (6). Table 4 displays the log-likelihood ( l n L ) alongside estimators λ ^ 1 and λ ^ 2 , while the matrix in (6) illustrates the transition probabilities of the 2-state PHMM.
P z o n e 4 = p 11 p 12 p 21 p 22 = 0.8601 0.1399 0.2879 0.7121 .
Table 5 and the matrices P z o n e 7 ; 2 and P z o n e 7 ; 3 given in (7) and (8) present the results for zone 7 for both the 2-state (Table 5—left part and the matrix in (7)) and the 3-state PHMM (Table 5—right part and the matrix in (8)). Table 5 provides the parameter estimates, the loglikelihoods and the values of AIC and BIC for both the 2- and the 3-state PHMMs. Observe that the values of the criteria are clearly quite close for the two chosen/competing models.
P z o n e 7 ; 2 = p 11 p 12 p 21 p 22 = 0.6621 0.3379 0.2825 0.7175
P z o n e 7 ; 3 = p 11 p 12 p 13 p 21 p 22 p 23 p 31 p 32 p 33 = 0.6581 0.2721 0.0698 0.2853 0.7147 0 0 1 0 .
The λ i values represent the earthquake occurrence rate for each hidden state ( λ 1 , . . . , λ N ) specific to each zone. Notably, the estimates λ ^ 1 and λ ^ 2 are quite similar across models, while λ ^ 3 is significantly larger, indicating rare event occurrences. A value as large as 104 (for λ 3 ) suggests an earthquake occurence every 104 years, compared to approximately 30 or 60 for the first two hidden states (see Table 5). This is further supported by transition probabilities in the matrix given in (8), where transitions from state 1 or 2 to state 3 are minimal ( 0.0698 or 0.0000 ). The most common transitions are within state 1 or 2. If, by chance, the process reaches state 3, it transits to state 2 with probability 1. It is reminded that λ i , ( i = 1 , 2 , 3 ) signifies the hidden state, i.e., the Poisson distribution rate.

3.4. Steady State Distribution

Figure 7, Figure 8 and Figure 9 present the estimates of the steady state transition probabilities for the 2-state PHMM of zone 4 and the 2-state and 3-state PHMMs of zone 7, for the next 30 years. The probabilities have been calculated according to (5).
Figure 7 delves into the steady state distribution behavior for the PHMM’s hidden states over the next 30 years. As depicted in Figure 5, the model comprises two states, with the first being more probable than the second. The steady state distribution is obtained after k = 10 steps ahead (for the year 2031) with probability just under 70% for the first hidden state and just over 30% for the second hidden state (see the steady state probability matrix given in (9)). For k = 1 (for 2022), the results in Figure 7 align with the 1st row of the transition probability matrix given in (6) ( p 11 = 0.8601 , p 12 = 0.1399 ) .
Π z o n e 4 = π 11 π 12 π 21 π 22 = 0.6730 0.3270 0.6730 0.3270
The steady state distributions for the two models (according to Figure 6 and Table 2 and Table 3) for zone 7 are furnished in Figure 8 and Figure 9 using expression (5) and matrices (7) and (8) for k = 1 (2022), 2 (2023), …, 30 (2050). The two hidden states for the 2-state model are almost equally likely to occur with the steady state distribution obtained for k = 8 steps ahead, for the year 2029 (see the matrix in (10)). On the other hand, for the 3-state PHMM, λ 1 and λ 2 appear as the dominant ones, in contrast with λ 3 that occurs with probability just over 3%, while the steady state distribution is obtained relatively fast for k = 5 (see the matrix in (11)). Observe that for k = 1 (for 2022) the results are those appearing in the 1st row of the matrices given in (7) and (8). For the sake of completeness, Table 6 provides the steady state distribution for all ten zones including the number of steps k needed for the steady state distribution to be achieved and the transition probabilities. Notice that the table provides for zones 3, 6, 7, 9 and 10 the results for both models selected by AIC and BIC (see Table 3 and Table 4). Although the figures and tables have not been presented for each zone, all candidate models for each zone, have been provided in Table 2 and Table 3 while the steady state probabilities for each zone, have been provided in Table 6. Based on the results in Table 2, Table 3 and Table 6 one could easily furnish figures identical to Figure 5, Figure 6, Figure 7, Figure 8 and Figure 9.
Π z o n e 7 ; 2 = π 11 π 12 π 21 π 22 = 0.4554 0.5446 0.4554 0.5446
Π z o n e 7 ; 3 = π 11 π 12 π 13 π 21 π 22 π 23 π 31 π 32 π 33 = 0.4409 0.5284 0.0307 0.4409 0.5284 0.0307 0.4409 0.5284 0.0307

3.5. Earthquake Frequency—Accuracy Evaluation

The analysis concludes with an accuracy evaluation of the methodology, detailed in Table 7 and Table 8. Due to limited sample size, earthquake frequency is categorized into four classes that appear to represent adequately the earthquake frequencies in all seismic zones with different end points, according to the total seismic activity in each zone. Results for zones 4 and 7 (with similar results for all zones), include estimated frequencies for chosen models (2- or 3-state PHMM) and the classical Poisson model (state 1). As anticipated, models recommended in previous sections better align with observed frequencies across classes (Table 7 and Table 8), compared to the classical Poisson model. It has to be noted that the estimated number of years (frequencies), are obtained via the expression
o ^ j ( x ) = M i = 1 N π i j P λ i ( X = x ) ,
where for each zone and each class j, M is the number of events and P λ i ( X = x ) is the mass function for the Poisson distribution with parameter λ i .

4. Discussion and Conclusions

Stochastic processes like Markov chains have been proven to be useful tools with numerous applications, including seismic hazard evaluation. This process transitions from one state/zone to another, between a finite number of states/zones according to a transition probability matrix. In this work, a Poisson hidden Markov model (PHMM) is presented and implemented for seismic activity in Central and South America. The choice of these regions was not random. Both areas are characterized by a high degree of tectonics, with mainly two subduction tectonic plates: the Nazca plate under South America and the Cocos plate under Central America. These tectonics result in a significant number of large earthquakes. The largest earthquake ever recorded globally occurred in South America in 1960 with a magnitude of M w = 9.5 . Additionally, a large earthquake in 2010 with a magnitude of M w = 8.8 occurred in the same area.
After identifying the number of hidden states and estimating the parameters of the selected model(s) for each zone separately using the EM algorithm and Information Criteria AIC and BIC, several conclusions were reached.
Seismic activity in most zones follows the PHMM for the geographical area under investigation, with the classical Poisson model chosen only for zone 3 and only by BIC. The steady state probabilities vary for different zones. For zones 4 and 7, which are fully presented in this work, the probabilities stabilize quite early (after 10 and 6 years, respectively). While for most zones, stabilization occurs within 3–10 years, zones 2 and 9 require more than 15 years for the steady-state distribution to be realized. Additionally, although the selection criteria recommend different models for zones 3, 6, 7, and 9, the differences are relatively small.
In most zones, the system remains in the first state for the next 30 years. In zone 7 (similar to zones 6 and 9), the system remains in the second state for the next 30 years (see e.g., Figure 8 and Figure 9 and matrix in (11)).
The methodology presented in this work demonstrates that the models used are reliable for estimating seismic activity in Central and South America. While classical Poisson-type models like the Generalized Poisson or Negative Binomial models often work well, PHMMs provide researchers with a way to capture autocorrelation in the data, resulting in an adequate representation of the phenomenon under investigation. It is worth noting that PHMMs can also be used for other aspects of seismic activity, such as estimating horizontal acceleration of an earthquake, which is crucial for designing large engineering projects. The seismicity of the regions investigated in this work is considered one of the most active in the world, with frequent large and destructive earthquakes. Due to the highly reliable results, the implementation of the proposed methodology is recommended for other geographical regions with varying levels of seismic activity.

Author Contributions

E.G.: Conceptualization and original draft; T.M.T. and V.K.: Software and investigation; E.S.: Data curation; A.P. and A.M.: Methodology; A.K.: Review and Editing supervision. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data are available at the bulletins of ISC (http://www.isc.ac.uk/), NEIC (https://www.usgs.gov/programs/earthquake-hazards/national-earthquake-information-center-neic), GCMTC (https://www.globalcmt.org) and the published global earthquake catalogues.

Acknowledgments

This work is part of the MSc Thesis [41] (in Greek with an abstract in English) of the first author (E.G.). This Thesis has been completed at the Geophysical Laboratory, School of Geology in the Aristotle University of Thessaloniki. The first author (E.G.) wishes to express her sincere thanks to all people who offered help during the elaboration of the Thesis and especially to T.M. Tsapanos who was the main supervisor and gave to (E.G.) a full support.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AICAkaike Information Criterion
BICBayesian Information Criterion
EMExpectation-Maximization (Algorithm)
HMMHidden Markov Model
ISCInternational Seismological Centre
NEICNational Earthquake Information Centre of the USGS
GCMTCGlobal Centroid Moment Tensor Catalogue
PHMMPoisson Hidden Markov Model

References

  1. Tsapanos, T.M.; Burton, P.W. Seismic hazard evaluation for specific seismic regions of the world. Tectonophysics 1991, 194, 153–169. [Google Scholar] [CrossRef]
  2. Suàrez, G.; Gagnepain, J.; Cisternas, A.; Hatzfield, D.; Molnar, P.; Ocola, L.; Roecker, S.W.; Viodè, J.P. Tectonic deformation of the Andes and the configuration of the subducted slab in central Peru: Results from a microseismic experiment. Geophys. J. Int. 1990, 103, 1–12. [Google Scholar] [CrossRef]
  3. Kelleher, J.A. Rupture zones of south American earthquakes and ome predictions. J. Geophys. Res. 1972, 84, 2087–2103. [Google Scholar] [CrossRef]
  4. Stein, S.; Engeln, J.E.; De Meto, C.; Gordan, R.G.; Woods, D.R.; Lundgren, P.; Argus, D.; Quibble, D.; Stein, C.; Weistein, S.; et al. The Nazca South America convergence rate and the recurrence of the grate Chilean earthquakes. Geophys. Res. Lett. 1986, 13, 713–716. [Google Scholar] [CrossRef]
  5. Tsapanos, T.M.; Christova, C.V. Some preliminary results of the worldwide seismicity estimation: A case study of the seismic hazard evaluation in South America. Ann. Di Geofis. 2000, 43, 11–22. [Google Scholar] [CrossRef]
  6. Dewey, J.S.; Lamb, S.H. Active tectonics of the Andes. Tectonophysics 1992, 205, 79–95. [Google Scholar] [CrossRef]
  7. Quezada, F.J. Seismic observation in Chile. Bull. Intern. Inst. Seismol. Earthq. Engin. 1997, 31, 243–259. [Google Scholar]
  8. Bilek, S.L. Seismicity along the South America subduction zone: Review of large earthquakes, tsunamis and subduction zone complexity. Tectonophysics 2010, 495, 2–14. [Google Scholar] [CrossRef]
  9. Manea, V.C.; Manea, M.; Ferarri, L. A geodynamical perspective on the subduction of Cocos and Rivera plates beneath Mexico and Central America. Tectonophysics 2013, 609, 56–81. [Google Scholar] [CrossRef]
  10. Alvarado, G.E.; Benito, B.; Staller, A.; Climent, Á.; Camacho, E.; Rojas, W.; Marroquin, G.; Molina, E.; Talavera, J.E.; Torres, Y.; et al. New Seismic Zonation In Central America: A Base For Seismic Hazard. In Proceedings of the 16rh World Conference on Earthquake Engineering, Santiago, Chile, 9–13 January 2017; World Conference on Earthquake Engineering—Online Proceedings. Available online: https://www.wcee.nicee.org/wcee/sixteenth_conf_Santiago/ (accessed on 9 July 2024).
  11. Ruiz, S.; Madariaga, R. History and recent large megathrust earthquake in Chile. Tectonophysics 2018, 733, 37–56. [Google Scholar] [CrossRef]
  12. Tsapanos, T.M. The depth distribution of seismic parameters estimated for the South America area. Earth Planet Sci. Lett. 2000, 180, 103–115. [Google Scholar]
  13. Calderón, A.; Silva, V.; Avilés, M.; Méndez, R.; Castillo, R.; Carlos Gil, J.; Alfredo López, M. Toward a uniform earthquake loss model across Central America. Earthq. Spectra. 2022, 38, 178–199. [Google Scholar] [CrossRef]
  14. Costa, C.; Alvarado, A.; Audemard, F.; Audin, L.; Benavente, C.; Bezerra, H.F.; Cembrano, J.; González, G.; López, M.; Minaya, E.; et al. Hazardous faults of South America; compilation and overview. J. S. Am. Earth Sci. 2020, 104, 102837. [Google Scholar] [CrossRef]
  15. Petersen, M.D.; Harmsen, S.; Jaiswal, K.S.; Rukstales, K.S.; Luco, N.; Haller, K.; Mueller, C.; Shumway, A. Seismic hazard, risk, and design for South America. Bull. Seismol. Soc. Am. 2018, 108, 781–800. [Google Scholar] [CrossRef]
  16. Vere-Jones, D. A Markov model for aftershock occurrence. Pageoph 1966, 64, 31–42. [Google Scholar] [CrossRef]
  17. Nava, F.A.; Herrera, C.; Frez, J.; Glowacka, E. Seismic Hazard Evaluation Using Markov Chains: Application to the Japan Area. Pageoph 2005, 162, 1347–1366. [Google Scholar] [CrossRef]
  18. Tsapanos, T.M. The Markov model as a pattern for earthquake recurrence in South America. Bull. Geol. Soc. Greece 2001, 34, 1611–1617. [Google Scholar] [CrossRef]
  19. Altinok, Y.; Kolcak, D. An application of the semi-Markov model for earthquake occurrences in North Anatolia, Turkey. J. Balk. Geophys. Soc. 1999, 2, 90–99. [Google Scholar]
  20. Granat, R.; Donnellan, A. A hidden Markov model-based tool for geophysical data exploration. Pure Appl. Geophys. 2002, 159, 2271–2283. [Google Scholar] [CrossRef]
  21. Ebel, J.E.; Chambers, D.W.; Kafka, A.L.; Baglivo, J.A. Non-Poissonian Earthquake Clustering and the Hidden Markov Model as Bases for Earthquake Forecasting in California. Seismol. Res. Lett. 2007, 78, 57–65. [Google Scholar] [CrossRef]
  22. Chambers, D.W.; Baglivo, J.A.; Ebel, J.E.; Kafka, L.K. Earthquake Forecasting Using Hidden Markov Models. Pure Appl. Geophys. 2012, 169, 625–639. [Google Scholar] [CrossRef]
  23. Li, Y.; Anderson-Spencer, R. Hidden Markov Modeling of Waiting Times in the 1985 Yellowstone Earthquake Swarm. Pure Appl. Geophys. 2013, 170, 785–795. [Google Scholar] [CrossRef]
  24. Baum, L.E.; Petrie, T. Statistical Inference for Probabilistic Functions of Finite State Markov Chains. Ann. Math. Stat. 1966, 37, 1554–1563. [Google Scholar] [CrossRef]
  25. Can, C.E.; Ergun, G.; Gokceoglu, C. Prediction of Earthquake Hazard by Hidden Markov Model (around Bilecik, NW Turkey). Cent. J. Geosci. 2014, 6, 403–414. [Google Scholar] [CrossRef]
  26. Orfanogiannaki, K.; Karlis, D.; Papadopoulos, G.A. Identifying seismicity levels via Poisson hidden Markov models. Pure. Appl. Geophys. 2010, 167, 919–931. [Google Scholar] [CrossRef]
  27. Rabiner, L. A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proc. IEEE 1989, 77, 257–286. [Google Scholar] [CrossRef]
  28. Baum-Welch Algorithm for Training a Hidden Markov Model–Part 2 of the HMM Series, 2019. Analytics Vidhya. Available online: https://medium.com/analytics-vidhya/baum-welch-algorithm-for-training-a-hidden-markov-model-part-2-of-the-hmm-series-d0e393b4fb86 (accessed on 3 May 2024).
  29. Moon, T.K. The Expectation-Maximization Algorithm. IEEE Signal Process. Mag. 1996, 13, 47–60. [Google Scholar] [CrossRef]
  30. Papadimitriou, E.E. Long-term Earthquake Prediction along the Western Coast of South and Central America Based on a Time Predictable Model. Pageoph 1993, 140, 301–316. [Google Scholar] [CrossRef]
  31. Cernadas, D.; Osella, A.; Sabbione, N. Self-similarity in the Seismicity of the South-American Subduction Zone. Pure Appl. Geophys. 1998, 152, 57–73. [Google Scholar] [CrossRef]
  32. Galanis, O. Probabilistic Estimation of Seismicity of the Regions Mexico, Central and South America Using the Bayes Statistics. Master’s Dissertation, Aristotle University of Thessaloniki Hellas, Thessaloniki, Greece, 2000; 97p. (In Greek). [Google Scholar]
  33. Galanis, O.C.; Tsapanos, T.M.; Papadopoulos, G.A.; Kiratzi, A.A. An alternative Bayesian statistics for probabilistic earthquake prediction in Mexico, Central and South America. Bull. Geological. Soc. Greece 2002, 34, 1485–1491. [Google Scholar] [CrossRef]
  34. Pacheco, J.F.; Sykes, L.R. Seismic moment catalog of large shallow earthquakes, 1900 to 1989. Bull. Seismol. Soc. Am. 1992, 82, 1306–1349. [Google Scholar] [CrossRef]
  35. Engdahl, E.R.; van der Hilst, R.; Buland, R. Global teleseismic earthquake relocation with improved travel times and procedures for depth determination. Bull. Seismol. Soc. Am. 1998, 88, 722–743. [Google Scholar] [CrossRef]
  36. Engdahl, E.R.; Villasenor, A. Global Seismicity: 1900–1999. In International Handbook of Earthquake and Engineering Seismology, Part A, Chapter 41; Lee, W.H.K., Kanamori, H., Jennings, P.C., Kisslinger, C., Eds.; Academic Press: Cambridge, MA, USA, 2002; pp. 665–690. [Google Scholar]
  37. Scordilis, E.M. Globally valid relations converting Ms, mb and MJMA to Mw. In Proceedings of the NATO Advanced Research Workshop on Earthquake Monitoring and Seismic Hazard Mitigation in Balkan Countries, Borovetz, Bulgaria, 11–17 September 2005; pp. 158–161. [Google Scholar]
  38. Scordilis, E.M. Empirical global relations converting Ms and mb to moment magnitude. J. Seismol. 2006, 10, 225–236. [Google Scholar] [CrossRef]
  39. Tsampas, A.D.; Scordilis, E.M.; Papazachos, C.B.; Karakaisis, G.F. Global magnitude scaling relations for intermediate-depth and deep-focus earthquakes. Bull. Seismol. Soc. Am. 2016, 106, 418–434. [Google Scholar] [CrossRef]
  40. Reasenberg, P. Second-order moment of central California seismicity 1969–1982. J. Geophys. Res. 1985, 90, 5479–5495. [Google Scholar] [CrossRef]
  41. Georgakopoulou, E.A. Study of the Seismicity of Central and South America using the Hidden Markov Model. Master’s Thesis, School of Geology, Aristotle University of Thessaloniki, Thessaloniki, Greece, 2023; 109p. [Google Scholar]
Figure 1. The tectonics of Central America [10].
Figure 1. The tectonics of Central America [10].
Stats 07 00047 g001
Figure 2. A typical illustration of a hidden Markov model— X t = hidden states, Y t = observations—Vertical Arrows represent the associations between X t & Y t —Horizontal arrows represent the transitions between X t .
Figure 2. A typical illustration of a hidden Markov model— X t = hidden states, Y t = observations—Vertical Arrows represent the associations between X t & Y t —Horizontal arrows represent the transitions between X t .
Stats 07 00047 g002
Figure 3. Mode of operation of a 3-state PHMM ([25]).
Figure 3. Mode of operation of a 3-state PHMM ([25]).
Stats 07 00047 g003
Figure 4. The ten seismic zones into which the study areas of Central and South America are divided.
Figure 4. The ten seismic zones into which the study areas of Central and South America are divided.
Stats 07 00047 g004
Figure 5. AIC and BIC values for PHMMs with N = 1 , 2 , 3 , 4 , 5 hidden states (Zone 4).
Figure 5. AIC and BIC values for PHMMs with N = 1 , 2 , 3 , 4 , 5 hidden states (Zone 4).
Stats 07 00047 g005
Figure 6. AIC and BIC values for PHMMs with N = 1 , 2 , 3 , 4 , 5 hidden states (Zone 7).
Figure 6. AIC and BIC values for PHMMs with N = 1 , 2 , 3 , 4 , 5 hidden states (Zone 7).
Stats 07 00047 g006
Figure 7. Hidden state estimation of the PHMM for the next 30 years (Zone 4).
Figure 7. Hidden state estimation of the PHMM for the next 30 years (Zone 4).
Stats 07 00047 g007
Figure 8. Hidden state estimation of the 2-state PHMM for the next 30 years (Zone 7).
Figure 8. Hidden state estimation of the 2-state PHMM for the next 30 years (Zone 7).
Stats 07 00047 g008
Figure 9. Hidden state estimation of the 3-state PHMM for the next 30 years (Zone 7).
Figure 9. Hidden state estimation of the 3-state PHMM for the next 30 years (Zone 7).
Stats 07 00047 g009
Table 1. Magnitude of completeness, year (of completeness) and number of earthquakes—all zones.
Table 1. Magnitude of completeness, year (of completeness) and number of earthquakes—all zones.
ZonesMagnitude M c YearNumber of Earthquakes
ZONE 14.619923061
ZONE 24.61995447
ZONE 34.61990515
ZONE 44.91965408
ZONE 54.51995325
ZONE 64.61995694
ZONE 74.719921373
ZONE 84.619951137
ZONE 94.719981429
ZONE 104.71998649
Table 2. AIC values for PHMMs for different number of hidden states N = 1–6.
Table 2. AIC values for PHMMs for different number of hidden states N = 1–6.
ZonesPoisson Model23456
ZONE 1708.8076342.4268308.4430322.2655357.2092-
ZONE 2211.8213177.9372195.5404221.6399--
ZONE 3218.2804211.3093230.5936257.2309--
ZONE 4344.4748320.4079334.6641356.3250--
ZONE 5187.6203172.7280188.9182214.3400--
ZONE 6299.8547226.0982225.9051245.4005279.0324-
ZONE 7394.9631287.3457282.5186295.8455320.5154-
ZONE 8320.0459251.8138253.8652266.4483--
ZONE 9802.5337377.3655275.3205263.0317287.4585444.2555
ZONE 101027.298339.6253301.3229309.7854285.1658375.6718
Table 3. BIC values for PHMMs for different number of hidden states N = 1–6.
Table 3. BIC values for PHMMs for different number of hidden states N = 1–6.
ZonesPoisson Model23456
ZONE 1710.2088352.2352332.2633365.7026425.8679-
ZONE 2213.1171187.0081217.5697261.8108--
ZONE 3219.7461221.5694255.5111302.6687--
ZONE 4346.5178334.7092369.3959419.6596--
ZONE 5188.9162181.7988210.9475254.5109--
ZONE 6301.1506235.1690247.9343285.5714342.5284-
ZONE 7396.3643297.1541306.3389339.2826389.1741-
ZONE 8321.3417260.8847275.8944306.6193--
ZONE 9803.7118385.6118295.3474299.5514345.1831527.8973
ZONE 101028.476347.8717321.3499346.3050342.8905459.3137
Table 4. Poisson Rate Parameters (2-state PHMM)—zone 4.
Table 4. Poisson Rate Parameters (2-state PHMM)—zone 4.
λ ^ 1 5.1234
λ ^ 2 11.6139
ln L −153.2039
Table 5. Poisson Rate Parameters for Zone 7 (2- & 3-state PHMM).
Table 5. Poisson Rate Parameters for Zone 7 (2- & 3-state PHMM).
2-State PHMM3-State PHMM
λ ^ 1 30.433730.1330
λ ^ 2 60.730357.0006
λ ^ 3 -104.9935
ln L −136.6728−124.2593
A I C 287.3457282.5186
B I C 297.1541306.3389
Table 6. Steady state distribution—k steps ahead with transition probabilities in parentheses.
Table 6. Steady state distribution—k steps ahead with transition probabilities in parentheses.
Zonesk (Probabilities) 2-Statek (Probabilities) 3-State
ZONE 1-22 (0.5434 & 0.2479 & 0.2087)
ZONE 219 (0.6461 & 0.3539)-
ZONE 36 (0.6830 & 0.3170)-
ZONE 410 (0.6730 & 0.3270)-
ZONE 53 (0.8821 & 0.1179)-
ZONE 61 (0.9598 & 0.0402)6 (0.2480 & 0.7119 & 0.0401)
ZONE 78 (0.4554 & 0.5446)5 (0.4409 & 0.5284 & 0.0308)
ZONE 85 (0.8414 & 0.1586)-
ZONE 9-29 (0.4047 & 0.5054 & 0.0899) *
ZONE 10-20 (0.5587 & 0.3543 & 0.0870) **
* for 4-state: 30 (0.0736 & 0.7117 & 0.1072 & 0.1075). ** for 5-state: 18 (0.5929 & 0.3102 & 0.0315 & 0.0339 & 0.0315).
Table 7. Model Agreement Comparison—Zone 4.
Table 7. Model Agreement Comparison—Zone 4.
Annual Earthquake Freq.Observed # of YearsEstimated Freq. 2-State PHMMPoisson Model
0–52323.280516.017
6–102321.713734.707
11–1599.59766.1062
≥1622.40820.1698
TOTAL575757
Table 8. Model Agreement Comparison—Zone 7.
Table 8. Model Agreement Comparison—Zone 7.
Annual Earthquake Freq.Observed # of YearsEstimated Freq. 2-State PHMMEstimated Freq. 3-State PHMMPoisson Model
0–2020.40820.44340.0006
21–401112.773612.50906.6278
41–60128.596211.127922.8328
≥6158.22205.91970.5388
TOTAL30303030
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Georgakopoulou, E.; Tsapanos, T.M.; Makrides, A.; Scordilis, E.; Karagrigoriou, A.; Papadopoulou, A.; Karastathis, V. Seismic Evaluation Based on Poisson Hidden Markov Models—The Case of Central and South America. Stats 2024, 7, 777-792. https://doi.org/10.3390/stats7030047

AMA Style

Georgakopoulou E, Tsapanos TM, Makrides A, Scordilis E, Karagrigoriou A, Papadopoulou A, Karastathis V. Seismic Evaluation Based on Poisson Hidden Markov Models—The Case of Central and South America. Stats. 2024; 7(3):777-792. https://doi.org/10.3390/stats7030047

Chicago/Turabian Style

Georgakopoulou, Evangelia, Theodoros M. Tsapanos, Andreas Makrides, Emmanuel Scordilis, Alex Karagrigoriou, Alexandra Papadopoulou, and Vassilios Karastathis. 2024. "Seismic Evaluation Based on Poisson Hidden Markov Models—The Case of Central and South America" Stats 7, no. 3: 777-792. https://doi.org/10.3390/stats7030047

APA Style

Georgakopoulou, E., Tsapanos, T. M., Makrides, A., Scordilis, E., Karagrigoriou, A., Papadopoulou, A., & Karastathis, V. (2024). Seismic Evaluation Based on Poisson Hidden Markov Models—The Case of Central and South America. Stats, 7(3), 777-792. https://doi.org/10.3390/stats7030047

Article Metrics

Back to TopTop