1. Introduction
A proper description of earthquake-induced strong ground motion requires the quantification of its amplitude, duration, and frequency content. More specifically, the effect of frequency content of ground motion has proven important for the seismic response of structures and geotechnical systems [
1,
2,
3]. If the frequency content of ground motion, defined as the distribution of the signal’s energy with respect to frequency, is close to the fundamental frequencies of a structural or geotechnical system, then resonance occurs leading to increased seismic response.
The most complete description of the frequency content of ground motion is achieved through the Fourier amplitude spectrum (FAS), as well as the acceleration response spectrum. Nevertheless, in some cases, it is desirable to use scalar parameters which can effectively represent the frequency content of seismic excitation. There are a number of scalar frequency-content parameters of ground motion which have been proposed in the literature. The most common ones have been investigated by [
1,
4]. These include the mean period, T
m, the average spectral period, T
avg, the smoothed spectral predominant period, T
o, and the predominant period, T
p. Among these frequency-content parameters, T
m utilizes the FAS, whereas T
avg, T
o, and T
p are computed through the 5%-damped acceleration response spectrum of ground motion. T
m is preferred over the rest of the parameters due to its robustness and its ability to characterize the frequency content of acceleration time series directly [
4]. Since then, numerous studies have been published examining the effect of T
m on the seismic response of engineered systems [
3,
5,
6,
7,
8]. For example, T
m was considered as an important predictor variable for the estimation of seismic-induced slope displacements by Rathje et al. (2014) [
9], whereas Jibson and Tanyaş (2020) [
3] indicated that T
m is a good predictor variable for earthquake-induced landslide size distribution. Furthermore, Sotiriadis et al. (2019; 2020) [
7,
8] utilized T
m for characterizing the frequency content of ground motion and correlating it with kinematic soil-structure interaction (SSI) effects on buildings with embedded foundations. Moreover, the study of Song et al. (2014) [
10] led to the conclusion that T
m affects the collapse capacity of post-mainshock buildings, while Sotiriadis et al. (2017) [
6] used T
m to normalize the acceleration response spectra of ground motion and interpret the varying role of SSI on the seismic response of multi-storey buildings with respect to the fixed-base assumption.
In contrast to the amplitude and duration of strong ground motion, the development of Ground Motion Predictive Equations (GMPEs) for frequency-content parameters has received less attention. GMPEs are empirical models which provide estimates of ground motion parameters; they are calibrated through strong motion datasets and are essential for seismic hazard assessment, seismic design, and planning of emergency response. The usual predictor variables included in a GMPE are the earthquake magnitude (M), the seismic source-to-site distance (R), site effects proxies, and fault-type mechanisms. To the author’s knowledge, only a few GMPEs exist for T
m. Rathje et al. (2004) [
4] updated the previously published model of Rathje et al. (1998) [
1], using worldwide strong motion data to calibrate their model. Du (2017) [
11] presented a robust GMPE for T
m using the expanded worldwide NGA-West2 ground motion database, including more site and faulting-type predictors. On a regional scale, Yaghmaei-Sabegh (2015) [
12] and Lashgari and Jafarian (2022) [
13] developed GMPEs for T
m based on Iranian data, whereas for Greece only Chousianitis et al. (2018) [
14] included T
m in their suite of empirical models based on strong motion data in Greece. All of the aforementioned efforts included the calibration of coefficients of predefined functions forms, selected based on data trends or theoretical insights, through linear or nonlinear regression algorithms.
On the other hand, various machine learning (ML) algorithms have been gaining ground in the field of GMPE development over the last decade in an attempt to tackle the limitations of traditional methods, as reported in recently published extensive literature reviews [
15,
16,
17,
18]. The Artificial Neural Network (ANN) algorithm was incorporated to develop GMPEs for peak ground acceleration (PGA) and velocity (PGV) and spectral accelerations (Sa) using data recorded from European and California shallow crustal seismic events [
19,
20,
21], estimating the computed errors which were lower than those derived from conventional regression analyses. In addition to this, Khosravikia et al. (2019) [
22] proposed ANN-based GMPEs for the Oklahoma, US, area. Various strong motion parameters, including energy characteristics and strong motion duration, were adopted in deriving GMPEs applying five non-parametric machine learning models [
23]. Moreover, for energetic characteristics, e.g., Cumulative Absolute Velocity (CAV), Kuran et al. (2023) [
24] estimated GMPEs evaluating the machine learning approach and considered a Gradient Boosting Algorithm. Additionally, Deep Neural Networks (DNNs) using V
s profiles were applied to develop linear and nonlinear site amplification GMPEs [
25]. For Mexico’s subducting earthquakes registered at rock sites, GMPE models were proposed by Ramos-Cruz and Ruiz-Garcia (2024) [
26] utilizing Support Vector Machine (SVM) regression. For ground motion in Greece, Morfidis et al. (2024) [
27] implemented the ANN algorithm to obtain GMPEs for PGA and PGV, which performed slightly better than the most updated regression-based model by utilizing a small number of neurons. Regarding the frequency content of ground motion, Sofiane et al. [
28] developed a predictive model for T
m using feed-forward ANN and analyzed the effects of seismological parameters on it, with data collected from the Kik-Net database. This is the only existing work found on the implementation of ANNs for the prediction of frequency content of ground motion.
Careful examination of the existing literature on GMPEs derived through ML algorithms reveals that limited attempts have been made in developing such a model for scalar frequency-content parameters of ground motion. The present work aims to fill this gap by investigating the applicability of shallow Artificial Neural Networks on the production of a GMPE for Tm for strong motion in Greece, taking advantage of the most updated database which was recently published. At the same time, nonlinear regression-based models are developed using the same dataset as well in order to highlight the advantages and weaknesses of each algorithm in the prediction of the mean period of ground motion.
2. Strong Motion Data
The strong ground motion dataset adopted herein is the one published by Margaris et al. (2021) [
29]. It is the most comprehensive and updated strong motion dataset for shallow earthquakes in Greece so far and includes 471 seismic events recorded between 1973 and 2015, which produced a total of 2993 recordings from 333 different sites. The dataset includes key source parameters, such as hypocenter locations, moment magnitudes (M
w), fault-plane solutions, finite-fault information, and reported values of the average shear wave velocity of the top 30 m, V
S30, for all 333 sites. Within the framework of this paper, it was decided to exclude data from earthquake events with a magnitude lower than M4.5, as this magnitude is the smallest one of usual engineering interest, as well as data from events with a focal depth (H) larger than 40 km in order to retain the characterization of shallow events in our work. Upon implementation of these criteria, a total of 2551 recordings were used—coming from 341 earthquakes recorded at 323 sites.
Figure 1 presents the distribution of the data used herein in terms of earthquake moment magnitude (M
w), Joyner–Boore source-to-site distance (R
JB), V
S30 at the recording stations, and focal depth (H), along with their marginal distributions. The earthquake magnitude spans from M4.5 to M7.0, with the majority of data belonging to events with magnitudes between M5.0 and M6.0. More specifically, 68% of the data correspond to earthquake magnitudes between M5.0 and M6.0, 14% between M4.5 and M5.0, and 18% to data from events with magnitudes between M6.0 and M7.0. The source-to-site distance, R
JB, ranges from 1 km to 300 km, with near-source data being limited. More specifically, 62% of the data correspond to an R
JB larger than 100 km, whereas the rest of the data are equally distributed to R
JB values lower than 50 km (19%) and between 50 and 100 km (19%). Furthermore, V
S30 ranges from 91 m/s to 1183 m/s, with the vast majority of the recording sites exhibiting V
S30 values between 200 and 600 m/s. For a more detailed description of the dataset, also including information on the processing of strong motion recordings, the readers could refer to [
29]. It should be noted that no weighting or enhancing method was implemented to alleviate the inherent sparsity of the dataset.
The processed and filtered acceleration time series compiled by [
29] were used to calculate the mean period, T
m, of each recording. More specifically, the two horizontal components of each record were used to calculate the rotD50 [
30,
31] component of the FAS according to the following [
1]:
In Equation (1), C
i2 represents the sum of the squared real and imaginary parts of the Fourier amplitude ordinates, f
i represents the discrete fast Fourier transform frequencies ranging from 0.25 to 20 Hz, and Δf denotes the frequency interval.
Figure 2 demonstrates the FAS of two records of the current dataset, from earthquakes with similar magnitudes and sites with similar V
S30 values but at quite different source-to-site distances. The red part of the FAS highlights the required frequency range over which T
m is computed.
Figure 3 presents the variation in T
m, as calculated for the whole dataset, with respect to RJB, for three earthquake magnitude bins. It is evident that T
m increases in a quadratic manner as R
JB increases, whereas larger earthquake magnitudes lead to larger T
m as well.
The effect of R
JB on Tm is strong; thus, it may hinder the impact of site effects as represented by V
S30. Therefore, to highlight the variation in T
m with V
S30, the recorded data were divided into different earthquake magnitude and distance bins, as shown in
Figure 4.
Figure 4 shows that for R
JB < 50 km, the correlation between T
m and V
S30 is more evident compared to larger distances. This is supported by the fact that the R
2 coefficient of simple polynomial equations between T
m and V
S30 is significantly larger for small distances rather than larger ones. Moreover, the data suggest that for M
w > 5.0 and R
JB < 50 km, T
m does not vary significantly with increasing V
S30 when V
S30 exceeds approximately 800 m/s. Furthermore, it is evident that data recorded at distances R
JB > 50 km provide larger T
m values than their counterparts for R
JB < 50 km, highlighting the strong effect of R
JB.
The strong motion data used in this work are provided in (
Supplementary Materials) electronic supplement (ES1_Strong_Motion_data.xlsx).
4. Comparison Between NLR- ANN-Based and Existing GMPEs with Recorded Data
The performance indexes and the results of the mixed-effects residual analysis adequately depict the predictive capabilities of the developed GMPEs within the perspective of statistical manipulation of residuals between recorded data and predictions. However, it is also essential to see how these models work in practice, what their prediction trends are, and how they compare to data and to predictions of existing GMPEs.
Two existing GMPEs are used for comparison, namely those of Du (2017) [
11] and Chousianitis et al. (2018) [
14]. The proper implementation of these GMPEs to compare with the proposed ones requires some clarifications and assumptions. The model in [
11] uses the Euclidean Norm (or Squared Root of the Sum of Squares—SRSS) of the horizontal components, while that in [
14] uses the geometric mean (Geomean) of the horizontal components. It is emphasized that the rotD50 component was utilized for the proposed GMPEs. Additional investigation of the current dataset, which included the computation of the SRSS and the geometrical mean component of T
m, revealed that no significant difference occurs between rotD50 T
m and SRSS T
m or Geomean T
m. Figures which prove this are provided in
Appendix A. Furthermore, the model in [
11] uses the closest-to-rupture distance (R
rup) and that in [
14] uses the epicentral distance (R
epi) as a source-to-site distance metric. In the current study, the Joyner–Boore distance was used (R
JB). For the comparison to be possible, scatter plots between R
JB and R
rup, as well as R
JB and R
epi, were made and linear regression expressions were created, with high R
2 values. The corresponding graphs and linear regression expressions are included in
Appendix A.
4.1. Distance Scaling of Ground Motion
In this section, the developed GMPEs are evaluated in terms of distance scaling, that is, how they predict the variation in T
m with source-to-site distance. The evaluation is performed separately for the NLR- and ANN-based GMPEs so that the relevant observations are clear.
Figure 10 presents the distance scaling of T
m for two magnitude ranges, M5–M6 and M6–M7, as posed by the recorded data and predicted by the NLR-based proposed GMPEs and the existing GMPEs mentioned above. The mean V
S30 value denoted by the dataset for each magnitude range was considered, whereas the faulting mechanism, wherever it was necessary as input, was set to normal, which is the most common faulting type in the current dataset. The trends denoted by the data are adequately reproduced by the proposed GMPEs. The Bea21 functional form provides higher estimates than Du17, for M < 6.0. For distances larger than 200 km, the Du17 functional form exhibits a break point which leads to more accurate capture of T
m trends and lowest values. For M > 6.0 and distances between 100 and 230 km, Du17 provides higher estimates than Bea21. Regarding the existing GMPEs, the predictions of Chousianitis et al. (2018) [
14] for the smaller magnitude range are consistent with the proposed ones, though relatively lower. On the contrary, the GMPE of Du (2017) [
11] provides larger estimates for the small-distance range and similar for the mid- and large-distance range, compared to the proposed GMPEs. At the larger magnitude range, the GMPE of Chousianitis et al. (2018) [
14] exhibits similar predictions to the proposed GMPEs; however, at distances larger than 200 km it provides significantly overestimated values for T
m. On the other hand, the GMPE of Du (2017) [
11] shows similar values to the proposed GMPEs’ predictions for small distances (<50 km) and lower estimates for the rest of the distance range considered. Additionally, the estimates of two existing T
m models for ground motion in Iran (Yaghmaei-Sabegh, 2015 [
12]; Lashgari and Jafarian, 2022 [
13]) are depicted. Differences are spotted between the proposed GMPEs for Greece and those for Iran, especially for large earthquake magnitudes, which highlight the regional frequency-content characteristics of ground motion in Greece.
Figure 11 presents the same information as
Figure 10, with the NLR-based models and the existing GMPEs replaced by the developed ANN-based GMPEs for T
m.
Figure 11a,b include the comparison between recorded data and the ANN-based GMPEs for earthquake magnitudes between M5 and M6.
Figure 11b is actually a zoomed version of
Figure 11a, which depicts only data for R
JB up to 100 km. The mean trends shown by all of the ANN-based models are satisfactory for distances larger than 6 km, capturing the nonlinear relationship between T
m and R
JB. The differences between the various ANNs seem to increase as R
JB increases. For distances smaller than or equal to 6 km (as shown more clearly in
Figure 11b), the 2-, 3-, and 4-neuron ANNs provide stable estimates. On the contrary, unrealistic trends are observed for the rest of the ANN-based models. Both the 5-neuron and 10-neuron ANNs present increased T
m for smaller distances, whereas the 15-neuron ANN presents a sudden drop. Apparently, the lack of data at these small distances affects ANN training unfavorably. Hence, this is (possibly) an overfitting problem. It could be argued that the applicability of these models for 5 < M < 6 is constrained for the lowest R
JB = 6 km.
Figure 11c,d include the comparison between recorded data and the ANN-based GMPEs for earthquake magnitudes between M6 and M7, with the latter being a zoomed version of the former, depicting only data for R
JB up to 100 km. For distances larger than 60 km, all of the ANN-based models provide reasonable trends of increasing T
m with distance and similar estimates. For R
JB larger than 200 km, the 15-neuron ANN deviates profoundly from the rest of the ANNs—predicting lower T
m than the others. The 2-, 3-, and 4-neuron ANNs provide stable estimates of T
m for the whole distance range considered. On the other hand, the 5-neuron ANN presents a smooth response for a minimum R
JB equal to 20 km. For smaller distances, it predicts larger T
m for decreasing distance which is not theoretically coherent. For distances lower than 60 km, unrealistic trends are observed for the 10- and 15-neuron ANNs. Both of them present increased T
m for smaller distances, whereas at the distance of 6 km the 15-neuron ANN presents a sudden drop and the 10-neuron ANN presents a sudden increase. This abnormal response could be attributed to the lack of data at small distances which leads to an overfitting problem for these two ANN models.
4.2. Magnitude Scaling of Ground Motion
In this section, the developed GMPEs are evaluated in terms of earthquake magnitude scaling, that is, how they predict the variation in T
m with earthquake magnitude. In contrast to the previous section, the following figures include only the predictions of the proposed and existing GMPEs for a clearer comparison.
Figure 12 presents the variation in T
m with earthquake magnitude for three values of R
JB, as predicted by the two proposed NLR-based GMPEs (Bea21, Du17) and two existing GMPEs (Du, 2017 [
11]; Chousianitis et al., 2018 [
14]). For R
JB = 10 km, the mean estimates of the proposed GMPEs are close. The highest estimates along the small- and high-magnitude range (M4.5–M5.5 and M6.0–M7.0) come from the Du17 functional form, which provides an almost linear variation in T
m with magnitude. Bea21 provides a nonlinear relationship between T
m and M due to the hinge magnitude, M
h, and the squared term in the functional form which includes M
h. The hinge magnitude, M
h, for Bea21 is at M6.2, whereas Du17 suggests one at M5 and one at M7.3. The latter does not apply here due to max M = 7.0 in the current dataset. Similar observations are made for R
JB = 50 and 150 km, with the differences between the functional forms increasing for longer distances. Regarding the existing GMPEs, Du (2017) [
11] exhibits significantly overestimated mean periods for R
JB = 10 km, especially at small earthquake magnitudes. Its differences with respect to the proposed NLR-based GMPEs decrease as RJB increases and eventually it provides lower T
m estimates for long source-to-site distance and earthquake magnitude greater than M6.0. The GMPE proposed by Chousianitis et al. (2018) [
14] presents similar trends to Du17, but with increased values of T
m for the whole magnitude range considered and up to R
JB = 50 km. For longer, it provides lower estimates than Du17.
Figure 13a presents the magnitude scaling of T
m, as predicted by the 2-, 3-, and 4-neuron ANN-based GMPEs. The mean trends of the models are quite theoretically consistent, as well as in accordance to the predictions of the NLR-based GMPEs, shown in
Figure 11. The 2- and 3-neuron ANNs present smooth magnitude scaling curves, which capture the nonlinear relationship between T
m and M
w. Nevertheless, the 4-neuron ANN depicts a jagged-shaped curve, which also captures the associated nonlinearity. The magnitude scaling of the 4-neuron ANN is theoretically consistent for R
JB = 10 and 50 km; however, for R
JB = 150 km, above M6.0 it presents a descending branch which is counter-intuitive. This abnormal trend is possibly due to the lack of data for large-magnitude events, which leads to signs of overfitting. The estimates of T
m with respect to the NLR-based GMPEs are quite consistent.
Figure 13b presents the magnitude scaling of T
m, as predicted by the 5-, 10-, and 15-neuron ANN-based GMPEs. The curves obtained through the implementation of these models do not demonstrate specific trends and they are clearly affected by the attempt of the ANNs to follow closely the data. Hence, overfitting is an issue in these models especially above M5.8-M6.1, depending on the value of R
JB.
4.3. VS30 Scaling of Ground Motion
In this section, the developed GMPEs are evaluated in terms of V
S30 scaling, that is, how they predict the variation in T
m with varying V
S30.
Figure 14a presents the V
S30 scaling of the proposed NLR-based GMPEs, along with the existing GMPE of Du (2017) [
11] for earthquake magnitude M5.5. The existing GMPE of [
14] has been excluded from this comparison as it includes site effects through dummy variables and not through a continuous V
S30 function. The proposed GMPEs present similar V
S30 scaling for the two distances considered. They predict decreasing T
m with increasing V
S30, which is consistent to the recorded observations (
Figure 4). For V
S30 equal to or larger than 800 m/s, the decrease in T
m is negligible.
The GMPE of Du (2017) [
11] presents similar trends; however, it exhibits higher amplification of T
m at low distances than the proposed GMPEs, whereas the opposite stands for long distances.
Figure 14b is similar to
Figure 13a with the 2-, 3-, and 4-neuron ANNs’ estimates plotted instead of the NLR-based GMPEs. The 2-neuron ANN presents similar trends to the NLR-based GMPEs. The 3- and 4-neuron ANNs exhibit different V
S30 scaling than NLR-based GMPEs and the 2-neuron ANN, which is, nevertheless, theoretically consistent. Up to V
S30 = 350–400 m/s, the decrease in T
m due to increasing V
S30 is mild; then an abrupt decrease in T
m occurs up to V
S30 = 450–500 m/s; and then an almost constant value of T
m follows up to the maximum V
S30 considered. The observations made for
Figure 13a and
Figure 13b also apply to
Figure 14c and
Figure 14d, respectively, which refer to earthquake magnitude M6.5.
Figure 15 presents the V
S30 scaling of ANN-based models with 5, 10, and 15 neurons in the hidden layer, for earthquake magnitudes M5.5 (
Figure 15a) and M6.5 (
Figure 15b) and two values of R
JB. The 5-neuron ANN presents a consistent V
S30 scaling and is similar to the 3- and 4-neuron ANNs, both in values and trends, for both earthquake magnitudes. On the other hand, the 10- and 15-neuron models exhibit theoretically non-consistent trends, which is possibly due to overfitting of the ANNs to the dataset.
5. Testing of Developed GMPEs Through Residual Analysis on Unseen Data
This section aims to further test the developed NLR- and ANN-based GMPEs for T
m against unseen data, that is, ground motion data which have not been used to calibrate the regression coefficients of the former nor to train the latter. For this, strong motion data from three earthquakes which occurred in Greece afterward are utilized. The first earthquake considered is that of 30 October 2020, the M7.0 Samos Island (Aegean Sea) earthquake, which affected both Greece and Turkey. The strong motion data (77 records) related to this event were retrieved from Askan et al. (2022) [
35], where one can refer to obtain more information. The other two earthquakes come from the Thessaly earthquake sequence which occurred in March of 2021 and included two main shocks, namely one M6.3 on March 3 and one M6.0 on March 4. The strong motion data related to these events (42 records each) were retrieved from the technical report of Margaris et al. (2022) [
36]. The horizontal components of the ground motion records from these three earthquake events were used to compute the rotD50 Fourier spectrum and then calculate T
m according to Equation (1).
The evaluation procedure followed includes the statistical manipulation of the residuals between the recorded data of the additional earthquake events and the predictions of the developed GMPEs. It is especially interesting to see how the models characterized by overfitting in the previous section, namely the 10- and 15-neuron ANNs, respond to unseen data. Additionally, the unseen data come from earthquake events with magnitudes that lie at the edge or near the edge of the developed GMPEs’ applicability range.
Figure 16 presents the normalized residuals between the observed T
m values and the developed GMPEs’ predictions for the 2020 M7.0 Samos Island earthquake, with respect to R
JB. The normalized residuals are defined as the residuals computed through Equation (7), normalized to the standard deviation of each GMPE. Bea21, Du17, and the 2- and 3-neuron ANNs present a minor bias of normalized residuals with distance, whereas the absolute value of the mean trend is close to zero. On the other hand, the 4-, 5-, and 10-neuron ANNs exhibit increased residuals for source-to-site distances smaller than 100 km. The normalized residuals of the 15-neuron ANN do not depict a significant trend with R
JB; however, a large overall offset from zero is observed.
Figure 17 presents the normalized residuals between the observed T
m values and the developed GMPEs’ predictions for the 2021 M6.3 Thessaly earthquake, with respect to R
JB. Bea21, Du17, and the 2-, 3-, 4-, and 5-neuron ANNs present some bias of normalized residuals with distance, whereas the absolute value of the mean trend is reasonably close to zero. The observed trend is due to the increased residuals for some close-source recorded ground motion. The 10-neuron ANN exhibits an improved response compared to the former GMPEs, with a minor trend and low absolute values of normalized residuals with respect to zero. Finally, the 15-neuron ANN presents no trend of normalized residuals with respect to R
JB and a relatively low mean offset from zero, providing good predictions of T
m.
Figure 18 presents the normalized residuals between the observed T
m values and the developed GMPEs’ predictions for the 2021 M6.0 Thessaly earthquake, with respect to R
JB. Bea21, Du17, and the 2-, 3-, 4-, and 5-neuron ANN present some bias of normalized residuals with distance, whereas the absolute value of the mean trend is reasonably close to zero. Nevertheless, the 4- and 5-neuron ANNs seem to respond better than the 2- and 3- neuron ANNs for small distances. The 10-neuron ANNs exhibit a significant trend with respect to R
JB and quite an offset from zero at small distances. Finally, the 15-neuron ANN presents a minor trend of normalized residuals with respect to R
JB and an adequate response in T
m predictions.
To evaluate the proposed GMPEs for the whole testing dataset used in this section, the Multivariate Loglikelihood (MLLH) approach, proposed by Mak et al. (2017) [
37], is implemented herein. The MLLH score accounts for the hierarchical nature of modern GMPEs, as well as the ground motion correlation. The lower the MLLH score, the closer the model’s predictions to the observations. The presentation of the equations of the MLLH approach falls beyond the scope of this work. Therefore, the reader may refer to [
37] for further details.
Figure 19 presents the MLLH scores for the proposed GMPEs. It is observed that the NLR-based and 2–5-neuron ANN models exhibit similar values. On the other hand, the 10- and, more intensely, the 15-neuron ANNs depict noticeably higher MLLH values, proving that their predictive capability is inferior to the rest of the GMPEs on the unseen data.
6. Discussion
The current work focused on developing new GMPEs for the mean period, T
m, of strong motion in Greece. T
m has been proven to be the most effective scalar parameter for characterizing the frequency content of strong ground motion. The significance in developing updated GMPEs for T
m is highlighted by its use in multiple applications, such as the prediction of seismically-induced slope displacements, evaluation of kinematic soil-structure interaction effects, interpretation of seismic damage on structures, etc. The mean period may not be considered as a major ground motion parameter in seismic hazard assessment. However, it could act as an additional ground motion characteristic to the common amplitude-based parameters (e.g., PGA, S
a). A vector of ground motion parameters which includes T
m and any amplitude-based intensity measure should be suitable in some of the applications noted above. Anyway, the advantages of a vector-valued seismic hazard assessment have been highlighted by several studies (e.g., [
38,
39]). T
m could also be used as a criterion for the selection of ground motion records in nonlinear response history analyses of structures.
The new GMPEs were developed by calibrating the regression coefficients of predefined functional forms through nonlinear regression (NLR), as well as, through training of shallow Artificial Neural Networks (ANNs) utilizing the most updated strong motion database for Greece. The newly developed models are also provided in electronic supplement ES2. To the author’s knowledge, this is the first attempt to use ANNs for the prediction of Tm. A thorough investigation for the selection of ANN hyperparameters was performed and it was decided to train the ANN with up to 15 neurons for the hidden layer. The dataset was randomly split into three parts. In total, 70% of the data was used for ANN training, 15% was used for validation, and the remaining 15% was used for testing in order to avoid overfitting the ANN-based models.
The performance indexes of both NLR- and ANN-based GMPEs were adequate. A noticeable difference between the performance indexes of the NLR- and ANN-based GMPEs was observed when using more than two neurons in the hidden layer. Moreover, the mixed-effects residual analysis revealed low between-event standard deviation and reasonable values of within-event standard deviation for all of the developed models. All of the components of standard deviation, namely between- and within-event and total, of the proposed GMPEs were lower than those of the existing GMPEs for Tm.
Section 4 presented how the proposed GMPEs work in terms of distance, magnitude, and V
S30 scaling, through comparison with the recorded data as well as with existing GMPEs. All of the proposed GMPEs satisfactorily followed the trends of recorded data and were consistent to existing GMPEs, as well. The NLR-based and the 2-, 3-, and 4-neuron ANN-based GMPEs produced stable predictions, capturing the associated nonlinearities between the predictor variables and T
m. On the other hand, the 5-, 10-, and 15-neuron ANN-based GMPEs presented some abnormal predictions, which, in some cases, did not agree with basic theory. These abnormalities were possibly due to overfitting of these models and occurred in cases where the lack of data is evident (e.g., large earthquake magnitudes recorded at small distances). An attempt was made to check if these abnormal predictions are indeed due to overfitting or due to physical heterogeneity, which may be captured by including additional path and rupture parameters. Therefore, it was decided to explore alternative versions of the ANN-based GMPEs, which included the closest-to-rupture distance, R
rup; the horizontal distance from the top edge of the rupture, measured perpendicular to the fault strike, R
x; and the horizontal distance off the end of the rupture measured parallel to strike, R
y0. The performance indexes of these alternative ANN-based GMPEs were similar to the original ones and the abnormal predictions did not improve. Therefore, it was concluded that, indeed, the abnormal predictions of the original 5-, 10-, and 15-neuron ANNs were not due to poor selection of predictor variables. Hence, overfitting was not totally avoided, although holdout validation was implemented during the training of ANNs. By taking into consideration only the performance indexes and the residual analysis results of the developed models, one could conclude that ANNs with increasing numbers of neurons provide lower error metrics and, hence, better predictive capabilities. However, this section showed that the ANN-based models for T
m should be carefully evaluated in the way they work and how they compare to physical-based models. Among the proposed GMPEs developed herein, the NLR-based models, as well as the 2–4-neuron ANN-based models, may be used along the complete range of seismotectonic and site features (M, R
JB, V
S30) which were considered within the dataset. Moreover, the 5-neuron ANN is recommended to be used for R
JB values larger than or equal to 20 km. On the other hand, it is recommended that 10- and 15-neuron ANN-based GMPEs should be avoided, as their predictions were unstable, especially at large earthquake magnitudes.
Although an overfitted model provides reduced prediction errors within the training dataset, it fails to generalize and to give accurate estimates for unseen data. Therefore, it was decided in
Section 5 to compare the predictions of all of the proposed GMPEs with recorded data from recent earthquakes in Greece, which were not included in the training-calibration dataset. The comparison revealed adequate predictive capability of the NLR- and 2-, 3-, and 4-neuron ANN-based GMPEs. Additionally, even the 5-neuron ANN-based model provided estimates which did not deviate significantly from the GMPEs mentioned above. On the other hand, the residuals of the 10- and 15-neuron ANN-based GMPEs exhibited either significant trends with source-to-site distance, or significant offsets from zero, especially for the M7.0 Samos Island earthquake. Nevertheless, the response of the 10- and 15-neuron models against the recorded data of the M6.0 and M6.3 earthquakes of Thessaly was adequate. The difference in evaluation of the latter ANN-based models against the M7.0 and the M6.3 and M6.0 earthquake data may be attributed to the fact that the training dataset contained more data in the M6.0–M6.5 magnitude range than in the M6.5–M7.0. It should be noted that the holdout validation method was utilized herein to tackle any overfitting issues for the ANN-based models. However, the alternative method of k-fold cross-validation method may provide improved generalization for them and is worth investigating in future works.
The implementation of ANNs in developing GMPEs for T
m is useful as it brings enhanced predictive capabilities, reduces prediction uncertainty, and captures nonlinearities between the predictor and the response variables, provided that the training dataset is complete. Otherwise, care should be taken when implementing complex ANN-based models, especially when considering cases which lie on the edge of models’ applicability. The more traditional regression-based GMPEs are still important, as their functional form represents relationships inspired by fundamental theory physics-based processes, as well as earthquake data observations. A combination of these model development techniques may be optimal, especially in cases or data regions where the amount of data is limited. Future research may attempt to include directivity effects, as well as a residual analysis of station-to-station variability. The reasons for not addressing these issues herein is the lack of directivity effects proxies in the dataset and that the majority of the stations of the dataset had fewer than five recordings, so the robustness of station-to-station variability may be questionable, as also discussed in [
32].