Next Article in Journal
Secure Image Transmission Using Multilevel Chaotic Encryption and Video Steganography
Previous Article in Journal
Unveiling the Shadows—A Framework for APT’s Defense AI and Game Theory Strategy
Previous Article in Special Issue
DScanNet: Packaging Defect Detection Algorithm Based on Selective State Space Models
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Predicting the Magnitude of Earthquakes Using Grammatical Evolution

by
Constantina Kopitsa
,
Ioannis G. Tsoulos
* and
Vasileios Charilogis
Department of Informatics and Telecommunications, University of Ioannina, 451 10 Ioannina, Greece
*
Author to whom correspondence should be addressed.
Algorithms 2025, 18(7), 405; https://doi.org/10.3390/a18070405
Submission received: 24 May 2025 / Revised: 16 June 2025 / Accepted: 30 June 2025 / Published: 1 July 2025
(This article belongs to the Special Issue Algorithms in Data Classification (3rd Edition))

Abstract

Throughout history, human societies have sought to explain natural phenomena through the lens of mythology. Earthquakes, as sudden and often devastating events, have inspired a range of symbolic and mythological interpretations across different civilizations. It was not until the 18th and 19th centuries that a more positivist and scientific approach began to emerge regarding the explanation of earthquakes, recognizing their origin as stemming from processes occurring beneath the Earth’s surface. A pivotal moment in the emergence of modern seismology was the Lisbon earthquake of 1755, which marked a significant shift towards scientific inquiry. This means that the question of how earthquakes occur has been resolved; thanks to advancements in scientific, geological, and geophysical research, it is now well understood that seismic events result from the collision and movement of lithospheric or tectonic plates. The contemporary challenge that emerges, however, lies in whether such seismic phenomena can be accurately predicted. In this paper, a systematic attempt is made to use techniques based on Grammatical Evolution to determine the magnitude of earthquakes. These techniques use freely available data in which the history of large earthquakes is introduced before the application of the proposed techniques. From the execution of the experiments, it has become clear that the use of these techniques can allow for more effective estimation of the magnitude of earthquakes compared to other machine learning techniques from the relevant literature.

1. Introduction

Seismology, like most scientific disciplines, has matured over the centuries through the rational evolution of human reasoning. For many centuries, the interpretation of earthquakes was rooted in mythology, and it was only during the era of the profound Industrial Revolution that it became widely accepted as a geological phenomenon rather than an act of divine punishment. Despite this, one of the earliest seismic instruments, known as the seismoscope, did not record the timing or duration of ground motion, but merely signaled that seismic activity had taken place. This type of device was first developed by the Chinese scholar Zhang Heng as early as 132 CE [1]. Robert Mallet (1810–1881), often referred to as the “father of seismology” due to his pioneering contributions to the study of earthquakes, was an Irish geophysicist, civil engineer, and scientific researcher [2]. In today’s era, at the threshold between the Fourth and Fifth Industrial Revolutions, the science of seismology continues its quest for the “Holy Grail” of accurately predicting an impending earthquake. According to the United States Geological Survey (USGS), neither the USGS nor any other scientific institution has ever successfully predicted a major earthquake. A valid earthquake prediction must specify three critical elements: (1) the exact date and time, (2) the precise location, and (3) the anticipated magnitude [3].
Nevertheless, earthquake prediction was long considered an unattainable goal prior to the advent of artificial intelligence technologies. In recent years, however, annual conferences of leading scientific bodies such as the Seismological Society of America and the American Geophysical Union have included dedicated sessions on the application of machine learning (ML) in geophysical sciences. These scientific advancements, driven by the integration of AI-based methods, have increasingly captured the attention of seismologists and researchers in the field [4]. The necessity of accurate prediction is also evidenced by the following alarming data from the International Disaster Database EM-DAT https://www.emdat.be/ (accessed on 30 June 2025): between 2010 and 2025, earthquakes have resulted in 337,372 deaths and 698,085 injuries, and have left 1,547,581 individuals homeless. These data are depicted in Table 1. It is important to note that EM-DAT records only those events that meet at least one of the following criteria: ten or more fatalities, one hundred or more individuals affected, the declaration of a state of emergency, or a call for international assistance. These figures underscore the significance of seismic disasters and lead us to emphasize the urgent need for scientific advances in earthquake prediction. In order to contextualize this necessity further, it is worth examining comparable human losses caused by other natural disasters.
These stark figures reveal how vulnerable humanity remains even in our modern era, in the face of natural disaster events that are further intensified by the effects of climate change. Nevertheless, there is room for optimism, as researchers are increasingly employing artificial intelligence and other advanced methodologies in the pursuit of forecasting earthquakes (and other natural disasters) in a timely manner, with the ultimate goal of enabling populations to seek safety and reduce potential losses [5]. Consequently, emphasis has been placed on various scientific approaches that investigate events that may potentially influence or trigger the release of seismic energy. The following are some examples that illustrate these approaches.
In the state of Oklahoma, researchers have developed a sophisticated Bayesian approach to assess the relationship between wastewater injection practices and the occurrence of seismic events [6]. In early 2002, a study was conducted examining the correlation between long-period seismic tremors occurring at a depth of approximately 30 km, and underlying tectonic processes [7]. The connection was established in 2016, highlighting that the careful and precise monitoring of slow-slip events may yield valuable insights into the probability of forthcoming large-magnitude earthquakes [8]. An alternative approach involves the systematic monitoring of potential precursory signals preceding the onset of a major seismic event [9]. Several researchers have suggested alternative mechanisms, including electromagnetic anomalies and ionospheric disturbances, although investigations in this area remain in their preliminary stages [5]. In addition to these approaches, certain institutions have also explored the study of animal behavior as a potential means of anticipating seismic activity [10].
The following section presents researchers’ efforts to forecast imminent seismic events through the application of machine learning techniques. One study proposed an efficient and accurate framework for estimating earthquake magnitudes directly from unprocessed single-station waveform data, yielding minimal average error and a standard deviation close to 0.2, without requiring instrument-response adjustments, as validated using seismic records from Southern California [11]. Another study demonstrated that machine learning-driven approaches enhanced the accuracy of aftershock location forecasts and shed light on key physical parameters that may govern earthquake triggering during the most dynamic phase of the seismic cycle [12]. In another study, researchers trained convolutional neural networks (CNNs) on over 18 million manually picked seismographs from Southern California, and estimated earthquake parameters directly from raw waveform data, without the need for feature extraction. The model demonstrated high precision, achieving a standard deviation of just 0.023 s in arrival times and 95% accuracy in polarity classification [13]. Building on recent advancements, the authors of [14]  employed machine learning techniques on datasets derived from shear laboratory experiments, aiming to uncover previously undetected signals that might precede seismic events. The same researchers subsequently conducted an analysis using a machine learning-based method initially developed in the laboratory. They processed extensive raw seismic data from Vancouver Island to distinguish relevant signals from background seismic noise, an approach that may prove valuable in assessing whether and how a slow slip event could be coupled with or evolve into a major earthquake [15]. Subsequently, a high-resolution earthquake catalog was developed through machine learning techniques, providing new insights into the complexity and duration of earthquake sequences, as well as their interrelation with recent neighboring seismic events [16]. Following this, another study introduced a global deep learning model capable of concurrently detecting earthquakes and identifying seismic phases. The proposed model demonstrated superior performance compared to existing deep learning approaches and conventional detection and phase-picking methods. Notably, it enabled the detection, and localization of approximately twice as many earthquakes, while utilizing less than one third of available seismic stations [17]. A subsequent study focused on California, and demonstrated that nearest-neighbor diagrams offer a straightforward and effective method for distinguishing between various seismic patterns and evaluating the reliability of earthquake catalogs [18]. Another research team concluded that the Weibull model provided a superior fit to seismic data for California, exhibiting well-behaved tail characteristics. Furthermore, they demonstrated its robustness by applying it successfully to independent datasets from Japan, Italy, and New Zealand [19]. A subsequent study aimed to develop a model to predict both the location and magnitude of potential earthquakes occurring in the following week based on seismic data from the current week, focusing on seismogenic regions in southwestern China. The model achieved a testing accuracy of 70%, with corresponding precision, recall, and F1-score values of 63.63%, 93.33%, and 75.66%, respectively [20]. Moreover, an ensuing study focused on predicting earthquake magnitudes in the Hindukush region. Four machine learning techniques—namely, pattern recognition based on neural networks, Recurrent Neural Networks, random forest, and a linear programming boost ensemble classifier—were individually implemented to model the relationships between computed seismic parameters and the occurrence of future earthquakes [21]. Additionally, another study demonstrated that machine learning can effectively predict the timing and size of laboratory earthquakes by reconstructing and making sense of the system’s intricate spatiotemporal loading history [22]. In a related study, a machine learning technique was applied to the regions of Japan, Turkey, Greece, and the Indian subcontinent. The model revealed a relationship between the computed seismic data and the occurrence of future earthquakes [23]. Another paper compared the performance of a machine learning model which uses a limited set of predictor variables—surface roughness, peak frequency (fP), HV, VS30, and depth (Z2.5)—to that of a physics-based model (GRA) relying on detailed 1D velocity profiles. The results indicated that the machine learning approach outperformed the physics-based modeling in terms of predictive accuracy [24]. Another study investigated physical and dynamic variations in seismic data, and introduced a novel machine learning method called Inverse Boosting Pruning Trees (IBPT). This approach was designed to provide short-term forecasts based on satellite data from 1371 earthquakes with a magnitude of six or higher, given their significant environmental impact [25]. In contemporary scientific research, there is a growing surge of interest in the prediction of seismic events. Advancements in data acquisition technologies, communication networks, edge–cloud computing, the Internet of Things (IoT), and big data analytics have created favorable conditions for the development of intelligent earthquake prediction models, enabling early-warning systems in vulnerable regions [26]. In this context, it is noteworthy that in recent years, there has been significant attention paid to the development of various models for the early detection of seismic events, which are now accessible to the general public through mobile devices in the form of applications (apps), TV media, or radio. For instance, in Japan, significant investment has been made in the timely dissemination of information regarding seismic events since 2007, through the implementation of the Earthquake Early Warning (EEW) system [27]. This enables an alert for an impending earthquake to be issued from up to one minute to several seconds before the event occurs [28]. Moreover, the United States has also developed its own earthquake warning system through the U.S. Geological Survey. Since 2016, a system known as Shake Alert has been implemented for the West Coast [29,30]. In Southern Europe, specifically at the University of Naples in Italy, a software model known as PRESTo has been developed, which is capable of detecting earthquakes approximately within the last ten seconds before their occurrence. It is abundantly clear that the science of machine learning has entered the field of seismology with considerable momentum, a fact that is also reflected in the bibliographic references, with two articles authored by specialists in the fields of geological and geophysical sciences highlighting how machine learning contributes to the advancement of earthquake prediction. In this article, a new generation of earthquake catalogs, developed through supervised machine learning, provides a means of capturing seismic activity in unprecedented detail. The use of unsupervised machine learning to more comprehensively analyze seismicity is suggested as the most efficient path toward enhancing earthquake forecasting [31]. The same researchers emphasize that machine learning and data mining techniques can greatly enhance our ability to process seismic data. In their review, they offer a comprehensive overview of machine learning applications in earthquake seismology, discuss recent advancements and existing challenges, and propose directions for future research [32].
In the current work, seismological data collected from the NSF Seismological Facility were obtained, and afterwards, these data were preprocessed, so that for each earthquake with a magnitude greater than 5, the closest earthquakes that had preceded it in various parts of the planet within a critical distance were identified. Then, the number of these preceding seismic vibrations and their average magnitude were added to these recordings. The purpose of the above process is to achieve the most accurate prediction of the magnitude of an earthquake based on its characteristics and the earthquakes that have preceded it within a predetermined distance. An attempt was then made to predict the magnitude of earthquakes using machine learning techniques based on the Grammatical Evolution method [33]. Grammatical Evolution is an evolutionary technique where chromosomes (candidate solutions) are expressed as a series of production rules of a provided BNF grammar [34]. Grammatical Evolution has been used in a series of problems, such as data fitting [35,36], composition of music [37], video games [38,39], energy problems [40], cryptography [41], economics [42], etc. Methods based on Grammatical Evolution were compared against various optimization techniques employed to train artificial neural networks [43,44] for the prediction of the magnitude of earthquakes, and the results are presented and discussed.
The limitations of the proposed work stem primarily from those that we imposed ourselves, as it is focused on the analysis of data from a three-year period. Moreover, in order to avoid overfitting, we deliberately excluded a significant portion of data from each year, and concentrated on seismic events with a magnitude greater than 5.
In this study, is was observed that machine learning and soft computing techniques have a longstanding presence in the field of seismology. For instance, artificial neural networks (ANNs) were introduced to the field of seismology in 1994 [45]. The initial application of Deep Neural Networks (DNNs), featuring two hidden layers, emerged in 2002 [46], while the earliest implementation of Recurrent Neural Networks (RNNs) in the context of seismology appeared in 2007 [47]. However, they have yet to achieve so-called “triple prediction” namely, accurate forecasting of the date and time, location, and precise magnitude of seismic events. For this reason, in 2025, we adopt a groundbreaking approach employing Grammatical Evolution, a novel technique within the domain. Our initial approach involved the application of machine learning techniques. Specifically, we conducted experiments using the following algorithms: LSTM, RBF, MLP (BP, RPROP, BFGS), and SVM. Subsequently, we carried out an exploratory experiment employing Grammatical Evolution. The results proved to be so compelling that we decided to shift our focus and pursue this novel direction in greater depth. Moreover, our approach distinguishes itself from other studies in the field, as most existing research relies on data obtained directly from seismographs and primarily focuses on localized regions. In contrast, we process historical earthquake data on a global scale, allowing for broader generalization and pattern recognition.
Despite these constraints, our study highlights the considerable potential of Grammatical Evolution in this domain, due to its consistently dynamic adaptability, high predictive accuracy, and low error rates, even for earthquakes occurring at distances ranging from 25 to 500 miles. Therefore, Grammatical Evolution proves to be both pioneering and innovative within the field, offering substantial promise for advancing the discipline of seismology. The methods used in this work, which make systematic use of the Grammatical Evolution technique, include the construction of artificial features, the construction of programming rules, and the creation of artificial neural networks. The above techniques can be used to effectively explore the objective problem space, as they have the ability to isolate the most important features of the problem, but also to detect and present hidden correlations between the features of the problem.
The remainder of this paper i structured as follows: Section 2 presents the dataset employed, as well as the proposed machine learning techniques; Section 3 illustrates the experimental results; and finally, Section 4 presents some conclusions.

2. Materials and Methods

2.1. The Dataset Employed

In this paper, open data were used which are available from the NSF Seismological Facility for the Earth Consortium (SAGE), and particularly from the Interactive Earthquake Browser http://ds.iris.edu/ieb/ (accessed on 30 June 2025). The data were obtained from the NSF as it offers greater functionality. Specifically, while the GEOFON program provides similar information, it imposes a limitation on the maximum number of earthquakes retrievable per request (1000 events), which significantly constrains the selectable time range. This is particularly restrictive given that approximately 1000 seismic events can occur within a single day. The NSF SAGE Facility has been certified as a trustworthy data repository by the CoreTrustSeal Standards and Certification Board. The data were collected for the years 2004, 2010, and 2012, with each year comprising over 100,000 recorded earthquake events. For every recorded earthquake event, comprehensive data were systematically gathered, including the date, exact time of occurrence, geographic coordinates, depth, and magnitude. This information enabled temporal analyses and the identification of seismic patterns across different time intervals. Moreover, our dataset employs the Moment Magnitude scale, as it operates effectively across a broader range of earthquake sizes, and is applicable on a global scale. During the initial stages of preprocessing, it became evident that earthquake events should be organized based on lithospheric plate boundaries, rather than national borders, which had initially been our approach. A preprocessing procedure was applied to the data, including the identification of the lithospheric plate associated with each earthquake. Subsequently, each earthquake location was cross-referenced with nearby volcanoes, where applicable, using data from the Smithsonian Institution’s Global Volcanism Program https://volcano.si.edu/ (accessed on 30 June 2025). During the course of our experiments, we decided to proceed with earthquakes with a magnitude of 5 and above, as including lower-magnitude events would result in the model being trained primarily on the prediction of minor seismic occurrences. In our research, we approached earthquake magnitude prediction as a regression problem, employing the technique of Grammatical Evolution. To achieve this objective, we used the following features as input variables: date, time, latitude, longitude, depth, lithospheric plate, type of nearby volcano, magnitude, and magnitude code. In addition, a predefined distance, denoted as D c in the experiments, from the epicenter of the target event (10 miles, 25 miles, 50 miles, 100 miles, and 500 miles) was also taken into account. The numerical output of the model predicted the earthquake’s intensity with high predictive accuracy, with a mean absolute error below 0.5 magnitude units per event.
The following section provides a concise summary, as well as a more detailed analysis, of our data set. For the year 2004, a total of 666 geographic regions were classified and used as an input for the regression model, within which 215,753 seismic events were recorded. Similarly, for 2010, 635 regions were defined and encoded as input features, corresponding to 327,909 earthquakes, while for 2012, 690 regions were established as categorical input variables, with a total of 405,153 recorded seismic events. Regarding tectonic plates, for each seismic event, we classified and used as an input feature the tectonic plates involved in the corresponding region. In total, 81 distinct combinations of lithospheric tectonic plates were identified. In relation to the classification of volcanoes, these were categorized into ten types: stratovolcano, volcanic field, lava dome, caldera, complex, compound, shield, pyroclastic, minor, and submarine. For each region containing a volcano, the corresponding category was marked as an input feature with a value of 1, while a value of 0 was assigned when no volcano of that type was present. Table 2 shows the classification of earthquakes into various classes, depending on their magnitude, for each year studied in this work.
The transformation from raw seismic events to structured input for machine learning was conducted as follows:
  • On the Earthquake Interactive Browser platform, the “maximum earthquakes” parameter was set to 25,000 in order to extract the maximum available number of records.
  • The “time range” was then adjusted to correspond to the specific year or range of years targeted for data collection.
  • Subsequently, the “magnitude range” was specified, the filter was applied, and the dataset was downloaded in Excel format.
  • The final dataset was further processed by creating a separate column for each variable, including the following: Year, Month, Day, Time, Latitude, Longitude, Depth, Magnitude, Magnitude Code, Region, Region Code, Lithospheric/Tectonic Plate, Lithospheric/Tectonic Plate Code, Stratovolcano, Volcanic Field, Lava Dome, Caldera, Complex, Compound, Shield, Pyroclastic, Minor, and Submarine.
In order to enhance the reliability of the dataset employed, extensive data cleaning was performed by removing several thousands of records from each year. For instance, for 2004, a total of 214,170 entries were excluded; for 2010, 325,278 records were removed; and for 2012, 403,035 records were similarly discarded. The extensive data cleaning process played a crucial role in preventing our model from predominantly learning to predict the values of low-magnitude earthquakes, which vastly outnumbered higher-magnitude events by several hundreds of thousands.
In the preprocessing pipeline employed, categorical variables such as geographic regions and lithospheric/tectonic plates were encoded using a unique integer-based labeling scheme. Specifically, each distinct geographic region and tectonic plate was assigned a unique numeric identifier, ranging sequentially from 1 up to the number of unique entries in the respective category. Regarding volcanic types, a binary encoding approach was applied. For each type of volcano (e.g., stratovolcano, caldera, shield, etc.), we created a binary feature indicating the presence or absence of that specific volcano type within a given region. A value of 1 was assigned if the volcano type was present, and 0 otherwise. This strategy enabled the model to capture the influence of specific volcanic characteristics while maintaining compatibility with the GE algorithm.

2.2. Grammatical Evolution Preliminaries

The Grammatical Evolution procedure is considered as a variant of genetic algorithms [48,49], where the chromosomes are series of integer values that represent production rules of the underlying BNF grammar. BNF grammars are denoted as sets G = N , T , S , P , with the following definitions:
  • The set N represents non-terminal symbols.
  • The set T contains the terminal symbols of the language.
  • The symbol S N denotes the start symbol of the grammar.
  • The set P holds the production rules of the grammar.
The production mechanism of Grammatical Evolution is initiated with the symbol S, and using a series of production rules, a valid program is created replacing non-terminal symbols with the right hand of the selected production rule. The rules are selected through the following procedure:
  • Read the next element V from the current chromosome.
  • Select the production rule using the following equation: Rule = V mod N R . The constant N R stands for the total number of production rules for the non-terminal symbol that is currently under processing.
The previously used procedure is graphically illustrated in Figure 1.
As a full working example of this production mechanism, consider the following chromosome:
x = 9 , 8 , 6 , 4 , 16 , 10 , 17 , 23 , 8 , 14
and the grammar in Figure 2. The numbers shown in parentheses are the increasing numbers of the production rules for each non-terminal symbol. Denote with d = 3 the number of features (inputs) for the current dataset. The production of the valid expression f ( x ) = x 2 + cos x 3 is performed through a series of steps, which are depicted in Table 3.

2.3. The Rule Production Method

The rule construction method was initially presented in [50]. This method can produce rules in a human-readable form that can be used in classification and regression problems without any prior knowledge of the the objective problem. The main steps of this method are as follows:
  • Step 1—Initialization step.
    • Set as N c the number of chromosomes, and as N g the maximum number of allowed generations.
    • Set as p s the selection rate of the genetic algorithm, and as p m the corresponding mutation rate.
    • Initialize the chromosomes c i , i = 1 , , N c . Each chromosome is considered a set of randomly selected positive integers.
    • Set  k = 0 , the generation counter.
  • Step 2—Fitness calculation step.
    • For  i = 1 , , N c   perform the following:
      (a)
      Create the program P i for the chromosome c i using the grammar of Figure 3 and the Grammatical Evolution production mechanism.
      (b)
      Set as the fitness value f i for chromosome c i the training error of the produced program, calculated as follows:
      f i = j = 1 M P i x j y j 2
      The set x j , y j , x R N , j = 1 , , M defines the training set of the objective problem, where the value y j is considered as the actual output for the input pattern x j . In the current implementation of Grammatical Evolution, the fitness function is defined as the sum of squared errors between the predicted and actual earthquake magnitudes across the training set. This fitness function is crucial in guiding the evolutionary process by favoring candidate solutions (programs) that minimize prediction error. As earthquake magnitude prediction is a regression task, this formulation ensures that evolved models are optimized for minimizing the deviation from actual magnitudes.
    • End For
  • Step 3—Genetic operations step.
    • Select the best 1 p s × N c chromosomes from the current population. These chromosomes will be transferred intact to the next generation.
    • Create  p S N c chromosomes with the assistance of the one-point crossover shown graphically in Figure 4. For every couple z 1 , z 2 of created offsprings, two chromosomes should be chosen from the current population using tournament selection.
    • Mutation procedure: For every element of each chromosome, a random number r 1 is selected. The corresponding element is altered randomly when r p m .
  • Step 4—Termination check step.
    • Set  k = k + 1 .
    • If  k < N g , go to the fitness calculation step.

2.4. Constructed Neural Networks

Another method used in the conducted experiments was the neural network construction method [51]. This method can be used to discover the optimal architecture of artificial neural networks, as well as to estimate the parameters of the network by using the Grammatical Evolution technique. This method has been applied in a series of problems, such as in chemistry [52], education problems [53], autism screening [54], etc. The BNF grammar incorporated in the neural construction procedure is depicted in Figure 5. This grammar can produce artificial neural networks in the following form:
N x , w = i = 1 H w ( d + 2 ) i ( d + 1 ) σ j = 1 d x j w ( d + 2 ) i ( d + 1 ) + j + w ( d + 2 ) i
The value H denotes the number of processing units (weights) of the neural network. Also, the function σ ( x ) stands for the sigmoid function. Following the previous equation, it is deducted that the total number of parameters for this network can be computed as follows:
n = d + 2 H
As an example, consider the following neural network:
N ( x ) = 1.9 s i g 10.5 x 1 + 3.2 x 3 + 1.4 + 2.1 s i g 2.2 x 2 3.3 x 3 + 3.2
This expression represents a neural network used in a problem with 3 inputs x 1 , x 2 , x 3 . The number of processing nodes is H = 2 . This neural network is outlined graphically in Figure 6.
The steps of the method used to construct artificial neural networks are shown below:
  • Step 1—Initialization step.
    • Define as N c the number of chromosomes in the genetic population and as N g the total number of allowed generations.
    • Set as p s the used selection rate and as p m the used mutation rate.
    • Initialize the chromosomes c i , i = 1 , , N c . Each chromosome is considered as a set of randomly selected positive integers.
    • Set  k = 0 as the generation counter.
  • Step 2—Fitness calculation step.
    • For  i = 1 , , N c , perform the following:
      (a)
      Obtain the chromosome c i .
      (b)
      Create the corresponding neural network N i x , w for this chromosome using the grammar in Figure 5.
      (c)
      Calculate the associated fitness value f i as the training error of network N i x , w , defined as follows:
      f i = j = 1 M N i x j , w y j 2
    • End For
  • Step 3—Application of genetic operations. Apply the same genetic operations as in the algorithm of Section 2.3.
  • Step 4—Termination check step.
    • Set  k = k + 1 .
    • Terminate if k N g .
    • Go to the fitness calculation step.

2.5. The Feature Construction Method

Another technique, based on Grammatical Evolution, that was used in the conducted experiments was the Feature Construction technique, initially described in the work of Gavrilis et al. [55]. This technique produces artificial features from original ones using non-linear mappings produced by the Grammatical Evolution procedure. The BNF grammar used for this technique is outlined in Figure 7.
The produced features are evaluated by Radial Basis Function (RBF) networks [56,57] due to the speed of their associated training procedure. The steps of the Feature Construction technique are as follows:
  • Initialization step.
    (a)
    Set the parameters of the method: N c —the number of chromosomes, N g —the maximum number of allowed generations, p s —the selection rate, and p m —the mutation rate.
    (b)
    Initialize the N c chromosomes as sets of random integers.
    (c)
    Set as N f the number of constructed features.
    (d)
    Set  k = 0 , the generation counter.
  • Fitness calculation step.
    (a)
    For  i = 1 , , N c   perform the following:
    • Produce  N f artificial features y 1 , y 2 , , y N f for the processed chromosome c i using the grammar in Figure 7.
    • Modify the original training set using the constructed features y 1 , y 2 , , y N f .
    • Apply a machine learning model, denoted as M ( x ) , to the modified set and define as the fitness value f i the training error of M ( x ) .
    (b)
    End For
  • Genetic operations step. Apply the same genetic operations as applied in the algorithm in Section 2.3.
  • Termination check step.
    (a)
    Set k = k + 1 .
    (b)
    Terminate when k N g .
    (c)
    Go to the fitness calculation step.

3. Experimental Results

The measurements from the years 2004, 2010, and 2012 were used as test cases in the experimental results. For the Neural Network Construction technique, the freely available software NNC version 1.0 [58] was incorporated. Feature Construction was performed using the QFC software version 1.0 [59], which is also freely available. Furthermore, the WEKA programming tool version 3.6.14 [60] was utilized in the conducted experiments. Validation of the experiments was performed using the ten-fold cross-validation technique, and all experiments were conducted on a machine running Debian Linux with 128 GB of RAM. The average regression error for the test set was measured using the following equation:
E R M x = i = 1 N N x i y i 2 N
Here, the test set T is defined as the set T = x i , y i , i = 1 , , N , and the equation M x denotes the application of the machine learning M ( x ) on the input pattern x . The values for the experimental parameters are outlined in Table 4. The parameters used in the individual genetic algorithms have been successfully used in the past in a multitude of research works, and furthermore constitute a compromise between the speed and performance of the algorithms used in the present work.
Also, the following notation is used in the experimental table:
  • The column YEAR denotes the recording year for the earthquakes.
  • The column D c denotes the critical distance, expressed in miles, used in the preprocessing of the earthquake data.
  • The column LSTM denotes the incorporation of the long short-term memory (LSTM) neural network [61], as implemented in the PyTorch programming library [62].
  • The column SVM represents the usage of the Support Vector Machines [63], as coded in the LibSvm library [64].
  • The column MLP(BP) denotes the incorporation of the Back Propagation algorithm [65] in the training of a neural network with H = 10 processing nodes.
  • The column MLP(RPROP) represents the usage of the RPROP method [66,67] for the training of a neural network with H = 10 processing nodes.
  • The column MLP(BFGS) denotes the usage of the BFGS optimization procedure [68] for the training of an artificial neural network with H = 10 processing nodes.
  • The column RULE denotes the incorporation of the rule construction method, described in Section 2.3.
  • The column NNC represents the usage of the Neural Network Construction method, provided in Section 2.4.
  • The column FC represents the usage of the Feature Construction technique, outlined in Section 2.5.
  • The row AVERAGE represents the average error for all years and critical distances.
An example of constructed features for the year 2010 is presented below:
f 1 ( x ) = 4 99.35 x 2 + cos x 3 877.38 x 6 + sin 43.56 x 4 + x 6 99.35 x 2 f 2 ( x ) = 2 x 13 95.2 x 2 + 948.94 x 1
According to the data from the Table 5 for seismic magnitude prediction (2004) based on neighboring earthquakes, the following key observations are made: The Grammatical Evolution models (RULE, NNC, FC) continuously exhibit the lowest regression error (0.16–0.17) at all distances D c . They are much more stable and accurate than the other models, with an average error of 0.164. This suggests that Grammatical Evolution significantly improves performance. Among the models without Grammatical Evolution, LSTM and MLP-RPROP have the best performance, with an average error of around 0.24, and demonstrate decent stability at different distances. SVM has a slightly higher average error (0.282), but is also relatively stable. MLP-BP has a higher average error (0.418), with some improvement at greater distances (e.g., 0.35 at 100 miles). MLP-BFGS exhibits the worst error (average 0.802) and very high variance, particularly at short distances (0.98 at 25 miles), indicating significant instability. Distance (Dc) seems to minimally affect the error for most models (LSTM, SVM, MLP-RPROP, and the grammatical models), suggesting robustness. The significant exception is MLP-BFGS, which shows marked sensitivity to distance. In summary, the Grammatical Evolution models (RULE, NNC, FC) are clearly the most accurate and reliable for this prediction problem. Among the rest of the models, LSTM and MLP-RPROP are the best choices due to their balanced performance, while MLP-BFGS appears to be unsuitable due to high and unpredictable error.
For the year 2010, the Grammatical Evolution models (RULE, NNC, FC) maintain the lowest regression error (0.17–0.19) at all distances, with NNC achieving the best average (0.176). They are more stable than most non-evolutionary models, confirming that Grammatical Evolution provides reliable predictions. Among the other models, LSTM shows the best performance (average error 0.24), with remarkable stability at all distances. MLP-RPROP has a slightly higher average error (0.256), but with greater variance, particularly at long distances (0.32 at 500 miles). SVM remains stable (average 0.298), with no significant fluctuations. MLP-BP improves at medium distances (0.31 at 100 miles), but has a higher average error (0.366). MLP-BFGS improves compared to 2004 (average error 0.548), but still exhibits significant variance and high error at short distances (0.74 at 10 miles). Distance (Dc) minimally affects most models, with the exception of MLP-BFGS and, to a lesser extent, MLP-RPROP, which show sensitivity. Overall, the Grammatical Evolution models (especially NNC) and LSTM stand out as the most reliable approaches for this prediction problem.
Similarly, for 2012, the Grammatical Evolution models (RULE, NNC, FC) continue to maintain the lowest and most stable regression error at all distances. NNC and FC achieve the best average error (0.17 and 0.166 respectively), with FC demonstrating remarkable stability (0.16) at most distances. RULE has a slightly higher average (0.172). This confirms the superiority of Grammatical Evolution for this problem. Among the non-evolutionary models, LSTM stands out with the best average error (0.224) and very good stability at different distances. MLP-RPROP has a similar average (0.238), but greater variance, particularly at long distances (0.29 at 500 miles). SVM improves significantly compared to previous years (average error 0.268) and is stable. MLP-BP has the highest error (0.35) among the non-evolutionary models, with increasing error at medium distances (0.38 at 100 miles). MLP-BFGS continues to have the highest average error (0.566) and explosive variance, ranging from extremely low error (0.21 at 50 miles) to very high (0.85 at 500 miles), making it unreliable. Distance (Dc) minimally affects the grammatical models, LSTM, and SVM. MLP-BP shows sensitivity at medium distances, while MLP-RPROP and particularly MLP-BFGS show strong sensitivity at specific distances. Overall, for 2012, the grammatical models (especially FC and NNC) and LSTM stand out as the most accurate and stable approaches. MLP-BFGS remains unsuitable due to unpredictable performance, while MLP-BP has the highest error among the traditional models. There is a general improvement in SVM and in the stability of LSTM compared to previous years.
In Figure 8, the grammatical models (RULE, NNC, FC) exhibit the lowest and most stable error (0.16–0.17) across all distances, clearly outperforming the others. LSTM and MLP-RPROP are the best non-evolutionary models (average error ~0.24), while MLP-BFGS shows significant instability (average 0.802), with particularly high errors at short distances.
In Figure 9, the grammatical models maintain their superiority with an average error of 0.176–0.188, with NNC standing out. LSTM continues to be the top non-evolutionary model (average 0.24), with remarkable stability, while MLP-BFGS improves compared to 2004 (average 0.548), but retains significant variation, especially at short distances.
In Figure 10, the grammatical models are shown to achieve their best performance, with FC (average 0.166) and NNC (0.17) demonstrating exceptional stability. LSTM remains the most reliable non-evolutionary model (average 0.224), while MLP-BFGS shows extreme variation (from 0.21 to 0.85), despite its reduced average error (0.566).
According to the results of the Friedman test shown in Figure 11, where the overall difference between the models is extremely statistically significant (p = 4.51 × 10−18 < 0.0001, critical value = 4.2863,critical difference = 3.834), the following key stratifications are observed: The models with Grammatical Evolution (RULE, NNC, FC) show homogeneous performance (p = ns among them), but are significantly better (p = ****) than MLP(BP) and MLP(BFGS). They also significantly outperform SVM (p = **** for NNC/FC, p = *** for RULE). LSTM (without Grammar-Based Evolution) does not differ significantly either from the models with Grammar-Based Evolution (p = ns) or from SVM, MLP(BP), and MLP(RPROP) (p = ns), with the only significant difference being against MLP(BFGS) (p = **). MLP(RPROP) (without Grammar-Based Evolution) does not differ from RULE (p = ns), but shows significantly higher error than NNC/FC (p = *) and MLP(BFGS) (p = *). MLP(BFGS) (without Grammar-Based Evolution) is the least reliable model, with significantly worse performance (p = ***) compared to all models with Grammar-Based Evolution, and significantly worse performance than LSTM (p = **) and MLP(RPROP) (p = *).
In Figure 12 and Figure 13, the prediction error analysis results for different machine learning models are presented. We observe significant differences between the two model categories. The analysis reveals that models without Grammatical Evolution exhibit greater variability in prediction errors. Notably, one of these models shows particularly high errors at certain distances, indicating a lack of stability. In contrast, the most effective model in this category maintains relatively low errors across all distances. The methods with Grammatical Evolution demonstrate remarkable stability and lower errors. The three models in this category show similar and consistent performance, with minimal variations across different distances and years. The effect of distance is clearly visible in models without Grammatical Evolution, where errors vary significantly. Conversely, in models with Grammatical Evolution, distance does not appear to significantly affect prediction accuracy. Temporally, while models without Grammatical Evolution show some improvement over time, models with Grammatical Evolution maintain stable performance throughout the study period. Overall, the results support the superiority of methods with Grammatical Evolution, which demonstrate greater reliability and stability under various conditions. This ability to effectively handle data variations makes them an ideal choice for applications requiring accurate and stable predictions.
To verify the stability of the application of Grammatical Evolution to the present seismic data, an additional experiment was performed wherein random noise in the range of 0.1–5% was applied to the problem features one by one. The neural construction method, guided by the Grammatical Evolution, was incorporated for the data, with the noise and the average regression error, as measured in the test, shown in Table 6. The noise was applied to the results for the year 2010, where the critical distance D c was set to 10 miles.
As one can see from the above experimental results, only in a few cases of noise and only for a limited number of features did a partial deviation in the error appear compared to the case of no noise in the data.

4. Conclusions

This article presents an innovative approach to predicting earthquake magnitudes using the Grammatical Evolution method, which offers significant advantages over traditional machine learning techniques. Experimental results for the years 2004, 2010, and 2012 demonstrate that models employing Grammatical Evolution (RULE, NNC, FC) consistently achieve lower and more stable regression errors compared to models without Grammatical Evolution (MLP(BP), MLP(RPROP), MLP(BFGS)). The superiority of these models is evident both in their reduction of the mean error and in their resilience to changes in distance, indicating better adaptability to varying spatial conditions. The study’s conclusions emphasize the capability of Grammatical Evolution to effectively identify and exploit structures and relationships in seismic data. Despite the observed improvement in non-Grammatical Evolution models over time, their performance remained inferior to that of models incorporating Grammatical Evolution. This demonstrates that using Grammatical Evolution provides more reliable predictions, a critical factor for applications such as earthquake early-warning systems.
As future directions, this study suggests exploring the exact mechanisms through which Grammatical Evolution enhances performance, focusing on analyzing the structure of the generated rules. Additionally, the scalability of the method to larger and more complex datasets could be investigated, as well as its comparison with other advanced machine learning techniques. Another avenue could involve optimizing the method’s hyperparameters and incorporating additional geophysical parameters for even more accurate predictions. Finally, the real-time application of the model and its integration into early-warning frameworks could be significant steps toward the practical utilization of this study’s findings.
Furthermore, as previously mentioned in the introduction, several developments of various models have focused on short-term early-warning systems, typically offering a lead time of approximately one minute, based on real-time detection of tectonic or lithospheric plate movement as it occurs. In contrast, our study not only targeted seismic events of magnitude 5.0 and above, but also took a step further by demonstrating the potential to predict such events before any plate movement is detected. Consequently, our model can potentially be integrated into a broader early-warning framework capable of delivering mid-to-long-term predictions of earthquake occurrences, ranging from hours or days to even months in advance, thereby offering a significantly enhanced preparedness window and risk mitigation efforts.

Author Contributions

C.K., V.C. and I.G.T. conceived of the idea and the methodology, and C.K. and V.C. implemented the corresponding software. C.K. conducted the experiments, employing objective functions as test cases, and provided the comparative experiments. V.C. performed the necessary statistical tests. All authors have read and agreed to the published version of the manuscript.

Funding

This research has been financed by the European Union: Next Generation EU through the Program Greece 2.0 National Recovery and Resilience Plan, under the call RESEARCH-CREATE-INNOVATE, project name “iCREW: Intelligent small craft simulator for advanced crew training using Virtual Reality techniques” (project code: TAEDK-06195).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Seismograph. Britannica, Earth Sciences. Available online: https://www.britannica.com/science/seismograph (accessed on 20 May 2025).
  2. Robert Mallet. Britannica, Civil Engineering. Available online: https://www.britannica.com/biography/Robert-Mallet (accessed on 20 May 2025).
  3. Can You Predict Earthquakes? United States Geological Survey. Available online: https://www.usgs.gov/faqs/can-you-predict-earthquakes (accessed on 20 May 2025).
  4. Bergen, K.J.; Chen, T.; Li, Z. Preface to the focus section on machine learning in seismology. Seismol. Res. Lett. 2019, 90, 477–480. [Google Scholar] [CrossRef]
  5. Hutchison, A. How machine learning might unlock earthquake prediction. MIT Technology Review. 29 December 2023. Available online: https://www.technologyreview.com/2023/12/29/1084699/machine-learning-earthquake-prediction-ai-artificial-intelligence/ (accessed on 20 May 2025).
  6. Hincks, T.; Aspinall, W.; Cooke, R.; Gernon, T. Oklahoma’s induced seismicity strongly linked to wastewater injection depth. Science 2018, 359, 1251–1255. [Google Scholar] [CrossRef]
  7. Obara, K. Nonvolcanic deep tremor associated with subduction in southwest Japan. Science 2002, 296, 1679–1681. [Google Scholar] [CrossRef] [PubMed]
  8. Obara, K.; Kato, A. Connecting slow earthquakes to huge earthquakes. Science 2016, 353, 253–257. [Google Scholar] [CrossRef] [PubMed]
  9. Bletery, Q.; Nocquet, J.M. The precursory phase of large earthquakes. Science 2023, 381, 297–301. [Google Scholar] [CrossRef]
  10. Avolio, C.; Muller, U.; Wikelski, M. The secret knowledge of animals. Max Planck Institute of Animal Behavior. 12 April 2024. Available online: https://www.ab.mpg.de/578863/news_publication_21821500_transferred?c=413930 (accessed on 20 May 2025).
  11. Mousavi, S.M.; Beroza, G.C. A machine-learning approach for earthquake magnitude estimation. Geophys. Res. Lett. 2020, 47, e2019GL085976. [Google Scholar] [CrossRef]
  12. DeVries, P.M.; Viégas, F.; Wattenberg, M.; Meade, B.J. Deep learning of aftershock patterns following large earthquakes. Nature 2018, 560, 632–634. [Google Scholar] [CrossRef]
  13. Ross, Z.E.; Meier, M.A.; Hauksson, E. P wave arrival picking and first-motion polarity determination with deep learning. J. Geophys. Res. Solid Earth 2018, 123, 5120–5129. [Google Scholar] [CrossRef]
  14. Rouet-Leduc, B.; Hulbert, C.; Lubbers, N.; Barros, K.; Humphreys, C.J.; Johnson, P.A. Machine learning predicts laboratory earthquakes. Geophys. Res. Lett. 2017, 44, 9276–9282. [Google Scholar] [CrossRef]
  15. Rouet-Leduc, B.; Hulbert, C.; Johnson, P.A. Continuous chatter of the Cascadia subduction zone revealed by machine learning. Nat. Geosci. 2019, 12, 75–79. [Google Scholar] [CrossRef]
  16. Tan, Y.J.; Waldhauser, F.; Ellsworth, W.L.; Zhang, M.; Zhu, W.; Michele, M.; Chiaraluce, L.; Beroza, G.C.; Segou, M. Machine-learning-based high-resolution earthquake catalog reveals how complex fault structures were activated during the 2016–2017 central Italy sequence. Seism. Rec. 2021, 1, 11–19. [Google Scholar] [CrossRef]
  17. Mousavi, S.M.; Ellsworth, W.L.; Zhu, W.; Chuang, L.Y.; Beroza, G.C. Earthquake transformer—An attentive deep-learning model for simultaneous earthquake detection and phase picking. Nat. Commun. 2020, 11, 3952. [Google Scholar] [CrossRef]
  18. Hsu, Y.F.; Zaliapin, I.; Ben-Zion, Y. Informative modes of seismicity in nearest-neighbor earthquake proximities. J. Geophys. Res. Solid Earth 2024, 129, e2023JB027826. [Google Scholar] [CrossRef]
  19. Bayliss, K.; Naylor, M.; Main, I.G. Probabilistic identification of earthquake clusters using rescaled nearest neighbour distance networks. Geophys. J. Int. 2019, 217, 487–503. [Google Scholar] [CrossRef]
  20. Saad, O.M.; Chen, Y.; Savvaidis, A.; Fomel, S.; Jiang, X.; Huang, D.; Oboue, Y.A.S.I.; Yong, S.; Wang, X.A.; Zhang, X.; et al. Earthquake forecasting using big data and artificial intelligence: A 30-week real-time case study in China. Bull. Seismol. Soc. Am. 2023, 113, 2461–2478. [Google Scholar] [CrossRef]
  21. Asim, K.M.; Martínez-Álvarez, F.; Basit, A.; Iqbal, T. Earthquake magnitude prediction in Hindukush region using machine learning techniques. Nat. Hazards 2017, 85, 471–486. [Google Scholar] [CrossRef]
  22. Corbi, F.; Sandri, L.; Bedford, J.; Funiciello, F.; Brizzi, S.; Rosenau, M.; Lallemand, S. Machine learning can predict the timing and size of analog earthquakes. Geophys. Res. Lett. 2019, 46, 1303–1311. [Google Scholar] [CrossRef]
  23. Hoque, A.; Raj, J.; Saha, A.; Bhattacharya, P. Earthquake magnitude prediction using machine learning technique. In Trends in Computational Intelligence, Security and Internet of Things: Third International Conference, ICCISIoT 2020, Tripura, India, 29–30 December 2020, Proceedings 3; Springer International Publishing: Berlin/Heidelberg, Germany, 2020; pp. 37–53. [Google Scholar]
  24. Zhu, C.; Cotton, F.; Kawase, H.; Nakano, K. How well can we predict earthquake site response so far? Machine learning vs physics-based modeling. Earthq. Spectra 2023, 39, 478–504. [Google Scholar] [CrossRef]
  25. Xiong, P.; Tong, L.; Zhang, K.; Shen, X.; Battiston, R.; Ouzounov, D.; Iuppa, R.; Crookes, D.; Long, C.; Zhou, H. Towards advancing the earthquake forecasting by machine learning of satellite data. Sci. Total Environ. 2021, 771, 145256. [Google Scholar] [CrossRef]
  26. Bhatia, M.; Ahanger, T.A.; Manocha, A. Artificial intelligence based real-time earthquake prediction. Eng. Appl. Artif. Intell. 2023, 120, 105856. [Google Scholar] [CrossRef]
  27. Earthquake Early Warning System. Japan Meteorological Agency. Available online: https://www.jma.go.jp/jma/en/Activities/eew.html (accessed on 20 May 2025).
  28. Earthquake Early Warning System (Japan). Wikipedia. Available online: https://en.wikipedia.org/wiki/Earthquake_Early_Warning_(Japan) (accessed on 20 May 2025).
  29. ShakeAlert. US Geological Survey, Earthquake Hazards Program. Available online: https://earthquake.usgs.gov/data/shakealert/ (accessed on 20 May 2025).
  30. ShakeAlert. Earthquake Early Warning (EEW) System. Available online: https://www.shakealert.org/ (accessed on 20 May 2025).
  31. Beroza, G.C.; Segou, M.; Mousavi, M.S. Machine learning and earthquake forecasting—Next steps. Nat. Commun. 2021, 12, 4761. [Google Scholar] [CrossRef]
  32. Mousavi, S.M.; Beroza, G.C. Machine learning in earthquake seismology. Annu. Rev. Earth Planet. Sci. 2023, 51, 105–129. [Google Scholar] [CrossRef]
  33. O’Neill, M.; Ryan, C. Grammatical evolution. IEEE Trans. Evol. Comput. 2001, 5, 349–358. [Google Scholar] [CrossRef]
  34. Backus, J.W. The Syntax and Semantics of the Proposed International Algebraic Language of the Zurich ACM-GAMM Conference. In Proceedings of the International Conference on Information Processing, UNESCO, Paris, France, 15–20 June 1959; pp. 125–132. [Google Scholar]
  35. Ryan, C.; Collins, J.; O’Neill, M. Grammatical evolution: Evolving programs for an arbitrary language. In Genetic Programming; EuroGP 1998; Lecture Notes in Computer Science; Banzhaf, W., Poli, R., Schoenauer, M., Fogarty, T.C., Eds.; Springer: Berlin/Heidelberg, Germany, 1998; Volume 1391. [Google Scholar]
  36. O’Neill, M.; Ryan, M.C. Evolving Multi-line Compilable C Programs. In Genetic Programming; EuroGP 1999; Lecture Notes in Computer Science; Poli, R., Nordin, P., Langdon, W.B., Fogarty, T.C., Eds.; Springer: Berlin/Heidelberg, Germany, 1999; Volume 1598. [Google Scholar]
  37. Puente, A.O.; Alfonso, R.S.; Moreno, M.A. Automatic composition of music by means of grammatical evolution. In Proceedings of the 2002 Conference on APL: Array Processing Languages: Lore, Problems, and Applications, APL ’02, Madrid, Spain, 22–25 July 2002; pp. 148–155. [Google Scholar]
  38. Galván-López, E.; Swafford, J.M.; O’Neill, M.; Brabazon, A. Evolving a Ms. PacMan Controller Using Grammatical Evolution. In Applications of Evolutionary Computation. EvoApplications 2010; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2010; Volume 6024. [Google Scholar]
  39. Shaker, N.; Nicolau, M.; Yannakakis, G.N.; Togelius, J.; O’Neill, M. Evolving levels for Super Mario Bros using grammatical evolution. In Proceedings of the 2012 IEEE Conference on Computational Intelligence and Games (CIG), Granada, Spain, 11–14 September 2012; pp. 304–331. [Google Scholar]
  40. Martínez-Rodríguez, D.; Colmenar, J.M.; Hidalgo, J.I.; Villanueva Micó, R.J.; Salcedo-Sanz, S. Particle swarm grammatical evolution for energy demand estimation. Energy Sci. Eng. 2020, 8, 1068–1079. [Google Scholar] [CrossRef]
  41. Ryan, C.; Kshirsagar, M.; Vaidya, G.; Cunningham, A.; Sivaraman, R. Design of a cryptographically secure pseudo random number generator with grammatical evolution. Sci. Rep. 2022, 12, 8602. [Google Scholar] [CrossRef] [PubMed]
  42. Martín, C.; Quintana, D.; Isasi, P. Grammatical Evolution-based ensembles for algorithmic trading. Appl. Soft Comput. 2019, 84, 105713. [Google Scholar] [CrossRef]
  43. Bishop, C. Neural Networks for Pattern Recognition; Oxford University Press: Oxford, UK, 1995. [Google Scholar]
  44. Cybenko, G. Approximation by superpositions of a sigmoidal function. Math. Control Signals Syst. 1989, 2, 303–314. [Google Scholar] [CrossRef]
  45. Lakkos, S.; Hadjiprocopis, A.; Compley, R.; Smith, P. A neural network scheme for earthquake prediction based on the Seismic Electric Signals. In Proceedings of the IEEE Conference on Neural Networks and Signal Processing, Ermioni, Greece, 6–8 September 1994; pp. 681–689. [Google Scholar]
  46. Negarestani, A.; Setayeshi, S.; Ghannadi-Maragheh, M.; Akashe, B. Layered neural networks based analysis of radon concentration and environmental parameters in earthquake prediction. J. Environ. Radioact. 2002, 62, 225–233. [Google Scholar] [CrossRef] [PubMed]
  47. Panakkat, A.; Adeli, H. Neural network models for earthquake magnitude prediction using multiple seismicity indicators. Int. J. Neural Syst. 2007, 17, 13–33. [Google Scholar] [CrossRef]
  48. Holland, J.H. Genetic algorithms. Sci. Am. 1992, 267, 66–73. [Google Scholar] [CrossRef]
  49. Sastry, K.; Goldberg, D.; Kendall, G. Genetic algorithms. In Search Methodologies: Introductory Tutorials in Optimization and Decision Support Techniques; Springer: Berlin/Heidelberg, Germany, 2005; pp. 97–125. [Google Scholar]
  50. Tsoulos, I.G. Learning Functions and Classes Using Rules. AI 2022, 3, 751–763. [Google Scholar] [CrossRef]
  51. Tsoulos, I.G.; Gavrilis, D.; Glavas, E. Neural network construction and training using grammatical evolution. Neurocomputing 2008, 72, 269–277. [Google Scholar] [CrossRef]
  52. Papamokos, G.V.; Tsoulos, I.G.; Demetropoulos, I.N.; Glavas, E. Location of amide I mode of vibration in computed data utilizing constructed neural networks. Expert Syst. Appl. 2009, 36, 12210–12213. [Google Scholar] [CrossRef]
  53. Christou, V.; Tsoulos, I.G.; Loupas, V.; Tzallas, A.T.; Gogos, C.; Karvelis, P.S.; Antoniadis, N.; Glavas, E.; Giannakeas, N. Performance and early drop prediction for higher education students using machine learning. Expert Syst. Appl. 2023, 225, 120079. [Google Scholar] [CrossRef]
  54. Toki, E.I.; Pange, J.; Tatsis, G.; Plachouras, K.; Tsoulos, I.G. Utilizing Constructed Neural Networks for Autism Screening. Appl. Sci. 2024, 14, 3053. [Google Scholar] [CrossRef]
  55. Gavrilis, D.; Tsoulos, I.G.; Dermatas, E. Selecting and constructing features using grammatical evolution. Pattern Recognit. Lett. 2008, 29, 1358–1365. [Google Scholar] [CrossRef]
  56. Park, J.; Sandberg, I.W. Universal Approximation Using Radial-Basis-Function Networks. Neural Comput. 1991, 3, 246–257. [Google Scholar] [CrossRef]
  57. Yu, H.; Xie, T.; Paszczynski, S.; Wilamowski, B.M. Advantages of Radial Basis Function Networks for Dynamic System Design. IEEE Trans. Ind. Electron. 2011, 58, 5438–5450. [Google Scholar] [CrossRef]
  58. Tsoulos, I.G.; Tzallas, A.; Tsalikakis, D. NNC: A tool based on Grammatical Evolution for data classification and differential equation solving. SoftwareX 2019, 10, 100297. [Google Scholar] [CrossRef]
  59. Tsoulos, I.G. QFC: A Parallel Software Tool for Feature Construction, Based on Grammatical Evolution. Algorithms 2022, 15, 295. [Google Scholar] [CrossRef]
  60. Hall, M.; Frank, F.; Holmes, G.; Pfahringer, B.; Reutemann, P.; Witten, I.H. The WEKA data mining software: An update. ACM Sigkdd Explor. Newsl. 2009, 11, 10–18. [Google Scholar] [CrossRef]
  61. Yu, Y.; Si, X.; Hu, C.; Zhang, J. A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput. 2019, 31, 1235–1270. [Google Scholar] [CrossRef] [PubMed]
  62. Ketkar, N.; Moolayil, J.; Ketkar, N.; Moolayil, J. Introduction to pytorch. In Deep Learning with Python: Learn Best Practices of Deep Learning Models with PyTorch; Apress: New York, NY, USA, 2021; pp. 27–91. [Google Scholar]
  63. Xue, H.; Yang, Q.; Chen, S. SVM: Support vector machines. In The Top Ten Algorithms in Data Mining; Chapman and Hall/CRC: Boca Raton, FL, USA, 2009; pp. 51–74. [Google Scholar]
  64. Chang, C.C.; Lin, C.J. LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2011, 2, 27. [Google Scholar] [CrossRef]
  65. Vora, K.; Yagnik, S. A survey on backpropagation algorithms for feedforward neural networks. Int. J. Eng. Dev. Res. 2014, 1, 193–197. [Google Scholar]
  66. Pajchrowski, T.; Zawirski, K.; Nowopolski, K. Neural speed controller trained online by means of modified RPROP algorithm. IEEE Trans. Ind. Inform. 2014, 11, 560–568. [Google Scholar] [CrossRef]
  67. Hermanto, R.P.S.; Nugroho, A. Waiting-time estimation in bank customer queues using RPROP neural networks. Procedia Comput. Sci. 2018, 135, 35–42. [Google Scholar] [CrossRef]
  68. Powell, M.J.D. A Tolerant Algorithm for Linearly Constrained Optimization Calculations. Math. Program. 1989, 45, 547–566. [Google Scholar] [CrossRef]
Figure 1. The production mechanism of the Grammatical Evolution procedure.
Figure 1. The production mechanism of the Grammatical Evolution procedure.
Algorithms 18 00405 g001
Figure 2. A full example of a BNF grammar that produces a simple expression.
Figure 2. A full example of a BNF grammar that produces a simple expression.
Algorithms 18 00405 g002
Figure 3. The BNF grammar used in the method that produces rules for classification and data fitting problems.
Figure 3. The BNF grammar used in the method that produces rules for classification and data fitting problems.
Algorithms 18 00405 g003
Figure 4. An example of the application of the one point crossover method in the Grammatical Evolution. The green color is used for the parts of the first chromosome and the blue color is used for the parts of the second chromosome.
Figure 4. An example of the application of the one point crossover method in the Grammatical Evolution. The green color is used for the parts of the first chromosome and the blue color is used for the parts of the second chromosome.
Algorithms 18 00405 g004
Figure 5. The proposed grammar for the construction of artificial neural networks through Grammatical Evolution.
Figure 5. The proposed grammar for the construction of artificial neural networks through Grammatical Evolution.
Algorithms 18 00405 g005
Figure 6. An example of a produced neural network.
Figure 6. An example of a produced neural network.
Algorithms 18 00405 g006
Figure 7. The grammar used in the Feature Construction method.
Figure 7. The grammar used in the Feature Construction method.
Algorithms 18 00405 g007
Figure 8. Error per point for each method used on the 2004 dataset, and different values of the critical parameter D c .
Figure 8. Error per point for each method used on the 2004 dataset, and different values of the critical parameter D c .
Algorithms 18 00405 g008
Figure 9. Error per point for each used method used on the 2010 dataset, and a variety of values of the critical parameter D c .
Figure 9. Error per point for each used method used on the 2010 dataset, and a variety of values of the critical parameter D c .
Algorithms 18 00405 g009
Figure 10. Error per point for the machine learning methods used on the 2012 dataset, and a series of values for the critical parameter D c .
Figure 10. Error per point for the machine learning methods used on the 2012 dataset, and a series of values for the critical parameter D c .
Algorithms 18 00405 g010
Figure 11. Method comparison using Friedman test.
Figure 11. Method comparison using Friedman test.
Algorithms 18 00405 g011
Figure 12. Comparison of machine learning models for every year included in the experiments.
Figure 12. Comparison of machine learning models for every year included in the experiments.
Algorithms 18 00405 g012
Figure 13. Regression error by distance and model.
Figure 13. Regression error by distance and model.
Algorithms 18 00405 g013
Table 1. Natural disasters from EM-DAT.
Table 1. Natural disasters from EM-DAT.
2010–2025DeathsInjured PeopleHomeless People
Floods82,644111,9547,552,142
Storms51,091125,2663,671,100
Wildfires162213,81094,925
Earthquakes337,372698,0851,547,581
Table 2. Classification of earthquakes.
Table 2. Classification of earthquakes.
200420102012
Earthquakes126,972294,432370,582
0–4.9 mag126,003292,387369,101
5–5.9 mag89319091374
6–6.9 mag6811792
7–7.9 mag71913
8–8.9 mag102
9–10 mag100
Table 3. The series of steps performed for the production of a valid expression.
Table 3. The series of steps performed for the production of a valid expression.
ExpressionChromosomeOperation
<expr>9,8,6,4,16,10,17,23,8,14 9 mod 3 = 0
(<expr><op><expr>)8,6,4,16,10,17,23,8,14 8 mod 3 = 2
(<terminal><op><expr>)6,4,16,10,17,23,8,14 6 mod 2 = 0
(<xlist><op><expr>)4,16,10,17,23,8,14 4 mod 3 = 1
(x2<op><expr>)16,10,17,23,8,14 16 mod 4 = 0
(x2+<expr>)10,17,23,8,14 10 mod 3 = 1
(x2+<func>(<expr>))17,23,8,14 17 mod 4 = 1
(x2+cos(<expr>))23,8,14 23 mod 2 = 2
(x2+cos(<terminal>))8,14 8 mod 2 = 0
(x2+cos(<xlist>))14 14 mod 3 = 2
(x2+cos(x3))
Table 4. The values used for the experimental parameters.
Table 4. The values used for the experimental parameters.
ParameterMeaningValue
N c Chromosomes500
N g Maximum number of generations200
p s Selection rate0.10
p m Mutation rate0.05
N f Number of created features2
HWeights10
Table 5. Experimental results using a series of machine learning methods.
Table 5. Experimental results using a series of machine learning methods.
YEAR D c LSTMSVMMLP (BP)MLP (RPROP)MLP (BFGS)RULENNCFC
2004100.250.290.440.240.820.160.160.17
2004250.230.290.420.240.980.170.170.17
2004500.240.290.430.240.870.170.170.16
20041000.230.280.350.220.690.160.160.16
20045000.240.260.450.270.650.160.160.16
2010100.240.300.360.240.740.190.170.19
2010250.240.300.370.210.490.190.180.17
2010500.230.300.390.240.600.180.170.18
20101000.240.300.310.270.400.190.180.19
20105000.250.290.400.320.510.190.180.18
2012100.210.280.330.220.450.180.170.19
2012250.240.280.330.240.750.170.170.16
2012500.220.270.360.230.210.170.170.16
20121000.230.260.380.210.570.170.170.16
20125000.220.250.350.290.850.170.170.16
AVERAGE 0.2340.2830.3780.2450.6390.1750.1700.171
Table 6. Experimental results for the year 2010 and D c = 10 using the neural construction technique. Random noise ranging from 0.001 to 0.05 was applied to the data. Bold notation is used to indicate the higher test error as measured on the corresponding test set.
Table 6. Experimental results for the year 2010 and D c = 10 using the neural construction technique. Random noise ranging from 0.001 to 0.05 was applied to the data. Bold notation is used to indicate the higher test error as measured on the corresponding test set.
Noise Percent
Feature0.1%0.05%1%2%5%
10.1760.2280.1750.1750.175
20.177 0.1750.1750.1720.175
30.1750.176 0.1750.1750.175
40.175 0.1750.1750.1750.175
50.175 0.1750.1750.1750.175
60.1750.176 0.1750.1750.175
70.1750.1750.1750.193 0.175
80.175 0.1750.1750.1750.175
90.175 0.1750.1750.1750.175
100.175 0.1750.1750.1750.175
110.1750.1750.1750.1750.176
120.1760.1750.1750.1750.176
130.1750.1750.1760.176 0.175
140.1760.1750.1760.176 0.175
150.1750.1750.176 0.1750.175
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kopitsa, C.; Tsoulos, I.G.; Charilogis, V. Predicting the Magnitude of Earthquakes Using Grammatical Evolution. Algorithms 2025, 18, 405. https://doi.org/10.3390/a18070405

AMA Style

Kopitsa C, Tsoulos IG, Charilogis V. Predicting the Magnitude of Earthquakes Using Grammatical Evolution. Algorithms. 2025; 18(7):405. https://doi.org/10.3390/a18070405

Chicago/Turabian Style

Kopitsa, Constantina, Ioannis G. Tsoulos, and Vasileios Charilogis. 2025. "Predicting the Magnitude of Earthquakes Using Grammatical Evolution" Algorithms 18, no. 7: 405. https://doi.org/10.3390/a18070405

APA Style

Kopitsa, C., Tsoulos, I. G., & Charilogis, V. (2025). Predicting the Magnitude of Earthquakes Using Grammatical Evolution. Algorithms, 18(7), 405. https://doi.org/10.3390/a18070405

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop