1. Introduction
The influence of solar activity on various processes on Earth has long been the subject of close study, which resulted in the appearance of the term “space weather”. The review [
1] considers various aspects of the influence of solar activity on the Earth’s climate and anthropogenic processes. The ionosphere, as an important component of the concept of space weather, was studied in the works [
2,
3,
4]. The results of the development of statistical methods for predicting strong solar flares, including using machine learning, are presented in the articles [
5,
6,
7,
8]. An important issue in the study of space weather is its impact on catastrophic events in the life of society, such as earthquakes. In particular, methods have not yet been developed that allow us to unambiguously answer the question of whether strong solar flares and other electromagnetic events in the ionosphere have a trigger effect on the occurrence of sufficiently strong earthquakes. A lot of research is devoted to this issue [
9]. The papers [
10,
11] present the results of the analysis of correlations between 11 and 12-year cycles of solar activity and time intervals of increasing intensity of seismic events over long periods of time. A comparison of time intervals of high seismic activity with the phases of solar cycles since 1900 is carried out in the paper [
12].
The identification of the effects of the delay of strong earthquakes relative to the time intervals of geomagnetic storm maxima was considered in [
13,
14]. In [
15], a similar effect of the delay of seismic events was studied for sunspot numbers. The hypothesis about the occurrence of time anomalies of atmospheric electric fields preceding the occurrence of strong earthquakes, including deep-focus ones, as a result of processes in the source of an impending seismic event was studied in [
16]. A similar question about the occurrence of atmospheric and ionospheric electromagnetic signals recorded by spacecraft and preceding moderate seismic events was considered in [
17]. In [
18,
19], the difference between the global seismic process and the Poisson one after excluding aftershocks is explained by the piezoelectric effect in rocks as a result of the impact of the proton flux during solar activity, which has a periodic time structure in addition to the 11–12-year solar cycle. In [
20,
21,
22], the hypothesis is investigated about the generation of telluric currents in the Earth’s crust as a result of the impact of disturbances of ionospheric electromagnetic fields from solar flares and, as a consequence, their trigger effect on the foci of future seismic events. A statistical analysis of the impact of 50 largest solar flares in the time interval 1997–2024 on global seismic activity was performed in [
23], as a result of which an increase in seismic activity was discovered within 10 days after the flare compared to 10 days before it. The paper [
24] provides an overview of the work in Russia for the period 1995–2020 on the study of the influence of artificial and natural electromagnetic impacts on seismicity and discusses possible ways of using electromagnetic seismicity to reduce seismic hazard. The classification of seismic events with a magnitude of at least 6 as they occur as a result of the impact of a proton flux using a neural network was performed in the paper [
25].
The complex dynamics of both the Sun and solar-terrestrial relations requires the use of a set of modern data processing methods based on the use of nonlinear models for the analysis of time series describing interacting systems [
26]. In [
27,
28], the internal dynamics of solar cycles were studied using methods of empirical orthogonal oscillation modes, estimates of their maximum Lyapunov exponents, and entropy flows between the values of various parameters of processes inside the Sun. In [
29,
30], various estimates of the Hurst exponent and entropy measures were used to analyze data obtained using the Swarm satellite network of the European Space Agency to describe the most intense magnetic storms and to quantitatively study the complexity of processes in the upper ionosphere. In [
31], a study was conducted of the structure of currents induced by geomagnetic storms, leading to accidents in electrical networks, by applying information theory and various entropy measures to their time series.
In this paper, a new method is proposed that allows obtaining a quantitative estimate of the influence of various statistics of the proton flux density time series measured by the Solar Heliospheric Observatory (SOHO) [
32] on the sequence of earthquakes with a magnitude of at least 6.5. The method is based on the use of estimates of “advance measures” based on a parametric model of the intensities of interacting point processes and on the calculation of wavelet measures of spectral tilt and entropy, as well as on an estimate of the width of the carrier of the multifractal spectrum of singularities of the proton flux density time series.
4. Periodic Components of the Earthquake Sequence
Of interest is the question of whether the 27-day periodicity of the proton flux shown in
Figure 2 is reflected in the periodicity of the intensity of seismic events. To do this, it is necessary to estimate this periodicity, taking into account that the sequence of earthquakes is not a time series with a constant time step, for which classical spectral estimation methods [
35] can be applied. Below, the method proposed in [
36] is used to estimate the periodic components of the intensity of the sequence of events. In [
37], this method was used to calculate the periodic component of the stepwise variations in the time series of the displacement of the earth’s surface measured by GPS.
Let
be the times of the sequence of events observed on the interval
. Consider the following intensity model containing a periodic component:
where frequency
, amplitude
, phase angle
, and
multiplier
(describing the Poisson part of the intensity) are parameters of the model. Thus, the Poisson part of the intensity is modulated by a harmonic oscillation. Let us fix some value of frequency
. The logarithmic likelihood function [
38] in this case for a series of observed events is equal to
Taking the maximum of expression (2) with respect to the parameter
, it is easy to find that
It should be noted that the expression
is an estimate of the intensity of the process under the condition that it is Poisson homogeneous (purely random). Thus, the increment of the logarithmic likelihood function due to the consideration of a richer intensity model with a harmonic component with a given frequency
than for a purely random flow of events is equal to
An important issue when applying this method to real data is determining the statistical significance of the obtained peak values of statistics (5). Let us consider two hypotheses for the same data set consisting of independent observations:
- (1)
distributed by density —hypothesis ;
- (2)
distributed by density —hypothesis .
Here,
and
are vectors of unknown parameters, having dimensions
and
, and the hypothesis
is more “rich”:
, and the vector of parameters
completely include the components of the vector
. Let us consider the difference between the logarithms of the likelihood for these two hypotheses, provided that the vectors of parameters are taken from their maximum likelihood estimates:
It is evident that
. According to Wilks’ theorem [
39], if the hypothesis is true, the quantity (6) has an asymptotic distribution:
In our case,
and therefore, the doubled value (8) has an asymptotic distribution density
equal to
, and the value (8) itself is distributed asymptotically as
provided that the analyzed sequence of time moments is distributed according to the Poisson law with constant intensity. Expression (8) allows us to set thresholds for statistics that allow us to assert that only when they are exceeded does the sequence of time moments differ from the Poisson sequence with a given probability.
For a time sequence of seismic events with a magnitude of at least 6.5 (
Figure 2b), we calculated the increments of the logarithmic likelihood function (5) in a sliding time window of 730 days (2 years) with a shift of 30 days for 200 frequency values
corresponding to the values of periods varying from 10 to 100 days with a uniform step in a logarithmic scale. The resulting time–frequency dependence, similar to the usual spectral time–frequency diagram in
Figure 3a, is shown in
Figure 4a.
Figure 4b shows the graph of averaging the increments of the logarithmic likelihood function (4) for all time windows and highlights five “spectral” peaks exceeding level 2 with periods of 14.84, 23.89, 26.64, 50.5, and 74.9 days. For them, the peak values of the average increments
were 2.08, 2.14, 2.15, 2.09, and 2.17, respectively. Using the asymptotic Formula (8), we obtained the following probabilities of difference between the periodic components of the seismic regime with these periods and a purely random Poisson process: 0.875, 0.882, 0.883, 0.863, and 0.886.
Thus, it can be stated that the sequence of time moments of earthquakes with magnitudes not lower than 6.5 contained a periodic component with a period of 26.64 days, close to the period of the Sun’s rotation with a probability of not less than 0.883. This fact confirms the hypothesis about the influence of solar activity on the Earth’s seismicity.
5. Method for Assessing the Measure of Mutual Advance of Two Streams of Events
In the future, we will be interested in the question of whether there is an advance of the moments of time of the largest local maxima of the average values of the proton flux density (
Figure 2a) relative to the moments of time of earthquakes (
Figure 2b). Clarification of this question requires also an assessment of the “reverse” advance and calculation of their difference. If the average value of this difference is positive, then there is a trigger effect of the proton flux on seismicity. In addition, the value of the average difference of the advance measures will give a measure of the trigger effect.
To clarify this issue, we applied the influence matrix method proposed in [
40] to assess the degree of influence of earthquake sequences on each other in several seismically active regions. In its original implementation, this method is multidimensional. However, below, it is simplified and modified for the practically important situation of two time sequences. This modification was previously used in [
41,
42,
43,
44,
45] to analyze the relationships between seismic event times and local extremum time points of various microseismic background statistics, magnetic field fluctuations, ground tremor, and meteorological time series properties.
Let represent the moments of time of two sequences of events. In our case, these are
- (1)
a sequence of time moments corresponding to the largest local maxima of the average values of the proton flux;
- (2)
sequence of times of seismic events with magnitude of at least 6.5.
Let us represent their intensities as follows:
where
are parameters,
—function of influence of time moment
of the sequence with number
:
According to Formula (10), the weight of the event with number becomes non-zero for times and decays with characteristic time . The parameter determines the degree of influence of the flow on the flow . The parameter determines the degree of influence of the flow on itself (self-excitation), and the parameter reflects a purely random (Poisson) component of intensity. Let us fix the parameter and consider the problem of determining the parameters .
The log-likelihood function for a non-stationary Poisson process is equal to over the time interval
[
38]:
It is necessary to find the maximum of functions (11) with respect to the parameters
. Taking into account Formula (11), we can write the derivative of the logarithmic likelihood function with respect to the parameters:
From where and from Formula (9) it follows:
Since the parameters
must be non-negative, each term in the leftmost part of this formula is equal to zero at the point of maximum of function (11)—either due to the necessary conditions of the extremum (if the parameters are positive), or, if the maximum is reached at the boundary, then the parameters themselves are equal to zero. Consequently, at the point of maximum of the likelihood function, the equality is satisfied:
Let us substitute the expression
from (9) into (14) and divide by
. Then, we get another form of Formula (14):
where
Substituting
from (15) into (11), we obtain the following maximum problem:
Here,
, under restrictions:
Function (17) is convex with negative definite Hessian [
40] and, therefore, problem (17)–(18) has a unique solution. Having solved problem (17)–(18) numerically for a given
, we can introduce the elements of the influence matrix
according to the formulas:
The quantity
is a share of the average intensity
of the process with number
, which is purely stochastic, the part
is caused by the influence of self-excitation
and
is determined by the external influence
. From Formula (15) follows the normalization condition:
As a result, we can determine the influence matrix:
The first column of matrix (21) is composed of Poisson shares of mean intensities. The diagonal elements of the right submatrix of size 2 × 2 consist of self-excited elements of mean intensity, while the off-diagonal elements correspond to mutual excitation. The sums of the component rows of the influence matrix (21) are equal to 1. The influence matrices are estimated in a certain sliding time window of length with offset and with a given value of the attenuation parameter .
When analyzing variations of the components of influence matrices in sliding time windows corresponding to the mutual influence of the analyzed time sequences, the main attention is paid to their local maxima with their subsequent averaging. Let be the number of windows lengths within limits from up to . Thus, the sequence of windows lengths is , where . Each time window of the length is shifted along time axis with mutual shift . Let be the sequence of time moments corresponding to right ends of time windows of the length . The number of time moments is defined by mutual shift of time windows of the length . Let and be elements and of the matrix (21), corresponding to mutual influences and of analyzed time moments for current position of time window of the length . Let , be local maxima of , i.e., .
Let us take some “small” time interval of the length and for the sequence of time moments , of such time fragments, where we will calculate the mean values and of for which their time marks belong to these fragments. Averaging is performed over all time window lengths . These mean values in dependence on the right end of intervals gives the measures of the averaged effects of the advance of second sequence of time moments with respect to the first one and vice versa. Our main purpose is calculating the difference . In this formula, the first sequence is the sequence of time moments of earthquakes with a magnitude not less than 6.5, whereas the second sequence is time moments of the largest local maxima of the mean proton flux time series. Thus, if average is positive, it means that there is a trigger effect.
The full set of parameters of the method is the following: , , , , , . In our calculations, we used year (approximately 18 days), year, year, , day, year. The calculation results are most sensitive to the choice of parameters , , . The values used were chosen as a result of trial calculations and selection of the best options.
7. Proton Flux Density Time Series Statistics
The average proton flux density values used above are the simplest statistics. An idea arises to try other proton time series statistics and to estimate the relationship of the times of their “most expressive” (i.e., largest local maxima or smallest local minima) extreme values with the times of earthquakes using the above model of influence matrices. In addition to the simple average values, we used five different proton flux time series statistics described below. These statistics were estimated in the same time windows of 1440 5 min samples (5 days), taken with an offset of 288 samples (1 day), as before, when calculating the average values.
(1)
The kurtosis of a time series
is calculated in each time window using the following formula [
45]:
Here, the angle brackets denote the operation of calculating the mean value. The value can be considered as a measure of the difference from the Gaussian distribution, for which . Below, we use the logarithm of the kurtosis coefficient: .
(2)
The minimum wavelet-based normalized entropy of a time series
is calculated based on the decomposition of the time series within a window into orthogonal wavelets.
In Formula (23),
,
are the wavelet coefficients of the signal
, and
is the total number of wavelet coefficients. Seventeen orthogonal Daubechies wavelets were used: 10 ordinary bases with a minimum support with a number of vanishing moments from 1 to 10 and 7 so-called Daubechies symlets [
46], with a number of vanishing moments from 4 to 10. For each of the bases, the entropy (23) of the distribution of the squares of the wavelet coefficients was calculated, and then, by enumeration, the optimal basis was found that realized the minimum value in each time window. By construction,
. The details of calculating the entropy (23) in a sliding time window are described in [
47].
(3)
Wavelet-based spectral slope . After determining the optimal orthogonal wavelet from the minimum entropy condition, it is possible to calculate the average values
of the squares of the wavelet coefficients at each detail level, which is part of the oscillation energy corresponding to the detail level with the number
, which corresponds to the frequency band with the boundary frequencies
and
, where
is the length of the sampling time interval (in our case
= 5 min) [
46]. Let us consider the values of the periods corresponding to the centers of these frequency bands:
The quantities
are similar to the Fourier power spectra. These quantities are convenient to use when calculating the slope of the graph of the logarithm of the power spectrum as a function of the logarithm of the period. The spectral slope in each time window is found by the least squares method:
(4)
The Donoho–Johnston wavelet-based index (DJ-index)
is defined as the ratio of the number of “large” wavelet coefficients by absolute value to their total number. By definition,
. The threshold separating the “large” wavelet coefficients is
. This threshold separates the informative wavelet coefficients from other coefficients that are considered noisy [
46,
48]. The value
is an estimate of the standard deviation of noise under the assumption that the noise is most concentrated at the first detail level of the orthogonal wavelet decomposition. To estimate the value, the median estimate of the standard deviation of a normal random variable is used:
(5)
The multifractal singularity spectrum support width is an important characteristic of the signal and is considered as a measure of the diversity (complexity) of its stochastic behavior. It is defined as
, where
and
are estimates of minimum and maximum values of the Holder–Lipschitz exponent [
49]
, which governs the behavior of the signal at the vicinity of time moment
:
. For a mono-fractal signal, the Holder–Lipschitz exponent is the same for all time moments
. Otherwise, the signal is multi-fractal, and the concept of the spectrum of singularities
is introduced, equal to the fractal dimension of the time moments with the same value of the Holder–Lipschitz exponent, equal to
[
50]. To estimate
in each time window, we used the method of fluctuation analysis after removing scale-dependent trends [
51]. The implementation of the method used is described in detail in [
47]. To remove local polynomial trends for the proton flux density time series, we used zero-order polynomials, i.e., we analyzed fluctuations after removing local means.
For a sequence of time intervals of 5 days, taken with a shift of 1 day, we calculated the values of all five statistics of the proton flux density time series. The results of these calculations are presented in the graphs in
Figure 6.
From the graphs of the power spectra of the time series of changes in statistics, it is evident that for all of them, with the exception of , there is a periodicity of 89 days, which is especially pronounced for and . It should be noted that the power spectrum of the change in the average values of the proton flux density does not contain a spectral component with a period of 89 days.
Let us consider in more detail the time–frequency structure of the variations in the singularity spectrum support width
(
Figure 6(a5)), for which the 89-day periodicity is most clearly visible. Let us denote
the dependence of the singularity spectrum carrier width on time (the position of the right end of the 5-day time window with a 1-day offset)
and calculate the Morlet wavelet transform [
46]:
The values
can be interpreted as the energy of signal
oscillations in the vicinity of a time point
with a period
.
Figure 7a shows the Morlet time-frequency diagram of values
for 200 values of periods
varying within the range from 10 to 500 days with a uniform step on a logarithmic scale. For the frequency band with periods from 63 to 158 days (logarithms of periods from 1.8 to 2.2), in which the most intense periodic variations of
with a central period of 89 days are concentrated, we calculated the maximum values
. These maximum values are shown in
Figure 7b by a black line. The red line in
Figure 7b shows the cyclic trend for the maximum values of the logarithms of the Morlet wavelet coefficients in the frequency band highlighted above. The period of this oscillation was determined numerically from the condition of minimum variance of deviations for trial cyclic trends with periods in the range from 1500 to 5500 days. As a result of such calculations, it turned out that the optimal period is equal to 4429 days or approximately 12.13 years, that is, very close to the period of solar cycles.
8. Measures of Mutual Advance of Local Extrema of Proton Flux Density and Seismic Event Sequence Statistics
The further plan of using five statistics of the proton flux density time series consists of assessing the measures of advancement of the time moments of their most expressive local extrema (the largest local maxima and the smallest local minima) relative to the time moments of earthquakes with a magnitude of at least 6.5. In this case, the number of points of the most expressive local extrema will be chosen equal to the number of seismic events, i.e., 1136.
To eliminate the influence of low-frequency components of the change in the values of statistics on the determination of the moments of time of local extremes, the time series, the graphs of which are presented in
Figure 6(a1–a5), were subjected to the operation of removing low frequencies using Gaussian kernel smoothing. Let
be a time series with discrete time
. Gaussian kernel averaging of a time series
with radius (scale parameter)
at the moment of time
, is calculated using the following formula [
52]:
Calculation of the kernel averaging by Formula (28) for long time series can be effectively implemented using the fast Fourier transform. Then, the average values of the time series for the averaging radius
equal to 2 days were subtracted from the time series of changes in statistics and the most expressive points of local extrema were found for the residuals. These operations are illustrated by the graphs in
Figure 8.
When estimating the advance measures by the points of local extrema of the proton flux statistics, we tested both the points of the largest local maxima and the points of the smallest local minima after excluding low frequencies using Gaussian smoothing (28). In this case, the differences between the “direct” and “reverse” lead were calculated. Then, the variant of the largest local maxima and the smallest local minima for which the average value of the difference between the average measures of the “direct” and “reverse” lead was maximum was selected. As a result of such an enumeration of variants, it turned out that the most preferable were the smallest points of local minima for the statistics
,
,
, and
, whereas the largest local maxima for the entropy was
. The results of estimating the differences in the lead measures are presented in
Figure 9.
It is interesting to note that when analyzing the prognostic properties of low-frequency seismic noise measured by a global network of 229 broadband seismic stations located around the world, it turned out that it is the points of the smallest local minima of statistics
,
, and the points of the largest local maxima of entropy
that have the maximum prognostic effects relative to the times of the strongest earthquakes with magnitudes of at least 7 [
41].
Another characteristic of the difference of average lead measures is the part of interval lengths with positive values of the difference. This characteristic is equal to —0.78; —0.75; —0.71; —0.83; —0.76; —0.95, that is, the use of averaging provides a frequent positive value of the lead measure, although it loses out in comparison with the average value of using minima .
9. Discussion
The conclusions of the article were obtained as a result of applying a sequence of methods. The first stage consisted of a simple check for the presence of a 27-day period in the seismic event sequence, which dominates the proton flux density time series and is related to the rotation of the Sun. At this stage, the existence of this periodicity in the seismic event stream was confirmed with a probability of 88%. This is an indirect confirmation of the influence of the proton flux on seismicity. At the second stage of the analysis, the hypothesis about the influence of the proton flux on earthquakes was tested by a more “direct” method, based on the influence matrix method. In this method, two time sequences are processed, and the influence of events in each sequence to events in the other stream is directly estimated. In this case, the method allows for the calculating the contribution of a purely random (Poisson) component, the contribution of self-excitation, and the contributions of mutual excitation. In this analysis, one of the event sequences is always a sequence of time moments of earthquakes with a magnitude of at least 6.5. As for the second sequence of times, it varies. The simplest option is to select the time points of the largest local maxima of the average value of the proton flux density. For this option, the proportion of the average intensity of earthquakes for which the maximum values of the proton flux density are a trigger is 17% (
Figure 5). However, this result can be significantly improved if, instead of a simple average, we take more sophisticated statistics of the behavior of the time series of the proton flux density. The results for enumerating five variants of statistics are shown in
Figure 9. The best result (28%) was obtained when using the time points of the smallest local minima of the multifractal singularity spectrum support width (
Figure 9d). But, if we average over the set of all used statistics, we obtain 23% (
Figure 9f). Although the second value is less than the first, when averaging, the proportion of time when the trigger effect occurs is 0.95, while for the “record” statistics it is 0.83. In this sense, 23% is a more stable estimate.
An 89-day periodicity in the variations of proton flux statistics has been revealed. One hypothesis is that this periodicity may be related to the modulation of the proton flux density by the motion of Mercury, the planet closest to the Sun with an orbital period of 89 days. The presence of a 12-year periodicity in the change in the maximum values of the logarithms of the modules of the Morlet wavelet coefficients for the singularity spectrum support width confirms the connection of the 89-day periodicity with solar dynamics. Another hypothesis for the origin of this periodicity is the coincidence of 89 days with half the oscillation period of the SOHO satellite, which measures the proton flux density, in the vicinity of the Lagrange libration point L1 [
32]. However, the mechanism of such modulation, which is maximum precisely for the Donoho–Johnston index statistics and the singularity spectrum support width, remains unclear.