2D Hybrid method:Case of VLF signal amplitude variations in the time vicinity of an earthquake

Extraction of information in the form of oscillations from noisy data of natural phenomena such as sounds, earthquakes, ionospheric and brain activity, and various emissions from cosmic objects is extremely difficult. As a method for finding periodicity in such challenging data sets, the 2D Hybrid approach, which employs wavelets, is presented. Our technique produces a wavelet transform correlation intensity contour map for two (or one) time series on a period plane defined by two independent period axes. Notably, by spreading peaks across the second dimension, our method improves apparent resolution of detected oscillations in the period plane and identifies the direction of signal changes using correlation coefficients. We demonstrate the performance of the 2D Hybrid technique on a very low frequency (VLF) signal emitted in Italy and recorded in Serbia in time vicinity of the occurrence of an earthquake on November 3, 2010, near Kraljevo, Serbia. We identified a distinct signal in the range 120-130 s that appears only in association with the considered earthquake. Other wavelets, such as Superlets, which may detect fast transient oscillations, will be employed in the future analysis.


Introduction
Monitoring of different parts of the Earth and space collects data whose analyses can provide numerous important pieces of information for both scientific research and practical applications. One of the most important applications of various forms of monitoring is in the field of natural disaster prediction. However, in many cases, the possibilities and reliability of the corresponding predictions are still in the research phase. One of these examples is the application of data obtained in the monitoring of the lower ionosphere with very low frequency (VLF) signals to the prediction of earthquakes. These signals are emitted by worldwide located transmitters and propagate in the so-called "Earth-ionosphere waveguide", while numerous receivers record the signal amplitude and phase at their locations. Variations in the characteristics of the recorded signal enable the indirect detection of numerous phenomena, among which are those related to natural disasters. Among others, a large number of studies based on the analysis of VLF signals investigate the possibility of the existence of earthquake precursors in the form of ionospheric perturbations. Those precursors are primarily associated with changes in the VLF signal amplitude and/or phase [1][2][3][4][5] and the amplitude minimum time shift during solar terminator periods (the so-called "terminator time") [6][7][8][9][10]. In addition, the most recent research suggests that there are reductions in the noise of the VLF signal amplitude and phase a few tens of minutes before the observed type of disaster [11][12][13]. Although various data processing procedures have been applied in previous studies, there is still no way to reliably predict an earthquake. For this reason, the application of new models in VLF signal processing is essential.
Signals are similar to quantum systems due to wave-particle duality. The scientist who first noticed this and prove uncertainty principle for signals was Gabor [14], who did so by applying to arbitrary signals the same mathematical apparatus that was employed in the Heisenberg-Weyl [15,16] derivation of the uncertainty principle in quantum mechanics.
As it is well known, according to Heisenberg's uncertainty principle [15] the product of the standard deviations of position (σ x ) and momentum (σ p ) cannot be less than a non-zero constant σ x σ p ≥ h 4π , involving Planck constant h. Similarly, the basic Gabor Uncertainty Principle [see 14] states that the product of the uncertainties in frequency (σ f ) and time (σ t ) must exceed a fixed constant σ f σ t ≥ 1 4π . As a direct consequence of this, it is impossible to know the exact time and frequency of a signal simultaneously; hence, it is impossible to describe a signal as a point on the time-frequency (TF) plane. TF analysis of time series is traditionally carried out by employing the Fourier spectra on successive sliding time windows [e.g., see 17, and references therein]. Wide windows offer high frequency resolution but low temporal resolution, and vice versa; this is where the Heisenberg-Gabor uncertainty principle starts to have an influence. The characterization of a time series in the frequency domain by means of the spectral density function S( f ), which establishes the distribution of the time series variance at specific frequencies f , is one of the most used diagnostic tools for the identification of quasiperiodic fluctuations in a time series across disciplines. The simplest estimator of the S( f ) is the periodogram defined as the product of the time series sampling rate divided by the number of points and the square modulus of the discrete Fourier transform. The major issues of the periodogram are well investigated [see 18, and references therein]: (i) Because of the finite frequency resolution, power leakage into adjacent bins occurs (ii) a bias in the estimate that was not known a priori, and which was dependent on the time series itself and (iii)the associated variance, that is equal to the estimate itself.
In order to get around above mentioned problems, multiscale approaches, which are also known as multiresolution methods, have been developed. An example of one of these methods is the continuous-wavelet transform [CWT, 19, see]. The CWT provides good relative temporal location of the signal, since it may either tighten or inflate a mother wavelet depending on the frequency of the signal [20].
The CWT is capable of pinpointing the location of the oscillation in time, but as the frequency increases, it sacrifices its frequency resolution [21]. However, in many instances, it is not possible to find difference between frequency components that are immediately adjacent to one another. Because of this, analyses are frequently carried out making use of a dyadic representation. One example of this is the dyadic discrete wavelet transform (DWT), in which the T/2j wavelet amplitude coefficients arising from the j-th stage of the DWT, and T = 2 n [22] are employed. On the other hand, this representation does not do a very good job of resolving the high frequencies.
Our hybrid method, which is based on two-dimensional (2D) correlation analysis, was created in order to solve the problems that have previously been encountered [see 23,24]. It has been validated by the utilization of optical and photometric data obtained from several studies of active galactic nuclei, which are fueled by supermassive black holes [see e.g., 23,25]. The fundamental feature of these data is that they have irregular sampling, with huge gaps, and signal with low amplitude, if it exists at all, is buried in the red noise. The 2D Hybrid method is able to utilize various wavelets (e.g., CWT, DWT, Weighted wavelet transform, high resolution Superlets) in order to localize oscillations in period-period plane of time series in question. The method produces a contour map of correlation intensity on a period-period plane defined by two independent period axes corresponding to the two time series (or one). The map is symmetric and able to be integrated along any of the axes, resulting in a depiction of the level of correlation among oscillations that is similar to a periodogram. Therefore our approach could be interpreted as the two-dimensional distributions of correlation of the variance of time series in the time domain, which could also be projected into the one-dimensional domain. Our goal is to further illustrate the performance of the 2D hybrid technique and its application on time series of highly sampled very low frequency (VLF) radio signal amplitude data. In this study, we present computations of 2D correlation maps of the VLF signal amplitude oscillations before, during and after the earthquake nearby city Kraljevo in Serbia on November 3, 2010. As perturbation of VLF signal amplitude associated with an occurrence of earthquakes, this application presents an opportunity for the acquisition of novel insights.

Materials
This study is based on the processing of data recorded in the low ionosphere monitoring by the 20.27 kHz VLF radio signal emitted by the VLF transmitter [which name is ICV, see e.g., 26] Figure 1, we show the path of the observed VLF signal and the location of the Kraljevo earthquake epicentre [27] for which the data are given in http://www.emsc-csem. org/Earthquake/. In this study, we consider data series showing VLF signal amplitude values recorded with a sampling of 0.1 s in five time intervals starting 2 h and 1 h before the Kraljevo earthquake, at the moment of its occurrence, and 1 h and 2 h after that time. In order to compare the obtained results with those relevant for periods without seismic activity, we additionally analyse the time intervals in the same season period but about one year earlier (1-2 November, 2009). These periods are chosen to exclude the effects of daily and seasonal changes that are visible in the VLF signal amplitudes and that may affect the observed comparison. In addition, it was taken into account to eliminate the potential influences of other natural phenomena with origin in the atmosphere and from space as well as non-natural causes of variations in the emission and reception of the considered signal (these influences are described in detail in [11] and [13]). For this reason, the reference intervals are chosen in periods when no significant disturbances in meteorological, geomagnetic and space weather conditions were recorded, and when approximately the same values of the amplitude and its noise were recorded as in the quiet period before the Kraljevo earthquake.

Methods
Here, we are presenting a detailed view of 2D Hybrid method, where the main steps of this process are depicted, while we provide excerpts of the algorithm in pseudocode for revealing additional important details (see Algorithm A1, in Appendix ). To start, we review some of the fundamental ideas and information associated with wavelet analysis (see the Section 3.1). The overall concept of the 2DHybrid method is presented in Section 3.2, along with a detailed description of the period's uncertainty (Section 3.2.1) and significance (Section 3.2.2) estimate.
In conclusion, the Section 3.2.3 presents the wavelet that was implemented in our 2DHybrid method.

Summary of wavelet concept
A wavelet function (abbreviated wavelet) is a function belonging to the space of all squareintegrable functions ψ ∈ L 2 (R), averaged R ψ = 0, and normalized ψ = 1 [see more details in 19,28]. When we compare the wavelet approach to the Fourier method, we find that the Fourier analysis disassembles a signal into a set of sinusoids defined with distinct frequencies, but the wavelet method unwinds a signal into the shifted or scaled shapes that originate from a mother wavelet. Wavelet maps a signal into a time-scale plane which is the same as the time-frequency plane in the short-time Fourier transform, so that each scale represents a certain frequency range of the time-frequency plane [29]. Given the signal f (t), its CWT at time u and scale s is defined as: The result of the CWT is a matrix (scalogram) filled with wavelet coefficients located by scale and position: The aforementioned equation could be interpreted as the amount of energy, denoted by CW f present at scale s, provided that the condition S(s) ≥ 0 holds. When seen in this context, the same equation enables us to determine the scales that contribute the most to the overall energy of the signal. In most cases, we are only interested in searching for oscillations inside a predetermined time span such as [t min , t max ], and because of this, we can define the windowed scalogram that corresponds to this interval as follows: obtain discrete data. In terms of sampling, any discrete signal can be studied in a discrete domain by utilizing discrete wavelets, or in a continuous manner by employing neural and Gaussian process models of discrete series with gaps.
If we are given two time series, f and f , we can examine their respective scalograms, S and S , to determine whether or not they exhibit patterns that are comparable to one another. For instance, we are able to perform an absolute comparison of scalograms using the formula S − S [30].

General description of 2DHybrid method
Now, we are ready to present our 2D Hybrid approach, which compares the scalograms of two different (or one) series by using correlation. Given a scalogram S with dimensions M × N and another scalogram S with dimensions P × Q, the two-dimensional cross-correlation ( ) of these scalograms is the matrix , which has following elements: where S stands for complex conjugate of S and −(P As the cross correlation of scalograms is defined on the field of complex numbers, both the real and imaginary components of the complex cross correlation function are referred to as the synchronous and asynchronous 2D correlation spectra, respectively [see 31]. Since we are only interested in physical phenomena whose correlation can be tracked in the field of real numbers, we only supply the mathematical formulation of synchronous 2D correlation spectra [32] where Cov stands for covariance and σ and σ are standard deviations of scalogramsS andS of two time series, respectively. Notably, in the discrete formulation of cross correlation, a synchronous 2D correlation map is simply defined as the inner product of theS andS [32]: The resemblance between oscillations in two different (or one) time series is measured via a 2D correlation map (which we will also refer to as a heatmap). A high positive correlation value shows that periodic signals vary in a coordinated manner, implying that the signals have a common or related origin [23].
The two-dimensional correlation map is presented as a contour map of correlation strength on a period plane that is defined by two axes that are independent of one another. When done in this fashion, plotting a synchronous spectrum results in a map that is symmetric in relation to the primary diagonal line of the map. The correlation of two (or one) time series at the same periods refers to the intensity of the correlation that may be found at the main diagonal of the map. As a result, peaks that are located on the main diagonal line are referred to as auto-peaks. The intensities of auto-peaks are representative of the total extent of the signals' dynamic fluctuations [23]. It is important to note that two-dimensional correlation map can be summed with the absolute value of C(k, l) along any of the dimension. This is because negative correlation can also appear for some signals which should not be canceled in the summation. This integration provides an interpretation similar to a periodogram, with the horizontal axis counting periods and the vertical axis standing for the degree of correlation peaks.

Concept of estimating uncertainity of detected periods
In order to find the uncertainty of detected periods, we first calculate the full width half maximum of the peak in a periodogram-like image, and then use the mquantile module in Python to estimate points that fall between the 25th and 75th quantiles. The upper and lower error estimates are represented by these points. The reason behind using quantiles is that peaks in periodogram -like structure do not conform to the theoretical normal distribution, actually they are skewed. In this particular situation, quantiles provide more suitable information than standard deviation. The sample quantile is based on order statistics and calculated regardless of underlying distribution. The p-th quantile of a set of values represents a summarizing quantity having less than or equal to p, where, 0 ≤ p ≤ 1. Similarly to median which is the value bellow which 50% of all values in the sample lie, we might define that the first quartile (25th quantile) as the value below which 25% of all values in the sample lie and the third quartile (75th quantile) as the value below which 75% lie.
Our period error method is inspired on 'post mortem analysis' by [33], which requires the so called Mean Noise Power Level (MNPL) in the vicinity of detected period. The 1-sigma confidence interval on period then is equal to the width of the line at the period -MNPL level.

Concept of estimating significance of detected periods
The significance of detected period σ P , we estimated following the approach outlined in [34]. After shuffling the dates of each and every observation and its magnitude, the period was recalculated across this newly updated data set, and the power of the highest peak in this uncorrelated data set was compared to that of the initial simulated data. After performing this procedure a total of one hundred times (presumably due to large computing time needed for highly sampled VLF signals), the significance level was finally calculated as: where x represents the number of times that the peak power of the period in the original data was greater than that of the uncorrelated ensemble and N is the total number of shuffles (100). This formula therefore has a maximum of 1, corresponding to a 100 percent recovery rate. When multiple periods are found in an original curve, the significance of each peak is measured by comparing the power of peaks in shuffled curves found at the place of detected periods to the power of peaks in the original curve.

Wavelets used in 2DHybrid method
The effective implementation of the 2D Hybrid approach is going to be demonstrated in the next Section by making use of Weighted Wavelet Z-transform (WWZ) wavelets [35]. This wavelet approach can be utilized on data that has been sampled both regularly and irregularly. The WWZ wavelets are defined on a basis that consists of functions: cos[ω(t − τ)], sin[ω(t − τ)], I(t) = 1. In addition, the projection of data via WWZ makes use of weights that take the form exp(−cω 2 (t − τ) 2 ), where c is a parameter that can be adjusted according to the data set.
The tuning constant c, whose value determines the window's width, can have a variety of choices. For instance, the value c = 0.0125 was initially suggested in [35], with the intention of improving the time resolution on shorter parts of the data. However, the value of c might go as high as 0.005, and it is used to in order to improve frequency resolution.
In the former scenario, this translates to the wavelet decaying by e −1 in ∼1.4 cycles, while in the later scenario, this translates to the wavelet deteriorating in ∼2.4 cycles. In our study we used c ∼ 0.003. This value can be compared to e.g. c = 0.001 used in [36] and [37] for longer data sets than the one used here.
Because each of the analyzed time series has 36 000 points, the application of our 2D Hybird method takes 45 minutes to complete on the google.colab computing platform, which has a CPU: 1× single core hyper threaded Xeon Processors 2.2 GHz (1 core, 2 threads) and RAM: ∼ 13 GB. It takes about 70 hours on the same platform to calculate the significance of a detected period using 100 artificial time series.

Results and Discussion
As the amplitude fluctuation of the VLF signal during the Kraljevo earthquake is known to be difficult to analyze using Fourier periodicity, we decided to demonstrate our method using this data. We computed 2D hybrid maps and their integrated versions for each segment of the time series for the date of the earthquake occurrence (see Figure 2) and for the same date but one year earlier as a control case (see Figure 3). For each time series segment we kept the same parameters of WWZ wavelets: c = 0.003, range of frequencies [1/150,1/30 ] s −1 , and number of points in frequency grid (200). Table 1 provides a summary of the detected periods for the date of earthquake occurrence, whereas Table 2 provides the values for those detected periods for the control date one year earlier. The heatmaps show the prominent oscillations in time series segments which are plotted on the top of heatmap. The frequencies are displayed in s −1 along both the x and y axes of the plot. The degree of correlation between the oscillations in the time series is represented by the color of the heatmap cell for each pair of values (x, y). According to the colorbar scale that can be seen to the right of each plot, the hues of the heatmap are related with the correlation coefficients. It is important to note that the topology of heatmaps varies across different time series segments.   At 2 hours before the earthquake (shown in Figure 2 top row, left plot) we observe several signals of ≥ 80 s with correlation coefficients larger than 0.5 (shown in the bottom marginal plot of heatmap integrated projection), but there is only one peak above 100 s (located at 147 s, see also Table 1). All of the other oscillations have been muted by the passage of time, so that only the oscillation of 120 s that occurred 1 h before the earthquake is clearly visible in the top right panel of Figure 2.
During the earthquake, at 0 h from the event (middle plot), our approach records an oscillation at 131 seconds, with an amplitude that is approximately 20 percent lower than the peak that was recorded at 120 seconds during the hour-long interval preceding the earthquake (top right plot). In addition, there is one more signal at 35 s (middle plot).
After an hour has passed since the occurrence (bottom left plot), the oscillation at 121 s and 70 s once again appears. Finally, two hours after the earthquake (bottom right plot), the signals lasting longer than one hundred seconds vanished, and the graph once again began to show oscillations in the 80-90 s range, just as it had done in the period lasting two hours before the earthquake (top left plot).
One hour after the event, once again appears oscillation at 121 s, and 70 s. Finally, two hours after the earthquake the signals above 100 s disappeared, and once again comes to display oscillations around ∼ 80 − 90 s as in the period of two hours before earthquake. Interestingly, only time series segments corresponding to -2 h (the top left panel) and +2 h (bottom right panel) have a topology that is comparable to one another.
As it can be seen in Table 1, there are some ionospheric periodical oscillations of the order of one to two minutes. It is a question of this oscillation and its physical origine. One of the possibilities is that the electron density in the ionosphere is following periodical changes in the electromagnetic field which are registered close to the epicenters of earthquakes (see e.g. [38][39][40][41]), or that be generated by acoustic waves (see e.g. simulations in [38]). However, the true nature and physical background of the short-period oscillations given in Table 1 should be investigated in more detail, which is out of the scope of this paper.
In the scenario when there is no earthquake (the same date but one year earlier), the topology of the 2D hybrid maps Fig 3 looks very different from the topology of the maps corresponding to the record for earthquake date (Figure 2). The primary diagonal is where the majority of the correlation clusters are arranged. The VLF signal amplitude variation can be seen as a switching between a dominated correlation cluster at -2 h (top row-left), +1 h (bottom row, left) and a more granular structure (at +2 h (bottom row-right) and -1 h (top row-right). This variation on topology can not be seen during earthquake occurence (see Figure 2). On top of this, we observe a dominating core cluster is present at 0 h (middle panel). On the other hand, the values of detected periods are less than 111 seconds (at +2 h, +1 h, 0 h, and -1 h, see Table 2). It is interesting to note that a period of 140 s is captured in the -2 h time series segment, which is comparable to a period of 147 s that was captured in the -2 h segment when the earthquake occurred (Table 1).
A comparison of the obtained results with those given in [11] (based on the application of the Fast Fourier transform (FFT) to the data in the relevant time intervals of 1 h) shows that the agreement is better before the earthquake. For the first observed interval starting 2 h before the earthquake, the agreement is good for the obtained values below 1.5 min, while, in both studies, these values decrease to the similar values for the interval of the next hour. After an earthquake, the FFT method gives lower values of the period of excited waves than the method presented in this paper. Wave excitations with wave-periods of about 2 min obtained in the first 4 observed intervals are also visible in the study presented in [11] for intervals starting about 2 h before the earthquake and in the first hour after it. The post-earthquake wave-periods obtained in this study are also in agreement with those shown in [42] which indicate values from less than 10 s to a few hundred seconds.
As previously, mentioned notable characteristics of our two-dimensional hybrid method include the simplification of complex spectra of detected oscillations that are composed of many overlapping peaks in Fourier periodograms, the enhancement of apparent spectral resolution through the spreading of peaks over the second dimension, and the establishment of the direction of changes in signal through correlation coefficients. The normal VLF signal amplitude seen in control case one year before the earthquake shows more coherent topology of maps and oscillations below 111 s, whereas during the event of earthquake perturbations occur so that maps have more features off diagonal. This is in stark contrast to the observable oscillations of 120 and 130 seconds during earthquake. Moreover, as oscillations of 140-147 s are documented in earthquake and quiet day time series data occurring 2 hours before the nominal earthquake occurrence, we predict this oscillation as usual behavior.
Finally, we will take a moment reflecting about the ways in which our study could be expanded. For the purpose of this investigation, we made use of the WWZ wavelet, which is defined in terms of trigonometric functions. However, in order to test for presence of non-sinusoidal oscillations, it is essential to use wavelets on a base that does not involve trigonometry. In light of this, we believe that Superlets will prove to be the most suitable option for the continuation of our research. Superlet is a spectral estimator enabling time-frequency super-resolution which uses sets of wavelets with increasingly constrained bandwidth. These are combined geometrically in order to maintain the good temporal resolution of single wavelets and gain frequency resolution in upper bands. The normalization of wavelets in the set facilitates exploration of data with scale-free, fractal nature, containing oscillation packets that are self-similar across frequencies. Importantly, they can reveal fast transient oscillation events in single trials that may be hidden in the averaged time-frequency spectrum by other methods.

Conclusion
In this work, we explored the use of our 2D Hybrid technique for detecting oscillations in VLF signal amplitude time series in the time vicinity of the Kraljevo earthquake in Serbia in 2010 year. Furthermore, we demonstrated how the approach captures the difference between time series in the time vicinity of an earthquake and a control day one year earlier, which can be utilized to establish topology difference between certain ionosphere occurrences.
The cross correlation of scalograms of time series, which can be further integrated, is a crucial principle in our approach. This has two significant advantages. First, the method simplifies complex spectra with numerous overlapped peaks in periodograms, i.e. increasing apparent spectral resolution by spreading peaks over the second dimension. For example, we were able to detect various oscillation patterns at various time series segments.
The second advantge is usage of correlation coefficients to determine the direction of signal changes. This enabled us to distinguish between cohesive topology of 2D maps during a calm day and more dispersed correlation clusters during Kraljevo earthquake. We discovered that oscillations in the 120-130 s range appear 1 hour before the earthquake, continue to exist during the earthquake, and disappear 1 hour after the earthquake. Fluctuations of the order 140-147 s occur 2 hours before the nominal start of the earthquake during the calm day and can be interpreted as normal oscillations in the VLF signal amplitude. Finally, we briefly described how we could extend our research by using Superlets, which can reveal fast transient oscillation events that other types of wavelets may mask in averaged time-frequency scalograms.
Author Contributions: L.Č.P. conceptualized study, A.B.K. designed methodology, performed calculations, ploted figures and wrote the whole manuscript; A.N. collected data and prepared parts related to VLF data and ionospheric observations; A.B.K., A.N., L.Č.P. and M.R revised the manuscript. All authors have read and agreed to the published version of the manuscript.