Identification of Areas of Anomalous Tremor of the Earth’s Surface on the Japanese Islands According to GPS Data

: Statistical properties of Earth surface tremors measured by means of GPS were investi ‐ gated. This article considers measurements of the Earth’s surface displacements in three orthogonal directions relayed by a network of GPS sensors with about 1200 points distributed across Japan in 2009–2021. Next, the following characteristics of the tremors were considered: the entropy of the distribution of squared orthogonal wavelet coefficients, the entropy of the distribution of power spectrum values, and the spectral index. The anomalous regions of maxima of probability densities of the distribution of extreme values of the tremor statistics were determined: entropy minima and spectral index maxima. The average density maps of the distribution of extreme value tremor sta ‐ tistics were found to be highly correlated with one another. This made it possible to consider a weighted average density map and identify five anomalous regions in the center and south of Ja ‐ pan. A trajectory of visiting anomalous regions by a sequence of points realizing local maxima of the average probability density was obtained, for which seasonal periodicity was set. Estimates of changes in the average and maximum values of the correlation coefficients of tremor properties in an auxiliary network of 16 reference points in a semi ‐ annual time window were obtained.


Introduction
Analysis of the spectral structure and models of the noise component of a GPS time series is traditionally of interest to researchers. In [1], spectral indexes were analyzed by fitting power laws to spectra and estimating their parameters using the maximum likelihood method. The effect of the flicker noise intensity of a GPS time series on velocity determination errors, depending on the measurement latitude, was studied in [2,3]. In [4][5][6], the maximum likelihood method was applied to estimate the amplitudes of GPS time series noise component models for various regions of the world. The problem of estimating the error in determining the velocity based on the spectral index was considered in [7].
The use of parametric models of a GPS time series to consider the effect of gaps in data for tectonically active regions is described in [8]. In [9,10] the high-frequency noise structure of a GPS time series for a number of areas in New Zealand and the United States was analyzed. The influence of the seasonal component of the hydrological load on the Earth's surface on the determination of the velocities of tectonic plates was studied in [11,12]. The extraction of common spatial characteristics and seasonal components of a GPS time series using the principal component method, empirical orthogonal functions, and singular spectrum analysis was considered in [13,14]. In [15], the high-frequency noise of a GPS time series was analyzed together with the records of the seismic network of accelerometers.
The relationship between the appearance of non-stationary effects in the GPS time series and the processes of slow motions at the boundaries between crustal blocks was analyzed in [16]. Estimates of the non-stationary effects of GPS time series and station positioning stability using discrete wavelet transforms were considered in [17,18]. In [19,20], the coherence of the tremors of the Earth's surface measured by GPS was studied. An assessment of the fractal characteristics of a high-frequency GPS time series to identify potentially dangerous seismically active regions in Japan and California was used in [21][22][23].
This article analyzes the structure of high-frequency noise in the time series of the network of GPS stations in Japan (about 1200 points) over 13 years of observations, 2009-2013, with a time step of 5 min. Four characteristics of tremors are considered: the entropy of the distribution of squared orthogonal wavelet coefficients, the entropy of the distribution of power spectrum values, and two estimates of the spectral index, which is understood as the regression coefficient between the logarithm of the power spectrum and the logarithm of the period. The first estimate of the spectral index is traditional, by calculating the power spectrum. Another estimate of the slope is based on orthogonal wavelet decompositions, where the averages of the squares of the wavelet coefficients are considered as the contribution of the frequency band of the detail level of the decomposition to the total signal energy. For successive time fragments 1 day long (288 readings with a time step of 5 min), for measurements at each station, the values of the four statistics described above are calculated after detrending by a 4th order polynomial. The removal of the trend is aimed at getting rid of the influence of external deterministic factors, such as tidal and thermal deformations, and switching to the analysis of the random high-frequency component of the Earth's surface displacements, which we will call tremors. The anomalous regions of maxima of probability densities of the distribution of extreme values of the studied tremor statistics are determined: entropy minima and spectral index maxima. This definition of tremor anomalous behavior is based on the fact that white noise is characterized by maximum entropy values and a zero spectral index. Consequently, the maximum deviation from uninformative "white-noise" behavior should be characterized by entropy minima and spectral slope maxima.

Materials and Methods
Ground displacement data measured by GPS are taken from the website [24]. These data represent displacements of the Earth's surface in three orthogonal directions, which we will further denote as E, N, and Up for displacements in the west-east, north-south and vertical directions, with a time step of 5 min [25]. Figure 1a shows the positions of GPS points for measuring displacements of the Earth's surface, as well as 16 numbered reference points. These points will be further needed to assess the spatial correlations between the properties of the Earth's surface tremors. Reference point positions were noted as the centers of 16 clusters of GPS stations using the K-means method [26]. The numbering was performed by decreasing the latitude of the reference points. Figure 1b presents the density of GPS stations, which were estimated by Gaussian kernel method [26] with bandwidth 0.5° (≈56 km), within a regular 30 × 30 network of nodes which covered the region. Figure 1c shows a graph of the daily number of operational GPS stations for 13 years of observations, 2009-2021.
By operational stations on the current day, we mean those stations for which the total number of missing data does not exceed 10% of the duration of the daily window of 288 counts with a time step of 5 min, that is, no more than 28 values. Missing data are filled in with "plausible" artificial data, the values of which are determined by the behavior of the signals in adjacent intervals of the same length as the length of the gap. This data replenishment operation is described in detail in [19].  of the orthogonal wavelet decomposition, the lower index j numbers the sequence of time interval centers in the vicinity of which the signal convolution with finite elements of the basis is calculated. Below, 17 orthogonal Daubechies wavelets were used: 10 ordinary bases with minimal support with the number of vanishing moments from 1 to 10 and 7 so-called Daubechies symlets [27], with the number of vanishing moments from 4 to 10. For each of the bases, the normalized entropy of the distribution of squares was calculated coefficients and found a basis that provides the minimum entropy:

Minimum Normalized Wavelet-Based Entropy
Here, m is the number of detail levels taken into consideration, and . The condition 2 n L  is necessary for applying the fast wavelet transform. If the length L is not equal to a power of two, then the signal ( ) x t is padded with zeros to the minimum length N , which is greater than or equal to L : After determining the optimal orthogonal wavelet, we can calculate the average values of the squares of the wavelet coefficients at each detail level: The average value (2) of the squares of the wavelet coefficients is the part of the oscillation energy corresponding to the detail level k . In other words, this value can be considered as an estimate of the signal ( ) x t power spectrum in the frequency band corresponding to the detail level k [27]: where s  is the length of the discretization time interval (in our case s  = 5 min).
Consider the values of the periods corresponding to the centers of the frequency bands (3): Then the quantities are similar to the usual Fourier power spectra. The difference lies in the fact that the values (4) are much smoother, which is convenient when calculating the spectral index-the slope of the graph of the logarithm of the power spectrum depending on the logarithm of the period. To calculate the wavelet-based spectral index W  , consider the following model: where k  is a sequence of independent random variables with zero mean. The parameter W  in Formula (5) can be called the wavelet-based spectral index, the value of which can be found by the least squares method:

Spectral Normalized Entropy
can be considered as the probabilities of energy distribution in a sequence of non-overlapping frequency intervals of length . Here, N is the minimum length, which is greater than or equal to the sample length L : 2 n N L   , the index j in Formula (6) varies up to / 2 N due to the symmetry of the power spectrum of real signals with respect to the Nyquist frequency Similarly to the values (1) and (5), we introduce the spectral normalized entropy S En and spectral index S  according to formulae: where period 2 / j j T    , the value of S  is defined from the least squares approach:  is a sequence of independent random variables with zero mean.
The values of W En , S En , W  , and S  were estimated in successive time windows consisting of 288 adjacent 5-min readings, which is one day. Before their calculation, the operation of detrending by a polynomial of the fourth order was performed. Thus, for each station, time series were obtained with a time step of 1 day. The maximum entropy method [28] with autoregression order 28 (10% of the sample length in the time window) was used to calculate the power spectrum ( ) xx j S  estimate. The proximity of the normalized spectral entropy S En to the maximum value of 1 means that the energy distribution over frequencies is close to uniform, i.e., the power spectrum is close to the white noise spectrum. Since displacements of the Earth's surface in three orthogonal directions are measured at each point of the network, then, when necessary, we will distinguish the values of the analyzed tremor statistics for different directions of surface displacements by the superscript, for example, for entropy (1) En , where the superscripts correspond to measurements in the E, N, and Up directions. Similar notation will be used for the quantities, For an auxiliary network of 16 reference points ( Figure 1a) 12 time series were determined with a time step of 1 day, the values of which were calculated as medians of the values from the ten nearest operational GPS measurement points. Figure 2 shows graphs of such time series for central reference point number 8. These graphs show a strong seasonal periodicity of tremor properties.

Probability Densities of Extreme Values
Further, estimates of 12 parameters of the Earth's surface tremors will be considered in the sequence of time windows 182 days long (half a year), taken with a shift of 3 days. In this case, in each window, we will consider only operable GPS stations, by which we mean those stations for which the total number of missing data in the current window does not exceed 0.1 of the window length.
Consider a regular grid of 30 × 30 nodes covering an area with a latitude from 28° N to 46° N and longitude from 128° E to 146° E. Let U be any value [ , ] t t t  forms a random set. Let us estimate their two-dimensional probability distribution function for each node of the regular grid. For this, the Parzen-Rosenblatt estimate with the Gaussian kernel function [26] will be applied: Here, h is the kernel averaging radius and 0 1 , t t are integer indices that enumerate the "elementary" maps of each time window. Thus, 1 0 ( 1 ) t t   is the number of time windows in the considered time interval. The width of the smoothing band was used 0.5 h   which is approximately equal to 56 km. It should be noticed that the maximum distance from the nearest 10th stations to the reference points equals 47.5 km, the minimum equals 25.4 km, and the median maximum distance equals 37 km. Figures 3 and 4 show the average distribution maps of the extreme values of the four tremor statistics under consideration for all three displacement components. Even a purely visual perception of these maps suggests that they have a strong similarity to each other. This visual impression is confirmed by the estimate of the matrix of pairwise correlation coefficients presented in Table 1. The minimum value of the correlation coefficient is 0.7044 for the pair, while for the pair the correlation coefficient reaches 0.9903.    The high correlation of probability density maps of the distribution of extreme values of tremor statistics suggests the use of the principal component method [29] to extract the most common distribution over the space of the most common characteristics. To do this, we find the eigenvector of the correlation matrix (Table 1), corresponding to the maximum eigenvalue of this matrix. The squares of the values of the components of this vector (by definition, their sum is equal to 1) will be taken as the weights of digital maps. The result of the weighted average of the probability densities of the distribution of extreme values in Figures 3 and 4 is shown in Figure 5. Figure 5a contains white contour lines which extract regions where the weighted probability density of minimum entropy and maximum spectral index exceeds the level of 98% quantile. These regions are considered as regions of anomalous GPS-measured Earth tremors.  From this plot, it can be seen that the GPS noise statistics corresponding to measurements in the N direction contribute the most to the averaging weights. The smallest contribution corresponds to the vertical Up measurements. However, it should be noted that the differences in the weights are insignificant.

Trajectory of Extreme Mean Probability Density Maxima
According to the location of local maxima of the weighted average probability density in Figure 5a, 5 areas of anomalous tremors can be distinguished on the Japanese islands. Since the probability densities were estimated in a sliding time window 182 days long, it is of interest to obtain the trajectory of geographic coordinates corresponding to the position of the local maxima of the averaged probability density in each of the successive time windows.
To obtain such a trajectory for each time window 182 days long, we calculate the weighted average of all 12 probability densities of the distribution of extreme values of the used tremor statistics. Thus, in each time window, we calculate a 12 × 12 correlation matrix similar to the matrix of averaged correlations in Table 1, determine the eigenvector corresponding to the maximum eigenvalue, and use the squares of its components as weights to calculate the average weighted probability density in each window. Next, we find the coordinates of the point that realizes the maximum average density. This sequence of operations performed in each time window makes it possible to obtain the trajectory of geographical coordinates of points that realize the maximum values of the average probability density. Figure 6(a1,a2) shows the graphs of changes in the geographical coordinates of the trajectory points of the maxima of the average weighted probability densities of the extreme values of the tremor statistics. Recall that, according to the construction, these points are nodes of a regular grid of size 30 × 30 covering the region under study. Figure  6b shows the points at which the maxima of the probability densities are realized. These points are divided into five clusters, and cluster number four consists of one point, which is repeatedly visited by the trajectory (plot in Figure 6(a3)).

Spatial Correlations of Tremor Properties
In this section of the article, spatial correlations between tremor properties in the network of 16 reference points presented in Figure 1  ational stations. Examples of graphs of such time series for reference point #8 are shown in Figure 2.
Consider the correlation coefficients between the same tremor properties at different reference points. Denote by ( ) ( ) ij P   the estimates of the correlation coefficients between the values of the property P at the reference points with numbers i and j for a sequence of time windows 182 days long with time marks  of the right ends of the windows. As P can be any of the 12 tremor properties: For each property, we determine the average value of the correlation coefficients for all pairs of reference points: In Formula ( Figure 8.
The presence of values in 16 reference points distributed throughout Japan allows the construction of maps of spatial values. The construction of such maps provides additional information on the features of the distribution of the correlation of tremor properties. For this, as before when constructing maps in Figures 3 and 4, consider a regular grid of 30 knots in latitude and 30 knots in longitude covering Japan. Then the values at the nodes of the regular grid are calculated by the nearest reference point neighbor method [26]. These values, calculated for each position of the time window at all nodes, make it possible to obtain "elementary" maps of the distribution over the space of average values of the correlation of the tremor property. The result of averaging all "elementary" maps for time windows lying between some 2 dates gives the corresponding averaged map. Figure 9 shows averaged correlation distribution maps for time windows before and after 2016. The distribution of values is given only for the territory of the Japanese Islands covered by the GPS network of stations and some of their surroundings.
It follows from the maps in Figure 9 that the region of the highest spatial correlation of tremor properties is practically unchanged and occupies the southern part of Honshu Island, although the degree of correlation drops noticeably after 2016.

Discussion
A new method for estimating the intensity (activity measures) of the Earth's surface tremor based on measurements by a network of GPS sensors is proposed. The method is based on estimating the trajectory of points corresponding to the maxima of the average weighted probability densities of the distribution of extreme values of tremor statistics in a sliding time window. As tremor statistics, we chose the minimum values of the entropy of the distribution of squared orthogonal wavelet coefficients and the entropy of the power spectrum, as well as the maximum values of the spectral index calculated for the power spectra and the average values of the squared wavelet coefficients. A method has also been developed for estimating the variability of the spatial correlation of tremor properties based on estimates of the correlation values of statistics in an auxiliary network of reference points in a sliding time window.
Methods which used values of spectral index for GPS time series investigations were elaborated in [1][2][3][4][5][6] for identifying models of GPS noise. Similar methods were applied in [21][22][23] for detecting regions which are suspected for increased seismic danger although these methods are based on detrended fluctuations analysis (DFA). However, these methods do not investigate evolution of points providing maximum to probability density of extreme values of spectral index and entropy of GPS noise within moving time window what is the key tool in the current article.
The developed methods were applied to the data of the GPS network in Japan in 2009-2021. Five areas of tremor activity were identified in the center and south of Japan, The trajectory of the points of maximum average probability densities of the distribution of extreme values of tremor statistics and their belonging to the selected 5 activity areas in Figure 6 demonstrates a clear division into two large time fragments with significantly different behaviors; before and after 2016. The interval after 2016 is characterized by stable seasonal periodicity of switching the trajectory of the maxima of the average probability density between the first and fourth clusters. For the first time interval before 2016, we cannot notice such regularity. Thus, it could be called "chaotic," although this name simply underlines the existence of the change point in the behavior of maxima trajectory. During the two years 2014-2016, tremor activity is concentrated in the south of Japan, in the fifth cluster ( Figure 6(a3)). It is possible that this 2-year concentration of the tremor activation maximum in the fifth cluster reflects the preparation of the Kumamoto earthquake. We also note that the north of Japan in 2009-2021 does not apply to abnormal areas of GPS tremor.
It follows from the graphs in Figure 7 that all average tremor properties are subject to a strong annual periodicity. Note that the maxima of the average correlation *   , the average number * n  of pairs of reference points with strong correlations, and the aver- were almost equal to zero. The first half of 2010 displays the maximum value of annual bursts of tremor activity. It is natural to assume that these bursts reflect the process of preparing for the Tohoku mega-earthquake on 11 March 2011.

Funding:
The work was carried out within the framework of the state assignment of the Institute of Physics of the Earth of the Russian Academy of Sciences.

Data Availability Statement:
Publicly available datasets were analyzed in this study. This data can be found here: Nevada Geodetic Laboratory, http://geodesy.unr.edu/NGLStationPages/RapidStationList (accessed on 18 July 2022).

Conflicts of Interest:
The authors declare no conflict of interest.