2D Anisotropic Wavelet Entropy with an Application to Earthquakes in Chile

: We propose a wavelet-based approach to measure the Shannon entropy in the context of spatial point patterns. The method uses the fully anisotropic Morlet wavelet to estimate the energy distribution at different directions and scales. The spatial heterogeneity and complexity of spatial point patterns is then analyzed using the multiscale anisotropic wavelet entropy. The efﬁcacy of the approach is shown through a simulation study. Finally, an application to the catalog of earthquake events in Chile is considered


Introduction
The concept of entropy was first introduced by [1] in thermodynamics as a measure of the amount of energy in a system, and [2] was the first who gave a probabilistic interpretation.Shannon [3] applied the entropy concept to the information theory.According to the information theory, entropy is a measure of the uncertainty and unpredictability associated with a random variable, and the Shannon entropy quantifies the expected value of information generated from a random variable.
The definition of entropy has been widely used in many applications, such as neural systems [4], image segmentation through thresholding [5,6] and climatology and hydrology [7][8][9][10].Nicholson et al. [11] introduced a spatial entropy to quantify the simplification of earthquake distributions due to relocation procedures.A temporal definition of entropy was used by [12][13][14][15][16] to study the seismicity in different parts of the world.
For a process characterized by a certain number N of states or classes of events, the Shannon entropy [3] is defined as: where p i is the probability of occurrence of the events in each ith class.The choice of the base of the logarithm is arbitrary: for practical convenience, we use base two throughout this paper.For p i = 0, p i log p i = 0.The Shannon entropy is maximal when all of the outcomes are equally likely, that is S = log(N ).The Shannon entropy S can also be defined in a continuous framework as: S = − ∞ 0 P (x) log(P (x))dx, where P (x) is the continuous probability density function of the variable x.
A generalization of the Shannon entropy was defined by [17] as: where α is a parameter (α = 1 gives the Shannon entropy).Papoulis [18] extended the definition of entropy to a point set as: where P is the spatial probability density, and x is a point in the spatial domain V .One method for estimating the density P is to divide the spatial domain into boxes and then count the number of points in each box (the box-counting method).Moreover, the optimal size of the boxes is an open problem: too big boxes produce a low resolution in measuring the variability of density, and too small boxes cause a large increase in density error.To avoid this problem, different approaches have been proposed [11,19].However, a probability density is not the only type of distribution that can give information, and the definition of entropy can be extended to other types of distributions, such as the energy distribution based on wavelet coefficients [20].A definition of wavelet entropy based on the energy distribution of the wavelet coefficients has been proposed by [21][22][23][24][25][26].All of these authors propose or use a definition of the wavelet entropy in time, but an extension to the two-dimensional (2D) case has not been proposed so far.
The goal of this paper is to extend the wavelet entropy to the 2D case and to introduce new definitions of wavelet entropy using the Morlet directional wavelet in order to detect the spatial heterogeneity and complexity of spatial point patterns through scales and directions.The efficacy of the method is shown through a simulation study, and the directional wavelet entropy is then applied to the earthquake catalog of Chile for describing the spatial complexity of seismic events.
Wavelet analysis has succeeded in a variety of applications and held promise in the area of spatial pattern analysis (e.g., [27][28][29]).The main advantages are its ability to preserve and display locational information, while allowing for pattern decomposition, and it does not require the stationarity of the data.Despite its advantages, wavelet analysis has only been involved in several works for the detection of patterns (e.g., [30][31][32][33][34]).
Spatial point process models are useful tools to model irregularly-scattered point patterns that are frequently encountered in many studies of natural phenomena.A spatial point pattern is a set of points {x i ∈ A : i = 1, . . ., n} in some planar region A. Very often, A is a sampling window within a much larger region, and it is reasonable to regard the point pattern as a partial realization of a stochastic planar point process, the events consisting of all points of the process that lie within A. Let N be this stochastic planar point process defined on R 2 , but observed on a finite observation window W .For an arbitrary set A ∈ R 2 , let |A| and N (A) denote the area of A and the number of events from N that are in A, respectively.
The study of spatial point patterns has a long history in ecology and forestry [35][36][37][38].Spatial point patterns have also found application in fields as diverse as cosmology [39], geography [40], seismology [41] and epidemiology [42].Recent textbooks related to the topic of analysis and modeling of point processes include [43][44][45][46][47].A point process is stationary and isotropic if its statistical properties do not change under translation and rotation, respectively.Informally, stationarity implies that one can estimate properties of the process from a single realization on A, by exploiting the fact that these properties are the same in different, but geometrically similar, subregions of A; isotropy means that there are no directional effects.
The assumption of isotropy is often made in practice due to a simpler interpretation, ease of analysis and also to increase the power of statistical analyses.However, isotropy is many times hard to find in real applications.Many point processes are indeed anisotropic.There are many varied forms of anisotropy.Orientation analysis is the quantification of the degree of anisotropy in the case of non-isotropic point patterns and the detection of inner orientations in the case of isotropy [48][49][50][51].Anisotropy can be present when the spatial point patterns contain points placed roughly on line segments [52].A typical example of oriented point patterns is given by the distribution of earthquake epicenters after a "mainshock" event."Aftershock" events are normally clustered along the linear directions or segments given by the active faults [53].
The arguments shown and the literature involved in the analysis and detection of spatial anisotropies sets a motivating research line in terms of the detection of linearities in spatial point patterns and, even more generally, in terms of testing for spatial anisotropy.Here, we understand spatial anisotropy as the presence of the main directions in the point pattern [54].Ohser et al. [48], Brillinger [55] and Mateu et al. [34] proposed methods to assess isotropy (and to consequently detect anisotropy) for spatial point processes.The study of the spatial entropy of seismic events allows one to characterize the complexity of the earthquakes and, consequently, to assess their predictability.
The article is organized as follows.Section 2 provides some basic concepts of wavelets, and Section 3 gives a definition of wavelet entropy.Directional wavelets and the anisotropic wavelet entropy are introduced in Section 4. A simulation study is reported in Section 5. Finally, an application of the anisotropic wavelet entropy to the earthquake catalog of Chile is described in Section 6.The paper ends with some conclusions in Section 7.

Basics on Wavelets
One-dimensional wavelets are functions with zero mean and moderate decay, such that they are non-zero only over a small region.They can be defined as translations and re-scales of a single squared-integrable function ψ(x) ∈ L 2 (R), called the wavelet function or the mother wavelet, as: where a ∈ R\{0} and b ∈ R are the scale and shift parameters, respectively.Normalization by 1 √ |a| ensures that the energy of the corresponding wavelet is independent of a and b, i.e., For any function f (x) ∈ L 2 (R), the continuous wavelet transform is given by: where the overline denotes complex conjugate.The two-dimensional extension of Equation ( 3) is straightforward: by denoting with x = (x, y) and b = (b 1 , b 2 ) a spatial location and the translations, respectively, Equation (3) for f (x) ∈ L 2 (R 2 ) can be written as: where: is a 2D isotropic wavelet.An example of an isotropic wavelet is given by the Mexican hat whose wavelet mother is defined as Isotropic wavelets are normally used when no oriented feature is present or relevant in the signal or image.

Wavelet Entropy
The continuous wavelet entropy based on the energy distribution is defined by [22] as: where: is the wavelet energy probability distribution for each scale a at time t (b = t in the wavelet definition of Equation ( 2)).Similarly to the classical definition, the wavelet entropy is maximum when the signal is a "white noise" process, and it is minimum when it is an ordered mono-frequency signal.In the latter case, approximatively 100% of the energy concentrates around one unique level (or scale), and the wavelet entropy will be close to zero.Contrarily, when considering a "white noise", all levels will carry a certain amount of energy, and the wavelet entropy will be maximal.Through this multiscale approach, one can detect the relevant scale levels that represent the highest complexity of the system.
Discretization choices of the parameters a and b can be used for defining a multiresolution wavelet entropy by allowing for a scale representation of the disorder of a system [56].For example, the discretization a = 2 j and b = k2 −j produces an orthogonal basis given by {ψ j,k (x) = 2 j/2 ψ(2 j x − k), j, k ∈ Z} [57], and any function f ∈ L 2 (R) can be represented as f (x) = j,k d j,k ψ j,k (x), where d j,k are the multiresolution coefficients at the time k and scale j [58].More general discretizations are given by a = a −j 0 , b = kb 0 a −j 0 , j, k ∈ Z, a 0 > 1, b 0 > 0. In the discrete case, the energy of the signal can be approximated by E W (j, k) = |d j,k | 2 .As previously seen, summing this energy for any discrete time k leads to an approximation of the energy content at scale j, Following the Shannon entropy framework, the probability density distribution of energy across the scales is given by: , with j p W (j) = 1 (because of the orthogonal representation).Finally, the multiresolution wavelet entropy (MWE) at the scale j is defined by analogy with the continuous wavelet entropy: (see [20,23,24]).
The definition of wavelet entropy can be easily extended to 2D isotropic processes as follows: where: corresponds to the wavelet energy probability distribution for each scale a at point b.
Denoting by E W (a, b) the energy of the wavelet coefficients at point b and for scale a, the energy content at scale a is given by: In order to follow the Shannon entropy framework, a probability density function must be defined as the ratio between the energy of each level and the total energy: This corresponds exactly to the probability density distribution of energy across the scales where the following relation holds: Finally, the multiscale wavelet entropy (MWE) at the scale a is defined as: Integrating Equation ( 8) over all scales a, we obtain a measure of the global measure of wavelet entropy (GWE), In the discrete case, the 2D wavelet coefficients are denoted by d j,k , where k is a bidimensional vector, k ∈ R 2 .Using the multiresolution analysis, the energy E W (j, k) = |d j,k | 2 can be used to define the 2D discrete multiscale wavelet entropy (DMWE), where: The discrete global measure of wavelet entropy (DGWE) is then obtained by summing the DMWE over all scales, DGW E = j −p W (j) log(p W (j)).

Directional Wavelet Entropy
When the aim of the study is to detect oriented features of a signal or an image, a directional wavelet has to be used.A directional wavelet is not rotation invariant, and its transform gives information about the best angular selectivity.
For x ∈ R 2 and any function f (x) ∈ L 2 (R 2 ), the continuous directional wavelet transform for a scale a and an orientation θ is given by: In the literature, a variety of directional wavelets ψ a,b (x, θ) have been proposed.In particular, [59] introduced a flexible function called the fully-anisotropic directional Morlet wavelet, given by: where k 0 = (0, k 0 ) is a wave vector with k 0 ≥ 5.5, A = diag(D, 1) denotes a diagonal matrix and D is the anisotropy ratio defined as the ratio of the length of the elliptical envelope in the y-direction to the length of the elliptical envelope in the x-direction.The matrix C is a linear transformation given by: This transformation rotates the entire wavelet through an angle θ defined as positive in the counterclockwise direction.Two examples of this fully-anisotropic wavelet for directions θ = 30, 90 are shown in Figure 1.In order to identify the behavior of the process in different directions, [60] introduced two new functions, η(a, θ) and ζ(a, θ), given by: and: The component |W f (a, b, θ)| 2 in Equation ( 14) gives the distribution of the energy of a function at location b, scale a and direction θ.Hence, η(a, θ) characterizes the distribution of the energy at different scales and directions, whereas ζ(a, θ) provides the relative distribution of the energy in different directions at a particular scale.Mateu et al. [34] used fully-anisotropic wavelets for detecting linear patterns in spatial point processes.
Using the anisotropic wavelet of Equation (13), and denoting by: the energy of the wavelet coefficients at point b, scale a and with angle θ, one can define a probability density function for each direction θ and scale a as follows: where: and θ ∈ (0, 180).In this case, the multiscale directional wavelet entropy (MDWE) can be defined as: Thus, by integrating the energy E W (a, θ) over θ in Equation ( 16), that is, we obtain the isotropic multiscale wavelet entropy.The spatial heterogeneity of the wavelet entropy can be achieved by decomposing the domain of 2D process into several boxes and calculating the entropy in each box.

Simulation Study
For illustrative purposes, we considered three different scenarios of point patterns.In the first scenario, we considered a set of 961 spatial points distributed on a regular grid 31 × 31 (Figure 2a).In the second scenario, we simulated N = 1000 realizations from a uniform distribution (Figure 2b), and in the third scenario, we simulated N 1 = 500 points from a uniform distribution and N 2 = 500 points along a linear pattern with a slope of 45 degrees (Figure 2c).Further examples could be considered with a slope ranging between zero and 180 degrees as in [34].
Figure 3 represents the directional multiscale wavelet entropy for the three datasets in Figure 2. In the first case, the entropy is approximatively equal to zero for each direction and for each scale.The higher values in the first level of resolution corresponding to the angles 0, 90 and 135 degrees are due to the border effects.In the second case (Figure 3b), the entropy seems higher in the first levels of resolution, but it is approximatively equal for each direction.In the third case, the entropy is higher in the area where the two random processes are superimposed, that is along the linear direction of 45 degrees.
If we evaluate the entropy for each scale of resolution (Figure 4) using Equation ( 18), we observe that the entropy for the regular grid of points shows a fast decrease in the first levels of resolutions, while in the case of a random pattern, the entropy measure has a slower decay.A notably different behavior is shown when the process is generated from two different random processes, as in Figure 2. In this case, the entropy reaches its maximum value at a certain level of resolution.Figure 5 represents the spatial entropy for all directions and all scales for the three scenarios: low entropy for a regular grid (Figure 5a), high entropy for the random process (Figure 5b) and directional entropy for the superpositions of both patterns (Figure 5c).Table 1 shows the global measure of wavelet entropy for the three scenarios.As expected, the first process given by points on a regular grid has the lowest entropy, while the third process resulting from the overlapping of two uniform distributions shows the highest entropy.

Application to Earthquake Data
We consider the Chilean earthquake catalog from January 2007 to August 2014, including 13,883 events with a magnitude larger than 3.0 and a depth less than 60 km.In order to show the spatial entropy, we focus on three main seismic areas represented in Figure 6.The first area (Figure 7a) is delimited by a rectangle with longitudes (−72, −79) and latitudes (−22.62,−18.62) and includes an earthquake event with magnitude 8.2 on 1 April 2014.The second zone (Figure 7b), including the town of Valparaiso, is delimited by coordinates with latitudes (−33.79,−29.79) and longitudes (−72, −69).Finally, the third area (Figure 7c), delimited by coordinates (−74.5, −71.5) and (−38.13,−34.13), includes a big earthquake (with magnitude 8.8) on 27 January 2010 in the town of Concepción.Although all areas extend along the Nazca plate, there are many other smaller faults along which earthquakes tend to cluster.The directional wavelet entropy can identify linear spatial directions with different entropy.Figure 8 shows the wavelet entropy for each scale a and direction θ assessed on each dataset represented in Figure 7a-c, respectively.All plots show the maximum entropy along different dominant linear directions: 110 degrees for the first plot, 80 degrees for the second plot and 45 degrees for the third one.The entropy analysis through scales (Figures 9 and 10) shows the presence of different degrees of complexity at different scales.While Figures 9a and 9c indicate the maximum entropy in correspondence to a certain scale, in Figure 9b, the maximum entropy extends for a range of scales.The different behavior of Figure 9b is probably due to the source of its seismic events: unlike the events of Figures 7a and 7c, which are clustered along the main fault that caused their big earthquakes, Figure 7b shows different anisotropic clusters, probably due to the ruptures of independent faults.If we consider the global (isotropic) spatial entropy as described by Equation (11), it is not possible to distinguish the main directions where the entropy tends to be the same.However, we can observe that the higher values of entropy are not concentrated close to the epicenters of the big events, where most likely the predictability is larger.This is confirmed by the analysis in Figure 11, which shows the directional multiscale entropy before (a) and after (b) the Iquique earthquake with a magnitude 8.2 occurring in the North of Chile on 1 April 2014.Although the entropy seems to be concentrated along a particular direction before the big earthquake, the uncertainty seems higher than the period after the M8.2 earthquake.Similar results are shown in Table 2 representing the global entropy for each area.

Conclusions
In this work, we have extended the concept of wavelet entropy to the two-dimensional case in order to detect the different complexity of a spatial point pattern.A new definition of wavelet entropy has been proposed for anisotropic processes using the directional Morlet wavelets.A simulation study has been used to show that the wavelet entropy is minimum when the process is a regular process, and it is maximum when the process is given by an overlapping of random point processes.Finally, an application to the earthquake catalog of Chile has been considered to detect the different spatial complexity.In particular, the results have shown that entropy is lower at spatial zones where a big earthquake happened than at areas characterized by different seismic activity, such as the area around the town of Valparaiso.Furthermore, the wavelet entropy is higher before a big earthquake than after.This means that aftershock events, which normally cluster along an active fault, have a lower degree of uncertainty than foreshock events, which are often characterized by a random distribution.The proposed methodology provides an important contribution to the study of earthquakes in Chile, given that it allows assessing the predictability of earthquakes in a given area.

Figure 2 .
Figure 2. (a) Spatial points on a regular grid of size 31 × 31 ; (b) simulated spatial points (n = 1000) from a uniform distribution U [0, 1]; (c) simulated spatial points using a uniform distribution U [0, 1] with n = 500 and an overlapped simulated linear pattern generated by a uniform distribution U [0, 1] with n = 500 along the line with a slope of 45 degrees.

Figure 3 .
Figure 3. (a) Directional multiscale wavelet entropy for the: regular grid; (b) set of random points; (c) set of random points with a linear pattern.

Figure 4 .
Figure 4. Multiscale wavelet entropy for the: regular grid (blue line); set of random points dashed line); set of random points with a linear pattern overlapped (red points).

Figure 5 .
Figure 5. (a) Spatial wavelet entropy for the: regular grid; (b) set of random points; (c) set of random points with a linear pattern overlapped.

Table 1 .
Global entropy for the three processes: (a) spatial points on a regular grid of size 31 × 31; (b) simulated spatial points (n = 1000) from a uniform distribution U [0, 1]; (c) simulated spatial points using a uniform distribution U [0, 1] with n = 500 together with an overlapped linear pattern generated by a uniform distribution U [0, 1] with n = 500 along the line y = x.

Figure 6 .Figure 7 .
Figure 6.Earthquake catalog of Chile form January 2007 to August 2014 with a depth less than 60 km and a minimum magnitude of 3.0; red rectangles indicate three important seismic areas.

Figure 8 .
Figure 8. Directional multiscale wavelet entropy for the epicenters for each area of Figure 7.

Figure 11 .
Figure 11.Wavelet entropy for each scale a and for each angle θ for the earthquake events before (a) and after (b) the Iquique earthquake occurring on 1 April 2014.

Table 2 .
Global entropy for the earthquake catalog of Chile: "Window 1", "Window 2" and "Window 3" represent the wavelet entropy for the events in the rectangular windows represented in Figures7a, 7b and 7c, respectively.The last two rows show the entropy values for the period before the Iquique earthquake and after the Iquique earthquake.