The Empirical Watershed Wavelet

The empirical wavelet transform is an adaptive multi-resolution analysis tool based on the idea of building filters on a data-driven partition of the Fourier domain. However, existing 2D extensions are constrained by the shape of the detected partitioning. In this paper, we provide theoretical results that permits us to build 2D empirical wavelet filters based on an arbitrary partitioning of the frequency domain. We also propose an algorithm to detect such partitioning from an image spectrum by combining a scale-space representation to estimate the position of dominant harmonic modes and a watershed transform to find the boundaries of the different supports making the expected partition. This whole process allows us to define the empirical watershed wavelet transform. We illustrate the effectiveness and the advantages of such adaptive transform, first visually on toy images, and next on both unsupervised texture segmentation and image deconvolution applications.


Introduction
Within the field of image processing and computer vision, multi-resolution analysis is a vastly applicable tool. For example, the use of multi-resolution analysis and compressive sensing theory has led to state-of-the-art denoising and deconvolution techniques. In addition, the use of multi-resolution analysis has shown to be quite effective in texture analysis.
Wavelets have been the standard tool for multi-resolution analysis, and their effectiveness is matched by their popularity. However, wavelets are prescriptive in their construction in the sense that the wavelet filters are built independently of the data. Adaptive (i.e., data-driven) methods have shown many promising improvements in applications across all fields of science. Hereafter, we review the most popular adaptive methods developed in the last two decades.
An early adaptive method is the empirical mode decomposition (EMD) [1,2]. The EMD is an iterative algorithmic method that attempts to decompose a signal into its amplitude-modulated frequency-modulated components (also called harmonic modes). Despite its success in analyzing non-stationary signals, the EMD lacks a solid mathematical foundation due to its algorithmic nature, making it difficult to predict its behavior. Furthermore, the bidimensional empirical mode decomposition is computationally inefficient, especially with the presence of noise [3]. An alternative, called variational mode decomposition (VMD), was proposed by Dragomiretskiy et al. [4,5]. It aims at decomposing the spectrum of an image into a mixture of Gaussians with specific constraints, using optimization techniques, while this method has seen success in many applications [6][7][8][9][10].
The assumption that a spectrum can be meaningfully represented as a mixture of Gaussians can lead to a loss of information. However, another approach is using synchrosqueezed wave packets, proposed by Daubechies et al. in [11], extended into 2D by Yang et al. in [12]. Synchrosqueezed wave packets provide more accurate time/space-frequency resolution representations than classic wavelets by using a reassignment procedure. Recently, an arbitrary shape filter bank (ASFB) construction was proposed by Mahde et al. in [13]. Given a boundary line, the ASFB uses optimization techniques to define a low-pass and high-pass filter. However, since the ASFB defines only two filters, it is insufficient for many applications where more modes are present in the image. One more approach, proposed by Gilles et al. in [14,15], is the empirical wavelet transform (EWT). The empirical wavelet transform aims to build wavelet filter banks whose supports in the frequency domain are detected from the information contained in the spectrum of the signal/image. This process is equivalent to building wavelet filters based on an adaptive partitioning of the Fourier domain. The EWT has multiple bidimensional extensions based on Tensor, Littlewood-Paley, and Curvelet wavelets. The EWT has shown promising results in many applications [16][17][18][19][20]. However, in 2D, the detected partitions are made of subsets of constrained shapes limiting the adaptability capability of the EWT.
In this paper, we propose to remove such limitations by defining a new 2D version of the empirical wavelet transform based on fully arbitrary partitions of the frequency domain. This paper is organized as follows: In Section 2, we review the empirical wavelet transform and its existing implementations. We illustrate the limitations in the existing 2D approaches. In Section 3, we investigate the theoretical aspects of building 2D empirical wavelet filters based on arbitrary supports in the Fourier domain. We also propose an algorithm, based on scale-space representations and the watershed transform, to automatically detect a partition corresponding to dominant harmonic modes. Given such a partition, we define the empirical watershed wavelet transform (EWWT). In Section 4, we illustrate the performance of the EWWT as a multi-resolution analysis method, first on a synthetic image with chosen harmonic modes. Next, we illustrate that the EWWT allows improvements on the results obtained in unsupervised texture segmentation and in nonblind deconvolution. Finally, we give our concluding remarks in Section 5.

The Empirical Wavelet Transform
The empirical wavelet transform (EWT) is a method first proposed in 1D by Gilles in [14] and extended to 2D in [15]. The EWT is based on the idea of constructing a family of wavelets based on a set of detected (i.e., data-driven) supports in the Fourier domain corresponding to different harmonic modes. Hereafter, we first recall the 1D empirical wavelet transform, then the 2D extensions, as well as the support detection technique based on scale-space representations.

1D Empirical Wavelet Transforms
The one-dimensional, empirical wavelet transform constructs a set of wavelets based on an arbitrary partitioning of the 1D Fourier domain, see Figure 1. We consider a normalized Fourier axis, which is 2π periodic, and we denote the Fourier transform of a function f ∈ L 1 ∩ L 2 as f . While we consider real signals here, which have symmetric spectra, the generalization to complex signals is fairly straightforward and can be found in [21]. Figure 1. Construction of 1D Empirical Wavelet: given a set of boundaries {ω i }, we define ψ n as a band-pass filter between ω n and ω n+1 with transition regions of width 2τ n and 2τ n+1 .
With the set of empirical scaling function and empirical wavelets, {φ 1 (t), {ψ n (t)} N n=1 }, we can then define the empirical wavelet transform, denoted W f (n, t). The empirical scaling coefficients are given by the inner product of the signal and φ 1 , and the detail coefficients W t (n, t) are defined similarly, where * and (.) ∨ denote respectively the convolution operator and the inverse Fourier transform. Due to the tight frame condition, the original signal f can be reconstructed from W f (n, t) via

2D Empirical Wavelet Transforms
In 2D, the set of detected supports is characterized using regions in the Fourier plane, rather than intervals on the real line. Adding this extra dimension brings an extra degree of liberty in the detection of partitions: geometry. In 1D, intervals correspond to basic geometric objects; however, in 2D, the detected regions can have very different shapes. The 2D extensions proposed in [15] revisit three main families of classic wavelets based on Fourier supports of specific shapes. The first one, the tensor empirical wavelet transform, is a separable approach processing rows and columns independently, based on rectangular supports. The second one aims at building Littlewood-Paley wavelet filters, whose purpose is to separate scales, defined by concentric rings supports. The third one is a curvelet type transform and captures both scale and orientation information by defining supports that are polar wedges shaped supports. The adaptability of these different transforms comes from the fact that the positions and sizes of these different geometric structures are driven by the spectrum of the analyzed image. It is worth noticing that three different strategies were proposed to build the empirical curvelets. The first detects scales and angular sectors independently. The second detects scales first and then detects the angular sectors within each scale ring. The third and final option detects the angular sectors first and then detects scales within each. An example partition of each of the 2D empirical wavelet transforms (only the curvelet first option is depicted here) is illustrated in Figure 2. If each of these transforms allow for adaptive multi-resolution analysis with specific properties, they are still constrained by the restrictions on the shapes of their detected supports, limiting their degree of adaptability. The purpose of this paper is to remove such limitations by considering completely arbitrary partitions, but before we describe this new approach, the next section will review the scale-space approach to detect such data-driven partitions.

Scale-Space Representations
As described in the previous sections, the ability for the EWT to be adaptive lies in the fact that the method is capable of detecting the "optimal" supports (in the Fourier domain) for each wavelet filter. This leads to the natural question: how can we detect such a partition of the Fourier domain? Since the goal for each filter is to extract meaningful harmonic modes, let us first define a harmonic mode. A harmonic mode will correspond to an amplitude modulated/frequency modulated (AM/FM) signal, denoted f n (t) = A n (t) cos(ν n (t)), where A n (ω) and ν n (ω) are fast decaying (i.e., of almost bounded support) functions. It is also assumed that two consecutive f n (ω) are separated enough to be independently observed. From a geometric point of view, this means that f (ω) is the superposition of distinct lobes. Detecting the expected partition then corresponds to find the boundaries between successive modes. The original approaches proposed by Gilles et al. in [14,15] either consider the midpoint or the lowest minimum between the highest maxima. The major drawback of such approaches is that they require the knowledge of the expected number of modes, which is, in the general case, not necessarily known.
An automatic detection algorithm (including the number of modes) based on scale-space representations has been proposed in [22] and is recalled hereafter. The scale-space representation [23] of a function g provides the content of g at multiple scales by using a blurring operator to continuously remove components of small scales. In the 1D discrete case, it is given [23] by where s is a scale parameter and T(k; s) = e −αs I k (αs) is the 1D discrete Gaussian kernel. Note that I k is the Bessel function of first order and α ∈ R. A straightforward 2D extension, as will be necessary in Section 3.2.1, is simply obtained by using separable convolutions: Since we consider the discrete case, we will sample s with a fixed step-size, denoted s 0 , which will be discussed in the experimental section. One of the main axioms fulfilled by such scale-space representations is the guarantee that the number of extrema can only decrease when s increases. This reduction of extrema permits us to define how meaningful each original minimum is by simply measuring its lifespan (i.e., the value of s before this minimum disappears). Given the lifespans of all initial minima, each of them are then classified as meaningful/persistent or not.
As such, to detect the expected partition of the Fourier domain, we compute the scale-space representation of the signal magnitude spectrum, i.e., we set g = | f |, to extract the meaningful minima, which will correspond to the set of expected boundaries {ω n }. In the 2D extensions described in Section 2.2, detecting 2D partitions resumes in performing multiple 1D detections (by averaging the 2D Fourier spectrum or pseudo-polar Fourier spectrum with respect each dimension) because of their specific structures. We invite the reader to check [14,15] and especially [22] for more details and numerical implementations.

A New 2D Empirical Wavelet Transform
In the following sections, we present a general approach to remove the geometric constraints on the detected partitions in the previously described 2D empirical wavelets. First, we give a mathematical construction of wavelet filters is based on arbitrarily shaped supports and shows that these filters permit us to define a general empirical wavelet transform. We prove the existence of dual filters, allowing the definition of the inverse transform. Second, we present a new algorithm to detect 2D partitions of arbitrary geometry, inspired by the 1D scale-space algorithm discussed in Section 2.3.

2D Empirical Wavelets of Arbitrary Supports
Let us assume we are given a discrete image is the image domain (pixels coordinates will be denoted (i, j) ∈ Λ) and an arbitrary partition, , frequency coordinates will be denoted (k, l) ∈ Ω), whose detection will be discussed in Section 3.2. We assume the partition {Ω n } N n=1 has the following properties: N n=1 Ω n = Ω and Ω n ∩ Ω m = ∅ if n = m, whose boundaries delineate the expected supports. In order to define a transition region similar to the one shown in Figure 1 for the 1D EWT, we must define the distance from any point in the image to these boundaries. As such, for each region Ω n , we define a distance transform of the region: where ∂Ω n is the boundary of the region Ω n and d(., ., ., .) is the quasi-Euclidean distance, defined by This is done such that, for each region, we define any point in the image spectrum by its distance to the region's boundary. From there, we can define a 2D empirical wavelet as where τ is the desired transition area width and β(x) is an arbitrary C k function such that A usual choice for β is Unlike in the 1D case or the existing 2D cases, there are no theoretical results for the appropriate choice of τ, and therefore it must be chosen experimentally. Nevertheless, we can show that the set {ϕ n } forms a frame.
Proof. Assuming, for now that (to be shown later) ∃A, then, using Parseval's theorem, we get which implies that {ϕ n,i,j } forms a frame of bounds A and B.

It remains to show that such bounds
We can observe that three situations, as shown in Figure 3, can occur: -(k, l) lies outside of a transition area -(k, l) lies in a transition area between only two regions -(k, l) lies in a transition area between three or more regions It is easy to see that, if (k, l) is not within a transition area, ϕ n,k,l is nonzero for only one index n, and, by construction, we have ∑ N n=1 | ϕ n,k,l | 2 = 1. It remains to show the bounds if (k, l) lies within a transition area.
The lower bound is decided when the transition area is defined between two regions. In this situation, we have Since τ+D 1 (k,l) 2τ and since cos(x) = sin(x + π 2 ), this is equivalent to Therefore, our lower bound is A = 1.
In the situation where the transition area is between r regions, we have we therefore get an upper bound B = r − 1 which completes the proof.
With our set of empirical wavelets in hand, the corresponding transform is then defined by Since The existence of such dual frame allows us to reconstruct the original signal f from its empirical wavelet transform:

On the Detection of Partitions of Arbitrary Geometry
In this section, we address the question of how to detect partitions of the Fourier domain corresponding to meaningful harmonic modes of arbitrary geometry. The proposed method uses a 2D scale-space representation to find the "center" of harmonic modes, followed by a watershed transform to find the arbitrary boundaries delimiting the different regions Ω n . These two steps are described in detail in the next two sections. From now on, we assume our image f is discrete of size N × N .

Scale-Space Localization of Harmonic Modes
In Section 2.3, we showed that the expected 1D boundaries corresponded to meaningful local minima which can be found using scale-space representations. Unfortunately, 2D boundaries separating the different regions correspond to arbitrary curves and it becomes very hard to characterize what a meaningful curve is. Since our goal is to separate harmonic modes, assuming that these modes are well enough separated lobes in the spectrum, we propose first to detect the potential candidates by selecting meaningful local maxima. To do so, we follow a similar approach as in Section 2.3 but looking for persistent maxima instead of minima. To resume the process, we build a 2D scale-space representation of the image magnitude spectrum, i.e., g = | f |, detect all local maxima at each scale s, and measure their lifespans (i.e., the largest s before they disappear). From there, we create a histogram of the persistence of the maxima and use Otsu's method to define a threshold to classify each maxima as persistent or not. The locations of all persistent maxima, denoted {µ n }, are then extracted; these locations represent "centers" of the expected harmonic modes. Because we are working in a discrete space, the scale step-size s 0 is a parameter and we set the maximum scale proportional to s 0 and the image size, s max = 4 s 0 N K , where K is the size of the kernel used to create the scale-space representation. Figure 4a illustrates the tracking of maxima through the scale-space, while Figure 4b shows the remaining persistent maxima after thresholding.

Watershed Transforms
Given a set of mode centers, {µ n }, we wish to find a set of boundaries which will define the mode supports. Taking again the point of view that modes correspond to lobes, then a natural way to find such boundaries is to select the bottom of the valleys between the modes. From a mathematical perspective, such a process corresponds to finding the path of lowest separation (this idea is a direct extension of the principle used in [15], for finding 1D boundaries, where the lowest minimum between meaningful peaks was chosen).
The watershed transform, first proposed by Beucher et al. in [24], is an image segmentation technique that defines a contour based on the path of highest separation between minima. Based on the geographic definition of watersheds and catchment basins, the watershed transform treats an image as a topographic landscape with pixel intensity representing the height at that pixel. Then, the transforms separate the image into its catchment basins, with a watershed contour separating them. This process can be described as follows.
Given an image g and a set of its minima, {x i }, we uniformly "flood" the topographic landscape produced from g, with the water collecting at the minima. When one body of water flows into another, we construct a barrier. Once complete, we define the contour Γ along the barrier.
Later, Meyer et al. in [25,26] proposed a method in which one morphologically reconstructs the image in such a way to impose minima at select markers {M n }. This is done by first forcing each marker to be a minimum, by setting g(M n ) = −∞ for all n, and then by filling the catchment basins of each minimum x i / ∈ {M n } in such a way that the region contains no extrema. This approach reconstructs the image so that the only minima are at the markers {M n } and thus reformulates the watershed transform such that the generated contour Γ lays along the path of highest separation between selected markers, rather than all minima.
To find the set of boundaries that will define the expected supports, we first invert the magnitude spectrum, f − = −| f |, so that the watershed transform will define the path of lowest separation, rather than highest separation. Then, we impose minima to be at the location of persistent maxima, {µ n }. Finally, we apply the watershed transform, which defines a boundary Γ on the magnitude spectrum | f | that is along the paths of lowest separation between the locations of persistent maxima {µ n }. The boundary Γ defines a partition with regions {Ω n }.
Since the watershed transform defines some pixels as part of the boundary, we must assign these pixels to a region. This assignment is not critical, since they belong to the path of lowest separation, but it should be symmetric if f is a real valued image. Furthermore, if f is real valued, we must pair regions symmetrically with respect to the origin. This can be achieved using Algorithm 1.

The Empirical Watershed Wavelet Transform
We are now equipped with all tools to define a new 2D arbitrary empirical wavelet transform called the empirical watershed wavelet transform (EWWT). The process is described by Algorithm 2, and the output of the intermediate steps is illustrated in Figure 5.

Experiments
In this section, we first qualitatively explore the performance of the EWWT by decomposing a toy image made of known harmonic modes. Next, we propose to quantitatively test the use of the EWWT on two practical applications: unsupervised texture segmentation and non-blind deconvolution. As stated in Sections 3.1 and 3.2.1, the transition width τ and scale-space step-size s 0 must be provided. Unless otherwise specified, we choose τ = 0.1 and s 0 = 1 for all experiments.

Harmonic Mode Decomposition
First, we show how the EWWT decomposes a toy image, which is made of four well separated harmonic modes and two slightly blurred piecewise constant components. This toy image is illustrated in Figure 6a along with its magnitude spectrum in Figure 6b. Figure 6c provides the detected partition and Figure 7 shows the empirical watershed wavelet coefficients obtained by performing the transform. As expected, the EWWT is capable of isolating the five harmonic modes present in the original image, which are depicted in Figure 7a-e. As for the other coefficients shown in Figure 7f-i, we see very little information (as per their respective L 2 norms), which mostly consists of high frequency information near the piecewise regions. Looking at the Fourier partition, we see that these extra coefficients correspond to regions near the edge of the spectrum where the Fourier coefficients are very small.  Figure 6a with the partitioning shown in Figure 6c. These images have been normalized for visibility, but their L 2 energy is given to appreciate how much the corresponding information is significant.

Performance on Unsupervised Texture Segmentation
In this section, we explore the use of the EWWT to extract texture features for the task of unsupervised texture segmentation. Wavelets have been widely used in the literature to characterize textures since they can extract multi-scale as well as directional information. Recently, in [16], Huang et al. showed the effectiveness of the empirical wavelet transform on this problem, particularly that the Curvelet-1 EWT option outperforms all traditional wavelets. As such, we compare the EWWT against the Curvelet-1 option on the Outex dataset [27].
To perform the unsupervised texture segmentation of an image, we go through the following steps, which are outlined in detail in [16]. First, we perform a cartoon-texture decomposition of the image and keep only the textural part. This removes some variability in lighting and other non-textural information. Next, we perform the empirical wavelet feature extraction, which in our case, is either the EWWT or the EWT-Curvelet 1 option. Afterwards, we perform an energy aggregation method by calculating the L 2 energy on a window centered at each pixel. The ideal window size is mostly dependent on the dataset, and [16] shows that a 19 × 19 window is best for the Outex dataset. The final segmentation is obtained by performing a pixel-wise k-means (the expected number of textures k is provided to the algorithm) clustering with cityblock distance.
To quantitatively evaluate the quality of the segmentation, we perform the segmentation on 100 images and compute the score, R, which averages six different region-based segmentation benchmarks (the closer to 100% the better the segmentation). These metrics are the normalized variation of information [28], the swapped directional hamming distance [29], the van Dongen distance [30], the swapped segmentation covering [31,32], the bipartite graph matching [33,34], and the bidirectional consistency error [35]. Each of these metrics are implemented by the eval_segm function available in the Supervised Evaluation of Image Segmentation Methods (SEISM) toolbox (https://github.com/jponttuset/seism).

Influence of the EWWT Parameters
First, we investigate the choice of the scale-space step-size s 0 . When using smaller values for s 0 , we automatically reduces the maximum scale, s max . In these cases, Otsu's method detects an earlier threshold, providing a larger number of modes. Table 1 shows the results for different values of s 0 . We observe that a lower s 0 outperforms a larger s 0 , meaning that detecting a large number of modes is advantageous. This suggests that the empirical watershed wavelet transform does a successful job of separating relevant modes well that characterize the textural information. Ultimately, a value of s 0 = 0.1 is the most successful, and we will use this choice going forward. The next parameter influence we consider is the choice of the transition width τ. Table 2 shows that a transition width of τ = 0.1 performs best, which is likely a balance between proper separation of modes and the forming of Gibbs phenomena. We also look into the effect of excluding maxima detected at the Fourier domain edge in the mode detection step. If a mode is detected at the edge of the spectrum, it represents a mode with very high frequencies. Unless equally high frequency textures are present in the dataset, it is likely these represent noise rather than meaningful textural information. Therefore, we propose to remove the locations µ n corresponding to maxima within a certain band around the edge of the Fourier domain. Setting s 0 = 0.1 and τ = 0.1, Table 3 investigates the influence of the width of that exclusion band. We can observe that removing the maxima within a two-pixel distance from the edge of the Fourier domain gives the best accuracy.

Final Comparison
We now compare the EWWT against the Curvelet-1 EWT (extensive comparisons with other types of wavelets are provided in [16]) on the Outex dataset. We use the best parameters for the EWWT for these tests, i.e., we set s 0 = 0.1, τ = 0.1, and we exclude maxima detected within two pixels of the domain edge. The results are shown in Table 4, and we see a significant improvement (almost 2% with similar standard deviation) when using the empirical watershed wavelet transform. Examples of obtained segmentations can be seen in Figure 8. Likely, this improvement is due to extra adaptability of the EWWT in both mode detection and boundary detection.

Performance of EWWT for Non-Blind Deconvolution
Another standard application of traditional wavelets is in deconvolution as a sparsity-promoting operator [36]. Often, an ideal image u is affected by a kernel A and noise µ, which is observed as the image f accordingly to f = A * u + µ, where * denotes convolution. As such, deconvolution is an inverse problem for image restoration where the goal is, from the observed image f , to recover the unknown pristine image u. Here, we consider the case of non-blind deconvolution where the convolution kernel A is assumed to be known. We begin by discussing how deconvolution inverse problems are solved using traditional wavelets.

Deconvolution with Traditional Wavelets
When attempting to recover a pristine image u via deconvolution, we are essentially solving min u ||Au − f || 2 . However, this equation is ill-posed, as it lacks a regularization term.
Cai et al. [36] claim that real images usually have sparse representations under wavelet transforms. Therefore, by including a regularization term of ||Du|| 1 , where D is a traditional wavelet transform, we ensure that the resulting u will have the characteristics of a real image. The goal is to ultimately turn the image deconvolution problem into a saddle point problem so that we may employ the Primal-Dual method [37]. As such, we also employ a quadratic penalty method, leaving us with where λ is a regularization parameter. For the reader's convenience, we recall that the Primal-Dual algorithm [37] is a technique to solve saddle point problems of the form min where K : X → Y is a linear operator. The minimizer u is then obtained by Algorithm 3 (where prox σF (x) = argmin y ||x−y|| 2 2 2 + σF(y)).
Applied to the deconvolution problem given by (16), we have K = D, F(.) = ||.|| 1 and G(.) = λ 2 ||A * . − f || 2 2 . It is well known that the dual of ||.|| 1 is ||.|| ∞ and, furthermore, prox σ||.|| ∞ (x) corresponds to a projection of x on to the unit L 2 ball [38]. Thus, denoting (i, j) a pixel location, and On the other hand, prox νG (z) is the minimization of two squared L 2 norms. Its solution has a closed-form formulation obtained by (we will denote Au = A * u to emphasize the operator point of view and where A * denotes the adjoint of A) This can be easily solved in the Fourier domain by Furthermore, since we are working in the finite dimensional image space, D * = D T . We then make the assumption that D T ≈ D −1 , and we numerically implement this using the inverse wavelet transform. Finally, the wavelet based deconvolution algorithm is given by Algorithm 4: Algorithm 4 Wavelet based deconvolution.

Extension to Empirical Wavelets
Here, we naturally aim at promoting sparsity in the empirical wavelets domain, i.e., solve the deconvolution problem (16) by replacing the traditional wavelet operator, D with an empirical wavelet operator, denoted D , as was done by Alvarado in [39]. Two strategies are possible: (1) we build the set of empirical wavelets based on the input image f at the beginning, and this set is kept fixed while solving (16); (2) the set of wavelet filters is initialized based on f at the beginning of the algorithm but then it is updated at each iteration. In order to distinguish both options, we will denote D 0 the empirical wavelet transform that keeps its set of filters fixed in option (1). It is worth noticing that, in option (2), the operator D is nonlinear. Thus, we lose the theoretical guarantee of convergence of the primal-dual algorithm; however, we never observed any convergence issues in our experiments. The algorithms corresponding to each options are given by Algorithms 5 and 6, respectively.
We want to emphasize that updating the set of empirical wavelet filters { ϕ k n } at each iteration k automatically updates the operator D k .

Algorithm 5 Empirical wavelet based deconvolution with a kept fixed set of filters.
Input: Choose ν, σ > 0; θ ∈ {0, 1}; u 0 =ũ 0 = f ; Build the set of empirical wavelet filters, { ϕ n } based on the magnitude spectrum of f . This defines the operator D 0 Initialize y 0 = D 0 f , k = 0 while Convergence criteria not met do

Algorithm 6
Empirical wavelet based deconvolution with sets of wavelet filters that are updated at each iteration.
Initialize the set of empirical wavelet filters, { ϕ k n } based on the magnitude spectrum of f . k = 0 while Convergence criteria not met do Update the set of empirical filter { ϕ k n } based on the magnitude spectrum of u k+1 k = k + 1 end while Output: Numerical solution u.

Results
In this section, we compare the deconvolution results obtained by using framelets [40], empirical curvelet option 1 and the empirical watershed wavelets. We use the Outex dataset once again, and simulate a Gaussian blur with variance σ 2 B = 1, 2, 3. To assess the effectiveness of the different restorations, we compute the structural similarity index measure (SSIM) [41] between the pristine image f and the deblurred image u. We remind the reader that the closer the SSIM is to 1, the better the deconvolution. The deconvolution algorithms are run on 50 images, and we present the average SSIM over these 50 images.
We begin by investigating the influence of the parameters of the EWWT in a similar way as we did in Section 4.2.1. We run these tests in the case of the fixed filter bank approach described in Section 4.3.2. We first consider the influence of the scale-space step-size s 0 while keeping τ = 0.1. The results are presented in Table 5.
We observe that a scale-space step-size value of s 0 = 0.05 or s 0 = 0.1 is optimal. While we see small differences between average SSIMs, we notice that smaller step-sizes give higher average results, confirming our observation from Section 4.2.1 that a larger number of modes is advantageous. It is worth noticing that the value s 0 = 0.1 also led to the best texture segmentation, thus we favor the value s 0 = 0.1 over the value s 0 = 0.05 in the following experiments.
Next, we compare the influence of the transition width τ. From Table 6, we see very small differences between the structural similarity index measures, but overall a small transition width of τ = 0.05 works best for less blur, while a larger value of τ = 0.5 works best for larger blurs. Going forward, we use τ = 0.05 for σ 2 B = 1, 2 and τ = 0.5 for σ 2 B = 3. Table 5. Influence of the scale-space step-size s 0 for deconvolution for different level of blur (σ 2 B ). The best SSIM values are given in bold.  We now compare the effectiveness of empirical watershed wavelet deconvolution against framelets and empirical curvelet option 1 on the Outex dataset. Framelets have been chosen as the classic wavelets since it has been shown to outperform other traditional wavelet decompositions [36,42,43].
For both empirical wavelet approaches, we look into both the fixed filter bank approach and the adaptive filter bank method described in Section 4.3.2. The primal-dual algorithm has its own parameters set to λ = 5000, σ = 0.1, ν = 0.1, and θ = 1 [39]. The results (again averaged on 50 images) are shown for all σ 2 B values in Table 7, while visual deblurring examples are shown in Figure 9 and 10. We see from these results that, among all tested methods, the empirical watershed wavelet performs the best. Furthermore, between the fixed filter bank and the adaptive filter bank approaches, the adaptive method slightly outperforms the fixed method. We also note that, in some cases, the empirical curvelets led to artifacting, which can be seen in Figure 9, and this phenomenon was not observed when using framelets and empirical watershed wavelets. In Table 8, we compare the average number of iterations needed to reach the desired convergence accuracy (set to 2 × 10 −4 ) between the framelet and the empirical watershed wavelet methods. We see that the empirical watershed wavelets converge in comparable or less iterations compared to other methods. This seems to be another advantage of the adaptability, i.e., adaptive representations need less iterations to converge to optimal representations of the images.

Conclusions
In conclusion, we proposed a new construction of the 2D empirical wavelet transform based on an arbitrary partitioning of the Fourier spectrum. This construction includes a trigonometric transition region similar to the 1D EWT, and we showed that the set of empirical wavelets forms a frame. We also proposed an adaptive way to obtain such a partitioning, which is broken up into two steps. The first step involves a mode detection step and uses scale-space representations to define persistent maxima. The second step is a boundary detection step that uses the watershed transform. This defines a boundary that isolates the detected maxima by a path of lowest separation. Using these in conjunction with the 2D EWT of arbitrary shape, we defined the empirical watershed wavelet transform (EWWT).
We then showed that the EWWT performs well when applied to multiple applications. Both as a feature extractor for unsupervised texture segmentation and as a sparsity promoter in deconvolution, we saw improved results over previous wavelet and empirical wavelet transforms. On top of this, we explored the robustness of the empirical watershed wavelet transform parameters on each of these applications and found trends in how these parameters act.
Future work is possible in multiple directions. Firstly, in the construction of the 2D EWT, one could consider an arbitrary shape construction that forms a tight frame, rather than a frame. Future work can also be considered in the extension from 2D to 3D or N-D versions of the EWT of arbitrary shape.
In terms of the mode detection we proposed, many other options exist, such as threholding maxima, using maxima filters, entropy-based methods like in wavelet packets, or PCA based methods for dimension reduction. There is plenty of work to be done in finding which method works best for various applications. The same can be said for the boundary detection, especially if certain degrees of boundary smoothness are desired. While we saw that a certain value of the scale-space step-size worked best for both applications explored, more work is warranted to see if this extends to further applications, as well as further datasets.