1. Introduction
Texture is an important component of natural images, which provides abundant cues for visual information recognition and understanding. It’s generally recognized that the image texture is defined as a function of the spatial variation in pixel gray values, which is useful in a variety of applications, such as medical image analysis, document processing and remotely sensed image analysis [
1]. Recently, dynamic texture analysis has attracted much attention. Dynamic textures are video sequences of complex dynamical objects such as smoke, fire, sea waves, foliage waving in wind, moving escalators, and swinging flags, which exhibit certain stationary properties in time [
2]. They provide important visual cues for various video processing problems. Therefore, texture analysis is still an important and interesting research field [
3,
4,
5,
6,
7,
8].
Image (video) segmentation attempts to partition an image (video) into regions, each of which has a reasonably homogeneous visual appearance or corresponds to an object or a part of the object [
9]. Multiscale Bayesian approaches for texture segmentation have been proven efficient to integrate both features and the contextual information into the estimation of class labels. In [
10,
11], a tree-structured hidden Markov model, hidden Markov tree (HMT) model, was proposed in the wavelet-domain to achieve the statistical characterization of signals and images by capturing interscale dependencies of wavelet coefficients across scales. The multiscale dependencies tying together the hidden states assigned to the coefficients rather than their values, is the fundamental difference with other multiscale Markovian model. In [
12], Durand presented an alternative upward-downward algorithm for smoothed probabilities, which is a true smoothing algorithm that is immune to underflow problems in the HMT model and whose complexity remains unchanged. The segmentation method derived in [
13], HMTseg, has been proved as a very useful solution by combining the parametric wavelet-domain statistical modeling, direct likelihood calculation, and multiscale Bayesian decision fusion. In [
14], Ye proposed a novel texture descriptor for vision-based smoke detection using a Surfacelet transform and 3D HMT model, proving that the HMT model can be used to obtain high detection accuracy, and has a wide range of applications, such as object recognition. In [
15], Wang took advantage of the pyramidal dual-tree directional filter bank (PDTDFB) transform, and proposed a new color image segmentation algorithm based on PDTDFB domain HMT model.
In order to match the compression property of the wavelet transform that the majority of coefficients have small values and the minority of coefficients have large values, the marginal distribution
of each coefficient node
of the traditional HMT model is modeled as Gaussian mixture model (GMM) that is a mixture of Gaussian conditional distributions. In many applications, for example texture analysis and image restoration [
16], the GMM has been proved to be effective. However, for the histograms of some image textures and dynamic textures, such as “Floor” shown in
Figure 1, the Laplace mixture model (LMM) provides a better fitness to the histogram than the Gaussian mixture model, because the histogram is peakier in the center and has heavier tails. That is to say, LMM can describe the marginal distribution of
better. The results on texture classification [
17] and image denoising [
18,
19] also demonstrate the potential of Laplace distributions and their mixtures for the prior image. Therefore, we introduce Laplace mixture distribution-based HMT models to model the wavelet coefficient distributions of textures.
The rest of this paper is organized as follows: in
Section 2, we briefly review the basic concepts of the Laplacian distribution, HMT models, and the multiscale image segmentation method based on HMT models. In
Section 3, we propose the LMM-based HMT model and its parameter estimation. For more accurate texture characterization, we also use Laplace mixture distribution to describe the pixel-level texture. In
Section 4, we introduce LMM-HMT-based image texture segmentation and dynamic texture segmentation. Experimental results are shown and analyzed in the fifth section, and then the conclusions are given in the last section.
3. LMM-HMT Based Description of Texture
3.1. LMM Based HMT Model
For any given set of wavelet coefficients, the two-state, zero-mean Gaussian mixture model may not provide a fit to
with the desired fidelity, we may improve accuracy by increasing the number of hidden states and allowing nonzero mean, but this also increases the computational complexity and becomes less robust accordingly [
10]. To match the non-Gaussian nature of the wavelet coefficients of a texture more precisely and effectively, we utilize a mixture of Laplacian distribution instead of Gaussian mixture density. LMM can guarantee being peakier in the center and having heavier tails with least state variables, which fit the pdf of wavelet coefficients better than GMM, as shown in
Figure 1. As GMM based HMT model, each coefficient is assigned with a hidden state variable
state 0 corresponds to the Laplace function with zero mean and small value of the scale parameter, while state 1 corresponds to the Laplace function with zero mean and large value of the scale parameter. The Laplace probability density functions of each coefficient
in state
can be expressed as:
Therefore, an LMM-HMT model is specified in terms of the following parameters:
The probability of the state at the root node in the coarsest scale
The state transition probability is:
It is the conditional probability that
is in state
given
state variable
is the parent state of
;
The scale parameter , given .
We can also group these parameters for a sub-tree into a model parameter set (P is the number of wavelet coefficients of the sub-tree), and the LMM-HMT model of a texture that belongs to class c is parameterized by as the GMM-HMT model.
3.2. Parameter Estimation
In [
10], Crouse introduced an Upward-Downward method based on the EM algorithm for estimating parameters of the HMT model. This algorithm used ML estimation of Gaussian mixture means and variances for the leaves of the tree instead of ML estimation of probability mass function (pmf) values in classical EM algorithm. According the parameters estimation framework, the parameters of the LMM-HMT model with two hidden states and the zero-mean Laplace mixture distribution are estimated by using EM algorithm.
For a single wavelet tree of a sub-band, we can make use of the Upward-Downward method [
10] and the probability density distribution Equation (7) to obtain the desired conditional probabilities:
with the current parameters, where
is the set of wavelet coefficients obtained by removing the subtree
from the entire wavelet coefficient tree
.
includes the whole wavelet coefficients in the wavelet tree.
is used to denote the probablility.
At the E step of the EM algorithm, we apply the Upward-Downward method to each wavelet coefficient tree, and get the probabilities
and
with the current parameters
. Then, in the M step, the parameters are re-estimated with
K wavelet trees as follows:
As stated in [
13], we also use the intra-tying for the parameter estimation of each HMT, that is to say, the same state transition probabilities and mixture scale parameters are used for all wavelet coefficients at the same scale. The complete parameter set of LMM-HMT model is estimated by applying the tree-structure EM algorithm to, for example, LH, HL and HH directional sub-bands of an image texture.
3.3. Pixel-Level Texture Description
After the raw segmentation with the wavelet-domain LMM-HMT models and the multiscale fusion, the pixel-level segmentation is necessary to obtain final result completely. In this paper, the pixel values are considered as the realizations of a random variable that obeys a Laplace mixture distribution instead of a Gaussian mixture. In order to describe the region property of the texture, the multivariate Laplace mixture distribution is used.
The Laplace distribution belongs to the exponential power distribution family. The density function of the multivariate exponential power (MEP) distributions is defined as [
24]:
where
(a symmetric positive definite matrix) and:
is Gamma Function.
presents the dimension. The shape of a MEP is strongly influenced by parameter
, when
it is actually a Gaussian distribution; when
it becomes the multivariate Laplace distribution:
which is a generalized version of Equation (1). Therefore, the multivariate Laplace mixture distribution is:
with the parameter set
. Here,
is the parameter of the
m-th Laplace density function of the mixture distribution.
is the weight value, and
.
x is a vector. For each pixel
i of a texture, a vector can be constructed by accumulating this pixel and its neighbors that is shown in
Figure 4. The resulting vectors of the texture are modeled with the multivariate Laplace mixture distribution to describe the texture for the pixel level segmentation. If only the pixel
i is considered, this mixture distribution becomes a scalar distribution. In generally, the large neighborhood includes more pixels, so we can obtain the better description of the texture. However, the small neighborhood is useful to accurately locate the boundary of different textures. Therefore, in this paper, we utilize the second order neighbor system as shown in
Figure 4b.
Given a data set
, the log-likelihood for the mixture distribution is given by:
where
follows multivariate Laplace distribution, and
is the vector formed from the texture. Sascha [
24] has estimated the parameters of the mixture of MEP distributions by using EM algorithm. Because multivariate Laplace distribution is a special case (
in Equation (16)) of the MEP distribution, we can easily obtain the estimated parameters as follows:
with:
4. Texture Segmentation
After training the wavelet domain LMM-HMT model and pixel-level Laplace mixture distribution model for a homogeneous (image or dynamic) texture, we can use them to segment the heterogeneous texture. As mentioned in
Section 2.4, the segmentation method includes three steps, raw maximum likelihood segmentation, context-based multiscale fusion, and pixel-level segmentation, which is shown in
Figure 5. The differences between our method and HMTseg [
13] are just the marginal distribution of HMT model and the pixel-level description. The implementation is summarized as follows.
- (1)
Model training. For each texture class, we train the wavelet domain LMM-HMT model with the homogeneous texture samples by using EM algorithm as the
Section 3.2, and obtain the model parameters
, in which
c denotes the
cth texture class. Meanwhile, the pixel-level multivariate Laplace mixture model parameters are gotten with Equations (19)–(24).
- (2)
Raw maximum likelihood segmentation. For a heterogeneous texture to be segmented, the likelihood of each subtree at different scale can be computed by using the HMT likelihood computation method and the Equation (7). The raw segmentation at the coarest scale is accomplished by using Equation (5) with the trained LMM-HMT model parameters .
- (3)
Context-based multiscale fusion. At the scale
j, the context vectors
s are constructed from the segmentation label
at scale
j + 1. The segmentation result
is obtained by using EM algorithm and maximizing the contextual posterior distribution as the work [
13].
- (4)
Pixel-level segmentation. Compute the likelihood of each pixel with the trained pixel-level multivariate Laplace mixture models. Perform the context-based fusion scheme from the scale j = 1 to the pixel-level as the step (3). The output is the final segmentation result.
4.1. Image Texture Segmentation
The HMT model based segmentation method HMTseg and its improved versions have been applied in document image segmentation, aerial imagery segmentation and texture segmentation. Taking the texture property into account, we use the wavelet domain LMM-HMT model and Laplace mixture distribution model to describe the texture. Therefore, at the training stage, their parameters should be estimated with the Equations (11)–(13) and Equations (19)–(24), respectively. Then the method shown in
Figure 5 is utilized to segment the homogeneous image texture.
4.2. Dynamic Texture Segmentation
As mentioned in the introduction section, dynamic textures are video sequences of complex dynamical objects. Dynamic texture segmentation addresses the problem of decomposing an image sequence into a collection of homogeneous texture regions. The introduced dynamic texture segmentation is based on spatial-temporal wavelet domain LMM-HMT and pixel-level multivariate Laplace mixture distribution model.
For 2-D wavelet transform, each low-frequency sub-band is decomposed into four sub-bands, one approximate sub-band and three detail sub-bands, which leads naturally to a quad-tree structure on the wavelet coefficients. Therefore, 2-D HMT model is formed as shown in
Figure 3. However, for the spatial-temporal wavelet transform, each level transform results in one approximate sub-band and seven detail sub-bands as shown in
Figure 6, from which the octree structure is formed. That is to say, each parent node is connected to its eight child wavelet coefficents. Therefore, there are seven parallel octree structures. The complete LMM-HMT model consists of seven LMM-HMT models, where each of them corresponds to one sub-band. The LMM-HMT model for the dynamic texture uses Markov chain to model the interscale dependences and the Laplace mixture distribution to characterize the wavelet coefficient. Then the LMM-HMT model is parameterized by
, where each parameter subset of
is for a sub-band (For example,
denotes the parameter set of sub-band LLH). For tractability reasons, we assume that the seven sub-bands of the spatial-temporal wavelet domain are statistically independence. Then for a dynamic texture belonging to class
c, we can estimation the parameters for each sub-band with EM algorithm (Equations (11)–(13)), and obtain
of the complete LMM-HMT model.
The general segmentation method based on LMM-HMT model is shown in
Figure 5. It is easy to extend LMM-HMT model based image texture segmentation method to the spatial-temporal wavelet domain LMM-HMT model based dynamic texture segmentation. We should put emphasis on the context labeling tree at the context-based interscale fusion stage. In this paper, the context labeling tree is constructed as shown
Figure 7. The state of a wavelet coefficient at scale
j − 1 is dependent on its parent plus eight neighbors and the states of the coefficients of the adjacent frames at scale
j. Thus, context vector
is formed by accumulating the parent and the majority vote of all other 26 neighbors. Then, this context vector is used at the multiscale fusion stage.
6. Conclusions
In this paper, we introduce a wavelet domain HMT model fitting by a Laplacian mixture distribution instead of Gaussian conditions, from which the texture segmentation method is derived. Specifically, we make use of a LMM-HMT model to describe the wavelet coefficients and their interscale dependence, and the multivariate Laplacian mixture distribution to characterize the pixel values for pixel-level segmentation. For the dynamic texture segmentation, we also utilize the spatial-temporal parent neighbors to construct the context labeling tree, so as to improve the segmentation during multiscale fusion stage. The segmentation experiments are conducted on the image textures and dynamic textures, from which the performance of the proposed method is verified. The image texture analysis and applications based on the wavelet domain hidden Markov tree model have been intensively investigated in past years. This paper also tries to solve the dynamic texture segmentation problem by using the wavelet domain hidden Markov model. However, we process the dynamic texture (spatio-temporal image sequence) as a 3D volume data, which cannot effectively capture the dynamic texture property along the temporal domain. Therefore, a different spatio-temporal wavelet decomposition, such as different decomposition levels for spatial and temporal domains, and the corresponding hidden Markov model should be one of our future works. In addition, for the dynamic texture with slow change along the temporal domain, the Haar wavelet transform may result in some special sub-bands, in which almost all coefficients approach zero. It is difficult to model these sub-bands. Another wavelet basis-based wavelet domain representation for the dynamic texture should be studied for this case.