^{*}

Reproduction is permitted for noncommercial purposes.

Empirical mode decomposition (EMD) is good at analyzing nonstationary and nonlinear signals while support vector machines (SVMs) are widely used for classification. In this paper, a combination of EMD and SVM is proposed as an improved method for fusing multifocus images. Experimental results show that the proposed method is superior to the fusion methods based on à-trous wavelet transform (AWT) and EMD in terms of quantitative analyses by Root Mean Squared Error (

Due to the limited depth-of-focus of optical lenses, cameras cannot be focused simultaneously on all objects at different distances from them to gain a clear image [

Up to now, various methods at pixel, feature or decision levels have been presented for image fusion [

Another family of methods has been explored based on undecimated ′à-trous′ wavelet transform (AWT) [

Empirical mode decomposition (EMD) is a more recent signal processing method for analyzing nonlinear and nonstationary data, which was developed by Huang

The SVM is a supervised classification method that outperforms many conventional approaches in many applications [

Here, the processing of two images A and B is considered, though the algorithm can be extended to handle more than two. Each multifocus image is firstly decomposed by EMD into one residue and a series of intrinsic mode functions (IMFs). Then a SVM is trained to determine which IMF plane is clearer at each location at each level. In the end, the focused image is recovered by carrying out the inverse EMD (IEMD).

The EMD can represent the details and smooth part of an image and this framework is well suitable to fuse images by managing different IMFs [

Treating the original image _{0}.

Connecting all the local maxima and minima along rows using constructed smooth cubic splines to get upper envelope _{r} and lower envelope _{r}. Similarly, upper envelope _{c} and lower envelope _{c} along columns are also obtained. The mean plane

Then, the difference between _{0} and

This is one iteration of the sifting process. Because the value of _{1} becomes an IMF. The residue is obtained by:

Treating the residue as the new input dataset. A series of {_{i}_{1≤}_{i}_{≤}_{J}_{J}

Multifocus image fusion method based on the EMD is to fuse the residues and the IMFs by the activity levels to produce a composite decomposition of the fused image. However, this simple fusion rule sometimes may not produce optimal EMD representation of the fused image when adjacent EMD coefficients are jointly considered to take fusion judgment where a decision fusion rule is needed. With the SVM, one expects much room for improvement over the activity level based fusion schemes.

The SVMs are a set of related supervised learning methods used for classification and regression. Interested readers may consult [_{j}_{j}_{j}_{j}

_{i}_{i}_{i}_{i}_{r}_{s}

Based on the outputs of the SVM corresponding to the inputs, the activity level based fusion rule can be upgraded to the decision fusion rule in such a way that the trained SVM can be used to pick out the focused EMD coefficients for preserving the salient information at each pixel location at each level.

The proposed method (

Extract generalized spatial frequency (_{I}

Collect training patterns as follows:

Train a SVM using the training patterns obtained 2). The kernel function used has the following form:

Decompose A and B with EMD along rows and columns to

Derive the

Perform the fusion based on the outputs of the SVM. If the SVM output is positive, coefficients for the corresponding position of the fused image will come from

Finally, the fused image is recovered by implementing IEMD according to

In this section, multifocus image fusion based on the AWT, the EMD, and the proposed method is tested on two sets of images: green pepper (512×512) and leopard (480×360). Each reference image [

When performing the AWT based fusion algorithm, because multiresolution analysis based on à trous filter can preserve translation invariance, short decomposition/reconstruction filters are needed to avoid ringing artifacts [^{-1/2} (1/16, 1/4, 3/8, 1/4, 1/16), together with a decomposition level of three, coefficient based activity. For the EMD, cubic spline function, along with two levels of decomposition and coefficient based max scheme is used. For performing the proposed method termed EVM (Empirical support Vector Machine), the SVM20 with the radial basis function is used, and this software was downloaded from

Two evaluation criteria are used. They are the Root Mean Squared Error (

Here, _{F,I}_{F}_{I}

As can be found from

The key reason for the superiority of the EVM over the AWT and EMD is the usage of generalized spatial frequency in representing image clarity, which produces good input features for the SVM in deciding which input image has the better focus at a specific pixel position.

The SVM requires the presetting of a regularization parameter [

In this paper, we study the wedding of EMD and SVM for fusing images with different focuses of the same scene in order to get an image with every object in focus. The EMD is used for the multiresolution decomposition, while the SVM is employed to find the multifocus image with the better focus at a given pixel position. Based on the outputs of the SVM, the fusion scheme based on the activity level of the EMD coefficients can be improved to the decision fusion rule. This fusion scheme is used to select the source multifocus image that has the best focus at each pixel location. Experiments corroborate that the proposed method does better than the traditional AWT and EMD based fusion schemes in fusing multifocus images in terms of the evaluation based on RMSE and MI. By working on the EVM fused image rather than on the original defocused image, vision-related processing tasks can be expected to yield more accurate results. Compared with the separate AWT and EMD based methods, the EVM based method is more computational intensive when implemented to perform real-time image fusion. However, overall evaluation shows that it is a promising method.

In remote sensing community, one of the most challenging tasks is fusion of images with different imaging geometry and spatial resolution, for example, synthetic aperture radar images and Landsat Thematic Mapping images. In the future, we intend to extend the proposed fuser to merge multisensor images. Another is the fusion of images with obviously different pixel sizes and spectral properties, such as Moderate Resolution Imaging Spectroradiometer (MODIS) images and TM images [

This work was supported jointly by the Program of “One Hundred Talented People ” of the Chinese Academy of Sciences (CAS), the State Key Development Program for Basic Research of China with grant number 2007CB

(a) the original image; (b) IMF1; (c) IMF2; (d) the residue.

Schematic flowchart of the proposed algorithm.

Reference images and source images of green pepper and leopard. (a) Focus on the front green pepper; (b) focus on the behind green pepper; (c) reference green pepper image; (d) fused image using AWT; (e) fused image using EMD; (f) fused image using EVM (

(a) The effect of the

Performance of the three fusion methods on processing

5.2075 | 3.0118 | 2.6166 | |

2.5338 | 3.8520 | 3.9093 |

Performance of the three fusion methods on processing

3.8077 | 3.2249 | 2.7220 | |

1.7062 | 3.2331 | 3.4211 |