A Geometric Dictionary Learning Based Approach for Fluorescence Spectroscopy Image Fusion

: In recent years, sparse representation approaches have been integrated into multi-focus image fusion methods. The fused images of sparse-representation-based image fusion methods show great performance. Constructing an informative dictionary is a key step for sparsity-based image fusion method. In order to ensure sufﬁcient number of useful bases for sparse representation in the process of informative dictionary construction, image patches from all source images are classiﬁed into different groups based on geometric similarities. The key information of each image-patch group is extracted by principle component analysis (PCA) to build dictionary. According to the constructed dictionary, image patches are converted to sparse coefﬁcients by simultaneous orthogonal matching pursuit (SOMP) algorithm for representing the source multi-focus images. At last the sparse coefﬁcients are fused by Max-L1 fusion rule and inverted to fused image. Due to the limitation of microscope, the ﬂuorescence image cannot be fully focused. The proposed multi-focus image fusion solution is applied to ﬂuorescence imaging area for generating all-in-focus images. The comparison experimentation results conﬁrm the feasibility and effectiveness of the proposed multi-focus image fusion solution.


Introduction
Following the development of cloud computing, cloud environment provides more and more strong computation capacity to process various images [1][2][3].Due to the limited depth-of-focus of optical lenses in imaging sensors, one-time focus cannot guarantee to obtain all focused image in the whole scene [4].In another word, it is difficult to obtain an image that contains all relevant objects in focus.In microscope, only the objects at a certain distance away from the lens can be captured in focus and sharply, and other objects are out of focus and blurred.A fuzzy multi-sensor data fusion Kalman filter model was proposed by Rodger to reduce failure risk in an integrated vehicle health maintenance system (IVHMS) [5].In fluorescence spectroscopy, one scene at least contains several objects.For ensuring the accuracy and efficiency of fluorescence image processing, it usually captures multiple images of the same scene to guarantee all objects are in focus.However, it costs too much on viewing and analyzing a series of images separately.Multi-focus image fusion is an effective technique to resolve this problem by combining complementary information from multiple images of the same scene into a sharp and accurate one [6,7].The integrated image can precisely describe the corresponding scene, which is beneficial for further analysis and understanding.
As one of the most widely recognized image fusion technologies, a large number of multi-focus image fusion methods have been proposed.According to fusion domain, these methods could be categorized into two main categories: spatial-domain-based methods and transform-domain-based methods [8].Spatial-domain-based methods directly choose clear pixels, blocks, or regions from source images to compose a fused image [9,10].Some simple methods, such as averaging or max pixel schemes, are performed on single pixel to generate a fused image.However, these methods may reduce the contrast and edge intensity of fused result.In order to improve the performance of fused image, some advanced schemes, such as block-based and region-based algorithms, were proposed.Li et al. proposed a scheme by dividing images into blocks and chose the focused one by comparing spatial frequencies (SF) first; then the fused results were produced by consistency verification [11].Although block-based methods improve the contrast and sharpness of integrated image, they may cause block effect in integrated image [12].
Different from spatial-domain fusion methods, transform-domain methods transform source images into a few corresponding coefficients and transform bases first [13].Then the coefficients are fused and inverted to an integrated image.Multi-scale transform (MST) and wavelet based algorithms are the most conventional transform approaches applied to transform-domain-based image fusion [14,15].Wavelet transform [16,17], shearlet [18,19], curvelet [20], dual tree complex wavelet transform [21,22], nonsubsampled controulet transform (NSCT) [23] are usually used in MST-based methods.MST decomposition methods have attracted great attention in image processing field and achieved great performance in image fusion fields.However, MST-based methods are difficult to select an optimal transform basis without priori knowledge [24,25].As each MST method has its own limitations, one MST method is difficult to fit all kinds of images [14].
In recent years, sparse-representation-based methods, as the subsets of MST fusion methods, are applied to image fusion.Other than MST methods, sparse-representation-based methods usually use trained bases, which can adaptively change according to input images without priori knowledge [26,27].Due to adaptive learning feature, sparse representation is widely applied to image de-noising [28,29], image deblurring [30,31], image inpainting [32], image fusion [33], and super-resolution [34].
Yang and Li [35] took the first step for applying sparse representation theory to image fusion field.They proposed a multi-focus image fusion method with a fixed-dictionary.Li and Zhang [36] applied morphologically filtering sparse feature to matrix decomposition method for improving the accuracy of sparse representation in image fusion.Wang and Liu [37] proposed an approximate K-SVD based sparse representation method for image fusion and exposure fusion to reduce computation costs.Kim and Han [38] proposed a joint clustering dictionary learning method for image fusion.They used the steering kernel regression to strength the local geometric features of source images first.Then K-means clustering method was applied to image pixels clustering based on the local image features.The proposed method by Kim and Han can significantly group the pixels of input source images into a few classes.The principle components of local features are extracted from each group to construct a dictionary, which can make a compact and informative dictionary.This method reduces the sparse coding time by using the constructed compact dictionary.However, the performance of Kim and Han's method depends on the presetting cluster number, that is difficult to confirm before clustering.The number of local geometric-feature types is usually used as the presetting cluster number, but the actual clustering results do not exactly follow the geometric features every time.So the clustering results cannot reach the expectations all the time.To optimize the weakness of existing clustering methods, a geometric information based image block classification method is proposed in this paper.
Geometric information, as one type of the most important image information, including edges, contours, and textures of image, and so on, can remarkably influence the quality of image perception [39].The information can be used in patch classification as a supervised dictionary prior to improve the performance of trained dictionary [40,41].In this paper, a geometric classification based dictionary learning method is proposed for sparse-representation based image fusion.Instead of grouping the pixels of images, the proposed geometric classification method groups image blocks directly by the geometric similarity of each image block.Since sparse-representation based fusion method uses image blocks for sparse coding and coefficients fusion, extracting underlying geometric information from image-block groups is an efficient way to construct a dictionary.Moreover, the geometric classification can group image blocks based on edge, sharp line information for dictionary learning, which can improve the accuracy of sparse representation.The proposed dictionary learning approach consists of three steps:

•
In the first step, input source images are split into small blocks.According to the similarity of geometric patterns, these blocks are classified into smooth patches, stochastic patches, and dominant orientation patches.

•
In the second step, it does principal component analysis (PCA) on each type of patches to extract crucial bases.The extracted PCA bases are used to construct the sub-dictionary of each image patch group.

•
In the last step, the obtained sub-dictionaries are merged into a complete dictionary for sparse-representation-based fusion approach and the integrated sparse coefficients are inverted to fused image.
This paper has two main contributions.
1.A geometric image patches classification method is proposed for dictionary learning.
The proposed geometry classification method can accurately split source image patches into different image patch groups for dictionary learning.Dictionary bases extracted from each image patch group have good performance, when they are used to describe the geometric features of source images.2. A PCA-based geometry dictionary construction method is proposed.The trained dictionary with PCA bases is informative and compact for sparse representation.The informative feature of trained dictionary ensures that different geometric features of source images can be accurately described.The compact feature of trained dictionary can speed up the sparse coding process.
The rest sections of this paper are structured as follows: Section 2 proposes the geometric dictionary learning method and multi-focus image fusion framework; Section 3 compares and analyzes experimentation results; and Section 4 concludes this paper.

Dictionary Learning Analysis
During the sparse representation of the dictionary construction process, the key step is to construct an over-complete dictionary that covers the important information of input image.Since geometry information of image plays an important role in describing an image, an informative dictionary should take the geometric information into consideration.Smooth and non-smooth information as two important types of geometry information can be used to classify source images.
A multiple geometry information classification approach was proposed for single image super-resolution reconstruction (SISR) [39].According to the geometric features, a large number of high-resolution (HR) image patches were randomly extracted and clustered to generate corresponding geometric dictionaries for sparse representations of local patches in low-resolution (LR) images.The geometric features were classified into three types, smooth patch, dominant orientation patch, and stochastic patch.According to the estimated dominant orientation, the dominant orientation patches could be divided into different directional patches.Rather than adaptively selecting one dictionary, the recovered patches were weighted in the learned geometric dictionaries to characterize the local image patterns, by a subsequent patch aggregation to estimate the whole image.To reduce the redundancy in image recovery, the self-similarity constraints were added on HR image patch aggregation.Both LR and HR residual images were estimated from the recovered image and compensated for the subtle details of reconstructed image.
In SISR, the source images are classified as smooth, dominant orientation, and stochastic patches.Dominant orientation and stochastic patches are non-smooth patches.Since the detection of orientation in finite dimensional Euclidean spaces corresponded to fitting an axis or a plane by Fourier transformation of an n-dimensional structure, Bigün verified dominant orientation and stochastic patches can be separated by orientation estimation in the spatial domain based on the error of the fit [42].
In dictionary learning, the redundancy of trained dictionary is usually not considered.A redundant dictionary not only has high computational complexity, but cannot also obtain the best effect in image representation.Different methods are proposed to reduce the redundancy in learning process to construct the compact dictionary.Elad [43] estimated the maximum a posteriori probability (MAP) by a compact dictionary.An effective image interpolation method by nonlocal autoregressive modeling (NARM) was developed and embedded in SRM by Dong [44] to enhance the effectiveness of SRM in image interpolation by reducing the coherence between the sampling matrix and the sparse representation dictionary and minimizing the nonlocal redundancy.To improve the performance of sparse representation-based image restoration, a non-locally centralized sparse representation (NCSR) model proposed by Dong [45] suppressed the sparse coding noise in image restoration.
The sparse representation-based approach linearly combines a few atoms extracted from an over-complete dictionary as an image patch.The promising results have been shown in various image restoration applications [38,45].Based on the classification of image patches, this paper proposed sparse representation-based approach that uses PCA algorithm to construct more informative and compact dictionary [38].The proposed solution cannot be extended to other sparse-representation applications.Most sparse representation methods are based on a large number of sample images.The corresponding learned dictionaries are expected to be applied to different scenic source images.The proposed solution uses sparse feature to do image sparse representation and fusion.Therefore, the learned dictionary of proposed solution does not need to be extended to different source images, and only specializes for experimented source images.It uses PCA classification to reduce the redundancy in dictionary and enhance the performance.
There are many popular dictionary learning methods like KSVD.However, this paper cannot use other dictionary learning methods to substitute PCA to do dictionary learning.It compares PCA with KSVD to explain the benefits of using PCA in dictionary learning.For sparse representation-based image reconstruction and fusion techniques, it is difficult to select a good over-complete dictionary.To obtain an adaptive dictionary, Aharon [46] developed KSVD that learned an over-complete dictionary by using a set of training images or updates a dictionary adaptively by applying SVD operations.The image patches used by KSVD in dictionary learning were extracted either from globally trained dictionary (natural images) or adaptively trained dictionary (input images).Although KSVD shows better performance in dictionary learning than other methods, it takes high costs to do dictionary learning with a large number of training images iteratively.The high computational complexity constrains the learned dictionary size of KSVD in practical usage.Kim [38] firstly applied clustering-based dictionary learning solution to image fusion.It clustered patches from different source images together based on local structural similarities.Only a few principal components of each cluster were analyzed and used to construct corresponding sub-dictionary.All learned sub-dictionaries were combined to form a compact and informative dictionary that can describe local structures of all source images effectively.The redundancies of dictionary (i.e., reducing the size of dictionary) were eliminated to reduce the computation loads of the proposed PCA-based dictionary learning solution.In various comparison experimentations, the proposed PCA-based dictionary learning solution (JPDL) had better performance than other traditional multi-scale transform-based and sparse representation-based methods, including KSVD.Comparing with KSVD, JPDL as a PCA-based method not only obtained a more compact dictionary, but also took less computational costs in dictionary learning process.

Geometry Dictionary Construction
According to geometric patterns used in SISR [39], this paper classifies the image into three patch types: smooth patches, stochastic patches, and dominant orientation patches.The smooth patches, stochastic patches and dominant orientation patches mainly describe the structure information, texture information, and edge information respectively.Based on a priori knowledge, image patches can be classified into three groups for sub-dictionaries learning.Additionally, to obtain more compact sub-dictionary for each image patch group, this paper uses PCA method to extract the important information from each group.PCA bases of each group are used as the bases of corresponding sub-dictionary.The obtained dictionary can be more compact and informative by combining the PCA-based sub-dictionaries [38].
This paper learns three different sub-dictionaries rather other one dictionary.Three different sub-dictionaries contain more detailed information and structure information, and less redundant information.It means that although the obtained sub-dictionaries are compact, they can still contain more useful information.Three types of geometric patches extract different image information from source images respectively.

•
Smooth patches describe the structure information of source images, such as the background.

•
Dominant orientation patches describe the edge information to provide the direction information in source images.

•
Stochastic patches show the texture information to make up the missing detailed information that is not represented in dominant orientation patches.
Three learned dictionaries specialize in smooth, dominant orientation, and stochastic patches respectively.The structure information, edge information, and texture information can be accurately and completely shown in corresponding sub-dictionary.Comparing with one learned dictionary, three learned sub-dictionaries can not only supply different information of source images respectively, but also make up the deficiencies mutually to enhance the quality of merged dictionary.So each sub-dictionary is important to compose a compact and informative dictionary in the proposed approach.
The proposed geometric approach is shown in Figure 1 that has two main steps.In the first step, the input source images I i to I k are split to several small image blocks p in , i ∈ (1, 2...k), n ∈ (1, 2...w), where i is the source image number, and n is the patch number.The total block number of each input image is w.Then according to the similarity of geometric patterns, the obtained image blocks are classified into three groups, smooth patch group, stochastic patch group, and dominant orientation patch group.In the second step, it does PCA analysis on each sub-class for extracting corresponding PCA bases as sub-dictionary.Then these obtained sub-dictionaries are composed to construct a complete dictionary for image sparse representation.

Sub-dictionary
Sub-dictionary Sub-dictionary Figure 1.Geometric Dictionary based Image Fusion Framework.

Geometric-Structure-Based Patches Classification
Image blocks of a source image can be classified by various image features in describing underlying relationships.According to geometry descriptions, the multi-focus source images can be divided to smooth, stochastic, and dominant orientation patches.The edges of focused area are usually sharp that contain dominant orientation patches.The out-of-focused areas are smooth that contain a large number of smooth blocks.Besides that, there are also lots of stochastic blocks in source images.Grouping image blocks into different classes for dictionary learning is an efficient way to improve the accuracy of image description.
To obtain different geometry sub-classes, this paper uses a geometric-structure-based approach to partition image into several sub-classes first.Then based on the classified image blocks, it gets the corresponding sub-dictionaries.
Before doing geometry analysis, it needs to divide the input image into √ w × √ w small image blocks P I = (p 1 , p 2 , ..., p n ) first.Each image patch p i , i ∈ (1, 2, ...n) is converted into 1 × w image vectors v i , i ∈ (1, 2, ...n).After obtaining vectors , the variance C i of pixels in each image vector can be calculated.After obtaining variance, it needs to choose the threshold δ to evaluate whether image block is smooth.If C i < δ, image block p i is smooth, otherwise image block p i is not smooth [39].Based on the threshold δ, the classified smooth and non-smooth patches are shown in Figure 2a,b respectively.
According to the classified smooth and non-smooth patches shown in Figure 2a,b, it is clear to find that the smooth patches have similar structure information of input images.Oppositely, non-smooth patches are different and cannot be directly classified into one class.Non-smooth patches may also be classified.According to geometric patterns, non-smooth patches could be divided into stochastic and dominant orientation patches.The separation method of stochastic and dominant orientation image patches consists of two steps.In the first step, the gradient of each pixel is calculated.In every image vector v i , i ∈ (1, 2, ...n), the gradient of each pixel k ij , j ∈ (1, 2, ..., w), i ∈ (1, 2, ...n) is composed by its x and y coordinate gradient g ij (x) and g ij (y).The gradient value of each pixel k ij in image patch v i is g ij = (g ij (x), g ij (y)).The (g ij (x), g ij (y)) can be calculated by g ij (x) = ∂k ij (x, y)/∂x, g ij (y) = ∂k ij (x, y)/∂y.For each image vector v i , the gradient G i is G i = (g i1 , g i2 , ..., g iw ) T , where G i ∈ R w×2 .In the second step, the gradient value of each image patch is decomposed by Equation ( 1): where U i S i V T i is the singular value decomposition of G i .S i is a diagonal 2 × 2 matrix for representing energy in dominant directions [47].When S i is obtained, the dominant measure R can be calculated.The calculation method of R is shown in Equation ( 2): The smaller the R is, the more stochastic the corresponding image patch is [48].In this case, a threshold R * should be calculated for distinguishing stochastic and dominant orientation patches.To find the threshold R * , a probability density function (PDF) of R is calculated.According to [42], the PDF of R can be calculated by Equation (3).
The PDF of dominant measure R of patches with different sizes is shown in Figure 3.
A PDF significance test is implemented to distinguish stochastic and dominant orientation patch by the threshold R * [42].If R is less than R * , the image patch can be considered as stochastic patch.The stochastic and dominant orientation patches separated by the proposed method are shown in Figure 4.The Figure 4a shows the stochastic image patches, which contain some texture and detailed information.Dominant orientation image patches are shown in Figure 4b, which mainly contain the edge information.
As shown in Figure 3, when R increases, P(R) converges to zero.It chooses the value of R as the threshold R * , when P(R) reaches zero for the first time.

PCA-Based Dictionary Construction
When the geometric classification finished, image patches with similar geometric structure are classified into a few groups.In this work, the compact and informative dictionary is trained by combining the principal components of each geometric group.Since patches in the same geometric group can be well-approximated by a small number of PCA bases, top m most informative principal components are chosen to form a sub-dictionary [49] as Equation (4).
where B x denotes the sub-dictionary of the xth cluster, and q is the total number of atoms in each cluster.Each B x consists of p eigenvector atoms.L j is the eigenvalue of the corresponding jth eigenvector d j .The eigenvalues are sorted in descending order (i.e., L 1 > L 2 > ... > L n > 0).δ is a parameter to control the amount of approximation with rank p. Usually, δ is set to a proportional to the dimension of input image to avoid over-fitting [49].After the sub-dictionaries are obtained, a dictionary D is constructed by combining sub-dictionaries together as Equation ( 5).
where n is the total number of clusters.

Fusion Scheme
The fusion scheme of proposed method is shown in Figure 5, which consists of two steps.In the first step, each input image I i is split into n image patches with the size of √ w × √ w.These image patches are resized to * 1 vectors p i 1 , p i 2 , ..., p i n .The resized vectors are sparse coded with trained dictionary to sparse coefficients z i 1 , z i 2 , ..., z i n .In the second step, the sparse coefficients are fused by 'Max-L1' fusion rule [50][51][52].Then the fused coefficients are inverted to fused image by the trained dictionary.

Experiments and Analyses
In comparison experiments, three pairs of color fluorescence images are used to test the proposed multi-focused image fusion approach.All three multi-focus image pairs are modified to the size of 256 × 256 for comparison purpose.To show the efficiency of proposed method, the state of art dictionary learning based sparse-representation fusion schemes KSVD [50] and JPDL [38], which were proposed by Li in 2012 and Kim in 2016 respectively, are used in comparison experiments.The comparison experiments are evaluated by both subjective and objective assessments.Four popular image fusion quality metrics are used in this paper for the quantitative evaluation.The patch size of all sparse-representation-based methods including the proposed method are set to 8 × 8. To avoid blocking artifacts, all experiments use sliding window scheme [38,50].The overlapped region of sliding window scheme is set to 4-pixel in each vertical and horizontal direction of all experiments.All experiments were implemented using Matlab, version 2014a; MathWorks: Natick, MA, 2014, and Visual Studio, version Community 2013; Microsoft: Redmond, WA, 2013, mixed programming on an Intel(R) Core(TM), version i7-4720HQ; Intel: Santa Clara, CA, 2015, CPU @ 2.60 GHz Laptop with 12.00 GB RAM.

Mutual Information
MI for images can be formalized as Equation (6).
where L is the number of gray-level, h R,F (i, j) is the gray histogram of image A and F. h A (i) and h F (j) are edge histogram of image A and F. MI of fused image can be calculated by Equation (7).
where MI(A, F) represents the MI value of input image A and fused image F; MI(B, F) represents the MI value of input image B and fused image F.
Q AB/F metric is a gradient-based quality index to measure how well the edge information of source images conducted to the fused image.It is calculated by: where Q AF = Q AF g Q AF 0 , Q AF g and Q AF 0 are the edge strength and orientation preservation values at location (i, j).Q BF can be computed similarly to Q AF .w A (i, j) and w B (i, j) are the weights of Q AF and Q BF respectively.

Visual Information Fidelity
VIF is the novel full reference image quality metric.VIF quantifies the mutual information between the reference and test images based on Natural Scene Statistics (NSS) theory and Human Visual System (HVS) model.It can be expressed as the ratio between the distorted test image information and the reference image information, the calculation equation of VIF is shown in Equation (9).
where I( To evaluate the VIF of fused image, an average of VIF values of each input image and the integrated image is proposed [56].The evaluation function of VIF for image fusion is shown in Equation (10).
where V IF(A, F) is the VIF value between input image A and fused image F; V IF(B, F) is the VIF value between input image B and fused image F.

Q Y
Yang et al. proposed structural similarity-Based way for fusion assessment [57].The Yang's method is shown in Equation (11).
where λ(ω) is the local weight, SSI M(A, B) is a structural similarity index measure for images A and B. The detail of λ(ω) and SSI M(A, B) can be found in [57,58].

Q CB
Chen-Blum metric is human perception inspired fusion metrics.Chen-Blum metric consists of 5 steps: In the first step, filtering image I(i, j) in frequency domain.I(i, j) is transformed to frequency domain and get I(m, n).Filtering I(m, n) by contrast sensitive function(CSV) [59] filter S(r), where r = √ m 2 + n 2 .In this image fusion metric S(r) is in polar form.Ĩ(m, n) can be got by Ĩ(m, n) = I(m, n) * S(r).
In the second step, local contrast is computed.For Q CB metric, Peli's contrast is used and it can be defined as: A common choice for φ k (i, j) would be with a standard deviation σ k = 2.
In the third step, The masked contrast map for input image I A (i, j) is calculated as: Here, t, h, p, q and Z are real scalar parameters that determine the shape of the nonlinearity of the masking function [59,60].
In the fourth step, the saliency map of I A (i, j) can be calculated by Equation (15), The information preservation value is computed as Equation ( 16), , otherwise.
In the fifth step, the Global quality map can be calculated: Then the value of Q CB can be got by average the global quality map:

Image Quality Comparison
To show the efficiency of proposed method, the quality comparison of fused images is demonstrated.It compares the quality of fused image based on visual effect, the accuracy of focused region detection, and the objective evaluations.

Comparison Experiment 1
The source fluorescence image is obtained from public website [61].Figure 6a,b are the source multi-focus fluorescence images.To show the details of fused image, two image blocks are highlighted and magnified, which are squared by red and blue square respectively.The image block in red square is out of focus in Figure 6a, and the image block in blue square, is out of focus in Figure 6b.The corresponding image block in blue and red square are totally focused in Figure 6a,b respectively.Figure 6c-e show the fused images of KSVD, JPDL, and proposed method, respectively.The difference and performance of fused images by three different methods are difficult to figure out by eyes.In order to evaluate of fusion performances objectively, Q AB/F , MI, VIF, Q Y , and Q CB are also used as image fusion quality measures.The fusion results of multi-focus fluorescence images using three different methods are shown in Table 1.  1, it can figure out that the proposed method has the best performance in all five types of evaluation metrics.Particularly, for the objective evaluation metric Q AB/F , the proposed method obtains higher result than other two comparison image fusion methods.Since Q AB/F is a gradient-based quality metric to measure how well the edge information of source images is conducted to the fused image, it means the proposed method can get better fused image with edge information.

Comparison Experiment 2 and 3
Similarly, the source fluorescence images shown in Figures 7 and 8a   Objective metrics of multi-focus comparison experiment 2 and 3 are shown in Tables 2 and 3 respectively to evaluate the quality of fused images.According to the metric results, the proposed method has the best performance in all five objective evaluations in comparison experiment 2 and 3.So the proposed method has the best overall performance among all comparison methods.* The highest result in each column is marked in bold-face.

Processing Time Comparison
Table 4 compares the processing times of three comparison experimentations.The proposed solution has lower computation costs than KSVD and JPDL in image fusion process.Compared with KSVD, the dictionary construction method of proposed solution does not use any iterative way to extract the underlying information of images, which is not efficient in dictionary construction.Although JPDL and the proposed method both cluster image pixels or patches based on geometric similarity, the proposed method does not use the iterative method of Steering Kernel Regression (SKR), which is time consuming.

Conclusions
This paper proposes a novel sparse representation-based image fusion framework, which integrates geometric dictionary construction.A geometric image patch classification approach is presented to group image patches from different source images based on the similarity of image geometric structure.The proposed method extracts a few compact and informative sub-dictionaries from each image patch cluster by PCA and these sub-dictionaries are combined into a dictionary for sparse representation.Then image patches are sparsely coded into coefficients by the trained dictionary.For obtaining better edge and corner details of fusion results, the proposed solution also chooses image block size adaptively and selects optimal coefficients during the image process.The sparsely coded coefficients are fused by Max-L1 rule and inverted to the fused image.The proposed method is compared with existing mainstream sparse representation-based methods in various experiments.The experimentation results proves that the proposed method has good fusion performance in different image scenarios.

Figure 2 .
Figure 2. Smooth Image Patches and Non-smooth Image Patches, (a) shows smooth image patches, (b) shows non-smooth image patches.

Figure 3 .
Figure 3. PDFs of Dominant Measure R, (a) shows PDF of R in 6*6 patch size, (b) shows PDF of R in 7*7 patch size, (c) shows PDF of R in 8*8 patch size, (d) shows PDF of R in 9*9 patch size.

Figure 4 .
Figure 4. Stochastic Image Patches and Dominant Orientation Image Patches, (a) shows stochastic image patches, (b) shows dominant orientation image patches.
i ) represent the mutual information, which are extracted from a particular subband in the reference and the test images respectively.visual signals at the output of HVS model from the reference and the test images respectively.

Figure 6 .
Figure 6.Fusion Results of Multi-focus Fluorescence Image -1, (a,b) are source images, (c-e) are fused image of KSVD, JPDL, and proposed method respectively.
,b are obtained from public websites [62,63] respectively.In a set of source images, two images (a) and (b) focus on different items.The source images are fused by KSVD, JPDL, and proposed method to get a totally focused image, and the corresponding fusion results are shown in Figures 7 and 8c-e respectively.

Figure 7 .
Figure 7. Fusion Results of Multi-focus Fluorescence Image -2, (a,b) are source images, (c-e) are fused image of KSVD, JPDL, and proposed method respectively.

Figure 8 .
Figure 8. Fusion Results of Multi-focus Fluorescence Image -3, (a,b) are source images, (c-e) are fused image of KSVD, JPDL, and proposed method respectively.

Table 1 .
Fusion Performance Comparison -1 of Multi-focus Fluorescence Image Pairs.The highest result in each column is marked in bold-face.The best results of each evaluation metric are highlighted by bold-faces in Table1.According to Table *

Table 2 .
Fusion Performance Comparison -2 of Multi-focus Fluorescence Image Pairs.

.7692 * 3.8982 0.7488 0.8058 0.8206
The highest result in each column is marked in bold-face. *

Table 3 .
Fusion Performance Comparison -3 of Multi-focus Fluorescence Image Pairs.

Table 4 .
Processing Time Comparison.The highest result in each column is marked in bold-face. *