A Fusion Method for Atomic Force Acoustic Microscopy Cell Imaging Based on Local Variance in Non-Subsampled Shearlet Transform Domain

Featured Application: Our method can be applied to AFAM imaging, which helps to analyze cell structure. Abstract: Atomic force acoustic microscopy (AFAM) is a measurement method that uses the probe and acoustic wave to image the surface and internal structures of di ﬀ erent materials. For cellular material, the morphology and phase images of AFAM reﬂect the outer surface and internal structures of the cell, respectively. This paper proposes an AFAM cell image fusion method in the Non-Subsampled Shearlet Transform (NSST) domain, based on local variance. First, NSST is used to decompose the source images into low-frequency and high-frequency sub-bands. Then, the low-frequency sub-band is fused by the weight of local variance, while a contrast limited adaptive histogram equalization is used to improve the source image contrast to better express the details in the fused image. The high-frequency sub-bands are fused using the maximum rule. Since the AFAM image background contains a lot of noise, and improved segmentation algorithm based on the Otsu algorithm is proposed to segment the cell region, and the image quality metrics based on the segmented region will make the evaluation more accurate. Experiments with di ﬀ erent groups of AFAM cell images demonstrated that the proposed method can clearly show the internal structures and the contours of the cells, compared with traditional methods.


Introduction
Atomic force acoustic microscopy (AFAM) [1] is an imaging technology combining acoustic detection and atomic force microscopy (AFM), which can image nondestructively the internal structures, as well as the surface topography of the samples in high resolution. In AFAM, the transducer of the pedestal emits sound echo to the sample as the probe scans on the sample. The probe response signal is collected by the light spot detector mounted on the cantilever of the probe and analyzed during the scan. The morphology and phase images were obtained simultaneously after the scanning of the sample in 2D. When AFAM is applied to image cells, the morphology images could only show cytoplasmic regions, while their phase images showed the cytoplasmic internal structures, but without the cell boundaries clearly [1]. To see the cells in detail, image fusion is needed to fuse both images so that the fused image can contain both information of the morphology contour and the internal structures of the cell. To our knowledge, there is no such research on AFAM image fusion yet.
Image fusion is a process combining two or more source images of the same scene to obtain a single fused image which is more suitable for human visual perception or computer vision processing. The fused image requires to dig the duplicate Supplementary Materials in the source images and remove the redundant information. The selection of source image information is critical in image fusion. One of the important methods in image fusion is to separate the information in the source images and then fuse different types of information separately. For instance, the fusion method based on multi-scale transformation decomposes the source images at different scales to obtain the information, such as edges, details, and base contours, and applies suitable fusion rules to different kinds of information to obtain a better fused image.
Multi-scale transform (MST) is an important tool in image fusion, and many MSTs have been proposed, such as wavelet transform [2], contourlet transform [3], curvelet transform [4], shearlet transform [5], and the nonsubsampled contourlet transform (NSCT) [6,7]. Non-subsampled shearlet transform (NSST) [8,9] is an improved multi-scale geometric analysis tool based on shearlet transform, which has the advantages of simple calculation and less Gibbs effect, and therefore, is widely used in multi-scale image fusion [10][11][12]. NSST can separate image information, such as edges and base layers, so that different fusion rules can be applied to different kinds of information. Guo et al. [11] presented a multi-focus image fusion method based on NSST and the human vision system. Vishwakarma et al. proposed a variable coefficient Meyer window for constructing a shearing filter, which is used for image fusion [12]. However, these methods only work well for certain types of image fusion, such as optical images with an inconsistent focus in the same scene.
Contrast limited adaptive histogram equalization (CLAHE) can effectively enhance the local contrast of images, and the variance reflects the local information richness of the image, which is an effective indicator for information selection in image fusion [13,14]. Because the local variance of an image by CLAHE could reflect the local image information efficiently, therefore, it is used in our image fusion to preserve more information of low-contrast images.
In this paper, a cell image fusion method with AFAM based on the local variance in the NSST domain is proposed. The proposed method decomposes the image into low-frequency sub-band and a series of high-frequency sub-bands using NSST and fuses the low-frequency and high-frequency sub-bands using a weight map based on CLAHE-enhanced variance and maximum rule, respectively. The results of the experiments showed that the proposed method can effectively protect both edge and geometric structures by combining the morphology and phase images.

Non-Subsampled Shearlet Transform (NSST)
NSST is an improvement method, based on shearlet transform combining with a non-subsampled Laplacian pyramid (NSLP). Wang et al., proposed a method to implement NSST on time-domain directly [8], which divides the implementation of NSST into two steps: Multi-scale decomposition and directional filtering. In NSST, NSLP filters replace the Laplacian pyramid filters used in the shearlet transform to scale the image. The source image is decomposed into (k + 1) sub-bands, which have the same size of the source image, using k-class non-subsampled pyramid. The (k + 1) sub-bands include a low-frequency sub-band f k a and k high-frequency sub-bands { f i d i = 1, 2 . . . k}, which are represented by: where f k a denotes the low-frequency sub-band at resolution level k and f k d denotes the high-frequency sub-bands. H 0 (z) and H 1 (z) represent low pass and high pass filters, respectively. I is the identity matrix, and H {0,1} z 2 k−1 I is the kth filter after down sampling.
The high-pass image would convolve with a directional filter, which has a large flexibility in the choice. For instance, it can be calculated simply in pseudo-polar grid with Meyer window function and translated into the Cartesian grid. The NSST coefficients can be calculated by the following equation: where w l denotes a directional filter in the time domain.

Contrast Limited Adaptive Histogram Equalization (CLAHE)
Contrast limited adaptive histogram equalization is a classic method in image contrast enhancement [15]. To perform CLAHE, the original image is split into equally-sized rectangular partitions, while on each partition, the same transformation function will be applied.
Contrast-limited gray level transformation is the main procedure in CLAHE, which is similar to the traditional histogram equalization (HE), but limits the contrast by a clip point to cut off the peak value in the histogram. For a given image I with L discrete gray levels, denoted as g 0 , g 1 . . . g L−1 , the probability density function (PDF) p(i) in each partition is defined as: where n i denotes the number of pixels with the gray level g i and N is the total number of pixels in the partition. Crop p(i) above the clip point and distribute the cropped part evenly to p(i) to get a new p (i), as shown in Figure 1a. The gray level transformation function T(g k ) is showed in Equation (6).
It should be noted that the transformation functions on different partitions will be different.
where c(i) is the probability distribution function, which is the integral of p (i). To remove the possible block artifacts, a bilinear interpolation between the partitions was used to smooth the final pixel values. The transformation is appropriate for the center pixel of each partition, while the other pixel values are interpolated from the transformation functions in the surrounding partitions.
where I(x, y) denotes the value of the pixel at (x, y) and I CLAHE is the transformed image. Pixel (x, y) is surrounded by the center of four partitions B k (k = 1, 2, 3, 4). T B k (g) is transform function in the block B k , (x B k , y B k ) is the center of the block B k and L is the length of the partitions (see Figure 1b).

The Framework of Proposed Fusion Method
The framework of our proposed fusion method was depicted in Figure 2. It was composed of the following steps.

Decomposition
The morphology and phase images are first decomposed with NSST into a low-frequency sub-band and multiple high-frequency sub-bands, respectively. Then, they are fused using different fusion rules separately.

Low-Frequency Sub-Band Fusion
The low-frequency sub-band fusion adopts the weight fusion rule with a weight map. To enhance local information, CLAHE is calculated on the initial image, and its local variance is adopted to calculate the weight map W.
S(x, y) = where I CLAHE denotes the image processed with CLAHE and (n, m) denotes the position of the pixel, D x,y denotes a region centered on (x, y) with the size of s × s, N denotes the number of the pixels in region D x,y . S(x, y) denotes local variance, and the weight map W is calculated by Equation (10) with the local variance S morp , S phase and weight coefficient α. The fused image is fused by Equation (11) for each pixel by the weight W. With the increase of α, the fusion effect is gradually improved and finally tends to be flat. In the experiment, α = 2 is a value suitable for the fusion.

High-Frequency Sub-Bands Fusion
The energy of high-frequency sub-bands is low, and the key information in high-frequency sub-bands of two source images do not overlap. Among them, the part with larger absolute value will dominate the feature. Therefore, the max rule is selected to fuse the high-frequency sub-bands:

Image Quality Metrics
In cellular imaging, cellular information is more important in the fused image. Therefore, background information should be ignored when evaluating the quality of the fused image. To remove the background, a method based on the Otsu algorithm is used. Otsu [16] is an image segmentation method based on the principle of inter-class variance maximization, which is shown as: where T is the gray level and σ(T) denotes the inter-class variance of T. Equation (13) shows that the T otsu maximizes the inter-class variance. Due to there are not just one cell in the image, a single Otsu segmentation cannot accurately obtain all cell boundaries. To solve this problem, the improved Otsu algorithm was proposed. To get the ROIs of different cell parts, the morphology image is segmented with the Otsu algorithm repeatedly, and the parts below the threshold are discarded until a relatively consistent region is obtained. Then, dilate the part to include the cell boundary. The detail steps are described as follows:    Step 2: Use CLAHE for the source images, and use Equations (8) and (9) to calculate S morp and S phase . Get the weight map W using Equation (10), and fuse the low-frequency sub-bands using Equation (11) to obtain L f usion . The weight coefficient in our experiment is set as α = 2.0 and the size of the region to calculate the variance is 5 × 5.

•
Step 3: Equation (12) is utilized to deal with the high-frequency sub-bands.

•
Step 4: Perform the inverse NSST of the low-frequency and the high-frequency sub-bands to obtain the fused image.

•
Step 5: Segment the image ROI and evaluate the results.

Results and Discussion
The schematic diagram of the AFAM and the image acquisition have been clearly presented in Ref. [1], where AFAM was used to successfully and accurately image the morphology and internal structures of cells in a facile and non-invasive manner. Sixteen groups of morphology and phase images were used from 2 scans in Ref. [1] for testing, which were provided by the Medical Ultrasonic Laboratory of Huazhong University of Science and Technology, Wuhan, China. From Figure 4b,d, it is obvious that the backgrounds in the phase images are complicated. To show the cells more clearly, we segmented the cell image from their corresponding morphology images and ignored the background. Our fusion method is compared with five different fusion methods, including Laplacian pyramid (LP) [17], curvelet transform (CVT) [18], NSST-VGG [19], gradient transfer fusion (GFT) [20], and FusionGAN [21]. In the experiments, for the sake of fairness, the resolution level samples in LP, CVT, and NSST-VGG are set as 3, which are the same as our proposed method. The rest of the parameters were set to produce the best results of the experimental results of their methods.

Quality Evaluation
Those quantitative evaluation metrics, including MI [22], QAB/F [23], QLLSIM [24], and VIFF [25] are often used to evaluate the performances of different fusion methods. A cell ROI region segmentation method based on the improved Otsu algorithm in Section 2 is applied to make the results more convincing.

MI
Mutual information (MI) [22] shows the correlation between two events, which can be used for evaluating image fusion performance. The higher score of MI is, the richer the information is obtained from the source images. The MI of two images is defined as follows: where F is the fused image and A,B are the source images. H(A), H(B) and H(F) are the entropies of images A, B, and F, respectively.

Q AB/F
Q AB/F reflects the quality of edge information obtained from the fusion of input images and can be used to compare the performance of different image fusion algorithms [23], which is defined as follows: where Q AF (i, j) = Q AF g (i, j)Q AF o (i, j), Q AF g (i, j) and Q AF o (i, j) are the edge strength and orientation preservation value at the location (i, j). W A and W B are the weight maps which are equal to the edge strength in the source image. N×M are the size of the source images. The higher value of Q AB/F is, the less edge information of fused image loss.

Q LSSIM
Q LSSIM : A quality metric of fused image according to the structural similarity between the source images. The closer the Q LSSIM value to 1, the better the fusion performance. Q LSSIM is calculated from SSIM between fused image and two source images. SSIMM is defined as follow: where lSSIM are obtained by using the method in [24], N is the size of lSSIM(A, B, F),and γ(w) is local

VIFF
VIFF is a quality metric of fused image using visual information fidelity. Fused image and two source images are decomposed into different scales using Laplacian Pyramid. At each scale, the image is divided into different blocks. Then, evaluate the visual information of each block with and without distortion. Evaluate the VIFF for each scale. Finally, calculate the overall metric based on the above result [25].

Experimental Results
Three pairs of morphology and phase images of Staphylococcus aureus, called "Data-1"," Data-2"," Data-3", respectively, are selected as the representative results for subjective evaluation. In the results of using LP, CVT, and NSST-VGG, although the internal structure of the cell is very clear, the cell boundary is very blurred, and the surrounding phase image noise is mixed together. All the results using FusionGAN have clear boundary information, but in "Data-1" and "Data-3", the results using FusionGAN have a lack of internal structure information, as shown in Figures 5g and 6g. And in Figure 7g, the result shows only the boundaries of the internal structure. This shows that the FusionGAN fusion effect is not stable. In "Data 3", the contrast of GFT results is low, but it performs better in "Data-1" and "Data-2". This means that GFT has similar problems with FusionGAN. The results of the proposed method could clearly show the internal structures and contours of the cells, and the intracellular structures and cytoplasmic regions are with high contrast. But in terms of visual sense, its performance is not as good as the better results in GFT and FusionGAN.    Table 1 showed the quantitative comparisons of the four metrics for six methods, and shows the p-values of the proposed method for different metrics of each comparison method. It was demonstrated that the proposed method is the best in almost all metrics except GFT, and the t-test showed that there is a significance difference to the other methods. Although GFT has the highest score on Q LLSIM , the p-values in Table 2 showed that there is no significant difference from the proposed method. The quantitative analysis results are consistent with the subjective visual comparison results. LP, CVT, and NSST-VGG are all scale fusion methods, the performance of the three methods is relatively close, and there is a large gap with the proposed method. The GFT method performs better when the morphology image and the phase image have large gray levels, such as "data-1", but when the two are close, the fusion image will be blurred. Q AB/F indicates the fusion effects of edge information, GFT results are the worst. FusionGAN requires a lot of data for training, and AFAM cell images cannot meet this requirement. The network comes from an infrared image fusion project [21] with similar targets, which cannot achieve good results for AFAM cell image fusion, and the result is unstable. The FusionGAN method is very difficult to train and not easy to converge. It is difficult to achieve a good fusion effect even with a large number of images.

Conclusions
This paper proposed a novel fusion method for AFAM cell images to solve the problem that the phase image cannot show the surface structure information. The proposed method utilizes NSST to decompose the source images into low-frequency and high-frequency sub-bands. A weight map calculated by the variance of the image, which is applying CLAHE to enhance the contrast, is performed for low-frequency sub-bands fusion-while the max rule is used for high-frequency fusion. Experiments have been performed on 16 groups of morphology and phase images, which are compared to five methods, including LP, CVT, NSST-VGG, GFT, FusionGAN, and only the cell area is segmented for the evaluation. Both subjective evaluations and objective quality metrics, including Q AB/F , and VIFF, showed that our method has a clearer intracellular structure and cellular contour, compared with other methods, which is beneficial to the analysis of cell structures.
Supplementary Materials: The following are available online at http://www.mdpi.com/2076-3417/10/21/7424/s1, Previously reported AFAM scan data were used to support this study and are available at [doi:10.3390/cells8040314]. These prior studies (and dataset) are cited at relevant places within the text as references [1].