All-in-Focused Image Combination in the Frequency Domain Using Light Field Images

: All-in-focused image combination is a fusion technique used to acquire related data from a set of focused images at di ﬀ erent depth levels, which suggests that one can determine objects in the foreground and background regions. When attempting to reconstruct an all-in-focused image, we need to identify in-focused regions from multiple input images captured with di ﬀ erent focal lengths. This paper presents a new method to ﬁnd and fuse the in-focused regions of the di ﬀ erent focal stack images. After we apply the two-dimensional discrete cosine transform (DCT) to transform the focal stack images into the frequency domain, we utilize the sum of the updated modiﬁed Laplacian (SUML), enhancement of the SUML, and harmonic mean (HM) for calculating in-focused regions of the stack images. After fusing all the in-focused information, we transform the result back by using the inverse DCT. Hence, the out-focused parts are removed. Finally, we combine all the in-focused image regions and reconstruct the all-in-focused image.


Introduction
Light field cameras, also called plenoptic cameras, have been popularly used in digital refocusing and three-dimensional reconstruction. They are fabricated with internal micro-lens arrays to capture light field information in such a way that one can refocus the image after acquisition. This is the very unique capability of the light field camera [1]. Due to the finite depth of field (DOF) in normal digital cameras, an image of all of the relevant objects displays sharpness information inside the DOF; however, the objects show blurred information outside the DOF. Since the light field image generates a set of images focused at different depth levels after being captured, it is suggested that one can determine objects in the foreground and background regions. Moreover, it can generate a set of multi-view images without the need for calibration images.
All-in-focused image combination is a method for merging in-focused information of the stack images that are captured at different focal planes from the same position. This algorithm addresses a method to fuse image sequences for reconstructing the all-in-focused image so that all relevant object regions appear sharp in the final image reconstruction. For detecting in-focused regions of the images and combining them for the all-in-focused image, various methods have been studied for focus measurement using the whole images. Up to now, various focus measurement and image combination methods have been proposed for different applications. Aydin and Akgul have proposed a focus measure operator that applies flexibly shaped and weighted support windows [2]. The algorithm can retrieve the depth discontinuities. The all-focused image is used to determine the support window. Zhang et al. have presented a focus detection method that portions source images into edges, textures, and smooth regions [3].

All-in-Focused Image Combination
In this paper, we propose an image combination method that detects in-focused regions in the light field images and merges them into the all-in-focused image. Figure 1 represents the procedure of our proposed method. After dividing the input stack images into blocks of 8 × 8 pixels and calculating the DCT coefficients of each block, we calculate SUML and HM as the in-focus measures and perform the image combination procedure. Based on the final in-focused maps, we reconstruct the all-in-focused image by applying the inverse DCT and mitigating blocking artifacts.
Appl. Sci. 2019, 9, x 2 of 17 suggested a multi-focus image fusion method for visual sensor networks in the discrete cosine transform (DCT) domain [4]. This method utilizes variance values to measure and fuse multi-focus images using DCT-based algorithms. Lee and Zhou have introduced the DOF extension using a fusion of two images [5]. Their algorithm applies the DCT-STD and the DWT-STD for focus detection. Besides that, Chen et al. have demonstrated a multi-spectral imaging method that can also show the color image reproduction [6][7][8].
While most previous methods for image combination employed a few inputs of different focal images, we attempt a new method for image composition using many input images. In this paper, we describe a new all-in-focused image combination method that integrates the sum of updated modified Laplacian (SUML) and the harmonic mean (HM) in the discrete cosine transform (DCT) domain.
The sum of modified Laplacian (SML) performs better than other focus criteria [9] and HM is more robust than the arithmetic mean because they both support small pieces of information and increase their influences on the overall estimation operation. Moreover, the proposed method takes advantages of image representation in the frequency domain. Since it is difficult to classify in-focused and out-focused regions in the spatial domain when edges of the out-focused parts are sharp, we transform images into the frequency domain to analyze the image information. The main contributions of this paper are: (1) The method for extending the DOF in the imaging system that creates an image from a set of different focal images at one shot capture, and (2) the effective method for all-in-focused image combination that is performed in the frequency domain to avoid the artifacts reduction process in the spatial domain that requires a complexity execution.

All-in-Focused Image Combination
In this paper, we propose an image combination method that detects in-focused regions in the light field images and merges them into the all-in-focused image. Figure 1 represents the procedure of our proposed method. After dividing the input stack images into blocks of 8 × 8 pixels and calculating the DCT coefficients of each block, we calculate SUML and HM as the in-focus measures and perform the image combination procedure. Based on the final in-focused maps, we reconstruct the all-in-focused image by applying the inverse DCT and mitigating blocking artifacts.

Light Field Image Splitter
In this paper, we utilize a Lytro camera [10] to acquire light field images. In general, each light field image is decomposed into different focus-level images using the light field image splitter [11]. The splitter provides a set of different focal images that display the same position, as shown in Figure  2. We denote {It, t = 1, …, N} for the focal stack of input images.

Light Field Image Splitter
In this paper, we utilize a Lytro camera [10] to acquire light field images. In general, each light field image is decomposed into different focus-level images using the light field image splitter [11]. The splitter provides a set of different focal images that display the same position, as shown in Figure 2. We denote {I t , t = 1, . . . , N} for the focal stack of input images.

Discrete Cosine Transform (DCT)
Each stack image (It) is transformed into the frequency domain by DCT. The source image is partitioned into blocks of 8 × 8 pixels and DCT coefficients of each block are computed by where D(u, v) represents the DCT coefficient at the position (u, v) in the DCT domain. The DCT coefficients consist of the DC coefficient D(0,0) and AC coefficients. The AC coefficients are used for focus value calculation.

Sum of Updated Modified Laplacian (SUML)
In the proposed method, we improve the original SML and use it as a part of the focus measurement since the SML gives better efficiency than other focus measurement criteria [9]. When we consider the DCT coefficients, the higher energy property of the AC coefficients implies meaningful information in the in-focused region. Because the AC coefficients D (4,5), D (5,4), and D (4,4) are more important than other coefficients [12], we choose the AC coefficient D (4,4) for focus value calculation. The original modified Laplacian (ML) only considers variations in the x and y directions [13]. Thus, we modify the original ML and utilizes D (4,4). This value is small for both infocused and out-focused parts in the homogeneous region. Thus, we propose the updated modified Laplacian (UML) for the block B(x,y) to include the diagonal directions and combine all the information of its neighborhood including sharp in-focused parts around the block. UML is defined by where (x,y) represents the block position, D(u,v) is the AC coefficient at the position (u,v) of the block B(x,y). In (2), 'step' is a fixed value, set as 1 in this paper. The focus measure at block B(x,y) is computed as the SUML value in the window around B(x,y). SUML is expressed by

Discrete Cosine Transform (DCT)
Each stack image (I t ) is transformed into the frequency domain by DCT. The source image is partitioned into blocks of 8 × 8 pixels and DCT coefficients of each block are computed by where D(u, v) represents the DCT coefficient at the position (u, v) in the DCT domain. The DCT coefficients consist of the DC coefficient D(0,0) and AC coefficients. The AC coefficients are used for focus value calculation.

Sum of Updated Modified Laplacian (SUML)
In the proposed method, we improve the original SML and use it as a part of the focus measurement since the SML gives better efficiency than other focus measurement criteria [9]. When we consider the DCT coefficients, the higher energy property of the AC coefficients implies meaningful information in the in-focused region. Because the AC coefficients D(4,5), D (5,4), and D(4,4) are more important than other coefficients [12], we choose the AC coefficient D(4,4) for focus value calculation. The original modified Laplacian (ML) only considers variations in the x and y directions [13]. Thus, we modify the original ML and utilizes D (4,4). This value is small for both in-focused and out-focused parts in the homogeneous region. Thus, we propose the updated modified Laplacian (UML) for the block B(x,y) to include the diagonal directions and combine all the information of its neighborhood including sharp in-focused parts around the block. UML is defined by Appl. Sci. 2019, 9,3752 4 of 17 where (x,y) represents the block position, D(u,v) is the AC coefficient at the position (u,v) of the block B(x,y). In (2), 'step' is a fixed value, set as 1 in this paper. The focus measure at block B(x,y) is computed as the SUML value in the window around B(x,y). SUML is expressed by where δ(i, j) represents the UML value that follows the threshold T SUML condition. The window size around B(x,y) is N × N.

Enhanced SUML (eSUML)
In a homogeneous region, the focus measure can be affected by pixel noise [14]. In order to decrease this effect, the SUML values at block B(x,y) are computed as the eSUML value in the window around SUML(x,y). eSUML is calculated by where N × N determines the size of the window. The effectiveness of eSUML in the focus measure informs us that the focus measure values, as well as the focus border, are more distinct for eSUML compared to SUML.

Harmonic Mean (HM)
HM measures the information of the eSUML results and is used for confirming the reliable focus measure. The HM value at block B(x,y) is calculated based on the eSUML values in the window around B(x,y), which is N × N. HM is defined by where M determines the size of the window, (x,y) represents the block position, and µ m is the average value of the eSUML results at block B(x,y). High values of HM will be deemed as in-focused regions and the out-focused regions will have low HM values.
HM has two advantages. First, arithmetic mean estimate can be distorted significantly by the large variances of the out-focused regions, while the harmonic mean is robust. Second, the harmonic mean considers reciprocals, hence it assists the small variances and increases their influence in the overall estimation. Although most variances of the out-focused regions may have small values, one large variance value can make the arithmetic mean value in those regions larger than the value in the in-focused regions. It causes the out-focused regions to be falsely considered as in-focused regions.

Image Combination
The all-in-focused image combination is fused by selecting the DCT coefficients that grant the highest HM value for each block B(x,y). The focal stack of input images {I t , t = 1, . . . , N} that is divided into blocks in position (x,y), the DCT coefficients {DCT(x,y) t , t = 1, . . . , N}, and the HM values {H(x,y) t , t = 1, . . . , N} are the input data for the combination process. The block map of the fused image {MAP(x,y)} and the DCT coefficients of the fused image {FDCT(x,y)} are selected by

Consistency Verification (CV)
In order to improve the combination effect, we employ a CV process [4] to the block map of the fused image MAP(x,y). We improve the MAP(x,y) accuracy by utilizing a majority filter in the window around MAP(x,y) as shown in Figure 5. Therefore, the CV is applied as post-processing, after the image combination process, to improve the quality of the output image and reduce the error due to unsuitable block selection. This process succeeds in both quality and complexity. Then, the DCT coefficients of the fused image FDCT(x,y) will be updated by following the improved MAP(x,y) values, as shown in Figure 6.

Consistency Verification (CV)
In order to improve the combination effect, we employ a CV process [4] to the block map of the fused image MAP(x,y). We improve the MAP(x,y) accuracy by utilizing a majority filter in the window around MAP(x,y) as shown in Figure 5. Therefore, the CV is applied as post-processing, after the image combination process, to improve the quality of the output image and reduce the error due to unsuitable block selection. This process succeeds in both quality and complexity. Then, the DCT coefficients of the fused image FDCT(x,y) will be updated by following the improved MAP(x,y) values, as shown in Figure 6.

Consistency Verification (CV)
In order to improve the combination effect, we employ a CV process [4] to the block map of the fused image MAP(x,y). We improve the MAP(x,y) accuracy by utilizing a majority filter in the window around MAP(x,y) as shown in Figure 5. Therefore, the CV is applied as post-processing, after the image combination process, to improve the quality of the output image and reduce the error due to unsuitable block selection. This process succeeds in both quality and complexity. Then, the DCT coefficients of the fused image FDCT(x,y) will be updated by following the improved MAP(x,y) values, as shown in Figure 6.

Consistency Verification (CV)
In order to improve the combination effect, we employ a CV process [4] to the block map of the fused image MAP(x,y). We improve the MAP(x,y) accuracy by utilizing a majority filter in the window around MAP(x,y) as shown in Figure 5. Therefore, the CV is applied as post-processing, after the image combination process, to improve the quality of the output image and reduce the error due to unsuitable block selection. This process succeeds in both quality and complexity. Then, the DCT coefficients of the fused image FDCT(x,y) will be updated by following the improved MAP(x,y) values, as shown in Figure 6.    Finally, the all-in-focused image is reconstructed by applying the inverse DCT to the updated FDCT(x, y). The inverse DCT coefficients of each block are computed by

Blocking Artifacts Reduction
We apply edge-preserving smoothing, such as the fast guided filter [15], to the reconstructed image. This smoothing method also has the ability to sharpen blurred edges and ensures the efficiency of the reconstruction process.

Experimental Results
Experiments are conducted on five image sets: 'Bag', 'Cup', 'Bike', 'Mouse', and 'Flower' that are captured by a Lytro camera [10]. The experimental results are evaluated in terms of the focus measurement and the fused process for the all-in-focused image. We conduct the experiments on 360 × 360 pixel test images. The experimental parameters are assigned as a DCT window size of 8, a SUML window size of 3, a HM window size of 3, a CV window size of 3, and T SUML of 10. The effectiveness of the SUML focus measurement is evaluated using the SML on the images of the 'Bag' dataset. It is illustrated in Figure 7. It can be observed that the focus measurement is more distinct for the SUML in Figure 7c compared with the SML in Figure 7b. The effectiveness of the SUML focus measurement is evaluated using the SML on the images of the 'Bag' dataset. It is illustrated in Figure 7. It can be observed that the focus measurement is more distinct for the SUML in Figure 7c compared with the SML in Figure 7b.

On the Images of the 'Cup' Dataset
The effectiveness of the SUML focus measurement is evaluated using SML on the images of the 'Cup' dataset. It is illustrated in Figure 8. It can be observed that the focus measurement is more distinct for the SUML in Figure 8c

On the Images of the 'Cup' Dataset
The effectiveness of the SUML focus measurement is evaluated using SML on the images of the 'Cup' dataset. It is illustrated in Figure 8. It can be observed that the focus measurement is more distinct for the SUML in Figure 8c compared with the SML in Figure 8b. The effectiveness of the SUML focus measurement is evaluated using SML on the images of the 'Bike' dataset. It is illustrated in Figure 9. It can be observed that the focus measurement is more distinct for the SUML in Figure 9c compared with the SML in Figure 9b.

On the Images of the 'Bike' Dataset
The effectiveness of the SUML focus measurement is evaluated using SML on the images of the 'Bike' dataset. It is illustrated in Figure 9. It can be observed that the focus measurement is more distinct for the SUML in Figure 9c

On the Images of the 'Mouse' Dataset
The effectiveness of the SUML focus measurement is evaluated using SML on the images of the 'Mouse' dataset. It is illustrated in Figure 10. It can be observed that the focus measurement is more distinct for the SUML in Figure 10c compared with the SML in Figure 10b.

On the Images of the 'Mouse' Dataset
The effectiveness of the SUML focus measurement is evaluated using SML on the images of the 'Mouse' dataset. It is illustrated in Figure 10. It can be observed that the focus measurement is more distinct for the SUML in Figure 10c

On the Images of the 'Mouse' Dataset
The effectiveness of the SUML focus measurement is evaluated using SML on the images of the 'Mouse' dataset. It is illustrated in Figure 10. It can be observed that the focus measurement is more distinct for the SUML in Figure 10c compared with the SML in Figure 10b.

On the Images of the 'Flower' Dataset
The effectiveness of the SUML focus measurement is evaluated using SML on the images of the 'Flower' dataset. It is illustrated in Figure 11. It can be observed that the focus measurement is more distinct for the SUML in Figure 11c compared with the SML in Figure 11b.

On the Images of the 'Flower' Dataset
The effectiveness of the SUML focus measurement is evaluated using SML on the images of the 'Flower' dataset. It is illustrated in Figure 11. It can be observed that the focus measurement is more distinct for the SUML in Figure 11c compared with the SML in Figure 11b.

All-in-Focused Image Combination
In this section, the experimental results of the all-in-focused images are presented and evaluated by comparing them with other prominent techniques such as light field software [10], SML [9], DCT-STD [5], DCT-VAR-CV [4], SML-WHV [16], Agarwala's method [17], DCT-Sharp-CV [18], DCT-CORR-CV [19], and DCT-SVD-CV [20]. The all-in-focused images of different algorithms are shown in Figures 12-16. From the expanded images in Figure 17, we can easily observe that the results of the light field software, the DCT-STD method, and the DCT-VAR-CV method have lower contrast than those of the SML method, Agarwala's method, the DCT-Sharp-CV method, the DCT-CORR-CV method, the DCT-SVD-CV method, and the proposed method. However, it is hard to show the differences from the results of the SML method, Agarwala's method, the DCT-Sharp-CV method, the DCT-CORR-CV method, the DCT-SVD-CV method, and the proposed method by subjective evaluation. It seems that there are little differences among the fused images, but the objective performance evaluation can capture their differences precisely. Hence, this paper applies some nonreference fusion metrics, such as the feature mutual information (FMI) metric [21] and Petrovic's metric (Q AB/F ) [22]. These metrics are calculated without respect to the reference images. The FMI metric measures the amount of information that the fused image contains from the source images, while Q AB/F measures the relative amount of edge information that is transferred from the source into

All-in-Focused Image Combination
In this section, the experimental results of the all-in-focused images are presented and evaluated by comparing them with other prominent techniques such as light field software [10], SML [9], DCT-STD [5], DCT-VAR-CV [4], SML-WHV [16], Agarwala's method [17], DCT-Sharp-CV [18], DCT-CORR-CV [19], and DCT-SVD-CV [20]. The all-in-focused images of different algorithms are shown in Figures 12-16. From the expanded images in Figure 17, we can easily observe that the results of the light field software, the DCT-STD method, and the DCT-VAR-CV method have lower contrast than those of the SML method, Agarwala's method, the DCT-Sharp-CV method, the DCT-CORR-CV method, the DCT-SVD-CV method, and the proposed method. However, it is hard to show the differences from the results of the SML method, Agarwala's method, the DCT-Sharp-CV method, the DCT-CORR-CV method, the DCT-SVD-CV method, and the proposed method by subjective evaluation. It seems that there are little differences among the fused images, but the objective performance evaluation can capture their differences precisely. Hence, this paper applies some non-reference fusion metrics, such as the feature mutual information (FMI) metric [21] and Petrovic's metric (Q AB/F ) [22]. These metrics are calculated without respect to the reference images. The FMI metric measures the amount of information that the fused image contains from the source images, while Q AB/F measures the relative amount of edge information that is transferred from the source into the fused image. If FMI or Q AB/F indicate a higher value, the fused image performance provides a better result. The comparison results are summarized in Tables 1-5. The proposed method provides outstanding results when we compare it with other comparative methods.

On the Images of the 'Bag' Dataset
the fused image. If FMI or Q AB/F indicate a higher value, the fused image performance provides a better result. The comparison results are summarized in Table 1 Table 2 Table 3 Table 4 Table 5. The proposed method provides outstanding results when we compare it with other comparative methods.                      The performance summary of the different methods on the five image datasets using Q AB/F and FMI are listed in Table 6; Table 7, in which the top values are shown in bold. According to fusion metrics, Q AB/F and FMI, the performance of the proposed method was better than the other nine compared methods.  The performance summary of the different methods on the five image datasets using Q AB/F and FMI are listed in Tables 6 and 7, in which the top values are shown in bold. According to fusion metrics, Q AB/F and FMI, the performance of the proposed method was better than the other nine compared methods.

Conclusions
In this paper, we proposed an all-in-focused image combination method by integrating the SUML, eSUML, and HM in the DCT domain. The main contributions of this work are that we can perform the robust all-in-focused image combination that is processed in the frequency domain, and extend the depth of field in an imaging system. The performance of the proposed method was evaluated in terms of both subjective and objective evaluation from five image datasets. For the subjective tests, a visual perception experiment was performed. For the objective tests, the Q AB/F and FMI were measured. The experimental results show that the proposed method obtains an all-in-focused image of higher quality, presenting the focus measurement and all-in-focused image combination. Consequently, it was shown from the objective evaluation that the proposed method presented the top values of the Q AB/F and FMI criteria, compared with the conventional methods.
Author Contributions: All authors discussed the contents of the manuscript. W.C. contributed to the research idea and the framework of this study. He performed the experimental work and wrote the manuscript. M.G.J. provided suggestions on the algorithm and revised the entire manuscript.