^{1}

^{2}

^{1}

^{*}

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

Mostly, 3D cameras having depth sensing capabilities employ active depth estimation techniques, such as stereo, the triangulation method or time-of-flight. However, these methods are expensive. The cost can be reduced by applying optical passive methods, as they are inexpensive and efficient. In this paper, we suggest the use of one of the passive optical methods named shape from focus (SFF) for 3D cameras. In the proposed scheme, first, an adaptive window is computed through an iterative process using a criterion. Then, the window is divided into four regions. In the next step, the best focused area among the four regions is selected based on variation in the data. The effectiveness of the proposed scheme is validated using image sequences of synthetic and real objects. Comparative analysis based on statistical metrics correlation, mean square error (MSE), universal image quality index (UIQI) and structural similarity (SSIM) shows the effectiveness of the proposed scheme.

Depth information of an object is very useful and advantageous in many computer vision applications. Therefore, 3D cameras with depth sensing capabilities are becoming more popular and have a wide range of applications in the consumer electronics community. Web-conferencing, 3D gaming, objects tracking, face detection and tracking, automotive safety, mobile phones, robotics and medical devices are potential areas that are using depth cameras with a high expense. These cameras compute depth using various techniques, such as time of flight, stereo or triangulation and monocular [

Shape from focus (SFF) is one of the optical methods used to recover the shape of an object from a stack of monochrome images [

In this paper, we introduce the optimal computing area for robust focus measurement in SFF. Although the fixed small window provides a good depth map, there remains notable inaccuracies in recovered 3D shapes. In the proposed scheme, first, an adaptive window is computed through an iterative process using a criterion. Then, the window is divided into four regions. Each region contains the central pixel. In the next step, the best focused area is selected based on variation in the data. The effectiveness of the proposed scheme is validated using image sequences of synthetic and real objects. Comparative analysis based on statistical metrics correlation, mean square error (MSE), universal image quality index (UIQI) [

In SFF, the objective is to find out the depth by measuring the distance of a well-focused position of every object point from the camera lens. Once distances for all points of the object are found, the 3D shape can easily be recovered.

In the literature, many SFF techniques have been proposed. Usually, the SFF method consists of two major parts. First, a focus measure is applied to measure the focus quality of each pixel in the image sequence, and an initial depth is computed by maximizing the focus measure in the optical direction. Second, an approximation technique is applied to enhance the initial depth. In order to detect the true focus point from a finite number of images, a focus measure, a criterion to measure the focus quality, is applied. A focus measure is a quantity that measures the degree of blurring of an image; its value is a maximum when the image is best focused and decreases as blurring increases. In the literature, many focus measures have been proposed in spatial and frequency domains. One of the famous categories of focus measures in the spatial domain is based on image derivatives. These focus measures are based on the idea that the larger difference in intensity values of neighboring pixels are analogous to sharper edges. Broadly, they can be divided into two sub-categories: first and second derivative-based methods. A method based on gradient energy is investigated by Tenenbaum [

Some focus measures have also been proposed in the transform domain. Kristan

Once an initial depth estimate is obtained by applying a focus measure, a refinement procedure is followed to further refine the results. In the literature, various approximation-and machine learning-based refinement techniques have been proposed [

An image sequence, _{z}

_{i}

^{2}) for each region.

The optimal computing area is selected depending on the variance within the four regions. We choose the area having the maximum variance among all four regions. Thus, the area within the window is selected as:

The high variance depicts high contrast or high frequency. Therefore, the value of the focus measure increases as contrast increases, and this affects the maximum sharpest focused image. By applying the focus measure on each pixel of the image sequence, an initial focus volume, _{z}_{z}_{z}

It is notable that noise in the image is usually related with high frequency components. As the focus measure computes the focus quality by computing the high frequency components (high pass filter), so there are chances that noise-related intensities may also contribute to the focus measure. To eliminate this factor, we propose to divide the input patch into sub-windows, and then, the focus measure is computed from the part that maximizes the focus measure. _{D}_{z}

The best focused values provide an image of better quality of the object [

The complete procedure of the proposed method is illustrated in

1:

_{i}

_{i}

_{i}

The images for a simulated cone object were generated using camera simulation software. The simulated cone has been selected for the experiments, because it is easy to verify the results for such an object with a known data depth map. Images of the simulated cone at different lens positions are shown in _{z}

In order to investigate the performance of different focus measures and SFF techniques in real scenarios, several experiments have been conducted using an image sequence of real objects. A sequence of 97 images of a real cone object, each of 200 × 200 dimensions, has also been used in many experiments. The real cone object is made of hardboard with black and white strips drawn on its surface to enrich the texture. The length of the cone is about 97 inches, while the base diameter is about 15 inches. Details of these test images can be found in [_{3})_{2}3_{2}

For performance assessment and evaluation, we used two statistical metrics: mean square error (MSE) and correlation (C2). The lower value of the MSE indicates that the method provides more accuracy and higher precision. The correlation value provides the similarity between the real and estimated depth map. The higher the correlation is, the closer it is to the original image. This means that the depth map is well estimated. Recently, new metrics for comparison were developed by Zhou and Bovik. The Universal Image Quality Index (UIQI) [

Contrary to simulated objects, it is to obtain depth information for real objects. Although real objects cannot use statistical metrics, other metrics can be used, such as surface smoothness [

In order to investigate the improved performance of the proposed method, the results are compared with the traditional methods, such as SML, GLV and TEN. In our experiments, we set

We deal with various noise type such as Gaussian, salt& pepper, speckle noise.

Further, we have conducted simulations by using an image sequence corrupted with speckle noise with different noise variances.

In addition, the overall rank of each method can be seen in

The surface of the object is a key point for comparison. The smooth surface of the planar object can be seen in the proposed method. The reconstructed real cone 3D shape is in

In addition, the overall rank of each method can be seen in

In the literature [

In this paper, we introduced the optimal computing area of the area; the highest mean absolute derivation region is selected as the focus measure. The proposed algorithm has been exterminated using image sequences of a synthetic and various real objects: a micro sheet, a real cone and a groove. We performed experiments with image sequences corrupted with Gaussian, salt and pepper and speckle noise. From the experimental results, we can finalize the main properties of the proposed focus measure.

Robustness: The proposed method has shown the robustness against various noise, even high noise variance (0.01) or noise density (0.01).

Accuracy: For various qualitative measures, the proposed method has provided better results (94.47% similar to true depth) than conventional methods (92.28%–93.83% similar to true depth).

This research was supported by the Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education (2013R1A1A2008180) and the Ministry of Science, ICT and Future Planning (2011-0015740).

The authors declare no conflict of interest.

Basic schematic of shape from focus (SFF).

Conventional computing area.

Proposed computing area.

Curves for pixels at (40, 180) of a real cone image sequence: (

The diagram of the proposed method.

Sample images from the sequence of a simulated cone: noise-free (first row), in the presence of noise (second row), Gaussian noise with zero mean and a variance of 0.01 (first column), salt and pepper noise with a variance of 0.01 (second column) and speckle noise with a variance of 0.01 (last column).

Sample images from the sequence of real objects: real cone (first row), micro sheet (second row), groove (bottom).

Focus curves by applying (

Depth maps for simulated cone: (

Depth maps: Gaussian noise with zero mean and a variance of 0.01 (first row), salt and pepper noise with a variance of 0.01 (second row), speckle noise with a variance pf 0.01 (bottom), ML (first column), GLV (second column), TEN (third column) and the proposed method (last column).

Comparison of SFF methods for various speckle noise variances.

Depth maps: real cone (first row), micro sheet (second row), groove (bottom), ML (first column), GLV (second column), TEN (third column) and the proposed method (last column).

Fused images using: ML (first column), GLV (second column), TEN (third column) and OCA (last column).

The best value of various metrics.

MSE | Mean Square Error | Minimum |

Corr | Correlation | 1 |

UIQI | A Universal Image Quality Index | 1 |

SSIM | The Structure Similarity Index | 1 |

SS | Surface Smoothness | Minimum |

Comparison for SFF methods with various metrics.

MSE | 58.6907 | 52.3340 | 57.0221 | 51.5491 |

Corr | 0.9253 | 0.9383 | 0.9228 | 0.9447 |

UIQI | 0.0795 | 0.1176 | 0.0924 | 0.1989 |

SSIM | 0.8460 | 0.9021 | 0.8549 | 0.9533 |

Comparison of SFF methods with Gaussian noise (zero mean and 0.01 variance) and salt and pepper noise (noise density 0.01).

MSE | Gaussian | 695.4148 | 165.7734 | 217.7481 | 50.4545 |

salt and pepper | 830.5300 | 498.7118 | 324.2682 | 71.4485 | |

No noise | 58.6907 | 52.3340 | 57.0221 | 51.5491 | |

| |||||

Corr | Gaussian | 0.0641 | 0.5857 | 0.4926 | 0.9362 |

salt and pepper | 0.0866 | 0.2895 | 0.4476 | 0.9093 | |

No noise | 0.9253 | 0.9383 | 0.9228 | 0.9447 | |

| |||||

UIQI | Gaussian | 0.0002 | 0.0109 | 0.0047 | 0.1712 |

salt and pepper | 0.0002 | 0.0025 | 0.0039 | 0.0871 | |

No noise | 0.0795 | 0.1176 | 0.0924 | 0.1989 | |

| |||||

SSIM | Gaussian | 0.0825 | 0.3910 | 0.2672 | 0.9393 |

salt & pepper | 0.0864 | 0.1859 | 0.2392 | 0.8407 | |

No noise | 0.8460 | 0.9021 | 0.8549 | 0.9533 |

Overall rank table of each focus measure in the presence of various noise.

(1) | OCA | OCA | OCA |

(2) | GLV | TEN | GLV |

(3) | TEN | GLV | TEN |

(4) | ML | ML | ML |

Surface smoothness comparison of various focus measures

Simulated cone | 193.4657 | 68.3371 | 170.6478 | 43.3136 |

Real cone | 766.7228 | 227.8418 | 1,013.3 | 144.5903 |

Micro sheet | 2,295.8 | 530.7807 | 942.2703 | 305.0307 |

Groove | 4,517.9 | 1,267.3 | 1,790.3 | 252.9055 |

Overall rank table of each focus measure with various objects.

(1) | OCA | OCA | OCA | OCA |

(2) | GLV | GLV | GLV | GLV |

(3) | TEN | ML | TEN | TEN |

(4) | ML | TEN | ML | ML |