^{*}

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

In this paper, we focus on the problem of the accuracy performance of 3D face modeling techniques using corresponding features in multiple views, which is quite sensitive to feature extraction errors. To solve the problem, we adopt a statistical model-based 3D face modeling approach in a mirror system consisting of two mirrors and a camera. The overall procedure of our 3D facial modeling method has two primary steps: 3D facial shape estimation using a multiple 3D face deformable model and texture mapping using seamless cloning that is a type of gradient-domain blending. To evaluate our method's performance, we generate 3D faces of 30 individuals and then carry out two tests: accuracy test and robustness test. Our method shows not only highly accurate 3D face shape results when compared with the ground truth, but also robustness to feature extraction errors. Moreover, 3D face rendering results intuitively show that our method is more robust to feature extraction errors than other 3D face modeling methods. An additional contribution of our method is that a wide range of face textures can be acquired by the mirror system. By using this texture map, we generate realistic 3D face for individuals at the end of the paper.

Three-dimensional (3D) face modeling is a challenging topic in computer graphics and computer vision. Unlike 2D face models, 3D face models can realistically express face deformation and pose variation with depth information. With these advantages, 3D face models have been applied to various applications, including movies, 3D animation and telecommunications [

Three dimensional modeling systems can be categorized into active and passive vision systems [

Nowadays, the passive vision-based 3D modeling system is preferred for human faces because the glare from light-emitting devices can be unpleasant for the users. Passive vision-based system means a system that needs no light-emitting devices and estimates 3D information from 2D images. In passive vision-based 3D face modeling, 3D information can be calculated by analyzing camera geometry from corresponding features in multiple views [

Among the 3D face modeling methods using the passive vision system, the most commonly used one is the corresponding feature-based 3D face modeling method. This method is less computationally expensive because it uses only a few feature points to generate a 3D facial shape. Additionally, this method can generate highly accurate 3D facial shapes by using real 3D information calculated from the camera geometry. However, the accuracy of the 3D facial shapes declines rapidly if the extracted locations of the corresponding points are not exact. This problem should be solved to apply to automatic 3D modeling system because even excellent feature extraction techniques such as the active appearance model (AAM) [

In this paper, we aim to develop a realistic 3D face modeling method that is robust to feature extraction errors and generates accurate 3D face modeling results. To achieve this, we propose a novel 3D face modeling method which has two primary steps: 3D facial shape estimation and texture mapping with a texture blending method.

In the 3D facial shape estimation procedure, we take a statistical model-based 3D face modeling approach as a fundamental concept. Among the statistical model-based methods, we use in particular the deformable face model that utilizes location information of facial features in the input image. This method is robust to feature extraction errors because it uses pre-trained 3D face data to estimate 3D facial shapes from input face images but it is a little less accurate than the corresponding feature-based 3D face modeling methods. To improve accuracy of the 3D facial shapes, we propose a 3D face shape estimation method using multiple 3D face deformable models.

In the texture mapping procedure, we apply a cylindrical mapping and a stitching technique to generate a texture map. When stitching each face part, we apply a modified gradient-domain blending technique [

This paper is organized as follows: in Section 2, we introduce previous 3D face modeling techniques and the 3D face deformable model that is basis of proposed method. In Section 3, we address our 3D facial shape estimation method with mirror system. In Section 4, we describe our texture mapping and texture map generation method using a modified gradient-domain blending technique. Then, we discuss the 3D face modeling results and evaluate our method's performances compared with those of other 3D face modeling methods and ground truth in Section 5. Finally, we conclude our paper and address future work in Section 6.

In this section, we address previous works and the fundamental concept of the proposed 3D face modeling method. In Sections 2.1 and 2.2, we introduce previous 3D face modeling methods using passive vision systems. We categorize them into two groups: corresponding feature-based 3D face modeling and statistical model-based 3D face modeling. Then, we study strengths and weaknesses of these methods. In Section 2.3, we concretely describe the 3D face modeling method using a deformable model which is a type of statistical model-based 3D face modeling because it is a fundamental concept of the proposed method that will be described in Section 3.1.

The simplest and fastest way to generate a 3D face model using the corresponding features is to use orthogonal views [

Some researchers construct 3D faces from several facial images. Fua

As another example of a corresponding feature-based method, Lin

In 3D face modeling using statistical model, the 3D morphable face model suggested by Blanz and Vetter [

To improve the speed, researchers have proposed 3D face modeling methods using a single-view image. Kuo

The 3D face deformable model is a type of parametric model that can deform shapes and textures by changing related parameters. The morphable face model [

Generally, the computational costs of morphable face models are very high because they use entire face data (vertices and texture) and require many parameters to fit on input face images. On the other hand, a 3D face deformable model is less computationally expensive because it uses only geometric information (

The deformation of 3D FSM can be carried out by global and local deformations. In a global deformation, the position and shape of the 3D FSM can be determined by a 3D affine transformation. The 3D affine transformation can be expressed as a 3 × 1 translation matrix (_{t}_{t}_{t}

Then, feature vectors of the face data sets are calculated by principal component analysis (PCA). In PCA, the feature vectors of the shape data sets are the eigenvectors (

Then, the local deformation can be parameterized with model parameter (

In

In this section, we improve our proposed 3D face shape estimation method using a mirror system that was introduced at our previous work [

Our proposed face modeling system consists of two mirrors placed on either side of the face and a camera in front of the face. Frontal and lateral face images are captured simultaneously, and the pre-defined feature points are extracted from the captured image as described in

After feature extraction, the 3D FSM fitting procedure is carried out to calculate 3D coordinates from the extracted 2D feature points. During the fitting procedure, the 3D FSM parameters are adjusted to match the landmarks of the 3D FSM with the extracted feature points. This can be thought as least square optimization problem, and then the sum of the distances between the projected landmarks and objective feature points can be the cost function to be minimized. This cost function can be represented as:
_{Obj}_{ProjFSM}

The extracted feature points are categorized into three groups, _{FObj}_{LObj}_{RObj}

The total cost function is then the sum of these three variables:

In practice, 30 left side face features are extracted, while features like the right ear, right eye,

Meanwhile, _{FProjFSM}_{LProjFSM}_{RProjFSM}

In mirror geometry, the mirror image of an object can be explained by the projection of a virtual 3D object reflected by the mirror plane onto the image plane, as described in

An ideal, perfectly flat mirror plane can be represented by:

Once the plane equation is calculated, a virtual 3D face can be generated using a Householder reflection. Given the 3D FSM landmarks (_{real}_{u}_{3×3} is the 3 × 3 identity matrix, and _{plane}

After generating the virtual 3D FSMs, we can calculate _{LProjFSM}_{RProjFSM}

In _{x}_{y}_{x}_{y}

After the total cost function (_{total}

To calculate the minimizer (

However, for the virtual 3D FSM, the partial derivatives are changed due to the Householder transformation terms. After applying _{i}_{i}_{i}

After calculating the Jacobian matrix about the three 3D FSM, the entire Jacobian matrix (_{All}_{F}_{LS}_{RS}

After 3D face shape estimation, the 3D positions of other vertices can be determined by a generic 3D face model. The generic face model has been used in various applications because it has a uniform point distribution and can provide detailed face shape with a small number of points [_{i} w_{i}_{i} w_{i}_{i}

In _{i}_{i}

Once all of the parameters of the deformation function are determined, the vertices of the generic model can be deformed by multiplying

In this section, we introduce cylindrical mapping and address the stitching method using a seamless cloning method. A seam appears at the boundaries of each face part because of photometric inconsistency after stitching. To solve this problem, a seamless cloning method [

To map textures on the 3D face model, a texture map is created by extracting the texture directly from the captured face image. For the sake of simplicity, cylindrical mapping is applied. In common cylindrical mapping methods, mesh vertices that are intersected with the ray passing through the center of a cylinder are projected onto the image plane after a virtual cylinder is placed around the 3D face model. Then, the colors of the corresponding pixels in the image are extracted and mapped to the texture map. However, this is time consuming because the positions of the vertices on the face mesh must be calculated. Thus, in our texture mapping procedure, vertices of a triangle mesh are projected onto the image plane, and then textures in the projected mesh are warped on the texture map, as shown

As addressed in Section 4.1, a texture map of the entire face can be created by stitching each face texture parts. However, a seam appears at the boundaries of each face parts because of photometric inconsistency, as described in

For initialization of the

The

After using morphological operations to fill in holes, a final face texture map can be created, as shown in

Generally, multi-resolution splining [

Before constructing the proposed face modeling system, we completed statistical analyses with respect to the 3D scan landmarks in order to calculate the feature vector elements of the local deformation parameters in the FSM, as described in Section 2.1. For the statistical analysis, we used principal component analysis (PCA). We recorded 3D face views of 100 individuals with a ^{TM}

To define the ground truth, we captured 3D faces of 30 individual with a 3D laser scanner at the same time that we captured the image with our proposed system. We attached color markers on each user's face to identify the feature points as shown in

For a relative comparison, we used Lin's method [

We implemented Lin's method [

Then, we compared the accuracy of their method with that of our method, as shown in

To test on the robustness of our method with respect to feature extraction errors, we artificially generated erroneous feature points with normally distributed random distances and directions. Firstly, we calculated a two-dimensional matrix containing normally distributed random numbers using the Box-Muller method. Then, we generate the noisy feature points by adding each column vector of the matrix to the 2D coordinates of the feature points in the input face image. The feature points on the face for measurement were annotated by color markers. We assumed that the 2D positions of the marked feature points were the reference position.

We first tested the results of our proposed method and Lin's method according to error strength, which can be adjusted by changing the standard deviation of the random numbers.

We carried out the 3D face shape reconstruction by applying the proposed method and Lin's method with noisy feature points. The standard deviation of the error was varied from 0 to 5 in intervals of 0.02. Then, we calculated the average sum of the Euclidean distances (average absolute error) between each 3D reconstruction point and the truth. As shown in

Next, we fixed the standard deviation and measured the average absolute error as the number of noisy feature points was increased from 0 to 100 in intervals of 1. As shown in

We generated a textured 3D face model of users using the proposed face modeling method. After applying the generic model fitting as described in Subsection 3.2, we applied our texture mapping method described in Section 4. Eyeballs are not included in the generic model, and so we inserted artificial eyeballs with the 3D Max program. After producing the eyeballs, we align the center of the eyeball to the center of the eye region.

In this paper, we propose a realistic 3D face modeling method that is robust to feature extraction errors and can generate accurate 3D face models. In the facial shape estimation procedure, we propose a 3D face shape estimation method using multiple 3D face deformable models in a mirror system. The proposed method shows high robustness to feature extraction errors and highly accurate 3D face modeling results, as described in Sections 5.2 and 5.3. In the texture mapping procedure, we apply cylindrical mapping and stitching technique to generate a texture map. We apply the seamless cloning method, which is a type of gradient-domain blending technique, to remove the seam that caused by photometric inconsistency and finally can thus acquire a natural texture map.

To evaluate our method's performance, we carry out accuracy and robustness tests with respect to 30 individuals' 3D facial shape estimation results. Our method shows not only highly accurate 3D face shape results when compared with the ground truth, but also robustness to feature extraction errors. Moreover, the 3D face rendering results intuitively show that our method is more robust to feature extraction errors than other 3D face shape estimation methods. An additional contribution of our method is that wide range of face textures can be acquired by the mirror system. Lastly, we generate textured 3D faces using our proposed method. The results show that our method can generate very realistic 3D faces, as shown in

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2010-0011472). This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MEST) (No. 2011-0016302).

3D FSM deformation using model parameters of first and second principal mode. Deformation results applying a first principal mode parameter (

A mirror system for capturing facial images and a conceptual diagram of the system.

Feature extraction in simultaneously captured images and cost function generation.

3D FSM and two virtual 3D FSMs generated by a Householder reflection.

The 3D face shape estimation results. (

(

Cylindrical texture mapping for facial texture extraction.

Texture map refinement using seamless cloning. (

3D face modeling result after texture mapping.

Image stitching results using multi-resolution splining and our method. (

The user's face captured with our mirror system. The red color markers are attached on user's face for performance test.

Robustness test results. (

(

The results of the textured 3D face model for individuals.

Mean, standard deviation and median of absolute distance errors of the proposed and Lin. I-C's method compared to the actual faces.

| |||
---|---|---|---|

3.12 | 1.14 | 2.59 | |

3.58 | 0.59 | 3.49 |

The maximum error distances according to the standard deviation.

1 | 2 | 3 | 4 | 5 | |

3.811 | 6.692 | 9.760 | 12.667 | 16.845 |