Comparative Analysis of Warp Function for Digital Image Correlation-Based Accurate Single-Shot 3D Shape Measurement

Digital image correlation (DIC)-based stereo 3D shape measurement is a kind of single-shot method, which can achieve high precision and is robust to vibration as well as environment noise. The efficiency of DIC has been greatly improved with the proposal of inverse compositional Gauss-Newton (IC-GN) operators for both first-order and second-order warp functions. Without the algorithm itself, both the registration accuracy and efficiency of DIC-based stereo matching for shapes with different complexities are closely related to the selection of warp function, subset size, and convergence criteria. Understanding the similarity and difference of the impacts of prescribed subset size and convergence criteria on first-order and second-order warp functions, and how to choose a proper warp function and set optimal subset size as well as convergence criteria for different shapes are fundamental problems in realizing efficient and accurate 3D shape measurement. In this work, we present a comparative analysis of first-order and second-order warp functions for DIC-based 3D shape measurement using IC-GN algorithm. The effects of subset size and convergence criteria of first-order and second-order warp functions on the accuracy and efficiency of DIC are comparatively examined with both simulation tests and real experiments. Reference standards for the selection of warp function for different kinds of 3D shape measurement and the setting of proper convergence criteria are recommended. The effects of subset size on the measuring precision using different warp functions are also concluded.


Introduction
Optical 3D shape measurement has become one of the research hotspots in the field of measurement due to the advantages of high precision, non-contact, and high speed, etc. Laser scanning [1][2][3], structured light [4,5], and digital image correlation (DIC) [6][7][8] are commonly used for accurate 3D shape measurement. According to previous researches [6,9], all of the three methods can achieve the same level of precision. The principle of laser scanning can be briefly summarized as: a laser line stripe plane is projected onto a measuring surface, then a laser stripe is formed and modulated by the depth of the surface. By calibrating the line stripe plane previously and recording the laser stripe by a well-calibrated camera, the 3D information along the stripe line on the surface can be characterized. For structured light measurement, coded fringe patterns are projected onto a measuring surface, the captured images are processed by relative decoding method, whereby an exact phase is computed the left and right images. Speckle projection-based DIC has been proven to have good performances in single-shot 3D measurement and the principle is introduced in this section [25,26].

Warp Function of DIC
The captured images of a same local region from two different angles of view have obvious difference due to rotation and deformation. By setting a reference subset in the reference image, the position and shape of the relative target subset in the target image can be described by a warp function with proper parameters.
The first-order and second-order warp functions can be represented as: ∆p 1 = ∆u, ∆u x , ∆u y , ∆v, ∆v x , ∆v y T (3) ∆p 2 = ∆u, ∆u x , ∆u y , ∆u xx , ∆u xy , ∆u yy , ∆v, ∆v x , ∆v y , ∆v xx , ∆v xy , ∆v yy T (6) where (x, y) denotes the local coordinate of the pixel in reference subset, (x , y ) is the mapped coordinate of (x, y). W 1 (x, y; p 1 ) and W 2 (x, y; p 2 ) are the first-order and second-order warp functions with parameter vector p 1 and p 2 , respectively. ∆p 1 and ∆p 2 denote the incremental parameter vectors. u and v denote the displacement components of center pixel of the reference subset in x direction and y direction, respectively. The other parameters are the first-order gradient components (i.e., u x , u y , v x , v y ) and second-order gradient components (i.e., u xx , u xy , u yy , v xx , v xy , v yy ).

Principle of DIC-Based Stereo Matching Using IC-GN Algorithm
Figure 1 [7] shows the schematic principle of DIC-based stereo matching using IC-GN algorithm. The first-order and second-order warp functions are adopted in Figure 1a,b, respectively. Subscript 1 and 2 are used hereinafter to distinguish the first-order and second-order IC-GN algorithms: IC-GN 1 and IC-GN 2 . f and g denote the gray level intensities of reference subset and target subset, respectively. A whole DIC process using IC-GN algorithm can be concluded as three steps. Firstly, compute the optimal parameter incremental vector ∆p according to current p, which need to be estimated before the first iteration. The most commonly used ZNSSD criterion is employed in this step [20].
where x and y in Equation (8) are usually sub-pixel values, g(x , y ) is calculated by B-spline interpolation [18]. f and g are the mean values of gray level intensities of the reference subset and target subset. f is constant during the iterations, while g need to be calculated in each iteration. Equation (7) can be simplified by first-order Taylor expansion with respect to ∆p: where ∇f = (∂ f /∂x, ∂ f /∂y) is the gray level intensity gradient in x and y directions of the reference subset. ∂W ∂p is the Jacobian of the warp function. For first-order and second-order warp functions, the Jacobians can be expressed respectively as: From Equation (9) ∆p can be solved by least-squares method: where H is the Hessian matrix in the IC-GN algorithm, which is constant during the iterations because ∇ f and ∂W ∂p are independent of the target subset. Secondly, exert ∆p on the reference subset to get the incremental warp W(x, y; ∆p). Subsequently, compose current warp W(x, y; p) with the inverse incremental warp W −1 (x, y; ∆p) to obtain an updated warp: W(x, y; p) = W(x, y; p) · W −1 (x, y; ∆p) Thirdly, repeat the above two steps with the updated p obtained by Equation (14) until preset convergence conditions have been met. In Equation (14), the warp function must be invertible. The first-order warp function can be inverted directly, while the second-order warp function need to be expanded to make it invertible [20].
There are usually two steps to get dense disparity map in DIC-based stereo matching, namely seed point generation and seed point propagation. Scale-invariant feature transform (SIFT) [27] is a classical feature detection method, features extracted by which is invariant to affine transformation, rotation, and scale. In this paper, SIFT-based feature detection, feature matching [28], and affine transformation are adopted to estimate initial values for p to generate seed points. The detailed procedure can be found in our previous work [7]. The initial values for p 1 can be estimated directly. For IC-GN 2 , the initial values for the second-order components (i.e., u xx , u xy , u yy , v xx , v xy , v yy ) of p 2 are set to zeros. To improve the calculation efficiency, a fast recursive scheme [29] and reliability-guided seed point propagation [14] are utilized. Based on the disparity map, 3D reconstruction can be finished via triangulation.

Experiments and Discussions
To conduct the comparative analysis quantitatively, two groups of experiments are investigated. In the first group, numerical simulations with two speckle images generated by computer are conducted to compare performances of first-order and second-order warp functions. In the second group, a set of experiments with different real objects are performed to evaluate the applicability and efficiency of first-order and second-order warp functions for the measurement of surfaces with different complexities. All the experiments are executed on a normal Intel(R) Core(TM) i7-4710MQ CPU 2.50 GHz laptop by C++ language with the additional library of Open Source Computer Vision (OpenCV).
In the following experiments, the modulus of the incremental displacement components ∆u and ∆v, ∆P main = √ ∆u 2 + ∆v 2 , is used to examine the convergence. Also the optimized ZNSSD correlation coefficient is converted to zero-mean normalized cross-correlation (ZNCC) coefficient, which is equivalent to ZNSSD but more straightforward [30]. The judging conditions for the success of IC-GN are that ∆P main is less than the preset convergence threshold and the optimized ZNCC coefficient is larger than 0.8, as well as the number of iterations is less than 30.

Comparative Analysis by Numerical Simulations
In the following tests, a simulated image pair is equalized as a rectified stereo image pair: the displacements between the two images only occur along the along the x-axis. Therefore, the measurement of the displacements between the reference image and target image is equivalent to the process of stereo matching (getting dense disparity map) in DIC-based 3D shape measurement. As shown in Figure 2, the reference image and target images are generated by the well-known simulation algorithm proposed by Zhou [31] and widely used in previous researches [32][33][34]: where I is the generated intensity of the simulated speckle image. S is the total number of speckles, r is speckle size. (x k , y k ) is a randomly generated speckle position. I 0 is the peak intensity of each speckle, which is usually set to be 255. In Figure 2a, there are totally 150, 000 randomly generated speckles in the reference image with a resolution of 1280 × 960 pixels, and the speckle radius is 1.2 pixels. Figure 2b is the corresponding target image generated by exerting specific displacements on the reference image: Two different forms of displacements along the x-axis are exerted on the reference image according to the displacement function U(x, y). The displacements for the left part and right part are generated by an analogous sinusoidal-Gaussian function and an analogous Gaussian function, respectively. In Figure 2a, two preset regions of interest (ROI) are marked by yellow rectangles in the left part (ROI 1 ) and right part (ROI 2 ). (u 1 , v 1 ) and (u 2 , v 2 ) are the coordinates of the center pixels of ROI 1 and ROI 2 , which are set to be (320, 480) and (960, 480), respectively. σ denotes the Gaussian Root-Mean-Square (RMS) width, where σ 1 and σ 2 are set to be 50 and 200, respectively. t is the period of sinusoidal function, which is set to be 1. The displacement fields of ROI 1 and ROI 2 are shown in Figure 2c,d, it is obvious that the displacement field of ROI 1 is much more complex than that of ROI 2 . The displacements of all the pixels in ROI 1 and ROI 2 are measured by IC-GN 1 and IC-GN 2 . The measured data are analyzed statistically: where e U is the mean bias error, s U is the standard deviation, and RMSE U is the root-mean-square error (RMSE). U mei and U thi denote the measured and theoretical displacements along the x-axis of the sampling pixel with index i. N is the number of sampling pixels. It is necessary to state here that the influence of subset size and convergence criterion on the accuracy of IC-GN 1 and IC-GN 2 in different displacement fields are compared.

Comparative Analysis with Different Subset Sizes
Three groups of data (namely, measured data of ROI 1 , ROI 2 , both ROI 1 and ROI 2 ) are analyzed with subset size changed from 15 × 15 to 35 × 35 pixels, where the three groups of data are denoted as ROI 1 , ROI 2 , and ROI 1&2 hereinafter. Figure 3 shows the s U and RMSE U as a function of subset size, where the convergence threshold for ∆P main is set to be 0.001. The corresponding data are listed in Table 1. To compare the characteristics of the errors measured by IC-GN 1 and IC-GN 2 of the two displacement fields, the error distribution maps with a specific subset size as 27 × 27 pixels are shown in Figure 4. It can be easily seen in Figure 3 that for IC-GN 1 , s U and RMSE U of ROI 1 both increase as the subset size becomes larger. However, s U and RMSE U of ROI 2 decrease as the subset size becomes larger. For IC-GN 2 , s U and RMSE U of ROI 1 get the minimums with the subset size of 27 × 27 pixels. s U and RMSE U of ROI 2 decrease as the subset size becomes larger. For both IC-GN 1 and IC-GN 2 , s U and RMSE U of ROI 2 are always smaller than that of ROI 1 , which indicates that the precision of IC-GN can be reduced by complex displacement field. It should be noted that IC-GN 2 is more accurate than IC-GN 1 for ROI 1 . However, IC-GN 1 is more accurate for ROI 2 with all tested subset sizes. The errors of ROI 1&2 are the tradeoff of errors of ROI 1 and ROI 2 .   The error distribution maps of ROI 1 and ROI 2 measured by IC-GN 1 and IC-GN 2 are shown in Figure 4a-d, respectively. By horizontal comparison, it is obvious that the errors of ROI 1 measured by IC-GN 1 and IC-GN 2 are both mainly concentrate on the peak areas of the shape of displacement field, while the error distribution of ROI 2 likes a random distribution. By vertical comparison, we can see that the concentrated errors in the peak areas measured by IC-GN 1 can be suppressed by IC-GN 2 , while the errors of ROI 2 measured by IC-GN 2 are about double of that measured by IC-GN 1 . Therefore, it can be concluded that IC-GN 2 is more accurate for complex displacement (disparity) field measurement, while IC-GN 1 is more accurate for general uniform displacement (disparity) field measurement.

Comparative Analysis with Different Convergence Criteria
In Figure 3b, the curves of RMSE U of ROI 1&2 measured by IC-GN 1 and IC-GN 2 have an intersection around the side length of subset of 17 pixels. Therefore, the subset size is set to be 17 × 17 pixels to compare the performances of IC-GN 1 and IC-GN 2 under different convergence criteria. As shown in Figure 5, the convergence threshold for ∆P main is set to be 0.1, 0.01, 0.001, and 0.0001, respectively. To compare the convergence efficiency of IC-GN 1 and IC-GN 2 under different convergence thresholds, the average numbers of iterations (denoted as n itor ) of ROI 1 , ROI 2 , and ROI 1&2 are listed in Table 2.  It can be concluded from Figure 5 that the same characteristic of IC-GN 1 and IC-GN 2 for the three groups is that the errors under the convergence thresholds of 0.01, 0.001, and 0.0001 are almost the same from each other. The difference is that the errors of IC-GN 1 under the convergence threshold of 0.1 are significantly larger than that under the other thresholds, while the errors of IC-GN 2 under the convergence threshold of 0.1 are slightly larger or smaller than under the other thresholds. Furthermore, s U and RMSE U of ROI 1&2 measured by IC-GN 2 under the convergence threshold of 0.1 are smaller than that measured by IC-GN 1 under any one of the tested convergence thresholds. Also, it is evident in Table 2 that the preset convergence threshold directly affects the convergence efficiency. For all ROI 1 , ROI 2 , and ROI 1&2 , the average numbers of iterations of IC-GN 2 under the convergence threshold of 0.1 are about the same as those of IC-GN 1 under the convergence threshold of 0.01. If only ROI 1&2 is considered, IC-GN 2 under the convergence threshold of 0.1 is more accurate than IC-GN 1 under the convergence threshold of 0.01. Considering both the efficiency and accuracy, conclusions can be drawn that the convergence threshold of 0.01 is the best choice for IC-GN 1 , while 0.1 is more suitable for IC-GN 2 .

Comparative Anslysis by Real Tests
As Shown in Figure 6a [7], a single-shot stereo system is used to perform real experiments, which is composed of two CCD cameras with a resolution of 1280 × 960 pixels (Basler acA1300-30 gm. Manufactured by Basler AG, Ahrensburg, Germany. Supplied by Shanghai Vision-Light Tech Co., Ltd. Pudong New Area, Shanghai, China), two camera lenses (Computar 8 mm 1:1.4 2/3. Manufactured by Computar ® , Tokyo, Japan. Supplied by Shanghai Vision-Light Tech Co., Ltd. Pudong New Area, Shanghai, China), and a projector with a resolution of 1140 × 912 pixels (TI DLP LightCrafter4500. Manufactured by TEXAS INSTRUMENTS, Dallas, Texas, America. Supplied by Texas Instruments Semico . . . es (Shanghai) Co. Ltd. Pudong New Area, Shanghai, China). Figure 6b shows the projected speckle pattern with the same resolution as the projector: there are totally 120,000 speckles with a fixed radius of 1.2 pixels. Three objects are employed to compare the real measurement performances of IC-GN 1 and IC-GN 2 . The measured surfaces are shown in Figure 7a-c, namely, plane surface, cylinder surface, and back surface of a plaster head (named as head hereinafter for short). The plane surface and cylinder surface are used as standard surfaces, which are measured by a Coordinate Measuring Machine (CMM (2 + (L/350) µm. Manufactured by Thome Präzision GmbH, Messel, Germany. Supplied by THOME China, Minhang District, Shanghai, China)). The 3D coordinates of the measured points are fitted into plane and cylinder surface by least square method, respectively, and the fitted results are listed in Table 3. In the calculations of real tests, a ROI is set in the left image of each rectified stereo image pair. The shape of the ROI in the head is much more complex compared to that of the plane or cylinder. The disparity maps and 3D shapes of the three ROIs are shown in Figure 8 to enable a visual comparison. In the following comparative analysis, both IC-GN 1 and IC-GN 2 are used for all the ROIs except for that in Figure 8, which refers to different warp function for different ROI: the ROIs in the plane and cylinder are measured by IC-GN 1 under a convergence threshold of 0.01, and the ROI in the head is measured by IC-GN 2 under a convergence threshold of 0.1. In addition, the subset size is set to be 27 × 27 pixels in Figure 8.   To verify the conclusions drawn by simulation tests. The pixels in each ROI are matched by IC-GN 1 and IC-GN 2 with the convergence threshold of 0.001, and the subset size ranges from 15 × 15 to 35 × 35 pixels. For the plane surface and cylinder surface, the standard deviation (denoted as s) of plane or cylinder surface fitting in each measurement is plotted in Figure 9a. It can be seen that IC-GN 1 is always more accurate than IC-GN 2 with all tested subset sizes for both surfaces. To compare the measuring abilities of IC-GN 1 and IC-GN 2 for different surfaces, the statistics of matching rates with different subset size of each ROI are listed in Table 4. The matching rate is denoted as r m , which is the ratio of number of matched pixels to the number of total pixels (denoted as N pix ) in the ROI. The matching rates of IC-GN 1 and IC-GN 2 are all equal or very close to 100% for the plane and cylinder surfaces. However, the matching rates of IC-GN 1 for the ROI of head are all below 70%, while the matching rates of IC-GN 2 are all very close to 100%. Therefore, the measurement ability of IC-GN 1 for complex shape measurement is limited, which is almost unrelated to the change of subset size. The accuracy of IC-GN 1 and IC-GN 2 are also compared under different convergence thresholds with a specific subset size of 27 × 27 pixels. The standard deviations of plane fitting and cylinder surface fitting versus convergence threshold are plotted in Figure 9b. It can be seen that for IC-GN 1 , only the differences of standard deviations under convergence thresholds of 0.1 and 0.01 are relevant.
For IC-GN 2 , the standard deviations are almost the same under different convergence thresholds. As shown in Figure 10, the 3D data of the ROI of head measured by IC-GN 2 under two different convergence thresholds are compared. That means for every pixel in the ROI that has been matched, the spatial distance of the corresponding two 3D points reconstructed under the two convergence thresholds are calculated. The distance distribution maps by comparison of convergence threshold of 0.1 to 0.01 and 0.001 are shown in Figure 10a,b, respectively. Furthermore, comparison of shapes measured by IC-GN 2 and structured light of the head are shown in Figure 11.  It needs to declare that the distance values for unmatched pixels are set to be zeros in Figure 10. There is no significant difference between Figure 10a,b; the corresponding standard deviations are 4.318 µm and 4.496 µm. To further verify the measurement effectiveness of IC-GN 2 under the threshold of 0.1, the head is measured at the same position by both IC-GN 2 and three-frequency three-step structured light using the same system. The same ROI is set in the rectified left images of DIC measurement and structured light measurement, and the shapes of the ROI measured by IC-GN 2 and structured light are shown in Figure 11a,b, respectively. For every pixel in the ROI, the spatial distance of the two 3D coordinates measured by IC-GN 2 and structured light is calculated. The standard deviation of all the calculated distance values is 0.023 mm, which is in the same level of precision of the above plane fitting and cylinder surface fitting. Therefore, conclusions can be drawn from the above comparisons that the convergence threshold of 0.01 is suitable for IC-GN 1 , while 0.1 is recommended for IC-GN 2 . The conclusions are consistent with that drawn in the simulation tests.

Conclusions
In this paper, a comparative analysis of first-order and second-order warp functions for DIC-based stereo 3D shape measurement is presented. Both simulation tests and real experiments with different objects are performed to compare the impacts of subset size and convergence criteria on the measuring ability, efficiency, and precision by IC-GN using first-order and second-order warp functions. Conclusions are summarized as follows: (1) The first-order warp function is more suitable for surfaces with a shape of flat or small curvature, such as plane, cylinder, and flat Gaussian surface, etc. Under the same convergence criteria, IC-GN 1 is always more efficient and accurate than IC-GN 2 with all tested subset sizes. (2) The second-order warp function is more suitable for surfaces with a complex shape or large curvature, such as the tested back surface of head and analogous sinusoidal-Gaussian surface, etc. IC-GN 1 is not capable or accurate enough for such kind of 3D shape measurement; the matching rate of tested ROI of head is under 70% with any of the tested subset size. (3) The convergence thresholds for IC-GN 1 and IC-GN 2 are recommended to be that the variation of the modulus of incremental displacement vector is less than 0.01 pixel, and 0.1 pixel, respectively. Both the recommended convergence thresholds can achieve considerable measurement precision compared to smaller thresholds according to the simulation tests and real experiments.