A Systematic Stereo Camera Calibration Strategy: Leveraging Latin Hypercube Sampling and 2k Full-Factorial Design of Experiment Methods

This research aimed to optimize the camera calibration process by identifying the optimal distance and angle for capturing checkered board images, with a specific focus on understanding the factors that influence the reprojection error (ϵRP). The objective was to improve calibration efficiency by exploring the impacts of distance and orientation factors and the feasibility of independently manipulating these factors. The study employed Zhang’s camera calibration method, along with the 2k full-factorial analysis method and the Latin Hypercube Sampling (LHS) method, to identify the optimal calibration parameters. Three calibration methods were devised: calibration with distance factors (D, H, V), orientation factors (R, P, Y), and the combined two influential factors from both sets of factors. The calibration study was carried out with three different stereo cameras. The results indicate that D is the most influential factor, while H and V are nearly equally influential for method A; P and R are the two most influential orientation factors for method B. Compared to Zhang’s method alone, on average, methods A, B, and C reduce ϵRP by 25%, 24%, and 34%, respectively. However, method C requires about 10% more calibration images than methods A and B combined. For applications where lower value of ϵRP is required, method C is recommended. This study provides valuable insights into the factors affecting ϵRP in calibration processes. The proposed methods can be used to improve the calibration accuracy for stereo cameras for the applications in object detection and ranging. The findings expand our understanding of camera calibration, particularly the influence of distance and orientation factors, making significant contributions to camera calibration procedures.


Introduction
Binocular or stereo vision, used in diverse areas like autonomous driving [1,2], robot navigation [3,4], 3D reconstruction [5][6][7], and industrial inspection [8,9], employs two identical cameras to capture two images of the same target from different vantage points. The disparity between these images enables 3D reconstruction and measurement, the accuracy of which is heavily dependent on the precision of the camera calibration and stereo matching methods, thus making these processes key research subjects in stereo vision. During the camera calibration process, it is crucial to estimate both the intrinsic parameters, such as the focal length and optical centers, and the extrinsic parameters, like the camera's position and orientation in space. These directly influence the calculation of 3D spatial information [10]. Camera calibration significantly impacts the accuracy of the reconstruction and measurement, thereby playing a crucial role in binocular vision.
Camera calibration techniques mainly fall into two categories: self-calibration and photogrammetric calibration [11]. Self-calibration methods, though not needing a calibration object, rely on sequences of images from an uncalibrated scene and often entail complex computations [12][13][14] and low in accuracy [5,15]. Conversely, photogrammetric methods, including the popular methods such as Zhang's [16], Tsai's [17], Bouguet's [18], and Heikkila and Silven's [19] methods utilize geometric information from a calibration object, such as a checkered board with known geometry, to determine camera parameters. The latter methods minimize RP through feature point detection and optimization, providing more accurate information about camera characteristics.
Zhang's camera calibration method [16] has gained significant attention due to its simplicity, robustness, and high degree of accuracy. One of the key advantages of Zhang's method is that it only requires a simple calibration object, such as a checkered board, which is easy to fabricate and measure. This stands in contrast to other photogrammetric methods that may require specialized calibration objects or more complicated patterns. The method is based on the fundamental matrix theory, which determines the intrinsic (i.e., focal length, lens distortion coefficients, and the position of the image center) and extrinsic (i.e., camera position and orientation) parameters of a camera by measuring the geometric relationship between the camera and the calibration board. The method requires at least three pairs of images of a checkered board with known dimensions taken from different positions and orientations within the camera's field of view (FOV). Then, a corner detection algorithm is employed to detect the corners in each image, i.e., the intersections between the black and white squares, and the image coordinates of the detected corners are matched with their known 3D coordinates in the checkered board. Subsequently, least-squares minimization, i.e., the Levenberg-Marquardt algorithm, is used to search for determining the camera's intrinsic and extrinsic parameters that yield minimum geometric error between the projected 3D calibration points and their corresponding 2D image points. The lens distortion coefficients, such as radial and tangential distortions, are also estimated and corrected.
Owing to the simplicity, flexibility and robustness of Zhang's method, many researchers have developed new camera calibration algorithms that incorporate Zhang's method to improve calibration accuracy. Rathnayaka et al., 2017, proposed two new calibration methods that use a specially designed, colored checkered board pattern and mathematical pose estimation approaches to calibrate a stereo camera setup with heterogeneous lenses, i.e., a wide-angle fish-eye lens and a narrow-angle lens [11]. One method calculates left and right transformation matrices for stereo calibration, while the other utilizes a planar homography relationship and Zhang's calibration method to identify common object points in stereo images. Evaluations based on RP and image rectification results showed both methods successfully overcome the challenges, with the planar homography approach yielding slightly more accurate results. Wang et al., 2018, proposed a non-iterative self-calibration algorithm to compute the extrinsic parameters of a rotating (yaw and pitch only) and a non-zooming camera [13]. The algorithm utilizes a single feature point to improve the calibration speed and Zhang's method was employed in advance to calculate the intrinsic parameters of the camera. Although their results are slightly less accurate than those calibrated by Zhang's method alone, their work demonstrates its potential for real-time applications where extremely high calibration accuracy is not required. Hu et al., 2020, introduced a calibration approach for binocular cameras that begins with Zhang's technique to obtain an initial guess of the rotation and translation matrix [5]. This guess is then refined using singular value decomposition and solved via the Frobenius norm, and is further refined through maximum likelihood estimation, generating a new calculation for the relative position matrix of the binocular cameras. The Levenberg-Marquardt algorithm is employed for additional refinement. Their results show significant improvement over the traditional Bouget's and Hartley's algorithms. Liu et al., 2022, proposed a multi-camera calibration stereo calibration method focusing on the high-precision extraction of circular calibration pattern features during the calibration process [20]. The Franklin Sensors 2023, 23, 8240 3 of 19 matrix, used for sub-pixel edge detection and the image moment method, was used to obtain the characteristic circle center, while the calibration was realized using Zhang's method. Their work shows that calibration effects can be improved by focusing on the high-precision extraction of calibration features.
Calibration targets also play a significant role in camera calibration. Although using a checkered board pattern as the calibration target does have the advantages, such as simplicity, stability, and a generally higher accuracy than self-calibration, it has the downside of being susceptible to illumination and noise in corner feature extraction and matching. Circular patterns have been proposed to address the problem [21]. Recently, Wei et al., 2023, presented a camera calibration method using a circular calibration target [22]. Their method operates in two stages. The first stage involves extracting the centers of the ellipses and formulating the real concentric circle center projection equation by leveraging the cross-ratio invariance of collinear points. The second stage involves solving a system of linear equations to find the other center projection coordinates, taking advantage of the fact that the infinite lines passing through the centers are parallel. Unlike other circular calibration algorithms that directly fit ellipses to estimate the centers of circular feature points, their method not only corrects errors that arise from using the ellipse's center but also significantly improves the accuracy and efficiency of the calibration process. The proposed method reduces the average RP to 0.08 pixels and achieves real-world measurement accuracy of 0.03 mm, making it valuable for high-precision 3D measurement tasks. Recently, Zhang et al., 2023, proposed a flexible calibration method for large-range binocular vision systems for complex construction environments [23]. The calibration method consists of two stages. In stage 1, lenses are focused closer to the cameras, and non-overlapping fields of view (FOVs) are calibrated using virtual projection point pairs through Zhang's method and nonlinear optimization. Fourteen sets of clear checkered board images are acquired for this purpose. Then, the lenses are focused on the measurement position in stage 2, where three sets of blurred three-phase shifted circular grating (PCG) images are taken. The final extrinsic parameters for the binocular camera are computed using state transformation matrices for calibration. The method has demonstrated robustness under varying target angles and calibration distances, with a mean RP of only 0.0675 pixels and a relative measurement error of less than 1.75% in their experiment. While Zhang's method ensures accurate calibration outcomes when viewing a planar target from various angles, the selection of these differing orientations relies largely on empirical knowledge, which could potentially lead to variations in the calibration results [24]. Furthermore, it is often necessary to manually adjust the calibration object's position, a process that introduces uncertainties due to instability and shaking, resulting in inconsistent positioning of the calibration plate across images captured by different cameras, and thus introducing inherent errors that directly affect the calibration process's accuracy [15]. Moreover, the calibration procedure requires a significant amount of time and effort [13].
The present study seeks to address numerous key issues related to camera calibration in the process of capturing checkered board images. In particular, it aims to optimize this process by identifying the optimal distance and angle for calibration, consequently enhancing the calibration results. Different from the existing literatures, this study places emphasis on gaining a comprehensive understanding of the factors that affect RP in the calibration process, specifically the roles of various displacement and orientation factors, and how they might be manipulated independently for better calibration efficiency. This leads to four principal research questions which revolve around understanding the impacts of distance factors (D, H, V) and orientation factors (R, P, Y) on RP , determining the effects of combined distance and orientation factors on RP , and exploring the feasibility and efficiency of conducting the calibration process separately for different factor groups. To address these, the research objectives include identifying the most influential distance and orientation factors, investigating the combinations of factors that minimize RP , and evaluating the efficiency of separately conducting the calibration process for different factor groups. The overarching goal is to develop more efficient and effective calibration procedures for various applications.

Experiment Setup
Three different cameras are used in this study. The details of the cameras are presented in Table 1. However, only the calibration process of the ZED2i binocular camera [25] is presented in this study.
For camera calibration, a checkered board consisting of 2 cm alternating black and white squares arranged in a grid pattern of nine columns and seven rows is employed, as shown in Figure 1b. The checkered board is attached to the camera calibration rig as shown in Figure 1c. The Stereo Camera Calibration Toolbox for MATLAB [18] is used for calibration, where the inner corner points are detected using the Harris corner algorithm [26]. The checkered board size is fed to the algorithm and the actual number of detected corner points is 48. The high contrast between black and white aids in accurately detecting and localizing the inner corners of the squares, which are crucial for camera calibration algorithms. factor groups. The overarching goal is to develop more efficient and effective calibration procedures for various applications.

Experiment Setup
Three different cameras are used in this study. The details of the cameras are presented in Table 1. However, only the calibration process of the ZED2i binocular camera [25] is presented in this study.  For camera calibration, a checkered board consisting of 2 cm alternating black and white squares arranged in a grid pattern of nine columns and seven rows is employed, as shown in Figure 1b. The checkered board is attached to the camera calibration rig as shown in Figure 1c. The Stereo Camera Calibration Toolbox for MATLAB [18] is used for calibration, where the inner corner points are detected using the Harris corner algorithm [26]. The checkered board size is fed to the algorithm and the actual number of detected corner points is 48. The high contrast between black and white aids in accurately detecting and localizing the inner corners of the squares, which are crucial for camera calibration algorithms.  Since binocular camera calibration necessitates capturing images from various directions and angles, a camera calibration rig was designed to securely hold the checkered board, ensuring precise distances and orientations for calibration purposes. The components of the calibration rig and its final assembly are depicted in Figure 2. The rig comprises four main components: an LD2R dual panoramic gimbal for yaw (Y) and pitch (P) adjustments; a two-way macro frame for left and right (H) and depth, i.e., front and back (D) displacements; a lift table for vertical (V) movements; and an RSP60R axis rotation stage for rotation (R). These adjustments ensure that the checkered board is distributed across different quadrants and acquired from various positions and angles. Since binocular camera calibration necessitates capturing images from various directions and angles, a camera calibration rig was designed to securely hold the checkered board, ensuring precise distances and orientations for calibration purposes. The components of the calibration rig and its final assembly are depicted in Figure 2. The rig comprises four main components: an LD2R dual panoramic gimbal for yaw (Y) and pitch (P) adjustments; a two-way macro frame for left and right (H) and depth, i.e., front and back (D) displacements; a lift table for vertical (V) movements; and an RSP60R axis rotation stage for rotation (R). These adjustments ensure that the checkered board is distributed across different quadrants and acquired from various positions and angles.

Zhang's Camera Calibration
Zhang's calibration method [16] was used in this study to perform the calibration task due its simplicity and robustness. Mathematically, given the point , , represents the coordinates of each corner point on the checkered board in the world coordinate system, its homogeneous coordinate is , , , 1 . In the pixel coordinate system, this point is denoted by , , while its homogeneous coordinate is

Zhang's Camera Calibration
Zhang's calibration method [16] was used in this study to perform the calibration task due its simplicity and robustness. Mathematically, given the point P = (X w , Y w , Z w ) T represents the coordinates of each corner point on the checkered board in the world coordinate system, its homogeneous coordinate is ∼ P = (X w , Y w , Z w , 1). In the pixel coordinate system, this point is denoted by m = [u, v] T , while its homogeneous coordinate The relationship between the world coordinate and the image coordinate is described by the camera model, as follows: where λ is the proportionality coefficient and [R|t] and A are the extrinsic and intrinsic parameter matrices of the camera, respectively. By assuming that Z = 0 for the planar board, the equation can be further simplified: where H is a 3 × 3 homography matrix, defined as follows: and s is a proportional factor. The homography matrix describes the transformation between two images of the same planar scene. Before calculating the homography matrix for camera calibration using a checkered board, it is necessary to configure the parameters for the world coordinates of the checkered board. This includes details such as the board's overall size, the size of each square, and the number of detected corner points on the checkered board. The fundamental constraints for the camera's intrinsic parameters are as follows: By using the properties of the rotation matrix R, the unit orthogonal vectors r 1 and r 2 can be obtained. The camera intrinsic parameters can be determined by solving A −T A −1 , using Equation (4).
RP is the error between the reprojected point, obtained by projecting the calibrated image point onto the camera coordinate system and its actual coordinate. This metric is used to assess the accuracy of camera calibration results in this study. As shown in Figure 3, after calibrating the camera using its intrinsic and extrinsic parameters, the point P is reprojected onto the image plane, resulting in a new point P , with pixel coordinates (u , v ). The calibration RP is calculated as the Euclidean distance between the reprojected pixel coordinates (u , v ) and the actual calibrated pixel coordinates (u, v). The formula for the RP is given by Equation (5).
Sensors 2023, 23, x FOR PEER REVIEW 6 of 21 , , 1 . The relationship between the world coordinate and the image coordinate is described by the camera model, as follows: where is the proportionality coefficient and | and are the extrinsic and intrinsic parameter matrices of the camera, respectively. By assuming that 0 for the planar board, the equation can be further simplified: where is a 3 3 homography matrix, defined as follows: and is a proportional factor. The homography matrix describes the transformation between two images of the same planar scene. Before calculating the homography matrix for camera calibration using a checkered board, it is necessary to configure the parameters for the world coordinates of the checkered board. This includes details such as the board's overall size, the size of each square, and the number of detected corner points on the checkered board. The fundamental constraints for the camera's intrinsic parameters are as follows: By using the properties of the rotation matrix , the unit orthogonal vectors and can be obtained. The camera intrinsic parameters can be determined by solving , using Equation (4). is the error between the reprojected point, obtained by projecting the calibrated image point onto the camera coordinate system and its actual coordinate. This metric is used to assess the accuracy of camera calibration results in this study. As shown in Figure  3, after calibrating the camera using its intrinsic and extrinsic parameters, the point is reprojected onto the image plane, resulting in a new point , with pixel coordinates ′, ′ . The calibration is calculated as the Euclidean distance between the reprojected pixel coordinates ′, ′ and the actual calibrated pixel coordinates , . The formula for the is given by Equation (5).

Design of Experiment (DoE)
A 2 k full factorial with replication design of experiment method [27] is employed to analyze the influence of each factor and their interactions on RP in the camera calibration. A 2 k full-factorial DoE is a type of experimental design used to study the effect of k independent factors at 2 levels each (i.e., the low and high bounds) on a response variable. This allows for all possible combinations of the factors at all levels to be investigated, providing a comprehensive view of the entire experimental space. Figure 4 depicts the k = 3 factor design space, where each corner point of the cube represents the combination of the factors X 1 , X 2 , and X 3 . With replication, i.e., by repeating the entire experiment with more than one operating condition, the errors or uncertainties introduced by the other variables can be estimated and the significance and interactions of the independent factors can be revealed.

Design of Experiment (DoE)
A 2 k full factorial with replication design of experiment method [27] is employed to analyze the influence of each factor and their interactions on in the camera calibration. A 2 k full-factorial DoE is a type of experimental design used to study the effect of independent factors at 2 levels each (i.e., the low and high bounds) on a response variable. This allows for all possible combinations of the factors at all levels to be investigated, providing a comprehensive view of the entire experimental space. Figure 4 depicts the 3 factor design space, where each corner point of the cube represents the combination of the factors , , and . With replication, i.e., by repeating the entire experiment with more than one operating condition, the errors or uncertainties introduced by the other variables can be estimated and the significance and interactions of the independent factors can be revealed. The factors involved in this study can be divided into two categories: orientation factors (R, P, and Y) and displacement factors (D, V, H). Given that Zhang's method requires a minimum of three sets of images for calibration, specific variations must be incorporated into each experimental set. Consequently, the calibration experiment must be designed to exclude some factors to accommodate these variations. In this study, the experiments are configured as follows: Method A employs displacement factors for calibration while introducing variations in orientation. Method B utilizes orientation factors for calibration and introduces variations in displacement. Method C incorporates the two most significant factors from both orientation and displacement categories while introducing variations to the remaining factors.
The bounds for displacement and orientation factors are set at ±3 cm and ±10°, respectively. These displacement boundaries are designed to ensure that the checkered board stays within the field of view (FOV) of the camera, considering the minimum distance between the board and the camera. Upward, leftward, and forward displacements are categorized as positive, while downward, rightward, and backward displacements are considered negative. As for the orientation limits, the specific values are chosen based on experimental data to guarantee that all corners of the images captured by the camera can be detected using Zhang's method. Positive orientations are defined as up pitch, right yaw, and left roll, whereas negative rotations are classified as down pitch, left yaw, and right roll. The convention for sign determination is presented in Figure 5. The factors involved in this study can be divided into two categories: orientation factors (R, P, and Y) and displacement factors (D, V, H). Given that Zhang's method requires a minimum of three sets of images for calibration, specific variations must be incorporated into each experimental set. Consequently, the calibration experiment must be designed to exclude some factors to accommodate these variations. In this study, the experiments are configured as follows: Method A employs displacement factors for calibration while introducing variations in orientation. Method B utilizes orientation factors for calibration and introduces variations in displacement. Method C incorporates the two most significant factors from both orientation and displacement categories while introducing variations to the remaining factors.
The bounds for displacement and orientation factors are set at ±3 cm and ±10 • , respectively. These displacement boundaries are designed to ensure that the checkered board stays within the field of view (FOV) of the camera, considering the minimum distance between the board and the camera. Upward, leftward, and forward displacements are categorized as positive, while downward, rightward, and backward displacements are considered negative. As for the orientation limits, the specific values are chosen based on experimental data to guarantee that all corners of the images captured by the camera can be detected using Zhang's method. Positive orientations are defined as up pitch, right yaw, and left roll, whereas negative rotations are classified as down pitch, left yaw, and right roll. The convention for sign determination is presented in Figure 5. The experiment process starts by generating the 2 k full-factorial design points using the factors of interest. For methods A and B, each involves a factor group of three factors with two levels, resulting in a total of 2 = 8 combinations for each experiment. On the other hand, for method C, the two most important factors from each factor group are selected to construct the design points, resulting in 2 = 16 combinations.
Then, Latin Hypercube Sampling (LHS) is used to select 20 points from the remaining factors to introduce variations for Zhang's method. LHS is used to ensure that the variations introduced by the remaining factors are well distributed within the multidimensional sampling space, allowing them to be treated as random variables in the analysis. The calibration process is presented in Figure 6. The experiment process starts by generating the 2 k full-factorial design points using the factors of interest. For methods A and B, each involves a factor group of three factors with two levels, resulting in a total of 2 3 = 8 combinations for each experiment. On the other hand, for method C, the two most important factors from each factor group are selected to construct the design points, resulting in 2 4 = 16 combinations.
Then, Latin Hypercube Sampling (LHS) is used to select 20 points from the remaining factors to introduce variations for Zhang's method. LHS is used to ensure that the variations introduced by the remaining factors are well distributed within the multidimensional sampling space, allowing them to be treated as random variables in the analysis. The calibration process is presented in Figure 6. The experiment process starts by generating the 2 k full-factorial design points using the factors of interest. For methods A and B, each involves a factor group of three factors with two levels, resulting in a total of 2 = 8 combinations for each experiment. On the other hand, for method C, the two most important factors from each factor group are selected to construct the design points, resulting in 2 = 16 combinations.
Then, Latin Hypercube Sampling (LHS) is used to select 20 points from the remaining factors to introduce variations for Zhang's method. LHS is used to ensure that the variations introduced by the remaining factors are well distributed within the multidimensional sampling space, allowing them to be treated as random variables in the analysis. The calibration process is presented in Figure 6.  The interaction effects response model, Y, is built to describe the relationship between the factors and the response variable, i.e., RP : In Equation (6), β 0 is the intercept term, which is the baseline level of the response Y when all factors X 1 , X 2 , · · · , X k are zero. The terms β 1 , β 2 , · · · , β k are the model coefficients for the main effects of the factors X 1 , X 2 , · · · , X k , respectively. Each β i represents the change in Y for a unit change in X i when all other factors are held constant. A high absolute value of β i indicates that its corresponding factor X i has a strong influence on Y, and vice versa. The term ∑ i<j β ij X i X j denotes the interaction effects between the factors X i and X j . Finally, ε is the random error term, describing the uncertainty or unexplained variability in the model.
Bayesian Information Criterion (BIC) is used to quantify the trade-off between the goodness of fit of the response model and its complexity [28]: whereL is the maximized value of the likelihood function of the response model, n is the sample size, and k is the number of parameters estimated by the model. The term kln(n) is the penalty function for model complexity, to discourage overfitting by keeping the number of k low. The term −2 ln(L) is the measure of the goodness of fit of the model to the data, where smaller values indicate a better model fit. In model selection, model with the lowest BIC is considered the best. Analysis of variance (ANOVA) is used to determine which factors are statistically significant in the response. The R 2 , R 2 -adjusted, R 2 -predicted, and the number of significant factors (N f ), are used to judge whether additional replication is needed to yield a stable response. In this study, MINITAB [29] is used to generate the factorial points and perform the DoE analysis. The experimental procedure is summarized in Figure 7.

Method A: Calibration with Displacement Factors
Method A is designed to calibrate with displacement factors, i.e., D, H, and V, on RP . LHS was employed to randomly sample 20 points within the range of ±10 • for 3 orientation factors, R, P, and Y. The experiment is conducted by adding one DoE replicate at a time until the R 2 , adjusted R 2 , predicted R 2 values, and the number of significant factors (N f ) of the response model in the successive replicate are consistent. As shown in Figure 8, only three replicates are needed to reach convergence. An additional replicate was used to verify the results. The total number of photos taken to yield convergence is 3 replicates × 20 LHS samples × 8 factorial points = 480. The resultant hierarchical RP response model in uncoded units is presented in Equation (8).
The model is reduced to a linear regression. Within the limits of the factors, the minimum RP can be achieved by minimizing the values of the factors. Figure 9a displays the normal plot of the standardized effects of the response model, which confirms that only the three main factors are significant for RP . Among these factors, D exhibits the most influence on RP for method A, followed by V and H. Additionally, the normal plot reveals that all three factors have positive effects on the RP , indicating that an increase in the factor's level will result in an increase in RP .

Method A: Calibration with Displacement Factors
Method A is designed to calibrate with displacement factors, i.e., D, H, and V, on . LHS was employed to randomly sample 20 points within the range of 10° for 3 orientation factors, R, P, and Y. The experiment is conducted by adding one DoE replicate at a time until the , adjusted , predicted values, and the number of significant factors ( ) of the response model in the successive replicate are consistent. As shown in Figure 8, only three replicates are needed to reach convergence. An additional replicate was used to verify the results. The total number of photos taken to yield convergence is 3 replicates 20 LHS samples 8 factorial points = 480. The resultant hierarchical response model in uncoded units is presented in Equation (8).

Start
End Generate 2 k full factorial points for the design parameters.
Split the parameters H, D, V and P, Y, R into design and random parameters.
Sample 20 points for the random parameters using LHS.
Capture the calibration images according to the factorial and LHS points.
Camera calibration using Zhang's method.
Generate response surface model for εRP as a function of design parameters.
Perform ANOVA to determine the significance of design parameters.
The model is reduced to a linear regression. Within the limits of the factors, the minimum can be achieved by minimizing the values of the factors. Figure 9a displays the normal plot of the standardized effects of the response model, which confirms that only

Method B: Calibration with Orientation Factors
Method B calibrates with the three orientation factors, i.e., R, P, and Y on . LHS was employed to randomly sample 20 points within the range of 3 cm for 3

Method B: Calibration with Orientation Factors
Method B calibrates with the three orientation factors, i.e., R, P, and Y on RP . LHS was employed to randomly sample 20 points within the range of ±3 cm for 3 displacement factors, D, H, and V. The calibration is conducted by adding one DoE replicate at a time until the R 2 , adjusted R 2 , predicted R 2 values, and the number of significant factors (N f ) of the response model in the successive replicate are consistent. As shown in Figure 10, six replicates are needed to reach convergence. The total calibration images acquired is 960. The resultant hierarchical RP response model in uncoded units is presented in Equation (9).
Sensors 2023, 23, x FOR PEER REVIEW 13 of 2 factors ( ) of the response model in the successive replicate are consistent. As shown i Figure 10, six replicates are needed to reach convergence. The total calibration images ac quired is 960. The resultant hierarchical response model in uncoded units is pre sented in Equation (9).  Within the limits of the factors, the minimum can be achieved by minimizing and maximizing P and Y. Figure 11a displays the normal plot of the standardized effect of the response model, which confirms that the three main orientation factors and an ad ditional ⋅ interaction factor are significant for . Among these factors, P exhibit the most influence on for method B, while the ⋅ interaction factor ranked the sec ond, followed by R and Y. Additionally, factors P and Y have are inversely correlated wit , suggesting an increase in the factor's level will result in a decrease in . In contras an increase in the level of factor R and ⋅ interaction will lead to an increase in .  Within the limits of the factors, the minimum RP can be achieved by minimizing R and maximizing P and Y. Figure 11a displays the normal plot of the standardized effects of the response model, which confirms that the three main orientation factors and an additional R · P interaction factor are significant for RP . Among these factors, P exhibits the most influence on RP for method B, while the R · P interaction factor ranked the second, followed by R and Y. Additionally, factors P and Y have are inversely correlated with RP , suggesting an increase in the factor's level will result in a decrease in RP . In contrast, an increase in the level of factor R and R · P interaction will lead to an increase in RP .
The residual plot in Figure 11b confirms that the residuals of the model follow a normal distribution with zero mean, indicating that the model effectively accounts for the systematic variation in the data, leaving only random variation unexplained. The factorial point is where R = −10 • and P = Y = 10 • exhibits the lowest RP , with a value of 0.0825 pixels. This result aligns with the predictions made by the linear regression model, where RP = 0.0835 is minimum for this combination. The residual plot in Figure 11b confirms that the residuals of the model follow a normal distribution with zero mean, indicating that the model effectively accounts for the systematic variation in the data, leaving only random variation unexplained. The factorial point is where R = −10° and P = Y = 10° exhibits the lowest , with a value of 0.0825 pixels. This result aligns with the predictions made by the linear regression model, where = 0.0835 is minimum for this combination.

Method C: Calibration with Displacement and Orientation Factors
Differently from methods A and B, which calibrate each factor group on separately, method C calibrates the combined displacement and orientation factors concurrently. The two most prominent factors from each factor group are selected. LHS was employed to randomly sample 20 points from the remaining factors withing their respective range. The experiment is conducted by adding one DoE replicate at a time until the , Figure 11. Calibration of ZED2i camera with method B. (a) Normal plot of the standardized effects; (b) RP residual plot.

Method C: Calibration with Displacement and Orientation Factors
Differently from methods A and B, which calibrate each factor group on RP separately, method C calibrates the combined displacement and orientation factors concurrently. The two most prominent factors from each factor group are selected. LHS was employed to randomly sample 20 points from the remaining factors withing their respective range. The experiment is conducted by adding one DoE replicate at a time until the R 2 , adjusted R 2 , predicted R 2 values, and N f of the response model are consistent in the successive replication. As shown in Figure 12, 5 replicates are needed to reach convergence, resulting in a total of 1600 calibration images acquired. The non-hierarchical RP response model in coded units is presented in Equation (10). Figure 13a presents the normal plot of the standardized effects of RP . Two main orientation factors R and P and a displacement factor D are significant for RP . Factors P and D are positively correlated to RP , while R is negatively correlated to RP . Additionally, eleven interaction factors appear to be significantly influential on RP for method C.
convergence, resulting in a total of 1600 calibration images acquired. The non-hierarchical response model in coded units is presented in Equation (10).  Figure 13a presents the normal plot of the standardized effects of . Two main orientation factors R and P and a displacement factor D are significant for . Factors P and D are positively correlated to , while is negatively correlated to . Additionally, eleven interaction factors appear to be significantly influential on for method C. The residual plot in Figure 13b confirms that the residuals of the model follow a normal distribution with zero mean, indicating that the model effectively accounts for the systematic variation in the data, leaving only random variation unexplained. The minimum obtained from Equation (10)  The residual plot in Figure 13b confirms that the residuals of the model follow a normal distribution with zero mean, indicating that the model effectively accounts for the systematic variation in the data, leaving only random variation unexplained. The minimum RP obtained from Equation (10) is 0.0622 pixels, and the corresponding R, P, D, and V values are −10 • , +10 • , −3 cm, and −3 cm, respectively. This result aligns with the measurement, where RP = 0.0638 pixles is minimum for this combination. Interestingly, the R, P, D, and V values that yield the minimum RP are also consistent with the values for methods A and B.

Comparisons of Methods A, B, and C
Two additional cameras, D435i and HBVCAM-W202011HD V33, are used to assess the calibration performance of methods A, B, and C. For method A, which used only displacement factors for camera calibration, it was found that factors D, H, and V are linearly correlated to RP for all three cameras involved in this study. Factor D is the most influential factor on RP for all cameras, followed by V and H for the ZED2i camera, and H and V for both D435i and HBVCAM. However, a closer examination on Figure 9a reveals that both H and V factors are equally important. For method B, which calibrates using the orientation factors, it was observed that factors R and P are more influential than Y on RP for all cameras. Therefore, factors D and H or V from the distance group and factors P and Y from the orientation group are recommended for calibration with method C. Table 2 compares the calibration results for ZED2i camera using methods A, B, and C. It is evident that combining the two most significant factors, which yield the minimum RP for methods A and B, respectively, will also result in the minimum RP for method C. The results also show that a total of 1600 images are needed for calibration with method C, while methods A and B require only 1440 images in total, about 10% less compared to method C. Table 2. Comparison of ZED2i camera calibration using methods A, B, and C. The focal lengths in the x and y directions ( f x , f y ), the principal point (c x , c y ), and the radial distortion matrix (k 1 , k 2 , k 3 ) of the ZED2i camera are presented in Table 3. The tangential distortion matrix (p 1 , p 2 ) is set to zero in this study. Zhang's method is used to benchmark the performance of the calibration results. To improve the reliability of Zhang's calibration, the calibration is repeated five times, each with 20 sets of images of the randomly positioned calibration target. The average RP values obtained are 0.0901, 0.0832, and 0.0858 pixels for cameras ZED2i, D435i, and HBVCAM-W202011HD V33, respectively. The results are presented in Figure 14.  Figure 14 shows that methods A, B, and C are able to reduce . For ZED2i, D435i, and HBVCAM cameras, method A is able to reduce their respective by 20%, 38%, and 16% compared to Zhang's method. For method B, the reductions for the three cameras are 8%, 31%, and 33%. The average reduction by methods A and B are 25% and 24%, respectively, suggesting that both methods are equally effective. On the other hand, method C can produce a more consistent calibration result. The reductions for the three cameras are 31%, 36%, and 34%, respectively. Figure 15 shows the modeling performance for the three cameras using methods A, B, and C. The results indicate that the proposed 2 k full-factorial DoE is effective in modeling , as most of the models are able to achieve the , adjusted , and predicted values of more than 95%, except for method B for the ZED2i camera, where its , adjusted , and predicted values are less than 75%. A closer examination of the residuals for methods A, B, and C (see Figures 9b, 11b and 13b) reveals that that data for method B (Figure 11b) contain high variability. More replications can be carried out to improve its calibration accuracy in this case. However, method B is still able to capture the underlying significant factors and their interactions for ZED2i camera.  Figure 14 shows that methods A, B, and C are able to reduce RP . For ZED2i, D435i, and HBVCAM cameras, method A is able to reduce their respective RP by 20%, 38%, and 16% compared to Zhang's method. For method B, the RP reductions for the three cameras are 8%, 31%, and 33%. The average RP reduction by methods A and B are 25% and 24%, respectively, suggesting that both methods are equally effective. On the other hand, method C can produce a more consistent calibration result. The RP reductions for the three cameras are 31%, 36%, and 34%, respectively. Figure 15 shows the modeling performance for the three cameras using methods A, B, and C. The results indicate that the proposed 2 k full-factorial DoE is effective in modeling RP , as most of the models are able to achieve the R 2 , adjusted R 2 , and predicted R 2 values of more than 95%, except for method B for the ZED2i camera, where its R 2 , adjusted R 2 , and predicted R 2 values are less than 75%. A closer examination of the RP residuals for methods A, B, and C (see Figures 9b, 11b and 13b) reveals that that data for method B (Figure 11b) contain high variability. More replications can be carried out to improve its calibration accuracy in this case. However, method B is still able to capture the underlying significant factors and their interactions for ZED2i camera.
Interestingly, the factorial point with D = H = V = −3 cm exhibited the lowest RP for all cameras calibrated with method A. Similarly, the factorial point with R = −10 • and P = Y = 10 • resulted in the minimum RP for all cameras calibrated with method B. For method C, the minimum RP , obtained by considering the factorial point with R = −10 • , P = 10 • , and D = V = −3 cm, is consistent with the results from methods A and B. Since the baseline of each camera was used to determine the default depth distance for calibration to ensure the checkered board is within its FOV, it has no impact on RP . Since the baseline of each camera was used to determine the default depth distance for calibration to ensure the checkered board is within its FOV, it has no impact on .

Conclusions
This paper adopts Zhang's camera calibration method, combined with the 2 k fullfactorial DoE and LHS methods, to minimize in camera calibration. In general, the results of the experiments conducted for methods A, B, and C provide valuable insights into the factors affecting in the calibration process. For method A, which used only displacement factors for camera calibration, it was found that factors D, H, and V are linearly correlated to for all three cameras involved in this study. Factor D was found to be the most influential factor on for all cameras, followed by V and H for the ZED2i camera, and H and V for the D435i and HBVCAM cameras. However, a closer examination of Figure 9a  can be reduced by 25% on average compared to Zhang's method.
On the other hand, for method B, which calibrates using the orientation factors, it was observed that factors and are more influential than on for all cameras. The factorial point with R = −10° and P = Y = 10° resulted in the minimum . The findings were consistent with the predictions made by the response models, highlighting the importance of controlling the orientation factors to minimize . On average, method B can reduce the by 24% compared to Zhang's method. For method C, where two of the most influential factors from distance and orientation factors were selected for calibration, the minimum was obtained at the factorial point with R = −10°, P = 10°, and D = V = −3 cm; this is consistent with the results from methods A and B. Although the method requires 10% more images compared to methods A and B combined, method C and produce a consistent minimum with a reduction of more than 30% compared to Zhang's method.

Conclusions
This paper adopts Zhang's camera calibration method, combined with the 2 k fullfactorial DoE and LHS methods, to minimize RP in camera calibration. In general, the results of the experiments conducted for methods A, B, and C provide valuable insights into the factors affecting RP in the calibration process.
For method A, which used only displacement factors for camera calibration, it was found that factors D, H, and V are linearly correlated to RP for all three cameras involved in this study. Factor D was found to be the most influential factor on RP for all cameras, followed by V and H for the ZED2i camera, and H and V for the D435i and HBVCAM cameras. However, a closer examination of Figure 9a  On the other hand, for method B, which calibrates using the orientation factors, it was observed that factors R and P are more influential than Y on RP for all cameras. The factorial point with R = −10 • and P = Y = 10 • resulted in the minimum RP . The findings were consistent with the predictions made by the response models, highlighting the importance of controlling the orientation factors to minimize RP . On average, method B can reduce the RP by 24% compared to Zhang's method.
For method C, where two of the most influential factors from distance and orientation factors were selected for calibration, the minimum RP was obtained at the factorial point with R = −10 • , P = 10 • , and D = V = −3 cm; this is consistent with the results from methods A and B. Although the method requires 10% more images compared to methods A and B combined, method C and produce a consistent minimum RP with a reduction of more than 30% compared to Zhang's method.
Comparing the results of methods A, B, and C, it is evident that combining the two most significant factors from methods A and B would also result in the minimum RP for method C. Specifically, the combination of orientation factors R and P and displacement factors D and V or H can achieve the minimum RP value for method C. Furthermore, it was found that the calibration process could be conducted separately and more efficiently by using the two factor groups independently to achieve more than 20% improvement in RP compared to Zhang's method alone. This approach required fewer calibration images, about 70% and 40% fewer for methods A and B, respectively, compared to method C. For applications where a lower value of RP is required, method C is recommended.
Systematic camera calibration plays an important role in 3D reconstruction as it greatly influences the accuracy and reliability of the resulting 3D models. Accurate calibration ensures the precise determination of both the intrinsic and extrinsic parameters of a camera. This precision enables the correct mapping of 2D image data to real-world 3D coordinates. Optimizing camera calibration enhances the geometric fidelity of 3D reconstruction, paving the way for a wide range of applications in fields like computer vision, robotics, and augmented reality. Thus, it serves as the cornerstone underpinning the entire process of translating 2D images into accurate 3D reconstructions.
Overall, these findings contribute to a better understanding of the factors affecting RP in calibration processes, enabling more efficient and effective calibration procedures in various applications. The proposed methods can be used to improve the calibration accuracy for stereo cameras for the applications in object detection and ranging. However, the proposed systematic camera calibration experiment involves a substantial amount of repetitive testing and the capture of numerous calibration checkered board photos, which consumes a significant amount of manpower and time resources. Additionally, camera distortions have a pronounced impact on camera calibration as they alter the geometric properties of camera imaging, subsequently affecting the precise mapping relationship between pixel coordinates and real-world coordinates. These two limiting factors will be subjected to in-depth exploration and discussion in future research.