Automatic Camera Calibration Using Active Displays of a Virtual Pattern

Tan, Lei; Wang, Yaonan; Yu, Hongshan; Zhu, Jiang

doi:10.3390/s17040685

Open AccessArticle

Automatic Camera Calibration Using Active Displays of a Virtual Pattern

by

Lei Tan

^1,*,

Yaonan Wang

^1,2,

Hongshan Yu

^2,* and

Jiang Zhu

³

¹

College of Electrical and Information Engineering, Hunan University, Changsha 410082, China

²

National Engineering Laboratory for Robot Visual Perception and Control Technology, Hunan University, Changsha 410082, China

³

College of Information Engineering, Xiangtan University, Yuhu District, Xiangtan 411105, China

^*

Authors to whom correspondence should be addressed.

Sensors 2017, 17(4), 685; https://doi.org/10.3390/s17040685

Submission received: 12 January 2017 / Revised: 16 March 2017 / Accepted: 18 March 2017 / Published: 27 March 2017

(This article belongs to the Section Physical Sensors)

Download

Browse Figures

Versions Notes

Abstract

:

Camera calibration plays a critical role in 3D computer vision tasks. The most commonly used calibration method utilizes a planar checkerboard and can be done nearly fully automatically. However, it requires the user to move either the camera or the checkerboard during the capture step. This manual operation is time consuming and makes the calibration results unstable. In order to solve the above problems caused by manual operation, this paper presents a full-automatic camera calibration method using a virtual pattern instead of a physical one. The virtual pattern is actively transformed and displayed on a screen so that the control points of the pattern can be uniformly observed in the camera view. The proposed method estimates the camera parameters from point correspondences between 2D image points and the virtual pattern. The camera and the screen are fixed during the whole process; therefore, the proposed method does not require any manual operations. Performance of the proposed method is evaluated through experiments on both synthetic and real data. Experimental results show that the proposed method can achieve stable results and its accuracy is comparable to the standard method by Zhang.

Keywords:

camera calibration; 2D pattern; active display; lens distortion; closed-form solution; maximum likelihood estimation

1. Introduction

Camera calibration is the first process for 3D computer vision which recovers metric information from 2D images. There are two types of approaches for calibration: photogrametric calibration uses both 2D information and knowledge of the scene such as coordinates of 3D points, shape of reference objects, direction of 3D lines, etc.; self-calibration does not require any knowledge but only 2D information. Generally speaking, the former approaches give more stable and accurate calibration results than the latter because using the knowledge reduces the number of parameters. The proposed method in this paper belongs to the photogrametric approaches.

The standard photogrametric calibration is Zhang’s method [1] which uses a 3D plane called a chessboard or checkerboard, even though many methods have been proposed which use perpendicular planes [2,3], circles [4,5], spheres [6,7], and vanishing points [8,9]. The merits of Zhang’s method are the ease of use and its extensibility. The requirement is only a camera and a paper on which a pattern is printed. Pattern images are captured by moving either the camera or the plane manually. Then, camera parameters are estimated by decomposing the homography between 3D points on the plane and their 2D projections on the image. The basic idea of Zhang’s method is not only for a single camera calibration, but also applicable to multiple camera calibration [10], projector-camera calibration [11], and depth sensor-camera calibration [12].

Most parts of Zhang’s conventional method, such as checkerboard detection, can be automatically processed by software [13,14]. However, a manual part remains at the capture step. This part makes a calibration result unstable although it takes a lot of time. For stable calibration, many images under varied motions, generally ≥20 images, are required so that all detected points are distributed uniformly. Figure 1a shows an example in which all points from four images are scattered over the camera view. Otherwise, in a situation like Figure 1b, the conventional method does not give an accurate result for any trials.

To get well distributed points, robust methods are proposed for detecting partial occluded patterns [15,16,17]. By using those methods, if a part of the pattern is outside of the camera view, visible points including those near the image boundary are helpful for improving calibration accuracy. However, the manual part still exists.

This paper proposes a full-automatic calibration method to resolve the two problems caused by the manual operation: the time consuming problem and the point distribution problem. Instead of a physical pattern, the proposed method uses a virtual pattern which is transformed in the virtual world coordinates and projected on a fixed screen. The pattern on the screen is captured by a fixed camera, then, the proposed method performs calibration by using point correspondences between the virtual 3D points and their 2D projections. The virtual pattern can be actively displayed on the screen so that all points are uniformly distributed. Also, the camera and the screen are fixed during the whole process. Therefore, the proposed method can be stable and fully automatic.

This paper is organized as follows. Section 2 describes Zhang’s conventional method from basic equations. Although the derivation of Zhang’s method is widely known, it is highly related to the proposed method in Section 3. In Section 4, experimental results on synthetic and real images are provided and discussed. Finally, Section 5 gives the conclusions.

2. Conventional Method

Zhang’s conventional calibration method estimates the intrinsic and the extrinsic parameters of a camera from images of a physical planar pattern. Figure 2a shows an overview where the camera is moved by hand to take the pattern images.

2.1. Basic Equations

Assume that n 3D points are on a

z = 0

plane and the plane is shot by a pinhole model camera with m times. In a j-th shot (

j \leq m

), the relation between a 3D point

X_{i} = {[x_{i}, y_{i}, 0]}^{T}

(

i \leq n

) and its 2D projection

m_{i j} = {[u_{i j}, v_{i j}]}^{T}

can be expressed by

[\begin{matrix} m_{i j} \\ 1 \end{matrix}] \propto K [\begin{matrix} R_{j} & t_{j} \end{matrix}] [\begin{matrix} X_{i} \\ 1 \end{matrix}]

(1)

where ∝ denotes equality up to scale,

R_{j}

is a j-th

3 \times 3

rotation matrix,

t_{j}

is a j-th

3 \times 1

translation vector, and K is a

3 \times 3

upper triangular matrix given by

K = [\begin{matrix} f_{x} & s & u_{0} \\ 0 & f_{y} & v_{0} \\ 0 & 0 & 1 \end{matrix}]

(2)

with

[u_{0}, v_{0}]

the principal point, s the skewness, and

[f_{x}, f_{y}]

the focal length for x and y axis.

The third column of

R_{j}

can be eliminated due to

z = 0

. From Equation (1), then we have

[\begin{matrix} m_{i j} \\ 1 \end{matrix}] \propto K [\begin{matrix} r_{j 1} & r_{j 2} & t_{j} \end{matrix}] [\begin{matrix} x_{i} \\ 1 \end{matrix}]

(3)

where

x_{i} = {[x_{i}, y_{i}]}^{T}

,

r_{j k}

denotes the the k-th column of

R_{j}

. Furthermore we can simplify this projection by using a

3 \times 3

matrix

H_{j} \propto K [\begin{matrix} r_{j 1} & r_{j 2} & t_{j} \end{matrix}] .

(4)

H_{j}

, called a homography matrix, is given by at least four point correspondences

m_{i j}

and

X_{i}

[1]. Multiplying

K^{- 1}

from the left side of Equation (4) and using the orthogonality of

R_{j}

, we obtain two constraints for K:

\begin{matrix} h_{j 1}^{T} B h_{j 2} = 0, \end{matrix}

(5)

\begin{matrix} h_{j 1}^{T} B h_{j 1} - h_{j 2}^{T} B h_{j 2} = 0 \end{matrix}

(6)

where

B \propto K^{- T} K^{- 1}

, and

h_{j k}

denotes the k-th column of

H_{j}

. B is a

3 \times 3

symmetric matrix and has a six components. However, the degrees of freedom is five due to the scale ambiguity.

2.2. Estimating Parameters

Equations (5) and (6) are linear to B. Therefore, we can obtain B by solving

V v e c (B) = 0,

(7)

where V is a 2 m

\times 6

matrix and

v e c ()

is a vectorization operator. Note that the dimension of

v e c (B)

is six. In a general case, where all the intrinsic parameters are unknown,

m \geq 3

observations are required for getting a unique solution of

v e c (B)

. After getting B, K is extracted by decomposing B. More details on estimating the intrinsic parameters are described in [1] and [18].

Once K is known,

R_{j}

and

t_{j}

can be recovered as

\begin{matrix} R_{j} = [\begin{matrix} λ K^{- 1} h_{j 1} & λ K^{- 1} h_{j 2} & r_{j 1} \times r_{j 2} \end{matrix}], \end{matrix}

(8)

\begin{matrix} t_{j} = λ K^{- 1} h_{j 3} \end{matrix}

(9)

with scale factor

λ = 1 / ∥ K^{- 1} h_{j 1} ∥ = 1 / ∥ K^{- 1} h_{j 2} ∥

. Because of noisy data,

R_{j} = [r_{j 1}, r_{j 2}, r_{j 3}]

derived from the above equation does not generally satisfy the properties of a rotation matrix. The best rotation matrix from a general

3 \times 3

matrix can be estimated through singular value decomposition [18].

2.3. Nonlinear Refinement

The estimated parameters above are not accurate because they are derived by linear methods based on the algebraic error without lens distortion. To refine the linear estimation, a nonlinear optimization is carried out by minimizing the re-projection error:

\begin{matrix} \min_{K, R_{j}, t_{j} \forall j \in m} & \sum_{j \forall m} \sum_{i \forall n} {∥ m_{i j} - p (X_{i}, K, R_{j}, t_{j}, d) ∥}^{2} \\ s . t . & R_{j}^{T} R_{j} = I \forall j \in m \end{matrix}

(10)

where I is the

3 \times 3

identity matrix, and p is a projective function with lens distortion parameter d.

3. Proposed Method

As shown in Figure 2b, the proposed method uses a virtual calibration pattern instead of a physical one. The virtual pattern is transformed by some pre-generated parameters and projected onto a screen, then, the pattern on the screen is captured by a fixed camera. For stable calibrations, the virtual pattern is actively displayed on the screen and these pre-generated parameters ensure that all 2D projections of the corner points are uniformly distributed in the camera coordinates. The proposed method estimates the intrinsic and the extrinsic parameters from correspondences between the virtual world points and their 2D projections.

In contrast to the conventional method, the proposed method does not require moving either the camera or the pattern. Since the camera and the screen are fixed during the whole process, the proposed method can be implemented as a fully automatic calibration software.

3.1. Basic Equations

Let

P = K [\begin{matrix} R & t \end{matrix}]

be the projection from the screen to the camera and

P_{j}^{s} = K^{s} [\begin{matrix} R_{j}^{s} & t_{j}^{s} \end{matrix}]

be the projection from the virtual pattern to the screen where

K^{s}

,

R_{j}^{s}

, and

t_{j}^{s}

are the screen’s intrinsic and j-th extrinsic parameters, respectively.

Then, the projection between a virtual world space 3D point

X_{i}

and a 2D image point

m_{i j}

can be expressed by

[\begin{matrix} m_{i j} \\ 1 \end{matrix}] \propto [\begin{matrix} I & 0 \end{matrix}] [\begin{array}{c} P \\ 0^{T} 1 \end{array}] [\begin{matrix} P_{j}^{s} \\ 0^{T} 1 \end{matrix}] [\begin{matrix} X_{i} \\ 1 \end{matrix}]

(11)

where 0 is a

3 \times 1

zero vector.

Let us consider the two projections separately. The first projection by

P_{j}^{s}

can be rewritten by

\begin{matrix} [\begin{matrix} P_{j}^{s} \\ 0^{T} 1 \end{matrix}] [\begin{matrix} X_{i} \\ 1 \end{matrix}] & = [\begin{matrix} K^{s} & 0 \\ 0^{T} & 1 \end{matrix}] [\begin{matrix} R_{j}^{s} & t_{j}^{s} \\ 0^{T} & 1 \end{matrix}] [\begin{matrix} X_{i} \\ 1 \end{matrix}] \end{matrix}

(12)

\begin{matrix} = [\begin{matrix} K^{s} & 0 \\ 0^{T} & 1 \end{matrix}] [\begin{matrix} r_{j 1}^{s} & r_{j 2}^{s} & t_{j}^{s} \\ 0 & 0 & 1 \end{matrix}] [\begin{matrix} x_{i} \\ 1 \end{matrix}] \end{matrix}

(13)

\begin{matrix} = [\begin{matrix} H_{j}^{s} \\ 0 1 \end{matrix}] [\begin{matrix} x_{i} \\ 1 \end{matrix}] \end{matrix}

(14)

where

r_{j k}^{s}

denotes the k-th column of

R_{j}^{s}

, and

H_{j}^{s} = K^{s} [\begin{matrix} r_{j 1}^{s} & r_{j 2}^{s} & t_{j}^{s} \end{matrix}]

.

K^{s}

is the screen’s intrinsic parameters which are preset in the calibration, and

R_{j}^{s}

and

t_{j}^{s}

are the extrinsic parameters of the screen at the j-th capture in the calibration. Since the virtual pattern is transformed by pre-generated parameters,

R_{j}^{s}

and

t_{j}^{s}

are actually known. Also the second projection by P can be rewritten by

\begin{matrix} [\begin{matrix} I & 0 \end{matrix}] [\begin{matrix} P \\ 0^{T} 1 \end{matrix}] & = P \end{matrix}

(15)

\begin{matrix} = K [\begin{matrix} R & t \end{matrix}] . \end{matrix}

(16)

Letting

h_{j k}^{s}

be the k-th column of

H_{j}^{s}

, and from Equations (14) and (16), we can write Equation (11) by using a

3 \times 3

homography:

\begin{matrix} [\begin{matrix} m_{i j} \\ 1 \end{matrix}] \propto H_{j} [\begin{matrix} x_{i} \\ 1 \end{matrix}] \end{matrix}

(17)

where

H_{j} \propto K [\begin{matrix} R h_{j 1}^{s} & R h_{j 2}^{s} & R h_{j 3}^{s} + t \end{matrix}] .

(18)

Similarly to the conventional method, given virtual world space 3D points and their 2D image projections, homography

H_{j}

can be calculated using the same technique introduced in Zhang’s paper [1]. However, we cannot extract constraints from Equation (18) in the same way as Equations (5) and (6) since the form of

H_{j}

is not identical. The proposed method uses the ratio constraints of the vector dot product instead of the orthogonality.

Multiplying

K^{- 1}

from the left side of Equation (18), we have three equations from the first and the second columns:

\begin{matrix} ∥ K^{- 1} h_{j 1} ∥^{2} & \propto ∥ h_{j 1}^{s} ∥^{2} \end{matrix}

(19)

\begin{matrix} ∥ K^{- 1} h_{j 2} ∥^{2} & \propto ∥ h_{j 2}^{s} ∥^{2} \end{matrix}

(20)

\begin{matrix} {(K^{- 1} h_{j 1})}^{T} (K^{- 1} h_{j 2}) & \propto h_{j 1}^{s T} h_{j 2}^{s} \end{matrix}

(21)

where

h_{j k}

denotes the k-th column of

H_{j}

. If we take a ratio from any two of the above equations, we can obtain one constraints. For example, picking Equations (19) and (20), we have

∥ h_{j 2}^{s} ∥^{2} ∥ K^{- 1} h_{j 1} ∥^{2} - ∥ h_{j 1}^{s} ∥^{2} {| K^{- 1} h_{j 2} ∥}^{2} = 0 .

(22)

There are three possible combinations, but only two of them are linearly independent. Thus, we have two constraints by taking any two of them, e.g.,

\begin{matrix} ∥ h_{j 2}^{s} ∥^{2} h_{j 1}^{T} B h_{j 1} - {∥ h_{j 1}^{s} ∥}^{2} h_{j 2}^{T} B h_{j 2} & = 0, \end{matrix}

(23)

\begin{matrix} (h_{j 1}^{s T} h_{j 2}^{s}) h_{j 1}^{T} B h_{j 1} - {∥ h_{j 1}^{s} ∥}^{2} h_{j 2}^{T} B h_{j 1} & = 0 \end{matrix}

(24)

with

B \propto K^{- T} K^{- 1}

. Note that

h_{j k}

and

h_{j k}^{s}

are known but only B is unknown.

3.2. Estimating Parameters

As shown in Equations (23) and (24), we have two constraints from an

H_{j}

. Therefore, we can solve B and extract K in the same manner as the conventional method. On the other hand, a new approach is required for estimating the extrinsic parameters.

As soon as K is computed, a linear method can be employed to solve the extrinsic parameters. Stacking

K^{- 1} H_{j}

and

H_{j}^{s}

for

\forall j \in m

horizontally, we have

\underset{C}{\underset{︸}{[\begin{matrix} K^{- 1} H_{1} & \dots & K^{- 1} H_{m} \end{matrix}]}} = [\begin{matrix} R & t \end{matrix}] \underset{D}{\underset{︸}{[\begin{matrix} \begin{matrix} μ_{1} H_{1}^{s} \\ 0 1 \end{matrix} & \dots & \begin{matrix} μ_{m} H_{m}^{s} \\ 0 1 \end{matrix} \end{matrix}]}}

(25)

where

μ_{j} = ∥ K^{- 1} h_{j 1} ∥ / ∥ h_{j 1}^{s} ∥

is a scaling factor.

Then, Equation (25) can be linearly solved by

[\begin{matrix} R & t \end{matrix}] = C D^{T} {(D D^{T})}^{- 1} .

(26)

3.3. Nonlinear Refinement

Nonlinear refinement must be applied to the linear estimation for more accuracy. The nonlinear optimization for the proposed method can be written by

\begin{matrix} \min_{K, R, t} & \sum_{j \forall m} \sum_{i \forall n} {∥ m_{i j} - p (X_{i}, K^{s}, R_{j}^{s}, t_{j}^{s}, K, R, t, d) ∥}^{2} \\ s . t . & R^{T} R = I, \end{matrix}

(27)

where

p (X_{i}, K^{s}, R_{j}^{s}, t_{j}^{s}, K, R, t, d)

is the projection of point

X_{i}

onto the image,

d = [k_{1}, k_{2}]

denotes the lens distortion coefficients and all the screen parameters

K^{s}

,

R_{j}^{s}

, and

t_{j}^{s}

are known. In our implementation, this optimization is also solved by using the Levenberg- Marquardt algorithm [19,20].

Distortion coefficients are estimated based on Zhang’s method [18] and included while minimizing Equation (27). For simplicity, only the first two coefficients of radial distortion

k_{1}

and

k_{2}

are considered, since the distortion function is mainly dominated by the radial components, especially the first term [2]. The relationship between the distortion-free pixel

(x, y)

and the distorted point

(x_{d}, y_{d})

is presented by

\begin{matrix} x_{d} = x (1 + k_{1} r^{2} + k_{2} r^{4}), \end{matrix}

(28)

\begin{matrix} y_{d} = y (1 + k_{1} r^{2} + k_{2} r^{4}) \end{matrix}

(29)

where

r^{2} = x^{2} + y^{2}

. Readers can refer to [3] for more details on lens distortion model and how to compensate lens distortion.

3.4. Summary

The procedure of the proposed method is very similar to the conventional one and includes the following steps:

Place the camera in front of the screen and adjust its position and orientation;
Fix the camera when the whole camera view is covered by the screen and it contains as much part of the screen as possible;
Take a few images of the screen while the virtual checkerboard is being transformed and displayed;
Detect the corner points in the images;
Estimate focal length $f_{x}$ and $f_{y}$ , principal point $[u_{0}, v_{0}]$ , skewness s, rotation matrix R and translation vector t using the closed-form solution as stated in Section 3.2;
Refine intrinsic and extrinsic parameters, including lens distortion coefficients, by nonlinear optimization as described in Section 3.3.

4. Experiments and Discussion

To demonstrate the validity and robustness of the proposed method, experiments on both synthetic data and real data have been conducted.

4.1. Experiment Setup

Before starting the calibration, the camera to be calibrated needs to be setup to ensure that the whole camera view is covered by a screen. To start with, the screen is placed within the working distance of the camera and the camera is looking straight to the screen. Ideally, using a screen with appropriate size and let the optical axis of a camera cross vertically with the screen at the center, the aforementioned condition should be satisfied. This setup may not work for a real camera, since its principal point is usually not at the center of the image. Also a real camera has lens distortion. Therefore, we still need to manually adjust the orientation and position of the camera, and fix the camera until its entire image is covered by the screen.

Then, a set of parameters about orientation and position are generated. They are used to transform the virtual pattern in the experiments. The orientation of the pattern is generated as follows: the pattern is parallel to the screen at first; a rotation axis is randomly chosen from a uniform sphere; the pattern is then rotated around that axis with an arbitrary angle

θ

between

40^{\circ}

and

50^{\circ}

. The reason for choosing

θ

in that range is because it achieves the best performance according to the experimental results in [18]. The position of the pattern can be expressed by the 3D coordinate of its center point

T = [x, y, z]

in the screen’s coordinates. In order to generate appropriate position for the pattern, following scheme is adopted. The pattern and the screen are initially on the same plane, and the center of the pattern coincides with the center of the screen. The pattern is then moved along the positive direction of Z axis. When the projection of the pattern on the screen is about 1/4 size of the screen, the value of z is fixed. The value of x and y are determined by randomly choosing points on the plane

Z = z

, within the screen’s field of view. If given enough number (≥20) of patterns, all the 2D projections of the corner points should scatter all over the image and the uniform distribution is achieved.

4.2. Experiment on Synthetic Images

In the computer simulation, a simulated camera is created with the following intrinsic parameters:

f_{x} = 1417

,

f_{y} = 1420

,

u_{0}

= 942,

v_{0}

= 547,

s = 0

,

k_{1}

= −0.0806,

k_{2}

= −0.0393. The screen which has

1920 \times 1080

resolution can be described using ideal pinhole model with 2500 (in pixels) focal length, and the principal point is located at the center of the screen. The virtual checkerboard contains

16 \times 10

= 160 corner points, and each square has 100 units per side. To investigate the performance of the proposed method regarding the noise level and the number of images of the calibration pattern, the following two experiments are designed and conducted. The method used for corner detection in the experiments is the method developed by Vezhnevets Vladimir, which is also integrated in OpenCV [21].

Performance regarding the noise level. To start with, virtual patterns with 20 different orientations and positions are synthesized. Then noisy images are created by adding Gaussian noise with a mean of

μ = 0

and a standard deviation of

σ

to the projected image points. The noise level varies from

σ = 0.1

to

σ = 1.5

. For each noise level, our method is tested with 100 independent trials and assessed by comparing the results with the ground truth. Figure 3a,b show the relative error for focal length and absolute error for principal point respectively. As we can see in Figure 3, the average errors increases as the the noise level rises and the relationship between them is almost linear. When the noise level increases to

σ = 0.5

, which is larger than the normal noise in practical calibration [18], the relative errors in focal length

f_{x}

and

f_{y}

are less than

0.1 %

, and the absolute errors in principal point

u_{0}

and

v_{0}

are around 1 pixel.

Performance regarding the number of images. This experiment is designed to explore how the number of images of the calibration pattern impacts the performance of our method. Starting from two, we increase the number of images by one each time until it reaches twenty. For each number, Gaussian noise(

μ = 0

,

σ = 0.5

) is first added to the images, calibration is then conducted with these independent images for 100 times. The errors are calculated based on the calibration results and ground truth data as in the previous experiment. The mean values of the errors are shown in Figure 4. The errors decrease and tend to be stable as the number of image increases. Note that the errors decrease significantly when the number increases from 2 to 3.

4.3. Experiments on Real Images

To test our method on real images, we use a 24 inch LCD monitor to display the virtual pattern. Parameters of the screen and the virtual pattern are the same as in the computer simulation. The camera to be calibrated is the color camera of a Microsoft Kinect for Windows V2 sensor. As shown in Figure 5, the camera is fixed approximately 40 cm away from the screen using a tripod, looking straight to the screen, so that the whole camera view is covered by the screen. Ten independent trials are performed with images of

1920 \times 1080

resolution. In each trial, virtual pattern is transformed using parameters randomly chosen from the synthetic data and shown on the monitor. Meanwhile, the screen is captured by a real camera and 20 different images are used in each calibration. Figure 6a shows sample images captured in this experiment. The images are collected automatically by computer program, and the screen and the camera are fixed during the whole process. We use the same method as in the synthetic experiments for corner detection.

In comparison, we also calibrated the real camera using a physical checkerboard. The pattern is printed by a high-quality printer and attached to a glass board with guaranteed flatness. It contains the same number of squares as the virtual pattern, and each square is 15 mm × 15 mm. The camera is fixed by a tripod, and images are collected while the checkerboard is being manually moved. A sample images used in this experiment is shown in Figure 6b. Ten independent trials are performed, with 20 images each time.

Explicit calibration experiments results are reported in Table 1 and Table 2. For the first 10 lines in the tables, each line shows the result obtained in an independent trial, which are the 6 camera parameters and the root mean square error( RMSE). Here, the RMSE is defined as the root mean square distance between every detected corner point and the re-projected one using the estimated parameters. The mean and standard deviation values of the estimated parameters are listed in the last two lines. As we can see in Table 1, results obtained using the proposed method are very consistent with each other and the standard deviations for all parameters are pretty small, which suggests that our method is very robust. Contrarily speaking, performance of the conventional results are not as stable as the proposed one. Since we don’t have ground truth data of the real world experiment, the camera parameters estimation result is evaluated based on re-projection error. With the proposed method and the conventional one, the mean value of the RMSE are 0.1855 and 0.2337 pixels, respectively. And the lowest RMSE, which is 0.1460, is achieved by the proposed method. We choose the best calibration results obtained by our method and the conventional method, and plot the localization errors of the control points in Figure 7. The results indicate that the proposed method outperforms the conventional one in terms of stability and accuracy in real world experiments.

4.4. Discussion

The above experiments show not only the practicality but also the advantage of the proposed method. In conventional calibration method, a key step is to capture images while manually moving a physical calibration pattern. Usually, this step takes as long as several minutes. In contrast, our method takes much less time to prepare calibration pattern and collect high quality data, and the whole procedure is done fully automatically within one minute.

The use of virtual pattern affects the calibration result in the following aspects. First, virtual pattern is transformed by computer program so that all the control points are uniformly distributed in the image. Well distributed points usually lead to more stable and accurate calibration result. Second, since the screen is fixed in the calibration, image blur caused by motion can be eliminated, therefore, control points can be more precisely localized. Otherwise, in a blurry image which is taken by a moving camera like Figure 8, the observed feature location in the image may deviate from the actual feature location. Even though the checkerboard patten can be detected by some algorithms (e.g., OpenCV’s checkerboard detection algorithm [21]), uncertainty in the localizations of the control points yields incorrect correspondences which lead to performance degradation of the calibration.

However, the proposed method also shows some limitations. An essential requirement of this method is that the entire camera view has to be covered by a screen. In some cases, it is difficult to satisfy the above requirement. For a camera with large working distance or wide field of view, it is necessary to use a large size screen, e.g., flat screen TV, to cover the entire image of the camera. However, screen size cannot be increased without limitation, our method may not be applicable if the camera has very large working distance or very wide field of view. The proposed method also does not work in some certain applications, such as high precision visual measurement, where the camera to be calibrated has very short working distance or very high resolution. In this case, the resolution of the camera is usually higher than that of the screen. Hence the image of a screen is discretized, and corner point detection and localization can be a problem. Although the effect of discretization can be reduced by using high resolution screen, it still affects the accuracy of calibration unless it is completely eliminated.

5. Conclusions

The conventional calibration technique using a 2D planar object is widely used due to its ease of use. Although many efforts have been focused on making the whole calibration procedure as automatic as possible, there is still a manual part at the capture step which takes a lot of time and makes the result unstable. In this paper, we proposed a full-automatic method for camera calibration to resolve the issues brought about by manual operations. Different from the conventional method, we use a virtual pattern which is transformed in the virtual world coordinates and projected on a fixed screen. The pattern shown on the screen is then captured by a fixed camera. Calibration is performed by using point correspondences between the virtual 3D points and their 2D projections, and the solution to camera parameters estimation is very similar to the conventional method.

Owing to the use of virtual pattern, there is no need to manually adjust the position and orientation of the checkerboard during calibration. Moreover, the virtual pattern can be actively displayed on the screen so that all corner points are uniformly distributed. Once the camera and the screen are set up, they are fixed during the whole calibration process. Thus, the proposed method can be fully automatic and the problems caused by manual operation are resolved without loss of usability. Experimental results show that our method is more robust and accurate than the conventional method.

Acknowledgments

This work has been supported by National Natural Science Foundation of China (Grant No. 61573134, 61573135), National Key Technology Support Program (Grant No. 2015BAF11B01), National Key Scientific Instrument and Equipment Development Project of China (Grant No. 2013YQ140517), Key Research and Development Project of Science and Technology Plan of Hunan Province(Grant No. 2015GK3008), Key Project of Science and Technology Plan of Guangdong Province(Grant No. 2013B011301014).

Author Contributions

The paper was a collaborative effort between the authors. Lei Tan, Yaonan Wang and Hongshan Yu proposed the idea of the paper. Lei Tan and Jiang Zhu implemented the algorithm, designed and performed the experiments. Lei Tan and Hongshan Yu analyzed the experimental results and prepared the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

Zhang, Z. A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 2000, 22, 1330–1334. [Google Scholar] [CrossRef]
Tsai, R.Y. A Versatile Camera Calibration Technique for High-Accuracy 3D Machine Vision Metrology Using Off-the-Shelf TV Cameras and Lenses. IEEE J. Robot. Autom. 1987, 3, 323–344. [Google Scholar] [CrossRef]
Heikkila, J.; Silven, O. A four-step camera calibration procedure with implicit image correction. In Proceedings of the 1997 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Juan, PuertoRico, 17–19 June 1997; pp. 1106–1112. [Google Scholar]
Chen, Q.; Wu, H.; Wada, T. Camera calibration with two arbitrary coplanar circles. In Computer Vision-ECCV 2004; Springer: New York, NY, USA, 2004; pp. 521–532. [Google Scholar]
Bergamasco, F.; Cosmo, L.; Albarelli, A.; Torsello, A. Camera calibration from coplanar circles. In Proceedings of the 22nd International Conference on Pattern Recognition (ICPR), Stockholm, Sweden, 24–28 August 2014; pp. 2137–2142. [Google Scholar]
Agrawal, M.; Davis, L.S. Camera calibration using spheres: A semi-definite programming approach. In Proceedings of the 2003 Ninth IEEE International Conference on Computer Vision, Nice, France, 13–16 October 2003; pp. 1–8. [Google Scholar]
Wong, K.Y.K.; Zhang, G.; Member, S.; Chen, Z. Calibration Using Spheres. Image 2011, 20, 305–316. [Google Scholar]
Caprile, B.; Torre, V. Using vanishing points for camera calibration. Int. J. Comput. Vis. 1990, 4, 127–139. [Google Scholar] [CrossRef]
Radu, O.; Joaquim, S.; Mihaela, G.; Bogdan, O. Camera calibration using two or three vanishing points. In Proceedings of the 2012 Federated Conference on Computer Science and Information Systems (FedCSIS), Wroclaw, Poland, 9–12 September 2012; pp. 123–130. [Google Scholar]
Li, B.; Heng, L.; Koser, K.; Pollefeys, M. A multiple-camera system calibration toolbox using a feature descriptor-based calibration pattern. In Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Tokyo, Japan, 3–7 November 2013; pp. 1301–1307. [Google Scholar]
Moreno, D.; Taubin, G. Simple, accurate, and robust projector-camera calibration. In Proceedings of the 2012 2nd International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission, Zurich, Switzerland, 13–15 October 2012; pp. 464–471. [Google Scholar]
Raposo, C.; Barreto, J.P.; Nunes, U. Fast and accurate calibration of a kinect sensor. In Proceedings of the 2013 International Conference on 3DTV-Conference, Seattle, WA, USA, 29 June–1 July 2013; pp. 342–349. [Google Scholar]
Rufli, M.; Scaramuzza, D.; Siegwart, R. Automatic detection of checkerboards on blurred and distorted images. In Proceedings of the 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, Nice, France, 22–26 September 2008; pp. 3121–3126. [Google Scholar]
Donné, S.; De Vylder, J.; Goossens, B.; Philips, W. MATE: Machine Learning for Adaptive Calibration Template Detection. Sensors 2016, 16, 1858. [Google Scholar] [CrossRef] [PubMed]
Pilett, J.; Geiger, A.; Lagger, P.; Lepetit, V.; Fua, P. An all-in-one solution to geometric and photometric calibration. In Proceedings of the 2006 Fifth IEEE/ACM International Symposium on Mixed and Augmented Reality, Santa Barbar, CA, USA, 22–25 October 2006; pp. 69–78. [Google Scholar]
Atcheson, B.; Heide, F.; Heidrich, W. CALTag: High Precision Fiducial Markers for Camera Calibration. Vis. Model. Vis. 2010, 10, 41–48. [Google Scholar]
Oyamada, Y. Single Camera Calibration using partially visible calibration objects based on Random Dots Marker Tracking Algorithm. In Proceedings of the IEEE ISMAR 2012 Workshop on Tracking Methods and Applications (TMA), Atlanta, GA, USA, 5–8 November 2012. [Google Scholar]
Zhang, Z. A Flexible New Technique for Camera Calibration; Technical Report MSR-TR-98-71; Microsoft Research: Redmond, WA, USA, 1998. [Google Scholar]
Marquardt, D.W. An algorithm for least-squares estimation of nonlinear parameters. J. Soc. Ind. Appl. Math. 1963, 11, 431–441. [Google Scholar] [CrossRef]
Levenberg, K. A method for the solution of certain problems in least squares. Q. Appl. Math. 1944, 2, 164–168. [Google Scholar] [CrossRef]
Vezhnevets, V. OpenCV Calibration Object Detection, Part of the Free Open-Source OpenCV Image Processing Library. Available online: http://graphicon.ru/oldgr/en/research/calibration/opencv.html (accessed on 20 December 2016).

Figure 1. Distribution of Detected Points. (a) Detected points are distributed uniformly across the image; (b) Detected points are mainly located at the center part of the image.

Figure 2. Overview of the conventional method and the proposed method. (a) The conventional method; (b) The proposed method.

Figure 3. Errors regarding the noise level of the image points. (a) Relative error for focal length; (b) Absolute error for principal point.

Figure 4. Errors regarding the number of the calibration pattern. (a) Relative error for focal length; (b) Absolute error for principal point.

Figure 5. Setup of the real experiment.

Figure 6. Two calibration images captured in real experiment. (a) Image of a virtual checkerboard shown on screen; (b) Image of a physical checkerboard.

Figure 7. Scatter plots for the RMSE between the detected corner points and the re-projected ones with the estimated calibration parameters. (a) Localization errors by the proposed method; (b) Localization errors by the conventional method.

Figure 8. A blurry image captured in real experiment.

Table 1. Calibration result for real images using the proposed method.

	$f_{x}$	$f_{y}$	$u_{0}$	$v_{0}$	$k_{1}$	$k_{2}$	RMSE
Trial 1	1050.2120	1045.9939	957.1198	519.4579	0.0448	−0.0468	0.1502
Trial 2	1052.1709	1047.9542	957.1122	519.7247	0.0456	−0.0494	0.2021
Trial 3	1048.5039	1044.3648	956.7213	519.2291	0.0442	−0.0462	0.2061
Trial 4	1051.0054	1046.8187	956.8339	519.2194	0.0455	−0.0486	0.1756
Trial 5	1050.8918	1046.7329	956.8582	519.4178	0.0460	−0.0498	0.1944
Trial 6	1051.2977	1047.1457	956.8358	519.4241	0.0454	−0.0481	0.1460
Trial 7	1050.4691	1046.3180	956.5077	519.6354	0.0446	−0.0467	0.1699
Trial 8	1052.8643	1048.7560	956.4323	519.5267	0.0452	−0.0473	0.2077
Trial 9	1051.0076	1046.8497	956.9606	519.6325	0.0461	−0.0489	0.1952
Trial 10	1049.4690	1045.3789	956.4628	519.2602	0.0460	−0.0494	0.2076
Mean	1050.7892	1046.6313	956.7845	519.4528	0.0453	−0.0481	0.1855
Deviation	1.2463	1.2397	0.2515	0.1791	0.0006	0.0013	0.0236

Table 2. Calibration result for real images using the conventional method.

	$f_{x}$	$f_{y}$	$u_{0}$	$v_{0}$	$k_{1}$	$k_{2}$	RMSE
Trial 1	1048.0347	1044.0247	956.8945	519.3556	0.0438	−0.0464	0.2595
Trial 2	1047.9891	1043.7756	956.6410	519.7846	0.0458	−0.0485	0.2153
Trial 3	1051.6414	1047.3967	957.3807	519.6939	0.0458	−0.0486	0.2029
Trial 4	1052.1863	1048.0365	957.2387	519.3653	0.0454	−0.0470	0.2948
Trial 5	1050.3806	1046.1871	956.9527	519.0593	0.0446	−0.0452	0.2469
Trial 6	1049.6929	1045.5486	956.8276	519.5737	0.0451	−0.0475	0.2210
Trial 7	1048.9989	1044.8639	956.8082	519.3750	0.0449	−0.0465	0.1747
Trial 8	1050.1785	1046.0461	956.7260	519.5743	0.0439	−0.0457	0.2672
Trial 9	1050.3922	1046.2240	956.5963	519.5850	0.0437	−0.0445	0.1787
Trial 10	1051.6436	1047.4263	956.8238	519.8481	0.0450	−0.0459	0.2757
Mean	1050.1138	1045.9530	956.8889	519.5215	0.0448	−0.0466	0.2337
Deviation	1.4674	1.4353	0.2487	0.2362	0.0008	0.0014	0.0414

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Tan, L.; Wang, Y.; Yu, H.; Zhu, J. Automatic Camera Calibration Using Active Displays of a Virtual Pattern. Sensors 2017, 17, 685. https://doi.org/10.3390/s17040685

AMA Style

Tan L, Wang Y, Yu H, Zhu J. Automatic Camera Calibration Using Active Displays of a Virtual Pattern. Sensors. 2017; 17(4):685. https://doi.org/10.3390/s17040685

Chicago/Turabian Style

Tan, Lei, Yaonan Wang, Hongshan Yu, and Jiang Zhu. 2017. "Automatic Camera Calibration Using Active Displays of a Virtual Pattern" Sensors 17, no. 4: 685. https://doi.org/10.3390/s17040685

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Automatic Camera Calibration Using Active Displays of a Virtual Pattern

Abstract

1. Introduction

2. Conventional Method

2.1. Basic Equations

2.2. Estimating Parameters

2.3. Nonlinear Refinement

3. Proposed Method

3.1. Basic Equations

3.2. Estimating Parameters

3.3. Nonlinear Refinement

3.4. Summary

4. Experiments and Discussion

4.1. Experiment Setup

4.2. Experiment on Synthetic Images

4.3. Experiments on Real Images

4.4. Discussion

5. Conclusions

Acknowledgments

Author Contributions

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI