Next Article in Journal
Annealing Effects of Parylene-Caulked Polydimethylsiloxane as a Substrate of Electrodes
Next Article in Special Issue
A Novel Probabilistic Data Association for Target Tracking in a Cluttered Environment
Previous Article in Journal / Special Issue
Adaptive Local Spatiotemporal Features from RGB-D Data for One-Shot Learning Gesture Recognition
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Robust and Accurate Vision-Based Pose Estimation Algorithm Based on Four Coplanar Feature Points

Tianjin Key Laboratory of High Speed Cutting & Precision Machining, Tianjin University of Technology and Education, Tianjin 300222, China
*
Author to whom correspondence should be addressed.
Sensors 2016, 16(12), 2173; https://doi.org/10.3390/s16122173
Submission received: 26 October 2016 / Revised: 11 December 2016 / Accepted: 14 December 2016 / Published: 17 December 2016
(This article belongs to the Special Issue Video Analysis and Tracking Using State-of-the-Art Sensors)

Abstract

:
Vision-based pose estimation is an important application of machine vision. Currently, analytical and iterative methods are used to solve the object pose. The analytical solutions generally take less computation time. However, the analytical solutions are extremely susceptible to noise. The iterative solutions minimize the distance error between feature points based on 2D image pixel coordinates. However, the non-linear optimization needs a good initial estimate of the true solution, otherwise they are more time consuming than analytical solutions. Moreover, the image processing error grows rapidly with measurement range increase. This leads to pose estimation errors. All the reasons mentioned above will cause accuracy to decrease. To solve this problem, a novel pose estimation method based on four coplanar points is proposed. Firstly, the coordinates of feature points are determined according to the linear constraints formed by the four points. The initial coordinates of feature points acquired through the linear method are then optimized through an iterative method. Finally, the coordinate system of object motion is established and a method is introduced to solve the object pose. The growing image processing error causes pose estimation errors the measurement range increases. Through the coordinate system, the pose estimation errors could be decreased. The proposed method is compared with two other existing methods through experiments. Experimental results demonstrate that the proposed method works efficiently and stably.

1. Introduction

With the development of modern industrial technology, quickly and accurately determining the position and orientation between objects is becoming more and more important. This process is called pose estimation. Vision-based pose measurement technology, also known as perspective-n-point (PNP) problem, is to determine the position and orientation of a camera and a target with n feature points on the condition of knowing their world coordinates and 2D image pixel coordinates. It has the advantages of non-contact and high efficiency. It can be widely applied in robotics [1], autonomous aerial refueling [2], electro-optic aiming systems [3,4], virtual reality [5], etc.
The existing approaches to solve object poses can be divided into two categories: analytical algorithms and iterative algorithms. Analytical algorithms apply linear methods to obtain algebraic solutions: Hu et al. [6] uses four points to achieve a linear solution. Lepetit et al. [7] proposed a non-iterative solution named EPnP algorithm. Tang et al. [8] presented a linear algorithm on the condition of five points. Ansar et al. [9] presented a general framework to directly recover the rotation and translation. Duan et al. [10] introduced a new affine invariant of trapezium to realize pose estimation and plane measurement.
As for iterative algorithms, pose estimation is formulated as a nonlinear problem with the constraints, and then it is solved using nonlinear optimization algorithms, most typically, Levenberg-Marquardt method. DeMenthon et al. [11,12,13] proposed POSIT algorithm. It gets the initial value of the solution using a scaled orthographic model to approximate the perspective projection model. Zhang et al. [14,15] proposed a two-stage iterative algorithm. The iterative algorithm is divided into a depth recovery stage and an absolute orientation stage. Peng et al. [16] achieved the object pose non-linearly on the basis of five points using a least squares approach. Liu et al. [17] gets the object pose based on the corresponding geometrical constraints formed by four non-coplanar points. Zhang et al. [18] proposed an efficient solution for vision-based pose determination of a parallel manipulator. Fun et al. [19] proposed a robust and high accurate pose estimation method to solve the PNP problem in real time.
The analytical solutions generally take less computation time. However, they are sensitive to observation noise and usually obtain lower accuracy. As for iterative solutions, the pose estimation is translated into a nonlinear least squares problem through the distance constraints. Then the distance error between feature points is minimized based on 2D image pixel coordinates. However, the iterative solutions also have drawbacks: (1) non-linear optimization needs a good initial estimate of the true solution, (2) they are more time consuming than analytical solutions.
During the pose estimation process, the image processing errors emerge mainly because the perspective projection point of the feature marker center and the perspective projection image center of feature markers do not coincide. The current papers focus on the reduction of image processing errors through better extraction of the image center [20,21,22]. However, those methods are not powerful enough to eliminate the inconsistency between the perspective projection point of the feature marker center and the center of the corresponding perspective projection image, especially when the measurement range increases.
Based on the discussion above, in this paper a robust and accurate pose estimation method based on four coplanar points is proposed: In the first step, by utilizing the linear constraints formed by points, the coordinates of points in the camera coordinate system are solved analytically. The results obtained in the first step are then set as the initial values of an iterative solving process to ensure the accuracy and convergence rate of non-linear algorithm. The Levenberg-Marquardt optimization method is utilized to refine the initial values. In the second step, the coordinate system of object motion is established and the object pose is finally solved. The growing image processing error causes greater pose estimation errors with an increasing measurement range. Through the coordinate system, the pose estimation errors could be decreased.
The rest of the paper is organized as follows: Section 2 and Section 3 propose a robust and accurate pose estimation method. Section 4 provides some experiments to examine the method. Section 5 gives the conclusion.

2. The Solving of Feature Point Coordinates in the Camera Coordinate System

The target pattern with four coplanar points is designed for pose estimation as shown in Figure 1. P0, P1, P2 and P3 form a trapezium. P0P1 is parallel to P2P3.
P0 is set as the origin of coordinate system. The connecting line of P0 and P1 is Y axis. P 0 P 1 × P 1 P 3 is Z axis. Then X axis is Z × Y . In this way, the world coordinate system (target coordinate system) is constructed.
To achieve the solution of target position, the coordinates of points in the camera coordinate system need to be solved first. o c x c y c z c is the camera coordinate system. o w x w y w z w is the world coordinate system. The coordinates of each point in the camera coordinate system is P c i = ( x c i , y c i , z c i ) T . The coordinates of each point in the world coordinate system are P w i = ( x w i , y w i , z w i ) T . P w i is known before solving the pose. The corresponding ideal image coordinates are I u i = ( x u i , y u i , 1 ) T ( i = 0 , 1 , 2 , 3 ) . The relationship of P c i and I u i could be described as
z c i I u i = [ s x f / d x 0 u 0 0 f / d y v 0 0 0 1 ] P c i = K P c i P c i = λ i K 1 I u i ( λ i = z c i )
where (u0, v0) is the center pixel of the computer image, sx is the uncertainty image factor, dx and dy are center to center distances between pixels in the row and column directions respectively, f is the focal length, and λ i is the projection depth of Pi.
The coordinates of points in the camera coordinate system could be obtained by solving λ i . Since P0P1 is parallel to P2P3, two linear constraints are introduced. Then λ i is solved (The solving process is in the Appendix).
λ 0 = σ 0 λ 1 , λ 2 = σ 2 λ 1 / ε , λ 3 = σ 3 λ 1 / ε ε = P w 1 P w 0 / P w 3 P w 2 P w 3 P w 2 = P c 3 P c 2 = λ 1 σ 3 K 1 I u 3 / ε σ 2 K 1 I u 2 / ε λ 1 = P w 1 P w 0 / σ 3 K 1 I u 3 σ 2 K 1 I u 2
The coordinates of points in the camera coordinates system is obtained by λ i . So, the vector from the optical center Oc to each point could be calculated through Equation (3).
O c P c i = σ i K 1 I u i P w 1 P w 0 / σ 3 K 1 I u 3 σ 2 K 1 I u 2 ( i = 0 , 1 ) O c P c i = σ i K 1 I u i P w 3 P w 2 / σ 3 K 1 I u 3 σ 2 K 1 I u 2 ( i = 2 , 3 )
The length between the feature points could be calculated through Equation (4).
D i ,   j 2 = O c P c i 2 + O c P c j 2 2 O c P c i O c P c j ( i , j = 0 , 1 , 2 , 3 )
In order to maintain the planarity of the four points, the coplanar constraint (CO) expressed as Equation (5) should also be considered.
CO = ( P c 0 P c 1 × P c 2 P c 3 ) ( P c 0 P c 2 × P c 1 P c 3 )
Through Equation (6), the Levenberg-Marquardt optimization method is then used to solve λ i . The value of λ i obtained in Equation (2) is used as the initial value to ensure the accuracy and convergence speed of the algorithm.
D i ,   j 2 P w i P w j = 0 C O = 0

3. The Solving of Object Pose

The feature points coordinates in the camera coordinate system P c 0 , P c 1 , P c 2 , P c 3 are calculated through λ i . The following step is to solve the object pose. As shown in Figure 2, o m x m y m z m is the rotation and translation coordinate system of target. The three free degrees in rotation are yaw (ym axis and α angle), pitch (xm axis and β angle), and roll (zm axis and γ angle).
o m x m y m z m is established according to the following steps: (1) The target rotates around ym axis. The images of the target are captured in the locations of different rotation angles. N1 groups of spatial coordinates P c 0 can be obtained. With the least squares method, the N1 groups of spatial coordinates P c 0 are used to fit a plane: Ax + By + Cz + D = 0. The vector of the ym axis is,
v = [ r 4 r 5 r 6 ] T = [ A A 2 + B 2 + C 2 B A 2 + B 2 + C 2 C A 2 + B 2 + C 2 ] T
(2) The target rotates around xm axis. The images of target are captured in the locations of different rotation angles. N2 groups of spatial coordinates P c 0 can be obtained. With the least squares method, the N2 groups of spatial coordinates P c 0 are used to fit a plane: Ex + Fy + Gz + H = 0. The vector of the xm axis is,
u = [ r 1 r 2 r 3 ] T = [ E E 2 + F 2 + G 2 F E 2 + F 2 + G 2 G E 2 + F 2 + G 2 ] T
(3) The point sets P c 0 obtained in steps (1) and (2) share one rotation center. A sphere-fitting is adopted to describe the center. According to the following sphere-fitting equation, the sphere center could be calculated. The sphere center is the rotation center (the origin of o m x m y m z m ).
1 N 1 ( P c 0 x x o m ) 2 + ( P c 0 y y o m ) 2 + ( P c 0 z z o m ) 2 r 2 = 0 1 N 2 ( P c 0 x x o m ) 2 + ( P c 0 y y o m ) 2 + ( P c 0 z z o m ) 2 r 2 = 0
(xom, yom, zom) is the sphere center and r is the sphere radius.
(4) The vector of the zm axis is w = u × v = [ r 7 r 8 r 9 ] T . The vector of the ym axis is then adjusted by v = w × u = [ r 4 r 5 r 6 ] T .
As o m x m y m z m is established, the rotation and translation matrix from o c x c y c z c to o m x m y m z m is obtained. The coordinates of feature points in o m x m y m z m are shown below.
R c m = [ u v w ] = [ r 1 r 4 r 7 r 2 r 5 r 8 r 3 r 6 r 9 ] T c m = [ x o m y o m z o m ] T P m i = R c m P c i + T c m
The coordinates of feature points in o m x m y m z m are represented with P m i .
The positioning accuracy of the feature points during image processing have a direct impact on the pose estimation accuracy. The automatic identification of circular markers is more convenient. There are many available algorithms for the center positioning of circular markers. So in this paper, the center point of the circular markers is selected as the feature point. The nature of imaging is the perspective projection. It has the characteristic that the object is big when near and small when far. This causes the perspective projection point of the circular marker’s center and the center of the corresponding perspective projection image (usually an ellipse) to not coincide, especially when the measurement range is greater [23]. This inconsistency causes image processing errors. Then image processing errors emerge, which may result in the pose estimation errors (The ideal image coordinates I u i = ( x u i , y u i , 1 ) T in Equation (1) are obtained through these image processing steps). In order to improve measurement accuracy, the following method is introduced.
The point sets P c 0 obtained in Section 2 represent the known typical position of a target in the moving space. It means N typical positions of a target. The pose of a target in the typical position is known. According to Equation (10), N P m 0 are obtained. Then the coordinates of P 0 in o m x m y m z m in the typical position j are represented with Q m j 0 . P m k i represents the coordinates of feature points in o m x m y m z m in the location k. The nearest Q m j 0 to P m k 0 is Q m J 0 as shown in Figure 3. P m k i is calculated as
O m P m k i = O m Q m J i + Q m J i P m k i ( i = 0 , 1 , 2 , 3 )
Each pose vector in the moving space corresponds to three angles. They are α (yaw angle), β (pitch angle), and γ (roll angle). They could be calculated as shown in Equation (12).
α O m P m k i = α O m Q m J i + α Q m J i P m k i β O m P m k i = β O m Q m J i + β Q m J i P m k i γ O m P m k i = γ O m Q m J i + γ Q m J i P m k i
α O m Q m J i , β O m Q m J i and γ O m Q m J i are known quantities. α Q m J i P m k i , β Q m J i P m k i and γ Q m J i P m k i could be calculated according to Equation (13). In this way, image processing error caused by the inconsistency mentioned above is significantly reduced, especially when the measurement range is greater. The pose measurement accuracy is then improved. The target pose could be represented with α O m Q m k i , β O m Q m k i , and γ O m Q m k i .
R Q = [ g 1 g 2 g 3 ] g 2 = Q m J 0 Q m J 1 / | Q m J 0 Q m J 1 | l 1 = Q m J 1 Q m J 3 / | Q m J 1 Q m J 3 | g 3 = g 2 × l 1 g 1 = g 3 × g 2 R P = [ g 4 g 5 g 6 ] g 5 = P m k 0 P m k 1 / | P m k 0 P m k 1 | l 2 = P m k 1 P m k 3 / | P m k 1 P m k 3 | g 6 = g 5 × l 2 g 4 = g 6 × g 5 R Q P = R P 1 R Q R Q P = [ sin α cos γ cos α sin γ sin α sin β sin α cos γ cos β sin γ sin β sin α sin γ + cos β cos γ sin β cos α cos β sin α cos γ + sin β sin γ cos β sin α S γ sin β cos γ cos β cos α ]

4. Experiment Results

The experimental system that consists of a target, a three-axis rotation stage, and two CCD cameras is shown in Figure 4. The rotation range of stage in yaw axis, pitch axis, roll axis are ±160°, ±90°, ±60° respectively.
Camera used in this paper is Teli CSB4000F-20 (Teli, Tokyo, Japan) with the resolution 2008 (h) × 2044 (v), pixel size 0.006 × 0.006 mm2. The lens is Pentax (Tokyo, Japan) 25 mm. The positioning accuracy of rotation stage is less than 20″. The calibration results of camera intrinsic parameters are shown in Table 1 [24]. k1, k2 are the radial distortion coefficients. p1, p2 are the tangential distortion coefficients.
The images are captured with a single CCD camera in different locations. The feature point coordinates in the camera coordinate system are obtained in the current location. Then the rotation and translation coordinate system of target o m x m y m z m is established. α O m Q m k i , β O m Q m k i and γ O m Q m k i which could be used to represent the target pose are solved.
The target pose measurement experiment includes three parts: computer simulation experiments, real image experiments for accuracy, and real image comparative experiments with other methods.

4.1. Computer Simulation Experiments

To validate accuracy and noise immunity of the algorithm in this paper, the algorithm proposed is compared with the Oberkampf POSIT algorithm [11,12,13] and the Duan algorithm [10] during the computer simulation experiment process. In this process, the pinhole imaging model of a camera is simulated, thus the points are transformed with perspective projection and the simulated image coordinates of points are acquired. Gaussian noise from 0 pixels to 4 pixels are added to the point’s coordinates of images. The relative errors of estimated poses using the proposed algorithm, Duan algorithm, and Oberkampf POSIT algorithm are shown in Figure 5.
It can be seen that the errors are reduced after optimization and they increase with the noise level. It is noticeable that the method proposed produces better results than the other two methods, especially when the noise level is greater than one pixel. In addition, it is noted that the errors keep almost the same level under lower noise disturbance. This phenomenon can be explained as follows: owing to the fact that the Duan algorithm does not have an iterative solving process, this leads to a relatively high error. The Oberkampf POSIT method takes no account of both the coplanarity constraint and initial value of iteration process. With the noise level increasing, the factor of error changes from initial value of the iteration process to coplanarity. This results in a higher error than the proposed algorithm.

4.2. The Measurement Experiments for Accuracy

The measurement range of yaw angle, pitch angle, and roll angle is set to −35°~35°. The measurement errors are shown in Figure 6. The measurement error is the absolute difference of the estimated angle to the true angle.

4.3. The Comparative Experiments for Accuracy

The algorithm proposed in this paper, Duan algorithm, and Oberkampf POSIT algorithm are used to calculate target poses. The root mean square (RMS) errors are displayed in Figure 7. Each point in the figure is the RMS error over multiple measurements. As to the Duan algorithm, when the rotation angle is greater, the RMS error of our method is less than that of Duan algorithm. The experiment results demonstrate that the analytical pose measurement process is sensitive to the observation noise. This error could be reduced by the iterative process in our method. At the same time, in our algorithm a good initial value is provided at the beginning of iteration to ensure pose estimation accuracy. As to the Oberkampf POSIT algorithm, experiment results demonstrate that the image processing error mentioned in Section 3 exists in the pose measurement process. Through the method in Section 3, the error could be eliminated successfully, especially when the measurement range is greater.
By comparing the results of our method and those of the Oberkampf POSIT and Duan algorithms, it is obvious that the measurement accuracy of our method is higher than those in the whole moving space.
As to the real applications, we first take images for the calibration board of the camera, as shown in Figure 8. The four corner circular markers (marked in red) of the calibration board are used to form a trapezium. The center points of the four circular markers are extracted from the images. Then the lengths of the four sides of the trapezium are calculated with the algorithm proposed in this paper, Duan algorithm, and Oberkampf POSIT algorithm. The measurement errors are shown in Table 2. Figure 9 shows the image sets of the calibration board. D01 is the distance between P0 and P1. D02 is the distance between P0 and P2. D23 is the distance between P2 and P3. D13 is the distance between P1 and P3. The measurement error is the absolute difference of the estimated length to the true length. It can be seen from Table 2 that the results of our algorithm are more close to the real value. The measurement error of our method is less than 0.2 mm. The algorithm proposed in this paper is more effective.
Then we test the proposed algorithm by experimentally detecting the pose information of a rotation stage in yaw angle, pitch angle, and roll angle. Figure 10 shows the image sets of the rotation stage with feature points. Through the target on the rotation stage as shown in Figure 10, its pose is calculated using the algorithm proposed in this paper, Duan algorithm, and Oberkampf POSIT algorithm. The measurement errors are shown in Table 3. The measurement error is the absolute difference of the estimated angle to the true angle. It can be seen from Table 3 that the results of our algorithm are more close to the real value. The measurement error of our method is less than 0.2°. Using the same information extracted from the images, the proposed algorithm has an advantage to identify the pose of the rotation stage.
Finally, we estimated the head pose. The pose measurement system based on inertial technology is mounted on a helmet to calculate the head pose as shown in Figure 11. The results are taken as the real values. The corresponding images of the head are captured and shown in Figure 12. The vertices of the trapezium are the outer corners of the two eyes and the mouth. The image points of the four vertices on each image are located in a manual way. The head pose is calculated using the algorithm proposed in this paper, Duan algorithm, and Oberkampf POSIT algorithm. The first image (Frame 0) in Figure 12 is set as the initial position (zero position). The angle between the current positions (Frame 1, Frame 2, Frame 3, and Frame 4) and the initial position (Frame 0) are calculated. The measurement errors are shown in Table 4. The measurement error is the absolute difference of the estimated angle to the real angle.
It can be seen from Table 4 that the results of our algorithm are closer to the real value. The rough localization of the control points (the trapezium) in the image lead to the promotion of the measurement errors. However, our method could decrease these kinds of measurement errors compared with the other two methods. The errors of yaw, pitch, and roll are not independent. The errors of yaw, pitch, and roll influence each other for they constitute the rotation matrix together. When the head rotated in all the axes, as the errors of yaw, pitch, and roll influence each other, measurement errors are promoted.
The distance between object and camera is limited by the field of view. The corresponding experiment results (RMS errors) are shown in Table 5. When the camera is in different locations, the pose of the target is calculated. In Position 1 and Position 2, the camera is near to the target and the target is out of the field of view of the camera. In Position 5 and Position 6, the camera is far from the target and the target is out of the field of view of the camera. In Position 3 and Position 4, the target is in of the field of view of the camera. It can be seen as follows: if the object is out of the range of the field of view, the measurement accuracy is lower. If the object is in the range of the field of view, the measurement accuracy is higher and little influenced by the distance between object and camera. So in order to obtain a higher accuracy, it is needed to ensure that the object is in the range of the field of view of the camera.

5. Conclusions

A robust and accurate vision-based pose estimation algorithm based on four coplanar feature points is presented in this paper. By combining the advantages of the analytical methods and the iterative methods, the iterative process is given a preferable initial value acquired independently by an analytical method. At the same time, the anti-noise ability is strengthened and the result is more stable. The pose estimation errors depend on the feature extraction accuracy. When the measurement range is greater, the image processing errors become greater. In the algorithm proposed, the coordinate system of object motion is established to solve the object pose. In this way, the image processing error which may result in the pose estimation errors could be reduced.

Acknowledgments

This research was supported by the National Natural Science Foundation of China (No. 51605332). This research was supported by the National Natural Science Foundation of China (No. 11302149). This research was supported by the Innovation Team Training Plan of Tianjin Universities and Colleges (Grant No. TD12-5043).This research was also supported by the Tianjin application foundation and advanced technology research program (the youth fund project), No. 15JCQNJC02400.

Author Contributions

Zimiao Zhang conceived and designed the experiments. Shihai Zhang and Qiu Li analyzed the data.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proof of the Solving of λ i

ε is the distance ratio of P1P0 and P3P2.
ε = P w 1 P w 0 / P w 3 P w 2 P w 1 = ε ( P w 3 P w 2 ) + P w 0 = [ P w 0 , P w 2 , P w 3 ] [ 1 , ε , ε ] T
The transformation matrix from the world coordinates system to the camera coordinates system is R and T. P c i could be expressed as P c i = R P w i + T , and then P c 1 could be represented with P c 0 , P c 2 , P c 3 . The first linear constraint is shown in Equation (A2).
P c 1 = R P c 4 + T = R ( ε ( P c 3 P c 2 ) + P c 0 ) + T = R ( ε ( P c 3 P c 2 + T T ) + P c 0 ) + T = ε ( R P c 3 R P c 2 + T T ) + R P c 0 + T = ε ( P c 3 P c 2 ) + P c 0
P c 1 = ε ( P c 3 P c 2 ) + P c 0 = [ P c 0 , P c 2 , P c 3 ] [ 1 , ε , ε ] T
Given P c = [ P c 0 , P c 2 , P c 3 ] , according to Equations (1) and (A2), Equation (A3) is deduced.
P c = [ P c 0 , P c 2 , P c 3 ] = K 1 [ I u 0 , I u 2 , I u 3 ] d i a g [ λ 0 , λ 2 , λ 3 ]
The both sides of Equation (A3) are multiplied with P c 1 . Given [ σ 0 , σ 2 , σ 3 ] = [ I u 0 , I u 2 , I u 3 ] 1 I u 1 , finally Equation (A4) is obtained which represents the second linear constraint.
P c 1 P c 1 = d i a g ( λ 1 / λ 0 , λ 1 / λ 2 , λ 1 / λ 3 ) [ σ 0 , σ 2 , σ 3 ] T
P c 1 P c 1 could also be expressed as Equation (A5) according to Equation (A2).
P c 1 P c 1 = [ 1 , ε , ε ] T
According to both Equations (A4) and (A5), λ i is solved as shown in Equation (2) in the paper.

References

  1. Kim, S.J.; Kim, B.K. Dynamic ultrasonic hybrid localization system for indoor mobile robots. IEEE Trans. Ind. Electron. 2013, 60, 4562–4573. [Google Scholar] [CrossRef]
  2. Mao, W.; Eke, F. A survey of the dynamics and control of aircraft during aerial refueling. Nonlinear Dyn. Syst. Theory 2008, 8, 375–388. [Google Scholar]
  3. Valenti, R.; Sebe, N.; Gevers, T. Combining head pose and eye location information for gaze estimation. IEEE Trans. Image Process. 2012, 21, 802–815. [Google Scholar] [CrossRef] [PubMed]
  4. Murphy-Chutorian, E.; Trivedi, M.M. Head pose estimation in computer vision: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2009, 31, 607–626. [Google Scholar] [CrossRef] [PubMed]
  5. Schall, G.; Wagner, D.; Reitmayr, G.; Taichmann, E.; Wieser, M.; Schmalstieg, D.; Hofmann-Wellenhof, B. Global pose estimation using multi-sensor fusion for outdoor augmented reality. In Proceedings of the 8th International Symposium on Mixed and Augmented Reality (ISMAR 2009), Santa Barbara, CA, USA, 19–22 October 2009; pp. 153–162.
  6. Hu, Z.Y.; Wu, F.C. A note on the number solution of the non-coplanar P4P problem. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 550–555. [Google Scholar] [CrossRef]
  7. Lepetit, V.; Moreno-Noguer, F.; Fua, P. EPnP: An accurate O(n) solution to the PnP problem. Int. J. Comput. Vis. 2009, 81, 155–166. [Google Scholar] [CrossRef] [Green Version]
  8. Tang, J.; Chen, W.; Wang, J. A novel linear algorithm for P5P problem. Appl. Math. Comput. 2008, 205, 628–634. [Google Scholar] [CrossRef]
  9. Ansar, A.; Daniilidis, K. Linear pose estimation from points or lines. IEEE Trans. Pattern Anal. Mach. Intell. 2003, 25, 578–589. [Google Scholar] [CrossRef]
  10. Duan, F.Q.; Wu, F.C.; Hu, Z.Y. Pose determination and plane measurement using a trapezium. Pattern Recognit. Lett. 2008, 29, 223–231. [Google Scholar] [CrossRef]
  11. DeMenthon, D.F.; Davis, L.S. Model-based object pose in 25 lines of code. Int. J. Comput. Vis. 1995, 15, 123–141. [Google Scholar] [CrossRef]
  12. Gramegna, T.; Venturino, L.; Cicirelli, G.; Attolico, G.; Distante, A. Optimization of the POSIT algorithm for indoor autonomous navigation. Robot. Auton. Syst. 2004, 48, 145–162. [Google Scholar] [CrossRef]
  13. Oberkampf, D.; DeMenthon, D.F.; Davis, L.S. Iterative pose estimation using coplanar points. In Proceedings of the 1993 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’93), New York, NY, USA, 15–17 June 1993; pp. 626–627.
  14. Zhang, S.; Liu, F.; Cao, B.; He, L. Monocular vision-based two-stage iterative algorithm for relative position and attitude estimation of docking spacecraft. Chin. J. Aeronaut. 2010, 23, 204–210. [Google Scholar]
  15. Zhang, S.; Cao, X.; Zhang, F.; He, L. Monocular vision-based iterative pose estimation algorithm from corresponding feature points. Sci. Chin. Inf. Sci. 2010, 53, 1682–1696. [Google Scholar] [CrossRef]
  16. Wang, P.; Xiao, X.; Zhang, Z.; Sun, C. Study on the position and orientation measurement method with monocular vision system. Chin. Opt. Lett. 2010, 8, 55–58. [Google Scholar] [CrossRef]
  17. Liu, M.L.; Wong, K.H. Pose estimation using four corresponding points. Pattern Recognit. Lett. 1999, 20, 69–74. [Google Scholar] [CrossRef]
  18. Zhang, S.; Ding, Y.; Hao, K.; Zhang, D. An efficient two-step solution for vision-based pose determination of a parallel manipulator. Robot. Comput. Integr. Manuf. 2012, 28, 182–189. [Google Scholar] [CrossRef]
  19. Fan, B.; Du, Y.; Cong, Y. Robust and accurate online pose estimation algorithm via efficient three-dimensional collinearity model. IET Comput. Vis. 2013, 7, 382–393. [Google Scholar] [CrossRef]
  20. Pan, H.; Huang, J.; Qin, S. High accurate estimation of relative pose of cooperative space targets based on measurement of monocular vision imaging. Optik 2014, 125, 3127–3133. [Google Scholar] [CrossRef]
  21. Hmam, H.; Kim, J. Optimal non-iterative pose estimation via convex relaxation. Image Vis. Comput. 2010, 28, 1515–1523. [Google Scholar] [CrossRef]
  22. Ons, B.; Verstraelen, L.; Wagemans, J. A computational model of visual anisotropy. PLoS ONE 2011, 6, e21091. [Google Scholar] [CrossRef] [PubMed]
  23. Heikkila, J.; Silvén, O. A four-step camera calibration procedure with implicit image correction. In Proceedings of the 1997 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Juan, PR, USA, 17–19 June 1997; pp. 1106–1112.
  24. Zhang, Z. Flexible camera calibration by viewing a plane from unknown orientations. In Proceedings of the Seventh IEEE International Conference on Computer Vision, Kerkyra, Greece, 20–27 September 1999; Volume 1, pp. 666–673.
Figure 1. The measurement target with four points.
Figure 1. The measurement target with four points.
Sensors 16 02173 g001
Figure 2. The rotation and translation coordinate system.
Figure 2. The rotation and translation coordinate system.
Sensors 16 02173 g002
Figure 3. The solving of the object pose.
Figure 3. The solving of the object pose.
Sensors 16 02173 g003
Figure 4. The measurement system.
Figure 4. The measurement system.
Sensors 16 02173 g004
Figure 5. The simulation experiment results.
Figure 5. The simulation experiment results.
Sensors 16 02173 g005
Figure 6. The measurement error of yaw, pitch, and roll angle.
Figure 6. The measurement error of yaw, pitch, and roll angle.
Sensors 16 02173 g006
Figure 7. The RMS error of rotation angles: (a) Yaw angle; (b) Pitch angle; (c) Roll angle.
Figure 7. The RMS error of rotation angles: (a) Yaw angle; (b) Pitch angle; (c) Roll angle.
Sensors 16 02173 g007
Figure 8. The image of P0, P1, P2, and P3 on the calibration board.
Figure 8. The image of P0, P1, P2, and P3 on the calibration board.
Sensors 16 02173 g008
Figure 9. The image sets of the calibration board.
Figure 9. The image sets of the calibration board.
Sensors 16 02173 g009
Figure 10. The image of the rotation stage with feature points.
Figure 10. The image of the rotation stage with feature points.
Sensors 16 02173 g010
Figure 11. The motion capture system based on inertial technology.
Figure 11. The motion capture system based on inertial technology.
Sensors 16 02173 g011
Figure 12. The image sets of head pose.
Figure 12. The image sets of head pose.
Sensors 16 02173 g012
Table 1. Calibration results.
Table 1. Calibration results.
Parametersfxfycxcyk1k2p1p2
Camera 14341.5014341.9181034.6671033.926−0.3610.140−0.000240.00008
Camera 24373.5304373.6591000.9971020.977−0.3730.3420.00033−0.00040
Table 2. The measurement errors of the calibration board.
Table 2. The measurement errors of the calibration board.
Frame 1Frame 2Frame 3Frame 4Frame 5Frame 6
D01 (mm)
Duan−0.4550.456−0.3520.568−0.672−0.433
Oberk0.338−0.2090.189−0.2090.335−0.326
Our−0.0470.148−0.078−0.0410.039−0.086
D02 (mm)
Duan0.532−0.622−0.5350.659−0.5530.436
Oberk0.248−0.2950.306−0.3310.359−0.389
Our0.040−0.1380.0250.1140.0520.088
D23 (mm)
Duan0.555−0.636−0.418−0.659−0.5590.736
Oberk−0.3230.203−0.3530.268−0.1770.384
Our0.0630.0550.156−0.125−0.029−0.139
D13 (mm)
Duan−0.708−0.654−0.5550.715−0.453−0.359
Oberk0.448−0.407−0.3480.207−0.2480.307
Our−0.027−0.0720.076−0.165−0.089−0.134
Table 3. The measurement errors of the rotation stage with feature points.
Table 3. The measurement errors of the rotation stage with feature points.
Position
Yaw10°15°20°25°
Duan−0.1320.2520.2450.485−0.532
Oberk−0.056−0.154−0.1470.284−0.334
Our0.032−0.0570.045−0.085−0.132
Pitch (error)10°15°20°25°
Duan−0.127−0.196−0.2010.386−0.477
Oberk0.0460.177−0.1860.2950.402
Our0.0270.043−0.019−0.102−0.137
Roll10°15°20°25°
Duan0.129−0.2670.245−0.399−0.489
Oberk0.036−0.200−0.1960.247−0.374
Our−0.026−0.0520.0490.0930.140
Table 4. The measurement errors of head pose.
Table 4. The measurement errors of head pose.
Frame 1Frame 2Frame 3Frame 4
Duan (°)
Yaw−0.845−0.831−0.7760.766
Pitch0.476−0.5520.5440.422
Roll−0.5770.4300.402−0.427
Oberk (°)
Yaw−0.560−0.620−0.688−0.644
Pitch0.410−0.374−0.3970.294
Roll−0.355−0.2880.362−0.403
Our (°)
Yaw−0.3450.3440.286−0.374
Pitch−0.177−0.186−0.2110.204
Roll−0.222−0.1990.1750.232
Table 5. The RMS error of cameras in different locations.
Table 5. The RMS error of cameras in different locations.
Position 1Position 2Position 3Position 4Position 5Position 6
Yaw (°)0.4110.3840.1920.1890.4770.512
Pitch (°)0.4000.3720.1860.1770.4440.504
Roll (°)0.3940.3460.1740.1850.5000.523

Share and Cite

MDPI and ACS Style

Zhang, Z.; Zhang, S.; Li, Q. Robust and Accurate Vision-Based Pose Estimation Algorithm Based on Four Coplanar Feature Points. Sensors 2016, 16, 2173. https://doi.org/10.3390/s16122173

AMA Style

Zhang Z, Zhang S, Li Q. Robust and Accurate Vision-Based Pose Estimation Algorithm Based on Four Coplanar Feature Points. Sensors. 2016; 16(12):2173. https://doi.org/10.3390/s16122173

Chicago/Turabian Style

Zhang, Zimiao, Shihai Zhang, and Qiu Li. 2016. "Robust and Accurate Vision-Based Pose Estimation Algorithm Based on Four Coplanar Feature Points" Sensors 16, no. 12: 2173. https://doi.org/10.3390/s16122173

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop