A Novel Method for Intrinsic and Extrinsic Parameters Estimation by Solving Perspective-Three-Point Problem with Known Camera Position

: The aim of the perspective-three-point (P3P) problem is to estimate extrinsic parameters of a camera from three 2D–3D point correspondences, including the orientation and position information. All the P3P solvers have a multi-solution phenomenon that is up to four solutions and needs a fully calibrated camera. In contrast, in this paper we propose a novel method for intrinsic and extrinsic parameter estimation based on three 2D–3D point correspondences with known camera position. Our core contribution is to build a new, virtual camera system whose frame and image plane are deﬁned by the original 3D points, to build a new, intermediate world frame by the original image plane and the original 2D image points, and convert our problem to a P3P problem. Then, the intrinsic and extrinsic parameter estimation is to solve frame transformation and the P3P problem. Lastly, we solve the multi-solution problem by image resolution. Experimental results show its accuracy, numerical stability and uniqueness of the solution for intrinsic and extrinsic parameter estimation in synthetic data and real images.


Introduction
The perspective-n-point (PnP) problem [1][2][3][4][5][6][7] originates from the intrinsic and extrinsic parameter estimation, which are two of key steps in computer vision [8,9] and photogrammetry [10]. Its purpose is to retrieve the intrinsic or extrinsic parameters of a camera by using n known 2D-3D point correspondences from a single image.
There are many PnP solvers for n = 3, known as P3P solvers [11][12][13][14][15][16] and was first investigated in 1841 by Grunert [17]. These P3P solvers can estimate six extrinsic parameters (three position parameters and three orientation parameters) with a fully calibrated camera. The P3P is the minimal subset of 2D-3D point correspondences that yields a finite number of solutions and solves the multi-solution problem using a fourth 3D point. Hence, the existing P3P solvers all need a fully calibrated camera and obtain up to four possible solutions [18]. The strict requirement and multi-solution problem prevent its application from scenarios where some intrinsic parameters of a camera might change online or be unknown, such as using a zoom lens. To solve the problem, many methods were developed to estimate the extrinsic parameters and some intrinsic parameters of a partially calibrated camera [19,20]. Some methods focus on the partially calibrated case with unknown focal length, namely the PnPf problem [2,[21][22][23][24]. All the PnPf solvers need four or more points [25] and for the minimal P4Pf, some solvers work well in planar case [26], while some work well in the nonplanar case [25]. P4Pf solvers can obtain only one intrinsic parameter, hence there have been some methods working with unknown focal length and radial distortion [2,27,28], unknown focal length and aspect ratio [29], and unknown focal length and principal point [25]. When n ≥ 6, the PnP problem can be linearly determined [8,25,30], and we can obtain all the intrinsic and extrinsic parameters of a camera [31].
We can see that using more 3D control points can obtain more parameters. However, user convenience dictates the use of as few reference 3D points as possible. In some cases, such as missile range testing and aerial photogrammetry, the world frame is the space rectangular coordinate frame in general and therefore, the 3D points must be measured in this frame with professional mapping person. This will take a lot of time. In addition, the position of the 3D point is hard to remain unchanged under the influence of the wind and the 3D point is easy to be corroded by sun and rain in the open air. Because of these reasons, accurate 3D control points are troublesome and expensive to acquire and maintain in these cases.
Moreover, for most digital cameras, the skew is zero and the aspect ratio of the pixels is very close to one [25,27], hence we assume these parameters are prior knowledge, which can reduce parameters to be estimated and means fewer 3D points are needed. In some cases where the camera is fixed, the camera position can be obtained as prior knowledge. In a missile testing range, for example, the attitude measurement based on fixed cameras with the zoom lens is an important test. These cameras are fixed and the positions can be measured as the known parameters. In addition, with the growing prominence of the social security problem, visual monitoring cameras (VMCs) are used widely. In general, the position of the VMC is fixed and the lens orientation can be changed. Hence, in this paper, we focus on the case with unknown focal length, principal point and known camera position. In this case, we want to obtain the intrinsic and extrinsic parameters with a minimal (n = 3) subset of 2D-3D point correspondences.
In this paper, we assume that the unknown intrinsic parameters are the focal length, principal point and we will propose a novel method for intrinsic and extrinsic parameter estimation based on three 2D-3D point correspondences with known camera position, which is different from the existing P3P solvers.
The rest of this paper is organized as follows-Section 2 is divided into three parts, in which we introduce the pinhole camera model, P3P problem, problem statement and the proposed method. Section 3 provides an analysis of our proposed method with synthetic data and real images, including numerical stability, noise sensitivity and computational time. Finally, the discussion is in Section 4 and conclusions are drawn in Section 5.

Pinhole Camera Model
The camera model used in this paper is the standard pinhole camera model [8], where the image projection x of a 3D point X can be written as Here M is a 3 × 4 camera projection matrix and λ is a scale value. The camera projection matrix M can be written as In this equation, R t , namely the extrinsic parameter matrix, contains the information about the camera pose where R is a 3 × 3 rotation matrix and t is a 3 × 1 translation matrix. K is a 3 × 3 intrinsic parameter matrix of the camera and can be written as In this matrix f, γ, s, u 0 v 0 represent the focal length, aspect ratio, skew and principal point, respectively. For the all existing P3P solvers, the camera is fully calibrated, which means the K is known. However, the camera is uncalibrated in this paper. For most digital cameras, skew is zero and the aspect ratio of the pixels is very close to one. Accordingly, in this paper, we assume γ = 1, s = 0 and it will be shown that in practice, these assumptions yield good results even though they are not strictly true. Now the intrinsic parameter matrix K contains three unknown parameters f, u 0 v 0 and therefore in this paper, we assume We can see that the unknown parameters from the intrinsic parameter matrix K are the focal length, principal point and we will propose a novel method for intrinsic and extrinsic parameter estimation based on three 2D-3D point correspondences with known camera position, which is different from the existing P3P solvers. As with many P3P solvers, distortions are not considered in this paper.

P3P Problem
We choose to follow the same outline of the existing P3P solvers, but add three extra unknown components to the intrinsic parameter matrix K as described above. As shown in Figure 1, the P3P problem is to estimate the rotation R and translation t between the world frame O w X w Y w Z w and the camera frame O c X c Y c Z c , by using three 2D-3D point correspondences.
Appl. Sci. 2021, 11, x FOR PEER REVIEW 3 of 14 which means the K is known. However, the camera is uncalibrated in this paper. For most digital cameras, skew is zero and the aspect ratio of the pixels is very close to one. Accordingly, in this paper, we assume γ = 1, s = 0 and it will be shown that in practice, these assumptions yield good results even though they are not strictly true. Now the intrinsic parameter matrix K contains three unknown parameters f, ( ) 00 uv and therefore in this paper, we assume We can see that the unknown parameters from the intrinsic parameter matrix K are the focal length, principal point and we will propose a novel method for intrinsic and extrinsic parameter estimation based on three 2D-3D point correspondences with known camera position, which is different from the existing P3P solvers. As with many P3P solvers, distortions are not considered in this paper.

P3P Problem
We choose to follow the same outline of the existing P3P solvers, but add three extra unknown components to the intrinsic parameter matrix K as described above. As shown in Figure 1, the P3P problem is to estimate the rotation R and translation t between the world frame OwXwYwZw and the camera frame OcXcYcZc, by using three 2D-3D point correspondences. Figure 1. The P3P problem.
In Figure 1, let OC be the center of perspective, In the existing P3P solvers, the camera is fully calibrated and then In Figure 1, let O C be the center of perspective, X i (i = 1, 2, 3) be the 3D control points and x i (i = 1, 2, 3) be their image projections. The 3D points are known in the world frame and let |X 1 X 2 | = a, |X 2 X 3 | = b, |X 3 X 1 | = c. In the existing P3P solvers, the camera is fully calibrated and then ∠P w2 CP w3 , ∠P w3 CP w1 , ∠P w1 CP w2 can be computed as α, β, γ with the camera intrinsic parameters (focal length, principal point) and image projections x i . We assume |O c X 1 | = d 1 , |O c X 2 | = d 2 , |O c X 3 | = d 3 and from triangles O c X 1 X 2 , O c X 2 X 3 , O c X 3 X 1 , we obtain the P3P equation system: The P3P problem is to obtain a set of solutions for d 1 , d 2 , d 3 and this problem is known to provide up to four possible solutions that can then be disambiguated using a fourth point.

Problem Statement
The strict requirement of full calibration for the camera in the P3P solvers prevents its application from these cases where some intrinsic parameters might change online or be unknown. In this paper, these intrinsic parameters are the focal length, principal point and the problem is to solve the intrinsic and extrinsic parameter estimation by solving the P3P problem with a known camera position.
When the focal length and principal point are unknown, the α, β, γ cannot be computed in the P3P problem. Some algorithms are proposed for extrinsic parameter and focal length estimation by four 3D points [21] or for extrinsic parameter, focal length and principal point estimation by five 3D points [25]. However, user convenience dictates the use of as few 3D points as possible-accurate 3D points are troublesome and expensive to acquire and maintain. In some cases where the camera is fixed, the camera position can be obtained as prior knowledge and then our problem is to solve intrinsic and extrinsic parameter estimation with fewer 3D points and the known camera position. The problem is illustrated in Figure 2-let O C be the center of perspective which is known, X i (i = 1, 2, 3) be the 3D reference points, x i (i = 1, 2, 3) be their image projections and the task is to solve intrinsic (the focal length, principal point) and extrinsic parameter estimation from these information.
its application from these cases where some intrinsic parameters might change online or be unknown. In this paper, these intrinsic parameters are the focal length, principal point and the problem is to solve the intrinsic and extrinsic parameter estimation by solving the P3P problem with a known camera position.
When the focal length and principal point are unknown, the ,,    cannot be computed in the P3P problem. Some algorithms are proposed for extrinsic parameter and focal length estimation by four 3D points [21] or for extrinsic parameter, focal length and principal point estimation by five 3D points [25]. However, user convenience dictates the use of as few 3D points as possible-accurate 3D points are troublesome and expensive to acquire and maintain. In some cases where the camera is fixed, the camera position can be obtained as prior knowledge and then our problem is to solve intrinsic and extrinsic parameter estimation with fewer 3D points and the known camera position. The problem is illustrated in Figure 2-let OC be the center of perspective which is known, be their image projections and the task is to solve intrinsic (the focal length, principal point) and extrinsic parameter estimation from these information. Figure 2. Our problem.
In Figure 2,  ,, d d d which is known to provide up to four possible solutions. This is similar to the P3P problem, but the variables and invariants of the equations have completely different meanings-a, b, c are the distances between every two image points, but they are the distances between every two 3D points in the P3P problem; ,,    is different, as described above.
In the following, our main work is to convert our problem to a P3P problem and obtain the values of intermediate variables which are then used for intrinsic and extrinsic In Figure 2, α, β, γ are computed by three 3D points X i (i = 1, 2, 3) and the camera position O C , not by camera intrinsic parameters (focal length, principal point) and image projections x i as in the traditional P3P problem as shown in Figure 1. Now our problem is summarized geometrically as follows-the image points are x i (i = 1, 2, 3) in the original image frame and let |x 1 we obtain the equations: Now the task is to find a set of solutions for d 1 , d 2 , d 3 which is known to provide up to four possible solutions. This is similar to the P3P problem, but the variables and invariants of the equations have completely different meanings-a, b, c are the distances between every two image points, but they are the distances between every two 3D points in the P3P problem; d i is the distance between camera position O C and image point x i (u i , v i ), but is the distance between the camera position O C and 3D point X i ; in addition, the way to obtain the values of α, β, γ is different, as described above.
In the following, our main work is to convert our problem to a P3P problem and obtain the values of intermediate variables which are then used for intrinsic and extrinsic parameter estimation and to lastly find the unique solution by image resolution. We provide our method of the problem as follows.

Proposed Method
In this section we convert our problem to a P3P problem as shown in Figure 3 and then obtain the values of intermediate variables by P3P solvers [13,16].
From Figure 3, We systematically formulate this problem in the following basic procedures: (1) This step involves the definition of three new 3D points X τ i (i = 1, 2, 3) and a new world frame τ from the original image points Appl. Sci. 2021, 11, x FOR PEER REVIEW 5 of 14 parameter estimation and to lastly find the unique solution by image resolution. We provide our method of the problem as follows.

Proposed Method
In this section we convert our problem to a P3P problem as shown in Figure 3 and then obtain the values of intermediate variables by P3P solvers [13,16].
Original camera system    (2) We make the original 3D points X i (i = 1, 2, 3) as the new image points x η i (i = 1, 2, 3). In addition, this step involves the definition of a new principal point x η 0 and a new camera frame η from the original 3D points X i (i = 1, 2, 3). The new camera frame is defined as , the original 3D points in the original world frame S W can be transformed into η using Now we define a new camera system whose camera frame is η and image plane S contains points X i . Then, the new principal point x η 0 is the projection X 0 from the point O η to the plane S. (4) In a new camera system, the pixel size is 1 and the new image points x η i (i = 1, 2, 3) can be obtained by Since the origin of the new camera frame η is the camera position, the camera position in the new camera frame is 0 0 0 T . From the transformation between frame η and τ, we obtain 0 0 0 T = R j · O η + T j , and then the camera of the new camera system in frame τ can be given by  Figure 4.  In Figure 4, we assume X τ 0 is the projection from point O η to the plane S 2 who contains points X τ i (i = 1, 2, 3) and X τ 0 = x O η , y O η , 0 as described above.
Here, the values of the principal point and focal length are in pixel. However, there are up to four values of the principal point and focal length because of the four possible solutions of the P3P solvers.
(6) Determination of a unique solution. We assume the image resolution of the original camera system is m × n which is known and then the ideal principal point x 0_ideal can be given by After giving the ideal principal point, the unique solution of principal point x 0 is obtained by Note that x 0 has up to four points and the point closest to the ideal principal point is the unique principal point. When x 0j − x 0_ideal is the minimum, we let the unique solution be x 0 = x 0j and focal length f = f j . Now the intrinsic parameter estimation is finished and finally the transformation between τ and η is written as (7) Extrinsic parameter estimation. From the definition of the original camera frame S C and the new world frame τ as shown in Figure 4, τ can be transformed into S C using Here In the following, we will focus on the transformation between S C and S W . Our method involves four frames S w , η, τ, S c , and the transformations between (S w , η), (η, τ), (τ, S c ) have been obtained, as shown in Figure 5. From these transformations, the original world frame SW can finally be transformed into the original camera frame SC using Extrinsic parameter estimation is finally finished.

Experiments and Results
The method proposed in Section 2 will be thoroughly tested by synthetic data, and compared to Kneip's [13] method and GP4Pf [21]. Kneip's method returns up to four possible solutions and the disambiguation of the four possible solutions will be done using a fourth point.
Lastly, we test the proposed method with real images to verify the feasibility for intrinsic and extrinsic parameter estimation in practical application. From these transformations, the original world frame S W can finally be transformed into the original camera frame S C using Extrinsic parameter estimation is finally finished.

Experiments and Results
The method proposed in Section 2 will be thoroughly tested by synthetic data, and compared to Kneip's [13] method and GP4Pf [21]. Kneip's method returns up to four possible solutions and the disambiguation of the four possible solutions will be done using a fourth point.
Lastly, we test the proposed method with real images to verify the feasibility for intrinsic and extrinsic parameter estimation in practical application.

Synthetic Data
We synthesize a virtual perspective camera with zero-distortion, zero-skew and unit aspect ratio. The principal point lies at the image center. The image resolution is 1280 × 800 pixels and the focal length is 50 mm. The position of the camera is fixed at O c = (0, 0, 50), and the orientation is kept at R =   1 0 0 0 1 0 0 0 1   , which means the camera frame S C and world frame S W are parallel.
The synthetic data consist of three thousand 3D points that are randomly generated in the world frame. These points are randomly distributed in the box of [−20, 20] 190,210]. For each test, three synthetic 3D points are randomly selected from the synthetic data, and project them into image points using the virtual perspective camera. Now the synthetic data have three thousand 2D-3D point correspondences.

Numerical Stability
To analyse the numerical stability of our proposed method, we perform 50,000 trails. At each trail, we randomly select three 2D-3D point correspondences from synthetic data without any noise added to the 2D image points for our proposed method and Kneip's method, and four 2D-3D point correspondences for GP4Pf. We measure the log10 value of the relative error between the ground truth and the estimated focal length by our proposed method and GP4Pf. The results are shown in Figure 6a. We measure the log10 value of the error in translation by our proposed method, Kneip's method and GP4Pf, as shown in Figure 6b. The synthetic data consist of three thousand 3D points that are randomly generated in the world frame. These points are randomly distributed in the box of [−20, 20] × [−2, 2] × [190,210]. For each test, three synthetic 3D points are randomly selected from the synthetic data, and project them into image points using the virtual perspective camera. Now the synthetic data have three thousand 2D-3D point correspondences.

Numerical Stability
To analyse the numerical stability of our proposed method, we perform 50,000 trails. At each trail, we randomly select three 2D-3D point correspondences from synthetic data without any noise added to the 2D image points for our proposed method and Kneip's method, and four 2D-3D point correspondences for GP4Pf. We measure the log10 value of the relative error between the ground truth and the estimated focal length by our proposed method and GP4Pf. The results are shown in Figure 6a. We measure the log10 value of the error in translation by our proposed method, Kneip's method and GP4Pf, as shown in Figure 6b. From Figure 6a, the distribution of the log10 value of the relative error between the estimated focal length and the ground truth can be observed. We can see our proposed method is clearly superior in numerical stability over GP4Pf which is the state-of-the-art 4-point solution.
From Figure 6b, the distribution of log10 value of error in translation can be observed. We can see our proposed method is clearly superior in numerical stability over GP4Pf and compared to Kneip's method, it behaves in a very similar way, but Kneip's method has a multi-solution phenomenon that is up to four solutions and needs a fully calibrated camera. From Figure 6a, the distribution of the log 10 value of the relative error between the estimated focal length and the ground truth can be observed. We can see our proposed method is clearly superior in numerical stability over GP4Pf which is the state-of-the-art 4-point solution.

Noise Sensitivity
From Figure 6b, the distribution of log 10 value of error in translation can be observed. We can see our proposed method is clearly superior in numerical stability over GP4Pf and compared to Kneip's method, it behaves in a very similar way, but Kneip's method has a multi-solution phenomenon that is up to four solutions and needs a fully calibrated camera.

Noise Sensitivity
In this Section, we add zero-mean Gaussian noise onto the image points. We vary the noise deviation level δ from 0 to 2 pixels. At each noise level, we run 50,000 independent trails and report the median focal length, translation, rotation, and reprojection error in Figure 7.
Appl. Sci. 2021, 11, x FOR PEER REVIEW 10 of 14 trails and report the median focal length, translation, rotation, and reprojection error in Figure 7. Based on Figure 7, in terms of the relative focal length error, our proposed method is better than GP4Pf. In terms of the relative translation and reprojection error, our proposed method and Kneip's method have similar performance. In terms of the rotation error, our proposed method is slightly worse than Kneip's method, but Kneip's method has a multisolution phenomenon that is up to four solutions and needs a fully calibrated camera.

Computational Time
In this section, we perform 50,000 independent trails to analyze the computational time of our proposed method, the GP4Pf and the Kneip's method executed on a 3.3 GHz 4-core laptop. At each trail, we randomly select three 2D-3D point correspondences from synthetic data without any noise added to the 2D image points for our proposed method and Kneip's method, and four 2D-3D point correspondences for the GP4Pf. We report the average computational time in milliseconds (ms) in Table 1. Based on Figure 7, in terms of the relative focal length error, our proposed method is better than GP4Pf. In terms of the relative translation and reprojection error, our proposed method and Kneip's method have similar performance. In terms of the rotation error, our proposed method is slightly worse than Kneip's method, but Kneip's method has a multi-solution phenomenon that is up to four solutions and needs a fully calibrated camera.

Computational Time
In this section, we perform 50,000 independent trails to analyze the computational time of our proposed method, the GP4Pf and the Kneip's method executed on a 3.3 GHz 4-core laptop. At each trail, we randomly select three 2D-3D point correspondences from synthetic data without any noise added to the 2D image points for our proposed method and Kneip's method, and four 2D-3D point correspondences for the GP4Pf. We report the average computational time in milliseconds (ms) in Table 1.  Table 1, we observe that our proposed method is faster than the GP4Pf and is slightly slower than Kneip's method, but Kneip's method has a multi-solution phenomenon that is up to four solutions and needs a fully calibrated camera.

Real Images
We have also tested our proposed method using real images captured by two cameras with 1280 × 800 resolution. These cameras are assumed to have zero-distortion, zero-skew and unit aspect ratio. We place some control points in these two camera fields of view as shown in Figure 8.   Table 1, we observe that our proposed method is faster than the GP4Pf and is slightly slower than Kneip's method, but Kneip's method has a multi-solution phenomenon that is up to four solutions and needs a fully calibrated camera.

Real Images
We have also tested our proposed method using real images captured by two cameras with 1280 × 800 resolution. These cameras are assumed to have zero-distortion, zeroskew and unit aspect ratio. We place some control points in these two camera fields of view as shown in Figure 8. Because the ground truth of the intrinsic and extrinsic camera parameters in real scenarios is unknown, we cannot directly compare measuring precision of the intrinsic and extrinsic parameter estimation from real images. Therefore, we compare the precision of our proposed method, Kneip's method and the GP4Pf indirectly-some control points are placed in the two camera fields of view, and the positions of these points and cameras are measured as the ground truth by a total station (NTS-330R, measuring precision is better than 0.5 cm). We use 3 or 4 control points for the intrinsic and extrinsic parameter estimation and the positions of the other points are measured by binocular vision. Then, the average error of relative position between the measured value and the ground truth is used to verify the accuracy of the intrinsic and extrinsic parameter estimation. The measuring result of relative position error is 0.43% with our proposed method, 0.47% with Kneip's method and 1.37% with the GP4Pf. This shows our proposed method works well in real scenarios.
In addition, because the intrinsic and extrinsic parameter estimation only needs feature point extraction from real images, synthetic data and real images of simple scenes can be used to perform a thorough comparison with other PnP methods [13,21,23,25].

Discussion
The PnP problem originates from intrinsic and extrinsic parameter estimation and the PnP solvers are widely used in SLAM, computer vision and photogrammetry. In this paper, we propose a novel method for intrinsic and extrinsic parameter estimation. This method just needs three 3D points and the position of the camera, which is different from Because the ground truth of the intrinsic and extrinsic camera parameters in real scenarios is unknown, we cannot directly compare measuring precision of the intrinsic and extrinsic parameter estimation from real images. Therefore, we compare the precision of our proposed method, Kneip's method and the GP4Pf indirectly-some control points are placed in the two camera fields of view, and the positions of these points and cameras are measured as the ground truth by a total station (NTS-330R, measuring precision is better than 0.5 cm). We use 3 or 4 control points for the intrinsic and extrinsic parameter estimation and the positions of the other points are measured by binocular vision. Then, the average error of relative position between the measured value and the ground truth is used to verify the accuracy of the intrinsic and extrinsic parameter estimation. The measuring result of relative position error is 0.43% with our proposed method, 0.47% with Kneip's method and 1.37% with the GP4Pf. This shows our proposed method works well in real scenarios.
In addition, because the intrinsic and extrinsic parameter estimation only needs feature point extraction from real images, synthetic data and real images of simple scenes can be used to perform a thorough comparison with other PnP methods [13,21,23,25].

Discussion
The PnP problem originates from intrinsic and extrinsic parameter estimation and the PnP solvers are widely used in SLAM, computer vision and photogrammetry. In this paper, we propose a novel method for intrinsic and extrinsic parameter estimation. This method just needs three 3D points and the position of the camera, which is different from the existing PnP solvers. The differences and advantages of our proposed method are as follows.

Difference and Advantage
The P3P is the minimal subset of PnP and in some cases, it is difficult to acquire 3D points. This is the reason why we do our work with the P3P problem. However, P3P solvers needed a fully calibrated camera in previous studies, so we had the idea about extrinsic parameter estimation with an uncalibrated camera and three 3D points. From Figure 1, although α, β, γ cannot be computed with an uncalibrated camera in the original camera frame, they can be computed in the original world frame when the camera position is known. This finding let us convert our problem to the known P3P problem as shown in Figure 3. The thought is to build a new camera system whose camera frame is defined with the original world frame and the world frame is defined with the original camera frame. This is a P3P problem between the new camera frame and the new world frame. With the definition of a new camera system, three new 2D-3D point correspondences are obtained and then the extrinsic parameters of the new camera are obtained by the P3P solvers. There is an interesting point in this solution-the camera position in the new world frame contains the intrinsic parameter information of the original camera system, as shown in Figure 4. This is the key for intrinsic parameter estimation in our proposed method.
Additionally, the way to solve the multi-solution problem is different. The existing P3P solvers use a fourth point to disambiguate multi-solution phenomena, but our proposed method uses no extra point. The multi-solution phenomenon of P3P solvers in the new camera system leads to the multi-solution of the intrinsic parameter in the original camera system and hence we can obtain up to four principal points. This is because the image resolution is known and the ideal principal point can be obtained with Equation (14). The ideal principal point, not a fourth 3D point, can be used to disambiguate multi-solution phenomenon in our proposed method. In this paper, we assume the one closest to the ideal principal point is the unique principal point and this assumption yields good results with synthetic data and real images, as shown in Section 3.
In the new camera system, the new 2D image points are obtained from original 3D points which have the characteristics of high precision because they are measured by total station or other high precision measurement equipment. Therefore, the error of the new 2D point can be an order of magnitude lower than the error of the original 2D point and we think this may be a reason why our proposed method has improved numerical stability and lower noise sensitivity. Compared to the P3P method, only the establishment of the new frames and coordinate transformation are added in our proposed method, which can be made very fast, and this is the reason why our proposed method is just slightly slower than Kneip's method. Our proposed method just uses three 3D points and this is a reason why it is faster than the GP4Pf that uses four 3D points.
In brief, we can see that the proposed method has the following advantages-(1) it just uses three 3D points and the camera position to estimate intrinsic and extrinsic parameters; (2) it has no multi-solution phenomenon; and (3) it has improved numerical stability and lower noise sensitivity.

Future Work
Our proposed method needs to know the camera position and uses the manual zoom lens in experiments, which means we cannot change the focal length constantly. Therefore, in the future, we will focus on real-time intrinsic and extrinsic parameter estimation for an auto-zoom camera. Additionally, our proposed method in Section 2 is suitable for a camera with fixed position. In the future, we will install a positioning device (e.g., RTK) on the camera and this means we can use a camera whose position is constantly changing. The work will expand the use range of our proposed method.
Another work that will be done in the future is to explain theoretically and experimentally why our proposed method has improved numerical stability and lower noise sensitivity.

Conclusions
We have proposed a novel method for intrinsic and extrinsic parameter estimation by solving a perspective-three-point problem with known camera position. By building a new camera system and converting our problem to the P3P problem, the unique solution can be obtained. Our proposed method just uses three 3D points, and has improved numerical stability and lower noise sensitivity. Experiment results show our proposed method works well in synthetic data and real scenarios. It is particularly suitable for estimating the focal length, principal point and pose of a zooming, uncalibrated camera with fixed position.