Global Positioning from a Single Image of a Rectangle in Conical Perspective

This article presents a method to obtain the overall positioning of the focus of a camera from an image that includes a rectangle in a fixed reference with known position and dimension. This technique uses basic principles of descriptive geometry introduced in engineering courses. The document will first show how to obtain the dihedral projections of a rectangle after three turns and one translation. Secondly, we will proceed to obtain the image of the rectangle rotated in a conical perspective, taking the elevation plane as the drawing plane and a specific point in space as the view point, and represented in the dihedral system. Thirdly, we proceed with the inverse perspective transformation; we will expose a method to obtain the coordinates in the space of a rectangle obtained from an image. Finally, we check the method experimentally by taking an image of the rectangle with a camera in which the coordinates in the drawing plane (center of the image) are the only available position information. Then, the positioning and orientation of the camera in 3D will be obtained.


Introduction
Pose determination is to estimate the position and orientation of one calibrated camera using a set of correspondences between 3D control points and 2D image points [1]. Determination of surface orientation has important applications such robotics, object recognition, 3D measurement or tracking of moving objects. Magee [2] was the first to present a procedure for determining the unique position of a robot in a three dimensional space. That method has been continuously improved in different areas as large non-cooperative satellites [3] or Unmanned Aerial Vehicle (UAV) Control [4,5]. Different methods for monocular pose estimation have been studied in the past [6][7][8][9]. More recently, marker-based positioning systems as ArUco, Chilitags, ApriTags, or ArToolKit, among others, have been introduced to estimate quantitative changes in distances and orientations in many technological applications, such as autonomous robots [10][11][12], unmanned vehicles [13][14][15][16], or virtual assistants [17][18][19][20].
The calibration and orientation of a camera from its images has been obtained through different approaches in the past with good precision through techniques such as using a single image with four coplanar control lines [21], three coplanar circles [22], using parallelogrammatic grid points [23], or even using only three points in the world coordinate system when a multiple camera system is used [24]. Becker [25] introduced a new technique using an iterative method which solves the parameters that minimize vanishing point dispersion to solve for radial and decentering lens distortion directly from the results of vanishing point estimation, precluding the need for special calibration templates. Single image based reconstruction has been deeply studied by many authors such as Delage [26], Wilczkowiak et al. [27], Sturm and Maybank [28] or Micusik et al. [29], assuming perpendicularity and parallelism to recover the lack of information. Other authors such as Penna [30] showed that there is sufficient information in the two-dimensional perspective projection of an arbitrary quadrilateral of known shape and size in three-space to determine the exact three-dimensional coordinates of its vertices, generalizing known results for rectangles. Duan [1] used the projection of a trapezium for pose estimation and plane measurement in a very simple way. An iterative algorithm was used by Hong & Yang [31] to establish the relationship between parameters and the world coordinates of a given 3D calibration point. Nevertheless, additional studies via rectangular structures as in Haralick [8] or Wefelscheid [32] use similar concepts with a different approach. In contrast, our research used the information provided by the dihedral projections of a rectangle to determine the image of the rectangle rotated in a conical perspective.
Computer vision has been used in areas, such as unmanned vehicles, to estimate relative 3D position and altitude using algorithms based on four feature points, such as square and parallel relations, to avoid complicated calculations [33]. An algorithm for pose estimation based on volume measurement of tetrahedra composed of target points and the lens center of the vision system was proposed by Abidi [6]. 3D model reconstruction from a single image calibrating a camera and recovering the geometry and the photometry of objects was part of Guillou's [34] research and a novel method to find the initial solutions for iterative camera pose estimation using coplanar points was provided by Zhou [35]. A general photogrammetric method for determining object position and orientation was presented by Yuan [36]. Recently Wang et al. [37] studied active relocalization of a 3D camera pose from a single reference image; a recent and challenging problem in computer vision and robotics. Pose estimation of smooth metal parts is an important task in intelligent manufacturing. Ulrich [38], Sakcak [39], Han [40] and He [41] proposed a solution using a monocular camera and corresponding practical algorithms.
The adjustment of tools in machining centers is usually the slowest and most critical operation in the positioning of the machined parts. The provision of a tool that includes machine displacements and images with edge detection can be adjusted at micrometric scales without the need for lasers or probes. Other possible application could be the metrology by vision, since in the case of characteristics to be measured in the same plane of a rectangle of known dimensions, the dihedral perspective of the aforementioned characteristic can be obtained and non-contact metrological checks can be performed immediately (in real time) compensating many of the existing errors. This is an essential aspect to achieve the efficiency and flexibility required by controls in production systems in Industry 4.0.
In this work a new method to obtain the camera coordinates of a rectangle from its image is proposed. This method is based in the principles of descriptive geometry as developed by Monge [42], which is studied in basic engineering courses. In order to explain the method a remembrance of the construction of a rectangle in conical perspective is described, and an inverse path is proposed. Finally, an experiment has been designed to check the precision of the method.

Dihedral Projection of a Rectangle. Rotations and Translations
In this case the problem input data is the dihedral projection of a rectangle in which the length of one side L is known. Therefore, it is represented by its coordinates x * and z * . This rectangle is rotated by three angles φ, ξ, and θ. The transformation matrices are applied to obtain a global rotation matrix and the translation is made to the point X 0 , the coordinates of the vertices are then obtained and presented in a table of dihedral information. The Top View of the dihedral would be represented by the xy plane, and the elevation of the dihedral is the xz plane. The projections of the rectangle on both planes will be its dihedral representation [42,43].

Conical Projection
With the point of view with coordinates (V x , V y , 0) and represented in the same dihedral system as the rectangle, where the Front View coincide with the image plane and V y is the focal distance, the vertices coordinates (x * , z * ) in the rectangle in the conical perspective are obtained. The method used consists in creating, from the Top View, a line that passes through (V x , V y , 0) and the Top Projection of the point P (P x , P y , 0) obtaining the intersection with the image plane which will be the coordinate x * p . This coordinate x * p is calculated by drawing the line that passes through (V x , V y , 0) and the Front View of the point P (P x , P y , 0) and obtaining the intersection with the vertical line that starts at x * . Consequently, the conical projection of the point in the image plane with coordinates (x * , z * ) is calculated. When this operation is performed Figure 1 with the four points in the rectangle, the rectangle in conical perspective is obtained.

Obtaining the Possible Front View and Top View of Dihedral Projection of the Rectangle
Using the coordinates in the conical perspective and knowing the projection of the point of view in the Front and Top planes, the coordinates of the edges of the rectangle are calculated ( Figure 2). To do this, we use part of the geometric method described by Wefelscheid et al. [32] obtaining auxiliary points that help us calculate the dihedral projection from the conical projection. These auxiliary points are:

4.
Midpoint of edges P * 12 , P * 23 , P * 34 , P * 14 as intersection of the lines that are drawn from the vanishing point to M * with the respective edges.
The auxiliary points are represented in Figure 3. These operations can be done graphically by drawing on paper, so it is computationally reduced to intersections between lines that are defined each by two points as in Table 1 as represented in Figure 3. Table 1. Obtaining auxiliary points as intersections of lines that go through two points.

Support Point
Line 1 Line 2 After obtaining these points, a proposal of the Front View of the rectangle is based on two graphic properties: 1. The points of the Front View projection are in the lines that start from the center point V whose coordinates are (V x , 0, V z ) and go though the point of the image P * 1 , P * 2 , P * 3 , P * 4 , M * , P * 12 , P * 23 , P * 34 , P * 14 . 2. In the dihedral projection the center points are in the geometric center of the segment of the side, dividing this side in two. For example, P 12 is in the center point of the segment that joins P 1 and P 2 . 3. Opposite sides are parallel in the dihedral projection.
By taking advantage of these two properties and a trigonometric interrelation, a first proposal of a rectangle in Front View can be obtained by the following procedure: 1.
In the triangle P * 1 P * 2 V * which is divided by the segment V * P * 12 , a line that starts at P 12 and its intersection with the lines V * P * 1 and V * P * 2 is equidistant, in a way that a possible point P 12 in Front View can be obtained as shown in  2.
An arbitrary distance d to obtain P 12 is taken.

3.
The normal vector of the line P 1 P 2 in dihedral will be found by a rotation of the vector V * P 12 an angle ω reached using the trigonometric relation (1): The deduction of this expression is detailed in Appendix A, where α is the angle between − −− → V * P * 2 and − −− → V * P * 12 ; and β is the angle between − −− → V * P * 1 and − −− → V * P * 12 represented in the Figure 4 and expressed by the Equations (2) and (3).

4.
Points P 1 and P 2 are obtained from the intersection of the line defined by the point P 12 and the vector − −− → V * P 12 rotated an angle ω. Once the orientations are calculated, starting from a point in the line V * P * 1 and drawing a line that intersects the line V * P * 2 gets the hypothetical side P 1 P 2 already in the Front View of the dihedral projection.

5.
With the presumed points P 1 and P 2 of the Front View in the dihedral projection, it is possible to calculate, with the central point V in the Top view (which is at a distance equal to the focal distance from the drawing plane), the projection in Top view of points P 1 and P 2 as shown in Figure 5. To get the Top View of the rectangle from the hypothetical Front projection of the side P 1 P 2 it is possible to obtain its Top projection from the Top projection of V which is at a focal distance from the drawing plane. As an example, the y coordinate of the point P 1 will be obtained by the intersection of the line that joins V in the Top View with the point x * P1 and the coordinate x P1 as shown in Figure 6 and in Equation (4). In the same way we proceed to obtain the y coordinate of the point P 2 .

6.
With the Top and Front projection of the points P 1 y P 2 , the length in pixel units of the segment is calculated, and the real distance d that means the length P 1 P 2 matched with the length of the edge of the rectangle is obtained.
With the complete coordinates of the points P 1 y P 2 the distance between both points is calculated.
Being a proportional geometric problem, the solution is found in a single step from the application of Thales' theorem.

7.
A similar procedure is done with the triangle P * 2 P * 3 V * divided by V * P 23 . Consequently, the two orientations of the edges which will have the projection of the rectangle in the dihedral system are calculated.
Getting the rest of the points is direct as we have the orientations in Top View, finding the points P 3 y P 4 using a correlative method. Once the three-dimensional coordinates of the rectangle are found, it is possible to perform any operation related to positioning and orientation of the camera or distance calculation and angle modification. The full method is represented in Figure 7.

Comments on the Described Method and Comparison with Previous Ones
The new method has many advantages over the methods of Haralik [8] and Wefelscheid [32] which are the most used: • It has the advantage of working with points and lines as it works in descriptive geometry science, making the calculations much more intuitive, based on simple sequences.

•
The method makes little use of trigonometric functions. The only trigonometric relation used is the tangent angle function between two vectors which induces very few floating point errors.
In addition to this, a rotation is applied on the vectors to redraw the rectangle edges in dihedral.

•
It is a direct method without iterations or matrix inversions.

•
As it is sequential, we can perform checks and easily determine where an error may have occurred. Once the calculations have been verified, the equations in mega formulas that save the calculation times can be exposed. The algebraic operations to obtain the points barely exceed one hundred which equates to less than thousandths of a second of computer time.
With the results, several verification can be performed since it provides data which can already be calculated such as: • The length of the second edge of the rectangle, since it has not been used for the calculation of the inverse perspective.

•
The spatial lines that join the point V with the vanishing points V 1 and V 2 in the drawing plane, are parallel to the sides of the rectangle so they are perpendicular to each other. Consequently, the scalar product must be zero, which means that the starting data (the focal length) can actually be determined from the vanishing points [34].

Positioning of the Camera in Coordinate System Defined in the Rectangle
Object tracking systems in space through images, navigation systems or calculation of distances and angles in images can be easily made from the coordinates of known rectangles that serve as a reference ( Figure 8). Therefore, they can be used for tracker systems in positioning of parts in specific coordinate systems. x v camera j v In global coordinates (5): The position in global coordinates would be given by (6): and the check in local coordinates would be obtained according to (7): In global coordinates the orientation of the camera follows the j v vector (8): In rectangle coordinates (9) is obtained, which is the projection of j on each of the three axes that coincides with the component in y of the three vectors − → i , − → j , and − → k expressed in global coordinates.

Experimental Tests
For the test of the method all the images are taken with a CASIO EXILIM EX-ZR200 digital camera with a resolution of 4608 × 3456 (16MPixels) and a sensor dimension of 6.16 × 4.62 mm (1/2.3"). After the calibration the focal distance is 4.6 mm and the central point is not in the middle of the image but at coordinates (2186, 1991). The camera is calibrated by standard method of Computer Vision by images taken of a chessboard. To analyze the position a Coordinate Measuring Machine (CMM) model Pioneer DEA 03.10.06 with measuring strokes 600 × 1000 × 600 mm has been used as seen in Figure 9. The Maximum Permissible Error of the DEA in the measurements is 2.8 + 4.0 L/1000 µm. The software for the measurements was PC-DMIS.
The procedure followed was as follows: 1. Place a DIN A4 size paper on the granite table.

2.
Position the camera on a tripod.

3.
Take photo of DIN A4 paper remotely so as not to influence the captured image.

4.
Take six points of the camera housing according to the 3-2-1 method [44].

5.
Calculate the position of the camera focus with respect to the center of the A4 sheet from the palpated points. 6.
Contrast this position with that obtained by image analysis. 7.
Repeat several times the steps 3-6, while varying the position and angles of the camera. The parameters used for the test are summarized in Table 2. The coordinates and distances of the paper center obtained by CMM and the image analysis including the differences of the coordinates and distances between both, where subindex e refers to experimental data calculated with the CMM and subindex t refers to theoretical values calculated by the image analysis algorithm, are shown in the Table 3. Result comments: • The differences in measured distances are less than 2%, e.g., less than 2 cm in 1 m distance.

•
The errors in the x coordinate are due to the parallelism of the lines with the image plane, but they hardly affect the distance calculation since the contribution of the x coordinate is small in the global calculation.
As can be seen in Table 3 and Figure 10, the difference between the distance from the camera to the center of the folio, measured by the image analysis and by the CMM, is less than 2%. This indicates that, visually, very close accuracies of the actual distances can be achieved. Nevertheless, analyzing each coordinate, a very high error in the x coordinates is observed in some points. These errors occur when the camera is facing the folio, being the x-axis parallel to the sensor plane. In consequence, the vanishing points in this direction are far apart, and the intersection between the lines is more imprecise. As in these cases, the camera is located at a small value of the x coordinate, the influence of this value on the global error is reduced. This indicates a limitation since the method works best, the closer the vanishing points remain.  Figure 11 shows the images obtained with the camera and used to verify the method presented in this article.

Conclusions
A new method has been proposed for rectangle reconstruction using elements of descriptive geometry, as used by Monge in 1847 [42], and of extensive knowledge by engineering users since it is taught in the early stages of such studies. The method presented is mainly based on the intersection between lines, as their calculations are fast and stable in computing and, therefore, minimize errors and optimize computation. The proposed process uses very few trigonometric functions of small angles that are the main source of errors in other methods, so very few floating-point errors are introduced. Additionally, the trigonometric functions are mainly used for the rotation of vectors to align the edges in dihedral projections, which also reduces the errors.
In addition, a procedure was carried out to experimentally test the calculations. The proposed technique was tested in a CMM by locating the camera through the palpation using the 3-2-1 method and the position given by the CMM was compared with the calculation from the image taken by the camera. The proposed method provides maximum errors of 2% in the measured distances. The big errors detected in individual coordinates are due to the parallelism of two sides with the image plane since, in this case, the vanishing point is distance in space and its determination by the intersection of two almost parallel lines has more variability. Funding: This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript:

UAV
Unmanned Aerial Vehicle CMM Coordinate Measurement Machine DIN Deutsches Institut für Normung (German Institute for Standardization)
Consequently, the angle of rotation ω can be calculated using the Equation (A10):