Absolute IOP/EOP Estimation Models without Initial Information of Various Smart City Sensors

In smart cities, a large amount of optical camera equipment is deployed and used. Closed-circuit television (CCTV), unmanned aerial vehicles (UAVs), and smartphones are some examples of such equipment. However, additional information about these devices, such as 3D position, orientation information, and principal distance, is not provided. To solve this problem, the structured mobile mapping system point cloud was used in this study to investigate methods of estimating the principal point, position, and orientation of optical sensors without initial given values. The principal distance was calculated using two direct linear transformation (DLT) models and a perspective projection model. Methods for estimating position and orientation were discussed, and their stability was tested using real-world sensors. When the perspective projection model was used, the camera position and orientation were best estimated. The original DLT model had a significant error in the orientation estimation. The correlation between the DLT model parameters was thought to have influenced the estimation result. When the perspective projection model was used, the position and orientation errors were 0.80 m and 2.55°, respectively. However, when using a fixed-wing UAV, the estimated result was not properly produced owing to ground control point placement problems.


Introduction
Cities have recently been transformed into smart cities to increase their survivability and improve the quality of life of their residents. The goals of smart cities are achieved through the use of various sensors to collect and analyze data [1]. Cameras are examples of optical sensors that are used in smart cities. Optical sensors are used to perform realtime actions such as traffic control, social safety, and disaster response [2,3]. For example, closed-circuit television (CCTV) cameras are already installed in many cities and play an important role. In Korea, the number of CCTV cameras installed is gradually increasing for purposes such as crime prevention and disaster monitoring.
However, the precision of the three-dimensional position of optical sensors has received little attention. In the case of CCTV, the position information is roughly provided. However, only latitude and longitude can be checked, and the orientation information is unavailable [4]. Furthermore, cameras used in smart cities are not standardized. Hence, people cannot easily use specific information about cameras. Determining the specifications and locations of these numerous sensors takes time, incurs administrative costs, and is impossible without the cooperation of various organizations. In addition, public officials in charge have a low perception of the significance of camera information. In summary, interior orientation parameters (IOPs) and exterior orientation parameters (EOPs), which are the most important information regarding a sensor, are difficult to use. Furthermore, some studies performed positioning estimation of various sensors using deep learning [28][29][30][31], but it is difficult to accept that it is close to the true value from a surveying standpoint.
Although many studies have been conducted in this way, previous studies have focused on the theoretical part rather than the application of actual data. Also, as far as we know, no research has been conducted that analyzes each algorithm using the same data. This study performed absolute position and orientation estimation as well as camera calibration for cameras with no initial information. We proposed and compared standardized algorithms that can be applied to a variety of camera sensors such as smartphones, drones, and CCTV without initial information. This study focuses on estimating IOP (principal distance)/EOP information. The main objectives were: 1.
Investigation and comparative analysis of IOP/EOP estimation models; 2.
Stability and accuracy analysis of IOP/EOP estimation models; 3.
Analysis of estimation results using practical optical sensor data.

.1. DLT Model
A DLT model connects points in 3D space and 2D image planes using parameters. Because of their simplicity and low computational cost, DLT models are widely used in close-range photogrammetry, computer vision, and robotics. The mathematical model of planar object space and image space using homogeneous coordinates is given as where x and y are image coordinates; X, Y, and Z are the object space coordinates; and L n is the DLT parameter. This model can be written as follows: x = L 1 X + L 2 Y + L 3 Z + L 4 L 9 X + L 10 Y + L 11 Z + L 12 + e x , y = L 5 X + L 6 Y + L 7 Z + L 8 L 9 X + L 10 Y + L 11 Z + L 12 + e y (2) At least six well-distributed GCPs are required to calculate the DLT parameters in Equation (2). The LESS can be used to determine the best DLT parameters. Lens distortion parameters can also be applied to the DLT model by using Equation (3) [32]: x = L 1 X + L 2 Y + L 3 Z + L 4 L 9 X + L 10 Y + L 11 Z + L 12 + dist x + e x , y = L 5 X + L 6 Y + L 7 Z + L 8 L 9 X + L 10 Y + L 11 Z + L 12 + dist y + e y .

Perspective Projection Model
Equation (4) is a perspective projection camera model expressed by a homogeneous vector: homogeneous camera projection matrix. Principal point offset, pixel ratio, and skew can be applied using Equation (5): where p x p y T represents the coordinates of the principal point; α, β are the pixel ratios; and s is the skew parameter [20]. Figure 1 shows the process of converting two different coordinate systems (world coordinate system and camera coordinate system) using rotation and translation. The geometric camera model with camera rotation and translation is applied as follows [20]: where represents the coordinates of the principal point; , are the pixel ratios; and s is the skew parameter [20]. Figure 1 shows the process of converting two different coordinate systems (world coordinate system and camera coordinate system) using rotation and translation. The geometric camera model with camera rotation and translation is applied as follows [20]:

DLT Model
The camera position and parameters can be calculated using the DLT model [19]. Equations (7) and (8) show the DLT and perspective projection models, respectively.
where is the homogeneous image coordinates vector, is the calibration matrix, is the rotation matrix, is the camera position vector, is the homogeneous object point vector, and is the identity matrix.

DLT Model
The camera position and parameters can be calculated using the DLT model [19]. Equations (7) and (8) show the DLT and perspective projection models, respectively.
where x I is the homogeneous image coordinates vector, K is the calibration matrix, R is the rotation matrix, X O is the camera position vector, X W is the homogeneous object point vector, and I is the identity matrix. From Equations (7) and (8), Equation (9) can be derived: Equation (9) can be rewritten as follows: On the basis of Equations (9) and (10), the camera position and rotation matrix can be computed as follows: Sensors 2023, 23, 742 5 of 21 X O matrix means the camera position, and R matrix means the camera orientation. The camera calibration matrix K can be calculated by Equation (10) and Choleski factorization:

Perspective Projection Model
Equation (14) is the geometric model of the perspective projection: where x I,i is the ith image point coordinates and X W,i is the ith world point coordinates. Let us assume a typical camera (skew parameter ≈ 0 and pixel ratio ≈ 1). The camera calibration matrix K will be K = diag(1, 1, w) for w = 1 f . With the assumptions, Equation (14) can be written as follows: Undistorted and distorted image coordinates are expressed by Equation (16), according to Fitzgibbon's radial distortion model [34].
where k is the radial distortion parameter, p u = x u y u 1 T is an undistorted image point, p d = x d y d 1 T is a distorted image point, and r 2 d = x 2 d + y 2 d is the radius of p d for distortion center. The image point can be written as follows: In this step, we will use the properties of the skew-symmetric matrix. The skewsymmetric matrix [a] × of vector a = a 1 a 2 a 3 T is defined as: By the property of the skew-symmetric matrix, [x i ] × x i = 0 can be obtained as The third row of Equation (19) can be rewritten as follows: Sensors 2023, 23, 742 6 of 21 The seven parameters, p 11 , p 12 , . . . , p 24 , are unknown. If seven GCPs are obtained, the seven equations can be expressed in the matrix form as By decomposing Matrix M1 through SVD, the last column of matrix V can be selected as a solution. A constant value λ is needed to obtain the actual solution because the norm of the chosen solution is fixed at 1. Equation (22) shows the solution vector v 1 from the last column vector of matrix V.
Let a 3 × 3 submatrix of matrix P be P . Matrix P can be written as Equation (26): Herein, R is the rotation matrix of the camera. The three rows of matrix P are perpendicular because matrix K is a diagonal constraint matrix. In addition, the norm of the first and second-row vectors of matrix P are the same; thus, Equations (27)-(30) established: Let p 31 = δ. Subsequently, p 32 and p 33 can be parameterized for δ by using Equations (27) and (28). The results are given as follows: . The remaining unknown parameters are p 31 , p 34 , k 1 , k 2 , k 3 . These five unknown parameters can be obtained using the second row of Equation (20): where Here, M 2 and v 2 have dimensions of 7 × 5 and 7 × 1, respectively, because seven GCPs are used. As a result, v 2 can be calculated using LESS as Equation (37): The values of each element of matrix P and camera distortion parameters can be obtained using the equations described above. The principal distance is the final unknown parameter. The relationship between the first and last rows of matrix P can be used to calculate the principal distance. Based on Equation (14), Equations (38) and (39) are obtained by multiplying the first row of P by w: w 2 = p 2 31 + p 2 32 + p 2 33 p 2 11 + p 2 12 + p 2 13 .
Finally, we can get focal length from Equation (39):

Equipment and Dataset
CCTV, unmanned aerial vehicles (UAVs), and smartphones, which can be used in smart cities, were used as target sensor platforms. Images were captured in a variety of environments with each sensor, and the estimation results were compared. EOPs were estimated and compared to a total station surveying result (position parameters, X, Y, Z) and the SPR result (orientation parameters ω, φ, κ). The process using the DLT models and the perspective projection model is illustrated in Figure 2. Figure 3 shows the camera platforms used in the experiments. Each sensor platform was calibrated using a checkerboard and a camera geometric model. Figure 4 shows images from each sensor platform. CCTV images were obtained from locations under conditions similar to those found in a smart city. Two drones were used to capture images: one in an oblique direction (rotary-wing UAV) and one in a nadir direction (fixed-wing UAV). The image acquisition conditions were investigated by comparing the results obtained from the two images. Smartphone images were acquired without any specific photographic conditions. The GCP positions on the images are denoted by yellow X marks.    Figure 4 shows images from each sensor platform. CCTV images were obtained from locations under conditions similar to those found in a smart city. Two drones were used to capture images: one in an oblique direction (rotary-wing UAV) and one in a nadir direction (fixed-wing UAV). The image acquisition conditions were investigated by comparing the results obtained from the two images. Smartphone images were acquired without any specific photographic conditions. The GCP positions on the images are denoted by yellow X marks.
Each image dataset has unique position and orientation properties. Although the height of the platform is clearly different, Figure 4a,b shows they have a similar     Figure 4 shows images from each sensor platform. CCTV images were obtained from locations under conditions similar to those found in a smart city. Two drones were used to capture images: one in an oblique direction (rotary-wing UAV) and one in a nadir direction (fixed-wing UAV). The image acquisition conditions were investigated by comparing the results obtained from the two images. Smartphone images were acquired without any specific photographic conditions. The GCP positions on the images are denoted by yellow X marks.
Each image dataset has unique position and orientation properties. Although the height of the platform is clearly different, Figure 4a,b shows they have a similar Each image dataset has unique position and orientation properties. Although the height of the platform is clearly different, Figure 4a,b shows they have a similar orientation parameter of looking down diagonally. Figure 4b,c shows the camera mounted on a UAV, and while the Z value of the position is similar, the orientation parameter is noticeably different. Figure 4b examines the diagonal direction which can have a wide variety of GCPs, whereas Figure 4c shows the cause of the GCPs to be distributed on an almost constant plane. The Z diversity of GCP is particularly low in the park, which is the study's target area. Finally, the smartphone image is captured by the user while holding the phone and orientation parameter of looking down diagonally. Figure 4b,c shows the camera mounted on a UAV, and while the Z value of the position is similar, the orientation parameter is noticeably different. Figure 4b examines the diagonal direction which can have a wide variety of GCPs, whereas Figure 4c shows the cause of the GCPs to be distributed on an almost constant plane. The Z diversity of GCP is particularly low in the park, which is the study's target area. Finally, the smartphone image is captured by the user while holding the phone and looking to the side, which can differ significantly from the image orientation parameters of Figure 4a-c. MMS + UAV hybrid point cloud data were used in this study to acquire the 3D location of GCPs and checkpoints (CKPs). The smart city point cloud was used because GCPs could be easily obtained without direct surveys. In this study, the georeferencing point cloud generated in Mohammad's study [35] was used (as shown in Figure 5).   MMS + UAV hybrid point cloud data were used in this study to acquire the 3D location of GCPs and checkpoints (CKPs). The smart city point cloud was used because GCPs could be easily obtained without direct surveys. In this study, the georeferencing point cloud generated in Mohammad's study [35] was used (as shown in Figure 5). orientation parameter of looking down diagonally. Figure 4b,c shows the camera mounted on a UAV, and while the Z value of the position is similar, the orientation parameter is noticeably different. Figure 4b examines the diagonal direction which can have a wide variety of GCPs, whereas Figure 4c shows the cause of the GCPs to be distributed on an almost constant plane. The Z diversity of GCP is particularly low in the park, which is the study's target area. Finally, the smartphone image is captured by the user while holding the phone and looking to the side, which can differ significantly from the image orientation parameters of Figure 4a-c. MMS + UAV hybrid point cloud data were used in this study to acquire the 3D location of GCPs and checkpoints (CKPs). The smart city point cloud was used because GCPs could be easily obtained without direct surveys. In this study, the georeferencing point cloud generated in Mohammad's study [35] was used (as shown in Figure 5).

Simulation Experiments
Before conducting an experiment using a real sensor, simulation experiments were performed to compare the performance of each algorithm using a 10 × 10 × 10 virtual grid. Thirteen virtual grid points were chosen as GCPs at random from a pool of 1000. The camera parameters, position, and rotation were calculated using 100,000 GCP combi-nations from a possible set of C(1000, 13) ≈ 1.4849 × 10 29 . The camera parameters were set close to the actual camera parameters. Camera IOPs/EOPs and the coordinates of virtual points were also set based on the actual TM coordinate system. The set camera parameters and IOPs/EOPs values are listed in Table 2. The estimated values were directly compared with the true values. The reprojection error was calculated using 987 virtual points. Figure 6a depicts the virtual grid and the camera position. Figure 6b depicts the virtual grid, and Figure 6c depicts the virtual image generated by the virtual grid. The simulation environment was Win 11, Matlab R2022b. Table 2. Camera parameters, IOPs, and EOPs.

Camera Parameters Values
Principal distance 3500 pixels Principal point x p 50 pixels y p 20 pixels Radial distortion parameter Rotation angle

Simulation Experiments
Before conducting an experiment using a real sensor, simulation experiments were performed to compare the performance of each algorithm using a 10 × 10 × 10 virtual grid. Thirteen virtual grid points were chosen as GCPs at random from a pool of 1000. The camera parameters, position, and rotation were calculated using 100,000 GCP combinations from a possible set of (1000,13) ≈ 1.4849 × 10 . The camera parameters were set close to the actual camera parameters. Camera IOPs/EOPs and the coordinates of virtual points were also set based on the actual TM coordinate system. The set camera parameters and IOPs/EOPs values are listed in Table 2. The estimated values were directly compared with the true values. The reprojection error was calculated using 987 virtual points. Figure  6a depicts the virtual grid and the camera position. Figure 6b depicts the virtual grid, and Figure 6c depicts the virtual image generated by the virtual grid. The simulation environment was Win 11, Matlab R2022b.      Figure 7c-e shows the camera orientation and position estimation results. It was shown that the perspective projection model, NDLT, and ODLT model showed good performance in order. All three algorithms showed an error of less than 1 degree. Figure 7f-h shows the camera position estimation results. The NDLT model and perspective projection model showed good performance, and the ODLT model also showed satisfactory performance. The maximum error when using the NDLT and the perspective projection model did not exceed 1 m, but when the ODLT was used, the maximum error was relatively large. error of less than 1 degree. Figure 7f-h shows the camera position estimation results. NDLT model and perspective projection model showed good performance, and the OD model also showed satisfactory performance. The maximum error when using the ND and the perspective projection model did not exceed 1 m, but when the ODLT was u the maximum error was relatively large.  Figure 8 shows box plots of the mean reprojection error of each algorithm. Wh comparing the mean reprojection error, ODLT showed outstanding performance. T maximum error of the ODLT model did not exceed 0.5 pixels. NDLT and Perspective p jection models also showed good performance, but the maximum errors were 1.71 pix Figure 7. Accuracy assessment results using a 10 × 10 × 10 virtual grid. Figure 8 shows box plots of the mean reprojection error of each algorithm. When comparing the mean reprojection error, ODLT showed outstanding performance. The maximum error of the ODLT model did not exceed 0.5 pixels. NDLT and Perspective projection models also showed good performance, but the maximum errors were 1.71 pixels and 3.72 pixels, respectively.
(g) Position (Y) (h) Position (Z) Figure 7. Accuracy assessment results using a 10 × 10 × 10 virtual grid. Figure 8 shows box plots of the mean reprojection error of each algorithm. When comparing the mean reprojection error, ODLT showed outstanding performance. The maximum error of the ODLT model did not exceed 0.5 pixels. NDLT and Perspective projection models also showed good performance, but the maximum errors were 1.71 pixels and 3.72 pixels, respectively. It is interesting to note that the X, Y, and Z distributions of GCP also affect the quality of the estimation results. Aside from the distribution of GCPs on the image plane, the even distribution of GCPs in a 3D object space is critical [36,37]. GCPs were randomly selected on one plane as shown in Figure 9a, and GCPs were randomly selected on multiple planes as shown in Figure 9b, and the results were compared. It is interesting to note that the X, Y, and Z distributions of GCP also affect the quality of the estimation results. Aside from the distribution of GCPs on the image plane, the even distribution of GCPs in a 3D object space is critical [36,37]. GCPs were randomly selected on one plane as shown in Figure 9a, and GCPs were randomly selected on multiple planes as shown in Figure 9b, and the results were compared.  When GCPs were selected on only one plane, IOP/EOP estimation was not perfor properly. Figure 10 is a visual representation of the results. Figure 10 shows the size direction of the reprojection error, and it can be seen that a visually unacceptable error occurred. None of the three algorithms produced significant estimation results. In a tion to the reprojection error, the estimation results of the camera IOPs and EOPs w also unacceptable. As with many camera models, it is clear that the 3D distributio GCPs is critical. When GCPs were selected on only one plane, IOP/EOP estimation was not performed properly. Figure 10 is a visual representation of the results. Figure 10 shows the size and direction of the reprojection error, and it can be seen that a visually unacceptable error has occurred. None of the three algorithms produced significant estimation results. In addition to the reprojection error, the estimation results of the camera IOPs and EOPs were also unacceptable. As with many camera models, it is clear that the 3D distribution of GCPs is critical. Table 3 shows the camera EOPs and the mean reprojection errors. It is confirmed that the camera parameter estimation and the reprojection results have remarkably improved. The X, Y, and Z errors of all three models were all less than 50 cm. In particular, in the case of perspective projection, it was confirmed that the size of positional error was more than twice as small as that of other models. Orientation error and mean reprojection error were also the smallest in the perspective projection model. When GCPs were selected on only one plane, IOP/EOP estimation was not performed properly. Figure 10 is a visual representation of the results. Figure 10 shows the size and direction of the reprojection error, and it can be seen that a visually unacceptable error has occurred. None of the three algorithms produced significant estimation results. In addition to the reprojection error, the estimation results of the camera IOPs and EOPs were also unacceptable. As with many camera models, it is clear that the 3D distribution of GCPs is critical.  Table 3 shows the camera EOPs and the mean reprojection errors. It is confirmed that the camera parameter estimation and the reprojection results have remarkably improved. The X, Y, and Z errors of all three models were all less than 50 cm. In particular, in the case of perspective projection, it was confirmed that the size of positional error was more than twice as small as that of other models. Orientation error and mean reprojection error were also the smallest in the perspective projection model. The degree of the 3D distribution can be determined by the distance between the camera and the object. Let us compare close-range photogrammetry with an object-sensor  The degree of the 3D distribution can be determined by the distance between the camera and the object. Let us compare close-range photogrammetry with an object-sensor distance of about 20 m with aerial photogrammetry with a flight altitude of 200 m or more. Even GCP distributions with the same depth range can be treated as near-planar distributions in aerial photogrammetry [38]. As a result, GCPs must be carefully chosen by the sensor platform.

Practical Experiments
This section describes experiments in which the ODLT, NDLT, and perspective projection models were used to estimate the actual sensor position/orientation and principal distance. Sensor calibration was performed prior to the experiments to determine the IOP values of each sensor. However, IOPs can change for a variety of reasons. For example, the principal point varies due to lens group perturbation and may vary due to aperture and focus changes [39][40][41]. The value of the radial distortion parameter changes with the principal distance, making generalized modeling difficult [40]. The estimated radial distortion parameter value can also vary with the distance from the control points [42,43]. The camera was set to manual mode to control various factors; however, the micromechanism that operated the lens group was not. Therefore, in this study, a direct comparative analysis was only used to estimate the camera EOPs. The focal length was shown to examine the trend of the estimation result, but the principal point location and camera distortion parameters were not shown.
To examine the accuracy of the estimated sensor position, a virtual reference station VRS GPS survey was performed. Further, as the true value of EOPs, the SPR result based on sensor measurement and camera calibration can be used as the initial value. The accuracy of the orientation estimation result can be indirectly checked using the mean reprojection error (MRE) and the comparison with the SPR result. In this paper, both orientation parameters estimated by SPR and reprojection results are presented. The locations of the GCPs are marked in Figure 4. Pixel coordinates and ground coordinates of GCPs were applied to x i and X w to estimate L 1 to L 12 in Equation (7). Based on L 1 to L 12 , X O and K matrices were estimated to estimate camera position, orientation, and principal distance. In addition, pixel coordinates of GCPs were applied to x i and y i , and ground coordinates of GCPs were applied to X i , Y i , and Z i of Equation (20) to estimate p 11 to p 34 for camera position, orientation, and principal distance.

CCTV
The CCTV image was used to estimate the camera principal distance and EOPs. The reprojection error was calculated for each image using the estimated IOPs/EOPs and 10 CKPs. Table 4 shows the estimated principal distance, whereas Table 5 shows the EOPs of the camera based on the CCTV image. The estimated camera position error for each method is shown in Figure 11a, and the rotation angle error is shown in Figure 11b. The MRE for each model is depicted in Figure 11c.    Table 6 shows the calibrated and estimated principal distance of the UAV camera sensor. The estimated EOP errors and the reprojection errors are shown in Table 7. Interestingly, as a result of experimenting with images taken in the direction of nadir (fixedwing UAV), an unacceptably large error occurred in the IOP and EOP estimation. It was estimated very differently from the principal distance calibration result, and rotation and position errors largely occurred in the case of EOP as well. In contrast, the experiment using the image taken in the oblique direction to understand the rotary-wing UAV showed acceptable results. This is related to the distribution of the GCPs described in Section 3.1. The image was taken at a high altitude (>200 m), but the height distribution of the GCPs was within 4.09 m. Because all GCPs and CKPs were on nearly the same plane, the reprojection results were not large, but proper IOP/EOP estimation was not performed.

UAV
Next, the results obtained using the rotary-wing UAV image are shown in Figure 12. Figure 12a,b depicts the camera position and the camera rotation angle errors, respectively. Figure 12c also displays the MRE. In terms of the camera position error, the perspective-projection-model-based algorithm performed the best. The DLT-based algorithms also produced acceptable estimation results with errors of less than 1.6960 m and 2.1053 m, respectively. The position estimation accuracy of the perspective projection model was within 0.7966 m. The ODLT model had a maximum rotation angle estimation error of 7.15°, but the other two algorithms were generally capable of accurate rotation In the case of the position estimation error, the perspective projection model produced the most accurate estimation results. The position errors for the ODLT and NDLT models were 1.9856 m and 1.3951 m, respectively. The perspective projection had an error of 0.6336 m, allowing for a more accurate position estimation. In the case of the rotation angle estimation, the perspective projection model produced the best results, whereas ODLT produced a large error in the rotation angle. However, the reprojection results were consistent across all three models. Table 6 shows the calibrated and estimated principal distance of the UAV camera sensor. The estimated EOP errors and the reprojection errors are shown in Table 7. Interestingly, as a result of experimenting with images taken in the direction of nadir (fixed-wing UAV), an unacceptably large error occurred in the IOP and EOP estimation. It was estimated very differently from the principal distance calibration result, and rotation and position errors largely occurred in the case of EOP as well. In contrast, the experiment using the image taken in the oblique direction to understand the rotary-wing UAV showed acceptable results. This is related to the distribution of the GCPs described in Section 3.1. The image was taken at a high altitude (>200 m), but the height distribution of the GCPs was within 4.09 m. Because all GCPs and CKPs were on nearly the same plane, the reprojection results were not large, but proper IOP/EOP estimation was not performed.  Table 7. Estimated EOP and reprojection errors of UAVs. Next, the results obtained using the rotary-wing UAV image are shown in Figure 12. Figure 12a,b depicts the camera position and the camera rotation angle errors, respectively. Figure 12c also displays the MRE. In terms of the camera position error, the perspectiveprojection-model-based algorithm performed the best. The DLT-based algorithms also produced acceptable estimation results with errors of less than 1.6960 m and 2.1053 m, respectively. The position estimation accuracy of the perspective projection model was within 0.7966 m. The ODLT model had a maximum rotation angle estimation error of 7.15 • , but the other two algorithms were generally capable of accurate rotation angle estimation. All three algorithms had the MRE of fewer than 5 pixels.

Smartphone
The calibration and estimated principal distance of a smartphone camera are shown in Table 8. Table 9 displays the estimated EOP errors and the reprojection errors. Figure  13a shows a comparison of each-algorithm-estimated camera position error, and Figure  13b shows the estimated orientation angle error. The MRE is depicted in Figure 13c. All three models produced accurate camera position estimation results. The same pattern was observed in the results of orientation estimation. However, the ODLT position and orientation estimation performance suffered significantly.

Smartphone
The calibration and estimated principal distance of a smartphone camera are shown in Table 8. Table 9 displays the estimated EOP errors and the reprojection errors. Figure 13a  shows a comparison of each-algorithm-estimated camera position error, and Figure 13b shows the estimated orientation angle error. The MRE is depicted in Figure 13c. All three models produced accurate camera position estimation results. The same pattern was observed in the results of orientation estimation. However, the ODLT position and orientation estimation performance suffered significantly.  Overall, the perspective projection model showed good results. This is because the correlation between parameters affected the quality when DLT models were used. The results of the three algorithms had lower reliability compared to the results of camera calibration or SPR, which are widely used. However, there was not much difference between the camera calibration result and the SPR result, and it is enough to be used as an initial parameter value. Therefore, it is possible to estimate the IOP/EOPs of the sensor precisely by fusion with the camera calibration and SPR.

Simulation Experiments
When estimating IOPs (the principal distance and the principal point) in the two DLT models, A and C components were used for the x component of IOPs, and B and C components were used for the y component of IOPs. When estimating EOPs, all A, B, and C components were used. Figure 14 shows the relationship between the ODLT parameters calculated with the total least square. Overall, the perspective projection model showed good results. This is because the correlation between parameters affected the quality when DLT models were used. The results of the three algorithms had lower reliability compared to the results of camera calibration or SPR, which are widely used. However, there was not much difference between the camera calibration result and the SPR result, and it is enough to be used as an initial parameter value. Therefore, it is possible to estimate the IOP/EOPs of the sensor precisely by fusion with the camera calibration and SPR.

Simulation Experiments
When estimating IOPs (the principal distance and the principal point) in the two DLT models, A and C components were used for the x component of IOPs, and B and C components were used for the y component of IOPs. When estimating EOPs, all A, B, and C components were used. Figure 14 shows the relationship between the ODLT parameters calculated with the total least square.

Simulation Experiments
When estimating IOPs (the principal distance and the principal point) in the two DLT models, A and C components were used for the x component of IOPs, and B and C components were used for the y component of IOPs. When estimating EOPs, all A, B, and C components were used. Figure 14 shows the relationship between the ODLT parameters calculated with the total least square. Overall, the DLT parameters correlated with each other in the case of ODLT. The correlation between the parameters of the A, B, and C block components was high in the case of NDLT. A strong relationship between parameters can reduce estimation precision and increase error [44,45]. Because of high correlation between DLT parameters, errors in some parameters may be used to correct other parameters, causing errors to propagate to the accuracy of the IOP/EOP estimation [46]. In this regard, when IOPs/EOPs are estimated using ODLT and NDLT, the correlation between parameters influences the result, potentially lowering the estimation accuracy. In particular, it is expected that the estimation accuracy of ODLT, which shows the overall correlation, will be lower than that of NDLT.

Practical Experiments
Experimental results using CCTV, UAV, and smartphone, camera EOPs estimation results showed good results in regards to the point-based perspective projection model, NDLT model, and ODLT model. In the case of camera position estimation, ODLT and NDLT showed similar results, but the position estimation error was slightly larger when ODLT was used. In the case of the rotation angle estimation, NDLT and the perspective projection model showed significantly better results than ODLT. In the case of the reprojection error, the three models showed similar results. It is judged that this is because the high correlation between DLT model parameters affects the estimation result, as analyzed in Section 3.1. In addition, since DLT parameters are for the purpose of connecting 3D points and image points, the amount of reprojection error is smaller than that of the perspective projection model, but the quality of camera position and orientation estimation Overall, the DLT parameters correlated with each other in the case of ODLT. The correlation between the parameters of the A, B, and C block components was high in the case of NDLT. A strong relationship between parameters can reduce estimation precision and increase error [44,45]. Because of high correlation between DLT parameters, errors in some parameters may be used to correct other parameters, causing errors to propagate to the accuracy of the IOP/EOP estimation [46]. In this regard, when IOPs/EOPs are estimated using ODLT and NDLT, the correlation between parameters influences the result, potentially lowering the estimation accuracy. In particular, it is expected that the estimation accuracy of ODLT, which shows the overall correlation, will be lower than that of NDLT.

Practical Experiments
Experimental results using CCTV, UAV, and smartphone, camera EOPs estimation results showed good results in regards to the point-based perspective projection model, NDLT model, and ODLT model. In the case of camera position estimation, ODLT and NDLT showed similar results, but the position estimation error was slightly larger when ODLT was used. In the case of the rotation angle estimation, NDLT and the perspective projection model showed significantly better results than ODLT. In the case of the reprojection error, the three models showed similar results. It is judged that this is because the high correlation between DLT model parameters affects the estimation result, as analyzed in Section 3.1.
In addition, since DLT parameters are for the purpose of connecting 3D points and image points, the amount of reprojection error is smaller than that of the perspective projection model, but the quality of camera position and orientation estimation results are analyzed to be inferior.
In general, it was possible to estimate IOPs/EOPs using three models, but good results were not obtained using nadir images acquired from a UAV. The position estimation errors were over 100 m, and the rotation angle estimation was not able to estimate a reasonable result. The camera IOPs estimation result was also less reliable. This is because GCPs are almost on the same plane due to high altitude imaging. The reprojection results seem reasonable, but this is because the distribution of CKPs is also on the same plane as presented in Section 3.1. Neither DLT nor the perspective projection model can be used in this environment, but it is more appropriate to use the classic SPR.

Contribution and Limitations
A study was carried out in this paper to estimate the IOPs/EOPs of an optical sensor in the absence of an initial value. Sensor positioning was performed using three different algorithms, and the results were confirmed to be different. Experiments were carried out using both real data and simulation levels. CCTV, UAV, and smartphones were used, and it was discovered that applying the three algorithms was difficult if the diversity of GCP was not secured.
The limitations of this study are discussed as follows. The first limitation of this study is dependent on the quality of the point cloud from which GCP can be acquired. Many MMS devices currently acquire city point clouds, but the quality of the point clouds varies. When uncalibrated MMS equipment is used, the location accuracy of the point cloud is greatly reduced, which has a direct impact on the optical sensor's orientation/position estimation result. The simplification of the camera distortion parameters is the study's second limitation. The tangential distortion parameter was ignored in this study, and the radial distortion parameter was assumed to be small. This study did not include cases with large lens distortion parameters, such as fisheye lenses. As a result, future research must investigate how the location accuracy of the point cloud is propagated to the estimation results. Furthermore, when using a lens with a high distortion parameter, a position and orientation estimation process must be developed. However, in a situation where sufficient GCP can be secured, for example, in the case of an indoor space where a point cloud is acquired with terrestrial LiDAR, effective results can be produced for estimating the position and orientation of the sensor. This research team is conducting additional research related to point cloud registration using these characteristics and expects to obtain interesting results.

Conclusions
In this study, the IOPs/EOPs of various smart city sensors were estimated using ODLT, NDLT, and the perspective projection models. MMS + UAV hybrid point cloud data were used to collect GCPs and CKPs. We tested two different images for each platform. In this study, camera IOPs were not used as true values because calibration results could vary depending on the experimental conditions and fine optical adjustment of the instrument was not possible. Instead, the obtained calibration and estimation results were presented in tables to confirm the trend of IOPs.
In general, the estimated camera EOP results are ranked in descending order: the results of the perspective projection model, NDLT model, and ODLT model. In the case of the camera position estimation, ODLT and NDLT produce similar results, but ODLT produces slightly larger position estimation errors. The maximum error in estimating the sensor's position using the perspective projection model was 0.7966 m, and the average error was 0.6331 m. The ODLT and NDLT models had average errors of 1.6992 m and 1.2047 m, respectively. In the case of the rotation angle estimation, NDLT and the perspective projection model significantly outperforms ODLT. The average orientation angle errors for the perspective projection model and the NDLT model were 0.88 • and 0.76 • , respectively, and 3.07 • for the ODLT. The three models produce similar reprojection error results. The average reprojection error of each model was 3.67 pixel, 2.14 pixel, and 2.93 pixel, respectively.
Herein, three models were used to estimate IOPs/EOPs. However, results obtained from UAV-acquired nadir images are poor. The position estimation results exceed 100 m, and the rotation angle estimation result is not reasonable. The estimation of camera IOPs is also less reliable. Because of high altitude imaging, GCPs can be regarded as being almost on the same plane. The reprojection results appear to be reasonable, but this is because the distribution of CKPs is also on the same plane. Table 10 is the error summary table for each sensor platform. Through this study, it is possible to quickly estimate the camera information, position, and orientation of various optical sensors distributed in a smart city. Because it uses the geometric characteristics of a frame camera, it can be applied not only to the optical sensor but also to the infrared camera. In addition, there is an advantage in that absolute or relative coordinates of various sensor platforms can be calculated. In particular, the results of this study can be significantly applied to indoor and underground spaces where positioning systems such as global navigation satellite systems cannot be used. This research team plans to apply the findings of this study to the coarse registration of point clouds in indoor space in a future study.