1. Introduction
The camera is recognised as the primary tool in photogrammetry for measurement and information generation. This optical instrument, depending on its geometric structure and sensor type, can provide 2D information, semantic information, metric products, and precise practical data. Recently, the use of spherical cameras has gained attention in various projects [
1]. Their extensive photographic coverage and ability to obtain necessary data for the production of augmented reality, digital twins [
2], and 3D reconstructions have made them increasingly viable [
3,
4]. These cameras are generally available in professional and low-cost models. The low-cost models, with their simpler configurations, offer users easier operation and faster capture, which are often preferred for their accessibility and affordability [
5,
6]. However, several factors prevent the use of this type of camera in most photogrammetric studies. One of them is the high distortion of the images produced by these cameras, which requires a calibration tailored to the geometry of fisheye images. This problem has been studied extensively in photogrammetry to determine the optical image parameters [
4,
7], which mainly include focal length (f), the pixel position of the image’s geometric centre (cx, cy), radial distortion parameters (K1, K2, K3, K4), tangential distortion parameters (p1, p2), and the non-orthogonality of physical axes of the camera lens (b1, b2) [
8]. Camera calibration is essential for accurate image processing to determine positions from photographs, create 3D reconstructions, and extract any metric information from image-based products [
9]. To accurately determine the relationship between image space and object space, a projection model is required [
8].
Recent advancements in calibration methods focus on achieving high accuracy in 3D reconstruction tasks by addressing the complexities of spherical images [
3]. Moreover, the development of self-calibration methods for fisheye lenses has provided insights into optimising calibration processes under diverse conditions [
8]. These methods allow for adaptive approaches that cater to specific use cases, such as the digitisation of cultural heritage structures and urban environments, where cost-effective solutions are essential [
5,
6]. The accuracy of the calibration methods and the design of the camera network during calibration are critical in evaluating the calibration itself [
10]. Despite using the latest orientation method with least squares adjustment for 3D reconstruction, which leads to the highest accuracy and reliability among other mathematical approaches, the lack of an accurate camera calibration and the weak network design geometry for calibration cause poor results in the accuracy metric. Since the least squares method includes nonlinear equations, sufficient initial values need to be considered in the following orientation steps of 3D point cloud generation [
11]. Therefore, in order to improve the assessment of the quality and accuracy of the calibration of spherical cameras with fisheye lenses, studying the impact of calibration parameters on the production of 3D products, such as 3D point clouds, can serve as a clear indicator of the importance of this aspect in the quality of processing outputs. This quality analysis of the camera calibration is performed according to the known criteria, such as the stability and accuracy of the calibration parameters, re-projection error, aggregation of the number of tie points in the constant value of the re-projection error, registration accuracy of ground control points, and correctness of the pose estimation for the camera position. For the purpose of affordability and easy configuration for non-professional users, professional spherical cameras such as the Teledyne FLIR Ladybug, Professional 360 Panono, Insta360 Pro2, and Weiss AG Civetta [
3,
12] are excluded from the considered spherical cameras.
The calibration of fisheye lenses presents challenges due to their inherent distortions, which require specialised models and techniques for effective compensation. Studies have demonstrated the potential of low-cost fisheye cameras for close-range photogrammetry, particularly for indoor dimensional measurements, emphasising the balance between affordability and accuracy [
1]. Similarly, fisheye cameras have been effectively used for creating digital twins of historic structures and enabling the detailed monitoring and documentation of deterioration [
2]. There are several approaches for fisheye camera calibration, which can be divided into two main parts. The first part involves using a relative orientation constraint to optimise camera calibration [
13], which is known as self-calibration, and the second part uses a planar target or pattern to calculate stable calibration parameters, also known as pre-calibration. In recent years, thanks to the combination and improvement of the current state of these two calibration solutions, other fisheye camera calibrations, including two-step calibration [
14] and 3D calibration, have emerged as alternative solutions to improve the geometric structure of the resulting 3D point cloud and reduce the distortion of the images. The two-step camera calibration solution is defined by removing the influence of distortion on the images [
15], while the 3D calibration concept follows the control point constraint and is influenced by the bundle approach. In fact, these two alternatives use the basic self-calibration and pre-calibration calculation methods [
13,
15].
The confined space and limited lighting conditions of the underpass presented significant challenges for optical imagery, particularly in close-range photogrammetry network design and feature capture. These difficulties were most pronounced in the middle section, where minimal lighting and the presence of stairs further restricted space. The use of a spherical camera introduced fisheye images as raw data, increasing distortion and reducing geometric consistency, which in turn affected the accuracy and complexity of relative orientation and camera calibration. Achieving high accuracy with fisheye lenses required an optimized camera network design, as their wide field of view complicated chessboard-based calibration and the selection of camera station configurations for both calibration and 3D reconstruction. To address these challenges, we tested two different spherical cameras, explored multiple calibration approaches using two separate software solutions, and carefully designed the photogrammetry workflow. By evaluating both relative and absolute solutions through a controlled dataset and a review of previous studies, we identified the most effective camera calibration approach, which holds potential for further refinement in future research.
This research utilizes five primary fisheye camera calibration strategies (pre-calibration, self-calibration, two-step calibration (solution 1 and 2), and 3D calibration) to examine the efficiency and accuracy of the results. The objective is to illustrate the most suitable calibration option. This study also involves the analysis and assessment of software and any factors that impact the final calibration results. Additionally, it utilizes calibration parameters in 3D reconstruction to practically demonstrate the effects of each calibration strategy on the geometry consistency and accuracy of 3D point clouds.
This paper is organized into five sections. The introduction provides a comprehensive overview of the fundamental concepts related to fisheye camera calibration, along with its background and relevant fields. This serves as a foundation for understanding the subsequent sections.
Section 2 (Workflow of Research) defines the workflow and framework of this study, introducing the case study and its key features. This section establishes the structural approach adopted in this research.
Section 3 focuses on the experimental process (Experiments), including data acquisition and data organization. This ensures a clear understanding of the research dataset before delving into the methodological aspects.
Section 4, Methods, details the procedures, techniques, and solutions employed in this study. It presents a step-by-step explanation of the approach used for fisheye camera calibration and its application. Finally, the last section provides a thorough evaluation of the calibration results. The results are analysed from both relative and absolute perspectives, offering the assessment of the fisheye camera calibration process and its effectiveness in 3D reconstruction.
5. Results
This research aimed to analyse the impact of different camera calibration solutions on the resulting 3D point cloud. To assess the results, we divide the evaluation stages into absolute and relative sections. First, the accuracy and correctness of the calibration parameters in each pre-calibration test are evaluated using the mean and standard deviation as key statistical measures. Based on the results, the best calibration method for the network design test is identified. Meanwhile, we also compare the software’s performance. After selecting the best pre-calibration test, we proceed to the relative evaluation section. In this phase, we compare the results of four calibration methods based on three main parameters related to the accuracy and consistency of the calibration tests. These parameters included factors such as noise, geometric consistency, completeness based on re-projection error, number of tie points, and control point accuracy, which are derived using calibration parameters in the 3D reconstruction of the underpass. Additionally, we assess and compare the geometric features of the generated 3D point clouds (produced by each calibration method) and their alignment accuracy with respect to the LS point cloud.
5.1. Absolute Evaluation
In the absolute evaluation of calibration parameters, the standard deviations and mean differences of the calibration parameters compared and analysed via the reference method. The reference method was chosen based on results from a previous study [
17] and also other studies [
37] utilizing self-calibration. Among the ten tests performed using the pre-calibration method (
Figure 3,
Section 4.1 and
Section 4.2.1), test number 10 was identified as the best in terms of the accuracy and reliability of the estimated parameters (based on
Figure 10c and
Figure 11). This test, in terms of network design, involved altering the image depth and increasing the number of images, compared to the initial base test with nine images. Therefore, the result of test number 10 was considered for the main pre-calibration test in the following report (
Figure 10,
Figure 11,
Figure 12,
Figure 13,
Figure 14 and
Figure 15). The RMS error for the estimated calibration parameters for the front and rear lenses of the X2 camera was 2.18 and 3.379 pixels, and for the X3 camera, it was 39.79 and 40.80 pixels, respectively.
Since the front lens of both cameras has slightly better performance in terms of the re-projection error and accuracy of calibration parameters (
Figure 10,
Figure 11 and
Figure 12), the performance evaluation between MATLAB and Metashape in terms of pre-calibration accuracy, consistency, and correctness has been illustrated by the front lens, and only by using the Insta360 One X2 camera. These programs utilize the Brown mathematical model to address the geometry and distortion of fisheye images [
28,
38]. The same procedure for the pre-calibration tests and network design in MATLAB software was repeated, and the result showed a weak ability in the consistency and accuracy of the calibration parameters in MATLAB (
Figure 10), especially for estimating the pixel position of the geometric centre of the image and focal length. Also, there was much more incorrect positioning of the camera locations after calibration.
Regarding the large error in the results of the camera calibration in MATLAB software, the rest of the pre-calibration tests were repeated with another lens and camera in Agisoft Metashape. As shown in
Figure 10, Metashape has dramatically better performance in each of the three stages of consistency, accuracy, and reliability. Radial and tangential distortion coefficients have strong consistency, according to the main parameters (focal length and geometric centre of the image) among the test results. Considering the impact of each calibration parameter from each test on the 3D reconstruction (
Figure 11), the best result was obtained for Insta360 One X2, despite the close similarity with Insta360 One X3. Given the results, which include a 0.223 and 0.308 re-projection error and 21,385 and 18,267 tie points, the accuracy is better compared to the rear lens in X2 and X3, respectively.
The number of correct camera positionings are demonstrated in
Table 4. The missed alignment of the test is the aggregation of previous tests for the inefficient camera positions during image capturing. However, the results show five camera misalignments and seven incorrect positionings for the rear, and, respectively, four and five values for the rear lens, only for the pre-calibration test (test number 10). Other calibration solutions performed perfectly while using both spherical cameras and their correspondent lenses (
Table 4).
5.2. Relative Evaluation
Based on the best possible network design in test number 10, there is remarkable gap between the pre-calibration and the reference calibration solution (self-calibration) in terms of accuracy and completeness of the point cloud (according to
Figure 12,
Figure 13 and
Figure 14), but other strategies such as 3D calibration and two-step calibration confirmed the close accuracy and were even better than the reference solution. In addition to the accuracy and completeness of the generated point cloud, the images used in the second solution of the two-step calibration method were exported as undistorted images using OpenCV libraries. The second solution of the two-step calibration has the best performance in terms of accuracy and completeness and was also better than the self-calibration stage. The mean variance of the estimated parameters in the first and second solutions of the two-step calibration methods were 3.28 and 4.80, respectively, indicating the stability of the parameters in these methods as well. The average improvement of the re-projection error is 0.61 among all the lenses and was compared to the self-calibration solution. The second solution in the two-step calibration also had better results in terms of tie points and control point errors, with a mean improvement of 541 points and 0.17 m. On the other hand, the 3D calibration had the closest behaviour to the reference, which may be a consequence of these two calibrations having more similar calculations and procedures. The first solution of the two-step calibration had a remarkable accuracy and was almost more accurate than the 3D calibration (
Figure 12,
Figure 13 and
Figure 14).
Test number 10, among those performed in the pre-calibration phase, included both the criteria of varying the depth of imaging and increasing the number of images taken. Consequently, it can be concluded that capturing as many angles as possible ensures a complete visibility of the checkerboard pattern and its points and can lead to a higher accuracy and reliability in the parameter estimation, thus improving the quality of the 3D reconstruction. The re-projection error of test number 10 indicates improvements of 0.07 and 0.14 for the front and rear lens, compared to the initial imaging state with only nine images in a single run (test number 1,
Figure 3). According to the variance of pre-calibration parameters, the maximum observed variance pertained to the estimation of the focal length and the x-component of the geometric centre displacement, with a relative stability in estimating other calibration parameters across the conducted tests (
Figure 10b,c). It also demonstrates acceptable stability across the ten tests conducted after each camera use for imaging and processing, which was expected to be unstable for any non-metric camera (
Figure 10b). The results of Agisoft Metashape seems far more accurate than the MATLAB App calibrator, which can be observed from the large error in estimating the focal length and projection centre (
Figure 10). However, the variance discrepancy in MATLAB outputs was significantly higher, indicating less overall stability among the parameters in this software. This behaviour is further confirmed by comparing the RMS error and re-projection of these parameters, suggesting that the parameters estimated with Metashape software exhibit greater accuracy and reliability. One reason for this performance improvement is the exclusion of certain images from calculations due to unsuitable image geometry and inappropriate density of key points, which led to errors in estimating calibration parameters. Thus, despite the inability to estimate the camera’s position and points from all captured images (based on
Table 4), Metashape provides higher quality outputs in the pre-calibration method. According to the results of the pre-calibration tests and previous work, this improvement in the estimation of calibration parameters is related to the optimal network design. Nevertheless, the self-calibration method outperformed the pre-calibration method, and the results of the 3D calibration and two-step calibration methods changed the situation. As shown in
Figure 13, a slight difference in the re-projection error was observed between the 3D calibration method and the self-calibration method, which also exists among the individual parameters of both methods. Overall, this method can be considered a viable alternative to self-calibration, providing comparable accuracy and precision. Due to the requirements of installing targets at the primary stage for 3D reconstruction, the 3D calibration test was more difficult and time-consuming compared to others. In contrast, the self-calibration method can estimate calibration parameters without targets, relying only on the extracted feature points and the matching process by 3D reconstruction. For a more comprehensive analysis of the results, two sample locations of the final 3D point clouds with each calibration test were compared in terms of surface density, roughness, verticality, and planarity.
In the geometric feature step of this analysis, we can still see the dominant performance of the two-step calibration solutions compared to others. The pre-calibration solution almost had the worst result for each geometric feature. The surface density of the second solution of the two-step calibration is higher than any other existing solution, despite using the same procedure for 3D reconstruction as other calibration results. The outcome of the two-step calibration seems to be remarkable in this test; although the self-calibration is still almost equal, especially when using X2 camera, it even performed better than the two-step calibration. Based on the standard deviation results, the two-step calibration has the most reliable geometric features among the other calibration strategies (
Table 5). In order to observe the detailed deviation of the two best solutions for calibration (two-step calibration: solution 1 and 2) between the spherical camera point cloud and the LS (reference), a profile of the front and rear side of the tunnel was extracted from all these three point clouds. It can be claimed that solution 1 contains more noise and causes a duality layer, especially in the corner sides, but in the rear lens model of both cameras, there is only a part of the walls extracted and the lack of completeness is obvious. On the other hand, solution 2 has completeness for the whole wall. If we look at the control points that were taken only on the ground, we can see that even the deviation of the upper side of the spherical point cloud with respect to the LS’s can be justified and distinguished from its geometric defects. The accuracy of the geometric structure of the 3D point cloud is also mentioned in
Table 5. This was accomplished by using the 2D profile of specific section of the underpass, in order to clarify the overlapping and the accuracy of the straight walls and flat ceiling (
Figure 15).
Our analysis workflow ends with the correctness evaluation of each resulting point cloud. Based on the C2C calculation between each calibration test 3D point cloud and LS point cloud, the best result still explains the advantages of the second solution of the two-step calibration in comparison to others (bold values in
Table 6 and
Table 7). The front lenses have a lower mean and standard deviation (std) distance compared to the rear lenses.
In the two-step calibration method, the first and second solutions demonstrate improved performance in producing tie points. The increase in the number of tie points and the slight reduction in the re-projection error reflect a decrease in point cloud noise and an increase in point density in the main sections of the captured images. On the other hand, this method ultimately allows for undistorted images to be used in 3D reconstruction without initial calibration parameters. This capability facilitates their processing in many accessible software and toolboxes that exclusively utilize frame images. Based on the accuracy and reliability results of parameters across all conducted tests and prior work, the accuracy of the front lens in estimating calibration parameters and 3D reconstruction is better than the rear lens. Therefore, priority should be given to using front lenses to cover critical areas of the structure or object. In summary, the use of the second solution of the two-step calibration method, through the production of undistorted images based on the calibration parameters from the pre-calibration method (with optimal network design) and the updating of image parameters in 3D reconstruction process, enables better accuracy and point cloud generation from this camera and its fisheye images.
7. Conclusions and Future Works
In the first step of this research, the best solution to achieve the best possible result of pre-calibration was clarified through various comprehensive network design tests. This involved the acquisition of 26 images in three different distances and the consideration of the main rules of close-range photogrammetry; furthermore, software performance was analysed, which led to the combination of Metashape as the software for relative orientation and calibration tests, and the OpenCV toolbox for undistorted fisheye images. Among the known strategies for fisheye camera calibration, the two-step calibration was confirmed as a reliable and accurate solution for fisheye camera calibration based on the metric criteria such as re-projection error, geometric features, ground control point accuracy, and correctness (due to the C2C distance analysis). Also, if we consider the second solution, which includes an undistortion step, we can claim that the calibration is also applied to the raw dataset of the project in addition to the 3D reconstruction. Despite these advantages, the completeness of the final result still needs to be improved in comparison to the self-calibration solution and the reference 3D model.
Based on recent studies utilizing neural networks and deep learning models for calibrating fisheye images [
39,
40,
41], attention to these methods for achieving better accuracy than the self-calibration method and their encompassing strategies could represent a significant step in enhancing the accuracy and efficiency of optical cameras and the metric products. Additionally, considering the raw data imaging system of these cameras, the common equirectangular format or panorama images can also be accurately calibrated and utilized in 3D reconstruction through a projection system that converts raw images into panorama images and correspondently provides 3D reconstruction.