A comparison of three geometric self-calibration methods for range cameras. Remote Sens

Significant instrumental systematic errors are known to exist in data captured with range cameras using lock-in pixel technology. Because they are independent of the imaged object scene structure, these errors can be rigorously estimated in a self-calibrating bundle adjustment procedure. This paper presents a review and a quantitative comparison of three methods for range camera self-calibration in order to determine which, if any, is superior. Two different SwissRanger range cameras have been calibrated using each method. Though differences of up to 2 mm (in object space) in both the observation precision and accuracy measures exist between the methods, they are of little practical consequence when compared to the magnitude of these measures (12 mm to 18 mm). One of the methods was found to underestimate the principal distance but overestimate the rangefinder offset in comparison to the other two methods whose estimates agreed more closely. Strong correlations among the rangefinder offset, periodic error terms and the camera position co-ordinates are indentified and their cause explained in terms of network geometry and observation range.


Introduction
Data captured with range cameras using lock-in pixel technology to measure the phase difference of a backscattered RF-modulated optical signal are contaminated by many random and systematic errors that can be divided into four categories.The first comprises shot noise and dark noise sources whose influence on captured range camera data can be reduced by temporal and spatial filtering strategies [1].A detailed discussion of these errors can be found in [2].The second category includes systematic artifacts that depend on the nature of the operating environment such as the ambient imaging conditions and the object scene structure.This includes biases due to external temperature, multi-path reflections, internal scattering and mixed pixels [3].The third group comprises scene-independent systematic artifacts due to the camera operating conditions such as the warm-up time [4] and the integration time [3].
The best practice for minimizing the impact of the errors belonging to the second and third groups is to control the imaging conditions by, for example, allowing a sufficient camera warm-up time, holding the integration time constant during data capture and controlling the room temperature.While such measures may be possible in a laboratory environment, they may not be for real scenes, particularly in harsh environments.Other artifacts such as mixed pixels can be identified and removed after image acquisition [1].The complex phenomena of multi-path and internal scattering are more difficult to control due to their dependence on object scene structure.The modeling and correction of scattering has been the focus of much work recently [5,6].
The final category comprises the scene-independent instrumental systematic effects that are due to individual component and assembly errors.This group includes lens distortions (radial and decentering) and ranging errors: the rangefinder offset, the scale error, periodic errors and clock-skew or latency errors, all of whose physical cause can be easily explained.It may also include experimentally-identified effects such as amplitude-dependent range errors [7].These errors are deterministic and independent of the scene so they can be readily modeled and determined in a pre-calibration procedure.The estimation of this group of parameters by different calibration methods is the subject of this paper.
A multi-station self-calibrating bundle adjustment is ideal for camera calibration for a number of reasons summarized in [8] that can be easily extended to range cameras: 1.No special facilities or equipment are required except for the camera itself and some form of primitive target features such as structured point targets (the focus here) or a planar surface; 2. The targets' object-space co-ordinates need not be known absolutely; only initial approximate values are required since they can be estimated in the adjustment; 3. A two-dimensional (2D) target field can be used provided that certain first-order design features are incorporated in the calibration network; 4. All observation types including image point measurements, ranges and independent object space information (e.g., spatial distances between targets) can be incorporated into the adjustment; 5.All required systematic error models can be incorporated, usually as additive perturbations terms, into the collinearity and range observation equations; 6.The stochastic model for each individual observation or observation group can be included in the solution; and 7.It yields optimal estimates of all model variables since it is based on the weighted least-squares criterion.
Several researchers have investigated various forms of the self-calibration approach for range cameras that can be classified as either a two-step process or a one-step, integrated approach.The purposes of this paper are to review and compare three different self-calibration approaches to range camera self-calibration and to answer the question of whether one method is superior to the other.This commences with a review of the relevant geometric functional models for the observations augmented with scene-independent instrumental systematic error terms.A detailed review of the calibration approaches in question and a discussion of pertinent network design issues then follow.A performance comparison has been conducted using datasets captured with two different range cameras (an SR3000 and an SR4000) calibrated by the three different methods.The reported bases for comparison are the estimated observation precision, co-ordinate determination accuracy, error parameter precision and correlation between model variables.

Range Camera Geometric Functional Models
The pinhole camera model is adopted as the basis for range camera geometric modeling.The first two functional models stem from the collinearity condition in which an object point (X, Y, Z) i , its homologous image point (x, y) ijk and the perspective centre of the image (X c , Y c , Z c ) j captured with sensor k lie on a straight line: where (x p , y p , c) k are the interior orientation parameters of sensor k, namely the principal point offset and principal distance; and in which (, , ) j are the camera orientation angles and R 1 , R 2 and R 3 are rotation matrices.
The second geometric condition is that the length of this line is equal to the observed range Deviations from these idealized conditions are modeled as additive random ( x ,  y ,   ) ijk and systematic (x, y, ) ijk error terms.
The systematic error models reported here are confined to those found to be statistically significant in the range cameras under investigation, namely radial lens distortion (k 1 , k 2 , k 3 ) k and decentering lens distortion (p 1 , p 2 ) k where The series in Equation (10) comprises periodic errors at wavelengths that have been identified in the SR3000 data, namely the unit length, U (half the modulation wavelength), and fractions thereof, U/2 and U/4.A mathematical explanation for the existence of the U/4-wavelength terms is given in [9].He shows they are caused by odd-harmonic multiples of the fundamental frequency contaminating the modulating envelope, which results in a slightly square waveform.The physical cause of this is attributed to the non-ideal response of the illuminating LEDs in [10,11].Pattinson also provides a mathematical derivation for the existence of the U/2-wavelength terms but admits that their physical cause is not clear.The longer-wavelength terms could be due to internal signal interference [12] that can occur at the unit length and at fractions thereof.

Range Camera Self-Calibration Approaches
Three self-calibration approaches are described.Two of these are two-step methods in which separate calibrations are performed for the camera-lens parameters (principal point offset, principal distance and lens distortions) and for the range-error parameters (rangefinder offset, periodic errors and clock skews).In the first method, the two-step independent (TSI) method, the camera-lens and range-error calibrations are performed as separate processes using separate facilities as depicted in Figure 1.First, an established procedure [8] is used for the camera-lens calibration from x and y observations of targets in a network of convergent images.Then, a planar target is imaged at normal incidence (the normal images) to determine the range-error parameters.The authors of [3] perform their range-error calibration using a small, planar target moved along an interferometric calibration track that allows very accurate camera-target positioning.An extended, featureless planar surface is used in [4] and the camera-plane orientation is established with two parallel tape measures.The orientation can also be performed by space resection of the camera from independently-surveyed targets on the plane, which is the procedure adopted for the testing described herein.Regardless of the orientation method used, reference ranges between each camera's perspective centre and the target surface are computed using the point observations in the already-oriented normal images and the estimated camera-lens parameters.These are compared with the observed ranges to derive the range differences, , from which the range-error parameters are estimated by least-squares.

(a) (b)
In the second method, the two-step dependent (TSD) procedure, a common facility is used for both calibration processes (see Figure 2).The camera-lens calibration is first performed with an established procedure using the x and y observations of targets on a planar surface observed in a network of both convergent and orthogonal images.The camera-plane orientation is thus determined since the position and orientation of each image are estimated in the calibration.The reference ranges can then be computed from the orthogonal camera stations to points on the plane [7,13] or to the target centers [14] and used for the range-error calibration as described previously.Though this is a more integrated approach than the TSI procedure, it is fundamentally a two-step method.

(a) (b)
The third approach, described in [15] and depicted in Figure 3, is called the one-step integrated (OSI) procedure in which both sets of calibration parameters (camera-lens and range-error) are estimated in a single step: a self-calibrating bundle adjustment with ranges.A 2D field of targets is imaged from both convergent and orthogonal camera locations.To prevent scattering errors from biasing the solution, the range observations from the convergent stations are excluded from the bundle adjustment.In this approach the camera orientation is performed concurrently and there is no explicit computation of reference ranges.The two-step calibration described in [1] exploits a high-resolution digital camera rigidly mounted with the range camera in a rig assembly to improve the quality of the camera-lens parameters.The estimation of the range errors is done in parallel with the camera self-calibration by comparing reference and observed ranges to a planar checkerboard pattern.The authors use B-splines to model the periodic errors, called wiggling error, rather than trigonometric functions.This method is not investigated here since only methods that do not rely on an ancillary device have been implemented.

Network Design Measures
It is important to discuss some pertinent aspects of self-calibration network design that can have an impact on the parameter estimates.In terms of zero-order design, a minimally-constrained datum definition is critical to prevent (potential) biases in the targets' object-space co-ordinates from propagating into the calibration parameters.The inner constraints approach has been adopted for this purpose in this study.A minimally constrained solution can be used for the traditional camera calibration of both two-step approaches, but the datum definition for normal-image orientation by resection in the TSI method is over-constrained.It is relevant to note that that omission of the range scale error from Equation (10) will not introduce any biases in a minimally-constrained bundle adjustment.
Several first-order design measures of network configuration are required to reduce parameter correlations and improve the calibration parameter precision.First, it is well known in photogrammetry that observations of multiple targets in multiple images are needed for self-calibration.The use of a 2D target field is permissible, as mentioned, but convergent geometry is needed to reduce the principal [8] since their relationship is constant when a planar target field is imaged at normal incidence: The inclusion of range observations very effectively mitigates this source of correlation in the orthogonal images provided that convergent images are present in the OSI network [15] but gives rise to another dependency described below.Inclusion of images with orthogonal exposures (i.e., roll-angle diversity) is also needed for the de-correlation of the principal point and the orientation angles [16].
The estimation of the range errors requires several images captured at multiple stand-off distances over the full ambiguity interval having a sufficiently-small spacing to prevent aliasing of the periodic errors.A planar target field should be imaged at normal incidence to minimize the effects of scattering [7] as previously mentioned.However, this gives rise to high correlation between the rangefinder offset (d 0 ) and the camera position in the depth dimension (Y c ) whose differential relationship is given by Under the described imaging conditions this differential relationship can also be written independent of the object space parameters as which is unity at the principal point and decays monotonically outward, reaching minimum values at the corners of the image format.This correlation is realized explicitly in the integrated approach but not in the two-step methods [17] where the camera position and orientation and the rangefinder offset are estimated separately.

Experiment Description
An experiment has been conducted to compare the performance of the three range camera self-calibration methods:  The one-step integrated (OSI);  The two-step independent (TSI); and  The two-step dependent (TSD).Two cameras were tested: a SwissRanger SR3000 and a SwissRanger SR4000.Both feature a sensor area of 176 × 144 pixels 2 and a 40 m pixel pitch.The respective nominal principal distances are 8 mm and 10 mm and the nominal unit lengths are 7.5 m and 5 m.
A multi-resolution, planar target field, pictured in Figure 4, was established for the testing.It comprised 106 circular white targets having black backgrounds and variable diameters (45 mm, 150 mm and 280 mm) mounted on a wall spanning an area of 4.3 m × 2.9 m.These dimensions were chosen to fill the SR3000 camera's field of view at a standoff distance of 7.0 m.The root mean square flatness of the plane as defined by all of the target centers was determined by independent total station survey (see Section 6.2) to be 4.6 mm.Each network comprised a set of convergent images and a set of orthogonal images captured every 0.5 m starting from about 1.0 m up to about (U-0.5 m).The integration time for each camera was chosen so as to strike a balance between detector saturation at close range and signal-to-noise ratio and at long range (i.e., near the ambiguity interval) and was held constant during data capture.Each camera was warmed up for at least one hour prior to data capture.The SR3000 calibration parameter set comprised the first term of radial lens distortion and all nine range-error terms in Equation (10).The SR4000 model included two radial lens parameters, both decentering distortion terms and the rangefinder offset.The calibration model terms used are summarized in Table 1.Each camera's parameter set includes only significant terms that were identified by the hybrid statistical testing procedure described in [15].Reference ranges were computed at the target centers for the two-step calibration approaches.Although [10] shows that range precision is a function of the range, a range-independent variance was used here since no practical differences were found between this and range-dependent models [15].The comparison is conducted in terms of four measures of self-calibration quality:  Observation precision;  Co-ordinate accuracy;  Parameter precision; and  Parameter correlation.

Observation Precision
Observation precision is measured with the root mean square error (RMSE) calculated from the self-calibration residuals and is summarized in Tables 2 and 3 along with the degrees-of-freedom for each adjustment.For the SR3000, the TSI procedure gives a slightly better, but not significantly different at the 95% confidence level, image point observation precision.This may be simply due to a combination of the observation set homogeneity and the lower redundancy (789) of the TSI camera-lens calibration adjustment.Differences also exist between the range precision estimates, with the OSI method being slightly better than the two-step approaches.The SR4000 results as a function of calibration method are consistent with those of the SR3000 and there are no statistically significant differences between precision measures.It is easy to confirm numerically that the contribution of target field un-flatness to the range uncertainty is only 0.6 mm for the SR3000 and 0.5 mm for the SR4000 since fewer targets were used for its calibration, so this is not a significant contributing factor.From a practical standpoint no method is superior in terms of observation precision since the largest difference in range RMSE is less than 2 mm and the image point co-ordinate RMSE differences are 0.3 m or less.The former represents only a 10% difference in range precision and the latter is only 0.3 mm when projected into object space at the 7.5 m ambiguity interval of the SR3000.

Accuracy
Accuracy is measured with the RMSE computed from the differences between corrected range camera co-ordinates and independently-surveyed (by a Leica TCR 803 total station) co-ordinates and is summarized in Table 4.Each range camera dataset was rigidly transformed to the co-ordinate system of the surveyed co-ordinates.Overall the RMSE measures differ by 2 mm or less.For the SR3000 dataset the OSI results are slightly superior in the depth dimension, which corresponds closely to the range dimension.The differences between the two-step methods for the SR3000 are not statistically significant, but their differences with the OSI method are.For the SR4000 the accuracy of the TSI method is superior by about 1 mm in the "planimetric" dimensions (lateral and height) but there are no differences in the depth dimension.However, no statistically significant differences exist at 95% confidence.The reason for the close agreement of the RMSE measures from the three calibration methods, despite the large differences in the d 0 estimates analyzed in the next section, is the compensation of any un-modeled rangefinder offset error by the translation parameters of the rigid body transformation (cf.Equation ( 12)).Again from a practical perspective, no method can be identified as being significantly better than the others.

Parameter Correlation and Precision
Despite the slight apparent superiority of the OSI method, one of its drawbacks is the high correlation between the rangefinder offset d 0 and the perspective centre position Y c . Figure 5 shows the strong functional dependence between these two variables in terms of the partial derivative given by Equation (12).Although the point distribution in the image plane is quite favorable for both datasets, the partial derivative drops only to about 0.9 near the image corners due to the cameras' narrow field of view.
The rangefinder offset-perspective centre correlation is exacerbated by the longest-wavelength sine term (d 2 ) of the periodic error model.The large-magnitude correlations (Corr) among these three variables (0.97 between d 0 and d 2 and 0.99 between Y c and d 0 ), which do not explicitly exist in the two-step calibration results, can be seen in Table 5. Removal of the d 2 term from the OSI calibration model has a very positive impact on the rangefinder offset d 0 in terms of its precision ( d0 ), which drops from 12.0 mm to 2.8 mm, and its correlation with the camera position (Corr Y c -d0 ), which drops to 0.87).It has a profound impact (29 mm change) on the estimated value of d 0 but does not affect the observation precision as measured by RMSE .The d 0 estimates of the other methods are less affected by the removal of d 2 , but the observation precision of the TSI method is affected considerably as the RMSE  increases from 17.9 mm to 25.4 mm.It is also worth noting that the principal distance, c, estimates and its precision from the OSI and the TSD methods match very closely, whereas the TSI seems to underestimate c whilst overestimating d 0 .The principal distance precision ( c ) in this method is lower due to the absence of the orthogonal images in the camera-lens calibration.The aforementioned de-correlation of the principal distance and the camera position in the OSI method and lack thereof in the two-step approaches can be seen in the (Corr Y c -c ) coefficients of Tables 5 and 6.Selected results from the SR4000 calibrations are presented in Table 6.For this camera the OSI and TSD parameter estimates match very closely.Again the TSI method underestimates the principal distance and over-estimates the rangefinder offset d 0 .As in the SR3000 case, the two-step methods give a more precise rangefinder offset d 0 since the uncertainty in the perspective centre position is not explicitly modeled.The principal distance precision from the TSI calibration is again comparatively lower for both cameras due to the lack of the orthogonal images.The role of the d 2 term can be seen in the TSI results plotted in Figure 6.Clearly the periodic error model without the d 2 term is inadequate as it does not accurately fit the range difference observations: it overshoots the data between 1 m and 2 m and undershoots between 5 m and 7 m.The required d 2 sine trend (offset by the constant d 0 for clarity) and truncated by the observation limits (6.0 m or 80% of the 7.5 m unit length), exhibits not just sinusoidal behavior but also a linear trend.It has been proven in [18] that an un-modeled constant range bias propagates into the residuals as a linear function of range in a one-dimensional ranging sensor self-calibration network.This linear dependence helps to explain the source of the high d 0 -d 2 correlation in the OSI method.The fact that the d 2 term is not needed for the OSI and TSD calibrations suggests that these methods yield better perspective centre-d 0 parameter-set estimates, even though their numerical realizations differ, than the TSI method.Furthermore, it also suggests that the need to estimate d 2 may be an artifact of the calibration method caused by the data truncation, at least for the SR3000 camera investigated here.The collection of range observations over the full unit length, U, would alleviate this problem, but in practice this may be difficult due to saturation errors that occur at close range and low signal-to-noise ratio at long range.The integration times chosen for the data capture in this experiment were set so as to achieve a reasonable trade-off between these two factors and, as a result, the collection of data at close range (less than 1.0 m) had to be sacrificed.

Conclusions
A test comparing the performance of three different range camera self-calibration methods has been conducted with two different cameras.The OSI and TSD methods gave very similar principal distance estimates for both datasets and similar rangefinder offset estimates for one of the datasets.The TSI approach appears to underestimate the principal distance and overestimate the rangefinder offset relative to the other two methods.These parameters were tightly coupled to the perspective centre estimation method, but the different numerical values realized by different calibration methods had little impact on the metrics used to quantify calibration method efficacy.Though the OSI method was found to be slightly superior, the observation precision and accuracy differences are not of practical consequence.The differences in image point co-ordinate precision were 0.3 m or less and were 2 mm or less in range, while the largest difference in co-ordinate RMSE from the accuracy assessment was 2 mm.
The modeling of periodic range errors was observed to depend on the calibration method used.The longest-wavelength sine term was found to be critical for the TSI method in terms of observation precision.This term was not critical for the other two methods, which is a useful outcome since it was found to weaken the OSI solution due to the truncated observation range over the camera's ambiguity interval.

Figure 1 .
Figure 1.Two-step independent range camera calibration.(a) Camera-lens parameter calibration from x and y observations in the convergent images (red).(b) Range-error calibration from range  observations in the normal images (blue).

Figure 2 .
Figure 2. Two-step dependent range camera calibration.(a) Camera-lens parameter calibration from x and y observations in both the convergent and the normal images (red).(b) Range-error calibration from range  observations in the normal images (blue).

Figure 3 .
Figure 3. One-step integrated range camera calibration.Simultaneous calibration from x and y observations in the convergent image (red) and x, y and  observations in the normal images (green).

Figure 5 .
Figure 5.The partial derivative of Equation 12 as a function of radial distance, r, and histograms of image point distribution in the normal images for each dataset.(a) SR3000; (b) SR4000.

Figure 6 .
Figure 6.Two-step independent calibration range differences and error model trends.

Table 1 .
Calibration parameters for the two cameras tested.

Table 4 .
Accuracy assessment statistical summary.

Table 5 .
Selected SR3000 adjustment results with and without the d 2 term in the model.