and Comparison

In this paper experimental comparisons between two Time-of-Flight (ToF) cameras are reported in order to test their performance and to give some procedures for testing data delivered by this kind of technology. In particular, the SR-4000 camera by Mesa Imaging AG and the CamCube3.0 by PMD Technologies have been evaluated since they have good performances and are well known to researchers dealing with Time-of-Flight (ToF) cameras. After a brief overview of commercial ToF cameras available on the market and the main specifications of the tested devices, two topics are presented in this paper. First, the influence of camera warm-up on distance measurement is analyzed: a warm-up of 40 minutes is suggested to obtain the measurement stability, especially in the case of the CamCube3.0 camera, that exhibits distance measurement variations of several centimeters. Secondly, the variation of distance measurement precision variation over integration time is presented: distance measurement precisions of some millimeters are obtained in both cases. Finally, a comparison between the two cameras based on the experiments and some information about future work on evaluation of sunlight influence on distance measurements are reported.


Introduction
In the last few years, a new generation of active sensors has been developed, which allows the acquisition of 3D point clouds without any scanning mechanism and from just one point of view at video frame rates. The working principle is the measurement of the ToF of an emitted signal by the device towards the object to be observed, with the advantage of simultaneously measuring the distance information for each pixel of the camera sensor. Many terms have been used in the literature to indicate these devices, such as: Time-of-Flight (ToF) cameras, Range IMaging (RIM) cameras, 3D range imagers, range cameras or a combination of the mentioned terms. In the following, the term ToF cameras will be employed, because it relates to the working principle of this recent technology.
There are two main approaches currently employed in ToF camera technology: one measures distance by means of direct measurement of the runtime of a travelled light pulse, using for instance arrays of Single-Photon Avalanche Diodes (SPADs) [1,2] or an optical shutter technology [3]; the other method uses amplitude modulated light and obtains distance information by measuring the phase shift between a reference signal and the reflected signal [4]. Such technology is possible because of the miniaturization of semiconductor technology and the evolution of CCD/CMOS processes that can be implemented independently for each pixel. The result is the ability to acquire distance measurements for each pixel at high speed and with accuracies up to 1 cm. While ToF cameras based on the phase shift measurement usually have a working range limited to 10-30 m, cameras based on the direct ToF measurement can measure distances up to 1,500 m (Table 1). ToF cameras are usually characterized by low resolution (no more than a few thousands of pixels), small dimensions, costs that are an order of magnitude lower than LiDAR instruments and a lower power consumption with respect to classical laser scanners. In contrast to stereo imaging, the depth accuracy is practically independent of textural appearance, but limited to about 1 cm in the best case (commercial phase shift ToF cameras).
In this paper, comparisons between two recent ToF cameras are presented in order to test their performances and to give some procedures for testing data delivered by this technology. In particular, the SR-4000 camera by Mesa Imaging AG and the CamCube3.0 by PMD Technologies have been tested. Both sensors have good performance and are well known to researchers dealing with ToF cameras. In Section 2, an overview of the main specifications of both cameras is first given. Then, in Section 3 the influence of camera warm up on distance measurement stability is analyzed. In Section 4 the distance measurement precision stability with varying integration time is evaluated for both cameras. Finally, some conclusions and recommendations for future works are presented. The first prototypes of ToF cameras for civil applications were developed in the late 90s [4]. After many improvements to both sensor resolution and accuracy there are now many commercial ToF cameras available. The main differences between models are related to ranging principle, sensor resolution and measurement accuracy. Table 1 summarizes some technical specifications (when  available) about commercial ToF cameras in order to give a general overview of the available products. The column "Measurement accuracy/repeatability" in Table 1 contains heterogeneous information since the camera manufacturers adopt different terms and conditions for this information. It is worth noting the flexibility (and low cost) of the DS311 sensor from SoftKinetic [19] will probably influence the whole market of ToF sensors in the near future.

SR-4000 and CamCube3.0 Cameras
As mentioned before, the SR-4000 and the CamCube3.0 cameras have been tested in this work. In Section 2.1, their main specifications are reported while Section 2.2 describes with the output data available with each camera.

Main Technical Specifications
The SR-4000 and the CamCube3.0 cameras are both based on the phase shift measurement principle [4]. The CamCube3.0 has a sensor resolution higher than the SR-4000 one (200 × 200 pixels vs. 144 × 176 pixels), but it is about three times larger and heavier ( Figure 1). The SR-4000 distance measurement accuracy is given by the manufacturer as ±0.01 m (30 MHz modulation frequency). This value has been confirmed by experimental tests, such as the ones reported in [17,18]. The distance measurement accuracy of the CamCube3.0 camera is not known, but preliminary tests performed by our group and reported in [16] on the previous model (CamCube2.0) shown a distance measurement accuracy of 3-4 cm. The declared distance measurement repeatability is similar for the two devices  The declared maximum frame rate of the SR-4000 camera is 54 fps (frames per second) and 40 fps for the CamCube3.0 at full resolution (200 × 200 pixels).
The "crop utility" delivered by PMD allows cropping of pixel columns and rows, therefore it is possible to get a frame rate up to 60 fps considering the same number of pixels of the SR-4000 camera.
Finally, the SR-4000 camera has a passive cooling system, while the CamCube3.0 is equipped with two fans running continuously.

Output Data
Both cameras deliver a range image and an amplitude image at video frame rates: the range image (or depth image) contains the radial measured distance between the considered pixel and its projection on the observed object, while the amplitude image contains the strength of the reflected signal by the object for each pixel. In the case of the CamCube3.0 an intensity image is also delivered, which represents the mean of the total light incident on the sensor (reflected modulated signal and background light of the observed scene). In both cameras, a confidence map (SR-4000) or a flag matrix (CamCube3.0) is also delivered, which contains information about the quality of the acquired data (i.e., saturated pixels, low signal amplitudes, etc.). Moreover, a 3D point cloud (with X, Y and Z coordinates referred to the local coordinate system of the camera) is also delivered, which is equivalent to a 3D scan from classical LiDAR instruments with the advantage of real time acquisition.
In order to give idea sample of the data acquired with the two tested cameras, some visualizations of data acquired with the SR-4000 and the CamCube3.0 cameras are given in Figures 2 and 3 respectively. It should be noted that the SR-4000 software only returns calibrated (for lens model) data, while the CamCube3.0 software allows the user to access both raw and calibrated data.

Warm-Up Period Evaluation
Since semiconductor materials are highly responsive to temperature changes, temperature variations within a ToF camera can affect its distance measurements. This problem could result from two different effects: self-induced heating caused by thermal losses of the camera electronics and ambient temperature changes. While ambient temperature changes cannot be predicted and need to be measured at runtime, camera heating is predictable and can therefore be characterized. In particular, for a constant ambient temperature, the inner temperature should increase (or decrease, if cooling is available) in the first minutes after the device start up and then should eventually stabilize.
Previous work, such as [6,[20][21][22] demonstrated that a warm-up time of several minutes is necessary for the tested camera models. In [6] a distance variation of several centimeters is observed for the SR-2 camera in the first 20 min of camera operation and variations of external temperature demonstrate centimeter level distance variations with ambient temperature variations of tens of degrees centigrade. In [20] the temporal distance variations of the SR-3000 camera are analyzed, but only in the first ten minutes of camera operation; the authors recommend a minimum warming-up time of 6 min. The SR-3000 camera is tested in [21] too, with similar results. In [22] the PMD3k-S is tested: 20-25 min are required for measurement stabilization, but only the distance measurements of the middlemost pixel are considered for one hour of camera working.
In order to determine the camera warm-up period necessary to achieve distance measurement stability of the tested ToF cameras, the procedure described in the following was carried out. The room temperature was maintained constant (20 °C) for all the tests and the distance measurements were analyzed for two hours of camera operation in each test. This procedure was already proposed in [23], but here the analytical calculations are explained in more detail and the results for both cameras are reported. Room lights were switched off during the tests in order to avoid influence on the camera measurements. Variations of external temperature were not analyzed in this work since no climate chamber was available.

Test of the SR-4000 Camera
The SR-4000 camera was set up on a photographic tripod, with the front of the camera parallel to a white wall. After turning on the camera, five consecutive frames were acquired every five minutes for two hours of camera operation. The test was carried out at several distances (and integration times) between the front of the camera and the wall, in order to get more reliable results.
Data were acquired using the "auto acquisition time" suggested by the SR_3D_View software delivered with the camera. This software allows one to automatically adjust the acquisition time depending upon the maximum amplitudes present in the current image. This setting was used in order to avoid pixel saturation and to achieve a good balance between noise and frame rate.
In all cases, the f = 5 frames (range images) acquired at each time (t i ) were averaged pixel by pixel in order to reduce the measurement noise; therefore the following term was estimated for each considered pixel: The term d r,c (f,ti) represents the measured distance by pixel in row r column c for the f-th acquired frame at the time t i . Since the camera was fixed in each test, variations during the operation time of the mean (m_t i ) and standard deviation (σ_t i ) of the averaged range images were calculated. Since the tests were performed at different distances (and integration times), the relative variations of the mean (m_t i ) and standard deviation (σ_t i ), with respect to their initial values (m_t 0 and σ_t 0 ), were considered for each test in order to compare them: where r min , r max and c min , c max represent the row gap and the column gap of the sensor pixel considered in the analysis and n the number of considered pixels. Figure 4 is a schematic representation of the data processing workflow, were the blue area represents the group of pixels considered for the analysis (this area is defined by r min , r max and c min , c max ). The variations of m_t i and σ_t i during two hours of camera acquisition are shown in Figures 5 and 6 respectively. In all cases a central sub-image of 84 × 96 pixels was considered, while in two cases (when the wall filled the entire range image) the entire image of 176 × 144 pixels was considered.
As can be observed from Figures 5 and 6, both the mean value and the standard deviation of the distance measurements vary during operation: a maximum variation of about −6 mm was detected for the mean value, while a maximum variation of about 3 mm was measured for the standard deviation. Since the calculated variations are nearly constant after 40 min of camera operation, a warm up period of 40 min is sufficient to achieve a good measurement stability for the SR-4000 camera. For this reason, all the following tests were performed after this warm-up period.

Test of the CamCube3.0 Camera
The procedure for testing the CamCube3.0 camera is identical to the one adopted for the SR-4000 camera. As in the previous case, after turning on the camera, five consecutive frames were acquired every five minutes for two hours of camera operation. The test was carried out at several distances (and integration times) between the front of the camera and the wall, in order to get more reliable results.  Since in this case no estimation of an "auto acquisition time" was available, the integration time was adjusted manually to limit pixel saturation and distance measurement noise.
The variations of m_t i and σ_t i during two hours of camera acquisition are reported in Figures 7 and   8 respectively. In all cases a central sub-image of 106 × 150 pixels was considered, while in two cases (when the wall filled the entire range image), the entire image of 200 × 200 pixels was also considered.
As can be observed from Figures 7 and 8, both the mean value and the standard deviation of the distance measurements vary during operation: a maximum variation of about 120 mm was detected for the mean value, while a maximum variation of about 4 mm was measured for the standard deviation. As in the previous case, since the estimated variations are nearly constant after 40 min of camera operation, a warm up period of 40 min is sufficient to achieve a good measurement stability of the CamCube3.0 camera. The camera warm up period is highly recommended in this case, in order to avoid distance errors of several centimeters. Therefore, all the following tests were performed after this warm-up period.

Integration Time and Distance Measurement Precision
In the following, an estimation of the distance measurement precision of both the SR-4000 and the CamCube3.0 cameras is performed for varying image integration times.

Test on the SR-4000 Camera
In order to estimate the precision (standard deviation) of the distance measurements performed by the sensor pixels (n pixels), the following test was performed. The SR-4000 camera was positioned on a photographic tripod, parallel to a white wall. Then, 100 frames were acquired for several integration times reported in Table 2, were "auto" means the auto acquisition time suggested by the SR_3D_View software.
For each pixel i (each pixel is now individuated with only one letter (instead of row r and column c) to improve clarity of presentation), the mean value (d i,m ) and the standard deviation (σ i ) of the acquired distance measurements (number of frames f = 100) were estimated:  In Figure 9 a histogram of the 100 distance measurements performed by the central pixel with an integration time of 11 ms for an approximate distance of 1.30 m between camera and wall is reported. The term "approximate distance" is used since the distance between the camera and its orthogonal projection on the wall was measured with a metal tape and the exact shape of the wall was unknown. However, this doesn't affect the results of the test as only relative variations of the distance measurements are considered in the following. Suitable accuracy tests have already been performed during other experiments [23] for the SR-4000 camera and will be performed for the CamCube3.0 too in the future. Figure 9. Histogram of the 100 distance measurements performed by the central pixel of the SR-4000 camera with an integration time of 11 ms (approximate distance camera-wall: 1.30 m). Figure 9 shows that the maximum of the distance measurement distribution is very close to the approximated distance value between camera and wall.
In order to compare data acquired with different integration times, the following were estimated: the mean value of the estimated standard deviations (m σ ) for all the pixels, which represents the mean precision of the sensor; the mean value of the range image (averaged over 100 frames) (m Dm ) and its standard deviation (std Dm ); the mean value of the amplitude image (averaged over 100 frames) (m Am ) and the mean value of the confidence map (averaged over 100 frames) (m Am ).  (11) where A i,t and C i,t are the amplitude and the confidence values for the i-th pixel at the t-th frame respectively. This procedure was repeated three times, positioning the camera at different distances from the wall. The results are reported in Table 2. As can be seen from Table 2, for each camera position, with data acquired with the auto acquisition time we have: the lowest mean value of the pixel standard deviations (m σ ), so more precise distance measurements; a null or negligible number of saturated pixels, which is a fundamental condition in order to avoid gross errors from the acquired data; a less noisy distribution of the distance measurements over the acquired area of the wall, which is represented by small values of the std Dm term; the maximum value of the m Cm term, which represents the mean quality of the measurements performed by the pixels. Since the real distance between the camera and the wall was measured with a metal tape (without considering the real shape of the wall), no evaluation of absolute measurement accuracy can be done in this case; nevertheless, the variations of the mean value of the measured distances (m Dm ) considering different integration times are very small, limited to some millimeters when only few saturated pixels appear. For these reasons, the auto acquisition time will be adopted during data acquisition with the SR-4000 camera instead of adjusting it manually.
In Figure 10 a 3D representation of the σ i term for each pixel over the whole sensor along with the amplitude image (averaged over 100 frames) are reported.  Figure 10(a) shows that the measurement precision is better for the central pixels than the pixels at the corners of the sensor. In the case (1.30 m, i.t. = 11 ms), values of the pixel precision up to 0.013 m are observed at the image corners. This is directly related to the amplitude of the reflected signal: since the amplitude is lower at the corners of the image (yellow and green areas in Figure 10(b)), distance measurements with higher standard deviation and therefore less precision are present. In the figure, a few saturated pixels in the central part of the image are present, which gives a few higher values in the 3D representation (Figure 10(a)). This test shows the important relation between the strength of the reflected signal and the distance measurement precision. For this reason, it is important to properly adjust the integration time in order to have the highest amplitude values without reaching pixel saturation. The results show that the auto acquisition time suggested by the SR_3D_View software completely adheres to this observation.

Test of the CamCube3.0 Camera
The same test described in the previous section was performed using the CamCube3.0 camera. Since the software delivered with this camera does not automatically adjust the integration time, this parameter was adjusted manually. Several integration times were adopted spanning a small range of all possible integration times (from 20 to 50,000 μs for this camera), in order to have low noise of the distance measurements and a small number of saturated pixels. With the SR-4000 camera a confidence map is delivered, however the CamCube3.0 delivers a flag matrix with the acquired data. The flag matrix indicates for each pixel if the camera detected problems with the measurement process. In particular, the meaning of the flags is reported in Table 3 [24]. Table 3. Possible values in the flag matrix delivered by the CamCube3.0 camera.

SBI (Suppression of Background Illumination) 4
Low signal 8

Inconsistent 10
Obviously the zero value means that no problems occurred during the measurement. Therefore, the expected quality of the acquired data could be obtained by computing the mean value of the flag matrix for the considered frame: higher the mean value, more problems occurred during the acquisition phase. For this reason, the mean value of the flag matrix (averaged over 100 frames) (m Fm ) was estimated for a given pixel as: where F i,t is the flag value for the i-th pixel at the t-th frame. Since the i.t. was adjusted manually, several integration times were employed for each of the three cases ( Table 4). As can be seen from Table 4, the m σ term decreases when i.t. increases, as was expected. The number of pixels having a flag different from zero varies with increasing integration time in a non-linear way. The mean precision (m σ ) is about 0.002 m better for the SR-4000 camera compared to the CamCube3.0 camera in the three tests (same adopted procedure for both cameras).  The variations of the mean value of the measured distances (m Dm ) for the Camcube3.0considering different integration times are bigger than SR-4000 camera even if smaller gaps of i.t. are considered for the CamCube3.0 camera: in this case, variations up to 0.040-0.050 m are observed, even with a small number of saturated pixels. A similar behavior of non-negligible distance variations with changing integration time was also detected for other previous PMD camera models. For example, in [25] distance variations of several centimeters were observed for the PMD19k camera.
In Figure 11, a 3D representation of the σ i term of each pixel for the whole sensor and the amplitude image (averaged over 100 frames) are reported. Figure 11 shows that the measurement precision is better for the central pixels with respect to pixels at the corners of the sensor since the amplitude of the reflected signal is higher in the center. In the displayed case (i.t. = 0.7 ms, 1.30 m of distance), values of the pixel precision up to 0.010 m are observed at the image corners. As mentioned before, this variation is directly related to the amplitude of the reflected signal: since the amplitude is lower at the corners of the image (blue areas in Figure 11(b)), distance measurements with higher standard deviation and therefore less precision are present. Comparing Figure 10(a) with Figure 11(a), one can see that the sensor precision is more homogeneous for adjacent pixels in the SR-4000. Again, this is a direct consequence of the amplitude distribution over the sensor: for the CamCube3.0 camera the central amplitudes are quadruple that of the corners, while for the SR-4000 camera the central amplitudes are double that of the corners.
This test shows the relation between integration time, distance measurement precision and distance measurement values for the CamCube3.0 camera. It is necessary to properly adjust the integration time, taking into account the distance variations which exist even with small changes of the i.t. parameter. Future work will be performed to take into account this effect in a proper distance calibration model.

Conclusions and Future Work
In this paper experimental comparisons between the SR-4000 and CamCube3.0 cameras have been reported in order to evaluate their performance and to give some procedures for testing data from ToF cameras.
After a brief overview of commercial ToF cameras available on the market and the main specifications of the tested devices, two topics have been presented. First, the influence of camera warm up on distance measurements was analyzed: a warm up of 40 min is suggested to obtain distance measurement stability, especially in the case of the CamCube3.0 camera, for which warm-up distance measurement variations up to 0.12 m have been found. Secondly, the distance measurement precision variation of the cameras with varying integration time was examined. Distance measurement precisions of 3-4 mm have been obtained in both cases, with improvements in the measurement precision increasing integration time (and consequently the amplitude of the reflected signal), as it was expected. Nevertheless, with changing the integration time of the CamCube3.0 camera, distance variations up to 0.040-0.050 m are observed, while for the SR-4000 camera variations are very small, limited to 0.004-0.005 m when only few saturated pixels appear. This test shows the important relation between integration time, distance measurement precision and distance measurement values for the CamCube3.0 camera. It is necessary to properly adjust the integration time, taking into account the distance variations which exist also with small changes of the integration time.
During the tests, a qualitative evaluation of sunlight influence on distance measurements has been performed too, in order to test the sensor sensibility to sunlight rays. Since almost all ToF cameras based on the phase shift measurement use an infrared signal to measure distances, one main aspect to be considered is the influence of sunlight on the acquired data. Many recent ToF cameras support the Suppression of Background Illumination (SBI) modality or an equivalent IR-suppression scheme, allowing the usage of the devices also in outdoor applications. Nevertheless, data acquisition with ToF cameras using near-infrared wavelength in direct sunlight could still be a hard task. Previous works, i.e., [22,26,27], have already reported about problems of noisy data in outdoor acquisitions. Some first tests performed by our research group show that the CamCube3.0 camera is more robust to sunlight than the SR-4000 camera thanks to its SBI system, as it was expected from the information reported in the manufacturer data sheets of the two devices. In fact, the SR-4000 has been designed for indoor use and it has not to be used in direct sunlight [28], while the CamCube3.0 camera is equipped with the PhotonICs ® PMD 41k-S2 sensor [24]. It includes the Suppression of Background Illumination (SBI), which is suitable for both indoor and outdoor environments. Nevertheless, specific tests will be performed in the future in order to verity if the acquired data are degraded by sunlight even with SBI. Figure 13 summarizes the results of the tests. These were confirmed by the camera manufacturer agents during the International Workshop on Range-imaging Sensors and Applications 2011 [29]. The red question mark reported for the distance measurement accuracy of the CamCube3.0 in Figure 13 is due to the fact that the distance measurement accuracy of the CamCube3.0 camera is not exactly known, but some preliminary tests performed by our group and already published works [16] on the previous camera model (CamCube2.0) shown a distance measurement accuracy of some centimeters. Future works will estimate the actual distance measurement accuracy of the CamCube3.0 camera.