^{*}

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

In the last years, 3D-vision systems based on the time-of-flight (ToF) principle have gained more importance in order to obtain 3D information from the workspace. In this paper, an analysis of the use of 3D ToF cameras to guide a robot arm is performed. To do so, an adaptive method to simultaneous visual servo control and camera calibration is presented. Using this method a robot arm is guided by using range information obtained from a ToF camera. Furthermore, the self-calibration method obtains the adequate integration time to be used by the range camera in order to precisely determine the depth information.

Nowadays, visual servoing is a well known approach to guide a robot using visual information. The two main types of visual servoing techniques are position-based and image-based [

A typical approach to determine the depth of a target is the use of multiple cameras. The most commonly applied configuration using more than one camera is stereo vision (SV). In this case, in order to be able to calculate the depth of a feature point by triangulation, the correspondence of this point in both cameras must be assured.

In this paper the use of 3D time-of-flight (ToF) cameras is proposed in order to obtain the required 3D information in visual servoing approaches. These cameras provide range images which give depth measurements of the visual features. In the last years 3D-vision systems based on the ToF principle have gained more importance compared to SV. Using a ToF camera, illumination and observation directions can be collinear, therefore, this technique does not produce incomplete range data due to shadow effects. Furthermore, SV systems have difficulties in estimating the 3D information of planes such as walls or roadways. They cannot find the corresponding physical point of the observed 3D-space in both camera systems. Hence the 3D information of that point cannot be calculated by applying the triangulation principle. Another standard technique to obtain 3D information is the use of laser scanners. The advantages of ToF cameras over laser scanners are the high frame rates and the compactness of the sensor. These aspects have motivated the use of a ToF camera to obtain the required 3D information to guide the robot.

Some previous works have been developed in order to guide a robot by visual servoing using ToF Cameras. Within these works, a visual servoing system using PSD (Position Sensitive Device) triangulation for PCB manufacturing is presented in [

When a ToF camera is used, some aspects must be taken into consideration, such as large fluctuations in precision caused by external interfering factors (e.g., sunlight) and scene configurations (

This paper is organized as follows: In Section 2, a visual servoing approach for guiding a robot by using an eye-in-hand ToF camera is presented. Section 3 describes the operation principle of the ToF cameras and the PMD camera employed. In Section 3, an offline camera calibration approach for computing the required integration time from an amplitude analysis is shown. In Section 5, an algorithm for updating the integration time during the visual servoing task is described. In Section 6, experimental results confirm the validity of the visual servoing system and the calibration method. The final Section presents the main conclusions.

A visual servoing task can be described by an image function, _{t}, which must be regulated to 0:
_{1}, _{2,} … _{M}) is a M × 1 vector containing M visual features observed at the current state (_{i} = (_{ix}, _{iy})), while

_{s} represents the interaction matrix which relates the variations in the image with the variations in the camera pose [

By imposing an exponential decrease of _{t} (_{t} = −_{1}_{t}) it is possible to obtain the following control action for a classical image-based visual servoing:
_{1} > 0 is the control gain,
_{c} is the eye-in-hand camera velocity obtained from the control law in order to continuously reduce the error _{t}.
_{s} [

First, the interaction matrix will be calculated when only one image feature (_{x}, _{y}) is extracted. The transformation between the range image _{x} and s_{y} are the pixel size in the x and y directions and
_{0}, v_{0}) of the optical center on the sensor array

To obtain the interaction matrix, the intrinsic parameters ξ = (f_{u}, f_{v}, u_{0}, v_{0}) are considered, where f_{u} = f·s_{x} and f_{v} = f·s_{y}. Therefore, considering these intrinsic parameters,

From (5) the coordinates of the image feature can be obtained as:

The time derivative of the previous equation is:

Considering the camera velocity
^{C}, ^{C}, ^{C}, the following expression can be obtained from

Developing the previous equation, an expression which relates the time derivative of the image features with the camera translational and rotational velocity can be obtained:

The matrix obtained in _{s}, therefore, _{s} · _{s}= [_{s1} _{s2 …} _{sM}]^{T}, where _{si} is the interaction matrix determined in (9) for only one feature.

Various previous works have studied the image-based visual servoing stability. In applications with commercial robots the complete dynamical robot model is not provided. In this cases, the system stability is deduced depending on kinematics properties [

In this section, a behaviour analysis of ToF cameras is provided. This analysis helps to define the methods to improve the depth measurement which will be used in the visual servoing system. A PMD19K camera has been used in this analysis. The PMD19K camera contains a Photo Mixer Device (PMD) array with a size of 160 × 120 pixels. This technology is based on CMOS technology and a time-of-flight (ToF) principle.

There are other similar cameras based on the same principle and with CMOS technology such as the CamCube 2 or 3 of PMD-Technologies and the SR2, SR3000 or SR4000 of CSEM-Technologies. The specifications and a comparison of the behaviour of these cameras is available in [_{0}(0°), r_{1}(90°), r_{2}(180°), r_{3}(270°), the camera compute the phase delay, ϕ, the amplitude, a, and the distance between sensor and the target, z, as follows:

This type of cameras has some disadvantages [

In a visual servoing system with eye-in-hand configuration (

In previous works, some experiments were done in order to observe the evolution of the distance measured by the camera when the integration time changed. In those experiments from 750 images (an integration time offset of 100 ms between each image), a relationship between mean distance value,

As regards the amplitude measurements, the curve which shows the evolution of the mean amplitude can be computed from a set of images acquired using a nominal fixed distance (the same as the mean distance that was computed in _{min}, τ_{max}] which are needed in order to guarantee the precise computation of the distance measurements (_{min}, is computed as the minimum integration time needed to compute the image depth in the desired camera location. It is determined as the time value where a least squares line fitting the mean amplitude curve crosses the zero axis (_{max}, is computed as the maximum integration time needed to compute the image depth in the initial camera location. These limits (

Pose the Robot in the initial pose and capture an image, I_{τ}

Compute mean amplitude: a_{m}

Estimate the frequency histogram for a_{m} and fit it by means of K-S and A-D Tests in order to classify the scene according to look-up-table as near or far target

τ_{min} is computed from zero crossing determinated by the fitting of the curve which represents the image at the maximum distance (min{

τ_{max} is computed as the suitable integration time for obtaining a desired mean amplitude, a_{d}, such as:

The amplitude analysis of

Once, the integration time values for final and initial camera positions have been computed, some intermediate integration time, τ_{k},

Fix the integration time as _{0} = _{max} for image I_{0}

Compute the deviation error e_{a} = a_{d} – (a_{m})_{0} where a_{d} = max{a_{m}} according to a desired minimum distance.

Update integration time following the control law _{k} = _{k–1} (1 + K · e_{a}) where K is a proportional constant and it is adjusted depending on the robot velocity.

This way, some intermediate integration time values, _{k} ∈ [_{min}, _{max}], have been estimated for different distances between the final and the initial positions. Therefore, the proper computation of
_{min} and τ_{max} as the values 10 ms and 46.4 ms (upper quartile of the maximum value shown in _{k}, all computed, according the previous calibration method,

From the previous analysis, a method to automatically update the integration time is presented in this section in order to be applied during visual servoing tasks.

Considering ^{c}_{o} the extrinsic parameters (pose of the object frame with respect to the camera frame), an object point can be expressed in the camera coordinate frame as:

Considering a pin-hole camera projection model, the point

Finally, the units of (17) specified in terms of metric units (e.g., mm.) are scaled and transformed in coordinates in pixels relative to the image reference frame, as:
_{u}, f_{v}, u_{0}, v_{0}) are the camera intrinsic parameters.

The intrinsic parameters describe properties of the camera used, such as the position of the optical center (u_{0}, v_{0}), the size of the pixel and the focal length defined by (f_{u}, f_{v}). They are computed from a calibration process based on [

During a visual servoing task, the camera extrinsic parameters are not known, and ^{c}_{o} is considered as an estimation of the real camera pose. In order to determine this pose, we must minimize progressively the error between the observed data, _{o}, and the position of the same features computed by back-propagation employing the current extrinsic parameters,

The time derivative of

To make _{2}_{2} is a positive control gain and

Consequently, two estimations are obtained for the depth of a given image feature: one depth (_{1}) from the previous estimated extrinsic parameters and another depth (
_{1} and _{2} are equal. Therefore, a new control law is applied in order to update the integration time, τ, by minimizing the error between _{1} and _{2}:
_{3} > 0.

The algorithm for updating the camera integration time is summarized in the following lines: First perform the offline camera calibration to determine the initial integration time and

At each iteration of the visual servoing task:

Apply the control action to the robot:

Estimate the extrinsic parameters using virtual visual servoing.

Determine the depth, _{1}, from the previous extrinsic parameters and _{2} from the range image (10).

Update the integration time by applying

In order to describe more clearly the interactions among all the subsystems that compose the proposed visual servoing system, a block diagram is represented in

The target used for the experiments can be seen in

The real distance between camera and target (background and objects) for this first experiment was
_{1}^{T}, _{2}^{T}, _{3}^{T} and _{4}^{T} for the initial robot pose and _{1}^{T}, _{2}^{T}, _{3}^{T} and _{4}^{T} for the final pose.

In

Furthermore,

Applying the algorithm described in Section 5 from the initial and desired image features location, the image trajectory presented in

In order to perform the correct tracking, the integration time is updated at each iteration of the visual servoing task using the algorithm described in Section 5.

The image ranges shown in

In this case, a trajectory with a displacement only in depth is described. The initial and final positions of the features in the image are (68,51)(86,51)(68,70)(86,70) and (56,43)(93,43)(56,80) (93,80), respectively. The initial distance between the eye-in-hand camera and the object is 1,160 mm and the final distance is 560 mm by using the proposed control law, the robot is able to perform precisely the displacement in depth as _{min}, and maximum,τ_{max}, values of the integration time are 10 ms and 57.4 ms, respectively. Therefore, when the theoretical value for the integration time is greater than τ_{max} this parameter is saturated to 57.4 ms (see

As described in [_{s*} is the value of _{s} for the desired position ^{*}

In this experiment there are important variations in the distance between the camera and the object from which the features are extracted. The initial and final depths are 1,160 mm. and 680 mm. respectively, and during the task the depth arrive until 1,760 mm. Thus, considering a fixed integration time, important errors appear and the task cannot be performed. Therefore, the integration time has to be updated with the approach described in this paper, and thus the evolution represented in _{min}, and maximum, τ_{max}, in the same way that in the previous experiment according to

This paper presents a new image-based visual servoing system which integrates range information in the interaction matrix. Another property of the proposed system is the possibility of performing the camera calibration during the task. To do this, the visual servoing system uses the range images not only to determine the depths of the object features but also to adjust the camera integration time during the task.

When a ToF camera is employed to guide a robot, the distance between the camera and the objects of the workspace change. Therefore, the camera integration time must be updated in order to correctly observe the objects of the workspace. As it is demonstrated in the experiments, the integration time must be updated depending on the distance between the camera and the objects. The use of the proposed approach guarantees that the information obtained from the ToF camera is accurate because an adequate integration time is employed at each moment. This last aspect permits obtaining a better estimation for the objects depth. Therefore, the behaviour of the visual servoing is enhanced with respect to previous approaches where this parameter is not accurately estimated. Currently, we are working in determining the accurate dynamic model of the robot to improve the visual servoing control law in order to assure the given specifications during the task.

The authors want to express their gratitude to the Spanish Ministry of Science and Innovation for their financial support through the project DPI2008-02647 and to the Research and Innovation Vicepresident Office of the University of Alicante for their financial support through the emergent projects.

Evolution of the mean distance of the range image for two different scenes: (a) An object and the camera moved between 0.5 m and 1 m. (b) Four objects and the camera moved between 0.3 m and 0.8 m.

Evolution of mean amplitude, a_{m}, for the tests of

Polynomial interpolation applied to compute

Block diagram of the visual servoing system.

(a) Initial position of the image features and the eye-in-hand camera. (b) Final position of the image features and the eye-in-hand camera. (Trajectory 1).

Range Image computed for the integration time of 53 ms.

(a) Evolution of the measured amplitude when the integration time is not updated. (b) Evolution of the depth parameter when the integration time is not update.

Trajectory during the visual servoing task. (a) Trajectory of the image features. (b) Trajectory of the eye-in-hand camera. Experiment 1.

Velocities during the visual servoing task. Experiment 1.

Integration time values at each iteration of the visual servoing task. Trajectory 1.

Range Image computed for the integration time updated at each iteration.

Trajectory during the visual servoing task. (a) Trajectory of the image features. (b) Trajectory of the eye-in-hand camera. Experiment 2.

Integration time values at each iteration of the visual servoing task. Experiment 2.

(a) Initial position of the image features and the eye-in-hand camera. (b) Final position of the image features and the eye-in-hand camera. Experiment 3.

Image trajectory when

Trajectory during the visual servoing task. (a) Trajectory of the image features. (b) Trajectory of the eye-in-hand camera. Experiment 3.

Integration time values at each iteration of the visual servoing task. Experiment 3.