Camera-Based Method for Identiﬁcation of the Layout of a Robotic Workcell

Featured Application: A fast and low-cost process for automated identification of positions of workcell components, including robots. Suitable for rapid deployment of robotic applications without a need of previous simulations or CAD modeling. Abstract: In this paper, a new method for the calibration of robotic cell components is presented and demonstrated by identiﬁcation of an industrial robotic manipulator’s base and end-effector frames in a workplace. It is based on a mathematical approach using a Jacobian matrix. In addition, using the presented method, identiﬁcation of other kinematic parameters of a robot is possible. The Universal Robot UR3 was later chosen to prove the working principle in both simulations and experiment, with a simple repeatable low-cost solution for such a task—image analysis to detect tag markers. The results showing the accuracy of the system are included and discussed.


Introduction
For robotic arms there has always been a trade off between the repeatability and absolute accuracy of the measurement of a robot's positioning in 3D space, as examined by Abderrahim [1] or by Young [2]. Many manufacturers of the industrial robot present only the repeatability parameter in their datasheets, when it is way more precise than the absolute positioning. The general problem of robot accuracy is described with experiments by Salmani [3].
The absolute positioning of a robot examines how accurately the robot can move to a position with respect to a frame. To achieve better results, parameter identification and robot calibration are performed. Identification is the process in which a real robot's kinematic (and possibly dynamic) characteristics are compared with its mathematical model. It includes determination of the error values that are afterwards applied into the control system, which improves the robot's total pose accuracy using a software solution without the need for adjusting the hardware of the robot. A generally suggested method for robot calibration is the use of a laser tracker. The methodology identifies the error parameters of a robot's kinematic structure, as is described by Nubiola [4]. The precision may be even increased, as Wu showed in [5] when trying to filter errors in measurements and finding optimal measurement configurations. In [6] Nguyen added neural network to compensate for non-geometric errors after the parameter identification was performed.
Unfortunately, these solutions are very expensive because of the price of a laser tracker. One may rent a laser tracker if needed, but this is also a time consuming process due to the need to perform precise experiments, measurements and evaluations after every error made during the process that may lead to incorrect final results. Therefore, the wide deployment of laser trackers is ineffective for many manufacturers.
There are other methods of robot calibration that tend to avoid the use of laser tracker. In [7], Joubair proposed a method using planes of a very preciously made granite cube, but acquisition of such a cube is not easy in general. Filion [8] or Moller [9] used additional equipment; in their case it is portable photogrammetry system. In [10] a new methodology is introduced by Lembono, who suggested to use three flat planes in a robot's workplace with a 2D laser range finder that intersects the planes, but the simulation was not verified by an experiment. A very different approach was taken by Marie [11] where the elasto-geometrical calibration method based on finite element theory and fuzzy logic modeling was presented.
On the other hand, the very precise results that the methodologies above wanted to achieve are not always necessary, and some nuances in a robot's kinematic structure that appear during its manufacturing process are acceptable for the users of the robot. The problem they may face is determination of the workplace coordinate system (base frame) in which the robot is deployed and eventually the offset of the tool's center point when a tool is attached to the robot's mounting flange, when they need to position it absolutely in a world frame.
For such applications the typical way to calibrate more robots is to use point markers attached to every robot, as described by Gan [12]. However, one important condition is that the robots need to be close together so they can approach each other with the point markers and perform the calibration. Additionally, there are a few optical methods using a camera to improve a robot's accuracy. Arai [13] used two cameras placed in specific positions to track an LED tag that was mounted on a robot; the method we propose allows us to put the camera in any place, in any orientation that will provide good visibility. In [14] Motta or Watanabe in [15] attached a camera to a robot and performed the identification process, but this cannot be used for other robots or to track positions of other components at the workplace at the same time. Van Albada describes in [16] the process of identification for a single robot. Santolaria presented in [17] the use of on-board measurement sensors mounted on a robot.
To avoid these restrictions, we propose a solution based on the OpenCV libraries [18] for Aruco tag detection by a camera, which adds to the calibration process benefits of simplicity, repeatability and low price. The outcomes may be used in offline robot programming, in reconfigurable robotic workplaces and for tracking of components, with as many tag markers and robots as needed, if the visibility for a camera or multiple cameras is provided.
There are methods for 2D camera calibration already presented, and they can be divided into two main approaches. The eye-on-hand calibration, wherein the camera is mounted on the robot and a calibration plate is static, and the eye-on-base method with the calibration marker mounted on the robot with static cameras around [19]. There are also Robot Operating System (ROS) packages [20,21] providing tools for 2D or 3D camera calibration using these two methods. The ROS is an advanced universal platform that may be difficult for some researchers to be able to utilize. Our approach combines both eye-on-hand and eye-on-base calibration processes, avoids using ROS and can be applied not only to localize the base of a robot, but to also localize other devices or objects in the workplace that are either static or of known kinematic structure (multiple robots) in relation to chosen world frame.

Materials and Methods
When an image with a tag is obtained, the OpenCV library's algorithm inserts a coordinate system frame in the tag and can calculate transformation from the camera to the tag. If there are tags placed on all important components of a cell, such as manipulated objects or pallets, the transformation between them may be calculated as well. If there is an industrial robot deployed in a workplace, we can attach an end-effector with Aruco tags to it, perform a trajectory with transformation measurements and using mathematical identification methods calculate the precise position of its base, no matter where it is.

Geometric Model of a Robot
For such an identification, a geometric model that is as precise as possible of a robot needs to be determined. The Universal Robot UR3 was chosen for demonstrating the function of the proposed solution. Its geometric model used for all calculations is based on modified Denavit-Hartenberg notation (MDH), described in [22] by Craig. Our geometric model consists of 9 coordinate systems. The "b" frame is the reference coordinate system (world frame); later in our measurements it is represented by a tag marker placed on a rod. The "0" frame represents base frame of the robot. The frames from "1" to "6" represent the joints; the 6th frame position corresponds to the mounting flange. The "e" frame stands for the tool offset, in this case a measuring point that was focused by the sensor. The scheme of the model is illustrated in Figure 1. The MDH parameters are noted in Table 1; the o i stands for offset of the ith joints in this position. Table 1. MDH parameters of the UR3 robot. For a vector q = [q 1 , q 2 , q 3 , q 4 , q 5 , q 6 ] T representing the joint variables, a homogeneous transformation matrix T be (q) gives the position and orientation P of the UR3's end-effector tool frame "e" with respect to the base frame "b" of the workplace.
According to [22], matrix A i−1,i in MDH notation is obtained by multiplying rotation matrix R x along x axis, translation matrix T x along x axis, rotation matrix R z along z axis and translation matrix T z along z axis.
G=The geometric model of the UR3 is mathematically expressed by the transformation matrix T be (q) noted in Equation (2). Matrix A b0 is displacement between the reference "b" frame and "0" frame; orientation difference is represented by R 0 rotational matrix. Matrix A 6e is displacement between the mounting flange and "e" frame of the end-effector. The objective of this study is to determine the 12 parameters of A b0 matrix, so to find the base frame "0" of a robot in a workplace. To be able to achieve this, it is necessary to identify during the calculations also the displacement of the end-effector (x e , y e , and z e ); however, the rotational part of the A 6e can be freely chosen. The reason is that the transform is static (there is no variable between 6th frame of the robot and end-effector "e" frame). For simplicity, we choose R e as identity matrix.
In general, geometric models are idealized and very difficult to make comply with real conditions due to manufacturing inaccuracy and environmental conditions. Error values can be estimated and included into the mathematical models though. Finding the relations between theoretical and real models is the crucial task of robot calibration. To find such a relation, geometric parameters of a device have to be identified. Robot identification is a process wherein error parameter values are determined using results of a test measurement. In the following simulation the UR3 robot performed a trajectory as described in Section 3. Obtained data of the end-effector position P c were compared with the robot's position P(q) based on q i for each joint.
The parameters x 0 , y 0 , z 0 , r 11 , r 12 , r 13 , r 21 , r 22 , r 23 , r 31 , r 32 , r 33 , x e , y e and z e were chosen to be identified. The reason for identification of the end-effector frame is because the Aruco tags may be placed on a low-cost end-effector by 3D printing and the designed CAD model transformation might be different than the real solution. On the other hand, we wanted to make the model as simple as possible, so we avoided MDH parameter identification between particular joints and links of the robot, which would lead to the robot's calibration process. We assume that this simple identification process may compensate for the small errors between the links.
When the transformation matrix T be (q) was defined earlier, the position vector of the end-effector P was represented by 4th column of T be with respect to the reference frame. If everything is ideal, we can consider this coordinate's equal to the coordinate's values calculated using the position sensor, as Equation (6) shows, for a specific q, where T be means 1st to 3rd elements of the 4th column of the transformation matrix. P c = T be (q) = P(q)

Identification with the Jacobian Method
The most common method for parameter identification is the application of a Jacobian, which is also described, for example, in [6,16]. This iterative method utilizes benefits of the Jacobian matrix that is obtained by partial derivative of the position vector (1st to 3rd elements of 4th column) of T be with respect to the parameters in X, the parameters that are going to be identified. Symbolically, the Jacobian is expressed as 3×15 J i matrix in Equation (7), where p i stands for a parameter of X.
For every measured point i the J i is determined. By applying all measured points a 3n×15 J matrix is obtained, where n is a number of measured points.
As a first step of every iteration a position vector Y m is calculated using values X j , where j represents the iteration step. For the first iteration, guessed values X 0 are used. The q is the vector of joint variables measured by robot's joint position sensors.
The next step is to compare and calculate the difference between the position measured by camera Y c and the previously calculated position Y m , so ∆Y is determined. Y c is n×3 matrix, where n stands for number of measured points, and there are three measured coordinates x, y, z.
The key equation of this method is Equation (11). When a position changes, the Jacobian matrix changes too; therefore, ∆x can be observed as the change of the parameters in X.
By using matrix operations in Equations (12) and (13), the values of ∆x are determined.
At the ends of iterations we added the computed values to the X j+1 . A convergent check follows to decide whether another iteration step is needed.

Simulations and Experiment
Two types of simulations and one experiment were performed to verify the proposed method of parameter identification. Simulation A was calculated only with absolute positions of the end-effector coordinates determined by CoppeliaSim software with the built-in UR3 model, as shown in Figure 2.
The robot moved along the same path in both simulations and the experiment, consisting of 250 points. The robot stopped at each pose and the measurements were taken. The reason for choosing such a path was to obtain coordinates of the joints that were as different as possible; on the other hand, due to the experiment that was performed with cameras we needed to guarantee the visibility of the Aruco tags, which were used for the simulation B and the experiment.

Simulation A
The robot was moved along a defined path with fixed points to stop at. Once it stopped, the joint coordinates and the position vector of the end-effector related to the world base frame were acquired. With these two sets of input values, the identification was made using the methods described in the previous chapter. Results may be seen in Section 3.5.

Simulation B
For simulation B and the experiment, there were cameras and OpenCV libraries [18] applied for image processing to detect the Aruco tags. Based on the previous research by Oscadal [23], we used a 3D gridboard with tags, which improves the reliability and accuracy of detection in comparison with basic 2D tags. The gridboard represents a coordinate frame; in our case it was the base frame "b" and the end-effector frame "e" as shown in Figures 3 and 4. The OpenCV library algorithm can calculate transformation from a camera to a tag. In real-time measurements, we used Equation (15) thanks to which the position of any camera was not important. Matrix T cb is the transformation from the camera to the base; matrix T ce is the transformation from the camera to the end-effector. The "c" frame is the camera frame. On the other hand, the position of a camera may be saved for later operations.
In the simulation B (Figure 3) we deployed the image analysis in CoppeliaSim, with a single camera of resolution 1280 × 720 px. The virtual camera was self-calibrated and the detection parameters of the OpenCV library for finding the Aruco tags were set similarly as in [23]. The dimensions of the tags were 70 × 70 mm with a 6 × 6 bit matrix.

Experiment A
Three independent cameras were placed around the UR3 robot to observe its trajectory and to calculate transformations in a laboratory during a real experiment. Intel RealSense D435i cameras with 1280 × 720 px resolution were used, and even though they are depth cameras, only the simple 2D RGB pictures were analyzed. The specifications of the cameras are shown in Table 2. The cameras were self-calibrated following the methodology used in [23].
As already said before, the robot went through 250 positions on the path. To reduce the inaccuracy of the detection in reality, 10 camera frames were taken for every position, which gave us 2500 measured points. In total for the three cameras, 7500 pictures were analyzed during the calibration process.

Experiment B
To observe the impact of placement of the world base frame "b," another experiment was performed as shown in Figure 5. Please, notice the difference in position and rotation of the base frame.
Only one camera (the same type, resolution and calibration) was used in this case with robot following a similar trajectory as in the previous experiment A. The robot went through 200 poses; at each pose, five images were taken and analyzed, so 1000 measurements in total were made.

Results
The calibration results of simulations are presented in Table 3. The data calculated based on simulation A (without tag detection, only end-effector position tracking) show high-precision identification with a very small difference in comparison to the expected results (difference ∆ is shown in the brackets). Such an accuracy could be achieved using, for example, a laser tracker in reality. For simulation B, when the tags were detected using simulated camera, the error was higher (maximum 2.59 mm for x 0 ), which gave us an idea about how accurate the system might be, so the noise from the environment was lowered, but the camera parameters were kept.
In Table 4 are the results of the experiment A. The best values were obtained when all the results of the three cameras were combined and analyzed together. The error for the base frame (x 0 , y 0 , and z 0 ) was maximally 7.61 mm in the z 0 direction. We performed other measurements following the same strategy; they all provided similar results, which made it made clear that the position of a camera has an influence on the detection accuracy, which is supported by results of Krajnik's research [25].
In Table 5 are the results of the experiment B. It proves the possibility of placing the base frame freely with respect to the robot.

Conclusions
The process of identification of the robot's base frame in a workplace using Aruco markers and OpenCV was presented and verified in this study. This approach may be used for more robots and other components of the workplace at the same time, which brings the main advantage of fast evaluation and later recalibration. The typical scenario is placing all components in their positions, placing markers on them and other points of interest, running a robot's path while measuring end-effector position by a camera and evaluating the results-obtaining the coordinates of robot's base and coordinates of points of interest (manipulated object, pallet, etc.). As the end-effector we used an 3D printed gridboard that might be replaced by a cubic with tags carried by a gripper.
When observing the results provided in the Section 3.5, one can see there is a gap between the accuracy of the simulated workplace and the experiments. As already mentioned above, simulation A demonstrates the possible accuracy of this method when all conditions are close to ideal. Therefore, there are some methods and topics for another study that would help to minimize the errors. The DH parameters of the UR3 were based on its datasheet, but when the robot is manufactured it is calibrated and modified DH parameters are saved in the control unit. This parameters may be retrieved and applied in the identification calculations. In general, UR3 is not as precise as other industrial robots; depending on demands, a different manipulator should provide more accurate results.
Nevertheless, the end-effector was 3D printed and assembled from three parts; more accuracy may be achieved using better manufacturing methods. In some cases, if the end-effector was manufactured precisely beforehand, only the identification of a base frame might be enough (instead of identification of the base frame and end-effector offset, as was presented).
However, the biggest issue seems to be the detection of the tags, as results differed for every camera in the experiment, as shown in Table 4. Positioning of a camera (distance from a tag) seems to have a big influence on the outcome. This topic was researched in the Krajnik's work [25]. Additionally, alternating the OpenCV's software algorithms and filtering leads to better detection; more on this topic is discussed in [23]. A user may seriously consider the use of a self-calibrated camera with higher resolution than 1280 × 720 px, as we used. It is important to note that the presented accuracy in simulation B and real experiment were achieved by cameras that we calibrated. There is no doubt that one with better calibrated hardware may achieve more accurate results.
Another question to focus on is which trajectory and how many measured points are necessary to provide satisfying results; we tested the system with only one path of 250 points.
Once this camera-based method is well optimized for a task depending on certain available equipment, and the accuracy is acceptable, it will stand as sufficient easy-to-deploy and low-cost solution for integrators and researchers. They will be able to quickly place components and robots, tag them and obtain their position coordinates based on prepared universal measurement. In addition, even the current system may serve in the manufacturing process as a continuous safety-check that all required components, including robots, are in the place where they should be, if the detection accuracy is acceptable.
In addition, this method will be used for identification of reference coordinate systems and kinematic parameters of experimental custom manipulators, the design which is a point of interest for Research Centre of Advanced Mechatronic Systems project.
To make this calibration method easier to follow, the Matlab's scripts with the calculations and raw input data obtained by simulation and experiments may be found on the Github page of the Department of Robotics, VSB-Technical University of Ostrava [26]. The calibration methodology and Supplementary Materials provided may serve engineers who have no previous experience with the process; they can use cameras or eventually other sensors, such as laser trackers.