1. Introduction
In the era of globalization, manufacturing industries deal with competitive and uncertain markets, where the dynamics of innovations and shortened life cycles products create a problem for the industry to become more productive and flexible; for instance, welding processes are one of the most common tasks in manufacturing industries, and robots equipped with intelligent programming tools represent the best alternatives to achieve these goals [
1].
Nowadays, there are two main categories of robotic programming methods in industrial applications, as well as online and offline programming [
2]; however, the time spent programming a new path for a job in high-volume manufacturing industries becomes the main challenge of using welding robots, especially when changes and uncertainties to the geometric shape in products occur, which is why robotic systems based on intelligence and robotic perception are one of the four pillars of research and implementation according to ”Industry 4.0” objectives [
3].
A computer vision system is required to capture the surfaces or features and help achieve fast offline programming [
2]. However, the obstacle toward achieving an intelligent welding robot is solving the problem of trajectory planning, seam tracking, and the control of welding systems against errors caused by light and environment disturbances to which each vision system is exposed [
4].
For example, as regards simple systems using only a single camera as a sensor, Kiddee et al. [
5] develop a technique to find a T-welding seem based on image processing to smooth the image and extract the edges of the image by using a canny algorithm, to find the initial and endpoints. In the same way, Ye et al. [
6] acquire the edges of a series of images to determine the location of the weld seam using a series of known characteristics. Yang et al. [
7] present a welding detection system based on 3D reconstruction technology for the arc welding robot. The shape from the shading SFS algorithm is used to reconstruct the 3D shapes of the welding seam.
Laser vision systems are among the most widely used sensors in welding robotics due to the precision and fast data processing that these devices provide. In particular, laser sensors are mostly applied in weld tracking research, where they have been developed since simple systems such as the one by Fernandez et al. [
8] that implements a low-cost laser vision system based on a webcam on the robot arm oriented toward the laser stripe projected at a 45° angle, up to systems already proved in the industrial context, for example, the study by Liu et al. [
9], in which an autonomous method is proposed to find the initial weld position for a fillet weld seam formed by two steel plates. This method employs an automatic dynamic programming-based laser light inflection point extraction algorithm. The algorithm for this method can support factors induced by natural light that may be present during the processing of laser vision images.
Disturbances in laser systems on metallic surfaces are a common problem in weld bead localization. Li et al. [
10] suggest reducing the influence of noise on the extraction of the centerline, through the double-threshold recursive least square method. Later, an automatic welding seam recognition and tracking method by utilizing structured light vision to search through a Kalman filter the profile of the welding seam in a small area, aiming to avoid some disturbances [
10]. Another approach in structured light systems incorporated an optical filter and LED lighting developed to reduce the effect of noise produced by the arc torch. Where a fuzzy-PID controller can be used to obtain the weld seam in horizontal and vertical directions simultaneously [
11].
Recent systems tend to be more robust or complex in terms of the number of tools involved in obtaining images, filtering data. For example, Zeng et al. [
12] propose a weld position recognition method based on directional light and structured light information fusion during multi-layer/multi-pass welding. On other hand, Guo et al. [
13] present a multifunctional monocular visual sensor based on combined laser structured lights, which has the functions such as the detection of the welding groove cross-sectional parameters, application for the joint tracking, the detection of the welding torch height, the measuring of the weld bead appearance, and the monitoring of welding process in real time. Other approaches for real-time processing are described by Kos et al. [
14] to compute the position of the laser beam and the seam in 3D during welding with a camera and illumination laser in order to equalize the brightness of the keyhole and the surrounding area. Zhang et al. [
15] acquire the 3D information by multiple segment laser scanning. The weld features are extracted by cubic smoothing spline, to detect the characteristic parameters of weld lap joint with a deviation lower than 0.4 mm.
Another research topic in robotic vision is systems that acquire images from two optical devices. In this sense, Chen et al. [
16] propose a Canny detector, where the two parallel edges captured in a butt v-joint are used to fitting the value of the start welding position. In a similar way, Dinham et al. [
17] use a Hough transform to detect the outside boundary of the weldment so that the background can be removed. In weld tracking systems, Ma et al. [
18] use two normal charge-coupled device cameras to capture clear images from two directions—one is used to measure the root gap, and the other is used to measure the geometric parameters of the weld pool.
Nowadays, owing to the precision of sensors and to have a complete understanding of the environment, 3D reconstruction techniques have been explored. In reconstruction with laser systems, 3D point cloud data are used to reconstruct welding seam, through the point cloud and guided by a neural network proposed by Xiao et al. [
19], which can obtain the equations and initial points of the weld seam. The test results of the guidance prove that the extraction error is less than 0.6 mm, meeting actual production demand.
In stereo vision, Yang et al. [
20] propose a 3D path teaching method to improve the efficiency of teaching playback based on a stereo-structured light vision system using a seam extraction algorithm, which could achieve fast and accurate seam extraction to modify the model of the weld seam. Their system could well realize a fast and accurate 3D path teaching of a welding robot. Experiment results show a measurement resolution less than 0.7 mm and are suitable for V-type butt joint before welding [
21]. In point clouds acquired with RGB-D sensors, Maiolino et al. [
22] use an ASUS Xtion sensor to register and integrate the point cloud with the CAD model to perform an offline programming system for sealant dispensing robot. On the other hand, Zhou et al. [
23] use an Intel family camera to detect and generate the trajectory with an algorithm based on the gradient of the edge intensity in the point cloud. However, the main limitation of the proposals found in the literature is that they seek to find a solution to a particular type of weld seam. Global path extraction systems are in the process of development; therefore, we find that the integration of color and segmentation of these data have not been the subject of research in welding robotics as a global acquisition system.
In this work, a color point cloud segmentation method was implemented to extract 3D paths for robot trajectory generation. The developed system consists of a RealSense D435 sensor, a low-cost device that incorporates technologies such as stereo vision and RGB sensor, with which the 3D reconstruction of a point cloud that incorporates the color of the work object, with this color information a series of filters are applied in the HSV color space to segment the region of interest where the weld bead is expected to be applied. Once captured the zone, a spline cubic interpolation is executed to calculate the path that smoothest the trajectory of the welding points that would require a robotic manipulator.
The rest of this paper is organized as follows:
Section 2 describes the theory related to vision systems and algorithms to perform our 3D reconstruction processing seam extraction.
Section 3 introduces the configuration of our experiment platform and vision sensor, and the results are presented in
Section 4. Finally, in
Section 5, concluding remarks are provided.
3. Experimental Setup
The integrated vision system incorporates an RGB-D camera (Intel RealSense D435), an active stereo depth camera to compute the stereo depth data for real time. It also has an optional infrared (IR) projector that assists in improving depth accuracy. The sensor is physically supported on a test arm that will allow an image acquisition from a top view of the work object at a distance ranging from 30 to 70 cm over the welding work zone, as shown in
Figure 3.
The proposed robotic system consists of an RGB-D camera that captures the surface point cloud of the workpiece, the welding seam detection algorithm that locates the color seam region in the input point cloud, and the trajectory generation method that processes the point set and outputs a 3D welding trajectory.
The image acquisition and trajectory planning algorithms implementation were carried out on a personal computer with the Windows 10 operating system and operating with an Intel i7 CPU @ 2.40 GHz, with the necessary USB 3.0 ports required for the communication with the RealSense D435 camera.
3.1. Test Sample
A test object was designed so that the geometric characteristics of the part could be mathematically parametrized. It consists of two parts designed both as a semi-complex surface with curvature as well as to simulate a V-type welded joint, one of the most investigated in the literature, with a depth of 5 mm and an angular opening of 90°. The assembly of these two pieces results in a test piece of 20 × 10 cm having 4.8 cm at its highest part.
The CAD models of
Figure 4 show the design of the test piece that was fabricated in aluminum 6061 T6, considering that aluminum is a highly moldable and reflective material, which could serve as a parameter to measuring light disturbances in the vision system. It is important to note that the sample part was machined with tungsten carbide milling tools whose toolpaths were programmed in WorkNC CAM software; the machining parameters are listed in
Table 1. The machining was performed on a HAAS VF3 CNC machine, to match the part to the CAD model, because machines such as these report positioning errors below 0.05 mm.
3.2. Trajectory Extraction Based on Stereo Vision System Embedding Color Data
Figure 5 shows the steps necessary for the definition of parameters and processing of the images that will carry out the extraction of the points corresponding to the weld bead. Next, the objective of each block was defined.
Set up the data acquisition parameters: Image acquisition and processing was performed by Intel SDK [
32], which, as an open source software program, has support for different programming languages, such as python, through the pyrealsense2 library, the official python wrapper. Since the implemented vision system has different sensors, both color and depth sensors were set to a resolution of 640 × 480 pixels and a frame rate of 30 fps, with a depth accuracy between 0.1 to 1 mm.
Acquire and align depth and color frame information: It is necessary to align the depth and color frames to make a 3D reconstruction faithful to the captured scene. This was achieved through the pyrealsense2 library [
32], which has an algorithm that aligns the depth image with another image, in this case, the color image.
Segment and remove the background data: Sometimes, we seek to process a region of interest (ROI); in this case, the ROI is defined by the distance at which the test object is located relative to the camera. Therefore, we first planned a filter using one of the device’s own tools to acquire the images [
32], where a depth clipping distance in which all information beyond our ROI was segmented and removed instead of using all the information in the scene.
Point-cloud calculation from depth and color-aligned frames: The pyrealsense2 library [
32] was used to calculate the point cloud since it has the intrinsic values of the stereo vision system and can perform the calculations for the point cloud acquisition, in addition to registering the color of the aligned frame.
Color segmentation: This block represents the core of the proposed methodology that segment the welding area from the rest of the surface, in which the image was preprocessed considering the brightness of the scene to binarize the color image and look for the threshold at which a single frame of the point-cloud was vectorized to an XYZRGB format to using the Numpy and OpenCV library tools. However, to improve the selection of the points of interest, a change in the color space to hue saturation value (HSV) was used. The threshold was applied to the hue channel to find the color region, as well as to the saturation channel as a parameter for brightness.
Trajectory planning: In order to calculate the trajectory from the color market segmented data in the previous module, following the methodology of Zhang et al. [
15], a cubic B-spline interpolation algorithm was implemented to approximate the nonlinear dataset, the function was divided by knot points, and between the knots, the subset of data points a 5th order polynomial curve was applied to satisfies a smoothness requirement to the target weld seam points. It was planned that the trajectory would be smooth enough to be applied directly to the robot through a transformation matrix referenced to the welding direction.
3.3. 3D Reconstruction with RealSense D435 Sensor
Before an in-depth analysis of the results of the trajectory extraction by the proposed algorithm, a study of the proposed vision system is necessary to evaluate the performance of the RealSense camera. We proceeded to execute the methodology described by Carfagni et al. [
33], which evaluates the reconstruction capability of D415 and SR300 sensors, seeking to measure the error with which the sensor can reconstruct a surface. To this end, the RealSense D435 camera was located 30 cm away at the top of a flat surface where the test piece was placed. With this configuration, the 3D reconstruction of the surface was carried out through the first three blocks of the algorithm presented in the previous section to finally obtain the point cloud of the test piece.
The real point cloud of the test piece was generated by the CAD model exporting the pieces to a Polygon File Format (.ply), as shown in
Figure 6. Once we had the target surface and the one calculated by the camera, we proceeded to run an ICP color registration algorithm [
30] with which we could estimate the Euclidean point distance between the target and the 3D reconstruction surface.