1. Introduction
In many industrial fields, including aerospace, shipbuilding, nuclear power, and petrochemicals, weld grinding has become a critical process to ensure the surface quality and facilitate subsequent operations, such as coating and anticorrosion treatments [
1,
2]. Taking the prefabrication of pipelines in the petrochemical industry as an example, the welding workload is substantial, and all-position welding is frequently required. Weld reinforcement often forms at overhead positions, where if it is not removed by grinding in a timely manner, it severely degrades coating adhesion and anticorrosion performance. Prefabricated pipelines are typically produced in small batches and are highly customized. The external geometric features of the weld seam are complex. The weld surface presents irregular waves, uneven protrusion heights and widths, and the weld trajectory is usually a three-dimensional path with changing directions and curvatures. Existing dedicated pipe-grinding devices often perform straight-line motions along a single weld and require manual installation; therefore, their applicability is limited by the pipe diameter range. Although six-degree-of-freedom industrial robots offer flexible motion capabilities, they still lack weld recognition and trajectory planning capabilities; consequently, weld grinding for prefabricated pipelines still mainly relies on manual operation.
With the rapid development of 3D vision technology, high-density surface point clouds can now be obtained relatively easily [
3], which provides the basic conditions for the automatic grinding of complex welds. Research on weld recognition based on reverse engineering or geometric modeling has become a focus in robotic automatic work [
4,
5]. For point cloud feature extraction, representative local feature descriptors include point signatures [
6], local feature size histograms (LFSH) [
7], spin images [
8,
9], and 3D shape context [
10], among others. When weld morphology is relatively regular, traditional methods can achieve high accuracy, but the point cloud data collected in industrial settings contains a large amount of noise and workpieces have the characteristics of small-batch customization. Once the working conditions change, the processing flow needs to be redesigned, and a large number of manual parameter adjustments for thresholds, filtering scales, and fitting templates are required, resulting in limited engineering adaptability.
Deep learning-based point cloud segmentation methods can learn multi-scale geometric features from unordered point sets and reduce the dependence on hand-crafted feature design. Representative architectures include PointNet/PointNet++, PointCNN, DGCNN, Point Transformer, KPConv, and SparseConvNet. PointNet and PointNet++ provide a simple and effective raw point set learning framework, though PointNet++ may still lose fine geometric details during hierarchical down-sampling [
11,
12]. PointCNN extracts local features through X-transformation-based point convolution; however, its architecture is relatively complex and may increase the computational cost in practical deployments [
13]. DGCNN improves local structure modeling through dynamic graph construction and EdgeConv, but graph updating increases the computational burden [
14]. Point Transformer enhances context aggregation through self-attention at the cost of higher memory and computational requirements [
15]. KPConv performs convolution directly on irregular point sets and is effective for local geometry learning, but it is sensitive to neighborhood and kernel design [
16]. SparseConvNet is efficient for voxelizing sparse data, yet voxelization may sacrifice local surface fidelity [
17]. Therefore, considering segmentation accuracy, model complexity, and engineering practicality, PointNet++ has been selected as the backbone of the proposed weld recognition network, and further structural optimization has been carried out on this basis.
The most effective method for planning grinding operation trajectories is to directly extract key points that accurately represent the weld trend within the ROI and use spline curves to fit these key points to generate the machining path [
18]. Feng et al. obtained key weld points by calculating the difference between the data before and after moving average filtering. The robot tool pose was estimated by implementing a principal component analysis (PCA) algorithm, with B-spline curves subsequently fitted to generate the grinding path [
19,
20]. Some researchers have obtained key trajectory points via octree slicing or the intersection of planes and point clouds [
21]. Wu et al. [
22] simultaneously generated the working position and contact direction based on B-spline curves, whereas Tong et al. [
23] employed differences in the workpiece curvature to extract path points and improve the surface quality [
24]. Beyond geometric accuracy and kinematic feasibility, robot trajectory planning should also account for engineering objectives such as execution time, energy consumption, and operational costs [
25]. This perspective is consistent with the concept of Industry 5.0, which emphasizes sustainable manufacturing; in this context, optimizing robot motion trajectories and operating parameters can contribute to improved energy efficiency, reduced resource consumption, and lower production costs while maintaining process quality [
26]. Therefore, it is of great significance to develop an efficient trajectory planning method for weld grinding that can accommodate the geometric complexity of weld surfaces while ensuring smooth surface continuity between the weld and the adjacent base metal after grinding.
To achieve automatic recognition of spatial welds and automatic grinding trajectory planning, a 3D vision-based weld recognition network and a trajectory planning module has been proposed. This paper is organized as follows:
Section 2 provides an overview of the proposed framework.
Section 3 describes the weld dataset construction process and the architecture of WSR-Net.
Section 4 presents the reference surface-based reverse layer-wise trajectory planning method.
Section 5 reports the results of the experimental validation.
3. Weld Recognition
In order to eliminate the dependence of the traditional weld recognition method on manual parameter adjustment, a weld dataset and a weld recognition network were constructed to achieve automatic identification of the weld area.
3.1. Construction of the Weld Dataset
The welds considered in this study include planar welds and curved surface welds. Based on this, grinding workpieces of different sizes were collected. A total of 22 weld workpieces were selected, including 15 flat workpieces and seven prefabricated pipe fittings (6 and 8 inches in diameter), as shown in
Figure 2. The grinding robot carried a 3D camera to collect point cloud data of the weld area from multiple angles. The acquired point clouds contained noise caused by the strong surface reflection and background interference under industrial conditions. In order to improve the quality of the point clouds, the background structure was first removed based on the plane fitting, and then the outliers were removed by statistical filtering.
To reduce the risk of overfitting and avoid data leakage, the dataset was constructed using a sample-wise split strategy. All point clouds acquired from different viewpoints of the same weld workpiece were assigned to only one subset (training, validation, or testing). Since point clouds captured from different viewing angles of the same workpiece differed in visible regions, point density distribution, and local geometric completeness, 20 viewpoints were collected for each of the 22 weld samples to increase data diversity. This resulted in 440 raw point clouds. Data augmentation, including random rotation, jittering, scaling, and point sampling, was applied only to the training set, and the final dataset contained 2200 point cloud samples. Among the 22 weld samples, 15 were planar welds and seven were curved surface welds, with pipe diameters of 6 and 8 inches. The 22 weld samples were split at the workpiece level into training, validation, and test sets in an 8:1:1 ratio to avoid data leakage across the subsets. This dataset was intended to validate the feasibility of the proposed method under representative planar and curved surface weld conditions rather than to claim broad generalization across all industrial weld types.
3.2. Weld Recognition Network
A weld recognition network, WSR-Net, was designed based on the PointNet++ framework, as shown in
Figure 3.
Preprocessed point clouds were first fed into a shared MLP to embed the input coordinates into an initial feature space of [
N, 32]. WSR-Net then adopted a PointNet++-style encoder–decoder architecture with four set abstraction (SA) layers [
27], where features were transformed by local MLPs and aggregated by max pooling within each local neighborhood, producing hierarchical multi-scale features at [
N/4, 64], [
N/16, 128], [
N/64, 256], and [
N/256, 512]. The decoder consisted of four feature propagation (FP) layers that progressively up-sample and fuse contextually interpolating features from [
N/64, 256], [
N/16, 128], [
N/4, 128], and [
N, 128]. At each stage, the up-sampled features are concatenated with the corresponding encoder features and refined by the MLPs, with channel dimensions reduced to 256, 128, 128, and 128, yielding the final full-resolution feature map [
N, 128].
A cross-attention module was integrated into the U-shaped PointNet++ backbone to improve the aggregation of point cloud information and segmentation accuracy, as depicted in
Figure 3. Given the input point cloud
where
is the number of sampled points and
is the feature dimension of each point, three shared-MLP branches were first used to generate the query, key, and value embeddings. In the implementation, these branches were realized by three 1 × 1 convolutions. The query and key branches projected the input feature from 128 channels to 32 channels and shared the same projection weights, while the value branch preserved the original 128-channel representation. The affinity matrix was computed from the query and key features and normalized by Softmax, followed by an additional normalization step to stabilize the attention distribution. The resulting attention map was then used to aggregate contextual information from the value branch. The aggregated feature was further refined by a CBL block and fused with the original input through a residual correction pathway, producing the output feature set
, defined as follows:
where
denotes the normalized exponential function;
represents matrix operations; and CBL denotes a Conv–BatchNorm–LeakyReLU block. This attention mechanism enhanced the aggregation of geometric features in the weld regions, thus improving the weld segmentation accuracy. Ultimately, the network learned to map between point cloud features and weld labels.
In the process of model training, the cross-entropy loss function [
24] was employed to quantify the discrepancy between the prediction outcomes and the true labels:
where
denotes the number of classes;
represents the ground-truth label; and
is the predicted probability.
The trained WSR-Net was deployed for automatic weld ROI recognition. An example of the segmentation results is shown in
Figure 4.
4. Weld Grinding Trajectory Planning Module
The recognized point cloud consisted of two components, the weld and the base metal—it cannot be directly used for robot grinding. Weld grinding can be regarded as a process of moving the grinding tool along the trajectory points that represent the weld morphology until the desired target surface has been reached; therefore, it was necessary to determine the grinding path points and the reference surface.
4.1. Generation of Weld Feature Points
To achieve a uniform and robust extraction of the weld ridge points under different weld morphologies, an ISS-based key point screening method with orthogonal slicing constraints was proposed. The standard ISS algorithm was first employed to identify geometrically salient points. Subsequently, the weld ridge points were explicitly defined as local maxima on cross-sectional planes orthogonal to the principal weld direction, thereby refining the selection stage and improving the accuracy and robustness of ridge point extraction.
- (1)
ISS-based salient point screening
For each point
in the weld ROI, a local neighborhood with radius
was established, and the covariance matrix
[
28] of the neighborhood point set
was computed:
By letting
denote the eigenvalues of
, the point set
can be retained if the ratio of two consecutive feature values satisfies the following condition:
where
and
are eigenvalue ratio thresholds that can be used to suppress non-salient or nearly isotropic points [
29]. The resulting set of geometrically salient points can be denoted as
.
- (2)
Weld-oriented orthogonal slicing
To incorporate the geometric priors of the weld morphology, an orthonormal basis can be obtained via principal component analysis (PCA), where represents the global axial direction of the weld, and span the cross-sectional plane.
For any point
in the set
, the axial scalar can be defined as
. Over the interval
,
cross-sections
can be selected to construct thin slabs of thickness
:
where all points in
are projected onto the transverse plane, and the ridge point in the
-th section is defined as the point with the maximum height along the
direction:
Collecting ridge points from all cross-sections can yield the ridge point set
, which is finally sorted according to the axial coordinate to obtain the ridge points uniformly distributed along the weld surface, as shown in
Figure 5.
4.2. Generation of the Grinding Reference Surface
As the base metal surfaces on both sides of the weld may exhibit misalignment and warping due to welding assembly errors and thermal distortion, the desired grinding target cannot be represented by a simple plane; therefore, a reference surface can be fitted from the contour and adjacent base metal regions. In this study, a reference surface was fitted from the edge points of the weld contour as the ultimate goal of the weld grinding.
Edge points were first extracted using an angle criterion. For each point
, the maximum normal angle
in its local neighborhood was computed; if
(threshold),
was classified as a contour point. The set of contour points
is shown in
Figure 6.
The grinding reference surface was then fitted using moving least-squares (MLS) over the contour and adjacent base metal region. By letting
denote the query point at which the reference surface is evaluated,
denote the neighboring point cloud samples around
, and where
is the
-th neighboring point, the local height value of
in the local fitting frame centered at
can be denoted by
. The reference surface can then be expressed as:
where
denotes the number of basis function terms;
denotes their corresponding coefficients; and
represents the basic functions. The cubic basis functions can be selected for
, which can be expressed as follows:
The coefficient vector
can minimize the weighted least-squares objective:
where
is the number of neighboring point cloud samples around
;
is the local height value of the neighboring point
; and
is a smooth distance-based weight function. Inside the support domain, the weight is positive, whereas at and outside the support boundary, the weight becomes zero.
By substituting the optimal coefficient vector
into Equation (7), the fitted MLS reference surface can be written in an implicit form as:
The fitted surface (
Figure 7) can serve as the global reference for the subsequent generation of the grinding trajectory.
4.3. Generation of the Grinding Trajectories
Due to the variation in the weld height, a reverse layer-wise trajectory generation strategy based on ridge point projection was proposed to ensure controllability of the grinding process. The desired grinding target points were first obtained by projecting ridge points onto the fitted reference surface. Reverse layer-wise offsetting was then performed from the target layer to the initial weld surface to generate the executable grinding trajectories.
For each ridge point, neighboring points in the non-weld region were searched within a fixed radius to estimate the local normal vector:
where
is the eigenvector corresponding to the minimum eigenvalue; and the normalized
is the local normal vector [
30].
The projection point on the reference surface was searched along the normal direction of each ridge point using the Newton iterative method. The iteration formula can be given by:
where
is the current iteration point;
is the normal vector;
is the gradient of
at
; and
is the directional derivative along the normal vector. The iteration terminates once the residual
. The final projection points
of the ridge points set
on the reference surface are shown in
Figure 8.
Assuming that the material removal rate of the grinding tool under the given operating conditions is
and the robot feed velocity is
, the
-th grinding layer must satisfy the constant material removal constraint:
The grinding cross-sectional area can be approximated as:
where
denotes the maximum weld width on the desired grinding surface at the
-th layer; and
represents the grinding thickness at the location of maximum width for that layer. By sequentially solving the inter-layer grinding thickness that can satisfy the constant material removal constraint, stable control of material removal can be achieved throughout the multi-layer grinding process. The multi-layer grinding trajectories generated using the above offset-trajectory construction and feed-rate planning method are illustrated in
Figure 9 and
Figure 10.
5. Experiments and Discussion
The experimental platform, as shown in
Figure 11, consisted of a SINSUN 210 six-degree-of-freedom industrial robot (SIASUN Robot & Automation Co., Ltd., Shenyang, China), a KW-DCW 3D camera (Suzhou 3DSWAY Intelligent Technology Co., Ltd., Suzhou, China), and an electric grinding tool with a detachable measurement module (Jiangsu Dongcheng Power Tools Co., Ltd., Qidong, China). The robot has a repeatability accuracy of 0.2 mm, and the camera provided a measurement accuracy of approximately 0.2 mm. The grinding tool was an angle grinder (rated power 1500 W, maximum speed 6000 r/min) equipped with a brazed diamond grinding disk with a diameter of 125 mm, a thickness of 6 mm, and a grit size of #36. The computer configuration included an Intel Core i5-13600 processor and an NVIDIA GeForce RTX 3090 graphics processing unit (GPU), which were responsible for point cloud processing, deep learning inference, and generating robot motion control instructions.
5.1. Weld Region Recognition Experiments
The WSR-Net was implemented in PyTorch 1.13.1 and trained for 200 epochs. Key hyperparameters such as the learning rate and batch size are listed in
Table 1. The evolution of the loss function and accuracy during training is shown in
Figure 12. Finally, the WSR-Net network achieved a segmentation accuracy of 98.87% and a mean intersection over union (mIoU) of 90.71% on the test set. As can be observed from the accuracy and mIoU curves, the model tended to stabilize after approximately 100 epochs, while the performance gained from further training was limited. To ensure sufficient convergence and to obtain the best performance under the adopted training setting, the total number of training epochs was set to 200.
To evaluate the segmentation performance of the proposed method, PointNet++, DGCNN, and Point Transformer were trained and tested on the same dataset under the same evaluation protocol, and the results are summarized in
Table 2. PointNet++ was used as the primary baseline, while DGCNN and Point Transformer were introduced as representative graph-based and attention-based point cloud segmentation methods, respectively. By contrast, methods such as KPConv and SparseConvNet involved substantially different input representations or model-specific preprocessing and were therefore not included in the present benchmark.
As shown in
Table 2, WSR-Net achieved an accuracy of 98.87% and the highest curved surface weld mIoU of 91.19%. Compared with PointNet++, WSR-Net improved the accuracy by 2.56 percentage points and the average mIoU by 1.02 percentage points, verifying the effectiveness of the introduced cross-attention module and the overall network optimization. Compared with DGCNN, WSR-Net achieved better results on all evaluation metrics. Compared with Point Transformer, WSR-Net obtained slightly higher accuracy and curved surface weld mIoU, while showing comparable performance in planar weld mIoU and average mIoU. These results indicated that the proposed method achieved competitive and robust segmentation performance under both planar and curved surface weld conditions.
To examine the effect of data augmentation, additional experiments were conducted under the same training settings with and without augmentation. As reported in
Table 2, data augmentation increased the accuracy from 96.57% to 98.87% and the average mIoU from 88.05% to 90.71%. The planar weld mIoU and curved surface weld mIoU also increased from 87.76% to 90.23% and from 88.33% to 91.19%, respectively. These results demonstrated that the adopted augmentation strategy effectively improved sample diversity and enhanced the robustness and generalization capability of the proposed network.
The camera collected the point cloud data of the test weld workpiece and inputted the trained weld recognition network after preprocessing to obtain the segmentation result of the weld area. Examples of the weld recognition and segmentation for planar and curved surface welds are shown in
Figure 13. WSR-Net accurately separated weld regions from the surrounding base metal areas. Under the default training setting used in this study, WSR-Net contained 1.771 million trainable parameters. The average inference time was 7.69 s per sample (the number of points is 60,000). These results indicated that the proposed network maintained a relatively low computational cost while achieving high segmentation accuracy.
5.2. Grinding Trajectory Generation
The effectiveness of the proposed trajectory planning module was evaluated from two perspectives: the accuracy of the ridge point generation and the height difference in the post-grinding surface. Owing to the relatively lower complexity of planar welds, the experimental evaluation in this study focused primarily on arc surface welds.
In the experiment of ridge point generation accuracy, the improved ISS algorithm proposed in this study was used to extract the ridge points. The parameters were set to
= 0.12 and
= 0.98, while the neighborhood radius was set to
= 5.3 mm. A detection method was used to measure the accuracy of the generated ridge points, and a dial indicator was installed at the end of the robot that moved along the circumferential direction of the weld. When the dial indicator reading reached its maximum at a location, the corresponding coordinate was recorded as the actual ridge point at that position, as illustrated in
Figure 14.
The position deviations between the key generated ridge points and the key measured ridge points are shown in
Figure 15.
By letting
denote the actual ridge point coordinate at the
-th measurement point,
as the generated ridge point coordinate, and
as the number of trajectory points, the mean error (ME)
and root mean square error (RMSE) [
31] can be defined as:
Table 3 compares the performance of the proposed method with existing methods in terms of the mean error, root mean square error, and Euclidean distance error. The results showed that the proposed ridge generation method achieved an RMSE of 0.5 mm, which had obvious advantages in trajectory accuracy compared with similar methods, and confirmed its effectiveness and robustness in complex curved surface weld scenarios.
5.3. Weld Grinding Experiments Based on Trajectory Planning
In this subsection, a cylindrical weld grinding experiment was conducted. To evaluate the continuity of the weld surface after grinding, the height difference between the area after grinding and the adjacent base metal after grinding was analyzed by a dial gauge.
- (1)
Grinding object and parameters
The grinding experiments were conducted using a rough grinding process on both planar and curved surface workpieces. The planar workpiece made of Q235 steel had a weld width of 13.47 mm and a reinforcement of 5.13 mm, while the curved surface workpiece made of X65 steel had a weld width of approximately 12.27 mm and a reinforcement of 4.47 mm. The feed speed was set to 3 mm/s for the Q235 workpiece and 2 mm/s for the X65 workpiece. The workpieces before and after grinding are shown in
Figure 16. The planned grinding layer information using the reverse layer-wise trajectory generation strategy is summarized in
Table 4.
- (2)
Grinding accuracy evaluation
The grinding results are shown in
Figure 16. A dial indicator was used to perform precise measurements at 50 positions on each side of the weld–base metal transition region after grinding. The mean and root mean square values of the height difference between the weld region and the two sides of the base metal were computed. The distribution of the height differences is shown in
Figure 17.
Figure 17 summarizes the height differences between the welds surfaces after grinding and the base metal surfaces on the top of the two workpieces. For the two welds, the maximum height differences between the ground surfaces and the base metal were 0.36 mm and 0.40 mm, with corresponding root mean square errors of 0.295 mm and 0.316 mm. The experimental results showed that the proposed robotic automatic grinding method could stably remove the weld reinforcement from both the planar and curved surface workpieces.
The final residual height of the weld reinforcement after grinding was significantly smaller than the RMS of ridge detection. This indicated that the reverse layer-by-layer grinding trajectory planning strategy based on the reference surface could reduce the errors in the generated ridge points. The height difference between the base metal and surface after grinding was not significant, which showed that the grinding reference surface could effectively adapt to the height differences in the base metal on both sides of the weld.
6. Conclusions
A robotic automatic grinding method was proposed to robustly recognize the weld regions and automatically plan grinding trajectories. Some conclusions have been drawn as follows.
(1) A weld recognition and segmentation network, WSR-Net, was constructed to recognize and segment ROIs of both the planar and curved surface welds automatically. Experimental results showed that WSR-Net achieved an improvement in segmentation accuracy of approximately 2.13% compared with PointNet++.
(2) An ISS key point selection algorithm with orthogonal slicing constraints was proposed for path generation. Experimental results showed that the resulting ridge generation method achieved the RMSE of 0.5 mm.
(3) A reverse layer-wise grinding trajectory planning method based on a reference surface was proposed to address the issues of inconsistent weld reinforcement height differences on the base metal surface. Experimental results showed that the final grinding mean error was 0.316 mm, meeting the accuracy requirements for grinding of planar welds and prefabricated pipe welds.
The final post-grinding height error was lower than the RMSE of ridge point detection. This indicated that the reference surface-based reverse layer-wise trajectory planning strategy could mitigate the influence of the local ridge point extraction errors on the final grinding result. In addition, the small height difference between the ground weld region and the adjacent base metal suggested that the fitted reference surface could effectively accommodate base metal height inconsistencies on both sides of the weld.