Data-Driven Offline Compensation of Robotic Welding Trajectories Using 3D Optical Metrology in Industrial Manufacturing

Filip, Alexandru Costinel; Cojocaru, Dorian; Vladu, Ionel Cristian

doi:10.3390/app16052510

Open AccessArticle

Data-Driven Offline Compensation of Robotic Welding Trajectories Using 3D Optical Metrology in Industrial Manufacturing

by

Alexandru Costinel Filip

¹,

Dorian Cojocaru

^1,*

and

Ionel Cristian Vladu

²

¹

Faculty of Automatics, Computers, and Electronics, Department of Mechatronics and Robotics, University of Craiova, 200585 Craiova, Romania

²

Faculty of Electrical Engineering, Department of Electromechanics, Environment and Applied Informatics, University of Craiova, 200585 Craiova, Romania

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2026, 16(5), 2510; https://doi.org/10.3390/app16052510

Submission received: 29 January 2026 / Revised: 26 February 2026 / Accepted: 27 February 2026 / Published: 5 March 2026

(This article belongs to the Topic Advanced Artificial Intelligence Solutions for Modern Engineering Applications)

Download

Browse Figures

Versions Notes

Featured Application

The proposed method enables the adaptive correction of robotic welding trajectories based on the 3D scanning of real components, reducing defects and rework in industrial processes characterized by high geometric variability.

Abstract

The geometric variability of industrial components represents a persistent challenge in robotic arc welding, particularly in high-volume manufacturing environments where parts are positioned in fixtures based on nominal CAD assumptions. Even moderate deviations in dimensions or seating conditions can lead to weld defects, rework, and reduced process capability when conventional offline programming is employed. This paper presents an applied industrial workflow for adaptive robotic welding trajectory correction that integrates full-field 3D optical metrology with a data-driven deep reinforcement learning (DRL) model. Prior to welding, each component is scanned using a structured-light 3D system, and critical geometric deviations are extracted relative to the nominal CAD model. These deviations define a compact state representation that is mapped, via a trained DRL agent, to corrective translational and rotational adjustments of the welding trajectory. Importantly, all trajectory corrections are computed offline, ensuring compatibility with standard industrial robot controllers and avoiding real-time computational overheads. The proposed approach is validated using real production data from an industrial batch of 5000 components characterized by significant dimensional variability and limited process capability. Experimental results demonstrate a reduction in welding defects exceeding 90%, elimination of rework associated with improper part positioning, and an improvement of the overall process performance to a sigma level of 5.219. The results show that combining 3D optical metrology with learning-based trajectory adaptation enables robust compensation of part-level geometric deviations without mechanical fixture modifications. The proposed method provides a practical and scalable solution for improving welding quality in manufacturing environments affected by upstream variability and imperfect part positioning.

Keywords:

robotic welding; trajectory correction; 3D scanning; optical metrology; deep reinforcement learning; adaptive control; industrial manufacturing

1. Introduction

Robotic welding represents a core technology in modern automotive manufacturing, where the integrity of welded joints directly influences product reliability, structural safety, and production costs [1,2,3]. In industrial practice, welding trajectories are commonly programmed offline based on nominal CAD models, assuming ideal part positioning within dedicated fixturing systems. However, under real production conditions, geometric variability of components frequently leads to deviations from nominal positioning, resulting in weld misalignment, defects, and reduced process capability [4,5].

In cylindrical assemblies such as catalytic converters, dimensional changes introduced during upstream manufacturing stages may produce deviations in wall radius, overall length, or seating contact areas with conical end caps. These geometric inconsistencies can prevent proper seating in the fixture, making nominal CAD-based trajectories incompatible with the actual geometry of the part. Similar challenges related to dimensional variability and fixture sensitivity have been reported in industrial welding systems [6,7]. As demonstrated in Section 3, the analyzed production process exhibits very low capability for wall radius (Cpk = 0.02) and marginal capability for overall length (Cpk = 0.56), confirming that geometric variability directly contributes to weld nonconformities.

To mitigate such issues, various adaptive strategies have been proposed. Seam-tracking systems based on laser or vision sensing enable real-time detection of joint position during welding [8,9]. Although effective for local correction, these systems can be sensitive to arc glare, smoke, and surface reflections, and may present limitations in circular or axisymmetric geometries. Other approaches employ machine learning or reinforcement learning techniques to optimize process parameters such as current, voltage, or travel speed [10,11]. However, in many reported implementations, the geometric trajectory itself remains unchanged.

Recent advances in deep reinforcement learning (DRL) have demonstrated the capability of data-driven agents to learn complex, nonlinear control policies in robotic systems [12,13]. DRL has been applied in manufacturing contexts for adaptive control, robotic path planning, and safe interaction in uncertain environments [14,15]. Nevertheless, most existing solutions either rely primarily on simulation-based training or focus on parameter adaptation rather than direct geometric trajectory correction derived from full-field three-dimensional measurements.

In this context, the present study proposes an integrated industrial workflow that combines structured-light 3D optical metrology with a DRL-based trajectory adaptation mechanism. Measured geometric deviations are encoded into a compact state representation, which is mapped by a trained DRL agent to translational and rotational corrections of the welding trajectory. Importantly, all corrections are computed offline, ensuring compatibility with standard industrial robot controllers and avoiding real-time computational overhead.

This approach enables the robot to adapt its trajectory to the actual geometry of each individual part, reducing sensitivity to upstream manufacturing variability and fixture wear. Instead of relying on manual adjustments or deterministic rule-based offsets, the DRL agent learns correction policies directly from production data, capturing nonlinear interactions between geometric deviations and welding outcomes.

The objective of this study is the development and experimental validation of a complete adaptive welding workflow integrating 3D scanning and deep reinforcement learning for automatic robotic trajectory correction. The method is validated using real industrial data from a production batch of 5000 components and evaluated through statistical process capability and sigma-level analysis.

The remainder of the paper is organized as follows. Section 2 describes the 3D scanning system, the formulation of the reinforcement learning problem, and the integration of the proposed method into the industrial robot. Section 3 presents the experimental results obtained using real industrial data. Section 4 discusses industrial relevance, comparative advantages, and identified limitations. Section 5 summarizes the main conclusions and outlines future research directions.

The main contributions of this work are:

(i) An integrated industrial workflow combining full-field 3D optical metrology with deep reinforcement learning for robotic welding trajectory correction.

(ii) A learning-based trajectory adaptation strategy that directly compensates for geometric deviations rather than adjusting the process parameters only.

(iii) Large-scale industrial validation on a production batch of 5000 components, including statistical process capability and sigma-level analysis.

(iv) An offline correction framework compatible with standard industrial robot controllers, avoiding real-time computational overhead.

Positioning and novelty: (i) Conventional seam-tracking approaches perform local online corrections under harsh arc conditions, (ii) reinforcement learning studies primarily optimize welding parameters (current/voltage/speed) while keeping the nominal path unchanged, and (iii) deterministic rule-based geometric approaches offset compensation, becoming brittle under coupled deviations. In contrast to these, the present work introduces an offline, part-specific trajectory adaptation stage driven by full-field 3D metrology and a DRL policy learned from real production data. The proposed framework targets global geometric/positioning deviations originating upstream (dimensional variability, seating errors, fixture wear), produces a finalized robot program compatible with standard industrial controllers, and demonstrates large-scale process improvement under industrial monitoring.

2. Materials and Methods

2.1. Three-Dimensional Scanning System and Data Acquisition

For dimensional characterization of the components, an optical 3D scanning system based on structured light (GOM ATOS Compact Scan) was employed. The system generates a high-resolution point cloud by projecting a precise pattern onto the surface of the component, enabling accurate reconstruction of the real geometry. The setup includes two high-resolution cameras, a blue-light projector, an ambient light sensor, and a rotary table used for the controlled positioning and scanning of the part.

The data acquisition process consists of the following steps:

The component is positioned on the rotary table.
A sequence of scans is performed from multiple viewing angles.
Filtering and cleaning algorithms are applied to the acquired point cloud.
The point cloud is converted into a triangulated mesh.
The mesh is aligned with the nominal CAD model using a three-stage procedure:
- Pre-alignment;
- Reference Point System (RPS);
- Local Best Fit.

The resulting aligned data constitute the primary input for the machine learning system.

The measurement uncertainty of the ATOS system was below 10 µm, ensuring reliable extraction of dimensional deviations relevant for welding correction (Figure 1).

2.2. Extraction of Geometric Deviations

After aligning the point cloud with the nominal model, point-to-point deviations are computed:

Δ_{i} = ‖p_{i}^{real} - p_{i}^{nominal}‖

(1)

where the Euclidean norm is used and the nominal point corresponds to the closest point on the CAD surface, and

p_{i}^{real}

is the scanned (measured) point and

p_{i}^{nominal}

is the corresponding point in the CAD model.

For critical features (radius R, length L, contact area C), the deviations are aggregated:

Δ R = R_{real} - R_{nominal} Δ L = L_{real} - L_{nominal} Δ C = C_{real} - C_{nominal}

(2)

Aggregated deviations were computed using statistical descriptors (e.g., mean and maximum absolute deviation) for each critical feature. Figure 2 highlights deviations in the critical welding areas; these deviations represent the observable state for the DRL model.

2.3. Dataset Generation for Learning

For training the DRL agent, each scan must be associated with the corresponding welding outcome.

Thus, for each part, the following tuple is constructed:

(s_{k}, a_{k}, r_{k})

(3)

where:

$s_{k}$ = state—the deviation vector:

$s_{k} = [Δ R_{k}, Δ L_{k}, Δ C_{k}, Δ θ_{k}, Δ x_{k}, Δ y_{k}, Δ z_{k}]$

(4)
$a_{k}$ = action—the correction applied to the trajectory:

$a_{k} = [Δ x_{torch}, Δ y_{torch}, Δ z_{torch}, {Δ θ}_{t o r c h}]$

(5)
$r_{k}$ = reward—determined based on the welding quality:

$r_{k} = \{\begin{matrix} + 1 & if the weld is compliant \\ - 1 & if the weld is non - compliant \end{matrix}$

(6)

In the full version, the reward can be continuous:

r_{k} = - (α \cdot E_{geometric} + β \cdot E_{vizual} + γ \cdot E_{NDT})

(7)

where

E_{g e o m e t r i c}

,

E_{v i s u a l}

, and

E_{N D T}

represent normalized error metrics derived from geometric deviation, visual inspection, and non-destructive testing, respectively, and α, β, γ are weighting coefficients.

To ensure the stability and generalization of the learning process, 3D point clouds are not used directly as raw input to the DRL agent. Geometric information is reduced to a compact set of industrially relevant features, consisting of aggregated dimensional deviations and deviation maps aligned to the CAD model. This representation enables a direct correlation between geometric variations of the parts and the trajectory correction actions.

2.4. Formulation of the Reinforcement Learning Problem

The process is formalized as a Markov Decision Process (MDP), where the agent learns a continuous trajectory-compensation policy from recorded industrial transitions. The state

s_{k}

consists of compact geometric deviation descriptors extracted prior to welding (Section 2.3), while the action

a_{k}

represents continuous translational and rotational corrections applied to the nominal robot trajectory (Equation (11)). After executing welding with the corrected program, the resulting weld quality provides the reward

r_{k}

.

Since the correction actions are continuous, the policy is learned using a Soft Actor–Critic (SAC) formulation. SAC is an off-policy Actor–Critic method that optimizes a stochastic policy

π_{θ} (a ∣ s)

while encouraging exploration through entropy regularization. The objective is to maximize the expected cumulative reward and the policy entropy:

π * = a r g \underset{π}{m a x} E [\sum_{t} γ^{t} (r_{t} + α H (π (\cdot ∣ s_{t})))]

(8)

where

γ

is the discount factor and

α

is the entropy temperature. In practice, two Critic networks

Q_{ϕ_{1}} (s, a)

,

Q_{ϕ_{2}} (s, a)

are used to improve stability and reduce positive bias.

2.5. Deep Reinforcement Learning Agent Architecture

The agent follows an Actor–Critic architecture consistent with the SAC algorithm. Geometric deviations extracted from 3D metrology are provided to two parallel encoders: (i) a CNN encoder that processes a fixed-resolution deviation map aligned to the CAD model, and (ii) an MLP that encodes low-dimensional numerical deviation descriptors (e.g.,

Δ R

,

Δ L

,

Δ C

,

Δ x

,

Δ y

,

Δ z

,

Δ θ

). The encoded representations are fused into a shared latent vector.

The fused representation is then used by:

An Actor network that outputs the parameters of a squashed Gaussian policy (mean and standard deviation_toggle) for continuous corrections, i.e., $a \sim π_{θ} (a ∣ s)$ .
Two Critic networks $Q_{ϕ_{1}} (s, a)$ and $Q_{ϕ_{2}} (s, a)$ (twin Q) that estimate the expected return for state–action pairs.

This design preserves industrial interpretability at the input/output level (explicit deviation descriptors and bounded correction actions) while enabling robust learning of nonlinear compensation policies from production data (Figure 3).

The agent integrates geometric deviation information extracted from 3D scans through a convolutional neural network (CNN) and numerical deviation features through a multilayer perceptron (MLP). The fused representation forms a shared latent vector that feeds both the stochastic policy network (Actor) and two Critic networks (Twin Q-functions).

The Actor outputs continuous corrective actions applied to the robot welding trajectory, while the Critics estimate the expected return of state–action pairs to guide policy optimization.

The CNN processes fixed-resolution deviation maps aligned with the CAD model, whereas the MLP encodes low-dimensional numerical deviation descriptors.

2.6. Exploration and Update Policy

In SAC, exploration is achieved through the stochastic policy itself, without ε-greedy action selection. At each step, the Actor samples a continuous correction action

a_{t} \sim π_{θ} (\cdot ∣ s_{t})

within predefined safety bounds. Transitions

(s_{t}, a_{t}, r_{t}, s_{t + 1})

are stored in an experience replay buffer and reused for off-policy learning.

Training updates are performed by minimizing (i) the Critic losses for the twin Q-functions and (ii) the Actor loss derived from maximizing expected Q-value while maintaining policy entropy. Target networks are updated using soft updates to ensure stability. The entropy temperature can be automatically tuned to maintain a desired exploration level.

Importantly, all learning is performed offline using recorded industrial transitions, ensuring that no unsafe online exploration is executed on the production cell.

2.6.1. Industrial Training Configuration and Deployment Context

The DRL agent was implemented and trained within the industrial automation environment of the Ford Craiova welding cell. Training was performed offline using recorded industrial transition data collected during controlled production runs. The learning configuration followed a standard Soft Actor–Critic (SAC) framework with experience replay and entropy-regularized stochastic exploration. Hyperparameters were selected and validated through iterative tuning under production engineering supervision to ensure stable convergence and compatibility with industrial safety constraints. Training relied on recorded production datasets rather than online exploration, thereby avoiding any risk to active manufacturing operations. The resulting policy was validated under supervised industrial conditions prior to full deployment in the production environment.

2.6.2. Training Configuration and Reproducibility Protocol

The SAC agent was trained offline using recorded industrial transition data collected from the welding production line. The training configuration followed a standard Soft Actor–Critic implementation with twin Q-functions and entropy regularization.

Training was conducted for 120,000 episodes, with a maximum of 64 steps per episode. A replay buffer of 1,000,000 transitions was used to enable stable off-policy learning. A warm-up phase of 10,000 steps was applied before policy updates.

Mini-batches of size 256 were sampled from the replay buffer. The discount factor was set to γ = 0.99, and soft target updates were performed using a coefficient τ = 0.005. Both Actor and Critic networks were optimized using the Adam optimizer with a learning rate of 3 × 10⁻⁴ (Table 1).

The neural architecture consisted of three fully connected hidden layers with 256 neurons per layer, using ReLU activations. The entropy temperature parameter was automatically tuned during training.

Training was executed on an NVIDIA RTX A5000/RTX 4090 GPU platform, requiring approximately 18 h for full convergence. Inference in the production environment is performed on an industrial PC CPU with latency below 40 ms, ensuring no impact on welding cycle time.

To improve robustness and mitigate initialization bias, training was repeated with five independent random seeds, and convergence was assessed based on stabilization of the residual RMS welding error.

The dataset was partitioned using a hold-out strategy (70/15/15), separating training, validation, and test sets to prevent data leakage between model development and final industrial evaluation.

2.7. Generation of the Corrected Trajectory

The action proposed by the agent is applied to the nominal trajectory:

T_{corrected} = T_{nominal} + a

(9)

where the action a contains positional and orientation corrections.

The correction is applied point-wise along the nominal trajectory.

The nominal Cartesian trajectory is defined as a spatial curve parameterized by arc length:

C (s) = [x (s), y (s), z (s), θ (s)], s \in [0, L]

(10)

where s represents the arc length and L is the total path length.

To obtain a time-parameterized trajectory suitable for robot execution, a time-scaling function s(t) is introduced, mapping time to arc length. The executed trajectory is therefore:

T (t) = C (s (t))

(11)

In practice, the continuous trajectory is discretized at controller sampling intervals

(s_{t}, a_{t}, r_{t}, s_{t + 1})

Δt, generating a sequence of Cartesian positions:

\{T (t_{k})\} f o r k = 0, \dots, N

(12)

The correction

∆ T (a_{T})

produced by the DRL agent is applied point-wise to the discretized trajectory prior to export. Final smoothing is performed using cubic interpolation to ensure compatibility with continuous path (CP) execution mode of the industrial robot (Figure 4).

2.8. Integration into the FANUC Industrial Robot

All stages of data acquisition, geometric processing, and DRL agent inference are performed offline, prior to welding execution. Trajectory corrections are fully computed before loading the program into the industrial robot controller, so that the actual welding process does not involve additional computations or real-time decision making.

The corrected trajectory is exported in a compatible format (LS, TP, or Cartesian Path):

Calculation of offsets in the robot coordinate system;
Generation of new points;
Application of smoothing (cubic interpolation);
Loading into the controller.

2.9. Complete Method Pseudocode

Algorithm 1 summarizes the complete offline trajectory compensation pipeline adopted in this study, from 3D scanning and CAD alignment to seam feature extraction, policy inference, and the generation of bounded translational/rotational offsets applied to the robot program. The method follows an actor–critic deep reinforcement learning formulation based on the Soft Actor–Critic (SAC) framework, which is suitable for continuous action spaces and stable learning. For clarity and reproducibility, the pseudocode reports the key processing stages, the safety envelope enforcement, and the logging/feedback loop used during offline training

Algorithm 1. SAC-based Offline Trajectory Compensation

Input:
3D geometric deviation descriptors

s_{t}

, trained Actor parameters

θ

, nominal trajectory

T_{n o m i n a l}

Output:
Corrected trajectory

T_{corrected}

Step 1: State Construction
Acquire the 3D scan of the component and extract geometric deviation descriptors. Construct the state vector

s_{t}

Step 2: Policy Inference
Sample correction action:

a_{t} \sim π_{θ} (\cdot ∣ s_{t})

(13)
where

π_{θ}

is a stochastic Gaussian policy with bounded outputs.
Step 3: Safety Constraint Enforcement
Clip action within predefined industrial safety bounds.
Step 4: Trajectory Update
Apply correction to nominal trajectory:

T_{corrected} = T_{nominal} \oplus Δ T (a_{t})

(14)
Step 5: Welding Execution and Reward Computation (Training Phase Only)
Execute welding, measure weld quality, and compute reward

r_{t}

.
Replay Buffer Update
Store transition

(s_{t}, a_{t}, r_{t}, s_{t + 1})

in buffer

D

.
Step 6: Offline SAC Update (Training Phase Only)
Store transition

(s_{t}, a_{t}, r_{t}, s_{t + 1})

in the replay buffer.
Update Critic networks.
Update Actor using entropy-regularized objective.
Soft-update target networks.

2.10. Elements Intended for the Supplementary Materials

For transparency and replicability, the following materials are provided in the Supplementary Materials:

Raw point clouds (PLY/STL);
Complete deviation maps;
SPC tables (Excel);
Original scan figures;
Before/after correction plots;
Extended pseudocode (full version);
Detailed CNN + SAC architecture (Actor–Twin Critic);
Examples of nominal and corrected trajectories.

3. Results

This section presents the statistical analysis of dimensional variations of the components, the evaluation of the effect of deviations on welding quality, and the validation of the performance of the proposed method. The results are obtained by combining three-dimensional measurements, statistical process control (SPC) techniques, and deep reinforcement learning (DRL) agent training, using real datasets from industrial production.

3.1. Analysis of Geometric Variations of the Catalytic Converter

For the critical features—wall radius (R), total length (L), and contact area (C)—100 parts were analyzed under normal production conditions. The data originate from 3D optical measurements, processed and aligned to the nominal model (Figure 5).

3.1.1. Wall Radius (R)

SPC analysis indicates a wide distribution of deviations, with values consistently exceeding the specified limits:

Cpk = 0.02;
Ppk = 0.02;
Non-normal distribution (p < 0.001).

These results indicate a completely incapable process, in which wall radius variations cannot be compensated for by mechanical positioning and lead to major errors in the welded joint.

3.1.2. Total Length (L)

Length analysis highlights a marginal process:

Cpk = 0.56, insufficient for stable production;
Ppk = 0.49;
Non-normal distribution (p-value = 0).

In particular, parts with L < 188 mm require additional operations (rework) due to the way the end caps seat and overlap on the converter body (Figure 6).

Observation: a reduction in length leads to positioning deviations that cause non-uniform opening of the welding gap.

3.1.3. Contact Area (C)

According to go/no-go inspection:

Defect rate ≈ 0.01%;
Process considered capable.

This feature has a limited influence on trajectory variations, with major deviations concentrated in R and L.

3.1.4. Industrial Pre- and Post-Implementation Quality Assessment

To rigorously evaluate the impact of the proposed DRL-based trajectory compensation framework, a direct comparison between baseline production and post-implementation performance was conducted using industrial quality monitoring data.

The baseline production batch consisted of 100 units, for which 5 defects were recorded. Each unit was evaluated with respect to one critical weld seam (one defect opportunity per unit). This corresponds to:

Defects (d) = 5;
Units (n) = 100;
Opportunities per unit (o) = 1;
Total opportunities = 100;
Defect rate = 5%;
Process sigma level = 3.145.

Following the deployment of the proposed trajectory correction framework, a monitored production batch of 5000 units was analyzed. In this case, two critical weld seams per unit were evaluated, resulting in two defect opportunities per unit. The recorded data were as follows:

Defects (d) = 1;
Units (n) = 5000;
Opportunities per unit (o) = 2;
Total opportunities = 10,000;
Defect rate (per unit) = 0.02%;
Process sigma level = 5.219.

The sigma level was computed using standard Six Sigma methodology based on defects per opportunity (DPO) and corresponding DPMO conversion. Since sigma accounts for the number of defect opportunities per unit, the comparison remains valid despite the difference in weld seam count between the two production stages.

It should be noted that the baseline batch (n = 100 units) represents the standard monitored production window prior to deployment of the compensation framework, while the post-implementation evaluation (n = 5000 units) corresponds to extended industrial validation under stable operating conditions.

Although the sample sizes differ, the statistical comparison is performed at the defect-per-opportunity (DPO) level, which normalizes for both unit count and defect opportunities per unit. Furthermore, the statistical validation analysis presented in Section 3.8 confirms that the observed improvement is not attributable to sampling size differences but reflects a genuine shift in process performance.

Table 2 summarizes the key performance indicators before and after implementation.

Figure 7 illustrates the resulting improvement in process capability.

The increase from 3.145σ to 5.219σ corresponds to a substantial reduction in defects per opportunity and reflects a significant enhancement in welding process robustness. From an industrial perspective, this improvement translates into reduced rework, lower scrap rates, and increased production stability.

3.2. Effect of Deviations on the Weld Bead

The measured deviations lead, in production, to the following issues:

Misalignment between the end caps and the body;
Excessively open or closed welding gaps;
Changes in the actual torch angle;
Axial offsets of the welding points;
Local material deformations.

These effects were repeatedly observed in non-compliant parts, confirming the direct relationship between dimensional variations and welding defects (Figure 8).

3.3. Performance of the DRL Agent in Trajectory Correction

The DRL agent was trained on a dataset composed of:

Geometric deviations extracted from 3D scanning;
Correction actions manually applied in production;
Welding outcomes (OK/defective);
Corrected and nominal trajectories

Convergence of the Learning Policy

During the training episodes, the DRL agent gradually improves its performance:

Steady increase in average reward;
Reduction in action variability;
Initial policy stabilization was observed after approximately 1200–1500 replay-based training episodes, while full convergence and robustness validation were achieved over the complete 120,000-episode offline training schedule described in Section 2.6.2.

3.4. Evaluation of the Corrections Proposed by the Agent

Based on the measured deviations, the DRL agent generates corrections in the space of:

Δx, Δy, Δz (translations);
Δθ (orientations).

Consistency of the Adjustments

For parts exhibiting large radius deviations (Figure 9):

The agent proposes an increase in radial offset;
Modification of the torch angle to compensate for gap opening;
Adjustment of the travel speed (in certain cases).

3.5. Industrial Results After Implementation of the Corrections

The method was validated on an extended batch of 5000 parts.

Evaluated parameters:

Defect rate;
Rework rate;
Process stability;
Sigma level.

3.5.1. Defect Reduction

The implementation of adaptive trajectory correction led to:

A reduction in welding defects by more than 90%;
Elimination of rework associated with short parts (<188 mm);
Automatic correction of improper positioning.

For comparison, the reference (baseline) process relied exclusively on welding trajectories programmed offline based on the nominal CAD model, without adaptive trajectory corrections. Under these conditions, geometric variations of the components frequently led to misalignment of the joining areas, generation of welding defects, and the need for rework operations. The performance reported in this section corresponds to the implementation of the proposed method and is evaluated relative to this reference process.

3.5.2. Process Capability Calculation

The process capability was evaluated using the Six Sigma methodology based on defects per opportunity (DPO). The Defects Per Million Opportunities (DPMO) were calculated as:

D P M O = \frac{D}{N \times O} \times 10^{6}

where D represents the number of detected defects, N the number of produced units, and O the number of critical weld opportunities per unit.

The corresponding sigma level was determined using the standard Six Sigma conversion, including the conventional 1.5σ shift adopted in industrial practice. This formulation enables a normalized comparison between production stages with different numbers of defect opportunities per unit.

3.6. Qualitative Performance Analysis

The implementation of the AI-based method integrated with 3D scanning brought clear benefits:

The robot becomes independent of perfect part positioning.
It compensates for dimensional deviations directly in the trajectory.
It reduces process sensitivity to manufacturing variations.
It improves process stability without mechanical modifications or fixture changes.
It enables generalization to new parts with similar geometries.

3.7. Operational Overhead and Industrial Feasibility

The practical deployment of the proposed trajectory compensation framework was evaluated in terms of additional operational overhead introduced into the production workflow.

The structured-light 3D scanning process required approximately 5–10 s per component, depending on positioning and surface conditions. Deviation extraction and data preprocessing were executed automatically within the metrology software environment and required only a few additional seconds.

The trajectory correction computation was performed offline within the industrial software ecosystem and did not introduce real-time computational load during welding execution. The corrected trajectory was generated prior to welding and transferred to the robot controller as a standard program.

The average main welding cycle time for the analyzed component is approximately 45 s per unit, depending on seam configuration and robot speed settings. The additional preprocessing stage (3D scanning and deviation extraction), requiring approximately 8–15 s in total, represents less than one third of the overall production cycle and is executed prior to arc ignition. Since the trajectory correction is fully computed offline, the arc-on time and takt time of the welding cell remain unchanged.

Importantly, the welding cycle time itself remained unchanged, since no adaptive control or real-time sensing was required during arc operation. The method therefore preserves production takt time while significantly improving process capability.

From an industrial perspective, the additional preprocessing time is negligible compared to the benefits obtained through defect reduction, elimination of rework, and increased production stability.

3.8. Statistical Validation of Process Improvement

To ensure that the observed improvements were not attributable to random variation, a formal statistical comparison was performed between the baseline production stage (CAD-based trajectory) and the SAC-compensated production stage.

For geometric residual error (RMS), normality was first assessed using the Shapiro–Wilk test. Since no significant deviation from normality was detected (p > 0.05), a two-sample independent t-test (Welch’s correction for unequal variances) was applied.

The null hypothesis H₀ assumed equal mean residual RMS errors between baseline and SAC-based compensation. The alternative hypothesis H₁ assumed a reduction in RMS error under SAC compensation.

The obtained test statistic indicated a statistically significant reduction in residual RMS error:

p = 0.0031 (two-tailed), α = 0.05.

The 95% confidence interval for the mean reduction in RMS error was:

CI₉₅ = [0.38 mm, 0.51 mm].

The computed effect size (Cohen’s d = 0.84) corresponds to a large practical effect according to conventional interpretation thresholds.

For defect occurrence rates, a comparison of proportions was conducted using a two-proportion z-test based on defects per opportunity (DPO). The reduction from 5 defects/100 opportunities to 1 defect/10,000 opportunities was statistically significant (p < 0.001), confirming that the observed increase in sigma level (from 3.145σ to 5.219σ) is not attributable to sampling variability.

These results confirm that the observed industrial performance gains are statistically robust and reflect a true improvement in process capability.

4. Discussion

The integration of three-dimensional measurement with a deep reinforcement learning (DRL) agent for adaptive adjustment of the welding trajectory represents a significant advancement over classical offline programming methods, as well as over the most recent commercial seam-tracking systems. This section analyzes the relevance of the method, its comparison with other techniques in the field, the industrial impact, the identified limitations, and directions for further extension.

4.1. Industrial Relevance of the Proposed Method

The presented method addresses a critical problem in high-volume production: parts may be dimensionally inaccurate, and their positioning in the fixture cannot be mechanically corrected.

Under these conditions:

Nominal trajectories become inadequate;
Fixturing systems cannot eliminate deviations;
Manual adjustment is slow, costly, and dependent on operator experience.

The proposed system automates this stage by:

Identifying real deviations through 3D scanning;
Correlating them with welding outcomes;
Automatically generating trajectory corrections via the DRL agent.

The industrial impact is significant: the robot becomes capable of directly compensating for natural production variations without hardware modifications, significantly reducing dependence on upstream precision and eliminating additional costs related to calibration or fixture redesign.

4.2. Comparison with Traditional Methods

Offline programming based on the nominal CAD model

Major limitations:

Assumes perfect parts;
Does not account for real deviations;
Requires frequent human intervention.

Manual trajectory adjustment

Although effective in some cases:

It is slow;
It introduces operator-to-operator variability;
It does not scale for mass production.

Seam-tracking–based systems

These use cameras or laser sensors to detect the joint in real time.

Limitations:

Not optimal for complex circular geometries;
Data acquisition can be affected by reflections, smoke, and the welding arc;
Real-time correction is limited to only one dimension of the process.

Unlike classical seam-tracking systems, the proposed method does not replace local real-time control, but acts in a complementary manner by globally correcting the trajectory prior to welding execution.

In industrial arc welding, real-time sensing and correction are often constrained by harsh optical conditions (arc glare, fumes, spatter, reflections) and by strict requirements for deterministic execution on certified robot controllers. For these reasons, the proposed approach is deliberately designed as an offline trajectory adaptation stage: the full-field 3D scan is acquired prior to welding, the correction is computed once per part, and the robot executes a finalized program without additional online inference. This design choice prioritizes robustness, repeatability, and straightforward integration into existing production cells.

From an automation perspective, the method is complementary to seam-tracking. Seam-tracking can be advantageous for local, short-range deviations detected directly in the joint region during welding, whereas the present method targets global geometric/positioning inconsistencies that originate upstream (part variability, seating errors, fixture wear) and would otherwise propagate along the entire nominal path. In practice, the two strategies can be combined: offline correction can bring the torch into a corrected global alignment window, while optional in-line sensing can refine the seam locally when feasible.

4.3. Comparison with Modern AI-Based Solutions

Recent literature presents several major directions:

Adaptive visual control (CNN + real-time vision)

Advantages:

Detects local deviations.

Limitations:

Requires controlled illumination;
Difficult to use directly in welding environments (intense light, smoke, optical noise);
High costs.

Parameter optimization via reinforcement learning (RL)

Some works adjust:

Current;
Voltage;
Travel speed.

Major limitation:

The trajectory is not adjusted; only process parameters are optimized.

A straightforward alternative to learning-based trajectory adaptation consists of deterministic geometric compensation rules derived from measured deviations (e.g., fixed offsets proportional to radius or length errors). While such rule-based strategies can partially mitigate systematic deviations, industrial observations indicate that their effectiveness rapidly degrades when multiple deviation sources interact simultaneously. In particular, coupled effects between part length, radial deformation, seating asymmetry, and local gap opening lead to nonlinear correction requirements that cannot be robustly captured by a small set of predefined rules.

The DRL-based approach overcomes these limitations by implicitly learning nonlinear correction policies from real production data, directly correlating geometric deviation patterns with welding outcomes. Instead of enforcing a predefined compensation model, the agent adapts its actions based on observed success or failure, allowing it to handle interacting deviations and edge cases that are difficult to formalize analytically. In this sense, the learning agent acts as a data-driven generalization layer over classical geometric compensation, preserving industrial interpretability at the input/output level while avoiding brittle hand-tuned heuristics.

Hybrid fuzzy–neural models

Limitations:

Require many rules;
Poor generalization to new geometries.

Justification for the Selection of Soft Actor–Critic

The choice of the Soft Actor–Critic (SAC) algorithm was motivated by the continuous nature of the trajectory correction actions and by the need for stable off-policy learning under limited and noisy industrial datasets.

Compared to deterministic Actor–Critic variants such as DDPG or TD3, SAC introduces entropy regularization, encouraging stochastic exploration and improving robustness to local optima. In industrial welding applications, where geometric deviations may interact nonlinearly and defect feedback is sparse, entropy-regularized policies help avoid premature convergence to suboptimal correction strategies.

On-policy methods such as PPO were not selected due to their higher sample complexity and reduced data efficiency, which are less suitable for industrial contexts where data collection is constrained and safety-critical.

Furthermore, SAC has demonstrated strong empirical stability in continuous control benchmarks and robotic manipulation tasks, making it particularly appropriate for learning bounded translational and rotational trajectory corrections.

The selection of SAC therefore reflects a trade-off between stability, sample efficiency, and safe offline training compatibility, which are critical factors in industrial deployment.

4.4. Distinctive Advantage of the Proposed Method

The method presented in this work represents an integrated approach that combines full 3D scanning of components with a deep reinforcement learning agent for direct correction of the robot trajectory based on real geometric deviations.

The primary methodological contribution of this work lies in coupling full-field industrial metrology with entropy-regularized reinforcement learning trained exclusively on real production data. Unlike simulation-dominated RL studies, the proposed approach operates entirely within a measured industrial state–action–reward loop, enabling direct transferability and eliminating the sim-to-real gap commonly encountered in robotic learning systems.

This enables the robot to:

Compensate for large deviations;
Generate customized trajectories for each part;
Maintain welding quality even in the presence of positioning deviations that would otherwise produce severe defects.

4.5. Identified Limitations

Although the method is effective, several inherent limitations exist. It is important to emphasize that the proposed method is not intended to replace all forms of adaptive welding control, nor to eliminate the need for local sensing in applications dominated by fast, high-frequency disturbances. Its primary scope is the compensation of part-level geometric and positioning deviations that are stable over the duration of a welding cycle and originate from upstream manufacturing variability. Within this scope, the offline correction paradigm offers a favorable trade-off between robustness, industrial deployability, and performance gains. Applications characterized by rapidly evolving joint geometry during welding may require hybrid strategies that combine offline trajectory adaptation with in-line sensing and control. There are additional costs associated with scanning each part.

Even though the scanning time is relatively short (5–10 s), full integration requires optimization of the production flow.

There is a need for an initial training dataset.

The DRL agent requires:

Defective parts;
Manual corrections;
Welding quality data.

This imposes an initial data collection phase.

Training time: the full training of the agent requires accelerated simulations or a large number of episodes.

Implementation complexity: complete integration requires:

Calibration of the coordinate system;
Data post-processing;
Export of trajectories to the robot.

4.6. Future Development Directions

Although the experimental validation focuses on a specific industrial component, the proposed framework is not tied to a particular product geometry or welding task. The learning agent operates on abstracted geometric deviation descriptors and trajectory correction actions, which are common across a wide range of robotic joining and processing operations. As a result, the same methodology can be transferred to other axisymmetric or quasi-axisymmetric components, as well as to different robotic processes where nominal trajectories are systematically affected by part-level geometric variability.

The method provides a solid foundation for extension to more advanced industrial applications. Future directions include: continual learning; agent ability to continue learning over time based on new parts; integration of in-line sensors; combining offline 3D scanning with laser sensors during welding; generalization to different geometries.

The same agent can be adapted through transfer learning to:

Tanks;
Valves;
Cylindrical housings.

Joint optimization of process parameters and trajectory: an extended DRL model can act on:

Trajectory;
Current;
Voltage;
Travel speed.

Cloud or edge computing implementation could be possible to reduce processing time on the shop floor.

5. Conclusions

This work presented an industrially validated framework for offline, part-specific robotic welding trajectory correction driven by full-field 3D metrology and entropy-regularized reinforcement learning. By combining structured-light scanning with a Soft Actor–Critic (SAC) agent, the system learns nonlinear compensation policies that directly address geometric variability originating upstream in the manufacturing chain.

Unlike conventional CAD-based programming or deterministic offset rules, the proposed approach treats geometric deviations as a measurable state representation and maps them to bounded translational and rotational trajectory corrections through data-driven policy learning. The correction is computed offline and deployed as a finalized robot program, ensuring full compatibility with standard industrial controllers and preserving welding cycle time.

The large-scale industrial validation demonstrates that metrology-informed reinforcement learning can convert dimensional variability from a process limitation into a controllable parameter. The framework does not replace seam-tracking or local sensing but complements them by correcting global positioning and geometric inconsistencies prior to arc initiation.

Beyond the specific catalytic converter case study, the methodology is transferable to other robotic joining operations where nominal trajectories are systematically affected by part-level geometric deviations. The proposed architecture establishes a scalable pathway toward adaptive, data-driven robotic manufacturing systems capable of operating under realistic industrial variability constraints.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/app16052510/s1.

Author Contributions

Conceptualization, methodology, and validation: A.C.F., D.C. and I.C.V.; data acquisition and industrial experiments: A.C.F.; formal analysis and interpretation of the results: D.C. and I.C.V.; manuscript writing: A.C.F. and I.C.V.; review and editing: all authors. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data supporting the findings of this study are available from the corresponding author upon reasonable request, due to industrial confidentiality restrictions. Representative processed data are included in the Supplementary Materials.

Acknowledgments

The authors would like to acknowledge the technical and administrative support provided during the industrial experiments and data acquisition. During the preparation of this manuscript, the authors used ChatGPT 4.0 (OpenAI) for language refinement, clarity improvement, and editorial assistance. The authors have reviewed, edited, and validated all generated content and take full responsibility for the accuracy, originality, and integrity of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AI	artificial intelligence
CAD	computer-aided design
CNN	convolutional neural network
Cpk	process capability index
DRL	deep reinforcement learning
MLP	multi-layer perceptron
MDP	Markov Decision Process
PLY/STL	3D model file formats
RPS	Reference Point System
SAC	Soft Actor–Critic
SPC	statistical process control
Δ	deviation from the nominal value

References

Rout, A.; Deepak, B.; Biswal, B. Advances in weld seam tracking techniques for robotic welding: A review. Robot. Comput.-Integr. Manuf. 2019, 56, 12–37. [Google Scholar] [CrossRef]
Kahnamouei, J.T.; Moallem, M. Advancements in control systems and integration of artificial intelligence in welding robots: A review. Ocean Eng. 2024, 312, 119294. [Google Scholar] [CrossRef]
Wang, J.; Li, L.; Xu, P. Visual Sensing and Depth Perception for Welding Robots and Their Industrial Applications. Sensors 2023, 23, 9700. [Google Scholar] [CrossRef] [PubMed]
International Standard. ISO 13920 Welding—General Tolerances for Welded Constructions—Dimensions for Lengths and Angles, Shape and Position, 2nd ed.; International Standard: Geneva, Switzerland, 2023. [Google Scholar]
Statistical Process Control (SPC): Reference Manual, 2nd ed.; Automotive Industry Action Group: Southfield, MI, USA, 2005.
Kos, M.; Arko, E.; Kosler, H.; Jezeršek, M. Remote laser welding with in-line adaptive 3D seam tracking. Int. J. Adv. Manuf. Technol. 2019, 103, 4577–4586. [Google Scholar] [CrossRef]
Li, J.; Li, B.; Dong, L.; Wang, X.; Tian, M. Weld Seam Identification and Tracking of Inspection Robot Based on Deep Learning Network. Drones 2022, 6, 216. [Google Scholar] [CrossRef]
Kershaw, J.; Yu, R.; Zhang, Y.; Wang, P. Hybrid machine learning-enabled adaptive welding speed control. J. Manuf. Process. 2021, 71, 374–383. [Google Scholar] [CrossRef]
Masinelli, G.; Le-Quang, T.; Zanoli, S.; Wasmer, K.; Shevchik, S.A. Adaptive Laser Welding Control: A Reinforcement Learning Approach. IEEE Access 2020, 8, 103803–103814. [Google Scholar] [CrossRef]
Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef] [PubMed]
Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction, 2nd ed.; Adaptive computation and machine learning series; The MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
Liu, Q.; Liu, Z.; Xiong, B.; Xu, W.; Liu, Y. Deep reinforcement learning-based safe interaction for industrial human-robot collaboration using intrinsic reward function. Adv. Eng. Inform. 2021, 49, 101360. [Google Scholar] [CrossRef]
Liu, Y.; Tang, Q. An approach to robot welding path autonomous planning of the intersection weldment based on 3D visual perception. Measurement 2024, 237, 115227. [Google Scholar] [CrossRef]
Liu, J.; Yap, H.J.; Khairuddin, A.S.M. Path Planning for the Robotic Manipulator in Dynamic Environments Based on a Deep Reinforcement Learning Method. J. Intell. Robot. Syst. 2024, 111, 3. [Google Scholar] [CrossRef]
Peng, R.; Navarro-Alarcon, D.; Wu, V.; Yang, W. A Point Cloud-Based Method for Automatic Groove Detection and Trajectory Generation of Robotic Arc Welding Tasks. In Proceedings of the 2020 17th International Conference on Ubiquitous Robots (UR), Kyoto, Japan, 22–26 June 2020; Institute of Electrical and Electronics Engineers (IEEE): New York, NY, USA, 2020; pp. 380–386. [Google Scholar] [CrossRef]

Figure 1. Schematic representation of the 3D scanning setup based on the ATOS Compact Scan system. The configuration includes a stereo camera pair and a structured-light projector used to acquire high-resolution surface data of the component mounted on a rotary table. The acquired scans are processed using GOM Inspect software (V8.2 build 8.2.1) for subsequent geometric deviation analysis.

Figure 2. Three-dimensional map of geometric deviations obtained by aligning the scanned point cloud to the nominal CAD model. Colors indicate the sign of the deviation relative to the nominal geometry: red regions correspond to oversizing (excess material), while blue regions correspond to under-sizing (missing material). Green regions indicate minimal deviations, close to the nominal geometry.

Figure 3. Logical architecture of the Soft Actor–Critic (SAC) agent used for welding trajectory correction.

Figure 4. Conceptual illustration of welding trajectory correction. The nominal CAD-based trajectory (blue) is adjusted based on measured local geometric deviations obtained from 3D scanning, producing a DRL-corrected welding trajectory (red) that compensates for dimensional and positioning errors.

Figure 5. Histogram of wall radius deviation (ΔR) measured after welding for the analyzed production batch. The distribution illustrates the statistical variability of the process and serves as input for evaluating weld conformity. Interpretation: more than 50% of the parts are out of tolerance, with no possibility of mechanical correction.

Figure 6. Histogram of length deviation (ΔL) measured for the analyzed production batch. The distribution summarizes dimensional variability and supports the statistical evaluation of component conformity prior to trajectory correction.

Figure 7. Process capability comparison before and after implementation of the proposed DRL-based trajectory compensation framework. The sigma level increased from 3.145 (5 defects/100 opportunities, n = 100 units, 1 opportunity per unit) to 5.219 (1 defect/10,000 opportunities, n = 5000 units, 2 opportunities per unit). Sigma values were computed from defects per opportunity (DPO) using the standard Six Sigma conversion with the conventional 1.5σ shift.

Figure 8. Representative examples of welding defects observed during inspection of the analyzed production batch: (a) conforming weld bead with uniform geometry, used as a reference, (b) lack of penetration in the joint area, and (c) geometric irregularities and misalignment along the weld seam. These examples illustrate typical defect patterns addressed by the proposed adaptive trajectory correction method.

Figure 9. Representative example of welding trajectory correction obtained using the proposed DRL-based method. The nominal CAD trajectory (blue) is locally adjusted to produce a corrected trajectory (red) that compensates for the measured geometric deviations of the component.

Table 1. Hyperparameters used for SAC-based trajectory compensation.

Parameter	Value
Algorithm	Soft Actor–Critic (SAC)
Discount factor γ	0.99
Target smoothing τ	0.005
Replay buffer size	1,000,000
Batch size	256
Optimizer	Adam
Learning rate	3 × 10⁻⁴
Hidden layers	3
Units per layer	256
Activation	ReLU
Training episodes	120,000
Warm-up steps	10,000
Hardware (training)	NVIDIA RTX A5000/RTX 4090
Inference latency	<40 ms

Table 2. Production quality indicators before and after trajectory compensation.

Indicator	Before Implementation	After Implementation
Units (n)	100	5000
Defects (d)	5	1
Opportunities/unit	1	2
Defect rate	5%	0.02%
Process Sigma (σ)	3.145	5.219

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Filip, A.C.; Cojocaru, D.; Vladu, I.C. Data-Driven Offline Compensation of Robotic Welding Trajectories Using 3D Optical Metrology in Industrial Manufacturing. Appl. Sci. 2026, 16, 2510. https://doi.org/10.3390/app16052510

AMA Style

Filip AC, Cojocaru D, Vladu IC. Data-Driven Offline Compensation of Robotic Welding Trajectories Using 3D Optical Metrology in Industrial Manufacturing. Applied Sciences. 2026; 16(5):2510. https://doi.org/10.3390/app16052510

Chicago/Turabian Style

Filip, Alexandru Costinel, Dorian Cojocaru, and Ionel Cristian Vladu. 2026. "Data-Driven Offline Compensation of Robotic Welding Trajectories Using 3D Optical Metrology in Industrial Manufacturing" Applied Sciences 16, no. 5: 2510. https://doi.org/10.3390/app16052510

APA Style

Filip, A. C., Cojocaru, D., & Vladu, I. C. (2026). Data-Driven Offline Compensation of Robotic Welding Trajectories Using 3D Optical Metrology in Industrial Manufacturing. Applied Sciences, 16(5), 2510. https://doi.org/10.3390/app16052510

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Data-Driven Offline Compensation of Robotic Welding Trajectories Using 3D Optical Metrology in Industrial Manufacturing

Featured Application

Abstract

1. Introduction

2. Materials and Methods

2.1. Three-Dimensional Scanning System and Data Acquisition

2.2. Extraction of Geometric Deviations

2.3. Dataset Generation for Learning

2.4. Formulation of the Reinforcement Learning Problem

2.5. Deep Reinforcement Learning Agent Architecture

2.6. Exploration and Update Policy

2.6.1. Industrial Training Configuration and Deployment Context

2.6.2. Training Configuration and Reproducibility Protocol

2.7. Generation of the Corrected Trajectory

2.8. Integration into the FANUC Industrial Robot

2.9. Complete Method Pseudocode

2.10. Elements Intended for the Supplementary Materials

3. Results

3.1. Analysis of Geometric Variations of the Catalytic Converter

3.1.1. Wall Radius (R)

3.1.2. Total Length (L)

3.1.3. Contact Area (C)

3.1.4. Industrial Pre- and Post-Implementation Quality Assessment

3.2. Effect of Deviations on the Weld Bead

3.3. Performance of the DRL Agent in Trajectory Correction

Convergence of the Learning Policy

3.4. Evaluation of the Corrections Proposed by the Agent

Consistency of the Adjustments

3.5. Industrial Results After Implementation of the Corrections

3.5.1. Defect Reduction

3.5.2. Process Capability Calculation

3.6. Qualitative Performance Analysis

3.7. Operational Overhead and Industrial Feasibility

3.8. Statistical Validation of Process Improvement

4. Discussion

4.1. Industrial Relevance of the Proposed Method

4.2. Comparison with Traditional Methods

4.3. Comparison with Modern AI-Based Solutions

Justification for the Selection of Soft Actor–Critic

4.4. Distinctive Advantage of the Proposed Method

4.5. Identified Limitations

4.6. Future Development Directions

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI