Review on the Application of the Attention Mechanism in Sensing Information Processing for Dynamic Welding Processes

: Arc welding is the common method used in traditional welding, which constitutes the majority of total welding production. The traditional manual and manual teaching welding method has problems with high labor costs and limited efficiency when faced with mass production. With the advancement in technology, intelligent welding technology is expected to become a solution to this problem in the future. To achieve the intelligent welding process, modern sensing technology can be employed to effectively simulate the welder’s sensory perception and cognitive abilities. Recent studies have advanced the application of sensing technologies, leading to the advancement in intelligent welding process. The review is divided into two aspects. First, the theory and appli ‐ cations of various sensing technologies (visual, sound, arc, spectral signal, etc.) are summarized. Then, combined with the generalization of neural networks and attention mechanisms, the devel ‐ opment trends in welding sensing information processing and modeling technology are discussed. Based on the existing research results, the feasibility, advantages, and development direction of attention mechanisms in the welding field are analyzed. In the end, a brief conclusion and remarks are presented.


Introduction
Arc welding is the common method used in traditional welding, which constitutes the majority of total welding production, and is commonly applied in automobile production, shipbuilding, and other fields [1].Although most precision welding still requires manual labor, the traditional manual and manual teaching welding method has problems with high labor costs and limited efficiency when faced with mass production.Intelligent welding technology is an effective idea proposed to solve this problem, which has become an important research direction in the development of welding technology.Figure 1 shows the various stages in development that arc welding has undergone.During the initial stages, welding was carried out manually, resulting in the welding quality being heavily reliant on the theoretical expertise and practical experience of the technicists.Then, the advancement of robot technology led to the gradual substitution of manual welding with industrial welding robots, representing the second stage in automated welding.Industrial robot welding enables efficient welding in challenging areas that are impossible for human welders [2].But the majority of welding robots continue to operate using manual teaching methods.In the actual welding process, the errors of equipment, preprocessing process, and environmental factors must be considered, and weld identification and defect detection still rely on manual examination and training [3].
Aimed at these problems, the application of different sensing technologies has made intelligent robotic welding possible [4].Sensing technologies have been demonstrated to have remarkable effectiveness in recognizing welding environments, monitoring the plasma welding process, and detecting weld defects [5].The value of sensing technology in the advancement of intelligent welding depends on its ability to simulate the welder's sensory and cognitive capabilities at an advanced level.Welders can acquire information from the arc and environment by paying close attention to the workpiece and the welding pool throughout the welding process.Afterward, combined with existing experience and theory, the defects in the weld seam are determined.At last, an evaluation of the welding quality is carried out.The investigation of welding sensing technologies can be summarized into three aspects: (1) employing different sensors to acquire a substantial quantity of reliable welding process data; (2) employing efficient signal processing algorithms to obtain suitable signal features; (3) establishing models of the welding tasks and finding a dependable correlation between characteristics and sensing targets.
Sensing technology is the basis of intelligent arc welding.Currently, many sensing technologies have been employed for various welding targets.Figure 2 summarizes the monitoring targets of regularly employed sensing technologies, such as visual, arc, acoustic, spectral, and other nondestructive testing (NDT) sensing signals [6].Welding dynamics and welding parameters have substantial connection, and welding is a nonlinear, uncertain, and time-varying process.As a result, it is challenging to extract and simulate the characteristic factors throughout the welding process.For instance, in the context of monitoring weld pools in real time, despite extensive efforts over many years, the task of extracting meaningful features from the weld pool and accurately modeling its dynamics continues to be a challenging problem [7].The study of intelligent robotic arc welding based on sensing technology is a multidisciplinary research area.It covers multiple fields such as sensor technologies, signal processing, model training, material science, and automation.Sensor technology still encounters significant challenges in intelligent robotic welding.In recent years, for the goals of an intelligent welding system (IWS) and adaptive intelligent welding manufacturing, some reviews introduced the sensing technologies and the corresponding applications in robotic arc welding [8,9].Lu [10] aimed at the fields of weld seam recognition and positioning, and introduced the development of some advanced sensors.Zhang [11] provided a comprehensive overview of the most recent sensing technologies used in the quality monitoring of arc welding.The focus of previous reviews was mainly on summarizing the characteristics of different sensing technologies and the differences in application fields, and rarely classified and summarized them from the perspective of signal processing and feature extraction.With the rapid development in information processing technology, some new algorithms and concepts have been introduced into the field of welding, of which machine learning technology is the most representative.Recently, neural network structures and deep learning methods have made great progress in welding defect detection and process monitoring.However, there are many variants of neural network structures, and the applicable data structures are different, which need to be selected and adjusted according to different objectives.Therefore, a review of network structures used in the welding field will help to understand various sensing information processing methods.The difference is conducive to the development of multi-information fusion technology and the realization of the intelligent welding process.In addition, the concept of attention mechanism, which has been popular in recent years, has gradually been introduced into the field of welding sensing information processing.A better understanding of different forms of attention mechanisms combined with appropriate neural network structures is important for the development of processing technology.Therefore, from the signal processing methods and neural network structures, this paper compares and summarizes the sensing signal information processing methods and attention-neural network structures for several typical welding monitoring targets.This paper mainly consists of two parts.Section 2 first introduces some typical sensing technologies, summarizing the current applications of every sensing technology, and discusses in detail the aspects of weld path identification, weld tracking, molten pool monitoring, welding quality diagnosis, etc.The second part (Section 3) introduces the deep neural network and attention mechanism for signal processing of sensing information, including the applied neural network structure and deep learning methods used in feature recognition and model training of intelligent welding, together with the application and advantages of the attention mechanism in the technology of sensing information processing.Finally, this paper is summarized and the outlooks for future research directions in intelligent robot arc welding are given.

Vision Sensing
Vision sensing technology possesses the benefits of simple equipment setup, mature supporting algorithms, and good stability.Therefore, visual sensors are widely used in fields such as weld recognition, weld tracking, molten pool monitoring, and weld quality monitoring [12,13].As shown in Figure 3, according to the light source properties, the system can be divided into active and passive vision sensing.Active vision sensing technology is usually used to realize weld recognition and dynamic tracking functions during robot welding.It uses a line laser light source to illuminate the weld, obtains the coordinates of the weld center point through an image processing algorithm, and then uses the calibration relationship of the sensor to guide or correct the position of the welding gun [14].Passive vision sensing technology does not use additional light sources.The advantage is the large amount of image information.Therefore, it is widely used in fields such as dynamic monitoring of molten pool morphology, penetration prediction, and welding seam identification and guidance [15].In current research, dimensionality reduction and feature extraction from visual information are mainly based on image processing and image recognition technology, including ROI area extraction, noise reduction, laser stripe collection, melt pool contour extraction, etc.

Acoustic Sensing
Experienced welders can rely on variable sound to judge the welding condition in the welding process.Several studies have found that acoustic data from arcs is essential in determining the quality of welding [16].The frequency range of 20 Hz to 20 kHz, which falls inside the audible sound band, is frequently employed for monitoring robotic arc welding.Figure 4 shows the mechanism of sound sensing technology; the source of sound in the arc welding process is the alteration of energy in the arc column, which involves the auditory manifestations of arc burning, droplet transfer, and splashing.Arc sound signals can detect even small changes in the metal transmission condition and variations in welding conditions.So, the acoustic sensing has rapidly developed as an essential tool for monitoring welding processes in real time.Research has found that sound can accurately monitor and predict the weld pool [17], arc length [18] and welding defects [19].In current research, dimensionality reduction and feature extraction from acoustic information are mainly based on the Fourier transform, combined with the analysis of the time-frequency domain of the welding arc sound signal, to extract features of interest that are highly correlated with the target.

Arc Sensing
The current and arc voltage signal are important parameters in the arc welding process.The intensity of the welding current signal can effectively reflect the heat input during the welding process, which can help researchers monitor the penetration status of the molten pool; the arc voltage signal can effectively reflect the arc length and stability during the welding process, connecting with some specific welding defects, such as arc breakage, welding leakage, and other defects [20].Researchers often use arc voltage to control welding stability, which includes two aspects: arc length monitoring and arc correction.Through modeling the actual welding arc length with the voltage signals, it can automatically detect the current welding seam position and arc length, and control them in real time [21].For welding current signals, researchers usually extract the amplitude level or peak intensity of the welding current signal to monitor the heat input level of the arc, thereby monitoring the penetration state of the welding process and welding defects.

Spectral Sensing
The arc spectrum is a noncontact sensing information from arc emission, which shows good sensitivity to dynamic welding processes and weld defects [22,23].Welding data are abundant in the arc spectrum.In comparison to other sensing technologies, its advantage lies in its ability to monitor the alterations of elements in the arc atmosphere.Figure 5 shows the arc spectrum mechanism.The line spectrum of metallic and nonmetallic elements is derived from the welding base metal, welding wire, and shielding gas, including aluminum, hydrogen, and argon.Variations in the concentration of these elements can indicate the stability in the welding process and the diffusion of elements, both of which are connected to the formation of porosity defects.Recently, spectral signals have been extensively employed for monitoring welding defects such as porosity [24] and burning-through [25].Feature extraction of spectral signals must first combine the welding process and defect formation mechanism to determine the elements with high correlation, and then further combine the spectral emission mechanism and theoretical calculations to reduce the information dimension.The noise and continuous background spectrum interference in the original arc spectrum are also important research topics in this field.

Other Sensing Technologies
In addition to the abovementioned technologies, there are also some other sensors, such as infrared, ultrasonic, and radiographic sensing.Infrared sensors and infrared cameras have the ability to record temperature data.One major benefit of these sensors is their resistance to interference from welding noise, making them suitable for monitoring molten pools [26] and detecting defects such as undercuts, humps, and unfused flaws [27].Ultrasonic sensing [28] and radiography sensing [29] are typical nondestructive testing methods and are currently widely used in weld quality assessment after the welding process.

Multisensor Fusion
The sensing information introduced in the previous chapters has its own advantages and disadvantages in different fields.As shown in Table 1, the signal characteristics and main application fields of each sensor are summarized.In view of the real-time capabilities of the signal, they can be divided into dynamic process monitoring and post-weld detection.The dynamic process monitoring-related technologies are the focus of research to achieve intelligent welding, while post-weld inspection technology is the basis for scientific research and experimental proof.
It can be shown that it is difficult to manage all types of welding defects with only single-sensing information.So, in the application of welding sensing information, in order to maximize the advantages of different sensors and achieve better monitoring effects, multi-information fusion technology is generally used to coordinate multiple-sensing signals to monitor welding quality and achieve intelligent welding.The performance of multisensor fusion is influenced by various parameters, including the dimensionality of features and fusion techniques, yet it can accomplish functions that are not possible with typical single-source sensing.In recent years, the advancement of neural network technology and the proposal of attention mechanism have also made the fusion of different sensing information more flexible.

Applications of Welding Sensing Information
In recent years, research on various sensing technologies has greatly promoted the development and progress of intelligent welding technology.The current research areas mainly include weld path recognition, weld seam tracking, weld pool monitoring, and weld quality diagnosis.This section introduces and summarizes the applications in four parts.

Weld Path Recognition
Weld path recognition is an essential initial step of robot welding; the initial point positioning and the spatial weld identification are the two basic research directions.Vision sensing technology is the main sensing technology to realize this application field.
The start welding point positioning can be divided into passive visual detection and active visual detection according to the light source.For an active vision system, the start welding point positioning is usually based on the key point detection algorithm, the geometric correlation, or visual pattern.Various techniques are employed for detecting edges and extracting foreground in weld extraction, such as the Canny operator, Sobel operator, and threshold segmentation, among others.Chen et al. [30] proposed a two-step start welding position locating technique that employs the intersection approach and local corner detection.The effectiveness of this method in accurately determining the 3D position of both straight and curved welds has been demonstrated.Wei et al. [31] developed a visual system consisting of a "single camera and double positions".The system employed image segmentation and edge extraction techniques to extract initial welding locations.This is achieved by application of the Harris corner detection and grayscale scanning method (GSCM).The extraction process ensures an error of less than 1 mm.Chen et al. [32] employed template matching and polynomial fitting techniques to achieve subpixel start welding position positioning with high processing speed, allowing for the realization of start welding position guidance in only 2.04 s.The active vision system's excellent adaptation in difficult situations enhances its stability in industrial manufacturing.Fan et al. [33] developed an innovative visual sensor with an additional LED light for the microgap weld.The locating error was confirmed to be below 0.4 mm using the random sample consensus fitting and the feature point extraction methods.In these studies, researchers innovated from the perspective of equipment and algorithms, effectively improving the positioning accuracy of the welding initial point.
The purpose of spatial weld extraction is to achieve a high-speed spatial weld location and simultaneous quick extraction of the entire weld seam.The crucial technology for this is stereovision.Peng et al. [34] developed a highly effective method for generating robotic welding paths.A new method called the local and global groove feature histogram was introduced as a descriptor for surface deformation in order to precisely locate the weld seam.This method was shown to attain a high level of precision in locating welds.Kim et al. [35] introduced a depth camera-based multiweld extraction technique.It achieves efficient extraction of welds spatially and resolves the issue of weld obstruction in the industrial situation for robotic welding.It is accomplished by utilizing 3D point cloud registration and extracting the weld profile from many different angles.Hairol et al. [36] proposed a method to automatically detect and identify weld paths with different path shapes.The Sobel operator was used to extract the weld edges, and welding seams could be extracted more completely after binarization, image difference, and morphological processing.
In summary, active and passive vision have their own advantages and disadvantages under different weld path recognition conditions.In practical applications, comprehensive consideration and selection need to be made from many aspects such as welding conditions, welding accuracy, ease of equipment installation, and cost.

Weld Seam Tracking
Weld seam tracking is the basis for realizing the robotic intelligent welding process.Its main purpose is to monitor the relative position of the welding gun and the weld seam in real time to ensure the accuracy of the welding trajectory.According to the technology used, we can divide the research on welding seam tracking into two parts, including welding deviation monitoring based on sensing signals and weld seam recognition based on the active vision method.
The aim of welding deviation monitoring is the welding errors according to the analysis of the sensing information obtained throughout the welding.In recent years, some studies have shown that arc sensing [37] and acoustic sensing [38] signals are correlated with the arc length during the welding process.In terms of arc sensing, Liu [39] et al. introduced a technique called harmonic extraction, which is used in the frequency domain to effectively retain the crucial arc signal information following the application of filtering.Daehyun et al. [40] introduced a new monitoring technique accompanied by an automated weaving width control method.Aimed at minimizing the impact of interference, voltage measurements are taken at the edges of the weaving along with the mean voltage throughout the weaving procedure.Jeong et al. [41] designed a high-speed rotating arc sensor.It can perform accurate positioning of the welding torches.It calculates the three-dimensional displacement of the torch by analyzing the waveform of the average current in four specific locations throughout one rotation cycle.It expands the utilization of arc sensing-based seam tracking techniques, including lap welding, fillet welding, and sheet welding.For acoustic sensing, Lan et al. [42] proved that the distribution of acoustic signal frequencies is correlated with the distance between the welding torch and the side wall.This finding supports the possibility of using acoustic signals for monitoring welding deviations.Lv et al. [43] designed a dual-microphone array.It can achieve welding path monitoring with high precision.The structure and the prediction result are shown in Figure 6.The dual-microphone array-based prediction model showed improved monitoring performance compared to the individual-microphone approach.To monitor the welding paths, the active vision technique for weld seam tracking can immediately identify the center of the weld seam.Laser vision system and the laser triangulation method are the main equipment.The time, accuracy, and anti-interference requirements are the important parts in this field.Xiao et al. [44] introduced the slope method and the adapted the Snake model for efficient laser stripes extraction to recognize the position of welds.As shown in Figure 7, special cases do not affect the result of the extraction algorithm.He et al. [45] introduced a Bayesian automatic decision-making technique for identifying feature areas.This method relies on the weld bead outline acquired through the use of the scale-invariant feature transform descriptor.Zou et al. [46] combined kernel correlation filtering with histogram-of-oriented-gradient features for stable tracking under significant noise interference, which showed clear efficiency over traditional techniques.In summary, welding seam tracking technology has made great progress in algorithms and sensing methods.Active visual sensing is commonly applied in welding seam tracking, because of its good anti-interference ability and positioning accuracy.However, with deep research on sensing technology, a variety of sensing methods have been applied to the field of weld seam tracking, of which sound sensing technology is a typical example.The expansion of sensing methods can make the equipment more flexible, which is of great significance to the development of intelligent welding.

Weld Pool Monitoring
A stable weld pool is an essential part in the arc welding process.It can relate to significant issues with the quality of the welding, such as lack of penetration and fusion.So, the monitoring of the weld pool in real time is important for intelligent arc welding.According to the type and dimension of molten pool information, molten pool monitoring during the welding process can be divided into three categories, including thermal imaging information monitoring, molten pool morphology monitoring, and three-dimensional reconstruction of the molten pool.
The weld pool geometry and temperature information can be monitored by thermal images; its advantage is that it can directly obtain the temperature field distribution in the molten pool during the welding process and accurately monitor it.Subashini et al. [47] proposed a new algorithm to detect the geometric properties of the weld pool from infrared images based on k-means.Then, the ANFIS model received input with the combination of geometric features, peak temperature, and Gaussian thermal area to calculate the penetration status and bead width.Vasudevan et al. [48] used the infrared images to construct a real-time weld-pool monitoring system.The effect of the infrared images system for accurate welding penetration monitoring was demonstrated by experiments.Cellular automata were used by Chandrasekhar et al. [49] for feature extraction of the weld pool infrared pictures.It can successfully predict the weld bead width by combining the length and width of the weld pool with data from arc sensing.
The monitoring of the molten pool morphology mainly relies on passive visual sensing.The detection information includes molten pool morphology, backside molten width, molten pool stability, and monitoring the penetration status of the weld and some common defects (welding leaks, wrong edges, lack of fusion, etc.).Wu et al. [50] proposed a passive visual sensing system to monitor the penetration status during variable polarity plasma welding.As shown in Figure 8, Fan et al. [51] constructed a multioptical path vision-sensing monitoring system to achieve synchronous monitoring of the front and back molten pool shapes during the aluminum alloy welding process.More detailed pool information can be obtained by 3D reconstruction.The 3D reconstruction of the weld pool surface is the most regular application, which is relies on stereovision systems and related disparity calculation.An innovative mathematical model for describing the weld pool surface was proposed by Zhang et al. [52].Through a testing experiment, the 3D geometry of the weld pool was accurately reconstructed using a new image processing algorithm with an error of only 0.11 mm.Xiong et al. [53] developed a monocular camera-based biprism stereovision system that effectively decreased hardware cost and sensor size.By using the triangle measuring method, the tail of the weld pool was selected as the ROI area for calculating the width of molten pool.The experiments showed an average error of less than 2.18%, and the new method can lighten the computational load.By employing a 3D vision system with laser dot matrix reflection equipment, Liu et al. [54] reconstructed the surface of the weld pool in real time.The geometric properties of the weld pool surface were combined with the ANFIS model for a high-precision penetration prediction.
In summary, infrared sensing and passive vision sensing are the primary techniques for pool monitoring during welding.The weld pool geometric properties and temperature distribution can be obtained efficiently by infrared sensors, which are of great significance for molten pool monitoring.Passive vision sensing can provide reliable information on the geometry of the molten pool and the welding arc.Efficient and accurate extraction of welding pool information is the basis for realizing intelligent welding.

Real-Time Weld Quality Monitoring
Real-time weld quality monitoring is a real-time identification and prediction of welding defects based on sensing information.This technology is an important part of ensuring the quality of welds produced by robot intelligent welding.In recent years, researchers have carried out many studies in this field, and have made great progress in identification accuracy and efficiency of welding defects.Early research mainly simulated welder behavior and conducted feature extraction and analysis, based on the correlation between vision, sound, arc signals and various defects.However, the defects were mainly targeted on the external defects, such as incomplete penetration, burn-through, undercut, and wrong edge.With the advancement in sensing technology and the improvement of weld quality requirements, some sensing information that exceeding the human sensory level was introduced into the weld sensing system, and as the scope of defect detection has become more comprehensive, the real-time monitoring of internal defects such as pores and slag inclusions became possible.In addition, the development of machine learning and neural network technology has made feature extraction of sensory information more flexible and is no longer limited to artificial feature definitions.The accuracy and real-time performance of defect prediction have been further improved, ensuring the effect of weld quality diagnosis to a greater extent.Among them, the most common sensing technologies include visual, sound, arc, and spectral sensing.
As mentioned above, visual sensing technology can collect rich welding pool information in real time.Through the geometric feature and stability of the molten pool, the penetration status can be monitored and the surface defects of some welds can be predicted.Chen et al. [55] extracted the visual features which contributed to the penetration prediction by a multiangle visual system; the prediction result is shown in Figure 9. Feng et al. [56] proposed a deep welding framework for penetration classification based on the combination of active vision and passive vision.Li et al. [57] proposed a real-time sensing system based on photoelectric conversion for a pulsed-gas tungsten arc welding process.By using an illumination laser and a photoelectric conversion chamber, the weld pool oscillation frequency can be measured in real time with high sampling rate, high processing speed, device simplicity, and ability to completely meet the requirement of real-time measurement in the actual welding process.Shi et al. [58] used a low-power five-line laser pattern and a high-speed camera to observe and analyze the change in the captured laser images during the base welding current period.They developed a novel image processing algorithm for extracting the pool oscillation frequency, which demonstrated good robustness and effectiveness, and could be used to monitor and control weld joint penetration in real time.Through feature extraction and data analysis of the welding sound information in the time-frequency domain, rich welding process information can be obtained for online diagnosis of various welding defects.The arc sound signal's mechanism was explained by Chen et al. [59] through the introduction of an arc sound excitation model.A DLSTM network was put forward to achieve the accurate penetration prediction by employing a nonlinear arc sound model.Zhao et al. [60] researched the feature of arc sound, an adaptive extraction model was put forward and proved.Through Grad CAM and guided Grad CAM, the regions of interest for arc sound under different penetration states were visualized (Figure 10).An innovative classification model, called SVM-GSCV, for identifying under penetration, normal penetration, and burning-through by arc sound signal, is presented by Zhang et al. [61], which combined the support vector machine with grid search optimization and cross-validation.The suggested model is highly effective and robust, as shown by the verification results for various welding currents and plate shapes, where the test accuracy varied from 81.52% to 98.46% using wavelet-based filtering PCA and SVM-GSCV.The spectral signal can be analyzed through spectral lines to obtain the changes in welding heat input and arc elements content.Combined with the formation mechanism of various internal defects, the time domain characteristics of each element spectral line in the arc emission spectrum are an important basis for solving difficult-to-predict internal defects such as pores and slag inclusions.As shown in Figure 11, Zhang et al. [66] used eight sensitive spectral lines to calculate the variance, RMS, and kurtosis features.In order to reduce the negative effects of the current pulse, the wavelet packet transform was employed.The results of experiments showed that oxidation and crater defects could be effectively monitored.A technique for detecting porosity was suggested, which relies on the HI spectrum from the first principal component.The extracted features can classify different levels of porosity by the analysis of the statistics and the feature curve of the absolute value of the coefficient [67].The correlation between arc spectrum and porosity was investigated by Huang et al. [68] based on a principal component analysis.To obtain the mapping vectors that describe the intrinsic structure of the spectra and porosity flaws, the intensity ratio was calculated and verified.Xu et al. [69] proposed a real-time prediction model for aluminum alloy porosity in gas tungsten arc welding (GTAW).In the research, an improved gradient boosting decision tree was proposed.The results of the testing experiment are shown in Figure 12; the prediction model shows high robustness and accuracy for the internal porosity of weld seam.
In addition, the development of numerical simulation technology is also an important part of the field of welding quality monitoring.Combined with welding process information, numerical simulation calculations can accurately predict the welding dynamic process including molten pool oscillation, penetration state, etc.It can provide a theoretical basis for the application of sensing information in the field of intelligent welding.Ebrahimi et al. [70] proposed a simulation-based approach to study and characterize molten metal-melt-pool oscillatory behavior during arc welding.It can predict complex molten metal flow in melt pools and associated melt-pool surface oscillations during both steady-current and pulsed-current gas tungsten arc welding (GTAW).The results confirm that the frequency of oscillations for a fully penetrated melt pool is lower than that for a partially penetrated melt pool with an abrupt change from partial to full penetration.For fully penetrated melt pools, [71] employed high-fidelity numerical simulations to describe the effects of welding position, sulfur concentration (60-300 ppm), and travel speed (1.25-5 mm s −1 ) on molten-metal flow dynamics.By comparing with the available analytical and experimental datasets, the robustness and accuracy of the proposed approach was demonstrated.Welding sensing technology based on numerical simulation is convincing in theory.In research concerning welding quality monitoring, it is necessary to flexibly apply various modeling methods to give full use to their respective advantages and improve accuracy and robustness.

Deep Neural Network and Attention Mechanism for Welding Dynamical Process
Sensing technology is the basis for realizing the intelligent welding process.These technologies provide effective information for the initial positioning, path planning, weld tracking, and defect detection in the robot welding process.However, there are differences between sensing information, and the original signals of various sensors are characterized by noise interference and high information redundancy.Therefore, the pre-processing and feature extraction of sensing signals, together with model establishment for different targets, are key issues that need to be researched and solved.Traditional feature extraction and modeling methods are mostly based on theory and numerical calculations to establish mathematical models of welding information and targets.However, the welding process is highly nonlinear and time-varying, and its parameter distribution contains a large number of uncertain factors.The traditional method for obtaining accurate mathematical models through high-dimensional sensing information is often extremely difficult or infeasible.The rapid development of neural network technology has provided new ideas for the processing and modeling process of sensing information.The feature extraction process has been greatly optimized, and the effect of the model has reached a new height.In addition, new concepts about attention mechanisms have also been introduced into the field of welding sensing, making the neural network model further improved in accuracy and interpretability.This section summarizes the research on welding sensing technology in two aspects: neural network technology and attention mechanism.

Neural Network and Deep Learning
Deep neural network technology is a branch of traditional machine learning technology.It imitates the structure and function of the human brain and uses multilayer neural networks connected to each other to learn and analyze data.Deep learning (DL) is an advancement of traditional shallow neural networks, marked by a significant increase in the number of hidden layers.Hidden layers transform the input data into higher-dimensional spaces, allowing for analysis of the input data from multiple viewpoints.Increasing the number of hidden layers enhances the probability of uncovering under-lying patterns in the data.In recent years, DL algorithms have become increasingly widely used in computer vision, speech recognition, natural language processing, and feedback control, and have created many network structures and algorithms, including feed-forward neural networks (FNN), convolution neural network (CNN), recurrent neural network (RNN), restricted Boltzmann machine (RBM), etc. [72] In the field of welding sensing, deep learning algorithms are mainly used in extracting characteristic information from sensor data and relating the characteristic information to target information of interest through learning models.

Network Structure of Deep Learning
Among the deep learning network structures, CNN and RNN are the two most typical and widely used.Most network structures in the field of welding sensing are derived from these two.Figure 13 shows the typical neural network structure and classification.In 1980, Fukushima introduced the initial convolutional neural network (CNN), which drew inspiration from the architectural structures of animal visual cortexes for the purpose of visual perception [73].LeCun et al. [74] developed the original contemporary convolutional neural network (CNN), known as LeNet-5, in 1998, and effectively employed it for the recognition of handwritten digits.Afterward, many new networks were derived to address the shortcomings of traditional neural network structures, such as AlexNet, VGGNet, GoogLeNet, GAN, and ResNet.Convolutional layers, pooling layers, and fully connected layers, as three types of layers, form the structure of CNNs.The convolution operations are used to extract feature maps from input images or grid-like data.This is accomplished by convolving the feature maps from the preceding layer with small-size kernels that shift over the data.A kernel refers to a collection of weights that can be learned and are not randomly assigned, but rather updated repeatedly during the process of learning.The initial step requires defining only the kernel size.The appropri-ate selection of kernel size should be based on the proportional dimensions of the item (or region) of interest in relation to the size of the image.Due to advancements in deep learning, it is preferable to have a small kernel size in order to effectively capture local information.The processing of vision-sensing information in the welding process is a typical machine vision field.Therefore, research on feature extraction and target detection is mostly based on the CNN network structure.In addition, sound signals and arc spectrum signals can also be converted into appropriate data structures for CNN network through operations such as pre-processing and time-frequency domain transformation.
RNNs were introduced in 1986 [75] as a means to handle and analyze sequential input.Subsequently, several variations have been created, including intricate structures, which comprise bidirectional associative memory (BAM), echo state networks (ESNs), long short-term memory (LSTM), and others.Currently, recurrent neural networks (RNNs) are considered the most advanced models for audio and text processing [76].The most notable advantage of an RNN, when compared to other neural networks, is in its explicit linkage of the hidden states, as indicated by the formula:

𝐻 𝑎 𝑊 • 𝐼 𝑊 • 𝐻 𝑏
The hidden state is denoted by H, the input by I, the weight matrix by W, the bias by b, and the nonlinear activation function by a.The state Ht is influenced by both the current input It and the prior inputs [I1, I2, …, It-1] through the previous state Ht-1, in accordance with a Markovian chain.This technique significantly decreases computational requirements as it allows for the propagation of all past information, which is abstracted into a single state, without any loss of crucial information.The sound signal and arc signal during the welding process are typical sequential signals, so the RNN structure and its improved structure are widely used in the processing and model establishment of these two kinds of sensing information.

Deep Learning in Welding
The proposal and development in the concept of deep learning has greatly promoted the progress of intelligent welding technology based on multi-information sensing.Traditional feature extraction and modeling methods have certain limitations when faced with complex sensing information and highly nonlinear welding processes.Neural network technology solves this problem well in an end-to-end method.In recent years, scholars have used deep learning methods to conduct research on welding sensing technology.According to different characteristics of welding sensing information, they have selected and improved corresponding network structures, and have achieved good results in various sensing fields.
In terms of CNN, Chen et al. [77] proposed a multiscale activation module that employs convolution kernels of varying structure to enhance the contextual information of the spectrum.This approach demonstrated remarkable predictive accuracy for determining the composition of wires and the rate of gas flow.Zhang et al. [78] proposed a deep learning algorithm based on CNN for the detection of welding defects in three different conditions during high-power disk laser welding.Jiao et al. [79] designed an end-to-end CNN for weld penetration identification, and a transfer learning approach based on ResNet was developed.The experiments show that the training time was decreased with the prediction accuracy improving to 96.35%.In terms of RNN, Wang et al. [80] proposed a long short-term memory (LSTM) recurrent neural network to monitor an ultrasonic welding (USW) process; the LSTM network demonstrated exceptional accuracy in predicting USW quality, and the convergence time of classification was shown to be related to specific errors in robot motion.This information can be utilized as input for adaptive learning.Zou et al. [81] introduced the RNN to acquire the temporal context information of convolutional features, enabling precise detection of the weld seam in the presence of persistent super-noise.Wang et al. [82] introduced a model to evaluate the quality of resistance spot welding.They employed a modified RNN to estimate the area of the heat-affected zone, and a new self-organizing map type classifier to identify the expulsion condition and its occurrence time.In this research, the end-to-end advantages of neural network technology have been used to achieve good results in the field of welding process monitoring.The CNN network has advantages in extracting target features and is mainly used in the field of molten-pool image processing, while the RNN network has the ability to analyze time series signals, and has made good progress in fields such as arc sound signals.
RNN and CNN have their own advantages under different application conditions.With the continuous development in sensing technology and the continuous improvement in welding quality requirements, the application of deep neural network technology is not limited to a single network structure, but is more inclined to the fusion of multiple network structures, when facing the same goal, divides the task goals into blocks, and uses the best network structure for each block to maximize the advantages of each network model.Liu et al. [83] combined the advantages of CNN and LSTM for high-quality online monitoring of automatic welding, the CNN-LSTM algorithm establishes a shallow CNN to extract the primary features in the molten pool image, the feature tensor extracted by the CNN is transformed into the feature matrix, in which the rows are fed into the LSTM network for feature fusion.The accuracy of defect detection in a CO2 welding molten pool reached 94%.Chen et al. [84] proposed a hybrid network model with high accuracy and robustness for the multisensor system.As shown in Figure 14, the visual feature of weld pool images and the time-frequency domain features of arc voltage, welding current, arc power, and arc sound is extracted by CNN and other mathematical methods.The LSTM network is used to fuse the extracted 19 dimensional features and learn time series information from the fused features.Yang et al. [85] introduced an innovative method for locating welding defects using an encoder-decoder network architecture and an attention-guided segmentation network.In order to minimize the loss of important information in the deep encoder module caused by multiple convolution and pooling operations, they incorporated an enhanced attention block and a bidirectional convolutional long short-term memory (BiConvLSTM) block into the skip connections between the encoder path and the decoder path.This integration allows the model to capture global, long-range contexts and highlight regions of interest.As a result, it is able to accurately identify welding defect areas and improve the segmentation ability for microdefects.From the above research, it can be found that the idea of model fusion can make different neural network structures work together to maximize their respective advantages, effectively improving accuracy and versatility.

Attention Mechanism
Attention is a complicated mental process that is essential for humans to survive [86].An essential characteristic of perception is that people typically do not handle complete information simultaneously in its full extent.Instead, people have a tendency to focus on some particular piece of information, while disregarding other observable information simultaneously.This is a method for humans to efficiently choose valuable information from a large amount of information using limited brain power.The development of the attention mechanism significantly enhances the effectiveness and precision of perceptual information processing [87].It can be found that sensing information, as a computer perception method, can also optimize its calculation process by imitating the human attention mechanism.Recently, the attention mechanism has been employed as a method for allocating resources to address the issue of information overload.When processing power is restricted, it has the ability to prioritize and handle crucial information using available resources [88].As research progresses, the attention mechanism has been a popular component in neural networks and has been utilized in a wide range of tasks, including picture caption generation, machine translation, action identification, and graph analysis.The attention mechanism not only enhances performance, but it can also serve as a tool to elucidate perplexing neural network architectural behavior [89].Therefore, we can apply the attention mechanism to welding sensor information processing, imitate experienced welders' screening and attention of welding process information, improve the processing speed of sensor information and model accuracy, and try to combine the physical mechanism of the welding process with neural network models to make scientific explanations.

Theory of the Attention Mechanism
The implementation procedure for the attention mechanism involves two parts: firstly, calculating the attention distribution for the input data and secondly, computing the context vector based on the attention distribution.
Figure 15 demonstrates that during computation of the attention distribution, the neural network initially encodes the feature of the source data as K, which is referred to as a key.K can be represented in different forms depending on the individual tasks and neural architectures.Furthermore, it is typically important to incorporate a task-specific representation vector q, commonly referred to as the query.The representation of q might vary depending on the individual activities, either as vectors or as a matrix.The neural network calculates the correlation between K and q using the score function f, resulting in the energy score e.This score indicates the significance of queries in determining the next output in relation to the keys.The score function f is an essential component in the attention model as it determines the matching or combination of keys and queries.Additive attention (such as the alignment model in RNN search) [90] and the computationally less expensive multiplicative (dot-product) attention [91] are the two most widely utilized attention methods.In the actual application, it is necessary to select an appropriate score function based on the type of the input signal, the network, and the processing task.
After obtaining the energy score e, the energy scores are mapped to attention weights α through attention distribution function g.To manage different tasks and data structures, many kinds of distribution functions have been proposed, such as softmax [90], sparsemax [92], and logistic sigmoid [93].The context vector c then is computed by a function that returns a single vector given the set of values and their corresponding weights.From this, the attention mechanism is added to the original vector.

Classification of Attention Mechanisms
The principle of attention models is the same, but in recent years, researchers have made some modifications and improvements to the attention mechanisms in order to better adapt them to specific tasks.As shown in Figure 16, this paper summarizes the attention mechanism into three major categories based on the softness and form of the input feature.The soft attention mechanism involves the utilization of a weighted average of all keys to construct the context vector.The soft attention mechanism allows for differentiability of the attention module concerning the inputs, enabling the entire system to be trained using traditional back-propagation methods.In the hard attention mechanism, the context vector is calculated based on randomly selected keys.The important difference between the two forms of attention is whether the attention weights are differentiable.In contrast to the soft attention model, the hard attention model is more computationally efficient as it avoids the need to calculate attention weights for all items at each time step.Nevertheless, if a hard choice is made at every position of the input feature, the module becomes nondifferentiable and challenging to optimize.Consequently, the entire system must be trained by maximizing the approximate variational lower bound or, equivalently, by employing the REINFORCE algorithm [94].
The attention processes can be classified as item-wise and location-wise, depending on the input feature.For the item-wise attention, the inputs need to consist of explicit items or that an extra pre-processing step is performed to construct a sequence of items from the source data; the single feature map of SENet [95] is a typical example.Each item is encoded as a separate code in the attention model, and upon decoding, each code is given a different weight by the encoder.On the contrary, location-wise attention is aimed at tasks for which it is difficult to obtain distinct input items, and generally such an attention mechanism is used in visual tasks.Another difference is the calculation principle in the soft/hard attention mechanism.The item-wise soft attention calculates a weight for each item, and then makes a linear combination of them.The location-wise soft attention accepts an entire feature map as input and generates a transformed version through the attention module.Instead of a linear combination of all items, the item-wise hard attention stochastically picks one or some items based on their probabilities.The location-wise hard attention stochastically picks a subregion as input and the location of the subregion to be picked is calculated by the attention module.

Attention Mechanism in Welding Sensing
The essence of welding sensing technology is the processing and calculation of visual, sound, spectrum, and other sensing signals.At present, neural network technology is an effective means to realize the calculation process and is widely used in various welding sensing tasks.The attention mechanism has good compatibility with neural network technology.When applied to the field of welding, it can imitate experienced welders to filter sensing information based on importance.On the other hand, further research and discussion on the weld mechanism can be carried out based on the attention weight information obtained from training.
Most of the welding sensing information needs to be combined with the physical mechanism of the welding process to perform corresponding pre-processing operations, and the training of related models mostly uses neural network learning methods; therefore, the soft attention mechanism is suitable for most welding sensing tasks.Among them, the channel attention mechanism and the spatial attention mechanism are the two most widely used forms.The channel attention mechanism mainly focuses on sequence information.Attention weights are attached to different feature channels.It generally requires pre-feature extraction and convolution operations.It has the widest application range and can be applied to various transmission methods such as arc, sound, spectrum, and image sense field.Zhao et al. [96] proposed the attention-based LSTMs to give more attention to the region of interest in the time-frequency spectrum of the acoustic signal.The dataset and network structure are shown in Figure 17.The modified network showed greater performance than the traditional methods with an average accuracy of 95.32%.Furthermore, the attention vectors are visualized to determine the mechanism of auditory attention when different penetration occurs.The spatial attention mechanism aims to improve the feature expression of key areas.In essence, it transforms the original information into another space through the spatial transformation module and retains the key information.It generates a mask for each position and weights the output.It is currently mainly used in the field of image processing.Wang et al. [97] designed a boundary distance field (BDF) regression module to predict the distance fields of the inner and outer boundaries, which can reflect the continuous boundary information of the welding zone.The spatial attention weight is calculated according to the regressed boundary distance fields, and the calculated weight is fused into the feature map to enhance the significance of the welding zone.Zhou et al. [98] constructed a deep learning model named real spatial-temporal attention denoising network (RSTADN), which consists of a denoising module, spatial-temporal attention modules, and multiple residual modules.The experimental results indicate that the accuracy of RSTADN in the task of detecting the quality of welded nuggets reaches as high as 94.35%The channel attention mechanism and the spatial attention mechanism have their own advantages in different aspects.Therefore, in recent years, researchers have connected the two attention mechanisms and worked together for deep learning of welding sensing information, achieving good results.Xiao et al. [99] proposed a multiscale convolution assemble block for spot welding, in which a fused attention block is designed to calibrate the spatial and channel information of welding spot feature maps.The results of classification experiments demonstrated that the proposed strategies are efficient and the accuracy reaches to 95.2%.Liu et al. [100] propose an improved three-dimensional convolutional neural network with separable structure and multidimensional attention for welding status recognition.Through comparison experiments, the accuracy of the proposed method is verified to be more accurate and noise-resistant than that of the conventional model.Research described in [101] also used the spatial and channel attention in the online monitoring of laser welding quality; the proposed method has an accuracy of 97.45% for the recognition of nine types of laser welding molten pool/keyhole defects.
At present, the applications of attention mechanism in the field of welding sensing are developing rapidly, but the existing technology is mainly aimed at sound and image information, the application of attention mechanism in spectrum and arc still needs to be developed.With the expansion of application scope and research depth, the attention mechanism is expected to become crucial for further progress in welding sensing technology.

Conclusions and Remarks-Current Hot Issues and Further Research
With the increasing demand for flexibility, efficiency, and personalization in manufacturing, there is no doubt that intelligent manufacturing will be the future manufac-turing style.In recent years, sensing technology in the field of welding has made great progress in terms of sensitivity and amount of information.This has been accompanied by an increase in the difficulty of processing various sensing information.It is difficult for traditional information processing technology to achieve high-speed processing of diverse big data, and how to analyze and leverage appropriately the big data has become a major technical difficulty.Neural network technology has shown obvious advantages in various sensing fields with its end-to-end training model.Combined with the attention mechanism, it has brought algorithm intelligence and accuracy to a higher level, promoting development in the intelligent manufacturing process.However, the current technology is still not enough to achieve real intelligent welding.The following summarizes the shortcomings and development directions in the field of multi-information sensing in the welding process from three aspects: sensing technology, deep learning, and attention mechanism.

Sensing Technology
Sensing technology is the hardware foundation for intelligent welding.We need to focus on three main directions for sensor research, including sensing accuracy, sensing real-time performance, and equipment simplicity.Sensing accuracy includes signal resolution and positioning accuracy, which can directly determine the quality of the signal.At present, the development level of various sensors is uneven.The development level of industrial cameras is relatively high, which can achieve high-quality collection and positioning of laser stripes and molten pools during the welding process.However, some sensing signals are subject to technical limitations and are sensitive to interference, such as spectral sensing and infrared sensing.The real-time capabilities of sensing is the prerequisite for realizing welding control, which puts forward high requirements for equipment response speed and signal transmission speed.Many high-quality and nondestructive detection methods are post-weld testing methods, which are still limited due to the real-time capabilities of signals.It cannot be used in real-time detection of welding defects.The diversity of sensing signals is limited.The simplicity of the equipment guarantees the versatility of intelligent welding technology.The highly portable sensing method can adapt to different environments and manufacturing.At present, some sensor equipment is relatively complex, difficult to operate, and has strict requirements on environmental conditions, such as X-ray detection.
In the field of sensing technology, research needs to focus on improving sensing accuracy and real-time capabilities, and trying to overcome some technical barriers so that more detection methods can be applied to real-time sensing of the welding process.Under the premise of ensuring the sensing quality, the portability of each sensing device needs to be enhanced to achieve light weight and high automation.

Deep Learning
This review summarizes and introduces the application of deep learning methods in the welding process.The end-to-end idea of a neural network has played a great role in promoting the research of welding models.However, over-reliance on neural network technology will gradually separate the welding sensing from the physical mechanism of the welding process.While having the advantage of being applicable to any data type, the characteristics of the black box model gives the neural network a disadvantage of poor interpretability.At present, most research is based on proven network structures.The research focuses on processing welding data rather than the innovation of network structures.These network structures were not originally proposed for the welding background and lack a theoretical basis for welding.Without innovative improvements, the applicability of the model may be limited.
Combining these issues, future research on neural network technology should be combined with the physical mechanism of the welding process.According to the characteristics of different welding defects and different sensing signals, researchers should develop network structures based the physical mechanism of the welding process, and flexibly design a more targeted network structure that takes into account accuracy, response speed, and robustness, thereby achieving more accurate real-time monitoring.

Attention Mechanism
In recent years, the attention mechanism has become a research hotspot in the field of welding sensing.On the one hand, the proposal of attention weights improves the accuracy of welding models.On the other hand, the study of weight distribution can make the model interpretable.However, the signal types in existing research are still relatively limited, and there are research gaps for some signals, including spectra and arcs.In addition, some studies have not deeply explored the location where the attention mechanism is added.Although adding attention within the convolutional layer and final fully connected layer of the neural network can improve the model, it lacks a scientific explanation for the attention weight based on welding theory.
In future research, researchers should try to explore the feasibility of the attention mechanism in various types of welding sensing signals, and flexibly use the concept of the attention mechanism instead of being limited to the proven technologies for image and sound signals.The practicality needs to be improved.In terms of basic structure, the added form and location of the attention mechanism should be combined with welding theory and sensing signal characteristics to acquire the advantages of the attention mechanism in interpretability, solving the black box feature of the neural network model.

Industry 4.0
The goal of Industry 4.0 is to realize a new intelligent industrial world with awareness, which requires the support of powerful sensing control technology and Internet of Things (IoT) technology.At present, in the field of welding, the fusion of various sensing technologies is not perfect, and welding IoT technology still needs to be developed.In future research, it is necessary to explore the complementarity between various sensing technologies, and position the target of sensing control in the entire welding process through the cooperation of various sensors and IoT technology, and ultimately achieve the intelligent welding required by Industry 4.0.

Figure 1 .
Figure 1.Three development stages in arc welding.

Figure 2 .
Figure 2.An overview of the monitoring targets and sensing technologies.

Figure 3 .
Figure 3. Classification of visual sensing technology.

Figure 4 .
Figure 4.The mechanism of sound sensing technology.

Figure 5 .
Figure 5.The mechanism of arc spectrum sensing.

Figure 6 .
Figure 6.The structure of dual-microphones system and the welding deviation prediction results of two microphone [43].

Figure 7 .
Figure 7.The result of the extraction algorithm under significant noise interference: (a) in low brightness; (c) ruined by arc; (b,d) extraction results (the yellow line) from (a,c), respectively[44].

Figure 8 .
Figure 8.The three light routes in a simultaneous visual sensing system of weld pool in a frame (Arrows of different colors represent different light paths) [51].

Figure 9 .
Figure 9.The penetration prediction result of classification and regression [55].

Figure 10 .
Figure10.Region-of-interest visualization for the time-frequency spectrum using Grad CAM and Guided Grad[60].The time-varying properties of arc signals are complicated, and are sensitive to arc disturbance in the welding process.It is widely used in real-time weld quality diagnosis.Xiao et al.[62] used arc voltage variation to research the relation that exists between the natural oscillation frequency and the geometry of the weld pool for both partial penetration and full penetration.YOO et al.[63] compared the sensitivities of arc voltage and arc light emission, and determined the characteristics of pool oscillation signals in full and transition penetration weld pool, which is defined to be the intermediate state between the partial and full penetration conditions.These two studies provided the foundation for penetration control based on natural oscillation frequency.To decrease noise influence, Huang et al.[64] developed a modified ensemble empirical mode decomposition to break down arc-sensing signals into intrinsic mode functions.In terms of time-frequency clustering and resolution, the authors recommended the Hilbert-Huang transform (HHT) approach.A 98.75% accuracy rate was achieved in classifying various weld quality classes using HHT and energy entropy.He et al.[65] developed a new model for the prediction of undercut, hump, and normal weld beads.The model combined the local mean decomposition method with a SVM classifier, in which the accuracy reached 94.4%.The spectral signal can be analyzed through spectral lines to obtain the changes in welding heat input and arc elements content.Combined with the formation mechanism of various internal defects, the time domain characteristics of each element spectral line in the arc emission spectrum are an important basis for solving difficult-to-predict internal defects such as pores and slag inclusions.As shown in Figure11, Zhang et al.[66] used eight sensitive spectral lines to calculate the variance, RMS, and kurtosis features.In order to reduce the negative effects of the current pulse, the wavelet packet transform was employed.The results of experiments showed that oxidation and crater defects could be effectively monitored.A technique for detecting porosity was suggested, which relies on the HI spectrum from the first principal component.The extracted features can classify different levels of porosity by the analysis of the statistics and the feature curve of the

Figure 11 .
Figure 11.The study of inner porosity detection for Al-Mg alloy in arc welding through online optical spectroscopy [67].

Figure 12 .
Figure 12.Prediction results for the porosity defect position using the improved Focal-XGBoost model (two experiments), the yellow line is the threshold for judging porosity defects [69].

Figure 13 .
Figure 13.Neural network structure and classification.

Figure 14 .
Figure 14.CNN-LSTM model: the processing flow and structure of the network [84].

Figure 15 .
Figure 15.The architecture of the unified attention model.

Figure 17 .
Figure 17.The welding dataset and the network structure: (a) welding seam of different penetration states; (b) partial process of attention calculation; (c) structure of the attention-based LSTM model [96].

Table 1 .
Comparison of different sensing technologies.