Fault Diagnosis of PMSMs Based on Image Features of Multi-Sensor Fusion

Permanent magnet synchronous motors (PMSMs) are extensively utilized in production and manufacturing fields due to their wide speed range, high output torque, fast speed response, small size and light weight. PMSMs are susceptible to inter-turn short circuit faults, demagnetization faults, bearing faults, and other faults arising from irregular vibrations and frequent start–brake cycles. While fault diagnosis for PMSMs offers an effective means to enhance operational efficiency, the multi-sensor information fusion is often overlooked. In industrial production processes, the collected data inevitably suffers from noise contamination, which can adversely impact diagnostic outcomes. To enhance the robustness of diagnostic methods in noisy environments and mitigate the risk of overfitting, a PMSM fault diagnosis method based on image features of multi-sensor fusion is proposed. Firstly, the vibration acceleration signals of the PMSM at different positions were acquired. Then, the newly designed multi-signal Gramian Angular Difference Fields (MGADF) method combines sensor signals from three different installation locations into a single image. Next, the multi-texture features are fused to extract the features of the image. Various machine models are compared in the fault feature learning and classification, and the results show that the proposed diagnostic method has good diagnostic accuracy and robustness, with an average diagnostic accuracy of 99.54% and a standard deviation of accuracy of 0.19. It has excellent performance even in noisy environments. The method is non-invasive and can be extended and applied to the condition monitoring and diagnosis of industrial motors.


Introduction
PMSMs are crucial electromechanical energy converters that play a significant role in diverse industrial applications due to their performance advantages, including light weight, reliable operation, low noise and high efficiency [1][2][3][4].Typical applications in automated production lines include packaging machines, drilling machines, cutting machines, and injection moulding machines.In these applications, PMSMs drive loads with high inertia and start frequently [5].Due to the manufacturing defects and the effects of wear, deformation and corrosion that occur during operation, the performance of the PMSM will gradually decline as component performance deteriorates, which can trigger safety hazards, and in severe cases even downtime accidents [6,7], resulting in significant economic losses [8][9][10][11].Therefore, accurate motor fault diagnosis algorithms are crucial.
In the feature extraction stage of traditional motor fault diagnosis research, signal processing methods, including time-domain [12][13][14], frequency domain [5,[15][16][17][18], and time-frequency domain [19][20][21][22][23], are commonly employed to analyze the measured signals and extract fault features associated with different states.However, the above methods often have problems of low fault diagnosis accuracy and a wide range of applications, and the related research has the limitation of extracting the detailed features of the signals in a Sensors 2023, 23 single dimension only.However, the motor operating state signals can be converted into two-dimensional or high-dimensional space to comprehensively display the implied multidimensional information through multi-dimensional data fusion and high-dimensional visual knowledge methods [24].The grayscale map coding method, while capable of partially reflecting the characteristics of vibration signals, suffers from the loss of temporal information during the coding process, leading to an absence of crucial fault characteristics.Gramian angular field (GAF) can convert the sequence signal into a 2D image, which overcomes the deficiency of missing information from gray-scale map coding, and provides a complete mapping of the signal through different features, such as colors, dots, and lines at the corresponding positions [25,26].GAF image coding, Extreme Learning Machine (ELM), and Convolutional Neural Network (CNN) were combined to further improve the accuracy and diagnosis speed of fault classification [27].To address the complexity of the conventional neural network structure in bearing fault diagnosis, using the construction of GAF feature maps and efficient channel attention optimization, a lightweight neural network fault diagnosis method was proposed., which achieves higher diagnostic accuracy with less parameter computation [28].Although the above work has demonstrated diagnostic capabilities, these signal-to-image techniques are all used with a single sensor, ignoring the fusion of information from multiple sensors.It is important to recognize that single signals are more susceptible to environmental interference compared to multi-sensor signals.
In the pursuit of more accurate and stable diagnostic performance, researchers have sought to enhance their methods by combining signals from multiple sensors.Ribeiro et al. [29] employed accelerometers in two different directions to detect and diagnose six distinct types of motor faults.Their results indicate that the proposed architecture offers good accuracy in multi-sensor fault detection based on vibration time series.Gu et al. [30] proposed a correlation adaptive weighting method to integrate the collected multi-source homogeneous sensor information into multi-source heterogeneous sensor information through data layer fusion.1D-CNN is used for feature extraction, feature layer fusion, and fault classification, and the results achieved a high fault diagnosis accuracy.Peng et al. [31] devised a motor fault diagnosis method based on a deep residual neural network (DRNN) and data fusion.Initially, they extracted time and frequency domain features from the original signal using a short-time Fourier transform (STFT) layer.Subsequently, they employed a deep residual network for feature fusion, enabling fault diagnosis via a classifier.Their method excelled in feature learning, model training, noise immunity, fault tolerance, and fault diagnosis.Yin et al. [32] proposed a fault diagnosis method combining ResNet and multi-sensor data fusion by using Fast Fourier Transform (FFT) to convert sensor data from the time-domain to the frequency domain, and by training the ResNet model for fault diagnosis and classification.However, certain challenges persist in these methods: (1) Some approaches process raw signals in the frequency and time-frequency domains, which increases fault diagnosis time and demands significant computational resources.
(2) These methods transform the signal from each sensor into an image individually, leading to an increased workload during later image feature extraction and classification.
Image feature extraction methods primarily rely on artificial techniques to extract features from the underlying and middle layers by considering the various features of the image locally or globally, according to its texture, shape spatial structure and other information to carry out feature extraction.The acquired features have the advantage of strong interpretability, among which representative feature extraction methods include Tamura texture features and local binary pattern (LBP), etc.They are often applied to image scene classification.The texture is a common but difficult-to-describe feature in images, which can be regarded as an attribute displaying the pixel's spatial distribution in the image.It is often shown as locally irregular but macroscopically regular.The traditional texture feature extraction scale is relatively single.The limited information obtained from the acquired images necessitates a description of how texture primitives are combined and arranged across multiple scales.This approach enables a more comprehensive capture of the image's structural features and its detailed information, highlighting the unique characteristics of the image across varying scales [33].
Machine learning has become a popular technique and has been widely used in the field of motor fault detection [34].The extracted fault features are employed in the pattern recognition stage to train machine learning models, including support vector machines, artificial neural networks, and extreme learning machines [35].The Random Forest (RF) algorithm has a strong advantage due to its stability and resistance to overfitting.RF has a great advantage in dealing with high-dimensional data and is highly adaptable to the dataset.Secondly, RF has the advantage of fast training speed.However, the main application of RF algorithms is fault detection in induction motors and is rarely applied to PMSM fault classification.
To enhance the diagnostic method's robustness in noisy environments and mitigate the risk of overfitting, we propose a PMSM fault diagnosis method based on the fusion of image features from multiple sensors.Firstly, the vibration acceleration signals of the PMSM at different positions under different speed and load conditions are acquired.Then, the newly designed multi-signal Gramian Angular Difference Fields (MGADF) method fuses the sensor signals from three different mounting positions into a single image, with each signal assigned to one RGB channel.Next, Tamura, HOG texture features and LBP features are fused to extract the features of the image.The effectiveness of the method is verified by experimental analysis.Multiple machine learning models are compared on fault feature learning and classification, and the results show that the diagnostic method has good performance and robustness, with excellent performance even in noisy environments.The rest of the paper is organized as follows.The theoretical background is described in Section 2. The presented method is described in Section 3. The framework of the fault diagnosis method is explained in Section 4. Section 5 explains the experiments and verifies the superiority of the proposed method by comparing it with other algorithms.Finally, the main conclusions are shown in Section 6

Gramian Angular Field
GAF is a method of encoding one-dimensional sequences in a way that preserves their time-dependent features to the greatest extent and maintains the signals' dependence on time.It is a density distribution map of Gramian angular field values generated by trigonometric operations after encoding in polar coordinates.GAF overcomes the limitation of one-dimensional signals, which often represent incomplete information and fail to capture detailed features effectively.The method is not only able to maintain time dependence but also includes temporal correlation.

Texture Feature Extraction Method
The texture, which can be thought of as an attribute reflecting the spatial distribution characteristics of the image pixels and is frequently manifested in local irregularities and macroscopic regularity, is a common and difficult-to-describe feature in the image [20].The texture of an image reflects the structural characteristics of the object in the image, with scale, anisotropy, rhythm and other characteristics.The traditional texture feature extraction scale is relatively single.Given the finite amount of image data available, it becomes essential to characterize the arrangement of texture primitives within an image at multiple scales.This involves combining techniques and their variations at different scales to capture more effectively both the overall structural characteristics and the finer details of the image.This approach also highlights the unique properties of images at various scales.Multi-scale texture feature extraction of images is one of the crucial techniques for image classification and recognition due to the distinct sensory characteristics of images.

RF
RF works by generating several classifiers that each learn and predict independently, and finally combining these results to make a prediction, which is better than the results predicted by a single classifier or model.RF consists of multiple decision trees that are independent of each other and is an integrated learning method based on these decision trees.Random Forest classification results are voted by the classification results of all the decision trees, which have a high accuracy rate [23].

∼
x i indicates the value of the time series after scaling to (−1,1).The values are encoded as the angular cosine, and the timestamps are encoded as the radius r.The time series is retransformed to polar coordinates by using Equation (2), where t i is the timestamp, and N is a constant factor of the generated space of the regularized polar coordinate system [26].
The above transformations can be used to convert the original time series into a feature map symmetric along the diagonal, which can also be used to reconstruct the time series, since the feature image contains time-related information.GAF can generate two images with different equations.Equation (3) defines the Gramian Angular Summation Field (GASF), while Equation (4) defines the Gramian Angular Difference Field (GADF).The key distinction between them is the trigonometric conversion: GASF is based on the cosine function, whereas GADF is based on the sine function.
where I is the unit row vector (1, 1, 1. . .1); ∼ X T is the transpose vector of X.Note from the above equation that GAF is a newly constructed operation and that form corresponds to the forms of punishment of the regular inner product [35].
GADF can only transform one-dimensional time series signals, and one-dimensional signals do not provide a comprehensive enough characterization of fault features, for the signals acquired by different sensors are solved for the Gramian matrix separately and injected into the red, green and blue channels to synthesise the final MGADF image.The example of MGADF image transformation is presented in Figure 1.Due to the combination of the three sensor signals, more fault features can be represented in the MGADF image in terms of shape, texture and color features.The innovation of the proposed method in this paper lies in the utilization of multip sensor time-domain signals to construct the Gramian matrix, which corresponds to th features of the three channels of RGB.The image is transformed, which can further cha acterise the fault-specific features compared with the single-signal-source transformatio method.The method further improves the differentiation between the different types images, which creates the conditions for the improvement of the accuracy of the classi cation and recognition.

Multi-Texture Fusion for Feature Extraction
Multi-scale texture feature extraction of images has developed into one of the cruci techniques for image classification and recognition to more accurately capture the com prehensive structural features of images and their detailed information as well as demonstrate the distinctive characteristics of images at different scales [33].Tamura te ture features include commonly used texture features such as roughness, contrast, dire tionality, line similarity, and so on; the LBP algorithm is a well-established image featu extraction technique known for its computational simplicity and efficiency.It is employe to describe feature descriptors related to the local texture structure of an image, reflectin the relationships between each pixel and its neighboring pixels.Notably, LBP offers si nificant advantages in terms of grayscale and rotational invariance; the histogram of or ented gradient (HOG) texture feature calculates the gradient feature vectors of the GA map locally in different directions.In this paper, a feature extraction method is propose that fuses the Tamura-HOG-LBP texture features of GAF images, which expresses th GAF spectrogram features more comprehensively from global and local texture feature The fusion of Tamura-HOG-LBP features improves the multi-texture feature fusio extraction process of GAF images, which is shown in Figure 2. Firstly, a PMSM fault sim ulation experimental platform is constructed and pre-configured with different fault typ on the PMSM.Normal motors and three common typical faulty motors, including ITS LDF and EF motors, are designed and fabricated to carry out experiments under differe operating conditions.The acceleration signals of the operation process are collected.Th improved GAF image transformation is performed on the acquired signal fragment which can comprehensively reflect the detailed features of the motors.Then, the acquire images are uniformly sized and pre-processed with a grayscale, and then the Tamur HOG-LBP features of the images are extracted, respectively, to obtain the vector space all the features.The dimensionality reduction is required to improve the recognitio speed.Finally, pattern recognition is combined with the classifier to complete the fau diagnosis.The innovation of the proposed method in this paper lies in the utilization of multiple sensor time-domain signals to construct the Gramian matrix, which corresponds to the features of the three channels of RGB.The image is transformed, which can further characterise the fault-specific features compared with the single-signal-source transformation method.The method further improves the differentiation between the different types of images, which creates the conditions for the improvement of the accuracy of the classification and recognition.

Multi-Texture Fusion for Feature Extraction
Multi-scale texture feature extraction of images has developed into one of the crucial techniques for image classification and recognition to more accurately capture the comprehensive structural features of images and their detailed information as well as to demonstrate the distinctive characteristics of images at different scales [33].Tamura texture features include commonly used texture features such as roughness, contrast, directionality, line similarity, and so on; the LBP algorithm is a well-established image feature extraction technique known for its computational simplicity and efficiency.It is employed to describe feature descriptors related to the local texture structure of an image, reflecting the relationships between each pixel and its neighboring pixels.Notably, LBP offers significant advantages in terms of grayscale and rotational invariance; the histogram of oriented gradient (HOG) texture feature calculates the gradient feature vectors of the GAF map locally in different directions.In this paper, a feature extraction method is proposed that fuses the Tamura-HOG-LBP texture features of GAF images, which expresses the GAF spectrogram features more comprehensively from global and local texture features.
The fusion of Tamura-HOG-LBP features improves the multi-texture feature fusion extraction process of GAF images, which is shown in Figure 2. Firstly, a PMSM fault simulation experimental platform is constructed and pre-configured with different fault types on the PMSM.Normal motors and three common typical faulty motors, including ITSF, LDF and EF motors, are designed and fabricated to carry out experiments under different operating conditions.The acceleration signals of the operation process are collected.The improved GAF image transformation is performed on the acquired signal fragments, which can comprehensively reflect the detailed features of the motors.Then, the acquired images are uniformly sized and pre-processed with a grayscale, and then the Tamura-HOG-LBP features of the images are extracted, respectively, to obtain the vector space of all the features.The dimensionality reduction is required to improve the recognition speed.Finally, pattern recognition is combined with the classifier to complete the fault diagnosis.

Tamura Texture Feature Theory
The Tamura texture feature contains six attributes: coarseness, contrast, orientation, linearity, regularity and roughness.Coarseness indicates the granularity of the image texture pattern, the larger the granularity of the image texture pattern, the rougher the texture image, and vice versa.The coarseness calculation formula is as follows: , 0,1, 2,...,5 where  (, ) is the average intensity within the active window; (, ) represents the gray value located there;  , (, ) is the average gray variance of the pixels in the horizontal direction;  , (, ) is the average gray variance of the pixels in the vertical direction; m represents the length of the image; and n represents the width of the image.Contrast refers to the degree of polarization between the light and dark parts of the histogram and the dynamic range of the gray level.It could indicate the image's clarity and the depth of the texture grooves.With increasing groove depth, the visual effect of the image becomes more contrasted and clearer.The shallower the grooves, the smaller the contrast, and the blurrier the image.The contrast is calculated as: where  = , where  is the quadratic moment;  is the variance.The orientation degree  characterizes the pixels in the image in a certain direction.The orientation degree can be calculated by computing the gradient vector.The formula is as follows:

Tamura Texture Feature Theory
The Tamura texture feature contains six attributes: coarseness, contrast, orientation, linearity, regularity and roughness.Coarseness indicates the granularity of the image texture pattern, the larger the granularity of the image texture pattern, the rougher the texture image, and vice versa.The coarseness calculation formula is as follows: where A k (x, y) is the average intensity within the active window; g(i, j) represents the gray value located there; E k,u (x, y) is the average gray variance of the pixels in the horizontal direction; E k,ν (x, y) is the average gray variance of the pixels in the vertical direction; m represents the length of the image; and n represents the width of the image.Contrast refers to the degree of polarization between the light and dark parts of the histogram and the dynamic range of the gray level.It could indicate the image's clarity and the depth of the texture grooves.With increasing groove depth, the visual effect of the image becomes more contrasted and clearer.The shallower the grooves, the smaller the contrast, and the blurrier the image.The contrast is calculated as: where a 4 = µ 4 σ 4 , where µ 4 is the quadratic moment; σ 4 is the variance.The orientation degree T dir characterizes the pixels in the image in a certain direction.The orientation degree can be calculated by computing the gradient vector.The formula is as follows: where ∆H, ∆V denote the obtained horizontal and vertical gradient vector changes; d(x, y) is the direction angle; and µ(x, y) is the mean direction angle within the neighborhood; Linearity refers to the degree of deviation of the pixel spacing distances in the calculation of the local covariance matrix.The degree of linearity is calculated as: where P a is the distance point of the m × m local direction covariance matrix.
Regularity is a measure of how regular an image's texture is.The more regular the image, the closer the regularity value is to 1; otherwise, it is closer to zero.The regularity formula is: where r is the normalization factor; and σ coa , σ con , σ dir , σ lin are the standard deviations of each texture feature parameter.Roughness is used in psychology to simulate the roughness of a hand touching the surface of an object.Images of different shapes and sizes are felt differently on contact; the denser and more irregular the image, the greater the roughness, and otherwise the smaller.The roughness formula is:

HOG Texture Feature Theory
The idea of the HOG algorithm is to represent the profile of an image target through the distribution of edge directions [11].The specific strategy involves dividing the recognized image into several fixed-sized regions.In each region, gradient features are accumulated by computing the gradients of the image pixels and performing feature computation.This process yields a histogram of gradient orientation with a specified number of dimensions, as shown in the Figure 3, which is completed by the following steps: where Δ, Δ denote the obtained horizontal and vertical gradient vector changes; (, ) is the direction angle; and (, ) is the mean direction angle within the neighborhood; Linearity refers to the degree of deviation of the pixel spacing distances in the calculation of the local covariance matrix.The degree of linearity is calculated as: where  is the distance point of the m × m local direction covariance matrix.
Regularity is a measure of how regular an image's texture is.The more regular the image, the closer the regularity value is to 1; otherwise, it is closer to zero.The regularity formula is:

(
) where r is the normalization factor; and  ,  ,  ,  are the standard deviations of each texture feature parameter.Roughness is used in psychology to simulate the roughness of a hand touching the surface of an object.Images of different shapes and sizes are felt differently on contact; the denser and more irregular the image, the greater the roughness, and otherwise the smaller.The roughness formula is:

HOG Texture Feature Theory
The idea of the HOG algorithm is to represent the profile of an image target through the distribution of edge directions [11].The specific strategy involves dividing the recognized image into several fixed-sized regions.In each region, gradient features are accumulated by computing the gradients of the image pixels and performing feature computation.This process yields a histogram of gradient orientation with a specified number of dimensions, as shown in the Figure 3, which is completed by the following steps:   13): where G x (x, y)•G y (x, y)•H(x, y) denotes the gradient and pixel value of the pixel point in the x-axis and y-axis directions in the two-dimensional planar vertical coordinate system.The gradient magnitude and gradient direction at this pixel point are calculated as: 14) The n-dimensional gradient magnitude of each cell is accumulated by dividing the gradient direction of the cell by 180 degrees equally into n blocks of directions called Bin.Multiple cells are combined into blocks for contrast normalization.HOG features are collected for all overlapping blocks in the detection window.

LBP Feature Theory
The LBP texture analysis operator is a gray-scale invariant texture analysis method.To begin, create a 3 × 3 window that includes the value at the center of the window along with its eight neighboring values.From the upper left corner, make a clockwise size comparison.When the neighboring value is greater than or equal to the central value, record as 1, and vice versa record as 0. Through this method, a set of 8-bit binary numbers can be generated, which will be converted to decimal value through (16).The decimal value is the LBP value of this pixel point.Finally, by counting the number of occurrences of different LBP values, the image can be characterized.
where (x c , y c ) is the center pixel of the 3 × 3 neighborhood; i c is the gray value of the center point; and i p is the gray value of the neighborhood pixel point.

PCA Algorithm Dimension Reduction Processing
In the process of image recognition, if the high dimension of the original feature space is used for model training, it will increase the computational complexity greatly, and in the statistical properties of the sample cannot be estimated.Therefore, it is necessary to reduce the dimensionality of the original features.In this paper, we use the Principal Component Analysis (PCA) method to achieve feature extraction, which reduces the number of dimensions, so as to improve the speed of image recognition.
Ideally, the feature space of the sample x has no redundant information.The PCA algorithm can be expressed as Equation (17): where ) is a set of bases in the feature space, and the estimation of x for the first k terms is: The resulting mean square error is: Sensors 2023, 23, 8592 According to the Lagrange multiplier, the expression for the extreme value of the mean square error is obtained: where is the covariance matrix of x, and m i is the eigenvector.The mean square error when representing x in terms of k eigenvectors is: From Equation ( 21), it can be concluded that, when the value of a i is smaller, the corresponding feature vector information is less impaired.

Random Forest Classification Algorithm with Dung Beetle Optimization
In fault classification recognition, the classifiers' role is to determine the type of fault to which the test sample belongs based on well-labeled training data with different fault types.Linear regression and SVM are commonly used as binary classifiers and are not suitable for classifying a wide range of faults.Neural network classifiers can affect the speed of fault identification due to their slow convergence rate.Therefore, random forest is chosen as the method for classifying and identifying motor ITSC faults.
The basic structure of a random forest is a decision tree, and the main mathematical description of a decision tree is as follows: let the sample set S have m categories c i : (i = 1, 2, . .., m); s i is the number of samples belonging to c i , then the sample expectation entropy is where s and s i denote the total number of samples and the number of samples belonging to the category c i , respectively.For a single feature A of a sample, its expected entropy is: where k denotes the total number of sample features and s ij denotes the i-th dimensional feature of the sample belonging to the category where The entropy gain of feature A can be obtained: The entropy gain rate Gain (A) is calculated as where |s| .The random forest consists of several decision tree structures, and the main process of its classification is shown in Figure 4.  Let a random forest consist of ℎ ()ℎ () … ℎ () decision trees, for any two features  and  of the sample, with edge functions: where I(.) denotes the transformation function,  and j are the positive and negative categories determined by the random forest, respectively,  denotes the mean value, and the value of (, ) is proportional to the feature extraction effect.RF has a great advantage in dealing with high-dimensional data and having high adaptability to the dataset.Secondly, the advantage of RF is that the training speed is quick.The importance of variables is sorted according to certain rules.The implementation is relatively simple.The idea of the RF algorithm is that a new training set is generated by randomly selecting N sample subsets from the original sample training set with put-back repetitions, and then the RF consisting of N decision trees is generated.The new classification result is obtained by judging the selection result of each class by the decision tree.The subset of elements used to decide the ideal node splits approves weaker elements' representation in RF [36].The low correlation between trees in RF is executed by randomization of bootstrap sampling.RF performs well in applications for rotating machinery fault diagnosis.
Dung beetle optimizer (DBO) is a population intelligent optimisation algorithm [37].This approach is inspired by the biological behavior of dung beetles and exhibits strong optimization-seeking capabilities along with fast convergence speed.The DBO algorithm is proposed based on the rolling, dancing, foraging, stealing and breeding behaviours of dung beetles.Optimization aims to solve the number of decision trees b and the minimum leaf point tree m under the condition of satisfying the optimal solution.
When using DBO to optimise RF parameters, it is necessary to choose a suitable objective function to evaluate the advantages and disadvantages of each set of parameters.In this paper, when using RF to train each decision tree-based classifier, about 1/3 of the texture fusion feature parameters are not extracted for training, which is called Out Of Bag data.Out Of Bag data can replace the test set to estimate the generalization error of the RF model, so this paper chooses the error score rate of Out Of Bag data as the objective function of DBO for searching the optimal parameter adapted to the fault diagnosis model of the electric motor, which is calculated by the formula: where,  ,  are the number of correctly classified and incorrectly classified samples in the out-of-bag data; M1 represents the feature attributes of the original samples; and N1 is the upper limit of the size of the decision tree-based classifier.
The RF and its parameter optimization process can constitute the original sample set of GAF image texture feature parameters of different states of the PMSM, proportionally Let a random forest consist of h 1 (X)h 2 (X) . . .h k (X) decision trees, for any two features X and Y of the sample, with edge functions: where I(.) denotes the transformation function, Y and j are the positive and negative categories determined by the random forest, respectively, aν k denotes the mean value, and the value of ma(X, Y) is proportional to the feature extraction effect.RF has a great advantage in dealing with high-dimensional data and having high adaptability to the dataset.Secondly, the advantage of RF is that the training speed is quick.The importance of variables is sorted according to certain rules.The implementation is relatively simple.The idea of the RF algorithm is that a new training set is generated by randomly selecting N sample subsets from the original sample training set with put-back repetitions, and then the RF consisting of N decision trees is generated.The new classification result is obtained by judging the selection result of each class by the decision tree.The subset of elements used to decide the ideal node splits approves weaker elements' representation in RF [36].The low correlation between trees in RF is executed by randomization of bootstrap sampling.RF performs well in applications for rotating machinery fault diagnosis.
Dung beetle optimizer (DBO) is a population intelligent optimisation algorithm [37].This approach is inspired by the biological behavior of dung beetles and exhibits strong optimization-seeking capabilities along with fast convergence speed.The DBO algorithm is proposed based on the rolling, dancing, foraging, stealing and breeding behaviours of dung beetles.Optimization aims to solve the number of decision trees b and the minimum leaf point tree m under the condition of satisfying the optimal solution.
When using DBO to optimise RF parameters, it is necessary to choose a suitable objective function to evaluate the advantages and disadvantages of each set of parameters.In this paper, when using RF to train each decision tree-based classifier, about 1/3 of the texture fusion feature parameters are not extracted for training, which is called Out Of Bag data.Out Of Bag data can replace the test set to estimate the generalization error of the RF model, so this paper chooses the error score rate of Out Of Bag data as the objective function of DBO for searching the optimal parameter adapted to the fault diagnosis model of the electric motor, which is calculated by the formula: where, x r , x w are the number of correctly classified and incorrectly classified samples in the out-of-bag data; M 1 represents the feature attributes of the original samples; and N 1 is the upper limit of the size of the decision tree-based classifier.
The RF and its parameter optimization process can constitute the original sample set of GAF image texture feature parameters of different states of the PMSM, proportionally divided into training and test sets, and the diagnosis of PMSM faults can be carried out based on the constructed model, which is shown in Figure 5.

Fault Diagnosis Framework
The general framework of the proposed method is shown in Figure 6 and described in detail as follows.
Step 1: Obtain Acquire time series data of PMSMs for different fault types.Measure the axial, radial and seat vibration signals of the motor in different states at different speeds and load conditions.
Step 2: Compute the Gramian matrix for each of the three signals, corresponding to the individual RGB channels, for the MGADF image transformation.
Step 3: The features of the image are solved by Tamura-HOG-LBP features.
Step 4: PCA feature space dimensionality reduction is performed on the solved features.
Step 5: The number of decision trees b and minimum leaf point tree number m under optimal solution conditions are satisfied by the DBO algorithm and fed into the RF classifier.
Step 6: Learn and generate an RF classifier to classify and identify the input features.

Fault Diagnosis Framework
The general framework of the proposed method is shown in Figure 6 and described in detail as follows.The length of the dataset to be coded is usually 2n, e.g., 64, 128, 256, and 512.As shown in Figure 7, the average accuracy of coding with MGADF for different data lengths is highest when the data length reaches 256.Accuracy declines after 256 bits of data are used.The data length in the following text is 256 because each pixel in the feature map created by the encoding at this point has been compressed and does not accurately represent the features of the original data.Step 2: Compute the Gramian matrix for each of the three signals, corresponding to the individual RGB channels, for the MGADF image transformation.
Step 3: The features of the image are solved by Tamura-HOG-LBP features.
Step 4: PCA feature space dimensionality reduction is performed on the solved features.
Step 5: The number of decision trees b and minimum leaf point tree number m under optimal solution conditions are satisfied by the DBO algorithm and fed into the RF classifier.
Step 6: Learn and generate an RF classifier to classify and identify the input features.
The length of the dataset to be coded is usually 2n, e.g., 64, 128, 256, and 512.As shown in Figure 7, the average accuracy of coding with MGADF for different data lengths is highest when the data length reaches 256.Accuracy declines after 256 bits of data are used.The data length in the following text is 256 because each pixel in the feature map created by the encoding at this point has been compressed and does not accurately represent the features of the original data.The length of the dataset to be coded is usually 2n, e.g., 64, 128, 256, and 512.As shown in Figure 7, the average accuracy of coding with MGADF for different data lengths is highest when the data length reaches 256.Accuracy declines after 256 bits of data are used.The data length in the following text is 256 because each pixel in the feature map created by the encoding at this point has been compressed and does not accurately represent the features of the original data.

Experimental Design
The performance of PMSMs gradually degrades as the components' performance deteriorates, which can lead to safety hazards and, in severe cases, result in downtime accidents, causing substantial economic losses.The common types of faults in PMSM are mechanical failure, winding short circuit and demagnetization faults, etc., among which the fault characteristics of inter-turn short circuit faults, local demagnetization faults, and eccentricity faults are more similar.There has been limited research on differentiating

Experimental Design
The performance of PMSMs gradually degrades as the components' performance deteriorates, which can lead to safety hazards and, in severe cases, result in downtime accidents, causing substantial economic losses.The common types of faults in PMSM are mechanical failure, winding short circuit and demagnetization faults, etc., among which the fault characteristics of inter-turn short circuit faults, local demagnetization faults, and eccentricity faults are more similar.There has been limited research on differentiating between the three types of faults mentioned above.Therefore, this paper focuses on investigating these three fault types.
To verify the reasonableness of the proposed method in practical applications, a PMSM fault simulation platform is constructed, as shown in Figure 8.In the experiment, the proposed algorithm is tested on a PMSM with pre-configured faults.The main components of the experimental platform include: the PMSM to be tested, load motor, encoder, touch screen, Digital Signal Processor (DSP), personal computer (PC), direct-current power supply (DC power supply), and so on.The parameters of the PMSM to be tested are shown in Table 1, and the radial vibration data were measured, including four states (healthy state and three types of vibration signals with similar time-domain characteristics of faults): the LDF, when a single permanent magnet is being magnetized, one of them is controlled to be magnetized up to 70% of the nominal magnetic density, and 30% of the local demagnetization is simulated; for the EF, the rotary eccentric device is designed by rotating the eccentric device of the two ends of the PMSM to achieve the adjustment of static eccentricity.The corresponding relationship between the rotation angle and the static eccentricity is a = 0.8sin(|θ/2|), where a is the static eccentricity, and θ is the angle of the eccentric device, which is set to 20 • during the experiment.The sampling frequency of these data is 10 kHz.Each sample contains 2k points.The acceleration sensor parameters are shown in Table 2.  9.For the ITSF, an internal short circuit fault is simulated where the PMSM is rewound and connectors are led on 1-30% of the total number of coils in the u, v and w phase windings to an external junction box.By connecting the terminals of the junction box, ITSC faults with different numbers of turns short-circuited can be simulated on the PMSM.The experiment simulates the 20% shortcircuited state of the u-phase winding; for the LDF, when a single permanent magnet is being magnetized, one of them is controlled to be magnetized up to 70% of the nominal magnetic density, and 30% of the local demagnetization is simulated; for the EF, the rotary eccentric device is designed by rotating the eccentric device of the two ends of the PMSM to achieve the adjustment of static eccentricity.The corresponding relationship between the rotation angle and the static eccentricity is  = 0.8sin (| /2 |), where a is the static eccentricity, and  is the angle of the eccentric device, which is set to 20° during the experiment.The sampling frequency of these data is 10 kHz.Each sample contains 2k points.The acceleration sensor parameters are shown in Table 2.The number of specimens for each working condition is described in Table 3.The speeds are 1000, 1500 and 2000 r/min, and the loads are no load, half load and rated load, which are a total of nine working conditions to construct the dataset.Based on the experimental setup, there are 400 samples for each working condition.Each sample comprises signals from three different positions of the vibration sensors, and the samples in the dataset are randomly selected.Each data subset is divided into three parts for training, validation and testing, with each part being 60%, 20% and 20%, respectively.The MGADF images of the steady state dataset under different load conditions at the speed of 1000 r/min are shown in Figure 11.It is evident that there are differences between the various types, but there is a certain degree of similarity between the images of HC and LDF.The variation between the MGADF images of the same type of PMSM is not obvious as the load changes from no load to rated load.
The MGADF images under different speed conditions with a load of rated load are shown in Figure 12.The images exhibit intra-class inconsistency, displaying significant variability in the MGADF images as the speed changes.Meanwhile, the HC and LDF images remain somewhat similar, increasing the inter-class ambiguity.Therefore, the above problems increase the difficulty of accurately diagnosing the fault.The MGADF images of the steady state dataset under different load conditions at the speed of 1000 r/min are shown in Figure 11.It is evident that there are differences between the various types, but there is a certain degree of similarity between the images of HC and LDF.The variation between the MGADF images of the same type of PMSM is not obvious as the load changes from no load to rated load.The MGADF images under different speed conditions with a load of rated load are shown in Figure 12.The images exhibit intra-class inconsistency, displaying significant variability in the MGADF images as the speed changes.Meanwhile, the HC and LDF images remain somewhat similar, increasing the inter-class ambiguity.Therefore, the above problems increase the difficulty of accurately diagnosing the fault.

Fusion Texture Feature Extraction
Based on the dataset, feature vectors are extracted by the fusion of Tamura-HOG-LBP texture features.High-dimensional data visualization is made possible by t-SNE.When we apply t-SNE to n-dimensional data, it will intelligently map n-dimensional data to 3D or even 2D data, effectively preserving the relative similarity of the original data.Because t-SNE follows nonlinearity rather than linearity, it is able to capture the intricate flow structure of high-dimensional data.In Figure 13, it can be observed that the features extracted by HOG become distinguishable to some extent and the features extracted by Tamura and LBP are less distinguishable.Of all the methods, feature fusion combines the strengths of all three texture feature extraction methods and proves to be the most effective in distinguishing the features associated with PMSM faults.

Fusion Texture Feature Extraction
Based on the dataset, feature vectors are extracted by the fusion of Tamura-HOG-LBP texture features.High-dimensional data visualization is made possible by t-SNE.When we apply t-SNE to n-dimensional data, it will intelligently map n-dimensional data to 3D or even 2D data, effectively preserving the relative similarity of the original data.Because t-SNE follows nonlinearity rather than linearity, it is able to capture the intricate flow structure of high-dimensional data.In Figure 13, it can be observed that the features extracted by HOG become distinguishable to some extent and the features extracted by Tamura and LBP are less distinguishable.Of all the methods, feature fusion combines the strengths of all three texture feature extraction methods and proves to be the most effective in distinguishing the features associated with PMSM faults.

Comparison with Other Classification Methods
To verify the superiority of the deep networks in this paper, Support Vector Machines (SVM), BP Neural Networks (BPNN), Radial Basis Function Neural Networks (RBFNN), CNN, RF and DBO-RF are compared.The input data for both this method and other deep learning methods are fused texture features extracted from MGADF images, and the primary focus of the comparative experiments lies in accuracy, precision, recall, and Fl-score.In tables concerning diagnostic performance, bold text highlights the highest diagnostic results.
Table 4 presents the classification results of the different methods.It is evident that the proposed method attains the highest values for accuracy, precision, recall and F1 value, with mean values of 99.54%, 99.56%, 99.54% and 99.58%, respectively.From the table, it is evident that the diagnostic accuracy of DBO-RF is improved as compared to RF.In terms of overall performance, DBO-RF outperforms all other classifiers.It has the highest classification performance along with high algorithmic stability.Meanwhile, the method in this paper still maintains a large advantage over BP and RBF methods.In summary, the DBO-RF created in this paper has a greater advantage in intelligent diagnosis methods based on MGADF for image coding.This section compares various signal-to-image methods, including GADF, GASF, Markov Transition Fields(MTF), and Recurrence Plot(RP).In Figure 14, images are formed from an axial vibration signal with rated load and the speed of 1000 r/min.These visuals are created through point projections, which means that their colors and shapes lack interpretability.From these images, high-frequency noise can be observed.
The diagnostic results are presented in Table 5, with 'a' 'b' and 'c' denoting the axial, radial, and seat of the motor, respectively.Obviously, the four indicators of MGADF are the best.The four metrics are 99.54%,99.56%, 99.54% and 99.58%.Among the other four signal-to-image methods, GADF-a has the highest average accuracy of 98.86%, which is 0.68% lower than the proposed method.The lowest average accuracy is 91.66% for RP-b, which is 7.88% lower than MGADF.The diagnostic results are presented in Table 5, with 'a' 'b' and 'c' denoting the axial, radial, and seat of the motor, respectively.Obviously, the four indicators of MGADF are the best.The four metrics are 99.54%,99.56%, 99.54% and 99.58%.Among the other four signal-to-image methods, GADF-a has the highest average accuracy of 98.86%, which is 0.68% lower than the proposed method.The lowest average accuracy is 91.66% for RP-b, which is 7.88% lower than MGADF.
The average generation time for each image is listed in Table 6.The most efficient method is GADF, which takes 0.1152 s.This is followed by GASF and MGADF, which take 0.1153 s and 0.1221 s.All other signal image transformation methods take more time than the above methods.Hence, the method's superiority is substantiated on the grounds of both reliability and efficiency.It can achieve reliable motor diagnosis under varying speed and load conditions.The average generation time for each image is listed in Table 6.The most efficient method is GADF, which takes 0.1152 s.This is followed by GASF and MGADF, which take 0.1153 s and 0.1221 s.All other signal image transformation methods take more time than the above methods.Hence, the method's superiority is substantiated on the grounds of both reliability and efficiency.It can achieve reliable motor diagnosis under varying speed and load conditions.In order to avoid overfitting, which leads to high algorithm accuracy, the noise immunity of this paper's algorithm is also investigated [38].To simulate data collected from the same type of motor in different environments, Gaussian white noise with different signal-to-noise ratios is added to the original test dataset [39].The signal-to-noise ratio is defined as when the signal-to-noise ratio is set to 5, 10, or 20 dB.SNR(dB) = 10 log 10 P signal P noise (29) where P signal denotes the power of the original signal and P noise denotes the power of the noise signal.
A Gaussian white noise with a signal-to-noise ratio of 5 dB is introduced to the current signal to demonstrate the impact of noise on the vibration signal.A comparison is depicted in Figure 15 In order to avoid overfitting, which leads to high algorithm accuracy, the noise immunity of this paper's algorithm is also investigated [38].To simulate data collected from the same type of motor in different environments, Gaussian white noise with different signal-to-noise ratios is added to the original test dataset [39].The signal-to-noise ratio is defined as when the signal-to-noise ratio is set to 5, 10, or 20 dB.where  denotes the power of the original signal and  denotes the power of the noise signal.
A Gaussian white noise with a signal-to-noise ratio of 5 dB is introduced to the current signal to demonstrate the impact of noise on the vibration signal.A comparison is depicted in Figure 15 Gaussian white noise at signal-to-noise ratios of 5 dB, 10 dB, and 20 dB was introduced to the original test dataset to evaluate the overall test accuracy.All methods were assessed at these three noise levels.As illustrated in Figure 16, the performance of all algorithms deteriorates as the noise level increases.Notably, across different noise levels, the proposed algorithm outperforms all other techniques.These results demonstrate that the algorithms presented in this paper exhibit high noise immunity and achieve a consistently high overall test accuracy.Gaussian white noise at signal-to-noise ratios of 5 dB, 10 dB, and 20 dB was introduced to the original test dataset to evaluate the overall test accuracy.All methods were assessed at these three noise levels.As illustrated in Figure 16, the performance of all algorithms deteriorates as the noise level increases.Notably, across different noise levels, the proposed algorithm outperforms all other techniques.These results demonstrate that the algorithms presented in this paper exhibit high noise immunity and achieve a consistently high overall test accuracy.

Conclusions
To tackle the challenges associated with the reliability and stability of fault diagnosis methods in industrial manufacturing PMSMs, a fault diagnosis method based on multisensor fusion of image features is proposed.For different types of motor faults, vibration acceleration signals of the PMSM under varying speed and load conditions were collected by sensors placed at different positions.The Gramian matrix is solved separately and the red, green and blue channels are injected to synthesise the final MSDP image.Based on the extraction of image feature vectors through the fusion of Tamura-HOG-LBP texture features, several machine learning methods are compared for the tasks of fault feature learning and classification.The results show that the proposed diagnostic method has the best diagnostic accuracy and robustness, with an average diagnostic accuracy of 99.54%.This technique maximizes the utilization of data collected in industrial settings and enhances its robustness against various environmental conditions by amalgamating multiple sensor signals into an input image.It exhibits outstanding performance, even in noisy environments.The method is non-intrusive and can be extended to condition monitoring and diagnosis of industrial motors, offering prospects for practical industrial applications in motor fault diagnosis.In future work, improvements will be made in the following two areas for better industrial applications.(1) Since PMSMs can encounter load or speed variations during operation, while this paper primarily examines scenarios with constant load and speed, the accuracy of the proposed method may be impacted when

Conclusions
To tackle the challenges associated with the reliability and stability of fault diagnosis methods in industrial manufacturing PMSMs, a fault diagnosis method based on multisensor fusion of image features is proposed.For different types of motor faults, vibration acceleration signals of the PMSM under varying speed and load conditions were collected by sensors placed at different positions.The Gramian matrix is solved separately and the red, green and blue channels are injected to synthesise the final MSDP image.Based on the extraction of image feature vectors through the fusion of Tamura-HOG-LBP texture features, several machine learning methods are compared for the tasks of fault feature learning and classification.The results show that the proposed diagnostic method has the best diagnostic accuracy and robustness, with an average diagnostic accuracy of 99.54%.This technique maximizes the utilization of data collected in industrial settings and enhances its robustness against various environmental conditions by amalgamating multiple sensor signals into an input image.It exhibits outstanding performance, even in noisy environments.The method is non-intrusive and can be extended to condition monitoring and diagnosis of industrial motors, offering prospects for practical industrial applications in motor fault diagnosis.In future work, improvements will be made in the following two areas for better industrial applications.(1) Since PMSMs can encounter load or speed variations during operation, while this paper primarily examines scenarios with constant load and speed, the accuracy of the proposed method may be impacted when applied to situations with load or speed fluctuations.Future research will delve into fault diagnosis for PMSMs in transient load or speed change scenarios.(2) In this study, vibration signals are employed for fusion in fault diagnosis.In the future, the fusion of current, vibration, and temperature signals will be explored to achieve enhanced fault diagnosis capabilities.

Figure 2 .
Figure 2. The multi-texture feature fusion extraction process.

Figure 2 .
Figure 2. The multi-texture feature fusion extraction process.

Figure 3 .
Figure 3. HOG feature extraction process.Figure 3. HOG feature extraction process.The image is divided into two layers, and the first layer is composed of interconnected cell units.Several cells form a block, and each block can overlap.The gradient magnitude and gradient direction of a pixel point (x, y) are obtained by calculating the gradient

Figure 6 .
Figure 6.Fault diagnosis framework.1: Obtain Acquire time series data of PMSMs for different fault types.Measure the axial, radial and seat vibration signals of the motor in different states at different speeds and load conditions.

Figure 7 .
Figure 7. Accuracy of the different data lengths.

Figure 7 .
Figure 7. Accuracy of the different data lengths.

( 1 )
Healthy Condition (HC); (2) Inter-turn Short circuit Fault (ITSF); (3) Local Demagnetization Fault (LDF); (4) Eccentricity Fault (EF).The details of the fault preset are shown in Figure 9.For the ITSF, an internal short circuit fault is simulated where the PMSM is rewound and connectors are led on 1-30% of the total number of coils in the u, v and w phase windings to an external junction box.By connecting the terminals of the junction box, ITSC faults with different numbers of turns short-circuited can be simulated on the PMSM.The experiment simulates the 20% short-circuited state of the u-phase winding; for Sensors 2023, 23, 8592 13 of 22 characteristics of faults): (1) Healthy Condition (HC); (2) Inter-turn Short circuit Fault (ITSF); (3) Local Demagnetization Fault (LDF); (4) Eccentricity Fault (EF).The details of the fault preset are shown in Figure

Figure 8 .
Figure 8. Overview of the test platform.

Figure 10 .
Figure 10.The vibration signals, color features and MGADF images.

Figure 11 .Figure 11 .
Figure 11.The MGADF images of HC, ITSF, LDF, and EF motors under different load conditions at the speed of 1000 r/min.

Figure 11 .Figure 12 .
Figure 11.The MGADF images of HC, ITSF, LDF, and EF motors under different load conditions at the speed of 1000 r/min.

Figure 12 .
Figure 12.The MGADF images of HC, ITSF, LDF, and EF motors under different speed conditions with the load of rated load.

Figure 13 .
Figure 13.Feature visualisation for different feature extraction methods.

3 .
, revealing differences in the amplitude and fluctuation trends of the signals.Effect of Signal-to-Noise Ratio

Figure 15 .
Figure 15.Comparison of original signal and after the adding of Gaussian white noise.(a) Original signal.(b) Signal with Gaussian white noise.

Figure 15 .
Figure 15.Comparison of original signal and after the adding of Gaussian white noise.(a) Original signal.(b) Signal with Gaussian white noise.

Figure 16 .
Figure 16.Comparison of the overall test accuracy with different levels of noise.

Figure 16 .
Figure 16.Comparison of the overall test accuracy with different levels of noise.

Table 1 .
Rated parameters of tested PMSM.
Figure 8. Overview of the test platform.

Table 1 .
Rated parameters of tested PMSM.
Sensors 2023, 23, x FOR PEER REVIEW 14 o

Table 2 .
Parameters of vibration sensor.

Table 2 .
Parameters of vibration sensor.

Table 3 .
Sample description of the dataset.

Status Size of the Dataset (Training Set/Validation Set/Testing Set)
Comparison with Other Classification MethodsTo verify the superiority of the deep networks in this paper, Support Vector Machines (SVM), BP Neural Networks (BPNN), Radial Basis Function Neural Networks (RBFNN), CNN, RF and DBO-RF are compared.The input data for both this method and other deep

Table 4 .
Comparison with other classification methods.

Table 5 .
Diagnostic results of different signal-to-image methods.
Sensors 2023, 23, x FOR PEER REVIEW 18 of 23 are created through point projections, which means that their colors and shapes lack interpretability.From these images, high-frequency noise can be observed.

Table 5 .
Diagnostic results of different signal-to-image methods.

Table 6 .
The average generation time for each image.