Effectiveness Analysis of PMSM Motor Rolling Bearing Fault Detectors Based on Vibration Analysis and Shallow Neural Networks

: Permanent magnet synchronous motors (PMSMs) are becoming more popular, both in industrial applications and in electric and hybrid vehicle drives. Unfortunately, like the others, these are not reliable drives. As in the drive systems with induction motors, the rolling bearings can often fail. This paper focuses on the possibility of detecting this type of mechanical damage by analysing mechanical vibrations supported by shallow neural networks (NNs). For the extraction of diagnostic symptoms, the Fast Fourier Transform (FFT) and the Hilbert transform (HT) were used to obtain the envelope signal, which was subjected to the FFT analysis. Three types of neural networks were tested to automate the detection process: multilayer perceptron (MLP), neural network with radial base function (RBF), and Kohonen map (self-organizing map, SOM). The input signals of these networks were the amplitudes of harmonic components characteristic of damage to bearing elements, obtained as a result of FFT or HT analysis of the vibration acceleration signal. The effectiveness of the analysed NN structures was compared from the point of view of the inﬂuence of the network architecture and various parameters of the learning process on the detection effectiveness.


Introduction
Today, permanent magnet synchronous motors (PMSMs) are gaining more popularity in industrial applications. They are also willingly used in the drive systems of electric and hybrid vehicles. This is due to their numerous advantages, such as high efficiency, possibility of high torque overload, and high power-to-weight ratio. They work well in applications that require both high precision and high dynamics [1,2]. Reliability is an equally important requirement for a modern drive system. Very good operating parameters do not make the PMSM failure-free. The factors causing different damages may include: too high of a current flow, high operating temperature, inadequate lubrication, or corrosion [2]. The introduction of diagnostics of the drive system enables the detection of damage to motor components already in the initial stage of their development. Usually, the failure of one of the components leads to subsequent failures. Monitoring allows you to plan the replacement of the damaged part and prevent failure of subsequent components. Appropriate diagnostics guarantees safety and extends the service life of the machine [3]. The rolling bearings discussed in this article are the most damaged components of the AC (alternating current) motors of medium power. They account for about 40% of all failures in electrical machines, while stator, rotor, and other failures account 36%, 9%, and 14%, respectively [4]. Rolling bearings are responsible for maintaining a constant air gap between the rotor and the stator. Therefore, their failure gradually leads to the failure of the entire machine [3,4].
Western Reserve University to train a Self-Organizing Map (SOM) to classify damages to rolling bearings. Differences of 70 features, such as statistical features, frequency domain features, autoregressive model coefficients, wavelet packet decomposition analysis, and empirical mode decomposition entropy energy were used as SOM inputs. Two techniques were used to select the optimal features (extended relief and min redundancy max relevance (mRMR) algorithms), which allowed to obtain as much as 100% effectiveness for 33 features. However, a practical realisation of such extended fault detectors or classifiers is rather difficult, especially when you want to use simple processors, e.g., ARM type and applied such solutions in industry.
Although, as can be seen from the above literature overview, there are examples of the use of selected classic NN structures for the diagnostics of rolling bearings in variable speed PMSM drives, there is a lack of comparison of the effectiveness of various neural structures in the detection and/or classification of bearing failures carried out for a wide range of operating conditions of such a drive.
Hence, the aim of this study is a comparative analysis of the effectiveness of three selected types of NNs for the detection of damage to rolling bearings in PMSM drive. Two methods of vibration signal processing were used to obtain diagnostic symptoms: fast Fourier transform (FFT) and Hilbert transform (HT) to obtain the envelope signal, which was next subjected to FFT analysis. The preliminary analysis allowed distinguishing several diagnostic symptoms characteristics for the failure of rolling bearings. The article presents and compares the possibility of using MLPs, RBF networks, and Kohonen maps for detection and classification of bearing damages. Two variants of the NN detector operations were analysed: the first one determined whether there was damage or not (fault detector), the second one identified the type of damaged structural element of the bearing (fault classifier). In addition, in this article, the impact of NN topology, activation function, learning method and learning rate on the detection/classification efficiency was analysed. It was assumed that developed neural detector and classifiers should achieve accuracy at the level of 95-100%, but not less than 90%. The article presents selected results of the operation of the developed and tested neural detectors of rolling bearings based on vibration acceleration signals acquired in the laboratory set-up with PMSM drive.
The first part of the article contains a review of the literature on the diagnosis of rolling bearings, with particular emphasis on PMSM drives, and the motivation of the research presented. In the following two sections, the characteristics of the damage to rolling bearings and the methods used for the initial signal analysis are discussed. In the fourth section, the applied NN structures are shortly characterized. In the fifth section, the experimental test results of the shallow NN-based rolling bearing fault detectors and classifiers are presented. These tests were carried out with the use of FFT and HT as preprocessing methods applied for selecting the most significant symptoms of the vibration acceleration signals for analysed NN structures with different architectures and parameters. The article ends with a short conclusion and summary of the results obtained.

Brief Characteristic of Bearing Failure Symptoms
Rolling bearings are elements of a drive enabling rotation of the rotor with minimal mechanical losses. The bearings allow minimizing the resistance to movement of the shaft and keep it in the right position. Therefore, bearings are the elements most intensively subject to the wear process [3,4]. In general, rolling bearings consist of a cage in which the rolling elements are placed and properly separated. The cage is located between the inner ring, which is placed on the machine shaft, and the outer ring [21]. The rolling bearing structure is shown in Figure 1. Due to their nature, damage to rolling bearings can be divided into two types [24,25]: • Point damages (cavities, splinters of small fragments); • Diffuse damages (surface deformation, corrosion damage, unevenness).
Both types of damages lead to different symptoms. Point damages cause vibrations in the low frequency range. Diffuse damages are detected by observing the vibration spectrum over a wide range. Due to their nature, they are difficult to describe in terms of specific frequencies. The occurrence of damage and its further deepening causes a change in the geometry of the rolling elements, an increase in vibration and noise level of the motor during operation, and, along with their deterioration, may lead to the eccentricity of the rotor in relation to the stator and damage to the mechanical connection with the working machine. Like any element of the electric drive, the bearings have a long service life and their natural damage results from the nature and time of operation. Premature failures can occur for many reasons. Most often, they result from improper assembly, often they are also caused by improper selection of the bearing to the drive requirements and its improper operation [26][27][28][29]. Depending on which element of the bearing is spot damaged, different damage symptoms appear in the spectrum of the mechanical vibration signal [13,21,24,25].
The symptoms defining the bearing condition are obtained by determining the amplitudes of the characteristic frequencies occurring in the vibration spectrum. A significant increase in these amplitudes indicates a point damage to a given element. To calculate the characteristic frequencies, it is necessary to find the rotational frequency fr of the motor shaft: = 60 where: n-motor velocity (rpm). Then the frequencies characteristic harmonics for failures of individual bearing elements can be determined on the basis of the formulas below [21,22,24]: Due to their nature, damage to rolling bearings can be divided into two types [24,25]: • Point damages (cavities, splinters of small fragments); • Diffuse damages (surface deformation, corrosion damage, unevenness).
Both types of damages lead to different symptoms. Point damages cause vibrations in the low frequency range. Diffuse damages are detected by observing the vibration spectrum over a wide range. Due to their nature, they are difficult to describe in terms of specific frequencies. The occurrence of damage and its further deepening causes a change in the geometry of the rolling elements, an increase in vibration and noise level of the motor during operation, and, along with their deterioration, may lead to the eccentricity of the rotor in relation to the stator and damage to the mechanical connection with the working machine. Like any element of the electric drive, the bearings have a long service life and their natural damage results from the nature and time of operation. Premature failures can occur for many reasons. Most often, they result from improper assembly, often they are also caused by improper selection of the bearing to the drive requirements and its improper operation [26][27][28][29]. Depending on which element of the bearing is spot damaged, different damage symptoms appear in the spectrum of the mechanical vibration signal [13,21,24,25].
The symptoms defining the bearing condition are obtained by determining the amplitudes of the characteristic frequencies occurring in the vibration spectrum. A significant increase in these amplitudes indicates a point damage to a given element. To calculate the characteristic frequencies, it is necessary to find the rotational frequency f r of the motor shaft: where: n-motor velocity (rpm). Then the frequencies characteristic harmonics for failures of individual bearing elements can be determined on the basis of the formulas below [21,22,24]: where: N b -number of rolling elements (balls), D b -rolling element diameter, D c -bearing pitch diameter, β-bearing working angle (0 o for rolling bearing), f bc , f or , f ir , f re -frequencies specific for a given failure: bearing cage, outer race, inner race, and rolling element. It is clearly seen that frequencies of the characteristic harmonics depend only on the shaft frequency (which, in the PMSM motor, is independent on the load torque, contrary to IM) and rolling bearing geometry.

Methods Used for Failure Symptoms Analysis in the Vibration Signal
In this work, two methods of vibration signal analysis were used to prepare the input signals of the developed neural detectors: the classical Fast Fourier Transform (FFT) and FFT of the vibration acceleration envelope signal calculated using the Hilbert transform (ENV).

Fast Fourier Transform
Classic FFT is the transformation most often used in machine diagnostics. It converts a signal from the time domain to the frequency domain and in a continuous form can be written as follows [30]: The Fourier transform assumes that the analysed signals are periodic and stationary and performs decoding of the analysed signal into a specified number of sinusoidal signals with a specified component frequency. The real signals in most cases do not meet the periodicity and stationarity conditions. In addition, the signal must be sampled synchronously with the fundamental frequency of the signal, according to Shannon's sampling theorem. The length of the analysed signal fragment must be an integer multiple of its fundamental harmonic period. For signals that meet these criteria, the obtained spectrum is correct; otherwise, there is a blurring of the spectrum. It is also related to the inaccuracy of the amplitude determination and can have a significant influence on the interpretation of the results. The transition from the time domain to the frequency domain means that the moment of appearance of a given component does not affect its place in the spectrum. Therefore, the signal subjected to such analysis should be stationary due to the loss of time information.
The Discrete Fourier Transform (DFT) is used for signal analysis [30]: where: x(n)-real discrete signals of finite length N, generated during continuous signal sampling.
The basic form used to present results after the Discrete Fourier Transform (DFT) is the frequency spectrum with harmonic magnitudes calculated according to: For the purpose of scaling the spectrum in frequency units, each |X(k)| sample is attributed frequency f (k) calculated according to the following expression: where: f p is the signal sampling frequency and N the number of all signal samples.

Hilbert Transform
The continuous Hilbert transform for any time course x(t) has the following form [31]: The analytical signal a(t) created from the real signal x(t) and its Hilbert transform has the following form [31]: where: a ENV (t)-instantaneous amplitude of the envelope of original signal x(t), ϕ(t)instantaneous phase calculated as follows [31,32]: Analysis of the a ENV (t) signal makes it possible to find the frequencies characteristic for failures in the medium-frequency range of the spectrum. Finding the characteristic frequencies in the low-frequency range requires high frequency resolution. Obtaining the high frequency resolution requires steady-state operation of the machine, which is difficult to maintain in real conditions. A characteristic feature of the Hilbert transform application is resistance to the detection of vibrations originating from sources other than bearing damage. A signal is obtained from which the natural frequency of the diagnosed motor is also eliminated. This is a significant advantage of this transformation. HT improves the symptom extraction process, and thus the effectiveness of the diagnostic system, even in the case of significant disturbances in the measurement signal [31].

Neural Network Structures Used
In this study, bearing failure detectors using various structures of shallow neural networks were analysed and their effectiveness in the detection and classification of the analysed failures was compared. The following NNs were taken into account: multilayer perceptrons (MLPs), networks with radial base functions (RBFs), and self-organizing Kohonen maps (self-organizing maps, SOMs).

Multilayer Perceptrons
Feedforward MLPs are most popular in various industrial applications, and consist of the input layer, one or more hidden layers, and the output layer. Each neuron in each layer is connected to a neuron of the next layer; there are no connections between neurons of the same layer, and the data are processed in parallel. This type of NN performs a global approximation [33,34]. This means that many neurons simultaneously decide to map each element of the input vector. The structure of the MLP network with one hidden layer is shown in Figure 2.
In general, the MLP network can be described as follows: where: y k -k-th output of the network, x j -j-th input of the network, w ij , w ki -weights of the first and second hidden layers, respectively, w 0 -biases in the first and second hidden layers, respectively. In general, the MLP network can be described as follows: where: yk-k-th output of the network, xj-j-th input of the network, ( ) ( ) The MLP network training process consists of modifying weights in such a way as to minimize the objective function, usually the mean square error of the output value in relation to the expected value. In this work, the Levenberg-Marquardt (L-M) algorithm [35,36] was used to train MLP networks, which combines the iterative features of the optimization method of the Gauss-Newton algorithm, and the steepest descent algorithm. Thanks to this, we obtained a tool that works as the gradient descent, when we are far from the minimum of the objective function, and then reduces to the conjugate gradients method. The general form of the L-M learning algorithm is described by the Equation (15) where: wn-weight of the network, µ-regularization factor determining the algorithm operation, Jn-Jacobian matrix, en-learning error, η-learning rate.
The difference between Newton's formulas is adding the unit matrix multiplied by the regularization factor µ. It always has a positive value modified in the learning process. By changing this coefficient, we go continuously from the steepest descent algorithm method to the Newton unfolding method. For values close to zero, the algorithm will calculate weights in a manner similar to the unfolding method. Increasing the value of µ is accompanied by an increase in the importance of the direction of improvement determined on the basis of the error function gradient. Apart from determining the share of individual strategies in the obtained result, µ also fulfils the function of the step size towards improvement [35,37,38].
The Levenberg-Marquardt algorithm is very sensitive to initializing weight values because they are selected randomly. An improvement to the above method is the addition of Bayesian regularization. This improves the generalizing properties of the network. Regularization introduces a change to the objective function. Not only is the aim to minimize the mean square error, but also to achieve it with the lowest possible weights. The objective function takes the following form [39,40]: Figure 2. Example of structure of the feedforward multilayer perceptron (MLP) network.
The MLP network training process consists of modifying weights in such a way as to minimize the objective function, usually the mean square error of the output value in relation to the expected value. In this work, the Levenberg-Marquardt (L-M) algorithm [35,36] was used to train MLP networks, which combines the iterative features of the optimization method of the Gauss-Newton algorithm, and the steepest descent algorithm. Thanks to this, we obtained a tool that works as the gradient descent, when we are far from the minimum of the objective function, and then reduces to the conjugate gradients method. The general form of the L-M learning algorithm is described by the Equation (15) [35,37,38]: where: w n -weight of the network, µ-regularization factor determining the algorithm operation, J n -Jacobian matrix, e n -learning error, η-learning rate.
The difference between Newton's formulas is adding the unit matrix multiplied by the regularization factor µ. It always has a positive value modified in the learning process. By changing this coefficient, we go continuously from the steepest descent algorithm method to the Newton unfolding method. For values close to zero, the algorithm will calculate weights in a manner similar to the unfolding method. Increasing the value of µ is accompanied by an increase in the importance of the direction of improvement determined on the basis of the error function gradient. Apart from determining the share of individual strategies in the obtained result, µ also fulfils the function of the step size towards improvement [35,37,38].
The Levenberg-Marquardt algorithm is very sensitive to initializing weight values because they are selected randomly. An improvement to the above method is the addition of Bayesian regularization. This improves the generalizing properties of the network. Regularization introduces a change to the objective function. Not only is the aim to minimize the mean square error, but also to achieve it with the lowest possible weights. The objective function takes the following form [39,40]: where: E-sum of mean square errors, E w -sum of squared weights, β-learning factor, α-decay rate. The α factor enforces low weight values which greatly reduces the tendency of the network to over-fit. This modification also provides greater immunity to noise and incorrect input data, but is more time-consuming [39,40].

Radial Basis Networks
Another structures used in this research are radial networks (radial basis functions, RBFs), i.e., feedforward NNs with one hidden layer in which activation functions are radial. Each element of the input vector is mapped by a particular radial neuron of the hidden layer. The output layer neurons are linear and are responsible for summing the outputs of individual neurons of the hidden layer. The structure of the RBF network is shown in Figure 3.
where: E-sum of mean square errors, Ew-sum of squared weights, β-learning factor, α-decay rate. The α factor enforces low weight values which greatly reduces the tendency of the network to over-fit. This modification also provides greater immunity to noise and incorrect input data, but is more time-consuming [39,40].

Radial Basis Networks
Another structures used in this research are radial networks (radial basis functions, RBFs), i.e., feedforward NNs with one hidden layer in which activation functions are radial. Each element of the input vector is mapped by a particular radial neuron of the hidden layer. The output layer neurons are linear and are responsible for summing the outputs of individual neurons of the hidden layer. The structure of the RBF network is shown in Figure 3. Contrary to MLP networks, whose neurons perform the task of global stochastic approximation of functions of many variables, in RBF networks, the mapping of the full set of input data is the sum of local mappings [34]. Output values of the RBF network are obtained as a sum of product of the RBF neuron outputs and weight factors: with outputs of the RBF neurons calculated obtained as follows: where: yk-k-th output of the network, wjk-weight factor between the j-th output of the hidden (RBF) layer and k-th neuron of NN output, wk0-bias of k-th output neuron, υj(X)the Euclidean distance input vector values and centres of RBF functions, σ-is a spread factor of the activation function.
The Euclidean distances between input vector values and centres of RBF functions are calculated: Contrary to MLP networks, whose neurons perform the task of global stochastic approximation of functions of many variables, in RBF networks, the mapping of the full set of input data is the sum of local mappings [34]. Output values of the RBF network are obtained as a sum of product of the RBF neuron outputs and weight factors: with outputs of the RBF neurons calculated obtained as follows: where: y k -k-th output of the network, w jk -weight factor between the j-th output of the hidden (RBF) layer and k-th neuron of NN output, w k0 -bias of k-th output neuron, υ j (X)-the Euclidean distance input vector values and centres of RBF functions, σ-is a spread factor of the activation function. The Euclidean distances between input vector values and centres of RBF functions are calculated: where: X = [x 1 , x 2 , x 3, ...., x N ] T is the input vector, C j -vector related to the centre of each RBF neuron. The design process for RBF networks is much simpler than for MLP. The learning procedure consists of three stages that include [41]: • Choice of the centres, C j , of the hidden radial basis neurons; • Choice of the spread parameter, σ-width of the radial function for each hidden neuron; • Determination of the weight factors, w jk , between hidden (radial) and output layer.
The first step may be carried out randomly with even distribution. For more complicated problems, data clustering algorithms are used. The most popular is the k-means algorithm [41]. The algorithm requires top-down determination of the number of clusters. In general, the k-means algorithm aims to place the centres of RBF neurons in the areas with the most significant data [42,43]. The weights of the output layer are determined by the supervised learning method, e.g., according to the delta rule. Learning rate is much shorter and the problem of local minimum is avoided.

Kohonen Maps
The last NN structures used in this work to detect damage to rolling bearings are self-organizing maps (SOMs). Network learning is based on the built-in mechanism of competition and neighbourhood assessment. The SOM is formed by the input and output (competition) layers. The construction is less complicated than the MLP and RBF networks due to the lack of a hidden layer ( Figure 4).
where: X = [x1, x2, x3,...., xN] T is the input vector, Cj-vector related to the centre of each RBF neuron. The design process for RBF networks is much simpler than for MLP. The learning procedure consists of three stages that include [41]: • Choice of the centres, Cj, of the hidden radial basis neurons; • Choice of the spread parameter, σ-width of the radial function for each hidden neuron; • Determination of the weight factors, wjk, between hidden (radial) and output layer.
The first step may be carried out randomly with even distribution. For more complicated problems, data clustering algorithms are used. The most popular is the k-means algorithm [41]. The algorithm requires top-down determination of the number of clusters. In general, the k-means algorithm aims to place the centres of RBF neurons in the areas with the most significant data [42,43]. The weights of the output layer are determined by the supervised learning method, e.g., according to the delta rule. Learning rate is much shorter and the problem of local minimum is avoided.

Kohonen Maps
The last NN structures used in this work to detect damage to rolling bearings are selforganizing maps (SOMs). Network learning is based on the built-in mechanism of competition and neighbourhood assessment. The SOM is formed by the input and output (competition) layers. The construction is less complicated than the MLP and RBF networks due to the lack of a hidden layer ( Figure 4). As for the similarities, these are also feedforward networks, and each input layer element is connected to each output layer element. The output signal yk of the k-th neuron is described by the vector relationship: Since the input x and the weight W vectors are normalized in the network, the output value of yk (neuron activation) is determined by the angular difference φk between the vectors X and W.
Self-organizing networks allow you to find unknown connections between input data, which is possible using two learning methods: Winner Takes All (WTA) and Winner Takes Most (WTM). In the WTA method, only the winning neuron is adapted. Adaptation consists in bringing the winning neuron closer to the input pattern. The WTM algorithm assumes that also neurons from an appropriately defined neighbourhood (e.g., Gauss As for the similarities, these are also feedforward networks, and each input layer element is connected to each output layer element. The output signal y k of the k-th neuron is described by the vector relationship: Since the input x and the weight W vectors are normalized in the network, the output value of y k (neuron activation) is determined by the angular difference ϕ k between the vectors X and W.
Self-organizing networks allow you to find unknown connections between input data, which is possible using two learning methods: Winner Takes All (WTA) and Winner Takes Most (WTM). In the WTA method, only the winning neuron is adapted. Adaptation consists in bringing the winning neuron closer to the input pattern. The WTM algorithm assumes that also neurons from an appropriately defined neighbourhood (e.g., Gauss neighbourhood function) can be activated (weight adaptation). Usually, Euclidean distance is taken as the neighbourhood measure [44][45][46].

Description of the Laboratory Set-Up
In order to obtain training and testing data for the developed NN detectors, the vibration acceleration signal of the tested PMSM (Lenze MCS14H15), with a rated power of 2.5 kW (U N = 325 V, I N = 6.6 A, T N = 21 Nm, f s = 100 Hz, n N = 1500 rpm, p p = 2), was measured. It should be highlighted that in the presented research the original measurement data were used, obtained in laboratory set-up developed by authors, for the PMSM supplied from the industrial frequency converter operating in vector control mode, as it can be met in real industrial conditions. The authors resigned from the use of generally accessible databases, the analysis of which has been presented in numerous articles. The tested motor was powered by a 4 kW Lenze E84VTCE40245 × 0 converter. This device allows us to control the motor in one of several vector or scalar modes. The load torque for the tested PMSM was generated by the second Lenze motor powered by the E84AVTCE40245X0 converter with a rated power of 4 kW. The tested motor with the loading motor and supplying inverters are shown in Figure 5. The measurements were carried out with the use of the NI PXIe-4492 measurement card and the NI PXIe-1082 computer. Vibration measurements were carried out for bearings with modelled damage to the outer and inner raceways and rolling elements, and for bearing without a defect, using the DeltaTron 4506 accelerometer by Brüel and Kjaer. Additionally, bearings under testing were damaged by a spark jump to the selected element (electric discharge between the electrode and the damaged element) similar to [1,8,13,31]. In the literature, bearing failures are often presented with a cut raceway or a hole made in the damaged raceway [7,13,17,18,23,25].

Description of the Laboratory Set-Up
In order to obtain training and testing data for the developed NN detectors, the vibration acceleration signal of the tested PMSM (Lenze MCS14H15), with a rated power of 2.5 kW (UN = 325 V, IN = 6.6 A, TN = 21 Nm, fs = 100 Hz, nN = 1500 rpm, pp = 2), was measured. It should be highlighted that in the presented research the original measurement data were used, obtained in laboratory set-up developed by authors, for the PMSM supplied from the industrial frequency converter operating in vector control mode, as it can be met in real industrial conditions. The authors resigned from the use of generally accessible databases, the analysis of which has been presented in numerous articles. The tested motor was powered by a 4 kW Lenze E84VTCE40245 × 0 converter. This device allows us to control the motor in one of several vector or scalar modes. The load torque for the tested PMSM was generated by the second Lenze motor powered by the E84AVTCE40245X0 converter with a rated power of 4 kW. The tested motor with the loading motor and supplying inverters are shown in Figure 5. The measurements were carried out with the use of the NI PXIe-4492 measurement card and the NI PXIe-1082 computer. Vibration measurements were carried out for bearings with modelled damage to the outer and inner raceways and rolling elements, and for bearing without a defect, using the DeltaTron 4506 accelerometer by Brüel and Kjaer. Additionally, bearings under testing were damaged by a spark jump to the selected element (electric discharge between the electrode and the damaged element) similar to [1,8,13,31]. In the literature, bearing failures are often presented with a cut raceway or a hole made in the damaged raceway [7,13,17,18,23,25]. In this article, the vibration acceleration measured in three axes was used as a diagnostic signal. Based on the analysis of the measurement results, it was found that the greatest reaction to bearing damage was visible in the signal on the X-axis, so only this signal was used for further analysis. The measurements were carried out in three series with variable supplying voltage frequency and variable load torque. Two series were used to learn NNs; the third one was applied to test them.
The use of the Fourier transform allowed determining the amplitudes of the characteristic failure frequencies using an application developed in the Labview environment. Amplitudes of 32 harmonics with frequencies associated with damage to the inner race, outer race and rolling element were analysed. The harmonic amplitudes at cage failure frequencies were neglected due to their increase in the presence of other types of failure. Taking the basket damage into account could disturb the proper operation of the detector. In order to take into account the variability of the load moment, the rotational frequency In this article, the vibration acceleration measured in three axes was used as a diagnostic signal. Based on the analysis of the measurement results, it was found that the greatest reaction to bearing damage was visible in the signal on the X-axis, so only this signal was used for further analysis. The measurements were carried out in three series with variable supplying voltage frequency and variable load torque. Two series were used to learn NNs; the third one was applied to test them.
The use of the Fourier transform allowed determining the amplitudes of the characteristic failure frequencies using an application developed in the Labview environment. Amplitudes of 32 harmonics with frequencies associated with damage to the inner race, outer race and rolling element were analysed. The harmonic amplitudes at cage failure frequencies were neglected due to their increase in the presence of other types of failure. Taking the basket damage into account could disturb the proper operation of the detector. In order to take into account the variability of the load moment, the rotational frequency f r was selected as an auxiliary symptom. Sample spectra for a healthy bearing and a bearing with a modelled damage are shown in Figure 6. fr was selected as an auxiliary symptom. Sample spectra for a healthy bearing and a bearing with a modelled damage are shown in Figure 6. Frequencies characteristic for inner race, fir, outer race, for, and rolling element, fre, taking into account coefficients obtained from the bearing producer, were calculated as follows: 3.05 3.99 with: k = 1, 2, 3, 4; l = 0,1. The rotating frequency, fr, was calculated as: Amplitudes of 32 harmonics with frequencies associated with damage to the inner race, outer race and rolling element were analysed from the point of view of their sensitivity to particular bearing damage. Detailed analysis allowed selecting symptoms that responded most strongly to the selected type of damaged structural element of the bearing. It was checked how the load torque and frequency of the supply voltage affected the selected symptoms. In order to take into account the variability of the load torque, the rotational frequency fr was selected as an auxiliary symptom. If this step were omitted, the tested structures would be very complex. Furthermore, they would require a large number of neurons in hidden layers. It is also possible that such complex neural networks could lose their generalization properties, i.e., be over-fitted. On this basis, the first 7th-element input vector of NNs was selected, consisting of the amplitudes of characteristic harmonics with failure frequencies, for which a significant increase in value was observed when the failure occurred. Moreover, the selected harmonics were characterized by certain regularity with the increase of the supply frequency, fs, or the change of the load torque value. The frequency values of these harmonics for a few examples of motor operating conditions are presented in Table 1. The second input vector was developed based on the HT analysis of the vibration acceleration signal and next FFT of the obtained envelope (ENV), calculated according Frequencies characteristic for inner race, f ir , outer race, f or , and rolling element, f re , taking into account coefficients obtained from the bearing producer, were calculated as follows: with: k = 1, 2, 3, 4; l = 0,1.
The rotating frequency, f r , was calculated as: Amplitudes of 32 harmonics with frequencies associated with damage to the inner race, outer race and rolling element were analysed from the point of view of their sensitivity to particular bearing damage. Detailed analysis allowed selecting symptoms that responded most strongly to the selected type of damaged structural element of the bearing. It was checked how the load torque and frequency of the supply voltage affected the selected symptoms. In order to take into account the variability of the load torque, the rotational frequency f r was selected as an auxiliary symptom. If this step were omitted, the tested structures would be very complex. Furthermore, they would require a large number of neurons in hidden layers. It is also possible that such complex neural networks could lose their generalization properties, i.e., be over-fitted. On this basis, the first 7th-element input vector of NNs was selected, consisting of the amplitudes of characteristic harmonics with failure frequencies, for which a significant increase in value was observed when the failure occurred. Moreover, the selected harmonics were characterized by certain regularity with the increase of the supply frequency, f s , or the change of the load torque value. The frequency values of these harmonics for a few examples of motor operating conditions are presented in Table 1. The second input vector was developed based on the HT analysis of the vibration acceleration signal and next FFT of the obtained envelope (ENV), calculated according Equation (12). After applying the Hilbert transformation of the signal, there was a significant increase in the amplitude values of the harmonics with frequencies characteristic for Equation (12). After applying the Hilbert transformation of the signal, there was a significant increase in the amplitude values of the harmonics with frequencies characteristic for individual failures. The response to failure for several of the selected harmonics obtained for different supplying frequencies and loads is shown in Figure 7.

Figure 7.
Examples of harmonic amplitudes with characteristic frequencies obtained using envelope (ENV) for vibration acceleration signal; different load torques and supplying frequencies of PMSM drive.

Analysis of the Operation of Neural Detectors of Damages to Rolling Bearings
Two types of neural detectors were developed: • Fault detector (type I)-determining the condition of the bearing, i.e., whether there is damage (1) or not (0) (healthy-faulty); • Fault classifier (type II)-determining the type of damage, i.e., which bearing element was damaged.
The data vectors from the FFT analysis or the ENV analysis of the vibration signal presented before were divided into training and testing sets, obtaining the data structure presented in Table 2. Table 2. The structure of the training and testing data with the division into the fast Fourier transform (FFT) and envelope analysis (ENV) of the vibration signal. Training set  240  240  Testing set 120 120

Number of Samples FFT Analysis ENV Analysis
The effectiveness of detectors using the following types of neural networks: MLP, RBF, and SOM was compared. The search of the minimum structure of selected NN types depending on the number of degrees of freedom and training algorithms was carried out with the assumption that neural detectors/classifiers would achieve an accuracy of 95-100%, but not less than 90%.

Analysis of the Operation of Neural Detectors of Damages to Rolling Bearings
Two types of neural detectors were developed:

•
Fault detector (type I)-determining the condition of the bearing, i.e., whether there is damage (1) or not (0) (healthy-faulty); • Fault classifier (type II)-determining the type of damage, i.e., which bearing element was damaged.
The data vectors from the FFT analysis or the ENV analysis of the vibration signal presented before were divided into training and testing sets, obtaining the data structure presented in Table 2. Table 2. The structure of the training and testing data with the division into the fast Fourier transform (FFT) and envelope analysis (ENV) of the vibration signal.

Number of Samples FFT Analysis ENV Analysis
Training set 240 240 Testing set 120 120 The effectiveness of detectors using the following types of neural networks: MLP, RBF, and SOM was compared. The search of the minimum structure of selected NN types depending on the number of degrees of freedom and training algorithms was carried out with the assumption that neural detectors/classifiers would achieve an accuracy of 95-100%, but not less than 90%.

Fault Detector Based on MLP Network
Neural detectors based on MLP were tested first. The effectiveness of the detectors was checked in terms of the number of hidden layers (only for the fault detector, because in the case of the fault classifier the structure with one hidden layer was insufficient), the number of neurons in the hidden layer, the type of activation function, and the learning method. The research was conducted in the MATLAB environment, for the training and testing data sets given in Table 2. A schematic diagram of an example structure with two hidden layers is shown in Figure 8, with the input vector containing samples from either the FFT analysis or the ENV analysis. Neural detectors based on MLP were tested first. The effectiveness of the detectors was checked in terms of the number of hidden layers (only for the fault detector, because in the case of the fault classifier the structure with one hidden layer was insufficient), the number of neurons in the hidden layer, the type of activation function, and the learning method. The research was conducted in the MATLAB environment, for the training and testing data sets given in Table 2. A schematic diagram of an example structure with two hidden layers is shown in Figure 8, with the input vector containing samples from either the FFT analysis or the ENV analysis. The study of perceptron effectiveness due to the random nature of the initialization of the weights was carried out in 15 training and testing series. Table 3 shows the average percentage effectiveness of the fault detectors (type I) using the damage symptoms from the FFT analysis or from the ENV analysis, respectively, as mean values obtained after 15 series of NN testing. First, the tests were performed for two types of activation functions: tansig and logsig for various MLP structures using the Levenberg-Marquardt training method (trainlm) with the learning rate η = 0.5. The use of the logsig activation function has a positive effect not only on the effectiveness of the detector operation, but also on the increase in the stability of the learning process in relation to the tansig function, both in the case of using ENV or the FFT itself. For all tested cases, the network achieved 100% efficiency. Next, the influence of the Levenberg-Marquardt method with Bayes regularization (trainbr learning) was checked, with weight limitation introduced. Tests were performed using the tansig activation function.
In the case of the 0-1 detector, after obtaining the proper structure, there are no significant differences between the use of ENV or the FFT itself. The differences are only noticeable during the learning process. The use of the Hilbert transform eliminates significant deviations from the expected values and ensures more repeatable network responses for individual learning and testing cycles, which significantly simplifies this process.   The study of perceptron effectiveness due to the random nature of the initialization of the weights was carried out in 15 training and testing series. Table 3 shows the average percentage effectiveness of the fault detectors (type I) using the damage symptoms from the FFT analysis or from the ENV analysis, respectively, as mean values obtained after 15 series of NN testing. First, the tests were performed for two types of activation functions: tansig and logsig for various MLP structures using the Levenberg-Marquardt training method (trainlm) with the learning rate η = 0.5. The use of the logsig activation function has a positive effect not only on the effectiveness of the detector operation, but also on the increase in the stability of the learning process in relation to the tansig function, both in the case of using ENV or the FFT itself. For all tested cases, the network achieved 100% efficiency. Next, the influence of the Levenberg-Marquardt method with Bayes regularization (trainbr learning) was checked, with weight limitation introduced. Tests were performed using the tansig activation function.
In the case of the 0-1 detector, after obtaining the proper structure, there are no significant differences between the use of ENV or the FFT itself. The differences are only noticeable during the learning process. The use of the Hilbert transform eliminates significant deviations from the expected values and ensures more repeatable network responses for individual learning and testing cycles, which significantly simplifies this process.
Then, the fault classifier (detector type II) was tested, whose task was to determine the type of bearing damage. Efficiency tests were performed analogously with FFT or ENV in 15 series of learning and testing. The results for variable parameters of MLPs and learning methods are presented in Table 4. The effectiveness of the MLP-based fault  Based on Table 4, it can be concluded that the use of the Hilbert transform improves the detection efficiency, with the structure reduced by two neurons in the hidden layers. The differences in the operation of the network based on FFT or ENV are noticeable in the results presented in Table 4, but not to the same extent as in the course of the exact output values of the analysed detectors, which are demonstrated in Figure 9. The output of the ENV-based detector fluctuates much less, regarding the expected value, than the output of the detector based on the FFT analysis. This greatly increases the stability and average accuracy of the detection. Then, the fault classifier (detector type II) was tested, whose task was to determine the type of bearing damage. Efficiency tests were performed analogously with FFT or ENV in 15 series of learning and testing. The results for variable parameters of MLPs and learning methods are presented in Table 4. The effectiveness of the MLP-based fault classifier was checked for various activation functions: tansig and logsig using the trainlm or trainbr learning methods and the learning rate η = 0.5. Based on Table 4, it can be concluded that the use of the Hilbert transform improves the detection efficiency, with the structure reduced by two neurons in the hidden layers. The differences in the operation of the network based on FFT or ENV are noticeable in the results presented in Table 4, but not to the same extent as in the course of the exact output values of the analysed detectors, which are demonstrated in Figure 9. The output of the ENV-based detector fluctuates much less, regarding the expected value, than the output of the detector based on the FFT analysis. This greatly increases the stability and average accuracy of the detection. In the case of the fault classifier, the study was carried out only for structures with two hidden layers, because the structure with one hidden layer was insufficient for the investigated problem and network responses were significantly different from the expected values.

Fault Detector Based on RBF Network
RBF networks, unlike MLP, are characterized by the repeatability of results in the learning process. For the same network parameters, the same output results are always obtained, which significantly facilitates network training in the MATLAB environment, performed with the use of data sets presented in Table 1. The general structure of the studied networks with inputs created with FFT or ENV analyses, respectively, is shown in Figure 10. In the case of the fault classifier, the study was carried out only for structures with two hidden layers, because the structure with one hidden layer was insufficient for the investigated problem and network responses were significantly different from the expected values.

Fault Detector Based on RBF Network
RBF networks, unlike MLP, are characterized by the repeatability of results in the learning process. For the same network parameters, the same output results are always obtained, which significantly facilitates network training in the MATLAB environment, performed with the use of data sets presented in Table 1. The general structure of the studied networks with inputs created with FFT or ENV analyses, respectively, is shown in Figure 10. First, tests were carried out for the damage detector and fault classifiers for different values of a spread parameter σ of the radial function and for a different number of radial neurons in the hidden layer, which was selected during the learning process. The respective results for both RBF-based detectors, with input vectors created using FFT or ENV analyses are collected in Table 5. The use of ENV in the case of RBF network training data resulted in the correct operation of the network for a greater number of spread parameter values, σ. It made the selection of this parameter much easier in the process of detectors' learning. Based on Table 5, it can be seen that with the increase in the value of the parameter σ, the effectiveness of NN using the data from the FFT analysis increases, and from the value of 0.3 it begins to decrease. In the case of fault detector based on ENV data, increasing the value of the spread parameter up to several times the value presented for FFT-based detector did not decrease its effectiveness.
In the case of the fault classifier, the RBF network also worked correctly for a larger set of the spread parameter after applying ENV, but not as large as in the case of the type I detector. On the exact output waveforms, presented in Figure 11, it can be observed that the RBF network response fluctuates strongly from the expected value, even when an ENV-based input vector is used. Nevertheless, with proper selection of parameters, 100% efficiency is possible. First, tests were carried out for the damage detector and fault classifiers for different values of a spread parameter σ of the radial function and for a different number of radial neurons in the hidden layer, which was selected during the learning process. The respective results for both RBF-based detectors, with input vectors created using FFT or ENV analyses are collected in Table 5. The use of ENV in the case of RBF network training data resulted in the correct operation of the network for a greater number of spread parameter values, σ. It made the selection of this parameter much easier in the process of detectors' learning. Based on Table 5, it can be seen that with the increase in the value of the parameter σ, the effectiveness of NN using the data from the FFT analysis increases, and from the value of 0.3 it begins to decrease. In the case of fault detector based on ENV data, increasing the value of the spread parameter up to several times the value presented for FFT-based detector did not decrease its effectiveness.
In the case of the fault classifier, the RBF network also worked correctly for a larger set of the spread parameter after applying ENV, but not as large as in the case of the type I detector. On the exact output waveforms, presented in Figure 11, it can be observed that the RBF network response fluctuates strongly from the expected value, even when an ENV-based input vector is used. Nevertheless, with proper selection of parameters, 100% efficiency is possible. The effectiveness of the RBF-based fault detectors was also tested in terms of the effect of limiting the number of neurons in the hidden layer. The analysis showed that the lack of constraints does not always ensure a better network efficiency. In the case of fault classifiers, too many neurons reduce the effectiveness. The network over-adjusts to the input data and the problem becomes oversized (Figure 12). In the case of network using only FFT data, the effectiveness drops to approximately 70%. The optimal value for all studied cases is about 100 neurons in the hidden layer.

Fault Detector Based on SOM Network
Due to the sensitivity of Kohonen maps to the training data, only the network, whose inputs were harmonics obtained as a result of ENV analysis, was tested, for the same number of training and testing samples as before. The general schematic diagram of the structure is shown in Figure 13. The effectiveness of the RBF-based fault detectors was also tested in terms of the effect of limiting the number of neurons in the hidden layer. The analysis showed that the lack of constraints does not always ensure a better network efficiency. In the case of fault classifiers, too many neurons reduce the effectiveness. The network over-adjusts to the input data and the problem becomes oversized (Figure 12). In the case of network using only FFT data, the effectiveness drops to approximately 70%. The optimal value for all studied cases is about 100 neurons in the hidden layer. The effectiveness of the RBF-based fault detectors was also tested in terms of the effect of limiting the number of neurons in the hidden layer. The analysis showed that the lack of constraints does not always ensure a better network efficiency. In the case of fault classifiers, too many neurons reduce the effectiveness. The network over-adjusts to the input data and the problem becomes oversized (Figure 12). In the case of network using only FFT data, the effectiveness drops to approximately 70%. The optimal value for all studied cases is about 100 neurons in the hidden layer.

Fault Detector Based on SOM Network
Due to the sensitivity of Kohonen maps to the training data, only the network, whose inputs were harmonics obtained as a result of ENV analysis, was tested, for the same number of training and testing samples as before. The general schematic diagram of the structure is shown in Figure 13.

Fault Detector Based on SOM Network
Due to the sensitivity of Kohonen maps to the training data, only the network, whose inputs were harmonics obtained as a result of ENV analysis, was tested, for the same number of training and testing samples as before. The general schematic diagram of the structure is shown in Figure 13. The effectiveness of the RBF-based fault detectors was also tested in terms of the effect of limiting the number of neurons in the hidden layer. The analysis showed that the lack of constraints does not always ensure a better network efficiency. In the case of fault classifiers, too many neurons reduce the effectiveness. The network over-adjusts to the input data and the problem becomes oversized (Figure 12). In the case of network using only FFT data, the effectiveness drops to approximately 70%. The optimal value for all studied cases is about 100 neurons in the hidden layer.

Fault Detector Based on SOM Network
Due to the sensitivity of Kohonen maps to the training data, only the network, whose inputs were harmonics obtained as a result of ENV analysis, was tested, for the same number of training and testing samples as before. The general schematic diagram of the structure is shown in Figure 13.  Tests were performed for different map sizes, neighbourhood radius, and distance functions. The research was carried out for a hexagonal grid, for neighbourhood radius equal to: 2 and 6, respectively, and for the distance function determined on the basis of the Euclidean distance (dist) and Link distance function (linkdist). The influence of the distance function and neighbourhood radius is presented in Figure 14, while results for variable size of SOM is shown in Figure 15. Testing Kohonen maps consisted of generating a map from the training data and then superimposing the results for the testing data on it. Overlapping of the respective areas indicated correct fault detection. Tests were performed for different map sizes, neighbourhood radius, and distance functions. The research was carried out for a hexagonal grid, for neighbourhood radius equal to: 2 and 6, respectively, and for the distance function determined on the basis of the Euclidean distance (dist) and Link distance function (linkdist). The influence of the distance function and neighbourhood radius is presented in Figure 14, while results for variable size of SOM is shown in Figure 15. Testing Kohonen maps consisted of generating a map from the training data and then superimposing the results for the testing data on it.
Overlapping of the respective areas indicated correct fault detection. The neighbourhood radius equal to 2 and the Euclidean distance function dist allowed for a better separation of the areas on the map (Figure 14d) compared to the radius value equal to 6 and the function linkdist ( Figure 14a). As shown in Figure 15c,d, error-free detection was obtained only for maps with increased sizes (16 × 16, 20 × 20). Based on the results presented in Figures 14 and 15, the average effectiveness of SOM-based detectors was calculated, and the results are presented in Table 6 for different SOM structures and in Table 7 for hexagonal 10 × 10 map with different training parameters. It can be noticed that, even for a relatively small structure, the average effectiveness of the tested SOMbased detector is very high, over 98%. Tests were performed for different map sizes, neighbourhood radius, and distance functions. The research was carried out for a hexagonal grid, for neighbourhood radius equal to: 2 and 6, respectively, and for the distance function determined on the basis of the Euclidean distance (dist) and Link distance function (linkdist). The influence of the distance function and neighbourhood radius is presented in Figure 14, while results for variable size of SOM is shown in Figure 15. Testing Kohonen maps consisted of generating a map from the training data and then superimposing the results for the testing data on it.
Overlapping of the respective areas indicated correct fault detection. The neighbourhood radius equal to 2 and the Euclidean distance function dist allowed for a better separation of the areas on the map (Figure 14d) compared to the radius value equal to 6 and the function linkdist ( Figure 14a). As shown in Figure 15c,d, error-free detection was obtained only for maps with increased sizes (16 × 16, 20 × 20). Based on the results presented in Figures 14 and 15, the average effectiveness of SOM-based detectors was calculated, and the results are presented in Table 6 for different SOM structures and in Table 7 for hexagonal 10 × 10 map with different training parameters. It can be noticed that, even for a relatively small structure, the average effectiveness of the tested SOMbased detector is very high, over 98%. The neighbourhood radius equal to 2 and the Euclidean distance function dist allowed for a better separation of the areas on the map (Figure 14d) compared to the radius value equal to 6 and the function linkdist ( Figure 14a). As shown in Figure 15c,d, error-free detection was obtained only for maps with increased sizes (16 × 16, 20 × 20). Based on the results presented in Figures 14 and 15, the average effectiveness of SOM-based detectors was calculated, and the results are presented in Table 6 for different SOM structures and in Table 7 for hexagonal 10 × 10 map with different training parameters. It can be noticed that, even for a relatively small structure, the average effectiveness of the tested SOM-based detector is very high, over 98%. SOM nets were also tested for the activity of individual areas on samples for healthy and damaged bearings ( Figure 16). These areas are consistent with the maps shown in Figures 14 and 15. Responses to samples from a damaged bearing cover a much larger area on the map, which is also visible in Figures 14 and 15. Additionally, this area shows a much greater activity of neurons. SOM nets were also tested for the activity of individual areas on samples for healthy and damaged bearings ( Figure 16). These areas are consistent with the maps shown in Figures 14 and 15. Responses to samples from a damaged bearing cover a much larger area on the map, which is also visible in Figures 14 and 15. Additionally, this area shows a much greater activity of neurons. When the type of damage is classified, tangles appear on the map. It can be seen that the greatest relationship is between the failure of the rolling element and the inner race. These effects are shown in following figures. In the detectors based on the MLP and RBF networks, errors also appeared when distinguishing these specific failures.
As shown in Figure 17, irrespective of the value of the neighbourhood radius and the selected distance function, there is one, constantly repeating error on the maps. The mistake appeared when testing the information obtained for a bearing with a damaged outer race. In the case of the fault classifier, even increasing the number of neurons did not allow achieving 100% effectiveness ( Figure 18). It was observed that, for each examined parameter, the error in the activity of the neuron of the healthy area was repeated due to damage to the outer raceway. Moreover, for fault detector (type I) the error appeared in a similar area. For many of the cases studied, this was the only wrong answer of the network. When the type of damage is classified, tangles appear on the map. It can be seen that the greatest relationship is between the failure of the rolling element and the inner race. These effects are shown in following figures. In the detectors based on the MLP and RBF networks, errors also appeared when distinguishing these specific failures.
As shown in Figure 17, irrespective of the value of the neighbourhood radius and the selected distance function, there is one, constantly repeating error on the maps. The mistake appeared when testing the information obtained for a bearing with a damaged outer race. In the case of the fault classifier, even increasing the number of neurons did not allow achieving 100% effectiveness ( Figure 18). It was observed that, for each examined parameter, the error in the activity of the neuron of the healthy area was repeated due to damage to the outer raceway. Moreover, for fault detector (type I) the error appeared in a similar area. For many of the cases studied, this was the only wrong answer of the network.
Based on the results presented in Figures 17 and 18, the average effectiveness of SOMbased classifiers was calculated and the results are presented in Table 8 for different SOM structures and in Table 9 for hexagonal 10 × 10 map with different training parameters. It can be noticed, that even for relatively small structure the average effectiveness of the tested SOM-based classifiers is also high, over 92%. Based on the results presented in Figures 17 and 18, the average effectiveness of SOM-based classifiers was calculated and the results are presented in Table 8 for different SOM structures and in Table 9 for hexagonal 10 × 10 map with different training parameters. It can be noticed, that even for relatively small structure the average effectiveness of the tested SOM-based classifiers is also high, over 92%.  Based on the results presented in Figures 17 and 18, the average effectiveness of SOM-based classifiers was calculated and the results are presented in Table 8 for different SOM structures and in Table 9 for hexagonal 10 × 10 map with different training parameters. It can be noticed, that even for relatively small structure the average effectiveness of the tested SOM-based classifiers is also high, over 92%.   The activity of individual areas of the Kohonen map in response to single fault samples was also examined (Figure 19). Neurons respond least intensively to damage to the rolling element. Therefore, in each case, tangles appeared for this failure. In the case of other types of networks, errors also occurred most frequently for this fault type. It can also be seen that the areas of activity related to the failure of the rolling element and the outer raceway overlap to some extent, hence, the errors that occur when distinguishing them.
The activity of individual areas of the Kohonen map in response to single fault samples was also examined ( Figure 19). Neurons respond least intensively to damage to the rolling element. Therefore, in each case, tangles appeared for this failure. In the case of other types of networks, errors also occurred most frequently for this fault type. It can also be seen that the areas of activity related to the failure of the rolling element and the outer raceway overlap to some extent, hence, the errors that occur when distinguishing them. Figure 19. Active areas of 10 × 10 SOM map in the case of healthy bearing (a), damage to outer raceway (b), inner raceway (c), rolling element (d) for the neighbourhood radius equal to 2 and the linkdist distance function.
In the case of Kohonen maps, a graphical network response to both detector types is shown. The presented maps should be subject to additional analysis. The use of an additional classifier (e.g., cascaded MLP as in Reference [47]) analysing the obtained maps would automate the process of determining the type of the damage.

Summary and Conclusions
The application of classical NNs in the diagnostics of rolling bearings of a synchronous motor with permanent magnets enables the automation of this process and very high efficiency with a correctly selected structure and method of network training. This tool requires neither a mathematical model nor the involvement of a human expert in the diagnostic process, which are its significant advantages.
The use of the MLP network to develop the fault detector (type I) allowed for very high efficiency. For several structures, after 15 series of training and testing, it was possible to achieve 100% correct answers. In the case of the fault classifier (type II), achieving 100% efficiency required a more complex structure. One hidden layer was insufficient for the problem studied. The best results for both types of detectors were provided by the Levenberg-Marquardt learning method with Bayesian regularization and Log-sigmoid activation function. The classical MLP network provided the best efficiency and the smallest deviations from the expected value from all of the examined structures, especially for training data obtained from the analysis of the vibration acceleration signal using the Hilbert transformation. In the case of Kohonen maps, a graphical network response to both detector types is shown. The presented maps should be subject to additional analysis. The use of an additional classifier (e.g., cascaded MLP as in Reference [47]) analysing the obtained maps would automate the process of determining the type of the damage.

Summary and Conclusions
The application of classical NNs in the diagnostics of rolling bearings of a synchronous motor with permanent magnets enables the automation of this process and very high efficiency with a correctly selected structure and method of network training. This tool requires neither a mathematical model nor the involvement of a human expert in the diagnostic process, which are its significant advantages.
The use of the MLP network to develop the fault detector (type I) allowed for very high efficiency. For several structures, after 15 series of training and testing, it was possible to achieve 100% correct answers. In the case of the fault classifier (type II), achieving 100% efficiency required a more complex structure. One hidden layer was insufficient for the problem studied. The best results for both types of detectors were provided by the Levenberg-Marquardt learning method with Bayesian regularization and Log-sigmoid activation function. The classical MLP network provided the best efficiency and the smallest deviations from the expected value from all of the examined structures, especially for training data obtained from the analysis of the vibration acceleration signal using the Hilbert transformation.
Networks with radial basis functions are characterized by repeatability of responses for structures with the same parameters and the same input vector for each subsequent learning process. This is due to the fact that the initial weights are selected based on the input vector, and not randomly as in MLP networks. For this reason, the selection of RBF network parameters is less time-consuming.
The use of Kohonen maps does not fully automate the diagnostic process. In this case, it is necessary to use the evaluation of a human operator or add an algorithm that will be responsible for additional analysis of the result obtained by SOM. Nevertheless, Kohonen maps seem to be the most interesting for further research of all of the structures tested. The k-means algorithm is most often used to automate the result classification process [48,49]. Kohonen network outputs may also be inputs for other networks, which will ensure full automation of the process, as proposed in Reference [47]. For such purposes, studies can be carried out with both MLP and RBF networks. An important advantage of using Kohonen maps is the lack of difficulties in selecting their parameters.
The industry requires solutions that ensure an efficient and automated diagnostic process. The proposed NNs can be easily implemented using a low-budget integrated hardware platform based on, e.g., Arm Cortex-M or similar processors, or the diagnostics can be based on the measurement infrastructure existing in the industry and its extension with additional neural detectors. An exemplary concept of a cheap diagnostic system is presented in [14]. Based on the presented work, it can be concluded that classical NNs still are able to meet these requirements and are much less time-consuming in training and simpler in practical implementation than more popular deep-learning networks.