Comparative Performance Analysis of RNN Techniques for Predicting Concatenated Normal and Abnormal Vibrations

: We analyze the comparative performance of predicting the transition from normal to abnormal vibration states, simulating the motor’s condition before a drone crash, by proposing a concatenated vibration prediction model (CVPM) based on recurrent neural network (RNN) techniques. Subsequently, using the proposed CVPM, the prediction performances of six RNN techniques: long short-term memory (LSTM), attention-LSTM (Attn.-LSTM), bidirectional-LSTM (Bi-LSTM), gate recurrent unit (GRU), attention-GRU (Attn.-GRU), and bidirectional-GRU (Bi-GRU), are analyzed comparatively. In order to assess the prediction accuracy of these RNN techniques in predicting concatenated vibrations, both normal and abnormal vibration data are collected from the motors connected to the drone’s propellers. Consequently, a concatenated vibration dataset is generated by combining 50% of normal vibration data with 50% of abnormal vibration data. This dataset is then used to compare and analyze vibration prediction performance and simulation runtime across the six RNN techniques. The goal of this analysis is to comparatively analyze the performances of the six RNN techniques for vibration prediction. According to the simulation results, it is observed that Attn.-LSTM and Attn.-GRU, incorporating the attention mechanism technique to focus on information highly relevant to the prediction target through unidirectional learning, demonstrate the most promising predictive performance among the six RNN techniques. This implies that employing the attention mechanism enhances the concentration of relevant information, resulting in superior predictive accuracy compared to the other RNN techniques.

However, drone crashes can lead to significant incidents, such as fires caused by battery explosions or even accidents resulting in human casualties.According to a drone safety survey released by the Korea Consumer Agency, the trend of safety incidents involving drones is on the rise as the purchase of drones for leisure increases, with drone crashes accounting for approximately 20% of the reported incidents [21].Therefore, research has been conducted to detect anomalies by training and predicting time series vibration data As shown in Table 1, research has been conducted in various fields using RNN techniques to train and predict vibrations.Furthermore, a previous study [36] forecasted both normal and abnormal drone vibrations individually using six RNN techniques: long short-term memory (LSTM), attention-LSTM (Attn.-LSTM),bidirectional-LSTM (Bi-LSTM), gate recurrent unit (GRU), attention-GRU (Attn.-GRU),and bidirectional-GRU (Bi-GRU).However, the previous studies had limitations, as they predicted normal and abnormal vibrations separately without considering the transition from normal to abnormal vibrations in real-world scenarios when the motor of a drone is damaged.
Therefore, in this study, an RNN-based concatenated vibration prediction model (CVPM) is proposed to comparatively analyze the performance of predicting concatenated vibrations transitioning from normal to abnormal states.Furthermore, six RNN models-LSTM, Bi-LSTM, Attn.-LSTM,GRU, Attn.-GRU, and Bi-GRU-are applied to assess the comparative prediction accuracy and simulation runtime of the concatenated vibrations using identical simulation parameters, including the optimizer and the number of hidden units.Subsequently, the vibration prediction accuracy of the six RNN techniques is analyzed in relation to simulation runtime using the proposed CVPM.The major contributions of this study are summarized as follows: 1.
Since previous studies focused on researching normal and abnormal vibrations separately, the CVPM is proposed to analyze the predictive performance of the six RNN techniques for concatenated vibrations.These concatenated vibrations represent the transition from normal to abnormal states reflecting the sudden damage to the motor.

2.
This study involves a comparative analysis of the prediction efficiency of the six RNN techniques, examining not only the prediction accuracy but also the simulation runtime.The initiative that we took to examine the simulation runtime for each of the six RNN techniques utilized in the CVPM, Attn.-LSTM, bi-LSTM, Attn.-GRU, and Bi-GRU techniques, is one of the major contributions of our study.

3.
Using the proposed CVPM, we conducted an analysis of the vibration prediction performance by progressively increasing the training segment of the concatenated vibrations from 40% to 90% in increments of 10%.These simulations allowed us to identify the training data segments where the prediction accuracy values of the six RNN techniques converged.
The remainder of this paper is organized as follows: Section 2 reviews previous research related to the prediction of time-series vibrations using various RNN techniques.In Section 3, the proposed CVPM is described, along with the description of the synthesized concatenated vibration dataset utilized to analyze the predictive performance of the six RNN techniques.Section 4 provides an insight into the techniques employed to predict concatenated vibrations, including LSTM, GRU, attention mechanism, and bidirectional techniques.Section 5 conducts a comprehensive analysis, comparing vibration prediction accuracy, simulation runtime, and the relationship between simulation runtime and predictive accuracy for the six RNN techniques.Finally, Section 6 summarizes the conclusions and describes future research.

Related Work
In recent years, machine learning algorithms have become popular due to the rapid development of software and hardware.As a result, research on predicting vibrations using machine learning and deep learning has been actively conducted.This section introduces studies related to fault detection and vibration prediction, highlighting their importance in Industry 4.0.
Wang et al. [31] introduce a fault diagnosis method for miniature vibration motors.This method combines wavelet packet decomposition with an improved three-layer LSTM network to enhance accuracy.ElSaid et al. [32] conducted a study on predicting excess vibration events in aircraft engines, a crucial aspect of the aviation industry.They utilized LSTM for accurate predictions, which yielded promising results with low error rates.Wang et al. [30] introduce a model for predicting and analyzing vibration severity in steam turbine rotor systems.This model combines sequence prediction with GRU and GRU-Seq2Seq to address the gradient disappearance problem, outperforming back-propagation (BP) and LSTM-Seq2Seq models.Hong [35] applied LSTM and GRU models to forecast vibrations based on time series motor data, comparing their accuracy and simulation runtime efficiency.This research indicates that GRU forecasts vibrations faster than LSTM.
Additionally, research has been conducted on predicting vibrations using LSTM, support vector machines (SVMs), artificial neural network (ANN), and RNN techniques.Xiao et al. [28] introduce a fault diagnosis method for three-phase asynchronous motors using RNNs.Experimental tests on six motors under different fault conditions demonstrate that this approach outperforms other methods, including logistic regression (LR), SVM, multi-layer perceptron (MLP), and basic RNN, in terms of fault diagnosis accuracy.Zhu et al. [33] applied the variational mode decomposition (VMD) technique to predict vibrations using SVM, ANN, and GRU techniques and conducted a comparative analysis of their performance.The results show that among these methods leveraging VMD, VMD-GRU demonstrated the most accurate predictive performance.Furthermore, Zhang et al. [26] introduced an LSTM-based fault detection and identification (FDI) method for quadcopter blades based on airframe vibration signals.This method outperforming the Back Propagation (BP) neural network-based FDI model, especially when dealing with larger volumes of vibration data.
Furthermore, studies have been conducted to predict performance by combining RNN and convolutional neural network (CNN) technologies.Li et al. [25] proposed a method that combines a CNN trained with acoustic emission signals and a GRU network trained with vibration signals, resulting in highly accurate gear pitting fault diagnosis with over 98% accuracy.The combination of CNN and GRU effectively leverages the advantages of both networks, demonstrating superior performance compared to using CNN or GRU individually.
Research has been conducted using Bidirectional RNN technology, which simultaneously processes input data in both forward and backward directions, considering information from both directions, to predict vibrations.Liang et al. [22] proposed a method to predict spindle rotation error that involves three key steps: data preprocessing, training a Bi-LSTM classification network, and predicting spindle rotation error.This method demonstrates the effectiveness of this predictive approach.Adlen et al. [24] conducted research using LSTM, Bi-LSTM, and GRU to predict the condition of wind turbine operations based on vibration time series data.In this study, Bayesian optimization was used to fine-tune the training parameters.The results demonstrate that these models achieved more accurate predictions of wind turbine conditions compared to models trained with conventional parameters.
Additionally, studies predicting vibrations were conducted using Autoencoder and RNN techniques.Han et al. [23] used LSTM techniques to predict the remaining useful life (RUL) of bearings through an approach that involves two models: degradation state model and RUL prediction model.This approach utilizes a stacked autoencoder (SAE) to extract health indications (HIs) from selected features in the degradation state model and employs an LSTM for RUL prediction with standard deviation input and HI training labels.Huang et al. [29] introduced a two-stage machine learning architecture for accurate motor fault prediction based solely on motor vibration time-domain signals, avoiding complex preprocessing.In the first stage, they use the RNN-based Variational Encoder (VAE) method to reduce dimension and improve prediction accuracy.The second stage employs Principal Components Analysis (PCA) and Linear Discriminant Analysis (LDA) for further dimension reduction, enabling clear visualization and detection of different fault modes.This approach simplifies fault detection, reduces computational costs, and enhances classification accuracy.
A study predicting vibrations was conducted using RNN and Attention techniques together.Yang et al. [34] introduces Informer into time series forecasting of motor bearing vibration.Through random search for model parameter optimization, Informer successfully minimizes error accumulation, improving forecasting accuracy.Lee and Hong [36] analyzed that RNN models with attention mechanisms are the most suitable for predicting time series normal and abnormal vibration data, addressing that future research will focus on predicting vibrations in near real-time with coexisting normal and abnormal vibrations.
To sum up, most existing studies utilize RNN techniques to predict time series normal and abnormal vibrations separately.Therefore, the previous studies lack consideration of the transition of the vibration states from normal to abnormal when the actual motor is damaged.Hence, research predicting vibrations that reflect the transition from normal to abnormal vibration states is highly required.

Proposed Concatenated Vibration Prediction Model
In this subsection, the proposed CVPM designed for forecasting concatenated vibrations using six distinct RNN techniques is described.The goal of the proposed CVPM is to comparatively analyze the prediction accuracy and simulation runtime of the six RNN techniques.
Figure 1 shows the flowchart of the proposed CVPM.According to Figure 1, it starts by collecting both normal and abnormal vibration data from the motors connected to the drone's propellers.Then, it generates concatenated vibration data by merging 50% of the normal vibration data with 50% of the abnormal vibration data.This concatenated vibration data represents the transition from the normal to abnormal vibration.Then, the concatenated vibration data is divided into training and testing datasets, with percentages ranging from 40% to 90%.The training data consists of 40%, 50%, 60%, 70%, 80%, and 90% of the concatenated vibration data, while the remaining portions, 60%, 50%, 40%, 30%, 20%, and 10%, form the testing dataset.
Subsequently, the proposed CVPM is employed to train on the segmented concatenated vibration data from 40%, 50%, 60%, 70%, 80%, and 90% segments using six different RNN techniques and to predict respectively the remaining 60%, 50%, 40%, 30%, 20%, and 10% segments.Finally, to assess predictive accuracy, the evaluation involves comparing and analyzing the coefficient of determination, R 2 values, among the predictions of the six RNN techniques and the corresponding testing dataset.This assessment is conducted alongside an examination of the simulation runtimes for the six RNN techniques.By comparing R 2 values and simulation runtimes, the performance of each RNN technique in predicting concatenated vibrations is comprehensively analyzed.Subsequently, the most suitable RNN technique for concatenated vibration prediction is determined through this comparative analysis.of the concatenated vibration data, while the remaining portions, 60%, 50%, 40%, 30%, 20%, and 10%, form the testing dataset.Subsequently, the proposed CVPM is employed to train on the segmented concatenated vibration data from 40%, 50%, 60%, 70%, 80%, and 90% segments using six different RNN techniques and to predict respectively the remaining 60%, 50%, 40%, 30%, 20%, and 10% segments.Finally, to assess predictive accuracy, the evaluation involves comparing and analyzing the coefficient of determination,  values, among the predictions of the six RNN techniques and the corresponding testing dataset.This assessment is conducted alongside an examination of the simulation runtimes for the six RNN techniques.By comparing  values and simulation runtimes, the performance of each RNN technique in predicting concatenated vibrations is comprehensively analyzed.Subsequently, the most suitable RNN technique for concatenated vibration prediction is determined through this comparative analysis.

Generation of Concatenated Vibration Data
This section describes the methodology for collecting both normal and abnormal vibration data from the motors connected to an actual drone's propellers.Subsequently, the process of generating concatenated vibration data is explained, entailing the combination

Generation of Concatenated Vibration Data
This section describes the methodology for collecting both normal and abnormal vibration data from the motors connected to an actual drone's propellers.Subsequently, the process of generating concatenated vibration data is explained, entailing the combination of 50% of the normal vibration data with 50% of the abnormal vibration data.This generation of the concatenated vibration dataset effectively models the transition from normal to abnormal vibrations.
Figure 2 shows the configuration for collecting time series vibration data using an accelerometer attached to a motor.Subsequently, vibration data within the 1 kHz frequency band is acquired over a duration of 100 milliseconds through an acceleration sensor affixed to the drone's motor.
of 50% of the normal vibration data with 50% of the abnormal vibration data.This generation of the concatenated vibration dataset effectively models the transition from normal to abnormal vibrations.
Figure 2 shows the configuration for collecting time series vibration data using an accelerometer attached to a motor.Subsequently, vibration data within the 1 kHz frequency band is acquired over a duration of 100 milliseconds through an acceleration sensor affixed to the drone's motor.Figure 3a and Figure 3b show the normal motor and the abnormal motor with damaged rotors, respectively.The motor in Figure 3 has a rated speed of 180 Kv, a maximum output power of 1474.6 W, and a maximum torque of 1.992 N•m.In order to collect vibration data from the normal motor and damaged motors, the rotation rate of the motors is set at 1200 revolutions per minute (RPM).Furthermore, the abnormal vibration data collected from the damaged motor were obtained after the vibrations had progressed to the advanced stage.The collected normal vibration data and abnormal vibration data undergo min-max normalization, as represented by Equation (1), resulting in values between 0 and 1.
where  and  represent an individual vibration value from the collected vibration data and the normalized vibration value, respectively.Figure 4 (below) illustrates the waveforms of the normalized normal and abnormal vibration data.Figure 3a,b show the normal motor and the abnormal motor with damaged rotors, respectively.The motor in Figure 3 has a rated speed of 180 Kv, a maximum output power of 1474.6 W, and a maximum torque of 1.992 N•m.In order to collect vibration data from the normal motor and damaged motors, the rotation rate of the motors is set at 1200 revolutions per minute (RPM).Furthermore, the abnormal vibration data collected from the damaged motor were obtained after the vibrations had progressed to the advanced stage.
ation of the concatenated vibration dataset effectively models the transition from normal to abnormal vibrations.
Figure 2 shows the configuration for collecting time series vibration data using an accelerometer attached to a motor.Subsequently, vibration data within the 1 kHz frequency band is acquired over a duration of 100 milliseconds through an acceleration sensor affixed to the drone's motor.Figure 3a and Figure 3b show the normal motor and the abnormal motor with damaged rotors, respectively.The motor in Figure 3 has a rated speed of 180 Kv, a maximum output power of 1474.6 W, and a maximum torque of 1.992 N•m.In order to collect vibration data from the normal motor and damaged motors, the rotation rate of the motors is set at 1200 revolutions per minute (RPM).Furthermore, the abnormal vibration data collected from the damaged motor were obtained after the vibrations had progressed to the advanced stage.The collected normal vibration data and abnormal vibration data undergo min-max normalization, as represented by Equation (1), resulting in values between 0 and 1.
where  and  represent an individual vibration value from the collected vibration data and the normalized vibration value, respectively.Figure 4 (below) illustrates the waveforms of the normalized normal and abnormal vibration data.The collected normal vibration data and abnormal vibration data undergo min-max normalization, as represented by Equation (1), resulting in values between 0 and 1.
where x and x norm represent an individual vibration value from the collected vibration data and the normalized vibration value, respectively.According to Figure 4, the vibration waveform of the normal motor shows consistent wavelengths and amplitudes, while the waveform of the abnormal vibration data appears irregular with varying wavelengths and amplitudes compared to the normal vibration waveform.The abnormal vibration also contains more residual vibrations.Furthermore, since the abrupt transition from normal vibrations to abnormal vibrations due to the sud- According to Figure 4, the vibration waveform of the normal motor shows consistent wavelengths and amplitudes, while the waveform of the abnormal vibration data appears irregular with varying wavelengths and amplitudes compared to the normal vibration waveform.The abnormal vibration also contains more residual vibrations.Furthermore, since the abrupt transition from normal vibrations to abnormal vibrations due to the sudden damage to the motor is assumed, the vibration used for prediction is generated by concatenating normal and abnormal vibrations.
Figure 5 illustrates the waveform of the concatenated vibrations, created by concatenating 50% of normal vibration data with 50% of abnormal vibration data.By using the 50:50 ratio of normal and abnormal vibrations, as illustrated in Figure 5, the proposed CVPM can simulate predictions covering various vibration scenarios.This includes cases with normal vibrations only and the ones with a combination of normal and abnormal vibrations.The proposed CVPM achieves these scenarios by employing segments comprising 40%, 50%, 60%, 70%, 80%, and 90% of the vibration data for the learning segment.According to Figure 4, the vibration waveform of the normal motor shows consistent wavelengths and amplitudes, while the waveform of the abnormal vibration data appears irregular with varying wavelengths and amplitudes compared to the normal vibration waveform.The abnormal vibration also contains more residual vibrations.Furthermore, since the abrupt transition from normal vibrations to abnormal vibrations due to the sudden damage to the motor is assumed, the vibration used for prediction is generated by concatenating normal and abnormal vibrations.
Figure 5 illustrates the waveform of the concatenated vibrations, created by concatenating 50% of normal vibration data with 50% of abnormal vibration data.By using the 50:50 ratio of normal and abnormal vibrations, as illustrated in Figure 5, the proposed CVPM can simulate predictions covering various vibration scenarios.This includes cases with normal vibrations only and the ones with a combination of normal and abnormal vibrations.The proposed CVPM achieves these scenarios by employing segments comprising 40%, 50%, 60%, 70%, 80%, and 90% of the vibration data for the learning segment.

RNN
This section describes the usage of LSTM and GRU to predict the vibration, the attention mechanism applied to Attn.-LSTM and Attn.-GRU, and the bidirectional method applied to Bi-LSTM and Bi-GRU.

LSTM
RNN is a neural network model with a recurrent structure, capable of time series prediction by leveraging the characteristic where the output of the previous time step influences the output of the current time step.Nevertheless, conventional RNNs encounter the challenge of long-term dependency, resulting in information loss as the input sequence length grows.For example, when information from earlier time steps holds significant importance, the issue of long-term dependencies can have severe repercussions.

RNN
This section describes the usage of LSTM and GRU to predict the vibration, the attention mechanism applied to Attn.-LSTM and Attn.-GRU, and the bidirectional method applied to Bi-LSTM and Bi-GRU.

LSTM
RNN is a neural network model with a recurrent structure, capable of time series prediction by leveraging the characteristic where the output of the previous time step influences the output of the current time step.Nevertheless, conventional RNNs encounter the challenge of long-term dependency, resulting in information loss as the input sequence length grows.For example, when information from earlier time steps holds significant importance, the issue of long-term dependencies can have severe repercussions.
LSTM is a specialized type of RNN architecture designed to address the vanishing gradient problem and to capture long-term dependencies in sequential data [37].LSTM networks are equipped with memory cells that can store information for extended periods, enabling them to effectively retain and utilize relevant information from previous time steps in a sequence.The distinct design of LSTM allows it to alleviate the issues of short-term memory limitations that traditional RNNs face.By allowing the network to train when to store, read, or erase information, LSTM can capture intricate dependencies and patterns in various types of sequential data, such as time series, natural language, and speech.
In the architecture of an LSTM cell, there are three essential gates: forget gate, input gate, and output gate.These gates play a crucial role in managing the flow of information through the cell, which is essential for capturing long-term dependencies in sequential data.Figure 6 below shows the architecture of LSTM.
and patterns in various types of sequential data, such as time series, natural language, and speech.
In the architecture of an LSTM cell, there are three essential gates: forget gate, input gate, and output gate.These gates play a crucial role in managing the flow of information through the cell, which is essential for capturing long-term dependencies in sequential data.Figure 6 below shows the architecture of LSTM.In Figure 6, ,  , ℎ, , and  refer to the cell state, candidate values to be added to the cell state, the output value, the time step, and the input value, respectively.Moreover, f, i, and o represent the forget gate, the input gate, and the output gate, correspondingly.Equations ( 2)-( 7) below represent the expressions for the variables that constitute the LSTM.
Equation ( 2), Equation (3), and Equation (4) represent the formulations of the forget gate, the input gate, and the output gate, respectively.Equations ( 5)-( 7) elucidate the LSTM cell's process of determining the cell state and hidden state.In Equation (2), Equation (3), Equation (4), and Equation ( 5), the symbols W and b represent the trainable weight matrix and the bias vector, respectively.Equation ( 2) represents the output of the forget gate, which determines the information to be forgotten in the cell state.This gate plays a crucial role in deciding what information should be retained or discarded within the cell state.The forget gate receives In Figure 6, C, ∼ C, h, t, and x refer to the cell state, candidate values to be added to the cell state, the output value, the time step, and the input value, respectively.Moreover, f, i, and o represent the forget gate, the input gate, and the output gate, correspondingly.Equations ( 2)-( 7) below represent the expressions for the variables that constitute the LSTM.
Equation ( 2), Equation (3), and Equation (4) represent the formulations of the forget gate, the input gate, and the output gate, respectively.Equations ( 5)-( 7) elucidate the LSTM cell's process of determining the cell state and hidden state.In Equation (2), Equation (3), Equation (4), and Equation ( 5), the symbols W and b represent the trainable weight matrix and the bias vector, respectively.Equation ( 2) represents the output of the forget gate, which determines the information to be forgotten in the cell state.This gate plays a crucial role in deciding what information should be retained or discarded within the cell state.The forget gate receives two inputs: the input for the hidden state from the previous time step, h t−1 , and the input for the current time step, x t .
Equation (3) represents the output of the current time step's input gate, which determines the information to be newly added to the cell state at the current time step.The value of the input gate at the current time step is calculated through a neural network operation involving the previous time step's hidden state, h t−1 , and the current time step's input, x t , as inputs.
Equation (4) represents the output gate's output, which influences the calculation of the current time step's hidden state, h t .The current time step's output gate value is determined by the previous time step's hidden state, h t−1 , and the current time step's input, x t , as inputs to the neural network operation.Equation ( 5), ∼ C t , represents the candidate cell state at the current time step, t.This candidate cell state is calculated based on the input value at the current time step, x t , and the previous time step's hidden state, h t−1 .W C denotes the trainable weight matrix, which determines how to combine the previous time step's hidden state, h t−1 , and the current time step's input, x t .[h t−1 , x t ] represents the vector obtained by concatenating the previous time step's hidden state and the current time step's input.LSTM employs this mechanism to selectively add and update information, enabling it to capture long-term dependencies in the sequential data effectively.
Equation ( 6), C t , represents the cell state at the current time step, storing information learned by the network and retaining valuable information for the current time step.f t denotes the forget gate at the current time step, t.The forget gate determines how much of the previous cell state, C t−1 , to retain or to forget.i t represents the input gate at the current time step, t.The input gate determines how much of the new information, ∼ C t , to add to the current cell state.∼ C t represents the candidate cell state at the current time step, calculated based on the current input and the previous hidden state, representing new information that can be added to the current cell state.Therefore, the current cell state, C t , is updated by combining what should be forgotten from the previous cell state, denoted as f t C t−1 , with the new information to be included, represented as i t ∼ C t .Equation ( 7) represents the process of calculating the hidden state, h t , at the current time step, t, in LSTM.This equation determines which part of the cell state, C t , at the current time step, t, as decided by the output gate, o t , will be used and compresses this information using the hyperbolic tangent function to calculate the hidden state, h t , at the current time step.Through this process, LSTM updates the hidden state at the current time step, combining the previous information with the new information to capture long-term dependencies in the sequential data.

GRU
GRU is a technique designed to simplify the structure of LSTM in order to reduce computational time [38].GRU offers a streamlined architecture compared to traditional LSTM while still retaining the ability to capture long-term dependencies in sequential data.In Figure 7, ℎ , ℎ, , and  denote the candidate output value, the output value, the time step, and the input value, respectively.The variables r and z correspond to the reset gate and the update gate, respectively.Furthermore, Equations ( 8) and ( 9) below represent the formulas for the reset gate and the update gate, while Equations ( 10) and ( 11) describe how the output value of the GRU cell is determined.
Equation ( 8) represents the output of the current time step's reset gate, which determines the strength of incorporating the previous time step's hidden state, ℎ , into the current time step's output with a trainable weight matrix,  .The current time step's reset gate value is calculated based on the previous time step's hidden state and the current time step's input,  , through a neural network operation.Since the reset gate's activation  In Figure 7, ∼ h, h, t, and x denote the candidate output value, the output value, the time step, and the input value, respectively.The variables r and z correspond to the reset gate and the update gate, respectively.Furthermore, Equations ( 8) and ( 9) below represent the formulas for the reset gate and the update gate, while Equations ( 10) and (11) describe how the output value of the GRU cell is determined.
Equation ( 8) represents the output of the current time step's reset gate, which determines the strength of incorporating the previous time step's hidden state, h t−1 , into the current time step's output with a trainable weight matrix, W r .The current time step's reset gate value is calculated based on the previous time step's hidden state and the current time step's input, x t , through a neural network operation.Since the reset gate's activation function is a sigmoid function, the reset gate has values between 0 and 1.
Equation ( 9) depicts the process of computing the update gate in GRU.z t represents the value of the update gate at the current time step, t, and the update gate is determined by the previous time step, h t−1 , and the current time step's input, x t .σ denotes the sigmoid activation function, which compresses the input values into the range between 0 and 1.Therefore, Equation ( 9) represents a convolution operation processed by the sigmoid function.W z represents a trainable weight matrix.This matrix plays a role in determining how to combine the input values, [h t−1 , x t ].
Equation (10) represents the candidate value to be added from the previous time step to the current time step.It uses the hyperbolic tangent function as its activation function and involves element-wise multiplication between the previous time step's hidden state, h t−1 , and the current time step's reset gate value, r t , along with the neural network operation's result, based on the current time step's input, x t , with a trainable weight matrix, W.
Equation ( 11) represents the hidden state for the current time step.According to Equation (11), element-wise multiplication is performed between the value obtained by subtracting the value of the current time step's update gate, z t , from 1, and the previous time step's hidden state value, h t−1 .Additionally, element-wise multiplication is carried out with the value to be added for the current time step, ∼ h t , and the update gate's value.As a result, the ratio between the information to be forgotten from the previous time step and the information to be added from the current time step is determined, thus influencing the calculation of the hidden state for the current time step.

Attention Mechanism
The attention mechanism is a pivotal component in neural network architectures, devised to enhance the processing of sequential data by selectively focusing on relevant information.The attention mechanism enables the network to dynamically allocate varying degrees of attention to different parts of the input sequence, effectively adapting its processing based on the context and importance of each element [39].Figure 8 below shows the architecture of attention mechanism.
In Figure 8, c, e i , h, M, ∼ s , x, and ŷ represent the attention value, the set of attention scores, the hidden state, the length of the input sequence, the input of the output layer, the input, and the predicted value, respectively.

Attention Mechanism
The attention mechanism is a pivotal component in neural network architectures, devised to enhance the processing of sequential data by selectively focusing on relevant information.The attention mechanism enables the network to dynamically allocate varying degrees of attention to different parts of the input sequence, effectively adapting its processing based on the context and importance of each element [39].Figure 8 below shows the architecture of attention mechanism.In Figure 8, ,  , ℎ, , , , and  represent the attention value, the set of attention scores, the hidden state, the length of the input sequence, the input of the output layer, the input, and the predicted value, respectively.

𝑒 𝑠 𝑊 ℎ (12)
softmax  (13) softmax  ̅  ) ( 16) Equation ( 12), Equation ( 15), and Equation ( 16) represent the th attention score, the ith attention distribution, the attention value, and the input and prediction values of the output layer, respectively.Equation ( 12) calculates the th attention score by multiplying the transpose of the decoder's hidden state,  , with a trainable weight matrix,  , and the th hidden state of the encoder, ℎ .Equation ( 13) calculates the th element that makes up the attention distribution.The Softmax function is applied to the attention scores in Equation (13) to calculate an attention distribution, which is a probability distribution with a total sum of 1.Each element of the attention distribution signifies the importance of each respective encoder's hidden state.Equation (12), Equation ( 15), and Equation ( 16) represent the ith attention score, the ith attention distribution, the attention value, and the input and prediction values of the output layer, respectively.Equation ( 12) calculates the ith attention score by multiplying the transpose of the decoder's hidden state, s T , with a trainable weight matrix, W c , and the ith hidden state of the encoder, h i .
Equation ( 13) calculates the ith element that makes up the attention distribution.The Softmax function is applied to the attention scores in Equation ( 13) to calculate an attention distribution, which is a probability distribution with a total sum of 1.Each element of the attention distribution signifies the importance of each respective encoder's hidden state.
Equation ( 14) calculates the attention value by summing the products of the ith attention distribution and the corresponding hidden state.Therefore, the attention value signifies the importance of the encoder's hidden states.
Equation ( 15) updates the hidden state, denoted as ∼ s , representing the updated hidden state candidate, indicating how new information is incorporated into the current hidden state.
Equation ( 16) represents the final step in the attention mechanism, where the predicted output, ŷ, is computed.W y denotes the trainable weight matrix, which is applied to the updated hidden state candidate, ∼ s , determining its contribution to generating predictions.∼ s represents the updated hidden state candidate, computed in Equation ( 15), encapsulating the model's current state, including the information processed up to this point.b y stands for the bias vector added to the product of W y and ∼ s .Therefore, the attention mechanism enhances prediction performance by assigning varying degrees of importance to the constituent values of the input sequence, thus elevating the weights of crucial information while attenuating the weights of less significant details when calculating prediction values.

Bidirectional RNN Techniques
In a standard RNN, information flows only from the past to the present, which means that the current time step's prediction depends solely on past time steps.However, the bidirectional RNN overcomes this limitation by introducing two separate hidden layers in both directions: one capturing information from past time steps in the forward direction, and the other capturing information from future time steps in the backward direction [40].In this study, the Bi-LSTM and Bi-GRU models, which incorporate the bidirectional technique, are employed and applied to the LSTM and GRU architectures. Figure 9 below illustrates the process through which Bi-LSTM and Bi-GRU generate the predicted values.
varying degrees of importance to the constituent values of the input sequence, thus elevating the weights of crucial information while attenuating the weights of less significant details when calculating prediction values.

Bidirectional RNN Techniques
In a standard RNN, information flows only from the past to the present, which means that the current time step's prediction depends solely on past time steps.However, the bidirectional RNN overcomes this limitation by introducing two separate hidden layers in both directions: one capturing information from past time steps in the forward direction, and the other capturing information from future time steps in the backward direction [40].In this study, the Bi-LSTM and Bi-GRU models, which incorporate the bidirectional technique, are employed and applied to the LSTM and GRU architectures. Figure 9 below illustrates the process through which Bi-LSTM and Bi-GRU generate the predicted values.In Figure 9, ℎ, ℎ′, , , and  represent the forward hidden state, backward hidden state, time step, output, and input respectively.As illustrated in Figure 9, the application of the bidirectional technique to LSTM and GRU introduces two RNN layers, each dedicated to learning information from the forward and backward contexts.This stands in contrast to the single LSTM and GRU models, which consider only unidirectional information.Consequently, when forecasting time series data, bidirectional models leverage information from both directions.Therefore, due to the bidirectional training process in Bi-LSTM and Bi-GRU, both models take a longer training time compared to the conventional LSTM or GRU models.In Figure 9, h, h , t, y, and x represent the forward hidden state, backward hidden state, time step, output, and input respectively.As illustrated in Figure 9, the application of the bidirectional technique to LSTM and GRU introduces two RNN layers, each dedicated to learning information from the forward and backward contexts.This stands in contrast to the single LSTM and GRU models, which consider only unidirectional information.Consequently, when forecasting time series data, bidirectional models leverage information from both directions.Therefore, due to the bidirectional training process in Bi-LSTM and Bi-GRU, both models take a longer training time compared to the conventional LSTM or GRU models.

Simulation Results and Discussion
In this section, a comparative analysis of predictive accuracy and simulation runtime among the six RNN techniques is conducted using the proposed CVPM.

Simulation Environment and Parameters
This section describes the simulation environment and parameters used to predict concatenated vibrations using the six RNN techniques.Table 2 below describes the simulation environment.Google Colaboratory Pro, Python 3.10.12and Tensorflow 2.12.0 In addition, Table 3 below provides an explanation of the parameters configured for the simulation.
In Table 3, the number of hidden units is set to 32, the initial learning rate is set to 0.0002, and the number of epochs is set to 10.These settings are chosen for a clear observation of the performance variations when predicting concatenated vibrations using the six RNN techniques.Additionally, the batch size is configured as 32, aligning with the recommended range of 32 to 128 [41].Furthermore, the adaptive moment estimation (Adam) algorithm is used as an optimization algorithm to minimize training errors and avoid local minima problems in the simulations [42].

Waveform of Predicted Vibrations
This section explains the predicted vibration waveforms when using the six RNN techniques to train on 40%, 50%, 60%, 70%, 80%, and 90% of concatenated vibration data and to predict the remaining 60%, 50%, 40%, 30%, 20%, and 10%, respectively.The coefficient of determination, R 2 , for comparing the predictive accuracy of the six RNN techniques in the proposed CVPM is calculated as follows: where n, y, ŷ, and y denote the total number of vibration data values, the actual vibration values, the predicted vibration values, and the average vibration values, respectively.Figure 10a-f represents the waveforms of the vibrations predicted using the six RNN techniques, where each technique is used to train on 40%, 50%, 60%, 70%, 80%, and 90% segments of concatenated vibration data and to predict the remaining 60%, 50%, 40%, 30%, 20%, and 10% segments.Figure 10a shows the predicted waveform of the vibrations after learning on 40% training data using the six RNN techniques.Since the training segment contains only normal vibration data in the 40% training segment, it is impossible to train abnormal vibration.Therefore, the average  value of the predicted vibrations using the six RNN techniques is 0.49, which is very low.
According to Figure 10b, similar to the 40% training segment, the 50% training segment cannot learn abnormal vibration data either.However, since more information about the amplitudes and periods of vibrations is learned compared to the 40% training segment, the average  value of the six RNN techniques is 0.75.This indicates an increase in prediction accuracy of approximately 53.06% compared to the average  value of the 40% training segment.Figure 10a shows the predicted waveform of the vibrations after learning on 40% training data using the six RNN techniques.Since the training segment contains only normal vibration data in the 40% training segment, it is impossible to train abnormal vibration.Therefore, the average R 2 value of the predicted vibrations using the six RNN techniques is 0.49, which is very low.
According to Figure 10b, similar to the 40% training segment, the 50% training segment cannot learn abnormal vibration data either.However, since more information about the amplitudes and periods of vibrations is learned compared to the 40% training segment, the average R 2 value of the six RNN techniques is 0.75.This indicates an increase in prediction accuracy of approximately 53.06% compared to the average R 2 value of the 40% training segment.
Figure 10c shows the predicted vibration waveform after learning 60% training segment, where 50% of normal vibration data and 10% of abnormal vibration data are trained.Therefore, since both normal and abnormal vibrations are trained, the average R 2 value of the six RNN techniques is 0.88, representing an increase of approximately 79.59% compared to the average R 2 value in the 40% training segment.Furthermore, as observed in Figure 10d, since 20% of abnormal vibration data is trained in the 70% training segment, the average R 2 value of the prediction accuracy of the six RNN techniques is 0.92.This R 2 value represents an approximately 87.76% increase compared to the R 2 value observed in the 40% training segment.
Also, as shown in Figure 10d, since 20% of abnormal vibration data is trained in the 70% training segment, the average R 2 value of prediction accuracy of the six RNN techniques is 0.92.This R 2 value is an increase of approximately 87.76% compared to the R 2 value in the 40% segment.Furthermore, according to the simulation results in Figure 10e,f, where the abnormal vibration data is trained at 30% and 40%, respectively, it is observed that the R 2 values increased by approximately 85.71% and 93.88%, respectively, compared to the R 2 value of the 40% training segment.
In short, according to Figure 10, all of the six RNN techniques demonstrate incremental improvement in prediction accuracy as the training segment size increases.In the next section, the predicted and actual vibration values from the six RNN techniques will be comparatively analyzed using scatter plots.

Comparison of Scatter Plots
In this section, the vibration prediction accuracy of the six RNN techniques is comparatively analyzed based on changes in the training segment using scatter plots.
Figure 11 shows the scatter plot of the predicted vibration data using the six RNN techniques and the actual vibration data.According to Figure 11a,b, after the six RNN techniques are trained on the 40% and 50% training segments, the predicted vibration values notably differ from the actual vibration values.Specifically, the predicted vibration accuracy using LSTM is significantly lower compared to the predicted vibration accuracy from the other five RNN techniques.According to Figure 11c, the vibration results predicted by the six RNN techniques gradually converge toward the actual concatenated vibration value as the abnormal vibration data is trained from the 60% segment.
Additionally, as shown in Figure 11d-f, the predicted vibration values match closely with the actual vibration values when predicting vibrations from the 70% training segment.These simulation results are attributed to the inclusion of more abnormal vibration data during training when the learning segments exceed 70%.
techniques and the actual vibration data.According to Figure 11a,b, after the six RNN techniques are trained on the 40% and 50% training segments, the predicted vibration values notably differ from the actual vibration values.Specifically, the predicted vibration accuracy using LSTM is significantly lower compared to the predicted vibration accuracy from the other five RNN techniques.According to Figure 11c, the vibration results predicted by the six RNN techniques gradually converge toward the actual concatenated vibration value as the abnormal vibration data is trained from the 60% segment.Additionally, as shown in Figure 11d-f, the predicted vibration values match closely with the actual vibration values when predicting vibrations from the 70% training segment.These simulation results are attributed to the inclusion of more abnormal vibration data during training when the learning segments exceed 70%.

Comparative Analysis of Concatenated Vibration Prediction Accuracy
In this section, the changes in  values of the six RNN techniques are analyzed as the training data segment increases from 40% to 90%. Figure 12

Comparative Analysis of Concatenated Vibration Prediction Accuracy
In this section, the changes in R 2 values of the six RNN techniques are analyzed as the training data segment increases from 40% to 90%. Figure 12   According to Figure 12, it is observed that the R 2 values predicted by the six RNN techniques increase as the training segment increases from 40% to 90%.Furthermore, it is noted that the accuracy of the vibrations predicted by Attn.-LSTM and Attn.-GRU, both utilizing the attention mechanism, is the highest among all the training segments, surpassing the respective groups of LSTM and GRU.
Furthermore, in the 40% training data segment, the vibration prediction results obtained using LSTM and GRU are similar to or lower than those obtained using Bi-LSTM and Bi-GRU.However, as the training segment increases, the R 2 values of LSTM and GRU show higher accuracy compared to those of Bi-LSTM and Bi-GRU.Table 4 below presents the R 2 values for each segment of the six RNN techniques.-GRU achieved the highest prediction accuracy in all the training segments, while Bi-GRU indicated lower accuracy compared to GRU.Moreover, in the 40% training segment, the prediction accuracy of LSTM and GRU is similar to or lower than that of Bi-LSTM and Bi-GRU, but from the 60% training segment, the prediction accuracy of LSTM and GRU is higher than that of Bi-LSTM and Bi-GRU.These simulation results indicate that when the training segment is 40% and 50%, LSTM, GRU, Bi-LSTM, and Bi-GRU face the common challenge of having a small training data size and are unable to effectively learn from abnormal vibrations.However, within the same training segment, Bi-LSTM and Bi-GRU learn more information about the vibration's amplitude and frequency compared to LSTM and GRU.
Nevertheless, starting from the 60% training segment, LSTM and GRU undergo sufficient learning on amplitude and frequency information as well, similar to Bi-LSTM and Bi-GRU techniques.Meanwhile, Bi-LSTM and Bi-GRU overlearn normal vibrations excessively, which can act as interfering factors when predicting abnormal vibrations.As a result, the prediction accuracy of Bi-LSTM and Bi-GRU becomes lower than that of LSTM and GRU.
In the next section, we will conduct a detailed comparative analysis of the simulation runtime required by the six RNN techniques in order to predict vibrations.

Comparison of Simulation Runtimes
In this section, the simulation runtimes required for the six RNN techniques to predict concatenated vibrations are analyzed comparatively.However, as evident from Figure 12 and Table 4, in the 40% training segment, LSTM exhibits very fast simulation runtime but the lowest predictive accuracy among the six RNN techniques.Therefore, it is important to compare simulation runtime and prediction accuracy comprehensively.Therefore, in Section 5.6, a comparison and analysis of the simulation runtime and predictive accuracy of the six RNN techniques are conducted to analyze their prediction performance comprehensively.

Comparative Analysis of Accuracy Efficiency
In this section, a comparative analysis is conducted by simultaneously evaluating the simulation runtimes and prediction accuracy of the six RNN techniques.This assessment aims to determine the RNN techniques that demonstrate the best performance in predicting concatenated vibrations.Figure 13  Table 6 below represents an analysis of the prediction accuracy improvement r and the simulation runtime changes of the remaining five RNN techniques relativ LSTM, which presents the lowest average prediction accuracy.In Figure 13, the bars represent the simulation runtime, and the circle, triangle, square, diamond, star, and inverted triangle represent the R 2 values of the 40%, 50%, 60%, 70%, 80%, and 90% training segment, respectively.
According to Figure 13, all six RNN techniques show an increase in simulation runtime as the training segment extends from 40% to 90%.Furthermore, the R 2 values increase with the expansion of the training segment, converging at approximately 70%.Additionally, among the converged R 2 values, Attn.-LSTM and Attn.-GRU show the highest R 2 values, while Bi-LSTM and Bi-GRU show the lowest R 2 values.In terms of simulation runtimes, LSTM and GRU are the shortest while Bi-LSTM and Bi-GRU are the longest.
Table 6 below represents an analysis of the prediction accuracy improvement rates and the simulation runtime changes of the remaining five RNN techniques relative to LSTM, which presents the lowest average prediction accuracy.According to Table 6, within the 40% training segment, the R 2 values for Attn.-LSTM,Bi-LSTM, GRU, Attn.-GRU, and Bi-GRU exhibit a minimum accuracy increase of approximately 730% compared to LSTM.Furthermore, Attn.-LSTM,Attn.-GRU,Bi-LSTM, and Bi-GRU exhibit longer simulation runtimes compared to LSTM and GRU across all the training segments due to the incorporation of additional attention mechanisms and bidirectional methods applied to the LSTM and GRU models.However, as shown in Table 6, both Attn.-LSTM and Attn.-GRU exhibit an increase in vibration prediction accuracy compared to LSTM across all the training segments.In contrast, it is observed that the vibration prediction accuracy of Bi-LSTM and Bi-GRU is slightly lower than that of LSTM in the 60%, 70%, and 80% training segments.Whereas LSTM and GRU are unidirectional, because Bi-LSTM and Bi-GRU are bidirectional, they tend to learn abnormal vibrations more, which act as distractors while predicting vibrations in both directions.
However, the average R 2 values for the entire training segments of Bi-LSTM and Bi-GRU increased compared to the R 2 values of LSTM.Therefore, in Table 7, the prediction accuracy and simulation runtime of LSTM are compared with the other five RNN techniques.Table 7 presents the change rates in average simulation runtimes and average R 2 values for the five RNN techniques compared to those of LSTM.According to the findings in Table 7, Attn.-LSTM shows an 8.26% increase in simulation runtime compared to LSTM, along with a 288.17% accuracy improvement.Similarly, Attn.-GRU exhibits a 15.40% average increase in simulation runtime compared to LSTM, with an accompanying 257.50% accuracy enhancement.Furthermore, it is observed that Attn.-LSTM and Attn.-GRU show smaller increases in simulation runtimes compared to Bi-LSTM and Bi-GRU, and yet they achieved a more substantial increase in accuracy.Hence, Attn.-LSTM and Attn.-GRU exhibit high prediction accuracy in comparison to their shorter simulation runtimes.The lower prediction accuracy of Bi-LSTM and Bi-GRU in longer simulation runtimes can be attributed to the following reasons.
Attn.-LSTM and Attn.-GRU, with their attention mechanisms designed to focus on information highly correlated with the prediction target in the training data, achieve a higher prediction accuracy when there is a low proportion of abnormal vibrations in the prediction segment.This is due to their ability to concentrate on training information relevant to abnormal vibrations, resulting in high prediction accuracy despite shorter simulation runtimes.However, Bi-LSTM and Bi-GRU, which employ bidirectional learning based on LSTM and GRU, not only transmit information from the previous time steps to the following ones, but also learn from the backward time steps.This bidirectional learning leads to an excessive focus on learning from normal vibrations that act as distractors when predicting abnormal vibrations within the prediction segment.Consequently, while attempting to predict abnormal vibrations, Bi-LSTM and Bi-GRU may experience significant increases in simulation runtime and subsequently demonstrate lower prediction accuracy.
Consequently, based on the simulation results, it is established that among the six RNN techniques employed for concatenated vibration data prediction, Attn.-LSTM and Attn.-GRU are the most suitable RNN techniques, while Bi-LSTM and Bi-GRU exhibit the least suitability.

Conclusions
In this study, a CVPM is proposed for predicting concatenated vibrations, and a comparative analysis of vibration prediction accuracy and simulation runtime is conducted among the six RNN techniques: LSTM, Attn.-LSTM,Bi-LSTM, GRU, Attn.-GRU, and Bi-GRU.The concatenated vibration data used in this study are collected from the motors connected to the drone's propellers, including both normal and abnormal vibration data.Then, the concatenated vibration dataset is created by merging 50% of normal vibration data with 50% of abnormal vibration data.Subsequently, this vibration dataset is utilized to comprehensively analyze the prediction accuracy and simulation runtime of the six different RNN techniques using the proposed CVPM.
According to the simulation results, it is observed that as the training segment increase, the predictive accuracies and simulation runtimes also increase for all the RNN techniques, including LSTM, Attn.-LSTM,Bi-LSTM, GRU, Attn.-GRU, and Bi-GRU.Additionally, when predicting concatenated vibrations, Attn.-LSTM and Attn.-GRU, which utilize the attention mechanism, exhibit the highest predictive accuracy relative to simulation runtime.On the other hand, the Bi-LSTM and Bi-GRU models, which employ the bidirectional technique, exhibit the lowest predictive accuracy relative to simulation runtime.Therefore, it is determined that Attn.-LSTM and Attn.-GRU are the most suitable RNN models for predicting concatenated vibrations.
In future work, we plan to predict vibrations transitioning smoothly from normal to abnormal states by incorporating the early and advanced stages during the transition from normal to abnormal vibrations, applied to an actual vibration prediction system.

Figure 2 .
Figure 2. Configuration of collecting time series vibration data.

Figure 2 .
Figure 2. Configuration of collecting time series vibration data.

Figure 2 .
Figure 2. Configuration of collecting time series vibration data.

Figure 4 (
below) illustrates the waveforms of the normalized normal and abnormal vibration data.

Figure 5 .
Figure 5. Waveform of a concatenated vibration from the normal and abnormal vibrations.

Figure 5 .
Figure 5. Waveform of a concatenated vibration from the normal and abnormal vibrations.

Figure 7
Figure 7 presents the architecture of GRU, equipped with gating mechanisms that regulate the flow of information through the network.It comprises two gates: reset gate and update gate.The reset gate determines the extent to which the previous hidden state is combined with the new input, allowing the model to decide what information to discard and what to keep, while the update gate controls how much of the previous hidden state should be retained and how much of the new information should be added to the current hidden state.In Figure7,

Figure 8 .
Figure 8. Architecture of LSTM and GRU with the attention mechanism.

Figure 8 .
Figure 8. Architecture of LSTM and GRU with the attention mechanism.
below illustrates the  values of the six RNN in relation to the changes in the training segment.

21 Figure 12 .
Figure 12.Average value of the coefficient of determination according to the size of the training segment.According to Figure12, it is observed that the  values predicted by the six RNN techniques increase as the training segment increases from 40% to 90%.Furthermore, it is noted that the accuracy of the vibrations predicted by Attn.-LSTM and Attn.-GRU, both utilizing the attention mechanism, is the highest among all the training segments, surpassing the respective groups of LSTM and GRU.

Figure 12 .
Figure 12.Average value of the coefficient of determination according to the size of the training segment.

18 Figure 13 .
Figure 13.Comparison of the average value of the coefficient of determination for each RNN m according to the average simulation runtime.

Figure 13 .
Figure 13.Comparison of the average value of the coefficient of determination for each RNN model according to the average simulation runtime.

Table 4 .
R 2 values of six RNN techniques as the training segment changes.

Table 5
below presents the average simulation runtimes for each training segment.According to Table 5, the simulation runtimes of all six RNN techniques increase with the training segment size.Furthermore, LSTM and GRU techniques indicate faster simulation speeds compared to Attn.-LSTM, Bi-LSTM, Attn.-GRU, and Bi-GRU techniques across all the training segments.

Table 5 .
Average simulation runtime (s) by training segment.

Table 6 .
Change rates in simulation runtime (s) and prediction accuracy (%) in comparison to LSTM.

Table 7 .
Change rates in average simulation runtime (s) and prediction accuracy (R 2 ) compared to LSTM.