Motion Sequence Decomposition-Based Hybrid Entropy Feature and Its Application to Fault Diagnosis of a High-Speed Automatic Mechanism

High-speed automatic weapons play an important role in the field of national defense. However, current research on reliability analysis of automaton principally relies on simulations due to the fact that experimental data are difficult to collect in real life. Different from rotating machinery, a high-speed automaton needs to accomplish complex motion consisting of a series of impacts. In addition to strong noise, the impacts generated by different components of the automaton will interfere with each other. There is no effective approach to cope with this in the fault diagnosis of automatic mechanisms. This paper proposes a motion sequence decomposition approach combining modern signal processing techniques to develop an effective approach to fault detection in high-speed automatons. We first investigate the entire working procedure of the automatic mechanism and calculate the corresponding action times of travel involved. The vibration signal collected from the shooting experiment is then divided into a number of impacts corresponding to action orders. Only the segment generated by a faulty component is isolated from the original impacts according to the action time of the component. Wavelet packet decomposition (WPD) is first applied on the resulting signals for investigation of energy distribution, and the components with higher energy are selected for feature extraction. Three information entropy features are utilized to distinguish various states of the automaton using empirical mode decomposition (EMD). A gray-wolf optimization (GWO) algorithm is introduced as an alternative to improve the performance of the support vector machine (SVM) classifier. We carry out shooting experiments to collect vibration data for demonstration of the proposed work. Experimental results show that the proposed work in this paper is effective for fault diagnosis of a high-speed automaton and can be applied in real applications. Moreover, the GWO is able to provide a competitive diagnosis result compared with the genetic algorithm (GA) and the particle swarm optimization (PSO) algorithm.


Introduction
A high-speed automaton is the core equipment of antiaircraft guns, and fault diagnosis of high-speed automatons is vital for defending national security.Different from rotating machinery, a high-speed automaton is a complex dynamic system; intense impact, friction, vibration and beating often appear over the course of movement of components.These factors can easily cause failure of the components of automaton.In addition to heavy background noise, the impacts generated by involved components influence each other significantly.It is worth pointing out here that the motion characteristic is a set of unique impacts that can only be observed in a high-speed automaton.Thus, fault diagnosis of automatic mechanisms is a challenging task in practical applications.Moreover, because of the dangerousness and complexity of this task, the foremost means for investigation of the reliability of an automaton is simulation analysis.To the best of our knowledge, fault diagnosis of various non-rating components in high-speed automatic mechanisms by employing vibration signals collected from a shooting experiment has not been implemented in previous works.Therefore, further work towards the exploration of an efficient technique to solve this challenging task is urgently necessary.
A shooting action is accomplished by intense impacts of various components of the automaton in accurate time order.In these motion sequences, two components may be involved at the same time point while one component may move at the following time point.Serial shootings will repeat the motion sequences rapidly.The disadvantage of propagation of impact is that it transmits assorted impact signals from various components of the automaton in addition to the signals of interest.This phenomenon makes it difficult to recognize the real state of a high-speed automaton using existing techniques.Moreover, when non-rotating components operate at high speeds or within a complex environment, the weak fault information may be suppressed by huge impacts.Considering the harsh working conditions and the specific motion characteristic, commonly used techniques cannot be applied directly to the vibration signal derived from serial impacts, though traditional time-frequency analysis techniques are effective to solve the nonlinear and non-stationary problem.This paper proposes a two-stage fault diagnosis scheme integrating motion sequence decomposition and modern signal processing to carry out fault diagnosis of automaton.At the first stage, the vibration signal generated by a shooting action is segmented into a series of sub-segments to match the actions of components involved according to their action times.Only the travels in which a fault occurs are considered for automaton fault diagnosis while the rest is ignored.At the second stage, time-frequency approaches are applied on the segmented signals for feature extraction and fault diagnosis.
As mentioned earlier, time-frequency analysis is an extensively applied tool in the field of fault diagnosis, like wavelet packet decomposition (WPD), empirical mode decomposition (EMD), and singular value decomposition (SVD).WPD is a powerful decomposition approach, applications of which involve various stages of fault diagnosis.For example, research on bearing degradation estimation with WPD based energy entropy feature was utilized by Lu [1].This is a representative example of the application of WPD in feature extraction.In addition, wavelet packet decomposition is also an excellent de-noising method that has been applied widely in fault diagnosis.Cui [2] used wavelet de-noising to preprocess vibration signals for diagnosis of rolling bearing faults prior to feature extraction.Liu [3] employed a wavelet de-noising approach to remove noise in vibration data for increasing the signal to noise ratio (SNR).Tabrizi [4] used wavelet packet decomposition and reconstruction to reduce noise for incipient damage detection.In order to reveal the signature of repetitive transients caused by early faults, Wang [5,6] introduced novel Bayesian inference-based optimal wavelet filtering to extend the infograms [7] based on an entropy metric for fault identification purpose.WPD makes important information available on high-frequency components, and the impacts covering a wide frequency range are assigned large coefficients while the smaller coefficients are set to 0. In this paper, the vibration signal collected from serial impacts is usually contaminated by environmental noise.Thus we adopt WPD to reduce heavy background noise and retain the most useful information for extraction of features.
Information entropy is able to quantify the complexity and uncertainty of non-periodic vibration signals.Furthermore, many contributions [8][9][10][11][12] have considered information entropy as a fault feature for fault diagnosis because the occurrence of faults may change the signal distribution in the time and frequency domains.However, a feature is merely sensitive to a corresponding failure.As a result, it is necessary to estimate the state of machinery from different points of view in order to implement a more reliable fault diagnosis.The vibration signal generated by components of the automaton shows a non-periodic characteristic, while information entropy is a more suitable measure to capture the non-periodic fault information.Therefore, three information entropy features are presented to comprehensively describe the state information of an automaton.The scheme for the extraction of entropy features information is as follows: (1) EMD adaptively decomposes the de-noised signal into a number of modes to form a data matrix for singular value decomposition.The aim of this procedure is to avoid construction of a trajectory matrix, so that it is not necessary to reconstruct phase space involving extra parameters, such as embedded dimensions and lag time; (2) The three information entropy features including the singular spectrum entropy, the Hilbert time-frequency spectrum entropy, and the Hilbert marginal spectrum entropy, are extracted from the time and frequency domains of IMFs for intelligent fault diagnosis of automatons.
Intelligent fault diagnosis has been an attractive research direction in recent years; many meaningful works [13][14][15][16][17][18] on the implementation of intelligent fault diagnosis in rotating machinery have been reported.It is generally known that the impact of associated parameters on the performance of fault diagnosis classifiers, such as support vector machine (SVM) and the artificial neural network (ANN), is significant.Another challenge related to fault diagnosis of high-speed automatic mechanisms is the lack of samples.For rotating machinery, the number of samples extracting from the vibration signal is based on the working time of rotating components.For a high-speed automaton, the number of samples depends on the number of shots fired.In practice, the number of bullets used for an experiment is finite.Hence, in this paper, for validating the effectiveness of the proposed work, SVM is employed to recognize the various states of a high-speed automaton because of its impeccable ability to classify small samples.To improve the identification ability of the fault classifier, the foremost choice for determining optimal parameters to obtain a higher classification rate is algorithm optimization.Ao [19] proposed an artificial chemical reaction optimization algorithm to optimize SVM parameters for the purpose of bearing fault diagnosis.Cerrada [20] used a genetic algorithm (GA) to enhance the identification accuracy of the random forest classifier.Certainly, particle swarm optimization (PSO) is also an efficient approach for improving the reliability of fault diagnosis [1].However, for these algorithms, it is observed that search agents only update their states towards the best solution obtained so far without designing a specialized mechanism for local optima avoidance.To exert the superiority of SVM in small sample classification, this paper introduces a gray-wolf optimization (GWO) algorithm to enhance the performance of SVM in fault diagnosis of an automaton.
This paper is organized as follows: Section 2.1 introduces the motion sequence decomposition (MSD) scheme.Section 2.2 introduces the three information entropy features prepared for automaton fault diagnosis.Section 2.3 briefly reviews the theoretical background of SVM and GWO.Section 3 demonstrates the effectiveness of the proposed approach in practical applications.Eventually, conclusions are drawn in Sections 4 and 5.The flowchart of the proposed work is shown in Figure 1. the extraction of entropy features information is as follows: (1) EMD adaptively decomposes the de-noised signal into a number of modes to form a data matrix for singular value decomposition.The aim of this procedure is to avoid construction of a trajectory matrix, so that it is not necessary to reconstruct phase space involving extra parameters, such as embedded dimensions and lag time. (2) The three information entropy features including the singular spectrum entropy, the Hilbert timefrequency spectrum entropy, and the Hilbert marginal spectrum entropy, are extracted from the time and frequency domains of IMFs for intelligent fault diagnosis of automatons.
Intelligent fault diagnosis has been an attractive research direction in recent years; many meaningful works [13][14][15][16][17][18] on the implementation of intelligent fault diagnosis in rotating machinery have been reported.It is generally known that the impact of associated parameters on the performance of fault diagnosis classifiers, such as support vector machine (SVM) and the artificial neural network (ANN), is significant.Another challenge related to fault diagnosis of high-speed automatic mechanisms is the lack of samples.For rotating machinery, the number of samples extracting from the vibration signal is based on the working time of rotating components.For a high-speed automaton, the number of samples depends on the number of shots fired.In practice, the number of bullets used for an experiment is finite.Hence, in this paper, for validating the effectiveness of the proposed work, SVM is employed to recognize the various states of a high-speed automaton because of its impeccable ability to classify small samples.To improve the identification ability of the fault classifier, the foremost choice for determining optimal parameters to obtain a higher classification rate is algorithm optimization.Ao [19] proposed an artificial chemical reaction optimization algorithm to optimize SVM parameters for the purpose of bearing fault diagnosis.Cerrada [20] used a genetic algorithm (GA) to enhance the identification accuracy of the random forest classifier.Certainly, particle swarm optimization (PSO) is also an efficient approach for improving the reliability of fault diagnosis [1].However, for these algorithms, it is observed that search agents only update their states towards the best solution obtained so far without designing a specialized mechanism for local optima avoidance.To exert the superiority of SVM in small sample classification, this paper introduces a gray-wolf optimization (GWO) algorithm to enhance the performance of SVM in fault diagnosis of an automaton.
This paper is organized as follows: Section 2 introduces the motion sequence decomposition (MSD) scheme.Section 3 introduces the three information entropy features prepared for automaton fault diagnosis.Section 4 briefly reviews the theoretical background of SVM and GWO.Section 5 demonstrates the effectiveness of the proposed approach in practical applications.Eventually, conclusions are drawn in Section 6.The flowchart of the proposed work is shown in Figure 1.

Motion Sequence Decomposition
Fault diagnosis of a high-speed automatic mechanism is a challenging task in practical applications, principally due to its complex motion procedures as well as its harsh working environment, including high temperatures, high pressure, and high impact.Considering that plenty of state information on the non-rotating component can be detected by estimating the vibration

Motion Sequence Decomposition
Fault diagnosis of a high-speed automatic mechanism is a challenging task in practical applications, principally due to its complex motion procedures as well as its harsh working environment, including high temperatures, high pressure, and high impact.Considering that plenty of state information on the non-rotating component can be detected by estimating the vibration signal of a specific component, it is essential to investigate the entire working procedure of an automatic mechanism as well as the corresponding action times.As listed in Table 1, the motion procedure is broken into a number of single actions according to the motion time of each component.The total time for accomplishing one shooting is 113.45 ms (The action time of each travel is highlighted in grey background).Since the function of each component in a shooting is known a priori, we can easily determine the travels in which the faults occur.Different from rotating machinery, the function realization of an high-speed automatic mechanism mainly relies on impacting the involved components sequentially, and the impact signal propagates throughout all components of high-speed automaton with rapid attenuation.For example, as shown in Figure 2, four components impact each other in a certain order to accomplish a shooting purpose.Two points, A and B, are impacted as component 4 moves to the right.Inversely, component 4 will impact point C as it returns.Corresponding to their action times, we find that these actions exist at travel 2 and travel 18, respectively.Then the vibration signal consisting of a series of impacts is associated with the decomposed actions.Only the signals derived from the faulty travels are segmented for fault diagnosis of an automatic mechanism.Figure 3 gives an example to illustrate this point.We observe that a shooting arouses a series of impulses acting on the components involved, and each impact duration declines to 0 g quickly during a single shooting.Therefore, in order to enhance the reliability of the fault diagnosis of an automatic mechanism, the impacts generated by the fault should be separated from the signal as its amplitude decreases to approximately 0 g.signal of a specific component, it is essential to investigate the entire working procedure of an automatic mechanism as well as the corresponding action times.As listed in Table 1, the motion procedure is broken into a number of single actions according to the motion time of each component.The total time for accomplishing one shooting is 113.45 ms (The action time of each travel is highlighted in grey background).Since the function of each component in a shooting is known a priori, we can easily determine the travels in which the faults occur.Different from rotating machinery, the function realization of an high-speed automatic mechanism mainly relies on impacting the involved components sequentially, and the impact signal propagates throughout all components of high-speed automaton with rapid attenuation.For example, as shown in Figure 2, four components impact each other in a certain order to accomplish a shooting purpose.Two points, A and B, are impacted as component 4 moves to the right.Inversely, component 4 will impact point C as it returns.Corresponding to their action times, we find that these actions exist at travel 2 and travel 18, respectively.Then the vibration signal consisting of a series of impacts is associated with the decomposed actions.Only the signals derived from the faulty travels are segmented for fault diagnosis of an automatic mechanism.Figure 3 gives an example to illustrate this point.We observe that a shooting arouses a series of impulses acting on the components involved, and each impact duration declines to 0 g quickly during a single shooting.Therefore, in order to enhance the reliability of the fault diagnosis of an automatic mechanism, the impacts generated by the fault should be separated from the signal as its amplitude decreases to approximately 0 g.As displayed in Figure 3, there are several free travels that exist in the actions of components because there is no obvious impact between impact 2 and impact 3.This means that some actions fail to cause a collision.Moreover, the labeled impacts in Figure 3 are derived from the actions of components in various travels.In practical applications, only the high impacts existing in travel 2 and travel 18 lead to a failure of component 1 and component 3.In this paper, faults are introduced to components 1 and 3 for validation of the proposed work.The details are presented in Section 3. Thus the vibration signal corresponding to travel 2 and travel 18 is selected from each shooting for fault diagnosis.

Hybrid Information Entropy
Considering that the data collected from a machine gun comprise a high number of non-stationary shocks, the presence of a fault may influence the frequency and energy distribution significantly.Thus, the obtained data are first preprocessed through wavelet packet decomposition (WPD) for investigating the frequency components and energy distribution.Only the most important components are chosen for further estimation.Due to space limitation, for details regarding the theoretical background of WPD, readers may refer to [4].
Empirical mode decomposition (EMD) is a powerful time-frequency analysis technique that has been intensively applied for fault diagnosis [21][22][23].It adaptively decomposes the original vibration signal into a series of intrinsic mode functions (IMFs) containing local features.For a vibration signal Y(t), this procedure can be formulated as: Y(t) = ∑ ( ) + ( ) , where ( ) and ( ) are the IMFs and residue, respectively.After EMD decomposition, the resulting components can show the detailed characteristics of an original input from different points of view.As displayed in Figure 3, there are several free travels that exist in the actions of components because there is no obvious impact between impact 2 and impact 3.This means that some actions fail to cause a collision.Moreover, the labeled impacts in Figure 3 are derived from the actions of components in various travels.In practical applications, only the high impacts existing in travel 2 and travel 18 lead to a failure of component 1 and component 3.In this paper, faults are introduced to components 1 and 3 for validation of the proposed work.The details are presented in Section 3. Thus the vibration signal corresponding to travel 2 and travel 18 is selected from each shooting for fault diagnosis.

Hybrid Information Entropy
Considering that the data collected from a machine gun comprise a high number of non-stationary shocks, the presence of a fault may influence the frequency and energy distribution significantly.Thus, the obtained data are first preprocessed through wavelet packet decomposition (WPD) for investigating the frequency components and energy distribution.Only the most important components are chosen for further estimation.Due to space limitation, for details regarding the theoretical background of WPD, readers may refer to [4].
Empirical mode decomposition (EMD) is a powerful time-frequency analysis technique that has been intensively applied for fault diagnosis [21][22][23].It adaptively decomposes the original vibration signal into a series of intrinsic mode functions (IMFs) containing local features.For a vibration signal Y(t), this procedure can be formulated as: Y(t) = ∑ ( ) + ( ) , where ( ) and ( ) are the IMFs and residue, respectively.After EMD decomposition, the resulting components can show the detailed characteristics of an original input from different points of view.As displayed in Figure 3, there are several free travels that exist in the actions of components because there is no obvious impact between impact 2 and impact 3.This means that some actions fail to cause a collision.Moreover, the labeled impacts in Figure 3 are derived from the actions of components in various travels.In practical applications, only the high impacts existing in travel 2 and travel 18 lead to a failure of component 1 and component 3.In this paper, faults are introduced to components 1 and 3 for validation of the proposed work.The details are presented in Section 3. Thus the vibration signal corresponding to travel 2 and travel 18 is selected from each shooting for fault diagnosis.

Hybrid Information Entropy
Considering that the data collected from a machine gun comprise a high number of non-stationary shocks, the presence of a fault may influence the frequency and energy distribution significantly.Thus, the obtained data are first preprocessed through wavelet packet decomposition (WPD) for investigating the frequency components and energy distribution.Only the most important components are chosen for further estimation.Due to space limitation, for details regarding the theoretical background of WPD, readers may refer to [4].
Empirical mode decomposition (EMD) is a powerful time-frequency analysis technique that has been intensively applied for fault diagnosis [21][22][23].It adaptively decomposes the original vibration signal into a series of intrinsic mode functions (IMFs) containing local features.For a vibration signal Y(t), this procedure can be formulated as: , where c i (t) and r n (t) are the IMFs and residue, respectively.After EMD decomposition, the resulting components can show the detailed characteristics of an original input from different points of view.
Entropy is a sensitive quantification index for reflecting the complexity and uncertainty of a system.In this paper, hybrid entropy is employed to extract features for quantifying information changes under different conditions of a machine gun.Given a data matrix A = [c 1 (t), c 2 (t), • • • , c n (t)] T , singular value decomposition on the feature space A is carried out to reflect the intrinsic characteristic of each component [13]: where U and V are orthogonal matrices and S = diag[δ 1 , δ 2 , • • • , δ n ] is a diagonal matrix consisting of elements arranged in descending order.So, the singular spectrum entropy is as follows: In fact, singular spectrum entropy is a measure of uncertainty of the vibration energy in a system, which can be regarded as a time-domain feature index.However, space spectrum entropy is able to reflect the uncertainty of local energy in this system.After Hilbert transformation, the Hilbert spectrum H(ω, l) is the representation of the intrinsic mode functions in the time-frequency domain.Similarly, the space spectrum entropy can be calculated by solving singular value decomposition: Subsequently, Hilbert marginal spectrum h(ω) is calculated to reflect the relationship between the frequency domain and the energy of signal.Then marginal spectrum entropy is defined as: Marginal spectrum entropy H bj and space spectrum entropy H kj are two metrics for capturing the instantaneous characteristic of modes, so the hybrid entropy feature can assess the real status of a machine gun comprehensively.

Support Vector Machine
Support vector machine (SVM) is a foremost choice for solving small sample learning problems.For classification, SVM maps the original input space into high-dimensional feature space through nonlinear transformation, which has been applied in many fields to solve the nonlinear, over-fitting, high-dimensional pattern recognition problems.
The goal of SVM is to find an optimal hyperplane to identify sample patterns.Take a dataset {(x i , d i )} N 1 , where x i is the ith input feature vector, d i is the target value, and N is the total number of training sets.An error function is constructed to transform this aim into solving the minimization of a quadratic function [24]: where ε i denotes the slack variable, g is a regularization parameter (penalty factor) that controls the complexity of classifier and the number of inseparable sample points, and ∅(x i ) is a nonlinear mapping function that maps the original input space into high-dimensional feature space.
The Lagrange function is introduced to determine the solution of Equation ( 5): where Differentiate J with w, b, then let partial derivatives of J equal zero to satisfy the optimality condition: ∂J/∂w = 0, ∂J/∂b = 0. Then the optimization problem is further converted as: Thus a classification hyperplane can be obtained through solving Equation (7).Then replace the inner product ∅(x i )∅ x j with kernel function K (x i , x j ) for the purpose of feature mapping.The nonlinear decision function is as follows: where K x i , x j is a kernel function derived from Mercer's theorem.Kernel function is a crucial bridge to map the original input space into feature space, while radial basis function is the foremost selection to overcome this challenging issue.Here K( , in which σ is band width.It can be visibly observed that the superiority of SVM heavily relies on the kernel function, and the two parameters σ and g may have a significant impact on the performance of the classifier.Therefore, to enhance the reliability and stability of fault diagnosis of a high-speed automaton, a gray-wolf optimization algorithm (GWO) is introduced for improving the identification accuracy of SVM.

Gray-Wolf Optimization Algorithm
A gray-wolf algorithm is a novelty meta-heuristic optimization algorithm imitating the social hierarchy and hunting behavior of gray wolves.Gray wolves generally prefer to live in groups, and the social hierarchy of the gray wolf comprises four levels, as shown in Figure 4 [25].The first level α, consisting of a male and a female, is the leader level, which is responsible for decision-making.Other wolves in the group should follow the orders of the dominant wolf.The second level is called β. Wolves at this level are the advisors of leaders in terms of decision and management, and they are probably the best candidates to become the dominant wolf when leaders are lacking.The third ranking is called δ. Wolves in this rank obey the previous two levels of wolves and dominate the wolves in the lowest level ω.Moreover, the wolves belonging to this level are responsible for performing real tasks for the group.Hunters assist the higher-level wolves as hunting prey and also find food for the group.Sentinels are responsible for protecting the group.Scouts watch the territory and alarm the group.Then these relationships and behaviors-social hierarchy, encircling prey, hunting behavior, and search for prey-are mathematically modeled for solving optimization problems.

•
Step 1. Social hierarchy To model the social hierarchy of gray wolves in designing the optimization algorithm, α is considered as the optimal solution.Thus, and are used as the second and third optimal solutions, respectively.The rest of the candidate solutions are defined to be .Obviously, the solutions should follow the first three levels according to the relationships depicted in Figure 4a).

•
Step 2. Encircling prey Gray wolves are able to determine the position of prey in space and encircle them.This behavior is formulated as: where t denotes the iteration number, and are coefficient vectors, is the position vector of prey, and is the position vector of a gray wolf.Here, = 2 • − , = 2 , in which is a vector linearly decreased from 2 to 0 in iterations ( can be formulated as: = 2 − t × ((2)/M), where M denotes the maximum number of iterations), and and are random vectors whose values are derived from [0, 1].The position of a gray wolf will be updated once a better solution is obtained.Obviously, a gray wolf can reach different positions around the prey in 2D or 3D space by changing the values of coefficient vectors and .

•
Step 3. Hunting prey Gray wolves have the ability to determine the position of prey.Since the optimal position of the prey in search space is not known beforehand, the candidate solution is assumed as the target prey.After a better solution is found, the other search agents will update their positions towards the optimal search agent.This behavior can be formulated as: It can be observed that the dominant wolves, α, , and , estimate the position of the prey while the other wolves update their position randomly according to the estimated position of the prey.

•
Step 4. Attacking prey After encircling the prey, gray wolves will attack once the prey stops moving.As mentioned

• Step 1. Social hierarchy
To model the social hierarchy of gray wolves in designing the optimization algorithm, α is considered as the optimal solution.Thus, β and δ are used as the second and third optimal solutions, respectively.The rest of the candidate solutions are defined to be ω.Obviously, the ω solutions should follow the first three levels according to the relationships depicted in Figure 4a).

• Step 2. Encircling prey
Gray wolves are able to determine the position of prey in space and encircle them.This behavior is formulated as: where t denotes the iteration number, A and C are coefficient vectors, X p is the position vector of prey, and X is the position vector of a gray wolf.Here, A = 2a•r 1 − a, C = 2r 2 , in which a is a vector linearly decreased from 2 to 0 in iterations (a can be formulated as: a = 2 − t × ((2)/M), where M denotes the maximum number of iterations), and r 1 and r 2 are random vectors whose values are derived from [0, 1].The position of a gray wolf will be updated once a better solution is obtained.Obviously, a gray wolf can reach different positions around the prey in 2D or 3D space by changing the values of coefficient vectors A and C.

• Step 3. Hunting prey
Gray wolves have the ability to determine the position of prey.Since the optimal position of the prey in search space is not known beforehand, the candidate solution is assumed as the target prey.After a better solution is found, the other search agents will update their positions towards the optimal search agent.This behavior can be formulated as: It can be observed that the dominant wolves, α, β, and δ, estimate the position of the prey while the other wolves update their position randomly according to the estimated position of the prey.

• Step 4. Attacking prey
After encircling the prey, gray wolves will attack once the prey stops moving.As mentioned earlier, this phenomenon can be achieved by linearly decreasing the value of a from 2 to 0. In fact, A contains a group of random values defined into [−2a, 2a], so that the value of A will fluctuate with the Currently, search agents update their positions based on position estimation of the three dominant wolves, α, β, and δ.To enhance the global optimization ability of the GWO algorithm, a local optima avoidance mechanism is introduced.

• Step 5. Search for prey
In this phase, in order to perform a global search, |A| > 1 is utilized to update the position of a gray wolf according to the randomly chosen search agent.The objective of the inverse operation is to allow other search agents to update the position of an optimal solution.Moreover, vector C comprises random values derived from [0, 2].This vector is applied to randomly weighting the position of the prey so as to enhance or weaken the effect of the position of the prey on the defined distance.Different from vector A, C is set to provide random values for the purpose of emphasizing exploration in the entire iteration process.For |A| < 1, the search agents tend to converge and the iterations are devoted to exploitation.For |A| > 1, the search agent will diverge from the current prey and the iterations attempt exploration.For further details with respect to the gray-wolf optimization (GWO) algorithm, readers may refer to [25].In this paper, to enhance the capability of SVM for fault diagnosis of automatic mechanism, GWO is utilized to determine the optimal parameters of SVM.It is worth pointing out that the fitness function is defined as the training accuracy of SVM.That is, the aim of the GWO algorithm is to find the optimal parameters σ and g corresponding to a maximization of the training accuracy of the classifier.The scheme for parameters optimization is summarized as follows: Parameters setting (e.g., population size, number of iterations) and gray wolf population initialization Initialize the best position vectors: X α , X β , X δ Calculate the fitness of each search agent When t < M (maximum number of iterations) For each agent Take the parameters σ and g as the search agent and training SVM model, respectively, by using training data Calculate the fitness (training accuracy) of each search agent Update the optimal search agents σ and g End for Update a, A, C T = t + 1 End while Get the optimal parameters σ and g and test the trained SVM by using testing data

Experimental Setup
In order to investigate the effectiveness of the proposed work, components 1 and 3 (as shown in Figure 2) are seeded with faults to simulate different working conditions with crack sizes.The details are summarized as follows: the first is a crack fault of depth 2 mm, which is set near the fillet of component 3. The second one is a crack fault of depth 1.5 mm, which is also introduced to component 3 at the other position.The third crack fault of depth 1.5 mm is introduced at the fillet of component 1. Figure 5 shows the details on the faults.Hence, taking the above faults and normal conditions into account, four states of high-speed automaton are simulated for the validation of the proposed work.In accordance with different conditions, experiments on five serial shootings are conducted for data collection.The sampling system of the vibration signal used in this study is based on the LMS Test Lab system, as displayed in Figure 6.Two accelerometers are mounted on the automatic mechanism to collect vibration signals generated from five serial shootings with a sampling rate of 204.8 kHz. Figure 7 depicts the experimental setup.
conditions into account, four states of high-speed automaton are simulated for the validation of the proposed work.In accordance with different conditions, experiments on five serial shootings are conducted for data collection.The sampling system of the vibration signal used in this study is based on the LMS Test Lab system, as displayed in Figure 6.Two accelerometers are mounted on the automatic mechanism to collect vibration signals generated from five serial shootings with a sampling rate of 204.8 kHz. Figure 7 depicts the experimental setup.proposed work.In accordance with different conditions, experiments on five serial shootings are conducted for data collection.The sampling system of the vibration signal used in this study is based on the LMS Test Lab system, as displayed in Figure 6.Two accelerometers are mounted on the automatic mechanism to collect vibration signals generated from five serial shootings with a sampling rate of 204.8 kHz. Figure 7 depicts the experimental setup.In a high-speed automatic weapon, the initial shooting requires human intervention, but the remaining four shots occur automatically.Therefore, to remove the influence of human intervention from the diagnosis result of automaton fault, the last four shots of each condition are selected for fault diagnosis purposes.For the resulting signals, we split them into four single shootings and then pick the faulty travels (the sampling time is 10 ms) from each shooting for automaton fault diagnosis Accelerometer 1 Accelerometer 2 Automatic mechanism In a high-speed automatic weapon, the initial shooting requires human intervention, but the remaining four shots occur automatically.Therefore, to remove the influence of human intervention from the diagnosis result of automaton fault, the last four shots of each condition are selected for fault diagnosis purposes.For the resulting signals, we split them into four single shootings and then pick the faulty travels (the sampling time is 10 ms) from each shooting for automaton fault diagnosis according to the motion sequence decomposition scheme proposed in Section 2. As depicted in Figure 8, the isolated signals have a similar appearance and the impacts occur within a very short space of time.
according to the motion sequence decomposition scheme proposed in Section 2. As depicted in Figure 8, the isolated signals have a similar appearance and the impacts occur within a very short space of time.

Experimental Results
Because the data were recorded with a high sampling rate (204.8kHz), a "db4" wavelet is used to decompose the impact vibration signal into a tree including six-layer decomposition.Then the normalized wavelet energy feature is calculated to investigate the detailed energy distribution of the impact signal.Figure 9 shows the normalized energy of a six-layer wavelet packet decomposition of each state.It is generally known that the selection of wavelet functions is difficult for different practical applications.Daubechies-family wavelets are often used for defect detection and fault diagnosis

Experimental Results
Because the data were recorded with a high sampling rate (204.8kHz), a "db4" wavelet is used to decompose the impact vibration signal into a tree including six-layer decomposition.Then the normalized wavelet energy feature is calculated to investigate the detailed energy distribution of the impact signal.Figure 9 shows the normalized energy of a six-layer wavelet packet decomposition of each state.It is generally known that the selection of wavelet functions is difficult for different practical applications.Daubechies-family wavelets are often used for defect detection and fault diagnosis because of their advantages in terms of orthogonality and computational simplicity [26].In general, a proper decomposition level can be determined in practical applications depending on  because of their advantages in terms of orthogonality and computational simplicity [26].In general, a proper decomposition level can be determined in practical applications depending on the real conditions, as has been demonstrated by many published works [1,26,27].It can be seen in Figure 9 that the energy of the four conditions is mainly concentrated in the low-frequency segment (bands 1-8).Moreover, the energy in the low-frequency segment is much larger than that in the high-frequency segment (bands 35-40).Evidently, Figure 9 suggests that frequency bands 3 and 7, with higher amplitudes, should be chosen for signal reconstruction because the modes of interest have the potential to retain dominant information about the signal.As an example, the reconstructed signal of fault 1 is presented in Figure 10.Singular value decomposition on the matrix whose rows are IMFs is performed to calculate the three information entropy features.In total, we get three features and 16 samples from feature extracting procedure.Each state contains four samples.Half of the total samples are presented to SVM optimized by GWO for model training, while all samples are used for testing.
The initialization parameters of GWO, such as the gray wolf population N, maximum number of iterations M, and search space [lb, ub] are as follows: M = 200, N = 20, [lb, ub] = [10 , 10 ].The task of GWO is to find the maximization of SVM accuracy.Furthermore, two meta-heuristic algorithms, the genetic algorithm (GA) and the particle swarm optimization (PSO) algorithm, are also examined for comparison.For GA, the generation and the population size are set at 200 and 20, respectively.The generation gap is 0.9.For PSO, the iteration count and the numbers of particles are set to 200 and 50, respectively ( = 1.5, = 1.7, inertia weight W = 1).The search space of the algorithms is the same as that of the GWO.We ran the compared algorithms 15 times to investigate their exploration ability.
The average optimal solution of 15 runs is listed in Table 2.All the algorithms were performed with Matlab-2014a.The SVM tool in LIBSVM-3.18library [28] was employed to identify automaton faults.
For each run, the optimal identification rate is obtained using 5-fold cross-validation.We observe from Table 2 that SVM-GWO provides the highest identification accuracy (87.5%) for the four states of a high-speed automatic mechanism.The optimal classification accuracies of PSO-SVM and GA-SVM reach 75%.This means that GWO has a stronger ability to avoid local optima compared with GA and PSO during model training.Based on this, the average identification rate for training and testing set is presented in Table 3. From Table 3, we find that the testing accuracies of GA-SVM, PSO-SVM, and GWO-SVM are 81.25%,87.5%, and 87.5%, respectively.Like GA and PSO, the diagnosis result of SVM using GWO is satisfactory.Although GWO fails to show obvious superiority in terms of classification accuracy compared with PSO and GA, it may be used as an alternative because of its fewer initialization parameters and stronger exploration capability.

Discussion
This paper proposes a motion sequence decomposition technique to isolate useful segments of impact signals collected from action sequences of a high-speed automaton.The major findings are as follows: (1) It is possible to accurately estimate the state of non-rotating components in a high-speed automaton by dealing with the vibration signal recorded from the shooting experiment.The result obtained above reveals the effectiveness of the proposed work to diagnose the state of an automaton.At present, many published works have been presented for fault diagnosis of rotating machinery, such as rolling bearing, gearbox, etc.Researchers have paid little attention to fault diagnosis of non-rotating machinery.It is worth pointing out that, to the best of our knowledge, no published works using a hybrid entropy feature have approached the issue of distinguishing the states of high-speed automaton based only on impact signal.(2) Motion sequence decomposition is an efficient approach to fault diagnosis in high-speed automatic mechanisms.Also, it is of great significance for fault diagnosis of non-rotating machinery.From Section 2, we understand that the impact signal of high-speed mechanism shows a specific characteristic different from common signals.Moreover, a shooting action comprises 19 motion travels, and the impact signal that is sampled includes superposed signals coming from these travels as well as components involved.In addition, due to the "three-high" working condition, extraction of the fault features is more complicated in practical applications.That is, this paper accomplishes a challenging task.(3) As displayed in Table 4, for each fault, we find that the feature indexes are unstable.This presents a great challenge to fault diagnosis of a high-speed automaton.The main reason may lie in the fact that the mass of dynamics system (automaton) tends to decrease along the shooting time.
That is, the shooting experiment starts with five bullets, while the mass of the system varies during the five shootings.Moreover, the amount of gunpowder in each bullet is different, which also affects the response of the dynamic system.
Entropy 2017, 19, 86 14 of 16 (4) GWO is introduced to improve the performance of SVM in the fault identification of an automaton.Experimental results confirm that GWO has a stronger ability to avoid local optima while exploring the best performance of SVM compared with the widely used algorithms GA and PSO (Table 2).Although GWO fails to show obvious superiority, it provides a competitive diagnosis result in this study (Table 3).It can be used as a potential alternative for parameter optimization.

Conclusions
Aiming at providing an effective approach to fault diagnosis of a high-speed automatic mechanism, this paper proposes a motion sequence decomposition technique combining modern signal processing methods to diagnose non-rotating machinery fault.A modern signal processing technique, wavelet packet decomposition (WPD), is applied to remove noise and redundancy.Three information entropy features from the segmented vibration signal are calculated with the help of EMD.The gray-wolf optimization algorithm is introduced to improve the performance of SVM.To sum up, the following conclusions can be obtained: Fault diagnosis of a high-speed automaton is a challenging task in real life, so this subject needs more attention, as well as broader views.

Figure 1 .
Figure 1.Flowchart of the proposed work for fault diagnosis of a high-speed automaton.

Figure 1 .
Figure 1.Flowchart of the proposed work for fault diagnosis of a high-speed automaton.

Figure 3 .
Figure 3. Vibration signal derived from a single shooting.

Figure 3 .
Figure 3. Vibration signal derived from a single shooting.

Figure 3 .
Figure 3. Vibration signal derived from a single shooting.

Figure 4 .
Figure 4. Gray-wolf optimization algorithm: (a) social hierarchy of the gray wolf; (b) hunting behavior.

Figure 4 .
Figure 4. Gray-wolf optimization algorithm: (a) social hierarchy of the gray wolf; (b) hunting behavior.
Defining the values of A in [−1, 1], a gray wolf is capable of reaching a new position between the original position and the prey's position.

Figure 6 .
Figure 6.Data collection system: (a) LMS Test Lab system; (b) schematic figure of data collection.

Figure 6 .
Figure 6.Data collection system: (a) LMS Test Lab system; (b) schematic figure of data collection.

Figure 6 .
Figure 6.Data collection system: (a) LMS Test Lab system; (b) schematic figure of data collection.

( 1 )
The proposed work in this paper is effective for fault diagnosis of a high-speed automatic mechanism and the results obtained are satisfactory.(2)This paper demonstrates that the information entropy feature can be used as an efficient measure of fault information to recognize faults in automatic mechanisms.(3) The gray-wolf optimization (GWO) algorithm is used to improve the classification accuracy of SVM.Although GWO fails to indicate obvious superiority, it provides a competitive diagnosis result compared with GA and PSO.(4) The proposed work is of great significance for fault diagnosis of non-rotary mechanical parts.

Table 1 .
Travels and action time of each shooting.

Table 1 .
Travels and action time of each shooting.

Table 2 .
Exploration ability of algorithms compared.

Table 3 .
Diagnosis results using three approaches.