EMG Pattern Classiﬁcation by Split and Merge Deep Belief Network

: In this paper; we introduce an enhanced electromyography (EMG) pattern recognition algorithm based on a split-and-merge deep belief network (SM-DBN). Generally, it is difﬁcult to classify the EMG features because the EMG signal has nonlinear and time-varying characteristics. Therefore, various machine-learning methods have been applied in several previously published studies. A DBN is a fast greedy learning algorithm that can identify a fairly good set of weights rapidly—even in deep networks with a large number of parameters and many hidden layers. To reduce overﬁtting and to enhance performance, the adopted optimization method was based on genetic algorithms (GA). As a result, the performance of the SM-DBN was 12.06% higher than conventional DBN. Additionally, SM-DBN results in a short convergence time, thereby reducing the training epoch. It is thus efﬁcient in reducing the risk of overﬁtting. It is veriﬁed that the optimization was improved using GA.


Introduction
Electromyography (EMG) pattern classification is used in various fields, such as biomedical engineering, rehabilitation engineering, user interface, and others [1][2][3][4].EMG pattern classification consists of feature extraction and classification algorithms.EMG features can be categorized as time-domain features and frequency-domain features.However, it is difficult to find the dominant features that distinguish gestures of the musculoskeletal system, primarily because the characteristics of the EMG signal are nonlinear and time varying.Therefore, selection of the classification algorithm is important.In numerous previously published works, machine learning methods have been applied.
In many of the previously published works, various methods have been applied to the analysis and classification of EMG.Many of these studies were related to the extraction of better features and their application to more efficient classifiers.Many features, including independence component analysis, root-mean-square, distance-based features, and nonlinear multiscale features have been suggested [5][6][7][8][9][10][11][12][13][14][15][16][17][18][19].Various machine learning algorithms, such as fuzzy, support vector machines (SVM), and neural networks were applied to solve the classification matter [7,9,20,21].In our previous study, a Gaussian mixture model (GMM) and a deep belief network (DBN) were applied [22,23].DBN has been shown to yield a better accuracy than GMM, linear discriminant analysis (LDA), or SVM.DBN was proposed by Hinton to dramatically improve the classification performance [24].This algorithm reduced over-fitting and eliminated the local minima problem, which is difficult to solve in conventional multi-layer perceptron models.The performance of the DBN was better than other shallow-learning classifiers.However, the DBN needs lengthy iterations to attain good performance in recognizing EMG patterns.Since lengthy iterations induce over-fitting, it is necessary to reduce iterations.To reduce the over-fitting problem and enhance the accuracy of the DBN, split-and-merge algorithms were proposed in our previous study [25].The split-and-merge deep belief network (SM-DBN) used a genetic algorithm (GA) after pre-training the DBN and selection of the best offspring.SM-DBN was shown to elicit a better performance than conventional DBN in the case of the mixed National Institute of Standards and Technology database dataset [26].The DBN is a general-purpose classifier, which can be useful for images, time series signals, text data, and others.
In this paper, we propose an enhanced EMG pattern recognition algorithm based on SM-DBN.Because the EMG signal is a time series signal, our work proves that SM-DBN is efficient for both image and time series data.Herein, the procedures of the EMG pattern recognition and the concepts of the algorithms are described.Subsequently, the results generated based on the evaluation of the algorithms are presented.

Pattern Recognition Process
The pattern recognition algorithm categorizes numerous input patterns to specified outputs.Generally, pattern recognition is dealt as a subtask of the machine learning algorithm.Figure 1 depicts a flow diagram of the entire EMG pattern recognition process.
At first, the EMG signal was acquired by the biosignal amplifier.Figure 2 shows the placement configuration of the electrodes.The amplitude range of the EMG signal was approximately −10 to 10 mV peak-to-peak.Furthermore, the frequency range was approximately 0 to 500 Hz, with the dominant energy in the 50 to 150 Hz range [27].To recognize the wrist posture, the EMG of the flexor carpi ulnaris (FCU) and extensor carpi ulnaris (ECU) were recorded.In our previous study, a Gaussian mixture model (GMM) and a deep belief network (DBN) were applied [22,23].DBN has been shown to yield a better accuracy than GMM, linear discriminant analysis (LDA), or SVM.DBN was proposed by Hinton to dramatically improve the classification performance [24].This algorithm reduced over-fitting and eliminated the local minima problem, which is difficult to solve in conventional multi-layer perceptron models.The performance of the DBN was better than other shallow-learning classifiers.However, the DBN needs lengthy iterations to attain good performance in recognizing EMG patterns.Since lengthy iterations induce over-fitting, it is necessary to reduce iterations.To reduce the over-fitting problem and enhance the accuracy of the DBN, split-and-merge algorithms were proposed in our previous study [25].The split-and-merge deep belief network (SM-DBN) used a genetic algorithm (GA) after pre-training the DBN and selection of the best offspring.SM-DBN was shown to elicit a better performance than conventional DBN in the case of the mixed National Institute of Standards and Technology database dataset [26].The DBN is a general-purpose classifier, which can be useful for images, time series signals, text data, and others.
In this paper, we propose an enhanced EMG pattern recognition algorithm based on SM-DBN.Because the EMG signal is a time series signal, our work proves that SM-DBN is efficient for both image and time series data.Herein, the procedures of the EMG pattern recognition and the concepts of the algorithms are described.Subsequently, the results generated based on the evaluation of the algorithms are presented.

Pattern Recognition Process
The pattern recognition algorithm categorizes numerous input patterns to specified outputs.Generally, pattern recognition is dealt as a subtask of the machine learning algorithm.Figure 1 depicts a flow diagram of the entire EMG pattern recognition process.
At first, the EMG signal was acquired by the biosignal amplifier.Figure 2 shows the placement configuration of the electrodes.The amplitude range of the EMG signal was approximately −10 to 10 mV peak-to-peak.Furthermore, the frequency range was approximately 0 to 500 Hz, with the dominant energy in the 50 to 150 Hz range [27].To recognize the wrist posture, the EMG of the flexor carpi ulnaris (FCU) and extensor carpi ulnaris (ECU) were recorded.Next, filters were used to reduce noise sources-such as motion artifacts, hum, high frequency noise, and aliasing interferers-during analog-to-digital conversion.In this work, a second-order Butterworth bandpass filter with a 50 to 500 Hz bandwidth was used.Subsequently, the EMG signals were segmented using constant-time segmentation.The temporal span of the Hamming window used was 166 ms, and it was overlapped by 50%.After the segmentation of the time series signal, feature vectors were extracted from each segment for use as input to the classifier.The feature extraction algorithms are described in Section 2.2.
Symmetry 2016, 8, x FOR PEER REVIEW 3 of 11 Next, filters were used to reduce noise sources-such as motion artifacts, hum, high frequency noise, and aliasing interferers-during analog-to-digital conversion.In this work, a second-order Butterworth bandpass filter with a 50 to 500 Hz bandwidth was used.Subsequently, the EMG signals were segmented using constant-time segmentation.The temporal span of the Hamming window used was 166 ms, and it was overlapped by 50%.After the segmentation of the time series signal, feature vectors were extracted from each segment for use as input to the classifier.The feature extraction algorithms are described in Section 2.2.

Feature Extraction
Jeong et al. compared EMG features that are commonly used, and suggested some efficiently separated features [22].According to Jeong [22], we chose the following features.The different absolute mean value is the absolute average between two successive signal samples [22,28].It can be calculated by integrating the absolute value of the signal sample during the time under consideration, expressed according to Equation (1): where is the number of samples of the segmented EMG signal.The different absolute standard deviation value (DASDV) is the standard deviation of the difference between two successive signal samples, and its value is expressed in accordance with Equation (2).
The mean absolute value (MAV) refers to the average of the absolute value of the EMG signal samples in the segmented signal, formulated in accordance with Equation (3): The zero crossing (ZC) is according to Equation (4): where, In this paper, ZC is normalized as follows to fit to the input range of the classifiers.

Feature Extraction
Jeong et al. compared EMG features that are commonly used, and suggested some efficiently separated features [22].According to Jeong [22], we chose the following features.The different absolute mean value is the absolute average between two successive signal samples [22,28].It can be calculated by integrating the absolute value of the signal sample during the time under consideration, expressed according to Equation (1): where N is the number of samples of the segmented EMG signal.The different absolute standard deviation value (DASDV) is the standard deviation of the difference between two successive signal samples, and its value is expressed in accordance with Equation (2).
The mean absolute value (MAV) refers to the average of the absolute value of the EMG signal samples in the segmented signal, formulated in accordance with Equation (3): The zero crossing (ZC) is according to Equation (4): where, In this paper, ZC is normalized as follows to fit to the input range of the classifiers.
where N is the number of samples in the window.

Split and Merge Deep Belief Network
Back propagation (BP) is a popular learning method for deep neural networks (DNNs).BP is used in conjunction with an optimization method, such as gradient descent [29].However, this method is associated with some problems, such as local minima, a slow learning speed at an increasing of number of hidden layers, and over-fitting [30].
To overcome these problems, Hinton proposed DBNs [24].The DBN consisted of a pre-training phase and a fine tuning phase.In the pre-training phase, a DBN generates a multilayer connected model with a restricted Boltzmann machine (RBM).The RBMs are trained using an unsupervised learning method.A joint probability between the visible layer and the hidden layer is then defined using an energy function in RBMs [31].In this phase, each layer is trained in a greedy manner.After pre-training, fine-tuning is completed using BP.
DBN minimized the problems of the conventional DNNs.However, in the case of the EMG signal classification, the iteration time is too long, because the EMG signal has poor features for classification.This causes overfitting.Therefore, it is necessary to reduce the iteration time to prevent overfitting.
The split-and-merge deep belief networks (SM-DBNs) algorithm is proposed to reduce the iteration time and enhance the classification accuracy.The SM-DBN comprises three procedures [25].In the first step (called split), two respective multilayer neural networks are initiated with the RBM method.The second step is called merge, and represents the stage where these two networks are merged by the GA.In the third step, this network is tuned finely by the BP.
Before training the model, the EMG signal data are divided into three datasets that are for training network, for validation of the network during training, and for testing performance.The training dataset S training is divided into two subsets, S training1 and S training2 .The validation dataset S valid is used to evaluate offspring performance in the GA training phase.The test set S test is used to evaluate the performance of the trained networks.The relationships between theses sets are as follows: Figure 3 shows the flow diagram of the SM-DBN.At the start of the process, data set S training is divided into two subsets S training1 and S training2 .At this point, these should have an intersection set.Each subset is trained by its corresponding DBN.Each DBN has the same bias vectors.This network information is treated as chromosomes in the merge phase of GA.Moreover, the crossover and mutation occur in this phase.In this paper, the uniform crossover method is used.The performance of each offspring created after the merge phase is evaluated by counting the number of errors (zero-one loss) using a validation set, S valid .If R D → {1, • • • , L} is the prediction function, then this loss can be expressed as: where D is the number of input dimensions, L is the number of labels, and the dataset S valid . is an indexed validation dataset of pairs (x i , y i ) .x i ∈ R D is the ith data of dimensionality D, and and y i ∈ {1, • • • , L} is the ith label assigned to input x i .The indicator function I x , and the function are defined as where Net is a set of all parameters for a neural network.The aim of the GA is to find Net m u which minimizes the zero-one loss function.Therefore, the objective function of the GA can be written as: Symmetry 2016, 8, x FOR PEER REVIEW 5 of 11 where is a set of all parameters for a neural network.The aim of the GA is to find which minimizes the zero-one loss function.Therefore, the objective function of the GA can be written as: After the evaluation for the offspring, the two fitted offspring, and , are selected.The subscript refers to the ranking based on the validation test.Both networks are going to be parents in the next generation.
After iterating for generations, the fittest offspring is finally selected.This network is not yet optimized.Therefore, fine-tuning by BP must be executed.

Participants and EMG Signal Acquisition
In this work, EMG signal data from our previous work are used [22,23].The EMG signals from five wrist motions-named up, down, left, right, and rest-were acquired.Figure 4 shows the shape of the five motions of the wrist.The EMG signals were acquired from 23 males and 5 females.Each of the participants was informed about the experimental procedures before the onset of recording.
Ag/AgCl electrodes were used to record EMG signals, and these were attached on the FCU and ECU, respectively.These electrodes were connected to the biosignal amplifier.In this work, the MP150WSW and BN-EMG2 acquisition systems (BIOPAC Systems, Inc., Goleta, CA, USA) were used to record the signals.EMG signals were obtained using the BN-EMG2.
In this work, we collected 47 sets of EMG raw signals from 28 persons.We then extracted 11750 sets of features from raw EMG signals.After the evaluation for the offspring, the two fitted offspring, Net m 1 and Net m 2 , are selected.The subscript refers to the ranking based on the validation test.Both networks are going to be parents in the next generation.
After iterating for M generations, the fittest offspring Net N 1 (N ≤ M) is finally selected.This network is not yet optimized.Therefore, fine-tuning by BP must be executed.

Participants and EMG Signal Acquisition
In this work, EMG signal data from our previous work are used [22,23].The EMG signals from five wrist motions-named up, down, left, right, and rest-were acquired.Figure 4 shows the shape of the five motions of the wrist.The EMG signals were acquired from 23 males and 5 females.Each of the participants was informed about the experimental procedures before the onset of recording.Ag/AgCl electrodes were used to record EMG signals, and these were attached on the FCU and ECU, respectively.These electrodes were connected to the biosignal amplifier.In this work, the MP150WSW and BN-EMG2 acquisition systems (BIOPAC Systems, Inc., Goleta, CA, USA) were used to record the signals.EMG signals were obtained using the BN-EMG2.
In this work, we collected 47 sets of EMG raw signals from 28 persons.We then extracted 11750 sets of features from raw EMG signals.

Results
To validate our method, we compared it with the conventional DBN.In our previous work, the DBN showed enhanced performance compared to other machine learning-based classifiers, such as the LDA or the SVM. Figure 5 shows the average learning curve of the DBN and SM-DBN, respectively.The learning error rate of the SM-DBN is less than the learning error rate of the DBN.Likewise, the figure shows the average test error rate curve of the DBN and SM-DBN.The test result is similar to the average learning curve.

Results
To validate our method, we compared it with the conventional DBN.In our previous work, the DBN showed enhanced performance compared to other machine learning-based classifiers, such as the LDA or the SVM. Figure 5 shows the average learning curve of the DBN and SM-DBN, respectively.The learning error rate of the SM-DBN is less than the learning error rate of the DBN.Likewise, the figure shows the average test error rate curve of the DBN and SM-DBN.The test result is similar to the average learning curve.A ten-fold cross-validation was used for statistical evaluation.In the cross-validation method, the entire dataset was divided into k sub-datasets.The classifier was trained with a training dataset, and it was then validated with a test dataset.Every dataset must be used for the test set more than once.In our work, the dataset was divided by 10 randomly.Eight subsets were then used for training, another subset was used for validation, and the rest were used for the test set.There were ten possible cases.Table 2 shows the validation test results.After 2000 iterations, the average error rate of the DBN was 12.00%.On the other hand, the error rate of the SM-DBN was 10.079.Moreover, the standard deviations of the DBN and SM-DBN were 1.086 and 1.050, respectively.Once the average result of 10 trials was compared with the DBN, the error rate of the SM-DBN was found to be 12.06% higher.
Error Rate A ten-fold cross-validation was used for statistical evaluation.In the cross-validation method, the entire dataset was divided into k sub-datasets.The classifier was trained with a training dataset, and it was then validated with a test dataset.Every dataset must be used for the test set more than once.In our work, the dataset was divided by 10 randomly.Eight subsets were then used for training, another subset was used for validation, and the rest were used for the test set.There were ten possible cases.Table 2 shows the validation test results.After 2000 iterations, the average error rate of the DBN was 12.00%.On the other hand, the error rate of the SM-DBN was 10.079.Moreover, the standard deviations of the DBN and SM-DBN were 1.086 and 1.050, respectively.Once the average result of 10 trials was compared with the DBN, the error rate of the SM-DBN was found to be 12.06% higher.A dependent t-test was used to evaluate the statistical significance of the difference between the results of each algorithm.The statistical program R was used for statistical computing [36].The elicited Symmetry 2016, 8, 148 p-value was 0.0146, indicating a statistically significant difference between the two algorithms at a 5% level of significance.
TP (True Positive), FN (False Negative), FP (False Positive) and TN (True Negative) were obtained using the confusion matrix to evaluate the performance of the model.In addition to the accuracy, true positive rate (TPR) and false positive rate (FPR) were designed and evaluated for performance.TPR can be defined as: TPR = TP TP + FN (11) and FPR can be written as: Table 3 shows the results of sensitivity and specificity analysis.The sensitivity of SM-DBN was 1.29% higher, and the specificity was 0.66% lower.This suggests that SM-DBN has better sensitivity, specificity, and accuracy than DBN.

Discussion
There are some limitations associated with this work.The recording time of the EMG signal is not long.This means that the scale of the samples is small to be considered as big data.Although the DBN is suitable for big data classification, it elicited good performance in this work.
The parameters used for the classifier used the same values to allow comparison with the conventional DBN and SM-DBN.Thus, the error rate convergence of the SM-DBN is faster than that of conventional DBN.Statistically, the performance of the SM-DBN is improved, because the convergence time of the SM-DBN is shorter than that of the DBN.This is because the GA optimized weights of the network heuristically.The crossover and mutation of the merge procedure yielded strongly connected nodes for neurons that enhance the recognition performance, and weakly connected nodes for irrelevant and redundant neurons.Therefore, the feature extractor for the proposed method chooses more separable features than the RBMs.In the result, we can identify the proper classifier at approximately 1000 iterations.It is important to avoid the overfitting problem.In many cases, overfitting occurs in a long trained classifier.Any overfitting instance is not shown in Figure 6.
The standard deviations of the DBN and SM-DBN are small.This means that the stability of the classifiers is guaranteed.

Conclusions
In this paper, an EMG pattern recognition method is proposed based on SM-DBN.Conventional DBN generated a network with RBM, and was fine-tuned using BP.In SM-DBN, optimization with GA was added between RBM and BP.The implemented algorithm was then compared with the conventional DBN.As a result, the statistical performance of the SM-DBN was 12.06% higher than the conventional DBN.SM-DBN has a short convergence time in reducing the training epoch.It is also efficient in reducing the risk of overfitting.It shows the effectiveness of the optimization with GA.
To enhance the performance of the classifier, more research on extracting dominant features and designing optimized SM-DBN structures is required.This classifier architecture is not optimized.Therefore, with the optimization of the parameters, the accuracy of the classifiers will be improved.
Furthermore, more data recording is needed in general for deep learning that inherently requires tremendously large datasets.It is thus important to reduce the error rate and biased decision function

Figure 1 .
Figure 1.Flow diagram of the process for electromyography (EMG) pattern recognition.

Figure 1 .
Figure 1.Flow diagram of the process for electromyography (EMG) pattern recognition.

Figure 2 .
Figure 2. We acquired two channel EMG signals.Channel 1 signals were measured from the flexor carpi ulnaris muscle, and channel 2 signals were measured from the extensor carpi ulnaris.

Figure 2 .
Figure 2. We acquired two channel EMG signals.Channel 1 signals were measured from the flexor carpi ulnaris muscle, and channel 2 signals were measured from the extensor carpi ulnaris.

Figure 5 .
Figure 5. Average learning curve of DBN and SM-DBN.

Figure 5 .
Figure 5. Average learning curve of DBN and SM-DBN.

Figure 6 .
Figure 6.Average test error rate curve of DBN and SM-DBN.

Table 1 .
Parameter settings to acquire raw signals and features.

Table 1 .
Parameter settings to acquire raw signals and features.

Table 2 .
Result of performance test between DBN and SM-DBN.

Table 3 .
Sensitivity and specificity analysis of SM-DBN and DBN.