1. Introduction
Propelled by successes in discriminating between different human activities, radar has recently been employed for automatic hand gesture recognition for interactive intelligent devices [
1,
2,
3,
4,
5,
6]. This recognition proves important in contactless close-range hand-held or arm-worn devices, such as cell phones and watches. The most recent project on hand gesture recognition, Soli, by Google, monitors contactless interactions with radar embedded in a wrist band and is a good example of this emerging technology [
3]. In general, automatic hand or arm gesture recognition, through the use of radio frequency (RF) sensors, is important for the smart environment. It is poised to make homes more user friendly and most efficient by identifying different motions for controlling instruments and household appliances. The same technology can greatly benefit the physically challenged, who might be wheelchair confined or bed-ridden. The goal is to enable these individuals to function independently.
Arm motions assume different kinematics than those of hands, especially in terms of speed and time duration. Compared to hand gestures, arm gesture recognition can be more suitable for contactless man–machine interactions with a longer range, e.g., in the case of commanding appliances, like a TV, from a distant couch. The large radar cross-sections of the arms, vis-a-vis hands, permit more remote interactive positions in an indoor setting. Further, the ability of using hand gestures for device control can sometimes be hindered by cognitive impairments such the Parkinson disease which induces strong hand tremors.
The nature and mechanism of arm motions are dictated by their elongated bone structure defined by the humerus, which extends from the shoulders to the elbows, and the radius and ulna that extend from the elbows to hands. Because of such structures, arm motions, excluding hands, can be accurately simulated by two connected rods. In this respect, the instantaneous Doppler frequencies corresponding to different points on the upper arm are closely related. The same can be said for the forearm. This is different from hand motions which involve different and flexible motions of the palm and the fingers, and it is certainly distinct from body motions which yield intricate micro-Doppler (MD) signatures [
7,
8,
9,
10,
11,
12,
13,
14,
15].
Recent work in automatic arm motion recognition using the maximum instantaneous Doppler frequencies, i.e., the frequency envelope of the MD signature of the data spectrogram, as features followed by the nearest neighbor (NN) classifier provided classification rates reaching close to 97% [
16]. It was shown that the feature vector consisting of the augmented positive frequency and negative frequency envelopes outperforms data driven automatic feature extraction, such as principal component analysis (PCA), and provides similar results to convolutional neural network (CNN). Since the NN classifier applies distance metrics to measure closeness of the test data to the training data, shuffling the envelope values of all test and training data in the same manner will not change the metric or the classification results. In this respect, the frequency envelope values, rather than the actual shape of the envelope, decide the classification performance.
In this paper, with a focus on improving the results in [
17], we employ features that capture the MD signature envelope behavior as well as the evolution characteristics. The envelope represents the maximum instantaneous Doppler frequencies, and thus, can be considered as a time series. Time-series analysis appears in many application domains, including speech recognition, handwriting recognition, weather readings, and financial recordings [
18,
19,
20]. We consider two common time-series recognition methods, namely, the NN-dynamic time warping (DTW) (NN classifier with the DTW distance) [
21,
22,
23,
24] method and the long short-term memory (LSTM) method [
25,
26,
27,
28]. The former is a conventional machine learning (ML) technique that utilizes the DTW distance which is a sum-measure over a parametrization. It has nonlinear warping capability to find an optimal alignment between two time series and, therefore, can determine the similarity between the two time series [
29,
30,
31,
32,
33,
34]. The latter method is a deep learning tool which is more appropriate for time series than CNN. It establishes a memory of the data temporal evolution information during the training process [
35,
36,
37,
38]. The DTW-based NN classifier was shown to outperform those based on the L1 distance norm and the LSTM method, and achieves an average classification rate of above 99%. Both time-series analysis methods are robust to time misalignment. Similar to [
17], our feature vector includes the augmented positive and negative frequency envelopes. However, we also augment these two envelopes with a vector of their differences which properly captures the time synchronization nature of the two envelopes. It is noted that no repetitive motions are considered, and gesture classification is applied to only a single arm motion cycle [
39].
The main novelty of our work is that, to the best of our knowledge, this is the first time where time-series recognition methods are employed to classify arm motions by the maximum instantaneous Doppler frequency features. Commonly applied methods for classification are more suitable for image-like data, such as handcrafted feature-based methods and low-dimension representation techniques based on PCA and CNN [
2,
4,
5,
40]. The principal motivation of using time-series recognition methods is to exploit the time relations between the different envelope values for improved classification.
The remainder of this paper is organized as follows. In
Section 2, we describe a method to extract the MD signature envelopes, and discuss two time-series analysis methods, namely, the dynamic time warping and the long short-term memory.
Section 3 describes the arm motion experiments, and presents the gesture recognition accuracy of the two time-series analysis methods.
Section 4 discusses the robustness of the proposed methods to time misalignment and time consumption. The paper is concluded in
Section 5.
5. Conclusions
In this paper, we considered a time-series analysis method for effective automatic arm motion recognition based on radar MD signature envelopes. No range or angle information was incorporated into the classifications. Taking advantage of the Doppler continuity of the arm motion, the PBC was used to determine the individual motion boundaries from long time series. The positive and negative frequency envelopes of the data spectrogram were then extracted by an energy-based thresholding algorithm. The feature vector was the augmented positive and negative frequency envelopes, and their differences. The augmented feature vector was provided to the NN classifier based on the DTW distance, which is more suitable to describe the similarity between time series in lieu of the L1 and L2 distance measures. The LSTM, a time-series analysis method commonly used in ML, was also presented for comparison. The experimental results showed that the NN classifier based on the DTW distance achieves close to a 99% classification rate, which is superior to both existing classifiers based on the L1 distance and the LSTM method by an overall 2% improvement. It was also shown that the DTW and LSTM methods are robust to the time shift of the signal.
Future work may consider more diverse arm motions, arm speeds, arm angle orientations, and distances between the radar and the person moving his/her arms. It will be of interest to evaluate the robustness of the arm classification results while the person is in the state of standing or walking.