Deep Learning in Physiological Signal Data: A Survey

Deep Learning (DL), a successful promising approach for discriminative and generative tasks, has recently proved its high potential in 2D medical imaging analysis; however, physiological data in the form of 1D signals have yet to be beneficially exploited from this novel approach to fulfil the desired medical tasks. Therefore, in this paper we survey the latest scientific research on deep learning in physiological signal data such as electromyogram (EMG), electrocardiogram (ECG), electroencephalogram (EEG), and electrooculogram (EOG). We found 147 papers published between January 2018 and October 2019 inclusive from various journals and publishers. The objective of this paper is to conduct a detailed study to comprehend, categorize, and compare the key parameters of the deep-learning approaches that have been used in physiological signal analysis for various medical applications. The key parameters of deep-learning approach that we review are the input data type, deep-learning task, deep-learning model, training architecture, and dataset sources. Those are the main key parameters that affect system performance. We taxonomize the research works using deep-learning method in physiological signal analysis based on: (1) physiological signal data perspective, such as data modality and medical application; and (2) deep-learning concept perspective such as training architecture and dataset sources.


Introduction
Deep Learning has succeeded over traditional machine learning in the field of medical imaging analysis, due to its unique ability to learn features from raw data [1]. Objects of interest in medical imaging such as lesions, organs, and tumors are very complex, and much time and effort is required to extract features using traditional machine learning, which is accomplished manually. Thus, deep learning in medical imaging replaces hand-crafted feature extraction by learning from raw input data, feeding into several hidden layers, and finally outputting the result from a huge number of parameters in an end-to-end learning manner [2]. Therefore, many research works have benefited from this novel approach to apply physiological data to fulfil medical tasks.
Physiological signal data in the form of 1D signals are time-domain data, in which sample data points are recorded over a period of time [3]. These signals change continuously and indicate the health of a human body. Physiological signal data categories fall into characteristics such as electromyogram (EMG), which is data regarding changes to skeleton muscles, electrocardiogram (ECG), which is data regarding changes to heart beat or rhythm, electroencephalogram (EEG), which is data regarding changes to the brain measured from the scalp, and electrooculogram (EOG), which is data regarding changes to corneo-retinal potential between the front and the back of the human eye.
Convolutional neural network (CNN) is the most successful type of deep-learning model for 2D image analysis such as recognition, classification, and prediction. CNN receives 2D data as an input and extracts high-level features though many hidden convolution layers. Thus, to feed physiological signals into a CNN model, some research works have converted 1D signals into 2D data [4]. Therefore, in this paper we survey 147 contributions which have found highly accurate and significant results of physiological signal analysis using a deep-learning approach. We overview and explain these solutions and contributions in more detail in Sections 3, 4, and 5.
We collected papers via search engine PubMed with keywords combining "deep learning" and a type of physiological signal such as "deep learning electromyogram emg", "deep learning electrocardiogram ecg", "deep learning electroencephalogram eeg", and "deep learning electrooculogram eog" [5]. We found 147 papers published between January 2018 and October 2019 inclusive from various journals and publishers. As illustrated in Figure 1a, the works on EMG, ECG, and EEG using a deep-learning approach have rapidly increased in 2018 and 2019, while EOG and a combination of those signals are limited. Within the works on EEG, there has been an increase by 13, from 33 works in 2018 to 46 works in 2019. As shown in Figure 1b, among the four data modalities of physiological signals, EEG has been conducted in 79 works on a variety of applications. For ECG, there have 47 works conducted. Fifteen works apply to EMG, 1 work to EOG, and 5 works to a combination of those signals. There are many papers that prove that the deep-learning approach is more successful than traditional machine learning for both implementation and performance. However, this paper does not aim to study the comparison between them. In this paper, we review some recent methods of deep learning in the last two years that analyze the physiological signals. We only compare the key parameters within deep-learning methods such as input data type, deep-learning task, deep-learning model, training architecture, and dataset sources which are involved in predicting the state of hand motion, heart disease, brain disease, emotion, sleep stages, age, and gender.

Related Works
There are two types of scientific survey in deep-learning approaches regarding physiological signal data for healthcare application between January 2018 and October 2019 inclusive.
The first is oriented to medical fields such as a taxonomy based on medical tasks (i.e., disease detection, computer-aided diagnosis, etc.), or a taxonomy based on anatomy application areas (i.e., brain, eyes, chest, lung, liver, kidney, heart, etc.). Faust et al [6] collected 53 research papers regarding physiological signal analysis using deep-learning methods published from 2008 to 2017 inclusive. This work initially introduced deep-learning models such as auto-encoder, deep belief network, restricted Boltzmann machine, generative adversarial network, and recurrent neural network. Then, it categorized the papers based on types of physiological signal data modalities. Each category points out the medical application, the deep-learning algorithm, the dataset, and the results. There are many papers that prove that the deep-learning approach is more successful than traditional machine learning for both implementation and performance. However, this paper does not aim to study the comparison between them. In this paper, we review some recent methods of deep learning in the last two years that analyze the physiological signals. We only compare the key parameters within deep-learning methods such as input data type, deep-learning task, deep-learning model, training architecture, and dataset sources which are involved in predicting the state of hand motion, heart disease, brain disease, emotion, sleep stages, age, and gender.

Related Works
There are two types of scientific survey in deep-learning approaches regarding physiological signal data for healthcare application between January 2018 and October 2019 inclusive.
The first is oriented to medical fields such as a taxonomy based on medical tasks (i.e., disease detection, computer-aided diagnosis, etc.), or a taxonomy based on anatomy application areas (i.e., brain, eyes, chest, lung, liver, kidney, heart, etc.). Faust et al [6] collected 53 research papers regarding physiological signal analysis using deep-learning methods published from 2008 to 2017 inclusive. This work initially introduced deep-learning models such as auto-encoder, deep belief network, restricted Boltzmann machine, generative adversarial network, and recurrent neural network. Then, it

Physiological Signal Analysis and Modality
Physiological signal analysis is a study estimating the human health condition from a physical phenomenon. There are three types of measurement to record physiological signals: (1) reports; (2) reading; and (3) behavior [8]. The "report" is a response evaluation of questionnaire from subjects who participants in rating their own physiological states. The "reading" is recorded information that is captured by a device to read the human body state such as muscle strength, heartbeat, brain functionality, etc. The "behavior" measurement records a variety of actions such as movement of the eyes. In this paper, we did not review the "report" measurement because the response of "report" is a more biased, less precise question and has broader diversity of question scale. We focus on the technique of "reading" and "behavior" measurement in which the response results are in a signal modality of EMG, ECG, EEG, EOG, or a combination of these signals. Table 1 describes the physiological signal modality which was used to implement medical application. The muscle tension pattern of the EMG signal provides hand motion and muscle activity recognition. The variant of heartbeat or heart rhythm provides heart disease, sleep stage, emotion, age, and gender classification. The diversity of brain response of EEG signal provides brain disease, emotion, sleep-stage, motion, gender, words, and age classification. The changes of eye corneo-retinal potential of EOG signal provides sleep-stage classification. Table 1. Medical application in physiological signal analysis.

Words classification
Words recognition of speech-impaired people from brain-generated signals [

Age classification
Age of children classification on performing a verb-generation task, a monosyllable speech-elicitation task [148] CNN BCI Competition IV 9 Accuracy = 95% University of Toronto, Toronto, Canada 92

Deep Learning with Electromyogram (EMG)
Electromyogram (EMG) signal is data regarding changes of skeleton muscles, which is recorded by putting non-invasive EMG electrodes on the skin such as the commercial MYO Armband (MYB). Since different muscle information is defined by different activity, it can discriminate a pattern of motion such as an open or closed hand. To classify those motion patterns based on the EMG signal information, 15 research works were conducted using deep-learning methods, as shown in Tables 3  and 4. Within these research works, there are two types of key contribution. One is focused on hand motion recognition and another one is focused on general muscle activity recognition. Figure 2 shows the number of deep-learning models used to analyze the EMG signal: (a) illustrates hand motion recognition and (b) illustrates muscle activity recognition. In hand motion recognition, CNN and CNN+RNN models are the most commonly used. In muscle activity recognition, the CNN model is the most commonly used. Table 11  Combination of signals  Table 12 We do both a quantitative and qualitive comparison of the deep-learning model. For quantitative comparison, the number of deep-learning models that have been used in medical application is illustrated. For qualitative comparison, since the performance criterion is not provided uniformly, we assume an accuracy value as a base criterion for an overall performance comparison.

Deep Learning with Electromyogram (EMG)
Electromyogram (EMG) signal is data regarding changes of skeleton muscles, which is recorded by putting non-invasive EMG electrodes on the skin such as the commercial MYO Armband (MYB). Since different muscle information is defined by different activity, it can discriminate a pattern of motion such as an open or closed hand. To classify those motion patterns based on the EMG signal information, 15 research works were conducted using deep-learning methods, as shown in Table 3 and Table 4. Within these research works, there are two types of key contribution. One is focused on hand motion recognition and another one is focused on general muscle activity recognition. Figure 2 shows the number of deep-learning models used to analyze the EMG signal: (a) illustrates hand motion recognition and (b) illustrates muscle activity recognition. In hand motion recognition, CNN and CNN+RNN models are the most commonly used. In muscle activity recognition, the CNN model is the most commonly used.  Table 3 describes medical application using deep-learning methods in EMG signal analysis from a public dataset source. The publicly available datasets are deployed in the CNN model, which provides overall accuracy >68%. However, the CNN+RNN model provides higher accuracy than the CNN model, with accuracy >82%. Table 4 describes medical application using deep-learning methods in EMG signal analysis from a private (in-house) dataset source. The works use their own in-house (private) dataset to recognize hand motion. The DBN model performs with overall accuracy >88%. Therefore, The DBN model performs better than CNN and CNN+RNN models. For muscle activity recognition, the CNN model performs NMSE of 0.033±0.017, while RNN/long short-term memory (LSTM) model performs NMSE of 0.096±0.013. Therefore, the CNN model performs better than the RNN/LSTM model.

Deep Learning with Electrocardiogram (ECG)
Electrocardiogram (ECG) is data regarding changes of heartbeat or rhythm. There are 47 research works using deep-learning methods to analyze the ECG signals, as shown in Table 5, Table  Table 3 describes medical application using deep-learning methods in EMG signal analysis from a public dataset source. The publicly available datasets are deployed in the CNN model, which provides overall accuracy >68%. However, the CNN+RNN model provides higher accuracy than the CNN model, with accuracy >82%. Table 4 describes medical application using deep-learning methods in EMG signal analysis from a private (in-house) dataset source. The works use their own in-house (private) dataset to recognize hand motion. The DBN model performs with overall accuracy >88%. Therefore, The DBN model performs better than CNN and CNN+RNN models. For muscle activity recognition, the CNN model performs NMSE of 0.033 ± 0.017, while RNN/long short-term memory (LSTM) model performs NMSE of 0.096 ± 0.013. Therefore, the CNN model performs better than the RNN/LSTM model.

Deep Learning with Electrocardiogram (ECG)
Electrocardiogram (ECG) is data regarding changes of heartbeat or rhythm. There are 47 research works using deep-learning methods to analyze the ECG signals, as shown in Table 5, Table 6, and Table 7.
Their key contributions are categorized as heartbeat signal classification, heart disease classification, sleep-stage classification, emotion detection, and age and gender prediction. Figure 3 shows the number of deep-learning models used to analyze ECG signal: (a) illustrates heartbeat signal classification in which the CNN model is the most commonly used; (b) illustrates heart disease classification in which CNN is the most commonly used; (c) illustrates sleep-stage detection in which CNN is the most commonly used; (d) illustrates emotion detection in which RNN/LSTM and CNN+RNN are used; and (e) illustrates age and gender classification in which only CNN is used. Figure 3 shows the number of deep-learning models used to analyze ECG signal: (a) illustrates heartbeat signal classification in which the CNN model is the most commonly used; (b) illustrates heart disease classification in which CNN is the most commonly used; (c) illustrates sleep-stage detection in which CNN is the most commonly used; (d) illustrates emotion detection in which RNN/LSTM and CNN+RNN are used; and (e) illustrates age and gender classification in which only CNN is used.  Table 5 describes medical application using deep-learning methods in ECG signal analysis from a public dataset source. In heartbeat signal classification, the CNN model performs with overall accuracy >95%. RNN/LSTM model performs with overall accuracy >98%. CNN+RNN/LSTM model performs with overall accuracy >87%. Therefore, RNN/LSTM model performs better than CNN and CNN+RNN/LSTM models. In heart disease classification, CNN model performs with overall accuracy >83%. RNN/LSTM model performs with overall accuracy >90%. CNN+RNN/LSTM model performs with overall accuracy >98%. Therefore, CNN+RNN/LSTM model performs the best. In sleep-stage classification, only CNN model is used and it performs with overall accuracy >87%.  Table 5 describes medical application using deep-learning methods in ECG signal analysis from a public dataset source. In heartbeat signal classification, the CNN model performs with overall accuracy >95%. RNN/LSTM model performs with overall accuracy >98%. CNN+RNN/LSTM model performs with overall accuracy >87%. Therefore, RNN/LSTM model performs better than CNN and CNN+RNN/LSTM models. In heart disease classification, CNN model performs with overall accuracy >83%. RNN/LSTM model performs with overall accuracy >90%. CNN+RNN/LSTM model performs with overall accuracy >98%. Therefore, CNN+RNN/LSTM model performs the best. In sleep-stage classification, only CNN model is used and it performs with overall accuracy >87%. Table 6 describes medical application using deep-learning methods in ECG signal analysis from a private dataset source. In heartbeat signal classification, only CNN model is used and the CNN model performs with overall accuracy >78%. In heart disease classification, CNN model performs with overall accuracy >97%, while CNN+LSTM model performs with accuracy >83%. Therefore, CNN model performs better than CNN+LSTM model. In sleep-stage classification, CNN and GRU model perform with accuracy of 99%. In emotion classification, CNN+RNN model performs with accuracy >73%. In age and gender prediction, CNN model performs with accuracy >90%. Table 7 describes medical application using deep-learning methods in ECG signal analysis from a hybrid dataset source. In heartbeat signal classification, CNN+LSTM model performs with accuracy >99%.

Deep Learning with Electroencephalogram (EEG)
Electroencephalogram (EEG) is data regarding changes of the brain measured from the scalp. There are 79 research works using deep-learning methods to analyze the EEG signals, as shown in Table 8, Table 9, and Table 10. Their key contributions are categorized as brain functionality classification, brain disease classification, emotion classification, sleep-stage classification, motion classification, gender classification, word classification, and age classification. Figure 4 shows the number of deep-learning models used to analyze EEG signal: (a) illustrates brain functionality classification in which the CNN model is the most commonly used; (b) illustrates brain disease classification in which the CNN is the most commonly used; (c) illustrates emotion classification in which the CNN is the most commonly used; (d) illustrates sleep-stage classification in which CNN is the most commonly used; (e) illustrates motion classification in which CNN is the most commonly used; (f) illustrates gender classification in which only CNN is used; (g) illustrates word recognition in which only AE is used; and (h) illustrates age classification in which only CNN is used. Table 6 describes medical application using deep-learning methods in ECG signal analysis from a private dataset source. In heartbeat signal classification, only CNN model is used and the CNN model performs with overall accuracy >78%. In heart disease classification, CNN model performs with overall accuracy >97%, while CNN+LSTM model performs with accuracy >83%. Therefore, CNN model performs better than CNN+LSTM model. In sleep-stage classification, CNN and GRU model perform with accuracy of 99%. In emotion classification, CNN+RNN model performs with accuracy >73%. In age and gender prediction, CNN model performs with accuracy >90%. Table 7 describes medical application using deep-learning methods in ECG signal analysis from a hybrid dataset source. In heartbeat signal classification, CNN+LSTM model performs with accuracy >99%.

Deep Learning with Electroencephalogram (EEG)
Electroencephalogram (EEG) is data regarding changes of the brain measured from the scalp. There are 79 research works using deep-learning methods to analyze the EEG signals, as shown in Table 8, Table 9, and Table 10. Their key contributions are categorized as brain functionality classification, brain disease classification, emotion classification, sleep-stage classification, motion classification, gender classification, word classification, and age classification. Figure 4 shows the number of deep-learning models used to analyze EEG signal: (a) illustrates brain functionality classification in which the CNN model is the most commonly used; (b) illustrates brain disease classification in which the CNN is the most commonly used; (c) illustrates emotion classification in which the CNN is the most commonly used; (d) illustrates sleep-stage classification in which CNN is the most commonly used; (e) illustrates motion classification in which CNN is the most commonly used; (f) illustrates gender classification in which only CNN is used; (g) illustrates word recognition in which only AE is used; and (h) illustrates age classification in which only CNN is used.   Table 8 describes medical application using deep-learning methods in EEG signal analysis from a public dataset source. In brain functionality signal classification, CNN model performs with overall accuracy >66%. RNN/LSTM model performs with overall accuracy >77%. CNN+RNN/LSTM model performs with overall accuracy >74%. Therefore, RNN/LSTM model performs better than CNN and CNN+RNN/LSTM models. In brain disease classification, CNN model performs with overall accuracy >93%. RNN/LSTM model performs with overall accuracy >95%. CNN+RNN/LSTM model performs with overall accuracy >90%. Therefore, RNN/LSTM model performs better than CNN and CNN+RNN/LSTM models. In emotion classification, CNN model performs with overall accuracy >55%. RNN/LSTM model performs with overall accuracy >74%. RBM model performs with overall accuracy >75%. Therefore, RBM model performs best. In sleep-stage classification, CNN model performs with overall accuracy >79%. RNN/LSTM model performs with overall accuracy >79%. CNN+RNN/LSTM model performs with overall accuracy >84%. Therefore, CNN+RNN/LSTM model performs better than CNN and RNN/LSTM models. In motion classification, only RNN/LSTM is used, with accuracy >68%. In gender classification, only CNN is used, with accuracy >80%. In word classification, only CNN+AE is used, with overall accuracy >95%. Table 9 describes medical application using deep-learning methods in EEG signal analysis from a private dataset source. In brain functionality signal classification, CNN model performs with overall accuracy >63%. CNN+RNN/LSTM model performs with overall accuracy >83%. Stacked autoencoder (SAE)+CNN model performs with overall accuracy >88%. Therefore, SAE+CNN model performs better than CNN and CNN+RNN/LSTM models. In brain disease classification, CNN model performs with overall accuracy >59%. RNN/LSTM model performs with overall accuracy >73%. CNN+RNN/LSTM model performs with overall accuracy >70%. Therefore, RNN/LSTM model performs better than CNN and CNN+RNN/LSTM models. In emotion classification, only CNN+LSTM model is used, with accuracy >98%. In sleep-stage classification, CNN model performs with overall accuracy >95%. CNN+RNN/LSTM model performs with kappa > 0.8. In motion classification, only CNN model is used, with accuracy >80%. Table 10 describes medical application using deep-learning methods in EEG signal analysis from a hybrid dataset source. In brain disease classification, only the CNN+AE model is used, with kappa  Table 8 describes medical application using deep-learning methods in EEG signal analysis from a public dataset source. In brain functionality signal classification, CNN model performs with overall accuracy >66%. RNN/LSTM model performs with overall accuracy >77%. CNN+RNN/LSTM model performs with overall accuracy >74%. Therefore, RNN/LSTM model performs better than CNN and CNN+RNN/LSTM models. In brain disease classification, CNN model performs with overall accuracy >93%. RNN/LSTM model performs with overall accuracy >95%. CNN+RNN/LSTM model performs with overall accuracy >90%. Therefore, RNN/LSTM model performs better than CNN and CNN+RNN/LSTM models. In emotion classification, CNN model performs with overall accuracy >55%. RNN/LSTM model performs with overall accuracy >74%. RBM model performs with overall accuracy >75%. Therefore, RBM model performs best. In sleep-stage classification, CNN model performs with overall accuracy >79%. RNN/LSTM model performs with overall accuracy >79%. CNN+RNN/LSTM model performs with overall accuracy >84%. Therefore, CNN+RNN/LSTM model performs better than CNN and RNN/LSTM models. In motion classification, only RNN/LSTM is used, with accuracy >68%. In gender classification, only CNN is used, with accuracy >80%. In word classification, only CNN+AE is used, with overall accuracy >95%. Table 9 describes medical application using deep-learning methods in EEG signal analysis from a private dataset source. In brain functionality signal classification, CNN model performs with overall accuracy >63%. CNN+RNN/LSTM model performs with overall accuracy >83%. Stacked auto-encoder (SAE)+CNN model performs with overall accuracy >88%. Therefore, SAE+CNN model performs better than CNN and CNN+RNN/LSTM models. In brain disease classification, CNN model performs with overall accuracy >59%. RNN/LSTM model performs with overall accuracy >73%. CNN+RNN/LSTM model performs with overall accuracy >70%. Therefore, RNN/LSTM model performs better than CNN and CNN+RNN/LSTM models. In emotion classification, only CNN+LSTM model is used, with accuracy >98%. In sleep-stage classification, CNN model performs with overall accuracy >95%. CNN+RNN/LSTM model performs with kappa > 0.8. In motion classification, only CNN model is used, with accuracy >80%. Table 10 describes medical application using deep-learning methods in EEG signal analysis from a hybrid dataset source. In brain disease classification, only the CNN+AE model is used, with kappa > 0.564. In sleep-stage classification, only CNN+LSTM model is used, with kappa > 0.72. In age classification, only the CNN model is used, with accuracy >95%.

Deep Learning with Electrooculogram (EOG)
Electrooculogram (EOG) is data regarding changes of the corneo-retinal potential between the front and the back of the human eye. There are 1 research work using deep-learning methods to analyze the EOG signals, as shown in Table 11. The contribution of deploying deep learning in EOG signal analysis is only for sleep-stage classification. Figure 5 shows the number of deep-learning models which are used to analyze EOG signal for sleep-stage classification. The work used GRU model.

Deep Learning with Electrooculogram (EOG)
Electrooculogram (EOG) is data regarding changes of the corneo-retinal potential between the front and the back of the human eye. There are 1 research work using deep-learning methods to analyze the EOG signals, as shown in Table 11. The contribution of deploying deep learning in EOG signal analysis is only for sleep-stage classification. Figure 5 shows the number of deep-learning models which are used to analyze EOG signal for sleep-stage classification. The work used GRU model.  Table 11 describes medical application using deep-learning methods in EOG signal analysis from a public dataset source. In sleep-stage classification, the GRU model performs with accuracy of 69.25%.

Deep Learning with a Combination of Signals
There are 5 research works using deep-learning methods to analyze a combination of signals, as shown in Table 12. Sokolovsky et al [150] combined EEG and EOG signal. Chambon et al [151] and Andreotti et al [152] combined polysomnography (PSG) signals such as EEG, EMG, and EOG. The work of Yildirim et al [153] exploited the combination signals of EEG and EOG. Croce et al's [154] contribution was from EEG and magnetoencephalographic (MEG) signals. CNN is used for both sleep-stage classification and the classification of brain and artifactual independent components. Figure 6 shows the number of deep-learning models used to analyze a combination of signals for sleep-stage classification.  Table 12 describes medical application using deep-learning methods in a combination of signals analysis from a public dataset source. In sleep-stage classification, only the CNN model is used, with an overall accuracy >81%.  Table 11 describes medical application using deep-learning methods in EOG signal analysis from a public dataset source. In sleep-stage classification, the GRU model performs with accuracy of 69.25%.

Deep Learning with a Combination of Signals
There are 5 research works using deep-learning methods to analyze a combination of signals, as shown in Table 12. Sokolovsky et al [150] combined EEG and EOG signal. Chambon et al [151] and Andreotti et al [152] combined polysomnography (PSG) signals such as EEG, EMG, and EOG. The work of Yildirim et al [153] exploited the combination signals of EEG and EOG. Croce et al's [154] contribution was from EEG and magnetoencephalographic (MEG) signals. CNN is used for both sleep-stage classification and the classification of brain and artifactual independent components. Figure 6 shows the number of deep-learning models used to analyze a combination of signals for sleep-stage classification.

Deep Learning with Electrooculogram (EOG)
Electrooculogram (EOG) is data regarding changes of the corneo-retinal potential between the front and the back of the human eye. There are 1 research work using deep-learning methods to analyze the EOG signals, as shown in Table 11. The contribution of deploying deep learning in EOG signal analysis is only for sleep-stage classification. Figure 5 shows the number of deep-learning models which are used to analyze EOG signal for sleep-stage classification. The work used GRU model.  Table 11 describes medical application using deep-learning methods in EOG signal analysis from a public dataset source. In sleep-stage classification, the GRU model performs with accuracy of 69.25%.

Deep Learning with a Combination of Signals
There are 5 research works using deep-learning methods to analyze a combination of signals, as shown in Table 12. Sokolovsky et al [150] combined EEG and EOG signal. Chambon et al [151] and Andreotti et al [152] combined polysomnography (PSG) signals such as EEG, EMG, and EOG. The work of Yildirim et al [153] exploited the combination signals of EEG and EOG. Croce et al's [154] contribution was from EEG and magnetoencephalographic (MEG) signals. CNN is used for both sleep-stage classification and the classification of brain and artifactual independent components. Figure 6 shows the number of deep-learning models used to analyze a combination of signals for sleep-stage classification.  Table 12 describes medical application using deep-learning methods in a combination of signals analysis from a public dataset source. In sleep-stage classification, only the CNN model is used, with an overall accuracy >81%.  Table 12 describes medical application using deep-learning methods in a combination of signals analysis from a public dataset source. In sleep-stage classification, only the CNN model is used, with an overall accuracy >81%.

Training Architecture
To strive for high accuracy, deep-learning techniques require not only a good algorithm, but also a good dataset [155]. Therefore, the input data is used in two ways: (1) the input data are first extracted as features, then the feature data are fed into the network. Based on our review, some contributions use traditional machine-learning methods as feature extractors described in detail in Section 4.1, while other contributions use deep-learning methods as feature extractors described in detail in Section 4.2; and (2) the raw input data are fed into the network directly for end-to-end learning described in detail in Section 4.3.

Traditional Machine Learning as Feature Extractor and Deep Learning as Classifier
To distinguish the label of signals, raw signal data is divided into N levels. This step is called feature extraction. Feature extraction is conducted to strengthen the accuracy of prediction in the classification step. Figure 7 illustrates the training architecture using traditional machine learning as feature extractor and deep learning as classifier. For example, the raw EMG signal is divided into N levels using mean absolute value (MAV). The featured data is fed into a CNN to classify hand motion.
Sensors 2020, 20, 969 9 of 35 To strive for high accuracy, deep-learning techniques require not only a good algorithm, but also a good dataset [155]. Therefore, the input data is used in two ways: (1) the input data are first extracted as features, then the feature data are fed into the network. Based on our review, some contributions use traditional machine-learning methods as feature extractors described in detail in Section 4.1, while other contributions use deep-learning methods as feature extractors described in detail in Section 4.2; and (2) the raw input data are fed into the network directly for end-to-end learning described in detail in Section 4.3.

Traditional Machine Learning as Feature Extractor and Deep Learning as Classifier
To distinguish the label of signals, raw signal data is divided into N levels. This step is called feature extraction. Feature extraction is conducted to strengthen the accuracy of prediction in the classification step. Figure 7 illustrates the training architecture using traditional machine learning as feature extractor and deep learning as classifier. For example, the raw EMG signal is divided into N levels using mean absolute value (MAV). The featured data is fed into a CNN to classify hand motion. Yu et al [9] designed a feature-level fusion to recognize Chinese sign language. The features are extracted using hand-crafted features and learned features from DBN. These two feature levels are concatenated before being fed into the deep belief network and fully connected network for learning.
For the hand-grasping classification described by Li et al [14], principal component analysis (PCA) method is used for dimension reduction and DNN with a stack of 2-layered auto-encoders, and a SoftMax classifier is applied for classifying levels of force.
Saadatnejad et al [35] proposed ECG heartbeat classification for continuous monitoring. The work extracted raw ECG samples into heartbeat RR interval features and wavelet features. Next, the extracted features were fed into two RNN-based models for classification.
To classify premature ventricular contraction, Jeon et al [45] extracted the features in the QRS pattern from the ECG signal and classified by modified weight and bias based on the errorbackpropagation algorithm.
Liu et al [60] presented heart disease classification based on ECG signals by deploying symbolic aggregate approximation (SAX) as a feature extraction and LSTM for classification.
Majidov et al [85] proposed motor imagery EEG classification by deploying Riemannian geometry-based feature extraction and a comparison between convolutional layers and SoftMax layers and convolutional layers, and fully connected layers which outputs 100 units.
Abbas et al [87] designed a model for multiclass motor imagery classification, in which fast Fourier transform energy map (FFTEM) is used for feature extraction and CNN is used for classification.
In diagnosing brain disorders, Golmohammadi et al [95] used linear frequency cepstral coefficients (LFCC) for feature extraction and hybrid hidden Markov models and stacked denoising auto-encoder (SDA) model for classifying. Yu et al [9] designed a feature-level fusion to recognize Chinese sign language. The features are extracted using hand-crafted features and learned features from DBN. These two feature levels are concatenated before being fed into the deep belief network and fully connected network for learning.

Deep Learning as Feature Extractor and Traditional Machine Learning as Classifier
For the hand-grasping classification described by Li et al [14], principal component analysis (PCA) method is used for dimension reduction and DNN with a stack of 2-layered auto-encoders, and a SoftMax classifier is applied for classifying levels of force.
Saadatnejad et al [35] proposed ECG heartbeat classification for continuous monitoring. The work extracted raw ECG samples into heartbeat RR interval features and wavelet features. Next, the extracted features were fed into two RNN-based models for classification.
To classify premature ventricular contraction, Jeon et al [45] extracted the features in the QRS pattern from the ECG signal and classified by modified weight and bias based on the error-backpropagation algorithm.
Liu et al [60] presented heart disease classification based on ECG signals by deploying symbolic aggregate approximation (SAX) as a feature extraction and LSTM for classification.
Majidov et al [85] proposed motor imagery EEG classification by deploying Riemannian geometry-based feature extraction and a comparison between convolutional layers and SoftMax layers and convolutional layers, and fully connected layers which outputs 100 units.
Abbas et al [87] designed a model for multiclass motor imagery classification, in which fast Fourier transform energy map (FFTEM) is used for feature extraction and CNN is used for classification.
In diagnosing brain disorders, Golmohammadi et al [95] used linear frequency cepstral coefficients (LFCC) for feature extraction and hybrid hidden Markov models and stacked denoising auto-encoder (SDA) model for classifying. Figure 8 illustrates the training architecture of using deep learning as a feature extractor and traditional machine learning as classifier. For example, the raw EEG signal is divided into N levels using SAE. The featured data is fed into support vector machine (SVM) to classify the state of emotion. Chauhan et al [26] proposed an ECG anomaly class identification algorithm, in which the LSTM and error profile modeling are used as a feature extractor. Then, the multiple choices of traditional machine-learning classifier models were conducted, such as multilayer perception, support vector machine, and logistic regression.

Deep Learning as Feature Extractor and Traditional Machine Learning as Classifier
To diagnose arrhythmia, Yang et al [42] used DL-CCANet and TL-CCANet as feature extractor to discriminate features from dual-lead and three-lead ECGs. Then, the extracted features were fed into the linear support vector machine for classification.
Nguyen et al [63] proposed an algorithm for detecting sudden cardiac arrest in automated external defibrillators, in which CNN is used as feature extractor (CNNE) and a boosting (BS) classifier.
Ma et al [131] designed a model to detect driving fatigue. The network model integrated the PCA and deep-learning method called PCANet for feature extraction. Then, SVM/KNN is used for classification.

End-to-End Learning
Rather than extracting the feature from raw data, the raw data is fed into the network for classification. This architecture reduces the feature-extraction step. Figure 9 illustrates the training architecture of using only deep-learning methods to get input raw data, do a classification, and output the result. For example, the ECG data is fed into the LSTM network to classify the states of sudden cardiac arrests. All works in Table 3-12 which are not mentioned in Sections 4.1 and 4.2 use a raw dataset for end-to-end learning.

Dataset Sources
We deduce that there are three types of dataset sources used. (1) The public dataset as shown in Table 3, Table 5, Table 8, Table 11, and Table 12 is available online and freely accessible. It has large numbers of samples. Figure 10 illustrates the number of papers using a public dataset based on physiological data modality. For EMG signal analysis, NinaPro DB is the most commonly used. For ECG signal analysis, MIT-BIH is the most commonly used, then PhysioNet is the second most commonly used. For EEG signal analysis, BCI competition II is the most commonly used, then CHB-MIT and DEAP are the second most commonly used. For EOG signal analysis, only PhysioNet is used. For the combination of signal analysis, MASS and PhysioNet is the most commonly used. (  Chauhan et al [26] proposed an ECG anomaly class identification algorithm, in which the LSTM and error profile modeling are used as a feature extractor. Then, the multiple choices of traditional machine-learning classifier models were conducted, such as multilayer perception, support vector machine, and logistic regression.
To diagnose arrhythmia, Yang et al [42] used DL-CCANet and TL-CCANet as feature extractor to discriminate features from dual-lead and three-lead ECGs. Then, the extracted features were fed into the linear support vector machine for classification.
Nguyen et al [63] proposed an algorithm for detecting sudden cardiac arrest in automated external defibrillators, in which CNN is used as feature extractor (CNNE) and a boosting (BS) classifier.
Ma et al [131] designed a model to detect driving fatigue. The network model integrated the PCA and deep-learning method called PCANet for feature extraction. Then, SVM/KNN is used for classification.

End-to-End Learning
Rather than extracting the feature from raw data, the raw data is fed into the network for classification. This architecture reduces the feature-extraction step. Figure 9 illustrates the training architecture of using only deep-learning methods to get input raw data, do a classification, and output the result. For example, the ECG data is fed into the LSTM network to classify the states of sudden cardiac arrests. Chauhan et al [26] proposed an ECG anomaly class identification algorithm, in which the LSTM and error profile modeling are used as a feature extractor. Then, the multiple choices of traditional machine-learning classifier models were conducted, such as multilayer perception, support vector machine, and logistic regression.
To diagnose arrhythmia, Yang et al [42] used DL-CCANet and TL-CCANet as feature extractor to discriminate features from dual-lead and three-lead ECGs. Then, the extracted features were fed into the linear support vector machine for classification.
Nguyen et al [63] proposed an algorithm for detecting sudden cardiac arrest in automated external defibrillators, in which CNN is used as feature extractor (CNNE) and a boosting (BS) classifier.
Ma et al [131] designed a model to detect driving fatigue. The network model integrated the PCA and deep-learning method called PCANet for feature extraction. Then, SVM/KNN is used for classification.

End-to-End Learning
Rather than extracting the feature from raw data, the raw data is fed into the network for classification. This architecture reduces the feature-extraction step. Figure 9 illustrates the training architecture of using only deep-learning methods to get input raw data, do a classification, and output the result. For example, the ECG data is fed into the LSTM network to classify the states of sudden cardiac arrests. All works in Table 3-12 which are not mentioned in Sections 4.1 and 4.2 use a raw dataset for end-to-end learning.

Dataset Sources
We deduce that there are three types of dataset sources used. (1) The public dataset as shown in Table 3, Table 5, Table 8, Table 11, and Table 12 is available online and freely accessible. It has large numbers of samples. Figure 10 illustrates the number of papers using a public dataset based on physiological data modality. For EMG signal analysis, NinaPro DB is the most commonly used. For ECG signal analysis, MIT-BIH is the most commonly used, then PhysioNet is the second most commonly used. For EEG signal analysis, BCI competition II is the most commonly used, then CHB-MIT and DEAP are the second most commonly used. For EOG signal analysis, only PhysioNet is used. For the combination of signal analysis, MASS and PhysioNet is the most commonly used. (2) Private datasets are shown in Table 4, Table 6, and  Figure 9. Training architecture of end-to-end learning using deep learning.
All works in Tables 3-12 which are not mentioned in Sections 4.1 and 4.2 use a raw dataset for end-to-end learning.

Dataset Sources
We deduce that there are three types of dataset sources used. (1) The public dataset as shown in Table 3, Table 5, Table 8, Table 11, and Table 12 is available online and freely accessible. It has large numbers of samples. Figure 10 illustrates the number of papers using a public dataset based on physiological data modality. For EMG signal analysis, NinaPro DB is the most commonly used.
For ECG signal analysis, MIT-BIH is the most commonly used, then PhysioNet is the second most commonly used. For EEG signal analysis, BCI competition II is the most commonly used, then CHB-MIT and DEAP are the second most commonly used. For EOG signal analysis, only PhysioNet is used. For the combination of signal analysis, MASS and PhysioNet is the most commonly used.
(2) Private datasets are shown in Table 4, Table 6, and Table 9: it is collected by an author in their own laboratory, hospital, or institution. This dataset requires a specific device for recording or capturing and requires participants or subjects to evolve in the experimental process. Thus, it has a small number of samples. (3) Hybrid datasets are shown in Tables 7 and 10: the public and private datasets are combined for use in the experiment.
Sensors 2020, 20, 969 11 of 35 laboratory, hospital, or institution. This dataset requires a specific device for recording or capturing and requires participants or subjects to evolve in the experimental process. Thus, it has a small number of samples. (3) Hybrid datasets are shown in Table 7 and Table 10: the public and private datasets are combined for use in the experiment.

Discussion
We studied contributions based on types of physiological signal data modality and training architecture. The medical application, deep-learning model, and performance of those contributions have been reviewed and illustrated.

Discussion of the Deep-Learning Task
In medical application, we deduced that most of the contributions were conducted using a classification task, feature-extraction task, and data compression task. The classification task, which is also known as recognition task, detection task, or prediction task, focuses on whether the instance exists or does not exist. For example, arrhythmia detection [51] analyzes whether the heartbeat signal is normal or arrhythmic. The classification task also focuses on grouping or leveling the types of instances. For example, emotion classification [126] analyzes emotion into groups of sad, happy, neutral, and fear. The feature-extraction task [43] focuses on input data enhancement, in which the unsupervised learning technique is used to label the dataset to avoid a heavy burden from manual labeling. The data compression task [33] focuses on decreasing the data size while still retaining the high quality of data for storage and transmission.

Discussion of the Deep-Learning Model
Even though there are various deep-learning models, we deduced that only CNN, RNN/LSTM, and CNN+RNN/LSTM models are the most commonly used. As theorized in the literature, the RNN/LSTM model predicts continuously sequential data well. However, many contributions convert physiological signals into 2D data and feed those 2D data into a CNN network, in which the performance is good.

Discussion
We studied contributions based on types of physiological signal data modality and training architecture. The medical application, deep-learning model, and performance of those contributions have been reviewed and illustrated.

Discussion of the Deep-Learning Task
In medical application, we deduced that most of the contributions were conducted using a classification task, feature-extraction task, and data compression task. The classification task, which is also known as recognition task, detection task, or prediction task, focuses on whether the instance exists or does not exist. For example, arrhythmia detection [51] analyzes whether the heartbeat signal is normal or arrhythmic. The classification task also focuses on grouping or leveling the types of instances. For example, emotion classification [126] analyzes emotion into groups of sad, happy, neutral, and fear. The feature-extraction task [43] focuses on input data enhancement, in which the unsupervised learning technique is used to label the dataset to avoid a heavy burden from manual labeling. The data compression task [33] focuses on decreasing the data size while still retaining the high quality of data for storage and transmission.

Discussion of the Deep-Learning Model
Even though there are various deep-learning models, we deduced that only CNN, RNN/LSTM, and CNN+RNN/LSTM models are the most commonly used. As theorized in the literature, the RNN/LSTM model predicts continuously sequential data well. However, many contributions convert physiological signals into 2D data and feed those 2D data into a CNN network, in which the performance is good.

Discussion of the Training Architecture
Due to different characteristics of data modality, investigation into the diversity of training architectures has been conducted. The first type of architecture exploits the traditional machine-learning model as a feature extractor and deep-learning model as a classifier. This architecture's goal is to boost accuracy of classification by converting raw data into feature data. The feature data consists of higher potentially discriminated characteristics than the raw data. The DL classifier trains this feature data in a supervised learning manner.
In contrast, the second type of architecture employs the deep-learning model as a feature extractor and traditional machine-learning model as a classifier. This architecture's goal is to reduce the heavy burden of the hand-crafted labeling of the dataset. The DL extractor trains the raw data in an unsupervised learning manner.
The third architecture type uses only a deep-learning model to train raw data and receive the final output. This architecture's goal is to not rely on the input dataset, but to strengthen the algorithm of the deep-learning model, in which they believe that the more robust the DL algorithm, the higher the accuracy will be received. This architecture trains raw data in a supervised learning manner. Additionally, this architecture eases the implementation stage.
In our survey, we could not point out which type of architecture was best. This is because there are no contributions that apply these three types of architecture using the same input dataset for training, testing, and receiving the same desired task.

Discussion of the Dataset Source
We overviewed the sources of the dataset which were conducted for the deep-learning application of physiological signal analysis. The available public datasets which are widely used are MIT-BIH, PhysioNet, BCI competition II, CHB-MIT, DEAP, Bonn University, and NinaPro. The private dataset was collected by authors in their own laboratory, hospital, or institution. The private dataset was collected if the data was not available as a public source. Due to lack of datasets, contributions such as Nodera et al [23] employed a technique of data augmentation, in which a fake dataset is generated by duplicating original data and doing a transformation such as translation and rotation. Contributions [12,16,23,46,58] employed a transfer learning technique. Rather than undertaking a training from a scratch with a huge required dataset, they adapted the pre-weight from a state-of-the-art model such as AlexNet, VGG, ResNet, Inception, or DenseNet.

Conclusions
In this paper, we conducted an overview of deep-learning approaches and their applications in medical 1D signal analysis over the past two years. We found 147 papers using deep-learning methods in EMG signal analysis, ECG signals analysis, EEG signals analysis, EOG signals analysis, and combinations of signal analysis.
By reviewing those works, we contribute to the identification of the key parameters used to estimate the state of hand motion, heart disease, brain disease, emotion, sleep stages, age, and gender. Additionally, we reveal that the CNN model predicts the physiological signals at the state-of-the-art level. We have also learned that there is no precise standardized experimental setting. These non-uniform parameters and settings makes it difficult to compare exact performance. However, we compared the overall performance. This comparison should enlighten other researchers to make a Sensors 2020, 20, 969 30 of 39 decision on which input data type, deep-learning task, deep-learning model, and dataset is suitable for achieving their desired medical application and reaching state-of-the-art level. As a lesson learned from this review, our discussion can also help fellow researchers to make a decision on a deep-learning task, deep-learning model, training architecture, and dataset. Those are the main parameters that effects the system performance.
In conclusion, a deep-learning approach has proved promising for bringing those current contributions to the state-of-the-art level in physiological signal analysis for medical applications.

Conflicts of Interest:
The authors declare no conflict of interest.