Contactless Real-Time Heartbeat Detection via 24 GHz Continuous-Wave Doppler Radar Using Artificial Neural Networks.

The measurement of human vital signs is a highly important task in a variety of environments and applications. Most notably, the electrocardiogram (ECG) is a versatile signal that could indicate various physical and psychological conditions, from signs of life to complex mental states. The measurement of the ECG relies on electrodes attached to the skin to acquire the electrical activity of the heart, which imposes certain limitations. Recently, due to the advancement of wireless technology, it has become possible to pick up heart activity in a contactless manner. Among the possible ways to wirelessly obtain information related to heart activity, methods based on mm-wave radars proved to be the most accurate in detecting the small mechanical oscillations of the human chest resulting from heartbeats. In this paper, we presented a method based on a continuous-wave Doppler radar coupled with an artificial neural network (ANN) to detect heartbeats as individual events. To keep the method computationally simple, the ANN took the raw radar signal as input, while the output was minimally processed, ensuring low latency operation (<1 s). The performance of the proposed method was evaluated with respect to an ECG reference ("ground truth") in an experiment involving 21 healthy volunteers, who were sitting on a cushioned seat and were refrained from making excessive body movements. The results indicated that the presented approach is viable for the fast detection of individual heartbeats without heavy signal preprocessing.


Introduction
The contactless monitoring of heart rate has numerous advantages over conservative methods that use contact sensors, such as electrocardiogram (ECG) monitors, conventional photoplethysmography (PPG) sensors or piezoresistive sensors [1]. Contactless sensors offer improved mobility and obviate the need for attaching or cleaning electrodes, but also have the unique ability to be used on patients who suffer from skin irritations, painful skin damage like lacerations or burns, as well as patients who exhibit anxiety or allergic reactions to contact sensors. Furthermore, some contactless instruments, such as radar-based sensors [2], can be used for heart rate monitoring through clothes or other obstacles.
The real-time operation of heart rate monitors is required for the timely detection of potentially dangerous conditions in hospitals or in-home health care applications. The heart rate and its variability in the frequency domain when shorter data windows were used (2-5 s). Specific heartbeat signal has been obtained using the short-time Fourier transform (STFT) analysis in [23], which was further filtered through an adaptive band pass filter for improved quality. The control of the adaptive band pass filter was done using the information extracted from the time domain analysis of the heartbeat signal on windows of 2-3 s. In [24] it has been shown that the analysis of the frequency domain only did not give satisfactory results. Therefore, the heart rate information was extracted using frequency domain analysis (window length: 3.5 s) for the coarse estimation and time domain processing using a band pass filter bank for the refinement of the results. This approach resulted in small algorithm delay (~2.5 s) and high accuracy.
Recently, new approaches based on supervised or unsupervised machine-learning algorithms [31,32] were introduced in the CW Doppler radar systems, and the first results have shown promising advantages in terms of heartbeat detection delay and source separation capabilities (robustness of heartbeat detection to respiration motion or random body motion) compared to traditional approaches. Convolutional neural networks (CNN) were applied in [29] to estimate heart rate from UWB radar signals. However, due to the lack of training data, this approach was person-specific since the CNN needed to be trained for each subject separately. To the best of our knowledge, there is no published paper that used artificial neural networks for heartbeat detection using the CW Doppler radar technology. This paper focused on the development of a system for instantaneous heart rate estimation (delay of less than 1 s from heartbeat occurrence) using a shallow artificial neural network (ANN) as a main signal processing element. Additionally, the goal was to develop a detection algorithm that was person-independent. The contribution of this work is in the development of the system for detecting individual heartbeats considering the following requirements: (1) a low-complexity time domain-based algorithm (without relying on periodic occurrences of heartbeat-related chest displacements as in the case of traditional spectral approaches), (2) suitable for real-time human presence detection, (3) calibration-free (no need for I/Q imbalance, offset compensation or the usage of any demodulation techniques) and 4) testing on a separate group of subjects from those whose heart rate signals were used in the ANN model selection and training process.

Basics of CW Doppler Radar Operation
A typical architecture of quadrature continuous-wave Doppler radar is shown in Figure 1a. The radar transmitted sinusoidal electromagnetic waves generated in the local oscillator (LO) and amplified in the power amplifier (PA). The transmitted signal reflects from the target, where the reflected signal is modulated in phase. The transmitted signal can be expressed as where A T and f are the amplitude and the frequency of the transmitted signal, respectively, and θ(t) is the phase noise of the local oscillator. The received signal can be expressed as where A R , f and λ are the amplitude, the frequency and the wavelength of the carrier signal, respectively, c is the speed of light, d 0 is the nominal radar-target distance and x(t) is the target's relative displacement [7]. The received signal is demodulated using a quadrature demodulator as shown in Figure 1a. The resulting signals are two baseband signals: the in-phase signal (I), which is in phase with the carrier and the quadrature signal (Q), which is phase-shifted from the carrier by 90 • . These signals are expressed as [10] I(t) = A I cos θ 0 + 4πx(t) λ + ∆θ(t) + DC I , where A I and A Q are amplitudes (A I A Q due to the I/Q amplitude imbalance), DC I and DC Q are DC offsets, θ 0 is the constant phase shift due to the constant nominal distance d 0 from (2), ∆ϕ is the phase shift due to the I/Q phase imbalance and ∆θ(t) is the total residual phase noise which can be neglected in vital signs detection applications since the distance between the target and the radar system is small [7]. Baseband signals are further digitized using analog-to-digital converters (ADCs) and processed in a digital signal processing unit.
offsets, θ0 is the constant phase shift due to the constant nominal distance d0 from (2), Δφ is the phase shift due to the I/Q phase imbalance and Δθ(t) is the total residual phase noise which can be neglected in vital signs detection applications since the distance between the target and the radar system is small [7]. Baseband signals are further digitized using analog-to-digital converters (ADCs) and processed in a digital signal processing unit.

Instrumentation
In this study, a CW Doppler radar with a carrier frequency of 24 GHz was used for heart rate estimation. The recording of the reference ("ground truth") signal for the ANN training and the validation of the estimation accuracy was done using a wearable cardiorespiratory monitoring system (Smartex Wearable Wellness System (WWS), Pisa, Italy) including a single lead ECG and a piezo band as shown in Figure 1b. The ECG sensor was chosen since its accuracy was higher than other heart rate monitors such as pulse oximeters or photoplethysmographs. The system contained a microcontroller for data acquisition and Bluetooth connection for wireless data transmission to the personal computer (PC). It was a CE (Conformité Européenne) certified system. Additionally, the electrodes were connected to the skin using the wet textile fabric, which eliminated any potential irritation that could come from the sticky adhesive electrodes, particularly if applied on non-glabrous skin [34]. The sample rate for the ECG recording was 250 Hz.  The radar system used in the experiment was a Novelic Radar Module, NRM24 [24]. Figure 1c shows the radar module placed in the experimental setup. It was a DC-coupled Doppler radar sensor. The module was compact (8 × 5 × 1 cm) and portable and consisted of two stacked printed circuit boards (PCBs). The radar sensor PCB included the main part of the analog frontend: antennas, an integrated radar transceiver and a phase-locked loop integrated circuit. Antenna

Instrumentation
In this study, a CW Doppler radar with a carrier frequency of 24 GHz was used for heart rate estimation. The recording of the reference ("ground truth") signal for the ANN training and the validation of the estimation accuracy was done using a wearable cardiorespiratory monitoring system (Smartex Wearable Wellness System (WWS), Pisa, Italy) including a single lead ECG and a piezo band as shown in Figure 1b. The ECG sensor was chosen since its accuracy was higher than other heart rate monitors such as pulse oximeters or photoplethysmographs. The system contained a microcontroller for data acquisition and Bluetooth connection for wireless data transmission to the personal computer (PC). It was a CE (Conformité Européenne) certified system. Additionally, the electrodes were connected to the skin using the wet textile fabric, which eliminated any potential irritation that could come from the sticky adhesive electrodes, particularly if applied on non-glabrous skin [34]. The sample rate for the ECG recording was 250 Hz.
The radar system used in the experiment was a Novelic Radar Module, NRM24 [24]. Figure 1c shows the radar module placed in the experimental setup. It was a DC-coupled Doppler radar sensor. The module was compact (8 × 5 × 1 cm) and portable and consisted of two stacked printed circuit boards (PCBs). The radar sensor PCB included the main part of the analog frontend: antennas, an integrated radar transceiver and a phase-locked loop integrated circuit. Antenna beamwidths (BW) were BW θ, 3-dB = 25 • , BW θ, 6-dB = 33 • , BW θ, 10-dB = 43 • (elevation) and BW ϕ, 3-dB = 44 • , BW ϕ, 6-dB = 65 • , BW ϕ, 10-dB = 90 • (azimuth), whereas the antenna gain was 12 dBi. The maximum power at the transmit antenna input was 10 dBm. The second PCB was the acquisition board, which included baseband amplifiers and filters, a power supply circuitry, an ARM Cortex-M4 based microcontroller (MCU) that integrated a multichannel 12-bit ADC and serial-to-USB converters for data transfer. The baseband filter had a cut-off frequency of 100 Hz, which was considered high enough for the vital sign detection application. The sampling rate was set to f S = 1 kHz, while the data logging on the PC was performed using a custom-made application that communicated via serial connection with the MCU.
The alignment of the datapoints from the Smartex WWS system and the radar system was done by matching timestamps in the logged data.

Database Recordings
The radar recordings, as well as the ECG data used as reference, were obtained from 21 healthy human volunteers who took part in the experiments (14 males and 7 females, aged 26.1 ± 5.1, with a height of 179.5 ± 11.6 cm and a weight of 74.2 ± 16.4 kg). Subjects were free of any diagnosed acute/chronic cardiac or respiratory problems, based on their self-report. Participants were acquainted with the protocol in advance and gave informed consent. The study was approved by the ethical committee of the University of Belgrade-School of Electrical Engineering, Serbia. The subjects had the wearable ECG strapped around their thorax and were instructed to sit comfortably on a cushioned seat in front of the radar sensor. The sensor was mounted on a custom stand, facing the participants at a distance of 75 cm. At this distance, the radar beam focused on the torso area, considering −3 dB beamwidths of the antenna (25 • and 44 • ). The participants were told to breathe as they normally would in a relaxed state, without extremely deep and excessive breaths. Additionally, they were asked to refrain from excessive body and hand movements, since the rapid movements could mask the small chest wall movements that come from heartbeats and hence affect the detection. The radar signals obtained for one participant are shown in Figure 2. Three-hundred seconds of data were acquired for each subject.
Sensors 2020, 20, x FOR PEER REVIEW 5 of 16 beamwidths (BW) were BWθ, 3-dB = 25°, BWθ, 6-dB = 33°, BWθ, 10-dB = 43° (elevation) and BWφ, 3-dB = 44°, BWφ, 6-dB = 65°, BWφ, 10-dB = 90° (azimuth), whereas the antenna gain was 12 dBi. The maximum power at the transmit antenna input was 10 dBm. The second PCB was the acquisition board, which included baseband amplifiers and filters, a power supply circuitry, an ARM Cortex-M4 based microcontroller (MCU) that integrated a multichannel 12-bit ADC and serial-to-USB converters for data transfer. The baseband filter had a cut-off frequency of 100 Hz, which was considered high enough for the vital sign detection application. The sampling rate was set to fS = 1 kHz, while the data logging on the PC was performed using a custom-made application that communicated via serial connection with the MCU. The alignment of the datapoints from the Smartex WWS system and the radar system was done by matching timestamps in the logged data.

Database Recordings
The radar recordings, as well as the ECG data used as reference, were obtained from 21 healthy human volunteers who took part in the experiments (14 males and 7 females, aged 26.1 ± 5.1, with a height of 179.5 ± 11.6 cm and a weight of 74.2 ± 16.4 kg). Subjects were free of any diagnosed acute/chronic cardiac or respiratory problems, based on their self-report. Participants were acquainted with the protocol in advance and gave informed consent. The study was approved by the ethical committee of the University of Belgrade-School of Electrical Engineering, Serbia. The subjects had the wearable ECG strapped around their thorax and were instructed to sit comfortably on a cushioned seat in front of the radar sensor. The sensor was mounted on a custom stand, facing the participants at a distance of 75 cm. At this distance, the radar beam focused on the torso area, considering −3 dB beamwidths of the antenna (25° and 44°). The participants were told to breathe as they normally would in a relaxed state, without extremely deep and excessive breaths. Additionally, they were asked to refrain from excessive body and hand movements, since the rapid movements could mask the small chest wall movements that come from heartbeats and hence affect the detection. The radar signals obtained for one participant are shown in Figure 2. Three-hundred seconds of data were acquired for each subject.
In-phase Quadrature Figure 2. In-phase and quadrature Doppler radar signals during the measurements. The participant was breathing normally in front of the radar sensor. Figure 3a shows a fragment of recorded time-aligned ECG and radar signals. There was a distinctive signal shape in the radar signals with a time delay in relation to the R wave of the ECG. The heartbeat-related disturbance in the radar signals originated from the mechanical movement of the chest wall during the heartbeat. This disturbance corresponded to the ballistocardiogram J-wave peak [35]. It can be seen that this characteristic signal shape became distorted and was reduced in the presence of breathing and movement.
Many previous works considered that the heartbeat-related displacements could be modeled as a sine wave [16], half-sine pulses [21], Gaussian pulses [22], or as an array of two consecutive pulses [24]. However, the mechanical response of the chest wall has a complex waveform that is difficult to model [33]. Figure 3b shows the heartbeat-related displacement obtained from the radar data shown in Figure 3a, using the extended differentiate and cross-multiply (DACM) demodulation algorithm  Figure 3a shows a fragment of recorded time-aligned ECG and radar signals. There was a distinctive signal shape in the radar signals with a time delay in relation to the R wave of the ECG. The heartbeat-related disturbance in the radar signals originated from the mechanical movement of the chest wall during the heartbeat. This disturbance corresponded to the ballistocardiogram J-wave peak [35]. It can be seen that this characteristic signal shape became distorted and was reduced in the presence of breathing and movement.
Many previous works considered that the heartbeat-related displacements could be modeled as a sine wave [16], half-sine pulses [21], Gaussian pulses [22], or as an array of two consecutive pulses [24]. However, the mechanical response of the chest wall has a complex waveform that is difficult to model [33]. Figure 3b shows the heartbeat-related displacement obtained from the radar data shown in Figure 3a, using the extended differentiate and cross-multiply (DACM) demodulation algorithm described in [16]. Before the demodulation, the I/Q imbalance of the radar was measured Sensors 2020, 20, 2351 6 of 16 and compensated offline like in [24]. It can be observed that the heartbeat-related displacement had a complex waveform, which further induced the complex signal shapes of the radar signals. The detector in this paper was trained with the aim to recognize these small distortions in the radar signals, without previous modeling. In order to get a reliable data set for training the detector, the signals were cropped to the period of 200 s of normal breathing ( Figure 2, time interval 50-250 s).
Sensors 2020, 20, x FOR PEER REVIEW 6 of 16 described in [16]. Before the demodulation, the I/Q imbalance of the radar was measured and compensated offline like in [24]. It can be observed that the heartbeat-related displacement had a complex waveform, which further induced the complex signal shapes of the radar signals. The detector in this paper was trained with the aim to recognize these small distortions in the radar signals, without previous modeling. In order to get a reliable data set for training the detector, the signals were cropped to the period of 200 s of normal breathing (

Data Preprocessing
The detector was envisioned as a binary classifier detecting the occurrence of each heartbeat, but such implementation required the reshaping of the recorded data into an appropriate format. The continuous nature of the reference ECG signal was not suitable for this approach, so it was instead transformed into a binary on/off signal. R peaks were detected using the Pan-Tompkins algorithm [36]. The surrogate reference binary signal ("binarized target signal") was synthesized in the form of a 400 ms pulse with 200 ms delay after each R wave in order to highlight the temporal relationship between the heart's electrical activity (ECG) and the resulting mechanical displacement (radar signal), as shown in Figure 4. The window of 400 ms width with a latency of 200 ms after the R wave was able to cover most of the mechanical rippling observed within the radar signals. This choice is in line with a study performed on 92 healthy subjects which showed that the duration range of the R−J interval was 203-290 ms [37]. Additionally, the selection of the value of the latency parameter (200 ms) for the binarized target signal was confirmed on a recorded dataset of seven randomly selected subjects from our study. Using the simplest model that was tested in the scope of this paper (a feed-forward artificial neural network with a single hidden layer containing 10 units), a series of training and testing with a range of delays and target widths was performed. The tested delay values ranged between 0 ms to 300 ms, and the target widths were tested in the range between 300 ms and 500 ms. These values were selected based on the visual inspection of the signals, and the values that were chosen as optimal (window of 400 ms width with a latency of 200 ms after the R wave) were those that yielded the best model accuracy in terms of the percentage of detected peaks.
The following stage in the signal preprocessing was the decimation of both the binary reference signal and the radar signal to the same sampling rate, which was a prerequisite related to the ANN employment, as each input state required a corresponding target output. To ensure the fast computation without a loss of information in the physiological range, 100 Hz was selected as the sampling rate during this computation.

Data Preprocessing
The detector was envisioned as a binary classifier detecting the occurrence of each heartbeat, but such implementation required the reshaping of the recorded data into an appropriate format. The continuous nature of the reference ECG signal was not suitable for this approach, so it was instead transformed into a binary on/off signal. R peaks were detected using the Pan-Tompkins algorithm [36]. The surrogate reference binary signal ("binarized target signal") was synthesized in the form of a 400 ms pulse with 200 ms delay after each R wave in order to highlight the temporal relationship between the heart's electrical activity (ECG) and the resulting mechanical displacement (radar signal), as shown in Figure 4. The window of 400 ms width with a latency of 200 ms after the R wave was able to cover most of the mechanical rippling observed within the radar signals. This choice is in line with a study performed on 92 healthy subjects which showed that the duration range of the R−J interval was 203-290 ms [37]. Additionally, the selection of the value of the latency parameter (200 ms) for the binarized target signal was confirmed on a recorded dataset of seven randomly selected subjects from our study. Using the simplest model that was tested in the scope of this paper (a feed-forward artificial neural network with a single hidden layer containing 10 units), a series of training and testing with a range of delays and target widths was performed. The tested delay values ranged between 0 ms to 300 ms, and the target widths were tested in the range between 300 ms and 500 ms. These values were selected based on the visual inspection of the signals, and the values that were chosen as optimal (window of 400 ms width with a latency of 200 ms after the R wave) were those that yielded the best model accuracy in terms of the percentage of detected peaks.
The following stage in the signal preprocessing was the decimation of both the binary reference signal and the radar signal to the same sampling rate, which was a prerequisite related to the ANN employment, as each input state required a corresponding target output. To ensure the fast computation without a loss of information in the physiological range, 100 Hz was selected as the sampling rate during this computation.  where the values of "1" represent the presence of the mechanical heart displacement after the R wave, and the values of "0" represent its absence.
The input to the ANN consisted of a 200-sample long vector, corresponding to 1 s of recording, where in-phase and quadrature branches were concatenated so that the first 100 samples in the input corresponded to the in-phase signal and the second 100 samples corresponded to the quadrature signal. Such an input was matched to the value of the reference binary signal as the target output. The choice of 1 s signal memory was based on a small-scale test conducted on a subset of recorded signals in which the memory depth varied between 200 ms and 1 s. The lower bound was sufficient to partially incorporate the mechanical wave observed due to the heartbeats and the upper bound corresponded to the interval between the subsequent heartbeats at a normal heart rate of 60 bpm. These initial test results showed that the increases of memory depth from 200 ms to 500 ms significantly increased ANN performance, while further increases resulted in only incremental gains. Nevertheless, we decided to use the deepest memory as we were not concerned about computational complexity, while using longer intervals was expected to contain more physiologically relevant data that the ANN could learn from, and not overfit to potentially insignificant signal details.

The Detection Algorithm
Two main approaches were taken in the design of the heartbeat detector: the classical shallow feed-forward neural networks (FF ANN) and the nonlinear autoregressive exogenous model network (NARX) as representative of the feedback-based topologies. The NARX model took as an input the current sample in the radar data stream and the previous 100 samples, together with their calculated output. The NARX topology was tested for a single hidden layer with 10 and 20 neurons, NARX 10 1 and NARX 20 1 respectively. For the classic shallow FF ANN, 4 configurations were tested: (1) a single hidden layer with 10 neurons, FF 10 1, (2) a single hidden layer with 20 neurons, FF 20 1, (3) two hidden layers with 20 and 2 neurons respectively, FF 20 2, and (4) two hidden layers with 40 and 4 neurons respectively, FF 40 4. The output layer in all the cases consisted of a single neuron with a sigmoid activation function. Activation functions for the hidden layers also varied during this topology search, including a hyperbolic tangent and log-sigmoid and linear transfer functions. Furthermore, for the FF ANNs different loss functions were tested: the mean squared normalized error (MSE), the mean squared error with regularization (MSEREG) and the sum squared error (SSE).
As the FF ANN, containing a single hidden layer with 20 neurons, a hyperbolic tangent activation function, trained using Levenberg-Marquardt optimization and MSE loss function, outperformed all the other ANN topologies, the results presented in this paper were focused mainly on this FF 20 1 ANN topology. The whole flowchart, including the preprocessing and the detection algorithm based on the FF 20 1 ANN, is presented in Figure 5. The ANN input consisted of unprocessed in-phase and quadrature components of a Doppler radar signal (discretized and resampled I(t) and Q(t) signals from Equation (3) and Equation (4), respectively). In the FF ANN The input to the ANN consisted of a 200-sample long vector, corresponding to 1 s of recording, where in-phase and quadrature branches were concatenated so that the first 100 samples in the input corresponded to the in-phase signal and the second 100 samples corresponded to the quadrature signal. Such an input was matched to the value of the reference binary signal as the target output. The choice of 1 s signal memory was based on a small-scale test conducted on a subset of recorded signals in which the memory depth varied between 200 ms and 1 s. The lower bound was sufficient to partially incorporate the mechanical wave observed due to the heartbeats and the upper bound corresponded to the interval between the subsequent heartbeats at a normal heart rate of 60 bpm. These initial test results showed that the increases of memory depth from 200 ms to 500 ms significantly increased ANN performance, while further increases resulted in only incremental gains. Nevertheless, we decided to use the deepest memory as we were not concerned about computational complexity, while using longer intervals was expected to contain more physiologically relevant data that the ANN could learn from, and not overfit to potentially insignificant signal details.

The Detection Algorithm
Two main approaches were taken in the design of the heartbeat detector: the classical shallow feed-forward neural networks (FF ANN) and the nonlinear autoregressive exogenous model network (NARX) as representative of the feedback-based topologies. The NARX model took as an input the current sample in the radar data stream and the previous 100 samples, together with their calculated output. The NARX topology was tested for a single hidden layer with 10 and 20 neurons, NARX 10 1 and NARX 20 1 respectively. For the classic shallow FF ANN, 4 configurations were tested: (1) a single hidden layer with 10 neurons, FF 10 1, (2) a single hidden layer with 20 neurons, FF 20 1, (3) two hidden layers with 20 and 2 neurons respectively, FF 20 2, and (4) two hidden layers with 40 and 4 neurons respectively, FF 40 4. The output layer in all the cases consisted of a single neuron with a sigmoid activation function. Activation functions for the hidden layers also varied during this topology search, including a hyperbolic tangent and log-sigmoid and linear transfer functions. Furthermore, for the FF ANNs different loss functions were tested: the mean squared normalized error (MSE), the mean squared error with regularization (MSEREG) and the sum squared error (SSE).
As the FF ANN, containing a single hidden layer with 20 neurons, a hyperbolic tangent activation function, trained using Levenberg-Marquardt optimization and MSE loss function, outperformed all the other ANN topologies, the results presented in this paper were focused mainly on this FF 20 1 ANN topology. The whole flowchart, including the preprocessing and the detection algorithm based on the FF 20 1 ANN, is presented in Figure 5. The ANN input consisted of unprocessed in-phase and quadrature components of a Doppler radar signal (discretized and resampled I(t) and Q(t) signals from Equation (3) and Equation (4), respectively). In the FF ANN topology, each neuron in the hidden and output layers calculated a linear combination of its inputs in Equation (5): where j refers to the current layer, N is the number of inputs to the current layer, w is the weights and b is the corresponding biases. The output of each neuron was passed through the hyperbolic tangent function, with the exception of the output neuron which used a sigmoid function. The weights were calculated through a numerical optimization of the mean squared error loss function in Equation (6): where M is the number of data points, b i is the binarized target signal and a i is the network output. This was an iterative procedure that was set to run for a maximum of 1000 epochs or to stop early if the solution became sufficiently close to the minimum, that is, if the gradient became smaller than 10 −7 .
The training would also stop if the error on the portion of the data set aside for validation (30%) failed to decrease for 6 consecutive epochs.
To remove fast noisy changes, the sequence of outputs of the ANN calculated for each sample was smoothed. This stage of the detection algorithm was implemented as a moving average filter with a width of 10 consecutive ANN outputs.
The next stage of the algorithm was the peak detection subroutine which marked local maximums of the continuous probabilities output, imposing established constraints on the minimal distance between the consecutive peaks based on the known physiological range in rest 40-120 beats/min [38,39] and their prominence (detection amplitude). When the duration of the detected inter-pulse interval (IPI) was twice as large as the previously detected IPIs (within the established constraints), a beat was interpolated as having occurred at the point in time that was the arithmetic mean between the occurrences of the current and previously detected heartbeats. The detection amplitude was defined empirically for each ANN topology on a small test sample using the error of the number of detected heartbeats as a metric. The same detection amplitude was then used for all the subjects.
Sensors 2020, 20, x FOR PEER REVIEW 8 of 16 topology, each neuron in the hidden and output layers calculated a linear combination of its inputs in Equation (5): where j refers to the current layer, N is the number of inputs to the current layer, w is the weights and b is the corresponding biases. The output of each neuron was passed through the hyperbolic tangent function, with the exception of the output neuron which used a sigmoid function. The weights were calculated through a numerical optimization of the mean squared error loss function in Equation (6): where M is the number of data points, bi is the binarized target signal and ai is the network output. This was an iterative procedure that was set to run for a maximum of 1000 epochs or to stop early if the solution became sufficiently close to the minimum, that is, if the gradient became smaller than 10 −7 . The training would also stop if the error on the portion of the data set aside for validation (30%) failed to decrease for 6 consecutive epochs.
To remove fast noisy changes, the sequence of outputs of the ANN calculated for each sample was smoothed. This stage of the detection algorithm was implemented as a moving average filter with a width of 10 consecutive ANN outputs.
The next stage of the algorithm was the peak detection subroutine which marked local maximums of the continuous probabilities output, imposing established constraints on the minimal distance between the consecutive peaks based on the known physiological range in rest 40-120 beats/min [38][39] and their prominence (detection amplitude). When the duration of the detected inter-pulse interval (IPI) was twice as large as the previously detected IPIs (within the established constraints), a beat was interpolated as having occurred at the point in time that was the arithmetic mean between the occurrences of the current and previously detected heartbeats. The detection amplitude was defined empirically for each ANN topology on a small test sample using the error of the number of detected heartbeats as a metric. The same detection amplitude was then used for all the subjects.

Error Estimation and Statistical Analysis
The inter-pulse interval was calculated as the time elapsed between every two adjacent heartbeats. The classification error was determined through the percentage error in the total number of detected heartbeats and the error in the estimation of median IPIs. The similarities were also assessed between the distributions of the ANN-detected IPIs and those extracted from the reference ECG method.
The performance was evaluated using a three-fold cross-validation: the data were split into three equal subject groups (folds), each containing recordings acquired from 7 subjects, out of which two folds were used for training and one for testing. The training and testing were repeated three times for a different fold held out for evaluation (as shown in Figure 6).
The evaluation metrics were calculated over the set of predictions obtained on the three folds used in the test mode. A statistical analysis was performed on the results obtained via radar and those extracted from the reference ECG signal. The number of detected heartbeats and median IPI for all the recordings used as test were compared to those calculated from the reference signal. Apart from group evaluations, a statistical comparison was also performed on the level of the individual heart event detection within each of the 21-subject sets in the test mode. As in all of the statistical tests, one or more samples were found to be significantly non-normal (Lilliefors test with a 0.05 significance level),Wilcoxon signed rank test was used for statistical comparisons.
All processing was done using the Matlab2018b (The MathWorks, Inc., Natick, Massachusetts, United States).

Error Estimation and Statistical Analysis
The inter-pulse interval was calculated as the time elapsed between every two adjacent heartbeats. The classification error was determined through the percentage error in the total number of detected heartbeats and the error in the estimation of median IPIs. The similarities were also assessed between the distributions of the ANN-detected IPIs and those extracted from the reference ECG method.
The performance was evaluated using a three-fold cross-validation: the data were split into three equal subject groups (folds), each containing recordings acquired from 7 subjects, out of which two folds were used for training and one for testing. The training and testing were repeated three times for a different fold held out for evaluation (as shown in Figure 6).
The evaluation metrics were calculated over the set of predictions obtained on the three folds used in the test mode. A statistical analysis was performed on the results obtained via radar and those extracted from the reference ECG signal. The number of detected heartbeats and median IPI for all the recordings used as test were compared to those calculated from the reference signal. Apart from group evaluations, a statistical comparison was also performed on the level of the individual heart event detection within each of the 21-subject sets in the test mode. As in all of the statistical tests, one or more samples were found to be significantly non-normal (Lilliefors test with a 0.05 significance level),Wilcoxon signed rank test was used for statistical comparisons.
All processing was done using the Matlab2018b (The MathWorks, Inc., Natick, Massachusetts, United States).  Figure 6. Database organization for the purpose of the ANN training and the cross-validation. The dataset was divided into three equal subsets and two subsets were used for training, while the remaining subset was used for testing. This process was repeated for all the combinations of the subsets in the training set. The three training datasets comprised 3473, 3429 and 3386 heartbeats, while the three testing datasets comprised 1671, 1715 and 1758 heartbeats.

Results
The results obtained using six different tested ANN topologies are presented in Table 1. The smallest error in the number of detected beats (count error-CNT error) was achieved by the FF ANN with 20 neurons in the hidden layer (2%), which was notably better than any other tested topology (12.3% < CNT Error < 34.3%). This topology also had the smallest error when the median IPIs were compared. When comparing at the level of the individual IPIs, the number of subjects whose median was significantly different from the reference was also higher than the rest, but was still relatively low. The dataset was divided into three equal subsets and two subsets were used for training, while the remaining subset was used for testing. This process was repeated for all the combinations of the subsets in the training set. The three training datasets comprised 3473, 3429 and 3386 heartbeats, while the three testing datasets comprised 1671, 1715 and 1758 heartbeats.

Results
The results obtained using six different tested ANN topologies are presented in Table 1. The smallest error in the number of detected beats (count error-CNT error) was achieved by the FF ANN with 20 neurons in the hidden layer (2%), which was notably better than any other tested topology (12.3% < CNT Error < 34.3%). This topology also had the smallest error when the median IPIs were compared. When comparing at the level of the individual IPIs, the number of subjects whose median was significantly different from the reference was also higher than the rest, but was still relatively low.  Figure 7 shows an example of the output of the FF 20 1 ANN, the topology which showed the best performance, smoothed with a moving average window, with the prominent peaks detected and marked. This example shows the typical behavior and errors of the methodology.
Sensors 2020, 20, x FOR PEER REVIEW 10 of 16 1 Percentage error of the number of the detected heartbeats out of a total 5144 heartbeats. The negative values correspond to fewer detected heartbeats by ANN compared with the "ground truth"; 2 Inter-pulse interval (IPI) mean relative error; 3 Number of subjects with no significant difference between the medians of the estimated IPI-s and the IPI-s from the ECG reference, out of the total 21 subjects; 4 Feed forward ANN; 5 Nonlinear autoregressive exogenous model; The numbers in the topology descriptions stand for the number of units in each layer. Figure 7 shows an example of the output of the FF 20 1 ANN, the topology which showed the best performance, smoothed with a moving average window, with the prominent peaks detected and marked. This example shows the typical behavior and errors of the methodology. The error of the total number of detected heart events for the FF 20 1 ANN configuration was −2% (104 undetected beats out of a total of 5144 heartbeats extracted from the ECG). The statistical tests showed that there were no significant differences between these two methods in terms of the number of detected events (p>0.05). The difference in the medians of the IPIs calculated using the reference ECG and the FF 20 1 ANN was −2 samples (−20 ms) and was not statistically significant. When it comes to individual heart event detection, for 11 out of the 21 subjects the medians of the ANN detections were not significantly different from the reference.
For the specific ANN that showed the best performance with the recorded database, there were 20 neurons in the first hidden layer, resulting in ~4000 multiply-accumulate operations. In the implementation on an embedded platform (Teensy 4.0 programmed in Arduino IDE) this calculation took 66 µ s, which was more than enough for executing the proposed method in real time. The error of the total number of detected heart events for the FF 20 1 ANN configuration was −2% (104 undetected beats out of a total of 5144 heartbeats extracted from the ECG). The statistical tests showed that there were no significant differences between these two methods in terms of the number of detected events (p>0.05). The difference in the medians of the IPIs calculated using the reference ECG and the FF 20 1 ANN was −2 samples (−20 ms) and was not statistically significant. When it comes to individual heart event detection, for 11 out of the 21 subjects the medians of the ANN detections were not significantly different from the reference.
For the specific ANN that showed the best performance with the recorded database, there were 20 neurons in the first hidden layer, resulting in~4000 multiply-accumulate operations. In the implementation on an embedded platform (Teensy 4.0 programmed in Arduino IDE) this calculation took 66 µs, which was more than enough for executing the proposed method in real time. As the calculation was done with a 100 Hz rate, there were~10 ms in between the consecutive heart event estimations.

Discussion
The work presented in this paper is intended for the detection of individual heartbeats using a state-of-the-art mm-wave radar sensor. The radar sensor relied on the Doppler shift in the signal reflected from the objects within its field of view to detect any movements, even small ones. The real challenge of such a measurement was separating the influence from the different sources. Furthermore, smaller displacements, such as chest movements due to heart activity could be completely hidden or distorted by other physiological sources, such as breathing, talking or change of posture. Thus, the focus of this research was on the specific radar signal footprint in the time domain resulting from the heartbeats and the method by which to identify such small signal ripples. The computational tool selected for this task was shallow artificial neural networks for their high capacity for generalization, ability to be trained without prior knowledge of the signal properties and being computationally inexpensive, enabling easy implementation in an embedded or a high-level system. For the shallow ANNs that were tested, the dominant part of the computational complexity was related to the first hidden layer which performed multiplications of all the input signal samples with the weights ( Figure 5).
With the aim of tracking vital signs and the presence of vehicle drivers in a contactless manner, a database of radar signals, alongside ECG and respiration, was gathered from 21 participants sitting comfortably in a cushioned chair. Up to this date, the number of participants in published papers that presented traditional or machine learning approaches for heartbeat detection was less than 12 , but considering our goal to obtain the realistic results of using a trained ANN on the unseen data, we acquired a larger dataset than in the previous studies. This objective imposed the strict condition of the testing of the performance on radar signals that were completely new to the ANN. Care was taken to split the data in such a manner that the signals obtained from the same person could all be found either in training or in testing ( Figure 6). Although the subjects were instructed to sit quietly in the chair, some of them did substantially move their upper body and head but these recordings were nevertheless included in the database, bearing in mind the potential of neural networks to abstract over a wide variety of inputs and their robustness to noise.
After analyzing the performance of various ANNs it was somewhat surprising to find that a relatively basic ANN outperformed more complex networks. The more complex networks were able to pick up minute details in the signals and use them to model the training set more closely at the cost of loss of generalization, while the reduced single layer network did not have such capacity to overfit. One of the conditions that favored the simpler network architectures could still be the limited database used for the training. The acquisition of yet larger amounts of data in the future could make space for more complex network architectures, such as sequential deep learning models, to further improve the results. This would, however, come at the cost of higher processing requirements.
With respect to the main idea of the ANN-based method, which was the identification of individual heartbeats, the most important metric of the ANN testing was the number of detections. Due to the lack of similar metrics in other scientific publications, the result presented in this paper of 2% undetected heartbeats could not be put into perspective with other approaches.
As another performance evaluation metric, we calculated the time between the consecutive detections. This metric was directly compared with the IPIs from the reference ECG signal and it was shown that the difference between their medians was also not statistically significant, confirming that the method could be used to accurately track averaged heart rates in longer periods. These results were also comparable with the findings presented in [22] where the relative error between the averaged radar and ECG rates was between 0.55% and 1.97%. The method proposed by the authors [24] outperformed all of the previously published methods based on the CW Doppler radar technology regarding the mean relative error (2.07% on the dataset of ten subjects, algorithm latency~2.5 s). The ANN-based approach presented here brings multiple advantages over other methods presented in relevant papers. The ANN method used raw radar signals without the need for any preprocessing or calibration, which can require the implementation of complex algorithms, nor prior knowledge of the signal properties and process models. The implementation of the shallow ANN within a microprocessor system was quite computationally inexpensive as it comprises only a small number of basic arithmetic operations (multiplications and additions for neural layers, while activation functions can be obtained using look-up tables). In comparison with the FFT-based approaches and the heavy use of digital filters, this feature presented a significant improvement in the computational complexity. This method is also beneficial in applications that require the fast detection of human presence, as the latency was below the width of the processing window (1 s), while to our knowledge, the shortest latency reported in relevant publications is 2.5 s [24]. This estimate of 1 s was made for the worst-case scenario of system power-up which requires the initial filling of the ANN input buffer. Once the system is up and running, with a human entering its field of view, this latency is expected to be significantly lower. Namely, due to the training procedure, the heart detections could occur in the 400 ms window that contains the latest signal samples. The smoothing by moving average filter was done on only 10 ANN outputs, which brought negligible delay. The last step of the detection chain which was peak detection required only a few extra samples to identify a local maximum.
To summarize the above contributions of our work, we listed all the relevant previously published methods for heartbeat extraction in normal breathing conditions based on the CW Doppler radar technology in Table 2. This work is the first approach that used artificial neural networks for the heartbeat detection based on the CW Doppler radar technology. The method was not person-specific (as opposed to the supervised machine learning approach applied in [32]) and we performed a more realistic scenario on 21 subjects where all the results were presented on "unseen data". This was a fast and reliable calibration-free method, with low percentage of failed heartbeat detections and with the latency that outperformed all relevant previously published non-person-specific approaches. As shown in Table 1, the ANN implemented in this paper showed weakness in the estimation of the IPI with high accuracies. Consequently, the utilization of the methodology for the evaluation of heart rate variability (HRV) parameters is limited. The goal of the present study was to develop a method for the fast detection of individual heartbeats; thus, all of the ANN optimizations were governed by the error of the detected heartbeats count. The HRV-related errors were not included in any of the methodology steps; therefore, it would be unrealistic to expect high accuracies compared with the methods specifically designed for the HRV estimation. In addition, the choice of the targets has a significant influence on the HRV parameters estimation. As aforementioned, the main goal was related to the robust detection of individual heartbeats, so the target window was made relatively wide (400 ms) to enable the identification of any signal waveform that was related to the mechanical displacements due to heartbeats. This wide window, except for providing more room for heartbeat detection, means that if a heartbeat was detected in any part of the target window, it would be considered as successful during the training phase. In an example, a heartbeat could be detected with the highest probability at the beginning of the target window, while the following heartbeat could be detected with the highest probability at the end, without any penalty related to the ANN performance during training. On the other hand, this kind of detection would result in 400 ms error in estimating the beat-to-beat interval.
In future work, the focus will be on increasing the HRV estimation accuracy. To achieve this goal, the error of the R-R interval estimation will be introduced as an additional loss function during the training procedure. This would inherently force the ANN to bind the maximal detection probability with a specific part of the mechanical oscillation. However, this approach would require some topological ANN changes, such as a feedback loop with the previously detected IPI and an extension of the memory depth.
In this study, there were several limitations that should be noted. All the subjects that participated in the study were young and physically fit adults. During the selected time period they were mostly sitting calmly and were instructed to restrain from prominent body movements. In this study, the chair was placed in a room with no moving objects. In a scenario which involves nonstationary objects, or recording within confined environments, such as inside a car, the performance of the system could deteriorate due to clutter and multipath effects. Given the continuous nature of the unmodulated signal, Doppler radar has no exact information on the absolute position of the observed person. Tracking vital signs of multiple people simultaneously would most likely have to be performed using modulated signals that can resolve observed targets in a space.
The usage of a high-carrier frequency provided small dimensions of the radar system, since small antennas were used, unlike for lower frequency radar systems whose antennas usually need to be larger. An additional advantage of the high-frequency radar was its sensitivity to the small chest displacements that come from heartbeats. Since the method in this paper was using in-phase and quadrature radar signals directly, the radar sensitivity was of crucial importance for the heartbeat detection. The usage of even higher carrier frequency could improve the results, since the sensitivity to small displacements would be larger.
The radar used for this study had a relatively broad field of view, which made it susceptible to picking up clutter from the surroundings. Additionally, in applications that would require a larger distance between the radar and a patient, this broad field of view would pick up even more surrounding movements. Using an antenna with a narrower beam could improve the performance of the system in the future. It is expected that the signal-to-noise ratio (SNR) of the received signals would be smaller if the radar was placed at larger distances. This means that further tests need to be done in order to determine the influence of the SNR on the detection accuracy. Future work will also include tests of the ANN performance in cases when subjects perform natural movements, to estimate the reliability of the sensor and the method in an environment saturated with motion originating from different sources.

Conclusions
In this paper, we presented a simple and efficient contactless method for detecting individual heartbeats. The method is based on the CW Doppler radar directly coupled with an ANN stage to detect small signal ripples resulting from sub-millimeter chest movements due to heartbeats. The method has lower latency, lower computational complexity and an easier implementation on an embedded platform when compared to the traditional methodologies described in the literature, while still achieving a good heart rate estimation accuracy. With the promising results presented in this paper, we could foresee the application of the system in uses that require real-time operation, such as human detection in an industrial, automotive or clinical environment.

Conflicts of Interest:
The authors declare no conflict of interest.