Automated Detection of Hypertension Using Physiological Signals: A Review

Arterial hypertension (HT) is a chronic condition of elevated blood pressure (BP), which may cause increased incidence of cardiovascular disease, stroke, kidney failure and mortality. If the HT is diagnosed early, effective treatment can control the BP and avert adverse outcomes. Physiological signals like electrocardiography (ECG), photoplethysmography (PPG), heart rate variability (HRV), and ballistocardiography (BCG) can be used to monitor health status but are not directly correlated with BP measurements. The manual detection of HT using these physiological signals is time consuming and prone to human errors. Hence, many computer-aided diagnosis systems have been developed. This paper is a systematic review of studies conducted on the automated detection of HT using ECG, HRV, PPG and BCG signals. In this review, we have identified 23 studies out of 250 screened papers, which fulfilled our eligibility criteria. Details of the study methods, physiological signal studied, database used, various nonlinear techniques employed, feature extraction, and diagnostic performance parameters are discussed. The machine learning and deep learning based methods based on ECG and HRV signals have yielded the best performance and can be used for the development of computer-aided diagnosis of HT. This work provides insights that may be useful for the development of wearable for continuous cuffless remote monitoring of BP based on ECG and HRV signals.


Introduction
In adults, hypertension (HT) is diagnosed when repeated office measurement of systolic blood pressure (SBP) is ≥140 mmHg, or diastolic blood pressure (DBP) is ≥90 mmHg [1]. HT can be classified into different categories based on the office measurement (Table 1) [1]. HT increase the force exerted by by the blood against the inner walls of the arteries, which transport oxygen-rich blood pumped out of the heart to the rest of the body [2]. As such, chronic HT can inflict damage to various vital organs of the body, such as lung, brain, heart, and kidneys [2]. The World Health Organization estimates that nearly 1.3 billion people suffered from HT in 2015 globally, and less than 20% received management [2]. HT is largely asymptomatic, but symptoms can sometimes occur, including headaches, panic attacks and dizziness. The electrocardiogram (ECG) records the electrical potentials on the body surface that originate from heart, and the signals can provide information on the rhythm as well as structure and function of the heart [3][4][5][6]. Using advanced analysis, the ECG signals in HT subjects can be correlated to BP measurements and even discriminate for higher clinical risk [7][8][9][10]. In HT, the heart observes more force and over time becomes hypertrophied, which induces the ECG. Figure 1 is a graphical depiction of a typical normal ECG waveform, which comprise the P wave, QRS complex and T wave representing atrial depolarization, ventricular depolarization and ventricular re-polarization, as well as standard ECG intervals, including RR interval, PR interval, QT interval and lengths of the PR and ST segments. In [7], associations were found between SBP and DBP and changes in the ECG at two intervals delineated by the peak of the R wave to the middle of the T wave and the mid of the T-wave to the peak of the R wave as indicated in Figure 1, respectively, using machine learning (ML) [11,12].

HRV Signal
The temporal variation of sequential heartbeats (RR intervals) is termed HRV [13]. From the ECG signal, R-peaks are first extracted and then HRV is deduced using computer programming based on the difference in RR intervals ( Figure 1). HRV reflects the activity of automatic nervous system and provides a window into the cardiac sympathetic and parasympathetic activities, which have significant physiological impact on heart rate rhythm and contractile function [13]. HRV measurement is non-intrusive easy to perform and results are reproducible. Importantly, it confers both diagnostic and prognostic im-plications for wellness and cardiac disease [14,15]. High HRV is associated with normal subjects and reduced HRV may be pathological. HRV can be analyzed over either long or short durations [14]. Long duration analysis encompasses activity throughout the day and night (24 h analysis), where as short-duration analysis uses only five-minutes of HRV data. In HT patients, HRV is affected by the presence of cardiovascular risk factors. In depressed patients with HT, HRV is associated with vascular cardiac and renal target organ damage [13,16]. The long-term (24-h) HRV is useful in the diagnosis of severe HT conditions. An increased sympathetic activity saturate the ability to modulate heart rate, hence HRV is depressed. To identify severe HRV or high-risk HRV, standard deviation of NN intervals should be less than 50 to 70 msec and HRV triangular index is less than 20 units. Similarly, the NN interval duration is 7.8 msec [17]. In summary, HRV is a simple non-invasive method which can be used to assess the cardiovascular system [18].

Photoplethysmography (PPG Signal)
PPG uses low-intensity infrared (IR) light sensor to detect the amount of light absorbed by or reflected from tissues supplied by the blood vessels. It produces photo electric signal either transmissive or reflective, which reflect the pulsatile blood volume in the area covered by sensors [19,20]. The PPG signal contains information about the arterial and venous circulatory system [19,20]. The PPG is correlated with and has been applied to the measurement of heart rate, BP, and blood oxygenation there by providing clinically useful information for physiological monitoring.

Ballistocardiogram (BCG) Signal
BCG measures the whole body motion in terms of displacement, velocity, and acceleration in response to the cyclical ejection of blood from the heart [21]. It reflects the sum of factors linked to heart and blood vessel function, and used to diagnose various cardio vascular diseases [21].
In this paper, we reviewed PPG, BCG, ECG and HRV signals related computer-aided diagnosis systems developed for the arterial HT. To the best of our knowledge this is the first review to provide unique ranges for nonlinear features for healthy control (HC), low-risk hypertension (LRHT), and high-risk hypertension (HRHT) ECG classes.
We further excluded non-English articles and works that were not explicitly designed for diagnosis of HT.
Finally, 23 articles were selected for this review. Figure 2 shows the flow diagram of article selection, where n is the number of articles.

Databases
The ECG, HRV, BCG, and PPG databases are used to develop an automated HT systems and are summarized in Tables 2-4. The relative percentages of the different signals used are: ECG = 30.43%, PPG = 17.39%, BCG = 8.69%, and HRV = 43.47%. The most common databases are based on HRV and ECG signals.
One group studied 113 HRV signals [28], and later an expanded 185-sample HRV dataset [29], derived from 7-min Lead II ECG recordings sampled at 500 Hz collected at the same center.
HRV signals were obtained from 7-to 9-hour ECG records (sampling frequency 200 Hz) from 24 subjects in [30]. In [18], a Kubios HRV analyzer was used to derive HRV from 5-min Lead II ECG recordings.
Seventy-one HRV signals derived from 300-s ECG recordings were studied in [31]. Ten minute 12-lead ECG signals sampled at 200-Hz in 97 subjects were used to derive the HRV dataset in [32].

BCG-Derived HRV Signals Database
BCG recordings sampled at 100 Hz were used to derive HRV signals for 18 subjects in [33]. In [21], HRV signals were extracted from 67 normal and 61 HT BCG signals sampled at 100 Hz with 16-bit resolution.

PPG Signal Database
In two different studies by the same group, 120-second PPG recordings (sampling frequency 125 Hz) from Multiparameter Intelligent Monitoring in Intensive Care Database (MIMIC) were used [34,35]. In [20], the same authors studied 124 180-second PPG signal recordings ( sampling frequency 1kHz) collected at the same hospital. In [15], HRV signals were extracted from 43 PPG recordings sampled at 64 Hz with 8-bit resolution. Twenty PPG signals encompassing 1536 hours of data in normal and HT subjects were used to derive HRV in [36].

Normalization
Rajput et al. [8,23] used Z-score normalization method for amplitude scaling of ECG signal. The Z-score is the difference between the mean and actual ECG signal divided by the standard deviation of the ECG signal. Similarly, Liang et al. [20,34,35] and Liu et al. [21] used Z-score score normalization to normalize the amplitudes of PPG and BCG signals, respectively.

Segmentation
Segmentation is used to convert long-duration (e.g., 24-h) signals into short-duration ones requiring shorter computation time for downstream analysis. Soh et al. [2] also segmented the ECG signals of 139 HT subjects into 69,500 segments with, each sample size of 2000 samples.

Signal Filtering
Low and high-frequency noise signals, generated during the recording of ECG signal, may affect the interpretation [39]. ECG signal noise can be induced by electrode contact noise, electromyogram, channel noise, baseline wander, and power line interference [39]. It is important to remove noise from the ECG signal to obtain higher classification performance for which various methods are available. Ni et al. [30] used Savitzky-Golay filtering to remove noise from the digitized ECG signal, while Soh et al. [2] used discrete wavelet transform (DWT). For the removal of noise from PPG signals, Liang et al. [20,34,35] applied Chebyshev II band-pass filter with a frequency range of 0.5-10 Hz.

Re-Sampling
Poddar et al. [28] employed BIOPAC 4.0 software to extract the RR tachographs from ECG signals. The tachographs contained samples that were unevenly placed due to beatto-beat variation of RR intervals. Re-sampling at a frequency of 4 Hz was performed to preserve the uniformity across the entire length of tachograph data.

Continuous Wavelet Transform (CWT) Used for PPG Signal Transformation
The PPG signals are converted into two dimensional images called scalograms and fed as input to the convolutional neural network (CNN) for automated detection of HT PPG signals [31,34]. HRV parameters are extracted from RR intervals and can be categorized into, shortterm variation (STV) and long-term variation (LTV) in the time domain [14]. LTV exhibits slower, and STV faster fluctuation. RR intervals have the following intrinsic features: intervals between normal heart beats of ECG signal (NN); standard error of NN intervals (SENN); standard deviation of differences between adjacent NN intervals (SDSD); root mean square of successive differences between NN intervals (RMSSD); the number of successive NN intervals that differs from each other by >50 ms of the whole recording (NN50); and the percentage of successive NN intervals that differs by >50 ms of the whole recording (pNN50%) [14].

HRV Frequency-Domain Parameters
Fast Fourier transform (FFT) decomposes the RR intervals into their frequency constituents, which can be classified as very low frequency (VLF), low frequency (LF), and high frequency (HF). Total power (TP) is a short-term estimate of total power of power spectral density in the range of frequencies between 0 and 0.4 Hz. However, TP mainly reflects level of the autonomic nervous activities (both parasympathetic (PNS) and sympathetic (SNS)) and humoral (hormonal) effects and circadian rhythm as well as ANS's activity. Generally decrease in TP is observed in individual under chronic stress or with disease [14,48]. HF (0.15 to 0.4 Hz) and LF (0.04 and 0.15 Hz) reflect the modulatory effects of parasympathetic and sympathetic activity, respectively, on the heart rate. Accordingly, the ratio of LF to HF represents the sympathovagal balance. VLF( 0 to 0.04 Hz ) reflects the vascular response associated with mechanisms caused by negative feelings [48,49].

Features of BCG Fluctuation
Cardiac mechanical operations modulate fluctuation pattern of the BCG signal, and cardiovascular disease such as HT can be identified by evaluating the pattern of fluctuation pattern [21]. As the BCG signal commonly includes noise from body motion and the signal acquisition system itself, it is important that BCG fluctuation features acquired be noise-sensitive. Four noise-insensitive features are used: zero-crossing rate (ZCR); average cumulative amplitude change (ACAC); average number of extreme points (ANEP); and average signal turns count(ASTC) [21].

Feature Selection, Reduction, and Ranking
Feature selection helps to select most relevant features which can be used to distinguish between normal versus HT classes. Various feature reduction techniques employed include principal component analysis (PCA) [13,31], marginal Fisher analysis (MFA), linear discriminant analysis (LDA), temporal pyramid pooling method (TPPM) [13], and independent component analysis.
The features are organized and presented to the given classifier based on their ranking one-by-one until the highest performance is obtained.
Student's t-test [2,8,13,16,21], Bhattacharya, Wilcoxon, and receiver operating characteristics (ROC) are the mostly used feature-ranking techniques. Multi-factor analysis of variance (MANOVA) and the chi-square test [16] are utilized as the feature selection methods for choosing the highly discriminant features to the classifier.

Computer-Aided Diagnosis Methods
In Figure 3, an outline of methods based on artificial intelligence (AI) is presented. About 82.6%, 13.04% and 4.34% of authors used ML the statistical software SPSS, and other traditional methods, respectively to detect HT ECG signals automatically. ML-based techniques are robust and accurate.

Hypertension Diagnosis Index (HDI) [8]
Rajput et al. [8] developed an HDI using selected features to accurately discriminate low-risk HT (LRHT) and high-risk HT (HRHT) with a single numeric value.
The orthogonal wavelet filter bank is used for five-level wavelet decomposition. Signal fractal dimension (SLFD) and log energy (LOGE) features were then computed from the decomposed coefficients. All 12 sub-bands of ECG signal were ranked by the Student's t-test ranking method. High-ranking feature sub-bands (SUB) were used to develop HDI (Equation (1)). In Equation (1), various composite features were experimentally merged to achieve the optimal difference between the two groups.

Proposed Work after Understanding Review Studies
An outline of the proposed work is shown in Figure 4. In this work, we used the public database used by authors in [23,24]. A total of 3694 ECG segments were obtained from (SHAREE, and PTB data base) [23,24] It is a measure of uncertainty in the non-linear signal [2,23]. SeEn can be computed as; where a is the l + 1 length of vector, and b is the length l of vector with l = 2 [23].

Approximate Entropy (ApEn)
It is an approximation technique which is useful to measure homogeneity and complexities in time-series data containing noise because of its stability to distinguish closely linked stochastic processes. It is effective in short data intervals to differentiate between chaotic and noisy time-series data [2,27,28,30]. For finite length N, ApEn is computed as: is a vector, while m = 2. The tolerance value r = 0.20 lies between 0.15 to 0.25 [28].

Renyi Entropy (ReEn)
It is used to measure the spectral intricacy of time-series signals and generalized as the Shannon entropy [2,13]. ReEn is computed as : where K = discrete random variable, α is order (α ≥ 2) of ReEn, p j = denoted the total spectral-power [24,50].
where i is a resolution level and P i is probabilities with respect to i [52].

Log Energy (LOGE)
It is the logarithm of an energy of the ECG signal. The mathematical expression of LOGE is given below [53].
where LOGE r is define the log energy of rth time-series, and the amplitude of nth sample of rth time-series is m r (n).

Signal Fractal Dimension (SLFD)
It is used to measure similarity and complexity in physiological signals. It is the ratio of fractal pattern with respect to which it is measured [53]. It is given by:

Hurst Exponent (HE)
It is a measure of repeat-ability [54]. The generalize equation of HE is as follows: Here, Y represents the length of time-series data, while A B denoted the re-scaled range value. The difference of maximum and minimum value of mean is considered as A. However, B represents the standard deviation.

Largest Lyapunov Exponent (LLE)
LLE is used to identity chaos in the physiological signal [55].
(c) A weighted center feature of bispectrum (WCOB) is described as [58]: Here, k and l represents the frequency bin index in the principle region of bispectrum plot [58]. Similarly, some moments related features are given below: (d) Bispectrum logarithmic amplitude feature [58]: (e) Bispectrum sum of logarithmic amplitude of diagonal elements feature [58]: (f) Bispectrum first-order spectral moments of amplitude of diagonal elements feature [58]: (g) Bispectrum mean magnitude feature [58]: (h) Bispectrum phase entropy feature [58] : Here, the number of points is represented by L of the principle region, Φ is the phase angle, and Ω refer the space of the region [58].

Higher Order Spectral Cumulant (HOSC)
Obtaining the nonlinear dynamical characteristics of ECG signal using lower-order (first) of statistics is complex [57]. Therefore, higher order statistics, such as second, third and fourth, are widely used to analyze the ECG signal. Hence, HOS cumulant higher order statistics features are used in analysis of non-stationary ECG signals in various applications [57].
Let {r 1 , r 2 , r 3 ...., r q } is representing a zero mean random process for q dimensional multivariate. While m r 1 , m r 2 , m r 3 , and m r 4 are the order of moments from first to fourth [57]. As well as k, l represents the lag-parameters. Hence, using the non-linear combinations of the moments, cumulant can be computed.
Here C r 1 , C r 2 , C r 3 , and C r 4 are the order of cumulants from first to fourth. In this work, we have computed second, third, and fourth order cumulants.

Recurrence Plot (RP)
For the physiological signal in the time domain, RP can find hidden patterns which are not clearly identifiable [57]. Recurrence can be defined as the value of k and l dropped below the threshold value is known as recurrence. Assuming r k in an L dimensional space be the kth point. However, a dot is considered at (k, l) as the distance between the r k and r l is closer. When k = l, the recurrence plots are symmetric along the diagonal, as r k is close to r l , then r l is close to r k . Hence, a RP is R × R square of an array dots [57]. This may also be presented in time-related space as a R × R matrix. The yellow dot implies there has been a recurrence.

Recurrence Quantification Analysis (RQA)
RQA is better choice for dynamical system used to measure the number and duration of recurrence. It helps to measure and analyze the recurrence plot of non-stationary physiological signal [57]. In the time domain, RQA evaluates the non-stationary and hidden periodicity of signals. The following RQA features are used in this work: (a) Recurrence rate (RR), (b) Determinism (DET), (c) Entropy (ENT), (d) Laminarity (LMR) [57].

Results
The proposed work is performed on LRHT, HRHT and HC subjects. A total of 3694 ECG segments were obtained from SHAREE and PTB databases. The LRHT have 3172, HRHT have 442, and HC class have 80 ECG signal segments of 2 min duration.
An experiment is performed on MATLAB 2016b version 9.1.10 (licensed) and work station (personal computer) with Intel i7 processor, 16GB RAM, 1TB HD, and 4 GB graphics card. We have tried several non-linear features to obtain the optimum results. However, higher order spectral cumulant, bispectrum and recurrence quantitative analysis (RQA) yielded optimum results.
In addition to this, the optimum performance were obtained using the combination of HOS bispectrum, cumulant and RQA feature. A total of 9 bispectrum-based features are extracted and shown in Table 5. The detailed of RQA and HOS cumulant features are presented in Tables 6 and 7 respectively.
The highest classification accuracy, sensitivity and specificity of 98.05%, 95.66%, and 96.58%, respectively are obtained using support vector machine classifier with ten-fold cross-validation strategy. Table 8 represents the confusion matrix obtained for SVM classifier using all bispectrum, cumulant and RQA features. Table 9 shows the performance measures obtained for each class using HOS bispectrum, cumulants and RQA features with SVM classifier. The highest AUC of 1.00 is obtained using SVM classifier ( Figure 5).  Figure 13 shows the recurrence plots for HC, LRHT, and HRHT classes. Summary of classification performance obtained using various combination of features is shown in Table 10. 92,476 ± 1 ×10 6 1× 10 5 ± 5 ×10 5 3× 10 12 ± 6 ×10 13

Discussion
Tables 2-4 summarizes all studies using ECG, BCG, PPG, and HRV signals. It is also evident from Table 2 that the ECG-based computed aided diagnosis system obtained the highest area under receiver operating characteristics (AUC=1.00) performance compared to rest of methods. Moreover, Table 2 represents the highest classification accuracy of 99.99% using ECG signals. Table 11 lists the summary of artificial intelligence (AI) techniques used to classify HT based on ECG and HRV signals. On the other hand, Tables 2-4 summarize the methods, features, subjects, results and type of databases that have been used to diagnose HT using HRV, ECG, BCG, and PPG signals. In both ECG and HRV signal-based studies, authors have used transformational approaches, converted time-domain signals into the frequency domain, extracted non-linear features and classified using SVM and KNN classifiers. The summary of automated systems developed for HT are as follows: • Rajput et al. [8] developed an HDI accurately using ECG signals to stratify low-risk versus high-risk HT with a single numeric value.
• Poddar et al. [28] used HRV signals to classify HT and normal subjects using SVM classifier with 100% accuracy using 20 features. They have used a balanced data set of 56 normal and 57 HT subjects in their study.
• Rajput et al. [23] classified ECG signals into three classes (LRHT, HRHT, and HC) using features extracted from the five-level wavelet decomposition of ECG signals. They have obtained 99.95% classification accuracy using SeEn and WeEn features with unbalanced data set. Testing error is found to be only 3.26% with hold-out validation method.
• Soh et al. [24] developed a CNN architecture for the classification of normal and HT ECG classes and achieved an accuracy of 99.99%, sensitivity of 100% and specificity of 99.97%. In this work, bispectrum based features obtained the highest classification accuracy of 96.5% among the nonlinear features. The summary of classification performance obtained using various combination of features is shown in Table 10.
It can be noted from Table 5 that bispectrum features are clinically significant and show clear difference between three classes. On the other hand, Hos cumulant order-three mean and standard values yielded the large difference among three class as mentioned in Table 7, hence it is useful for 1-D signal features extraction. However, RQA RR feature also comprising distinct difference in all three class.
It is well recognized that the bispectrum conserves the phase information [57]. Because of this property, it is used to examine quadratic nonlinear differences between various frequency components of ECG signals. Such interactions have been observed between different frequencies of three classes of ECG signals. This analysis may be useful for detecting changes in ECG signals. For a normal, LRHT, and HRHT ECG signal, the magnitude and its contour representation are shown in the Figures 7-12. It can be noted from Figures 7-12 that, these plots are unique and can be used to discriminate the class of the ECG signal (HC, LRHT, or HRHT). Similarly, Figure 13 shows the recurrence plots for HC, LRHT and HRHT ECG signals. These plots are unique and can also be used to differentiate the three classes. We have obtained the highest classification performance only using HOS bispectrum features without transforming the ECG signals (Table 10). The advantages of proposed study are: (i) many nonlinear features are employed which can be used for the classification. (ii) Proposed unique HOS bispectrum and recurrence plots for three classes. (iii) HOS-based features are more robust to noise.
In general, works conducted using ML and DL coupled with ECG signals have yielded the highest and optimum performance. The works done in Tables 2-4, [2,8,13,16,21,21,[23][24][25][26]34,35] have used public (open source) databases, while the rest of the studies have used private databases. This underscores the importance of public databases for computer aided diagnosis systems development.
Liang et al. [20,34,35] detected HT using PPG signal in three separate studies using public and private databases. They achieved a best classification F-score of 94.84%.
Liu et al. [21] diagnosed HT from BCG-derived HRV signal, and achieved highest classification accuracy of 84.4% using ML.
Such results demonstrate the effectiveness of transformation methods that combine nonlinear and entropy-based features. ML methods work well with balanced and smaller databases. The performance of ML models also depends on the features extracted and classifiers used.
In the future, we intend to use deep learning architectures to detect the HT ECG signals using large database [59]. The biggest challenge for this study is the availability of the large public database. Figure 14 illustrate the cloud based proposed model. Initially, the ECG, PPG, BCG, and HRV signal recorded from patients and stored in hospital database. The stored signals were sent to the cloud based model, where it is installed. The cloud based model analyze the provided data and detect the hypertension accurately. To the same, the results were revert from cloud to hospital. Hence, the Doctors can compare the results obtained by cloud based model as well as manually finding. Table 12 have all the list of abbreviation used in the paper.

Conclusions
We have reviewed many automated HT diagnosis methods using ECG and other physiological signals. Many ML models have been developed using nonlinear features and various classifiers. Few DL architectures have been proposed to detect HT ECG signals. Combined with low-cost wearable devices, such methods have the potential to monitor for continuous, non-intrusive cuffless and wireless remote BP. Such automated systems are reliable, accurate and can also be used to detect other cardiac ailments. It can be used in hospital intensive care units (ICUs) to aid the staff to alert the sudden rise in the BP of patients immediately and provide accurate treatment.