ECG Sensor Card with Evolving RBP Algorithms for Human Verification

It is known that cardiac and respiratory rhythms in electrocardiograms (ECGs) are highly nonlinear and non-stationary. As a result, most traditional time-domain algorithms are inadequate for characterizing the complex dynamics of the ECG. This paper proposes a new ECG sensor card and a statistical-based ECG algorithm, with the aid of a reduced binary pattern (RBP), with the aim of achieving faster ECG human identity recognition with high accuracy. The proposed algorithm has one advantage that previous ECG algorithms lack—the waveform complex information and de-noising preprocessing can be bypassed; therefore, it is more suitable for non-stationary ECG signals. Experimental results tested on two public ECG databases (MIT-BIH) from MIT University confirm that the proposed scheme is feasible with excellent accuracy, low complexity, and speedy processing. To be more specific, the advanced RBP algorithm achieves high accuracy in human identity recognition and is executed at least nine times faster than previous algorithms. Moreover, based on the test results from a long-term ECG database, the evolving RBP algorithm also demonstrates superior capability in handling long-term and non-stationary ECG signals.

processing. Intelligence optimization techniques such as GA [29], PSO, ant-colony, and so on are commonly utilized for tuning the parameters of the aforementioned algorithms. Most ECG-based human identification/verification methods rely on feature extraction derived from the ECG signals. The features are usually extracted according to three models: transform-based, waveform-based, and statistical-based.
The transform-based algorithms consist of wavelet transforms [9][10][11][12][13][14] and frequency domain transforms, including Fourier transform [15] or DCT [16]. Since the wavelet transform contains information in the time and frequency domain, it is more popular than the frequency domain transform.
Correlation coefficients and measurements of wavelet distances have been used in matching acquired ECG signals and modules [9]. Since this identification/verification method requires heavy calculation, its implementation is restricted in the statistical sense. The feature selection algorithm in [12] applied the feature set evaluation (FSE) k-nearest neighbor (k-NN) algorithm to improve low recognition rates and used the eigenspace method to reduce data dimensions; however, this approach is both complicated and time-consuming. In [13], morphological characteristics are first extracted through the wavelet transform and the independent component analysis; SVM follows for identification/verification purposes. Although a high identification rate could be reached, a lengthy feature extraction process seems unavoidable.
Waveform-based algorithms [17][18][19][20][21][22][23] extract different time domain characteristics (distance, height, and area) from fiducial points inside the ECG waveform. These waveform descriptors will be used to match or classify ECG signals in the identification/verification process. These algorithms usually have good accuracy in recognizing regular ECG signals but show opposite results for irregular data. Some researchers combined a precision-matched result with a waveform neural network in the signal preprocessing stage [18]. This model extracted seven features from the ECG signals based on their amplitude and the interval to be analyzed by the decision-based neural network. The computational complexity depends heavily on the forms of those time-domain ECG signals and the level of difficulty of the matching process carried out by the neural network. Nineteen characteristics are extracted from the time interval, amplitude, and angle of deflection and studied [22]; the identification is examined using Euclidian distances and an adaptive threshold. The eigenvectors used in feature-matching take time but are necessary for all band waves in the ECG signals.
An ECG signal can be described as a non-stationary time series that presents some irregularities in the waveform. Unlike the waveform-based algorithms, the transform-based algorithms analyze the non-stationary information based on the signal's presentation in the frequency domain. Not only is this process slow, but it is also difficult to extract good features for the purpose of identification.
Statistical-based algorithms usually depend on statistical evaluations (count, mean, and variance) of human identification. They are usually less time-consuming but definitely need a well-designed statistical model to assure high-quality accuracy. A method based on rank order statistics was presented to analyze the human heart beat [30].
The non-stationary behavior of ECG has been utilized in many studies. The fetal ECG was reconstructed with higher-order statistical tools exploiting ECG non-stationary properties associated with post-de-noising wavelets [27]. A de-trended fluctuation analysis to quantify the correlation property in the non-stationary physiological time series was presented [31]. Our previous work for an ECG card access control system [32] focused on the architecture in ECG human identification.
Compared with algorithms presented in the literature, our proposed scheme is capable of providing secure and accurate results with a user-possessed controller. Moreover, it can be easily embedded into the field application structure to ensure the implementation of a feasible ECG identification hardware.

System Architecture and Application Example
Even though some ECG biometric identifications have been demonstrated, there is a serious issue regarding the use of a centralized ECG database. Due to implementation cost and accuracy issues, there is not yet a feasible application. In our previous work [32], we put this idea into practice and introduced a portable ECG card for access control. This small ECG card provides a cheap and convenient way to enhance door access security.
An ECG access control system consists of a personal ECG sensor card and an access control device. An ECG card is a small device for storing personal ECG data and will be useful for identity recognition. As suggested in Figure 2, applications of ECG cards include secure personal keys to open cars, houses, deposit boxes, and mobile phones. The following advantages mean that the ECG approach outperforms the most popular fingerprint approach.
1. The ECG signal is not visible and photo-copying is impossible. Replication of ECG signals is more difficult. 2. ECG data can be measured only by a smaller, low-power-consumption, low-cost, simple circuit.
The blueprint of the architecture of an ECG sensor card is shown on the left of Figure 3. Data are obtained through a contact pad (denoted by "DOT"), and the processing unit, the integrated chip "INA321A", is in charge of common mode noise removal from the original signal. The main processing unit, the integrated microprocessor "MSP430FG439", controls and transmits all data to the ZigBee module, a short-range wireless transmission module that communicates with the access control device. The other modules include "SBLCDA4" and "JTAG", which will be used for LCD (Liquid Crystal Display) display and debugging the microprocessor, respectively. One real implementation of the ECG sensor card is shown on the right of Figure 3. An ECG card contains two voltage-sensitive contacts, noise filter modules, a microprocessor, and a wireless transmission module. This small and low-cost device allows practical ECG identification in real life.
A door access control, as shown in Figure 4, serves as one real application of ECG access cards. This card checks whether an ECG signal provided by the user matches that stored inside the card. The controller is an embedded system or personal computer connected with an ECG card via the wireless transmission module. The flowchart in Figure 5 shows the ECG verification process in the door access control.

The Basic RBP Algorithm
The idea of our proposed algorithm, RBP for ECG verification, is related to Yang's [30] and Kumar's [33] works, but we expand it to a different field of application. The differences between Yang's model and our model are as follows: 1. Their approach focuses on the human heartbeats; ours focuses on just the bare ECG signals. 2. They convert, count, and rank P waves in the ECG signals only; we perform these procedures on every sample of ECG data to obtain the reduced binary pattern. 3. They aim for heart disease classification; we focus on human identity recognition through ECG signals. The processing in our design can be roughly divided into three necessary steps that will be illustrated as follows.

Step 1: Reduced Binary Pattern Conversion
All ECG signals are non-stationary. Consider an ECG signal as = { 1 , 2 , 3 , … , } , where the real-valued corresponds to the th input datum. Each pair of consecutive input signals is compared and the data are categorized into one of the two cases: a decrease or increase in . A preliminary reduced function then maps these two cases to 0 or 1, respectively, according to the rule: This procedure converts the ECG signal of length to a binary sequence = { 1 , 2 , … , −1 } of length − 1. Every bits in are grouped to construct a reduced binary sequence of length , referred to as an m-bit word, and then all such words are collected to form a reduced binary pattern = { 1 , 2 , … , − } where = { , +1, … , + −1 }. We then convert each m-bit word to its decimal expansion .

Step 2: Counting and Ranking Processes
The main theme in this step, as shown in Figure 7, is to count the occurrences of all and sort them in order of descending frequency.
Let be an integer for = 1, 2, ⋯ , − . It is obvious that values of range from 0 to 2 − 1. Let integer ∈ , and let ( ) and be the corresponding relative frequency and occurrence of . To be exact, ( ) = − and ∑ (2 −1) =0 = − . Next, is ranked according to its frequency from the largest to the smallest. For example, ( ) = 1 means the -bit words , which converts to the same as those that appear the most frequently in the reduced binary pattern.

Step 3: Measurement of Similarity
Consider two segments of ECG data 1 and 2 , which may belong to two distinct subjects. To understand how closely they are related, the measurement of similarity needs to be defined. We incorporate a weighted distance formula [30] to measure the similarity between 1 and 2 : where the segment means a sequence of sampled ECG data of 10 sample periods which serves as a basic unit for our analysis. Each sample period denotes the ECG signal in an R-R interval. ( ) and ( ) represent the relative frequency and rank of in the sequence S , = 1 or 2. The absolute difference between two ranks is multiplied by the normalized probabilities as a weighted sum; the factor assures that all values of measurements lie within the scope of (0, 1).
Consider two groups of ECG data, and , each containing and segments, respectively. We define the measurement of similarity between these two groups: where 1 and 2 are the corresponding segments from and , respectively; ( 1 , 2 ) denotes the associated distance between these segments. ( , ) is the average distance of all segments from and . If = , ( , ) is referred to as the intra-group distance; otherwise it is the inter-group distance.

The Advanced RBP Algorithm
Next, we consider a new scaling factor , which is an increment in the length of the interval, in the RBP algorithm. All steps in this algorithm are similar to those performed in the basic design, except that the binary sequence will be obtained from the comparison of ( • +1) and ( • −1) instead of +1 and , where represents the raw ECG data. The reduced function (1) is now replaced with: Figure 8 represents the process of the modified RBP conversion of = 4 and = 2. The first four-bit word 1 = {0111} is now labeled as 1 = 7 = 2 2 + 2 1 + 2 0 . If we are concerned with the effect of the size of variations in amplitude, the reduced function will be: where denotes the jump in amplitude. This model allows variation (noises) from two consecutive data points to differ by units, which in turn reduces a small variation in amplitude by requiring a large rise in amplitude. In the case where variations in time and amplitude are allowed, the general reduced function can be set as: A proper choice of may reduce impacts from noises and a suitable scaling factor α could result in an ideal reduced binary pattern. Therefore, appropriate tuning of either parameter will improve the verification accuracy. Table 1 and Algorithm 1 include notations and the main pseudo code of the advanced RBP algorithm. The main pseudo code consists of two code segments: an ECG datum is converted to the statistical counter and the rank values are sorted to obtain . To expedite our computation process, an unsigned of length is created and each bit of σ stores the corresponding value of = { +1 , +2 , … , + −1 }. We also set , of length 2 , to accumulate the repeated number of -bit words in the reduced binary pattern.

The Evolving RBP Algorithm
Since ECG signals change slightly day by day, modifying our algorithm to handle this issue seems crucial. This model utilizes an incremental learning process to improve the advanced version. The advanced RBP algorithm evolves an incremental-update mechanism for the rank order of the -bit word = . If both ECG signals, the obtained one and the original one, come from the same individual, the identity match passes. The original relative frequency ( ) is now replaced by the new relative frequency ( ) from the new input ECG.
where is the weighted factor controlling the degree of impact from the new frequency ( ) , = 0,1, ⋯ , 2 − 1. The value of γ is affected by the degree of non-stationarity in the old and new ECG signals. A larger indicates that the new data are more non-stationary and the rank ( ), = 0,1, ⋯ , 2 − 1 will then be recalculated and updated. This non-stationary behavior can be modeled by verifying the cross-correlation between these two ECG signals in the future.

Compared Algorithms
Next, we will compare the proposed algorithm with two other schemes from the same feature extraction category. It is noted that both selected waveform-based and wavelet-based algorithms require R-R detection and noise preprocessing, which can be totally bypassed in our model.
In a waveform-based study [13],  Figure 9. These features form a feature vector S. The closeness between two feature vectors S 1 and S 2 is considered as their distance (S 1 , S 2 ); the intra-and inter-group distances can be evaluated through Equation (3). The procedures of the wavelet-based algorithm [12] in comparison include the following: each R-R cardiac cycle is obtained through R-R detection; an interpolation is performed on the R-R interval so each R-R cardiac cycle holds 284 data points; every R-R cycle is cut into three parts, each containing 85, 156, and 43 points; the first 85 and the last 43 points in each R-R cycle are assembled to form a 128-point segment; every four segments are grouped and an n-level discrete wavelet transform (DWT) is performed to obtain the corresponding wavelet coefficients. Four of the computed wavelet coefficients are gathered as a wavelet vector and expressed as: The Euclidean distance between two wavelet vectors S 1 and S 2 is regarded as their distance (S 1 , S 2 ); the intra-and inter-group distances can then be calculated through Equation (3). An example with = 9 is illustrated in Figure 10.   Figure 10. Procedure of the wavelet-based algorithm.

ECG Database
We conducted a comprehensive experiment on three public ECG databases: the MIT-BIH Arrhythmia Database, the MIT-BIH Normal Sinus Rhythm Database, and a Long-Term Database. Descriptions of these three databases are given below.

Measurement Approaches
Two approaches are used to evaluate our implemented algorithms: 1. Success rate: This is a metric used for accuracy measurement. Based on the results of comparisons between the individuals, when the inter-subject distance is smaller than the average inter-subject distance, we considered it an identification error. Summing up these errors gives us the total number of errors; then we divided this figure by the total number of comparisons to give the success rate. 2. False Acceptance (FA) and False Rejection (FR) rates: These are also the metrics used for accuracy performance. The FR denotes the relative ratio of subjects that should be accepted but are actually rejected by the classifier; similarly, the FA is the ratio of subjects that should be rejected but are actually accepted by the classifier. The threshold for FA/FR is obtained from the training set, which aims to minimize + 2 .

Basic RBP Algorithm
To verify how efficient this algorithm is in human verification by ECG, two types of comparisons are considered: self-and subject-comparison. 1. Self-comparison: Two eight-segment data are arbitrarily selected from one individual and their corresponding distances are measured using Equation (2). Each segment contains 3600 data points (10 s). All 64 intra-subject distances obtained from segments 1 to 8 for the subject ID number 100 in the MIT-BIH Arrhythmia Database are listed in Table 2. It is noted that all entries are symmetric with diagonal entries being zero since they denote the distances between two identical segments.   Table 3. Table 3. Subject-comparison between subject IDs 100 and 101. Next, we measure the average distance between two subjects from the same database using Equation (3). Table 4 lists all 64 average intra-group distances for subject IDs 100 to 107 from the MIT-BIH Arrhythmia Database. For example, the first row and first column record the average of all entries in Table 2; the average of all values in Table 3 is listed in the first row and second column and in the second row and first column.  Intuition suggests that a strong association should exist between certain patterns with each individual; therefore, the intra-subject distances must be smaller than the intra-group distances. Thus, it makes sense to treat the opposite cases as verification errors. Similarly, the RBP algorithm is applied to subjects in the MIT-BIH normal sinus rhythm database as well. The experimental results show that the success rates for the two groups of people, with and without significant arrhythmias, are 95.791% and 90.196%, respectively.
To seek better accuracy, the effect of the length of the -bit word is considered. Using 10 s periods as the duration of the input data should be reasonable for a fair evaluation. A self-comparison experiment is conducted to examine whether the value of and the stability of identity detection are related. Two 31-segment data, each containing a 10 s period, are selected from subject ID 100 in the MIT-BIH Arrhythmia Database. The results of all intra-subject distances obtained using Equation (3) are measured for different values of , as shown in Table 5. It is clear that the distances vary abruptly when = 4 and become more stable as increases. However, a bigger leads to computational and space complications. To balance the trade-offs, we decide to set = 8 in this study. Two parameters, interval and amplitude, are considered in this advanced design. Therefore, the pair of data points ( • +1) and ( • −1) , instead of +1 and , are compared to obtain the reduced binary pattern via Equation (4), and the ECG data are examined not only locally but globally.
Experiments with 1 to 36 intervals were conducted; Figures 11 and 12 show their effects on the total number of verification errors in the two databases.  The total number of errors shows a sharp drop followed by a short stable zone for = 15 in the Arrhythmia Database and for = 5 in the Normal Database. Since these two databases are sampled at 360 Hz and 128 Hz, respectively, = 15 and = 5 correspond to down-sample signals at 24 Hz and 25.6 Hz, respectively. These frequencies are quite close to the 25 Hz bandwidth. This new finding is analogous to the bandwidth of digitized ECG data in [4]. If a signal is sampled at a higher or lower rate, it may cause unnecessary noise or provide insufficient information, respectively. Therefore, 25 Hz seems to be a suitable sampling rate.
In addition to interval tuning, an experiment using the MIT-BIH Arrhythmia Database was conducted to check how recognition accuracy changes when the amplitude is adjusted. Tables 6 and 7 record the results of both algorithms with = 10. The data show that the success rate in the advanced model is a lot better than that in the basic model. When the increase in amplitude = 1, the impact of the signal noise is reduced to give the results showing the best performance; when > 1, certain distinguishing features may be removed for verification and result in a lower success rate.  The evaluation of our algorithms, the basic and the advanced RBP with = 5 an = 1, will depend on comparisons with two other feature extraction algorithms: a waveform-based algorithm with 19 waveform features extracted [13] and a transform-based scheme with wavelet feature extraction [12]. It is worth mentioning that R-R detection and noise preprocessing are required in both of the other algorithms but can be completely bypassed in ours.
In the evaluation using the MIT-BIH Arrhythmia and Normal databases, it is obvious from the comparison of outcomes shown in Table 8 that the waveform-based algorithm with 19 features performs well, but our advanced RBP algorithm still excels, having an extremely high success rate in both public databases. The FA and FR ratios for the normal sinus rhythm and arrhythmia databases are listed in Table 9.
Here the associated parameters for the normal sinus rhythm database are m = 8 and = 5 with a sampling rate of 128 Hz and m = 8 and = 15 with a sampling rate of 360 Hz. For the purpose of personal verification, the false rejection and acceptance rates should be as small as possible. The advanced RBP algorithm has been tested 18 × 8 = 144 times and 47 × 3 = 147 times for Normal and Arrhythmia databases, respectively, and advanced RBP has a false rejection rate of around 1.67% and a false acceptance rate of 1.43% for the Normal Database. Thus the performance of our algorithm should be acceptable.  Table 10 provides the information on execution time for all algorithms under comparison. It is clear that the execution time for the advanced scheme is shorter than that for the basic RBP algorithm, whose performance is definitely at least nine times faster than those of the waveform-and transform-based algorithms. The performance of the evolving RBP algorithm is evaluated on the long-term ECG database, where 20 individuals' ECG data recorded over 54 days were selected. All subjects had their ECGs measured on a minimum of two and a maximum of six days. Table 11 contains the data for the subject ID person_01. In this study, the evolving RBP algorithm, tested on the long-term database, is implemented with = 13 and = 1 in the advanced RBP model. Not all subjects have six days' worth of ECG data; for most of them, only two or three days' worth of data is on record. Each recorded ECG is 20 s long and is cut into two equal-length segments of 10 s each. Each segment is sorted into rank statistics after reduced binary conversion and -bit word counting and, finally, we get the mean rank statistic of rank statistics 1 and 2 for the two segments, respectively. This approach is illustrated in Figure 13. Table 12 gives us the result for person_01 by applying the non-evolving advanced RBP approach to calculate the similarity distance using Equation (2) between the first record (record No. 1) and other records.   The evolving RBP scheme Equation (6) models the rank statistics on the average relative frequency array of the first record on the first day from the same individual, and follows with an update of the first record on another day. The weight distribution between the model and the new record is 0.5:0.5. To be precise, each day, we simply use the first valid record to update the model, while the other record serves for comparison purposes only and is not used for the update. The similarity distance between the model and recorded data is shown in Table 13, where cells with grey backgrounds indicate that the corresponding recorded data are updated into the model. Comparing these results with those presented in Table 11, it can be seen that the evolving RBP approach has a much smaller similarity distance than the advanced RBP one.  In order to obtain a whole valid evaluation between the evolving and advanced RBP algorithms, we sum all the similarity distances after the update points on each date as Sarbp and Serbp for the advanced RBP algorithm and the evolving RBP algorithm, respectively, which are the non-grey records in the second-to-fifth rows in Table 13. Their values are shown in Table 14, which demonstrates an improved rate of evolving RBP to advanced RBP is 26.47%, which is obtained by (Sarbp − Serbp)/Sarbp, confirming that the evolving process is effective. This result reveals that the evolving process of the RBP algorithm does improve the verification performance for the non-stationary behavior in long-term ECG signals. As shown in Table 15, we use the same database and a procedure similar to the similarity distances evaluation to measure the false acceptance (FA) and false rejection (FR) rates for the evolving RBP, and the result shows that evolving RBP has improved rates of 43.77% and 9.57% for FA and FR, respectively. The average improved rate is 25.25%, which is quite close to the improvement of the success rate, which was 26.47%.

Conclusions
In this paper, a novel ECG card architecture and algorithm for ECG human verification are proposed. Verifications tested on subjects from the two public MIT-BIH databases confirm that the RBP algorithm performs in a timely manner with low computational complexity and is rather efficient in ECG human identity recognition. Moreover, the RBP scheme is enhanced by tuning the parameters, interval, and amplitude between sample points. The advanced RBP design demonstrates good accuracy with much shorter execution duration than those of the waveform-and transform-based algorithms. Furthermore, the modified evolving RBP algorithm cannot only easily merge the new rank data into the old one, but it is also capable of handling non-stationary ECG signals.

Supplementary Materials
Supplementary materials can be accessed at: http://www.mdpi.com/1424-8220/15/8/20730/s1.  Kuo-Kun Tseng conceived and designed the experiments, analyzed the data and wrote the paper.  Huang-Nan Huang conceived and designed the experiments, analyzed the data and wrote the paper.  Fufu Zeng conceived and designed the experiments, performed the experiments, analyzed the data and wrote the paper.  Shu-Yi Tu wrote the paper.

Conflicts of Interest
The authors declare no conflict of interest.