Electrocardiograph Identification Using Hybrid Quantization Sparse Matrix and Multi-Dimensional Approaches

Electrocardiograph (ECG) technology is vital for biometric security, and blood oxygen is essential for human survival. In this study, ECG signals and blood oxygen levels are combined to increase the accuracy and efficiency of human identification and verification. The proposed scheme maps the combined biometric information to a matrix and quantifies it as a sparse matrix for reorganizational purposes. Experimental results confirm a much better identification rate than in other ECG-related identification studies. The literature shows no research in human identification using the quantization sparse matrix method with ECG and blood oxygen data combined. We propose a multi-dimensional approach that can improve the accuracy and reduce the complexity of the recognition algorithm.


Introduction
Human identification has become an important issue in many areas, with people more aware than ever before of the need to protect data and prevent information from leakage, theft, or unauthorized modification. Many widespread electronic techniques and approaches, such as fingerprinting and iris scanning, are now used extensively in real-world applications. In medicine, electrocardiograms (ECGs) and blood oxygen levels are widely accepted methods for monitoring heart disease. Recently, ECGs have also been employed extensively in the field of security studies, particularly to verify human identity for access control systems [1][2][3][4].
The ECG method, a technique for measuring and recording different electrical potentials in the heart, is a powerful tool, but most recognition algorithms utilize ECG signals without other biometric characteristics [5,6], ignoring potentially useful additional information. A comprehensive survey of the literature revealed no studies combining ECGs and blood oxygen levels for identity verification. In this research, we present a new identification system that uses an algorithm combining ECG signals and blood oxygen levels, showing that they work together well to verify human identity. Further, the requisite data are easily collected from a fingertip through a wearable device. Identification using these biometrics is based on multidimensional analysis in which both data are mapped into a two-dimensional coordinate system and the intervals between points are quantized by counting their values. All entries in the quantized sparse matrix are used to find the correlation coefficient, an index for determining the success or failure of the identification. We call the proposed algorithm After the mapping processing is finished, the combination of ECG signal and blood oxygen level can be expressed in the format of a diagonal matrix of size × . We now reduce the size of M using the quantization method. Define a mask matrix * of size × , where m is the dimension of the mask matrix for quantization; * is initially set as a zero matrix. × size is also an reduction rating. First, is from 1 to and from 1 to of original matrix , we use matrix * to calculate the sum of window × . The result store as the element of new reduction matrix []. In this way, we follow the rule: Consider two sets of data, ECG signal and blood oxygen level, collected from an individual i. We write E i = e i (t 1 ), e i (t 2 ), · · · , e i (t k ) (1) where e i (t k ), b i (t k ), k = 1, · · · , n denotes the ECG and blood oxygen sample values obtained from the ith individual at a specific time t k . Describe the dataset using a two-dimensional coordinate representation, namely, M i e i (t k ), b i (t k ) . Without loss of generality, M i is initially set to be 0. According to the mapping rule, we have Set Let v i k be the value at the element [e i (t k ), b i (t k )] when counting processing is end; this is a counting process to map two time series into a matrix.
After the mapping processing is finished, the combination of ECG signal and blood oxygen level can be expressed in the format of a diagonal matrix M of size n × n. We now reduce the size of M using the quantization method. Define a mask matrix M * of size m × m, where m is the dimension of the mask matrix for quantization; M * is initially set as a zero matrix. m × m size is also an reduction rating. First, k x is from 1 to n and k y from 1 to n of original matrix M, we use matrix M * to calculate the sum of window m × m. The result store as the element of new reduction matrix RM[]. In this way, we follow the rule: The variables a, b, c, and d are the quantization levels we defined, as shown in Figure 2. In addition, M k x : k x + m − 1, k y : k y + m − 1 means the field of the horizontal ordinate X is from k x to k x + m − 1. The same is true for the vertical ordinate Y.
After biometric data are collected from an individual, the information is transformed immediately into a sparse matrix and is stored using COO without any pre-processing. A sparse matrix [], stores the features of the data, and its entries are inputs to the correlation coefficient classifier. The correlation coefficient between two [] will be calculated for future training and identification purposes, serving as an index to train and test the accuracy of human verification with an appropriately selected threshold. Correlation coefficients are widely applied in the sciences to determine how strongly units from the same group resemble each other; Mathworks documentation (http://www.mathworks.com/help/matlab/ref/corrcoef.html) describes this in detail.

Advanced Model
The advanced model we propose is a combination of the reduced sparse matrix of MDI and DTW methods. This method of mapping multi-lead signal data into multi-dimensional space and extracting sparse features can achieve better matching results. With equipment used to measure ECG signals, 12-lead or 5-lead are the most commonly used methods, which are more suitable for MDI. The main difference from basic model QSMI, MDI reduces input signal level at first; it can greatly reduce the data complexity of high dimensional matrix.
Then, DTW can compute the similarity between two time series, especially for different lengths and different rhythms of time series. DTW automatically warps distorted time series to make the two sequences as consistent as possible and obtain the greatest possible similarity.
Although multi-dimensional data can represent more information, there will be redundant information, so we need to reduce the dimensionality of the multi-dimensional data and enhance the extraction of feature information. Our advanced model relies on an algorithm that fuses multidimensional sparse matrix and DTW. For a large range of ECG data, in order to better aggregate the expression of the data features we first reduce the data to a certain range, then map them to a multidimensional space. There are two stages in our advanced model, as follows. After reduction, RM[] is a matrix with high sparsity since most of its entries are zero. Sparse data are easily compressed and use less storage space; it is beneficial and often necessary to store only the non-zero entries. There are numerous representations for sparse storage; in this study, we use the simplest and most flexible coordinate format (COO), which lists the row, column, and value as a 3-tuple. For example, information in the reduced matrix RM[] in Table 1 can be stored as {   After biometric data are collected from an individual, the information is transformed immediately into a sparse matrix and is stored using COO without any pre-processing. A sparse matrix SM[], stores the features of the data, and its entries are inputs to the correlation coefficient classifier. The correlation coefficient between two SM[] will be calculated for future training and identification purposes, serving as an index to train and test the accuracy of human verification with an appropriately selected threshold. Correlation coefficients are widely applied in the sciences to determine how strongly units from the same group resemble each other; Mathworks documentation (http://www.mathworks.com/help/ matlab/ref/corrcoef.html) describes this in detail.

Advanced Model
The advanced model we propose is a combination of the reduced sparse matrix of MDI and DTW methods. This method of mapping multi-lead signal data into multi-dimensional space and extracting sparse features can achieve better matching results. With equipment used to measure ECG signals, 12-lead or 5-lead are the most commonly used methods, which are more suitable for MDI. The main difference from basic model QSMI, MDI reduces input signal level at first; it can greatly reduce the data complexity of high dimensional matrix.
Then, DTW can compute the similarity between two time series, especially for different lengths and different rhythms of time series. DTW automatically warps distorted time series to make the two sequences as consistent as possible and obtain the greatest possible similarity.
Although multi-dimensional data can represent more information, there will be redundant information, so we need to reduce the dimensionality of the multi-dimensional data and enhance the extraction of feature information. Our advanced model relies on an algorithm that fuses multi-dimensional sparse matrix and DTW. For a large range of ECG data, in order to better aggregate the expression of the data features we first reduce the data to a certain range, then map them to a multi-dimensional space. There are two stages in our advanced model, as follows. In the MDI stage, it wants to map and reduce multiple lead signals into a reduced sparse matrix. According to the multi-dimensional space algorithm mentioned above, we assume that s j i (t k ) is the ith individual for sampling of the jth lead at time t of the sample point value. For reduced matrix processing, we first map multiple time series points s j i (t k ) for each lead j at time t, and we map the signal value to the j-dimensional coordinate point according to the following formula: where point p i (t k ) is constructed by j dimensional reduction value d j i (t k ), R is a reduction level we define, and ∆θ is an increment to prevent negative point values. The result of the second and third lines is to marginalize the result and prevent the map from overflowing, where u is the maximum point value of the space we map.
In this way, a multi-dimensional coordinate space is applied, then the multi-lead signal is mapped to the space according to time t k ; next, reduced sample point processing is performed, and finally the spatial correlation coefficient is calculated to complete the recognition judgment. In the next part, we will show the experimental evaluation results for this algorithm. Since high-dimensional space occupies a large amount of memory, storage is inconvenient. We therefore need to build a reduced sparse matrix to simplify the calculation and storage of the sample coordinates of the multi-dimensional space.
After a reduced sparse matrix is constructed, we upgrade the idea of a two-dimensional sparse matrix to a multi-dimensional coordinate space. For example, if the value of the j dimensional coordinate point (d 1 The subsequent correlation calculation no longer uses the correlation coefficient between the matrices, because the correlation coefficient of the high-dimensional coordinate space is calculated, the time and space complexity are too high to be practically applied, and we will extract a series element of high-dimensional sparse matrix. The sparse matrix is used as the input to the DTW for calculating the correlation between the two series. The algorithm for integrating our multi-dimensional feature with DTW is described in detail below.
As mentioned, a training or test sample for each individual i is ultimately represented as a representation of a series of sparse elements. Assume that the sth sample template of the ith individual is stored as X s i , then the algorithm.
among them, the distance can be used as a metric: Here is two high dimensional sequences, where s and s mean any two sample templates of two i and i indiviuals. In the DTW stage, the two test templates are represented as X s i and X s i . To calculate the similarity between them, the smaller the distance, the higher the similarity between X s i and X s i . Let k and k be the sequence lengths of X s i and X s i , respectively, so D[X s i , X s i ] represents the distance between these two sequences. If: (2) k = k , use dynamic programming to calculate the distance between them.
DTW is described in detail in [27]. We only need to apply this idea to the minimum distance to find two different sequences, which will not be described here. Multi-dimensional DTW is calculated in a similar way to DTW for single-dimensional time series, except that we redefine D X s i , X s i as the cumulative squared Euclidean distances of multiple data points instead of the single data point used in the more familiar one-dimensional case. For details, please refer to [28].

Experiments and Evaluations
A reliable biometric identification system is capable of handling five basic elements: data collection, transmission, signal processing, storage, and decision making. Past research has shown that even with ECG or other biometric-based verification algorithms, no practical solutions have been found to cut the cost of implementation. To overcome this obstacle, we implement a portable device, as shown in Figure 3, so the identification process can be done efficiently and less expensively. This board uses a TI AFE4900 chip, and the AFE4900 device is an analog front-end (AFE) for synchronized ECG and photoplethysmogram (PPG) signal acquisition. The device can also be used for optical bio-sensing applications, such as monitoring heart rate and measuring the saturation level of peripheral capillary oxygen (SpO 2 ). cumulative squared Euclidean distances of multiple data points instead of the single data point used in the more familiar one-dimensional case. For details, please refer to [28].

Experiments and Evaluations
A reliable biometric identification system is capable of handling five basic elements: data collection, transmission, signal processing, storage, and decision making. Past research has shown that even with ECG or other biometric-based verification algorithms, no practical solutions have been found to cut the cost of implementation. To overcome this obstacle, we implement a portable device, as shown in Figure 3, so the identification process can be done efficiently and less expensively. This board uses a TI AFE4900 chip, and the AFE4900 device is an analog front-end (AFE) for synchronized ECG and photoplethysmogram (PPG) signal acquisition. The device can also be used for optical biosensing applications, such as monitoring heart rate and measuring the saturation level of peripheral capillary oxygen (SpO2). ECG signals and blood oxygen level data can be collected by placing this small sensor on a thin area of an individual's body, usually a fingertip; signals are then transmitted to the user's mobile phone via Bluetooth and stored directly in its memory card. This equipment provides a low-cost, effortless way to gather and handle biometric information necessary for human recognition. Given the popularity of wearable monitoring devices [29,30], we believe that our equipment has the potential for wide application.

Experiment for QSMI Algorithm
For our verification experiment, data were collected from 18 individuals. This raw dataset contains a sequence of point values at certain times. For each individual, two sets of data were collected, one for training purposes and the other for testing the proposed algorithm.
We have two training set for ECG and blood oxygen signals data; the rest of the set is for the test samples data. In the training stage, each set of one individual should be a template matrix compared to every other set, including set of different individuals. In the training stage, we use the training data set of all 18 individuals to evaluate the correlation coefficients. To verify the efficiency of this algorithm, we train the sparse matrix containing a segment of sample data points and get thresholds for each individual.
Numerous metrics have been used to assess the performance of a biometric factor. The most common performance metrics-false acceptance (FA) and false rejection (FR) rates-are employed to evaluate the QSMI algorithm. The FA rate, which is the probability that a biometric system will accept an incorrect input as a positive match, is obtained by testing known biometric templates against a large data collection. By contrast, the FR rate represents the probability that a biometric system will incorrectly reject an input as a negative match. FA and FR rates usually show an inverse relationship with one another. We define the accuracy rate, Acc, as ECG signals and blood oxygen level data can be collected by placing this small sensor on a thin area of an individual's body, usually a fingertip; signals are then transmitted to the user's mobile phone via Bluetooth and stored directly in its memory card. This equipment provides a low-cost, effortless way to gather and handle biometric information necessary for human recognition. Given the popularity of wearable monitoring devices [29,30], we believe that our equipment has the potential for wide application.

Experiment for QSMI Algorithm
For our verification experiment, data were collected from 18 individuals. This raw dataset contains a sequence of point values at certain times. For each individual, two sets of data were collected, one for training purposes and the other for testing the proposed algorithm.
We have two training set for ECG and blood oxygen signals data; the rest of the set is for the test samples data. In the training stage, each set of one individual should be a template matrix compared to every other set, including set of different individuals. In the training stage, we use the training data set of all 18 individuals to evaluate the correlation coefficients. To verify the efficiency of this algorithm, we train the sparse matrix containing a segment of sample data points and get thresholds for each individual.
Numerous metrics have been used to assess the performance of a biometric factor. The most common performance metrics-false acceptance (FA) and false rejection (FR) rates-are employed to evaluate the QSMI algorithm. The FA rate, which is the probability that a biometric system will accept an incorrect input as a positive match, is obtained by testing known biometric templates against a large data collection. By contrast, the FR rate represents the probability that a biometric system will incorrectly reject an input as a negative match. FA and FR rates usually show an inverse relationship with one another. We define the accuracy rate, Acc, as The purpose of a threshold is to determine how close to a template the input data must be for it to be considered a match. The threshold for FA+FR 2 will be obtained from the training set; this procedure will be discussed in detail later. Figure 4 illustrates the relationship between FA+FR 2 and δ for the classification procedure. It shows that FA+FR 2 lies between 0.0686 and 0.2059 in template T 1 , and between 0.1830 and 0.2386 in template T 2 . A better Acc rate can be attained using an iterative approach to reach a suitable threshold δ. The purpose of a threshold is to determine how close to a template the input data must be for it to be considered a match. The threshold for will be obtained from the training set; this procedure will be discussed in detail later. Figure 4 illustrates the relationship between and δ for the classification procedure. It shows that lies between 0.0686 and 0.2059 in template , and between 0.1830 and 0.2386 in template . A better Acc rate can be attained using an iterative approach to reach a suitable threshold . As we can see from (5), a smaller value yields a higher Acc. Figure 4 shows that template has a better accuracy rate, so only will be used for discussion. Acc is maximized as reaches its minimum, when = 0.019 , so the appropriate threshold for multiple subject classification is set at 0.019. Figure 5 shows detailed rate information for template as increases from 0.001 to 0.03, clearly showing that reaches its minimum when = 0.019. The FA rates of all 18 individuals in when δ = 0.019 are shown in Figure 6. Since FR equals 0 for all individuals, it is not displayed.
Since the sample size of our ECG and blood oxygen data is small, it is highly possible to have the data map to the same point. To take advantage of this feature point, mapping rule (2) is employed to define the quantization interval. To achieve optimal results, various quantization intervals are assessed, and the experimental results of the improved algorithms are shown in Figure 7. Since all FR rates are 0 for all quantization intervals, only FA and are presented.
As can be seen in Figure 7, the rate is 0.1193 when no quantization is carried out and reaches its minimum, 0.0686, when a three-level (0, 1, 2) quantization procedure is performed for data points reduction. More specifically, implementing the proposed QSMI algorithm raises the Acc rate for identification from 0.8807 to 0.9314, an approximately 5% enhancement. As we can see from (5), a smaller FA+FR 2 value yields a higher Acc. Figure 4 shows that template T 1 has a better accuracy rate, so only T 1 will be used for discussion. Acc is maximized as FA+FR 2 reaches its minimum, when δ = 0.019, so the appropriate threshold δ for multiple subject classification is set at 0.019. Figure 5 shows detailed rate information for template T 1 as δ increases from 0.001 to 0.03, clearly showing that FA+FR 2 reaches its minimum when δ = 0.019. The purpose of a threshold is to determine how close to a template the input data must be for it to be considered a match. The threshold for will be obtained from the training set; this procedure will be discussed in detail later. Figure 4 illustrates the relationship between and δ for the classification procedure. It shows that lies between 0.0686 and 0.2059 in template , and between 0.1830 and 0.2386 in template . A better Acc rate can be attained using an iterative approach to reach a suitable threshold . As we can see from (5), a smaller value yields a higher Acc. Figure 4 shows that template has a better accuracy rate, so only will be used for discussion. Acc is maximized as reaches its minimum, when = 0.019 , so the appropriate threshold for multiple subject classification is set at 0.019. Figure 5 shows detailed rate information for template as increases from 0.001 to 0.03, clearly showing that reaches its minimum when = 0.019. The FA rates of all 18 individuals in when δ = 0.019 are shown in Figure 6. Since FR equals 0 for all individuals, it is not displayed.
Since the sample size of our ECG and blood oxygen data is small, it is highly possible to have the data map to the same point. To take advantage of this feature point, mapping rule (2) is employed to define the quantization interval. To achieve optimal results, various quantization intervals are assessed, and the experimental results of the improved algorithms are shown in Figure 7. Since all FR rates are 0 for all quantization intervals, only FA and are presented.
As can be seen in Figure 7, the rate is 0.1193 when no quantization is carried out and reaches its minimum, 0.0686, when a three-level (0, 1, 2) quantization procedure is performed for data points reduction. More specifically, implementing the proposed QSMI algorithm raises the Acc rate for identification from 0.8807 to 0.9314, an approximately 5% enhancement. The FA rates of all 18 individuals in T 1 when δ = 0.019 are shown in Figure 6. Since FR equals 0 for all individuals, it is not displayed.
Since the sample size of our ECG and blood oxygen data is small, it is highly possible to have the data map to the same point. To take advantage of this feature point, mapping rule (2) is employed to define the quantization interval. To achieve optimal results, various quantization intervals are assessed, and the experimental results of the improved algorithms are shown in Figure 7. Since all FR rates are 0 for all quantization intervals, only FA and FA+FR 2 are presented. As can be seen in Figure 7, the FA+FR 2 rate is 0.1193 when no quantization is carried out and reaches its minimum, 0.0686, when a three-level (0, 1, 2) quantization procedure is performed for data

Experiment for MDI Algorithm
First, to prove the correctness of our multi-dimensional algorithm, we slowly and incrementally use the same evaluation method for the PTB data (see, for example, the PTB database at www.physionet.org/physiobank/database/ptbdb/), from the double-lead ECG sensor, obtaining the results shown in Figure 8.

Experiment for MDI Algorithm
First, to prove the correctness of our multi-dimensional algorithm, we slowly and incrementally use the same evaluation method for the PTB data (see, for example, the PTB database at www.physionet.org/physiobank/database/ptbdb/), from the double-lead ECG sensor, obtaining the results shown in Figure 8.

Experiment for MDI Algorithm
First, to prove the correctness of our multi-dimensional algorithm, we slowly and incrementally use the same evaluation method for the PTB data (see, for example, the PTB database at www.physionet. org/physiobank/database/ptbdb/), from the double-lead ECG sensor, obtaining the results shown in Figure 8.

Experiment for MDI Algorithm
First, to prove the correctness of our multi-dimensional algorithm, we slowly and incrementally use the same evaluation method for the PTB data (see, for example, the PTB database at www.physionet.org/physiobank/database/ptbdb/), from the double-lead ECG sensor, obtaining the results shown in Figure 8.   It can be seen from Figure 8 that when the lead dimension is increased, FA+FR 2 is smaller, that is, the accuracy is higher. When the dimension reaches 5, FA+FR 2 is 0.013. At this point, the error rejection rate is 0, and Acc is up to 98.7%, proving that our multi-dimensional feature algorithm is feasible when mapping multi-dimensional feature data to space.
In the description of the MDI algorithm, we introduce the reduction remapping first, and the variation in the reduced parameter variable ∆θ affects the recognition rate. Figure 9 shows the change in ∆θ with the FA+FR 2 curve. It can be seen from Figure 8 that when the lead dimension is increased, is smaller, that is, the accuracy is higher. When the dimension reaches 5, is 0.013. At this point, the error rejection rate is 0, and Acc is up to 98.7%, proving that our multi-dimensional feature algorithm is feasible when mapping multi-dimensional feature data to space. In the description of the MDI algorithm, we introduce the reduction remapping first, and the variation in the reduced parameter variable ∆θ affects the recognition rate. Figure 9 shows the change in ∆θ with the curve.  Figure 9 we see that the FA/FR rate of the recognition result changes as ∆θ changes. When ∆θ = 20, reaches 0.0133, which is the best result for 5D, as shown in Figure 8. Of course, the 2D to 4D ∆θ in Figure 8 are the best values given under our experiment, and the chart description is no longer listed here. According to our previous charts, when the number of leads is 5-that is, the computational space is 5 dimensions-the experiment achieves good results but takes up a lot of memory. Hence, we need to find a new algorithm to optimize the data. To do so, we introduce the method of using DTW and sparse combination to solve the similarity calculation problem of different sequence lengths. Figure 10 presents the experimental results using DTW and sparse combination in a 5dimensional space.  From Figure 9 we see that the FA/FR rate of the recognition result changes as ∆θ changes. When ∆θ = 20, FA+FR 2 reaches 0.0133, which is the best result for 5D, as shown in Figure 8. Of course, the 2D to 4D ∆θ in Figure 8 are the best values given under our experiment, and the chart description is no longer listed here.
According to our previous charts, when the number of leads is 5-that is, the computational space is 5 dimensions-the experiment achieves good results but takes up a lot of memory. Hence, we need to find a new algorithm to optimize the data. To do so, we introduce the method of using DTW and sparse combination to solve the similarity calculation problem of different sequence lengths. Figure 10 presents the experimental results using DTW and sparse combination in a 5-dimensional space. It can be seen from Figure 8 that when the lead dimension is increased, is smaller, that is, the accuracy is higher. When the dimension reaches 5, is 0.013. At this point, the error rejection rate is 0, and Acc is up to 98.7%, proving that our multi-dimensional feature algorithm is feasible when mapping multi-dimensional feature data to space. In the description of the MDI algorithm, we introduce the reduction remapping first, and the variation in the reduced parameter variable ∆θ affects the recognition rate. Figure 9 shows the change in ∆θ with the curve.  Figure 9 we see that the FA/FR rate of the recognition result changes as ∆θ changes. When ∆θ = 20, reaches 0.0133, which is the best result for 5D, as shown in Figure 8. Of course, the 2D to 4D ∆θ in Figure 8 are the best values given under our experiment, and the chart description is no longer listed here. According to our previous charts, when the number of leads is 5-that is, the computational space is 5 dimensions-the experiment achieves good results but takes up a lot of memory. Hence, we need to find a new algorithm to optimize the data. To do so, we introduce the method of using DTW and sparse combination to solve the similarity calculation problem of different sequence lengths. Figure 10 presents the experimental results using DTW and sparse combination in a 5dimensional space.  In Figure 10, Num_n_Red_m on the abscissa indicates that the size of the window space of size m is reduced when the space size is n. It can be concluded from the figure that when the 5-lead ECG signal is mapped to a 5-dimensional space of 32 and the window is reduced by 2, the recognition result FA+FR 2 is the smallest, 0.0115, slightly higher than the multi-dimensional feature recognition (baseline) does not combine with DTW sparse algorithm, which will generate experimental result of 0.0133. We evaluate the spatial and temporal efficiencies of several of the above-mentioned algorithms using a number of experimental data and averaging the methods to evaluate them fairly. Table 2 compares the average memory space occupied by the DTW and the sparse algorithm, and the average memory space of the baseline algorithm. When MDI is combined with the DTW sparse optimization algorithm, because the number of sparse eigenvalues of each template sample is inconsistent, the memory space they occupy is also different. Table 2 lists all the samples with their average, minimum, and maximum memory space, but the occupied memory space of each sample is no longer listed. As can be seen, when MDI is combined with the DTW sparse algorithm, the average memory space for all the samples is much smaller than for the baseline algorithm. In actual applications, the maximum memory space required is an obvious factor. The original DTW uses the most memory. In comparison, the memory required for the new method is much smaller than for DTW and much smaller than for the baseline, so an identification system with this algorithm could be used in an embedded system or on a small chip.
However, when designing the algorithm, we should consider not only the spatial complexity but also the time complexity, so we evaluate the above algorithm in terms of time efficiency. Table 3 compares the time required to complete the experiment when using DTW in combination with the sparse method for identification. Using Figure 10 and Table 3, we can conclude that when the DTW is combined with the sparse algorithm, the time efficiency decreases as the recognition accuracy increases. This is because the dynamic programming algorithm is used when the final reduced feature value is increased such that the time efficiency rises to the power of O n 2 . In the baseline algorithm, dynamic programming is not used, and only the correlation coefficient between the two samples is calculated, so the time complexity is not as high as with the MDI algorithm. These experimental results show that the MDI algorithm and the DTW combined with the sparse identification multi-dimensional feature algorithm have advantages and disadvantages in terms of time and space efficiency, but the MDI algorithm increases the time by only 1 or 2 seconds in comparison with the baseline algorithm, so it is still very feasible for practical applications.

Comparison with One-Dimensional Models
To date, not much attention has been paid to investigating human identity verification using a combination of ECG signal and blood oxygen level. In this section, three designs involving only ECG signals are introduced. We then compare our scheme with these one-dimensional algorithms. To achieve a fair assessment, all data are selected from the same database-the biometric collection gathered from the wearable device introduced in Section 4.
(a) Reduced binary pattern (RBP) The reduced binary pattern algorithm employs the frequency and rank order statistics of the input ECG signals [31]. Express any collected ECG signal from an individual in the form {x 1 , x 2 , x 3 , . . . , x n }, where each real-value x i denote the ith value from the input data. Compare any two consecutive values and categorize the data into two cases: decrease or increase in x i . A preliminary reduced function then maps these two cases to 0 or 1, respectively, according to the following two-state rule as otherwise. This step converts one segment of the ECG signal of length n to a binary sequence Y = {y 1 , y 2 , . . . , y n−1 } of length n − 1. Group every m bits in Y into a reduced binary sequence and collect all m-bits to form a reduced binary pattern {b 1 , b 2 , . . . , b n−m }, where b k = {y k , y k+1 , . . . , y k+m−1 }.
Let d k be the decimal expansion of b k , k = 1, 2, · · · , n − m. It is obvious that the values of d k can change from 0 to 2 m − 1. Count the occurrences of each d k , sort them in order of descending frequency, and find the corresponding probability. Ranking can be omitted if the tested ECG segments have the same sampling duration.
Consider two segments of ECG data S 1 and S 2 , which may belong to two distinct individuals. To determine how closely they are related, the measurement of similarity between S 1 and S 2 is defined as where p i (d k ) and R i (d k ) represent the relative frequency and rank of w k in the sequence S i , i = 1, 2. The absolute difference between two ranks is multiplied by the normalized probabilities as a weighted sum; the factor 1 2 m −1 ensures all values of D m lie within the scope of (0, 1). (b) Waveform algorithm.
Waveform algorithms [32] use characteristic features obtained directly from the ECG waveform to verify human identity. After an ECG waveform is measured, it passes through high-and low-pass filters for pre-processing. Various techniques are then carried out to extract the characteristic points inside the waveform where ECG complexes P, Q, R, S and T are located. Based on the extracted values of the characteristic points, some identification features are composed of relative representations. The extraction of characteristic points and the composed features are shown in Figure 11. In a waveform-based study [33] Based on the feature-vectors obtained from the individuals, we use the measurement of similarity formula (6) to evaluate the difference between two subjects. Closeness between two feature-vectors S 1 and S 2 is denoted as the distance d(S 1 , S 2 ). We then use (5) to calculate the FA, FR, and FA+FR 2 rates of all obtained distances. where ( ) and ( ) represent the relative frequency and rank of in the sequence , = 1, 2. The absolute difference between two ranks is multiplied by the normalized probabilities as a weighted sum; the factor ensures all values of lie within the scope of (0, 1).
(b) Waveform algorithm. Figure 11. Waveform feature extraction for identification. Figure 11. Waveform feature extraction for identification.

(c) Wavelet transform algorithm
A wavelet transformation is the representation of a function by wavelets, which are scaled copies of finite-length or fast-decaying oscillation waveforms. The wavelet-based algorithm [34] includes the following procedures: each R-R cardiac cycle is obtained through R-R detection; an interpolation is performed on the R-R interval so each R-R cardiac cycle holds 284 data points; every R-R cycle is cut into three parts, each containing 85, 156, and 43 points; the first 85 and the last 43 points in each R-R cycle are assembled to form a 128-point segment; every four segments are grouped, and an n-level discrete wavelet transform (DWT) is performed to obtain the corresponding wavelet coefficients. Four of the computed wavelet coefficients are gathered as a wavelet vector.
The Euclidean distance between two wavelet vectors S 1 and S 2 is denoted as d(S 1 , S 2 ); the intraand inter-group distances can then be calculated using (6). Table 4 contains the outcomes of our QSMI design, with three one-dimensional algorithms for comparison. It is clear that the RBP, waveform, and wavelet transform algorithms perform well, with an approximately 75% success rate for human identification. But with an accuracy of 93%, the proposed QSMI algorithm using both ECG signal and blood oxygen level far outperforms the others. Next, we compare our scheme with two other typical two-dimensional algorithms.

Comparison with TTwo-d Dimensional MModels
Now, we compare the proposed QSMI scheme with two other commonly used two-dimensional algorithms: baseline and principal component analysis.

(a) Baseline algorithm
The baseline algorithm is a simple and effective way to determine the similarity between two sparse matrices [35]. The sparse matrix is formed by the mapping rule and dimensionality reduction of the sample points. There are many standards of similarity. Here, we use the Euclidean distance; that is, we define the baseline to measure the similarity between two sparse matrices SM 1 and SM 2 as: After the similarity is measured using (10), the correlation coefficient between two b's, as well as the minimum and mean correlation coefficients R i min and R i mean for each individual i, can be computed. Furthermore, the FA, FR, and FA+FR 2 rates can be determined using (5).
(b) Principal component analysis (PCA) algorithm PCA is a powerful tool for identifying similarities and differences in patterns of data. For highdimension data, this method reduces the number of dimensions without much loss of information. PCA maps data in a high-dimensional space to a space of lower dimensions using a linear projection. This well-known dimension reduction method is commonly applied to investigate linear correlations across multiple time series [36].
For a matrix X consisting of sample data, a linear transformation converts X to Y through The linear mapping P allows Y to extract the principal components from X, and each row of P contains the eigenvectors of C X , where In PCA assessment, we use the data collected from the previously mentioned wearable device, the difference being that the ECG signal is arranged prior to the blood oxygen level. The PCA algorithm extracts select features that are then used to train and test human identification. Experimental results confirm that the success rate for identification is much better with the five extracted features. Figure 12 shows that with ECG signal and blood oxygen level combined, the proposed QSMI algorithm demonstrates a much lower FA+FR 2 value, which equates to a better accuracy rate than with the other two two-dimensional designs. After the similarity is measured using (10), the correlation coefficient between two ′s, as well as the minimum and mean correlation coefficients and for each individual i, can be computed. Furthermore, the FA, FR, and rates can be determined using (5).
(b) Principal component analysis (PCA) algorithm PCA is a powerful tool for identifying similarities and differences in patterns of data. For highdimension data, this method reduces the number of dimensions without much loss of information. PCA maps data in a high-dimensional space to a space of lower dimensions using a linear projection. This well-known dimension reduction method is commonly applied to investigate linear correlations across multiple time series [36].
For a matrix X consisting of sample data, a linear transformation converts X to Y through The linear mapping P allows Y to extract the principal components from X, and each row of P contains the eigenvectors of , where In PCA assessment, we use the data collected from the previously mentioned wearable device, the difference being that the ECG signal is arranged prior to the blood oxygen level. The PCA algorithm extracts select features that are then used to train and test human identification. Experimental results confirm that the success rate for identification is much better with the five extracted features. Figure 12 shows that with ECG signal and blood oxygen level combined, the proposed QSMI algorithm demonstrates a much lower value, which equates to a better accuracy rate than with the other two two-dimensional designs.

Comparison with MMulti-d Dimensional MModels
Here, we use PTB multi-lead ECG signal data. We read the multi-lead signal from the PTB data, fuse the multi-lead data signal, and then classify the individual. Similar to the experimental models in the previous two parts, the PTB database contains 249 individuals, and we divide the data into training sets and test sets. To evaluate the algorithm fairly and without loss of efficiency, we select 12 individual 12-lead signals to conduct our experiment. To analyze with other multi-dimensional feature algorithms, we compare the MDI algorithm with the PCA algorithm and the baseline-QSMI algorithm. The results are shown in Figure 13.
It can be seen from Figure 13 that the recognition rate of our MDI algorithm reaches 98.67%, which is about 3.4% higher than the recognition result with the baseline algorithm and up to 13% higher than with the PCA algorithm, thus proving that the final proposed MDI algorithm has considerable advantages.

Comparison with MMulti-d Dimensional MModels
Here, we use PTB multi-lead ECG signal data. We read the multi-lead signal from the PTB data, fuse the multi-lead data signal, and then classify the individual. Similar to the experimental models in the previous two parts, the PTB database contains 249 individuals, and we divide the data into training sets and test sets. To evaluate the algorithm fairly and without loss of efficiency, we select 12 individual 12-lead signals to conduct our experiment. To analyze with other multi-dimensional feature algorithms, we compare the MDI algorithm with the PCA algorithm and the baseline-QSMI algorithm. The results are shown in Figure 13.

Additional Discussion
This section briefly addresses two questions. Some studies conduct ECG analysis with common classifiers such as artificial neural networks and support vector machines [37]. Is such an approach suitable for this study? General classification algorithms are more commonly used in ECG diagnosis. Our research is more suitable for ECG identification, as it applies a one-to-one comparison of a similarity algorithm to identify a person. Because ECG is a one-dimensional data source, if it is used for many-to-many identification, high error rates may result.
Another question is whether a system combining ECG and photoplethysmography (PPG) data has practical value. Some studies have applied ECG and PPG to heart-rate analysis or diagnosis [38], and now smart watches are able to integrate these two sensors, so including diagnosis and identification devices in smart watches should be possible. The successful combination of light and electric sensors proven by our research can achieve a higher recognition rate, further increasing the application value of ECG and PPG.

Conclusions
This study presents an innovative and efficient approach to better verify human identification. Experiments confirm that the combination of ECG signal and blood oxygen level yields greater identification accuracy than using ECG signal alone. The identity verification success rate using the proposed QSMI algorithm is 93.14%, almost 20% higher than with other one-dimensional methods and 6% higher than with two commonly used two-dimensional algorithms. Practically speaking, all ECG and blood oxygen level data can be easily collected from a small wearable device and stored in the memory card of a mobile phone. If we can collect more data dimensions, we can increase the ECG recognition rate through our proposed MDI algorithm.  It can be seen from Figure 13 that the recognition rate of our MDI algorithm reaches 98.67%, which is about 3.4% higher than the recognition result with the baseline algorithm and up to 13% higher than with the PCA algorithm, thus proving that the final proposed MDI algorithm has considerable advantages.

Additional Discussion
This section briefly addresses two questions. Some studies conduct ECG analysis with common classifiers such as artificial neural networks and support vector machines [37]. Is such an approach suitable for this study? General classification algorithms are more commonly used in ECG diagnosis. Our research is more suitable for ECG identification, as it applies a one-to-one comparison of a similarity algorithm to identify a person. Because ECG is a one-dimensional data source, if it is used for many-to-many identification, high error rates may result.
Another question is whether a system combining ECG and photoplethysmography (PPG) data has practical value. Some studies have applied ECG and PPG to heart-rate analysis or diagnosis [38], and now smart watches are able to integrate these two sensors, so including diagnosis and identification devices in smart watches should be possible. The successful combination of light and electric sensors proven by our research can achieve a higher recognition rate, further increasing the application value of ECG and PPG.

Conclusions
This study presents an innovative and efficient approach to better verify human identification. Experiments confirm that the combination of ECG signal and blood oxygen level yields greater identification accuracy than using ECG signal alone. The identity verification success rate using the proposed QSMI algorithm is 93.14%, almost 20% higher than with other one-dimensional methods and 6% higher than with two commonly used two-dimensional algorithms. Practically speaking, all ECG and blood oxygen level data can be easily collected from a small wearable device and stored in the memory card of a mobile phone. If we can collect more data dimensions, we can increase the ECG recognition rate through our proposed MDI algorithm.
Author Contributions: K.-K.T. proposed the main idea and organized the paper; J.L. carried out the experiments and wrote the paper; C.-C.C. organized the data and helped with writing it; S.-Y.T. helped with editing the paper; C.-F.Y. helped with editing the paper.
Funding: Thank you for the project support of Shenzhen Municipal Government; the project numbers are KQJSCX20170726104033 357, JCYJ20150513151706567 and JCYJ20160531191837793.

Conflicts of Interest:
The authors declare no conflict of interest.