Missing RRI Interpolation Algorithm based on Locally Weighted Partial Least Squares for Precise Heart Rate Variability Analysis

The R-R interval (RRI) fluctuation in electrocardiogram (ECG) is called heart rate variability (HRV), which reflects activities of the autonomic nervous system (ANS) and has been used for various health monitoring services. Accurate R wave detection is crucial for success in HRV-based health monitoring services; however, ECG artifacts often cause missing R waves and deteriorate the accuracy of HRV analysis. The present work proposes a new missing RRI interpolation technique based on Just-In-Time (JIT) modeling. In the JIT modeling framework, a local regression model is built by weighing samples stored in the database according to the distance from a query and output is estimated only when an estimate is requested. The proposed method builds a local model and estimates missing RRI only when an RRI detection error is detected. Locally weighted partial least squares (LWPLS) is adopted for local model construction. The proposed method is referred to as LWPLS-based RRI interpolation (LWPLS-RI). The performance of the proposed LWPLS-RI was evaluated through its application to RRI data with artificial missing RRIs. We used the MIT-BIH Normal Sinus Rhythm Database for nominal RRI dataset construction. Missing RRIs were artificially introduced and they were interpolated by the proposed LWPLS-RI. In addition, MEAN that replaces the missing RRI by a mean of the past RRI data was compared as a conventional method. The result showed that the proposed LWPLS-RI improved root mean squared error (RMSE) of RRI by about 70% in comparison with MEAN. In addition, the proposed method realized precise HRV analysis. The proposed method will contribute to the realization of precise HRV-based health monitoring services.


Introduction
The RR interval (RRI) fluctuation in an electrocardiogram (ECG) is known as heart rate variability (HRV), which is a physiological activity reflecting the cardiovascular control exerted by the autonomic nervous systems (ANS) [1,2]. Since the ANS is a control system that acts largely unconsciously and regulates functions such as the heart rate, digestion, respiration, perspiration, and body temperature, it is related to various diseases. Many types of HRV features have been proposed for evaluation of ANS activities, [3,4], and application areas of HRV analysis has been expanded due to recent advances in machine learning technologies. A brief introduction of HRV analysis is described in Appendix A.
It is known that changes in sleep condition affect HRV [5,6]. An HRV-based drowsiness detection method using linear discriminant analysis (LDA) was developed by Vicente et al. [7]. Patel et al.  Although missing R waves may easily be found when the ECG data are analyzed offline, such offline analysis cannot be used for online applications like drowsy driving detection and epileptic seizure prediction.
When the (j + 1)th R wave is not detected, r j is as r j =r j +r j+1 (1) wherer j is the true RRI measurement when both the jth and (j + 1)th R waves are detected correctly. The simplest way of missing RRI treatment is to remove or ignore r j [24]. Although Clifford and Tarassenko recommended using the Lomb-Scargle (LS) periodogram after removing r j for frequency domain feature extraction [25], such treatment cannot be used online because it causes time gaps between real-time. An ectopic RRI modification method from ECG data was proposed by using prior knowledge about arrhythmia [26]. However, its use for missing RRI interpolation is difficult since the effect of R wave detection errors on changes in RRI data is different from that of arrhythmia. Thus, missing RRI should be interpolated appropriately in real-time for precise HRV analysis. Highly adequate interpolation is required particularly in drowsy driving detection and epileptic seizure prediction because their errors caused by missing RRI may lead to severe injuries and accidents.
The electrode contact failure or sensor failure may cause long-term ECG measurement failure. In such cases, HRV analysis and its use for health monitoring should be stopped because we cannot use any information for HRV analysis. Thus, this study focuses on the modification of an isolated R wave detection error.
The present work proposes a new missing RRI interpolation method based on just-in-time (JIT) modeling. JIT modeling or lazy learning is a statistical modeling method that builds a local regression model and that estimates an output using the constructed local model only when an estimate is requested. Although some JIT modeling methods have been proposed [27][28][29][30], the present work adopts locally-weighted partial least squares (LWPLS), which utilizes partial least squares (PLS) for local model construction [31,32]. The proposed method, referred to as LWPLS-based RRI interpolation (LWPLS-RI), interpolates missing RRIs by using a local regression model only when an R wave detection error is detected. R wave detection errors are detected by using a threshold of the measured RRI.
This paper is organized as follows. Section 2 introduces LWPLS and proposes new missing RRI interpolation method. Section 3 validates the performance of the proposed missing RRI interpolation method through a case study of RRI data with artificial missing RRI. Finally, the conclusion and future work are described in Section 4. Although a preliminary version of this work has been reported in [33], the data analyzed in [33] was small, and the performance of the proposed method was not compared with other methodologies.

Materials and Methods
The present work proposes a new JIT-based algorithm for interpolating missing RRI, which is referred to as LWPLS-based RRI interpolation (LWPLS-RI). This section begins with partial least squares (PLS) and locally-weighted PLS (LWPLS) used in the proposed algorithm.

PLS
PLS is a widely used linear regression method that can build an accurate model with a small number of latent variables. Given an input data matrix X ∈ N×M whose nth row is the nth input sample x n ∈ M and an output data vector y ∈ N whose nth element is the nth output sample y n ∈ . X and y are mean-centered and appropriately scaled. In PLS, the input X and the output y are broken down as follows: where T ∈ N×K is the latent variable matrix whose columns are the latent variable t k ∈ N (k = 1, · · · , K), P ∈ M×K is the loading matrix of X whose columns are the loading vectors p k ∈ M , and b = [b 1 , · · · , b K ] T is the regression coefficient vector of y. K denotes the number of adopted latent variables. E ∈ N×M and f ∈ N are errors. The nonlinear iterative partial least squares (NIPALS) algorithm can be used to construct a PLS model [34]. Suppose that the first to kth latent variables t 1 , · · · , t k , the loading vectors p 1 , · · · , p k and the loading b 1 , · · · , b k are given. The (k + 1)th residual input and output can be expressed as follows: t k is a linear combination of the columns of X k , that is, t k = X k w k where w k ∈ M is the kth weighting vector. It is defined so that the covariance between y k and t k is maximized under ||w k || = 1. Using the Lagrange multipliers method, the function to maximize can be defined as where µ is the Lagrange multiplier. By solving ∂G k /∂w = 0, w k is derived as The kth loading vector p k and the kth loading b k are as follows: Finally, the above procedure is repeated until the number of adopted latent variables K is achieved; K can be determined by cross validation. Instead of using the Lagrange multipliers method, the derivation of the weighting vectors w k in the NIPALS algorithm can be formulated as an eigenvalue problem. w k is the eigenvector corresponding to the maximum eigenvalue of the following eigenvalue problem: where λ is an eigenvalue.
Algorithm 1 describes the eigenvalue-based NIPALS algorithm for PLS modeling.
Derive the eigenvector w k which corresponds to the maximum eigenvalue of the following eigenvalue problem: 9: if k = K then 10: Output P = [p 1 , · · · , p K ] and b = [b 1 , · · · b K ] T . 11: end if 12: end for

Locally-Weighted Partial Least Squares
In general, a global linear model cannot function well when a system has strong nonlinearity or changes in characteristics with time. The use of nonlinear modeling methods, such as support vector machine (SVM) or artificial neural network (ANN), is the first choice; however, nonlinear modeling methods are not always applicable because a considerable amount of data is required for nonlinear modeling. Also, it is difficult to even for nonlinear models to cope with changes in system characteristics with time.
Another method for dealing with these problems is JIT modeling, which has the following features: • Store new samples into a database when available.
• Construct a local model by the samples located in the neighboring region around a query and estimate an output only when estimation is required. • Discard the constructed local model after its use for output estimation.
In JIT modeling, samples for local modeling should be selected appropriately. LWPLS is an expansion of PLS based on the framework of JIT modeling for dealing with nonlinearity and system characteristics change. In LWPLS, a local PLS model is built by weighted samples stored in a database according to the similarity between the query and the weighted samples only when an estimate is requested. The constructed local model represents a nonlinear relationship between the input and the output around the query because a nonlinear relationship can be approximated as a linear relationship in a small region. After being used for estimation, the used local model is purged [31,32].
Let X and y have already been stored in a database. When an estimate is requested for a query x q , the similarity ω n between x q and the nth sample x n (n = 1, · · · , N) is calculated, and a local PLS model is built by the weighted samples with a similarity matrix Ω ∈ N×N defined as: The similarity between the query x q and a sample x n ω n in this work is defined as: where σ d denotes the standard deviation of d n (n = 1, 2, · · · , N) and ϕ is a localization parameter; the similarity decreases steeply when ϕ is small and gradually when ϕ is large. When the similarity matrix Ω is an identity matrix, LWPLS becomes the original PLS.
Algorithm 2 describes a procedure of LWPLS based on the NIPALS algorithm. Steps 4-7 derive the latent variable t, the loading vector p, and the regression coefficient vector b iteratively. In step 6, w k is calculated as the eigenvector of X T k Ωy k y T k ΩX k , which corresponds to the maximum eigenvalue. The final estimate is output when k = K. A localization parameter ϕ and the number of latent variables K are tuning parameters, which are determined by trial and error or cross-validation.

6:
Derive the kth latent variables of X and y, and the rth latent variable of x q ; Derive the kth loading vectors of X and the rth regression coefficient y; 8: Updateŷ q =ŷ q + t q,k d k . 9: if k = K then 10: Outputŷ q as an estimate. 11: else 12: Calculate X k+1 , y k+1 , and x q,k+1 ; 13: end if 14: end for

Missing RRI Interpolation
When the (j + 1)th R wave is not detected and the measured rth R wave is expressed as Equation (1), only the jth RRI estimater j is required because the next RRI estimater j+1 can be calculated from r j andr j :r j+1 = r j −r j . At this time, successive missing occurrences are not considered.
Appropriate input variables should be determined for interpolating missing RRI by LWPLS. Multiple past RRI measurements r j−1 , · · · , r j−L+1 and the current measurement r j are used as input variables; where L is the number of past measurements. Although the current measurement may be useful for interpolation, the measured r j is about double ofr j . Thus, its average r j /2 is used as input.
The procedure of the proposed LWPLS-RI is described in Algorithm 3. Before missing RRI interpolation starts, the initial RRI buffer has to be stored for more than the buffer size W. In step 4, r is the threshold for finding an R wave detection error. When the RRI measurement r j exceedsr, it is determined that an R wave detection error has occurred. The thresholdr is a predetermined parameter to be tuned beforehand; however, the default value ofr can be set to 1,500 msec. In step 5, the newly measured RRI r j is queued into the RRI buffer in a first-in-first-out (FIFO) manner when an R wave is detected correctly. On the other hand,r j is estimated by LWPLS when an R wave detection error occurs in steps 8-10. Finally, r j is replaced by the estimatedr j andr j+1 which are queued into the RRI buffer in a FIFO manner in steps 11 and 12. The proposed LWPLS-RI has four tuning parameters: the localization parameter ϕ, the numbers of latent variables K and past RRI measurements used for an input L, and the buffer size W.

Algorithm 3 LWPLS-RI.
1: Set φ, R,r, and l. 2: while do 3: Measure the jth RRI r j . 4: if r j ≤r then 5: Enqueue r j into the RRI buffer in the FIFO manner. 6: Wait until the next RRI r j+1 is measured. Construct an input x j as Equation (23) from the RRI buffer. 9: Estimate the jth RRIr j from x j by using LWPLS. 10: Calculate the j + 1 RRI estimate;r j+1 = r j −r k .

end while
In the proposed algorithm, the RRI data collected from any persons can be used for the initial RRI buffer. Even when the RRI data collected from persons other than users are stored in the initial RRI buffer, the stored RRI data are replaced by the RRI data measured from users themselves through the FIFO manner in steps 11 and 12 in Algorithm 3.

Results and Discussion
This section evaluates and discusses the interpolation performance of the proposed LWPLS-RI through its application to RRI data in which missing RRIs were introduced artificially. Long-term ECG measurement failure and an ectopic RRI caused by arrhythmia were not considered here.

Simulation Procedure
This case study used the MIT-BIH Normal Sinus Rhythm Database for objective data construction [35]. ECG data measured from subjects 1-18 were clipped from the database and ECG data containing strong artifacts were eliminated. The R waves in the clipped ECG data were detected by using the peak detection algorithm, and each RRI was calculated. Each of eighteen pieces of RRI data, r [s] , was divided into three datasets for parameter optimization of LWPLS r [s] o , initial RRI buffer in Algorithm 3, r [s] b , and validation r [s] v , where s(s = 1, · · · , 18) denotes the subject index. The numbers of samples in r [s] o , r [s] b , and r [s] v were 6000, 500, and 5000, respectively. There was no R wave detection error in any of these datasets.
Missing RRIs were artificially introduced to the parameter optimization dataset r [s] o and the validation datasets r [s] v in a random manner as r j = r j + r j+1 and eliminated r j+1 . There were no successive missing RRIs, and the missing rates were α = 0.3%, 0.5%, and 1%. Since the ordinal heart rate of a healthy adult is about 60-80 bpm, 1% missing means that R wave detection error occurs about every 1.5 min. Figure 3 illustrates an example of RRI data with α = 0.5 which was generated from the RRI data of subject 16 r [16] v .  In this case study, the HRV features described in Appendix A were extracted. A rectangular sliding window whose size is three minutes was used. The time domain features were extracted directly from the raw RRI data. For frequency domain feature extraction, the RRI data were interpolated by the third-order spline and resampled at 4 Hz. An AR model of order 40 was used to calculate frequency domain features. These HRV feature extraction settings were determined based on [18]. Figure 4 shows NN50, RMSSD, HF, and LF/HF extracted from the data in Figure 3. HRV features except for NN50 change greatly due to missing RRIs and such a situation may deteriorate the performance of HRV-based monitoring services. v , appropriate parameters in LWPLS were determined using the parameter optimization datasets r [s] o . 1% of R wave detection errors artificially occurred in all r [s] o in a random manner. The localization parameter ϕ and the number of latent variables K in LWPLS were determined so that the root mean squared error (RMSE) between the true and interpolated RRIs was minimized. Here, the RRI buffer size W and the number of past RRIs used for input L were fixed to 500 and 3. The determined parameters were ϕ = 1.3 and K = 3, respectively.
The proposed method with the determined parameters was applied to the validation datasets r [s] v (s = 1, · · · , 18). Although step 4 in Algorithm 3 judges missing RRI occurrences, missing RRI positions were known in this case study for interpolation performance evaluation. The following three interpolation methods were tested for comparison: • MEAN: Replace the missing RRI r j by a mean of x j .
• Equal Division (ED): Replace r j by the value of r j /(q + 1), where q is the number of successive R wave detection errors. • PLS-RI: Replace r j by an output of a PLS model whose input is x j .
The ectopic RRI remove method [24] was not tested in this case study because it can not be used in online applications. After interpolation by these methods, RMSE between the true and interpolated RRIs and RMSE between HRV features derived from the true and interpolated RRIs were calculated.
In this case study, missing RRI generation in the validation datasets r [k] v and missing RRI interpolation were repeated 30 times for precise performance evaluation. Figure 5 shows application results of four missing RRI interpolation methods to RRI data with α = 0.5% in Figure 3 and HRV extraction from the interpolated RRI data. These RMSEs of the four methods were calculated through a simulation repeated 30 times using all of the eighteen pieces of validation data. RMSEs of PLS-RI and the proposed LWPLS-RI were lower than those of MEAN and ED, and LWPLS-RI was the best. The proposed LWPLS-RI improved RMSE about 70% in comparison with MEAN when α = 0.5%. In addition, the interpolation performance hardly changed even if the error rate increased. These results demonstrate the usefulness of the proposed LWPLS-RI for HRV analysis from RRI data with missing RRIs.

Discussion
According to the simulation, the proposed LWPLS-RI achieved the best performance of missing RRI interpolation, of which the mean of RMSEs calculated from all subjects was 15.1 when the error rate was α = 1%. However, there were differences among subjects as shown in Figure 9. RMSE calculated from subject 6 was worse than the other subjects. None of the interpolation methods could improve his/her RMSE. The standard deviations of the raw RRI data of subject 6 and subject 5 whose RMSE was small were 61.8 and 140.4, respectively, which clearly showing that RRI fluctuation of subject 6 was much larger than that of subject 5 although there were no RRI detection errors. This indicates that a missing RRI cannot be interpolated appropriately when the fluctuation range of RRI is huge. LWPLS uses the past information most similar to a sample to be estimated; however, the number of past similar RRIs for interpolation becomes small when RRIs fluctuate largely.  . RMSE of RRI interpolation in each subject by LWPLS when α = 0.5%. Although RMSEs of almost all subjects except Subject 6 were less than twenty, that of Subject 6 was more than forty.
The current RRI measurement r j (=r j +r j+1 ) is used for utilizing missing information.
To investigate the effect of using the current measurement, RRI interpolation without r j was tested. The number of past RRIs used for input L was L = 4 so that the total number of input variables did not change. As a result, the means of RMSE by MEAN, PLS-RI, and LWPLS-RI were 49.1, 34.2, and 31.9, respectively, which are worse than RRI interpolation with r j . This shows that the current RRI measurement r j should be used for interpolation in any method.
In the proposed LWPLS-RI, the RRI buffer size W and the number of past RRIs used for input L were fixed to 500 and 3 in this case study. To investigate the effect of these parameters on the interpolation performance of the proposed LWPLS-RI, different parameter settings were compared. The RRI buffer size W was changed to W = 100, 300, 500, and 1000 and other parameters were fixed to the same values as Section 3.1. RMSEs were calculated using validation data with α = 0.5% in a simulation repeated 30 times. The means of RMSE were 15.9, 15.4, 15.7, and 15.1 when W = 100, 300, 500, and 1000, respectively. These numbers show that the interpolation performance was not improved when a large buffer size was selected, and in fact became worse when W = 100. Thus, W = 300 is sufficient for RRI interpolation.
The number of past RRIs used for input L was changed to L = 2, 3, 4, and 5, and other parameters were fixed to the same values as Section 3.1, except the number of latent variables K which was determined using the parameter optimization dataset for every different L. RMSEs were calculated using validation data with α = 0.5% in a simulation repeated 30 times. The means of RMSE were 12.7, 12.8, 12.9, and 12.8 when L = 2, 3, 4, and 5, which shows that the interpolation performance did not change, regardless of which L was selected.
In addition, the proposed LWPLS-RI was applied to the RRI data containing two successive missing RRIs and compared with the single RRI missing case. In this simulation, two successive missing RRIs were randomly generated at five different points in each validation dataset, and they were interpolated sequentially by LWPLS-RI whose parameters were the same as Section 3.1. The following input was used for the interpolation instead of Equation (23): The interpolated RRI for the first missingr j,1 was added to the input for the second interpolation as Using these inputs, the RRI data containing two successive missing RRIs were interpolated by LWPLS-RI sequentially, and RMSEs of the first and the second interpolations were calculated. This procedure was repeated 30 times. The mean of RMSE of the first missing interpolation was 24.9, which was worse than that of the single missing interpolation. More information about RRI was lost in the successive missing case than the single missing case, which makes it difficult to interpolate RRI. Thus, it becomes more difficult to interpolate missing RRIs when more than three successive missing RRIs occur. However, the mean of RMSE of the second missing interpolation was 15.2 which is almost the same as the single missing interpolation case. This may have been caused by input variable construction for the second interpolation. Equation (25) uses the interpolated first RRIr j,1 . Information lost through successive missing RRIs was recovered to some extent by the first interpolation andr j,1 contained such recovered information that may be useful for the second missing interpolation. The proposed LWPLS-RI can be easily implemented in mobile computers such as a smartphone because the computational load is much lighter than methods that need to process ECG signals directly. This is important from the viewpoint of practical use. An HRV-based epileptic seizure prediction smartphone app has already been developed and tested in hospitals [18]. In addition, an HRV-based drowsy driving detection smartphone app has been commercialized [9]. The performances of these smartphone apps will be improved through appropriate RRI interpolation by the proposed method.
It is concluded that the proposed LWPLS-RI has the potential for realizing highly accurate HRV-based health monitoring services in the future.

Conclusions
A missing RRI interpolation method was proposed utilizing the framework of JIT modeling, whereby a local model is constructed, and missing RRI is estimated by the constructed model only when R wave detection errors occur. The result of applying the proposed LWPLS-RI to real RRI data with artificial missing RRIs showed that it interpolated missing RRI more appropriately than other methods. The proposed method is not applicable to long-term ECG measurement failure since it modifies only one RRI. HRV analysis should be stopped during long-term ECG measurement failure.
The limitations of this study include the properties of the objective data collected from the Physionet database, such as the limited number of subjects and the fact that all subjects were healthy and did not have arrhythmia or other cardiovascular diseases.
In the future work, we will develop a unified framework for interpolating and modifying ectopic RRI data caused by arrhythmia as well as R wave detection errors.
Author Contributions: K.K. and K.F. developed the proposed method and wrote the initial draft of the manuscript. T.K. analyzed the data. M.K. contributed to data collection and analysis and assisted in the preparation of the manuscript. All authors approved the final version of the manuscript, and agree to be accountable for all aspects of the work.
Funding: This work was supported in part by AMED SENTAN #171122, JST PRESTO #JPMJPR1859, the Hattori Hokokai foundation, the SECOM science and technology foundation, the SEI Group CSR Foundation, and the Murata science foundation.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Heart Rate Variability Analysis
This appendix introduces HRV analysis. A typical ECG trace consists of some peaks, and the highest peak is called an R wave. The RR interval (RRI) [ms] is defined as an interval between an R wave and the next R wave. The R wave is detected by using a derivative-based peak detection algorithm [36]. Every RRI changes due to ANS activities, and this phenomenon is called HRV. Standard HRV features are classified into time domain features and frequency domain features [1].
The time domain features are obtained from the original RRI data.
• RMSSD: Root means square of the difference of adjacent RRI.
• NN50: Number of pairs of adjacent RRI, whose difference is more than 50 ms.
Frequency domain features are defined based on the power spectrum density (PSD) of the resampled RRI data, which is derived by Fourier analysis or an autoregressive (AR) model.
• LF: Power of the low-frequency band (0.04-0.15Hz) in PSD. LF reflects the activity of both the sympathetic and parasympathetic nervous systems. • HF: Power of the high-frequency band (0.15-0.4Hz) in PSD. HF reflects the parasympathetic nervous system activity. • LF/HF: Ratio of LF to HF. LF/HF expresses the balance between the sympathetic nervous system activity and the parasympathetic nervous system activity.
For frequency analysis, the raw RRI data are interpolated by using spline and are resampled at equal intervals, because they are not sampled at equal intervals. Appropriate settings of HRV analysis should be determined depending on applications; however, the HRV analysis guideline recommends that RRI data should be measured for more than two minutes and that the sampling rate should be at least 200 Hz [1].