Ischemic Stroke Risk Assessment by Multiscale Entropy Analysis of Heart Rate Variability in Patients with Persistent Atrial Fibrillation

It has been recognized that heart rate variability (HRV), defined as the fluctuation of ventricular response intervals in atrial fibrillation (AFib) patients, is not completely random, and its nonlinear characteristics, such as multiscale entropy (MSE), contain clinically significant information. We investigated the relationship between ischemic stroke risk and HRV with a large number of stroke-naïve AFib patients (628 patients), focusing on those who had never developed an ischemic/hemorrhagic stroke before the heart rate measurement. The CHA2DS2−VASc score was calculated from the baseline clinical characteristics, while the HRV analysis was made from the recording of morning, afternoon, and evening. Subsequently, we performed Kaplan–Meier method and cumulative incidence function with mortality as a competing risk to estimate the survival time function. We found that patients with sample entropy (SE(s)) ≥ 0.68 at 210 s had a significantly higher risk of an ischemic stroke occurrence in the morning recording. Meanwhile, the afternoon recording showed that those with SE(s) ≥ 0.76 at 240 s and SE(s) ≥ 0.78 at 270 s had a significantly lower risk of ischemic stroke occurrence. Therefore, SE(s) at 210 s (morning) and 240 s ≤ s ≤ 270 s (afternoon) demonstrated a statistically significant predictive value for ischemic stroke in stroke-naïve AFib patients.


Introduction
Atrial fibrillation (AFib) is the most common arrhythmia in elderly patients and is a known risk factor for stroke [1]. In patients without underlying rheumatic mitral valve disease, AFib is associated with an almost five-fold increase in the risk of stroke after adjusting for other risk factors [2]. In addition, strokes associated with AFib are often more disabling and fatal than strokes not associated with AFib, as most strokes associated with AFib are cardioembolic [3][4][5][6]. Consequently, the ability to predict ischemic stroke risk in patients with AFib is becoming an urgent issue, as many studies have reported a strong association between the two. The global focus on reducing stroke risks in AFib warrants an expanded understanding of the epidemiology, risk factors, determinants, and outcomes of stroke and other vascular conditions that threaten brain health [7]. Using the clinical characteristic information of patients, such as cardiovascular disease, prior stroke history, age, and sex, the CHA 2 DS 2 − VASc score is a widely used stroke predictor in patients with AFib [8].
Heart rate variability (HRV), defined as the interbeat intervals in normal sinus rhythm (without cardiac arrhythmia such as AFib), is closely related to the autonomic nervous

Patient Selection
Data pertaining to clinical characteristics and 24-h electrocardiogram (ECG) of 1093 patients with AFib were recorded from January 2005 to December 2013 at the Fujita Health University Hospital, Aichi, Japan. We excluded patients with certain criteria, such as (1) those with missing HRV data and missing or incorrect subject information; (2) those with a pacemaker; (3) those with atrial flutter or paroxysmal AFib; (4) those whose length of ECG recording was shorter than 20 h; (5) those who had a prior stroke; (6) those who did not develop an ischemic stroke during the observation period, but had a transient ischemic attack (TIA); and (7) those who had an intracranial hemorrhage. The remaining patients were then separated into two groups: (1) the ischemic stroke and (2) the non-ischemic stroke groups, which comprised patients who developed and did not develop an ischemic stroke during the follow-up years, respectively. Thereafter, the selected 628 stroke-naïve AFib patients were grouped based on their availability of morning (04:00 a.m.-08:00 a.m.), afternoon (11:00 a.m.-15:00 p.m.), and evening (17:00 p.m.-21:00 p.m.) recordings. A flowchart of patient selection is presented in Figure 1. We did not analyze the data during midnight hours, as sleep disorders (such as obstructive sleep apnea syndrome) might affect the sample entropy result [14]. To reduce the effect of confounding factors, we applied propensity score matching to the remaining patients with covariates as follows: age; body mass index (BMI); sex; inpatient/outpatient department; underlying diseases such as hypertension, coronary artery disease, heart failure, diabetes, dilated cardiomyopathy, and hypertrophic cardiomyopathy; and treatment with digitalis and warfarin. Propensity score matching was performed by matching one-on-four without replacement and a caliper of 0.2 of the standard deviation of the logit of the propensity score. In this study, the HRV time series is referred to as ventricular response interval (VRI) in AFib patients. VRI recordings were extracted automatically from the 24-h Holter electrocardiograms (ECGs) (Nihon Kohden, Tokyo, Japan), then interpolated linearly and resampled at 4 Hz. This research was approved by the ethics committee of Fujita Health University (approval No. HM17-232) for data measurement and Kwansei Gakuin University ethics committee (approval No. KG-IRB-18-02) for data analysis, and conformed to the principles outlined in the Declaration of Helsinki. For the use of data, the opt-out recruitment was approved by the Fujita Health University.
University ethics committee (approval No. KG-IRB-18-02) for data analysis, and conformed to the principles outlined in the Declaration of Helsinki. For the use of data, the opt-out recruitment was approved by the Fujita Health University.

2.2.
− Score We calculated the CHA DS − VASc score, where the acronym of "V" for vascular disease was replaced by coronary artery disease because of the absence of myocardial infarction, peripheral artery disease, and aortic plaque data. This score consists of 1 point for participants who had a congestive heart failure (C), 1 point for hypertension (H), 2 points for those with the age of 75 years or older (A ), 1 point for diabetes (D), 2 points for prior stroke or TIA (S ), 1 point for coronary artery disease as the vascular disease (V), 1

CH A 2 DS 2 − VASc Score
We calculated the CHA 2 DS 2 − VASc score, where the acronym of "V" for vascular disease was replaced by coronary artery disease because of the absence of myocardial infarction, peripheral artery disease, and aortic plaque data. This score consists of 1 point for participants who had a congestive heart failure (C), 1 point for hypertension (H), 2 points for those with the age of 75 years or older (A 2 ), 1 point for diabetes (D), 2 points for prior stroke or TIA (S 2 ), 1 point for coronary artery disease as the vascular disease (V), 1 point for those with the age range from 65 to 74 years old (A), and 1 point for female participant (S).

Analysis of VRI
We calculated the conventional linear indices of VRI in the time-domain, including the mean VRI and standard deviation of VRI (SDVRI). We also applied non-linear analysis to better express the complex nature of VRI, as non-linear analysis is considered to be less dependent on the pre-processing of a recording [15][16][17][18][19]. Long-range correlation properties were evaluated using detrended fluctuation analysis (DFA) [20][21][22]. This was estimated by the scaling exponent α in F(s) ∼ s α , where F(s) is defined as the square root of mean-square deviations around a linear trend averaged over segments with length n of integrated time series.
To measure the irregularity of the VRI time series x i , multiscale entropy analysis (MSE) and multiscale characterizations of the time series were calculated. MSE was performed in two steps [10]: (1) coarse-graining of the VRI time series, and (2) sample entropy measurement of each coarse-grained time series. Given a one-dimensional discrete VRI time series {x 1 , x 2 , . . . , x N }, the coarse-grained time series y (s) j was calculated as follows: where y (s) j is the mean value in j-th non-overlapping segment with length v, and 1 ≤ j ≤ N v . The length v divided by the resampling frequency of 4 Hz is called the time scale, which is denoted as s = v/4. The range of the time scale s (seconds) was set to a range of 1 s ≤ s ≤ 300 s. In the next step, we calculated the sample entropy S (s) E for each time scale s by the following: where m is the subseries length, r is the similarity tolerance, and N is the length of the VRI time series. Furthermore, n i (m + 1, r) represent the number of vectors that match the ith template of length m and m + 1, respectively, which satisfies r. We set the value of m to 2 and r to 0.15σ x , where σ x is the standard deviation of the resampled VRI time series. The sample entropy is the negative natural logarithm of the conditional probability that two subseries similar for m points remain similar for m + 1, where selfmatches are not included in calculating the probability [23,24]. In addition, the sample entropy is influenced by the time correlation and the probability distribution characteristics. Therefore, we calculated the autocorrelation coefficient at lag τ = 1 to quantify the time correlation characteristic by the following: We also calculated Shannon's information entropy to quantify the probability distribution characteristics by the following: where the probabilities {p i } were calculated using the histogram-based probability density function of {y (s) j } with a fixed bin width of 0.15σ x , and zero bins were not counted in n s . The autocorrelation coefficient at lag τ = 1 has a negative correlation with sample entropy, whereas information entropy has a positive correlation with the sample entropy. Then, we calculated the variance ratio as the ratio between the variance of the coarse-grained time All of the analyses were applied to the resampled VRI time series, and the time scale unit of the MSE profiles was in seconds instead of beats, in order to avoid it being affected by the cardiac rhythm (sinus or non-sinus) and heart rate [12].

Statistical Analysis
Quantitative data are presented as mean ± standard deviation (SD) for continuous variables, and as mean (frequency) for categorical variables. For continuous variables, the Mann-Whitney U test was conducted to analyze the significant differences between the two groups. We also calculated the Cohen's d effect size to determine the size of the differences between the groups. For survival time analysis, we applied the Kaplan-Meier method to estimate the survival rate of ischemic stroke and the log-rank test as the statistical test. Furthermore, we performed a cumulative incidence function (CIF) to evaluate the VRI index prediction with mortality as a competing risk and Gray's test to compare the CIF directly. Competing risk is an event whose occurrence precludes the occurrence of the primary event of interest [25]. In this study, the competing risk was mortality, and the primary event of interest was ischemic stroke occurrence. Receiver operating characteristic (ROC) curves were assessed to estimate the optimal cut-off value by plotting the true positive rate (sensitivity) against the false positive rate (1-specificity) at various cut-off values. The optimal cut-off value for the Kaplan-Meier survival time function analysis and CIF analysis was estimated using the value with the shortest distance to the (0,1) point of the ROC curves. Statistical tests were performed using R statistical software. A two-tailed p-value < 0.05 was considered significant.

Patient Clinical Characteristics
Following patient selection (Figure 1), the morning, afternoon, and evening recordings consisted of 324 (68 with ischemic stroke and 256 without ischemic stroke), 227 (49 with ischemic stroke and 178 without ischemic stroke), and 353 (73 with ischemic stroke and 280 without ischemic stroke) patients, respectively. Those who had developed an ischemic stroke during the follow-up period were included in the ischemic stroke group, while those who did not were included in the non-ischemic stroke group. The event of ischemic stroke was recorded during the follow-up of 3.25 ± 2.89 years, 3.46 ± 3.11 years, and 3.51 ± 3.02 years for the morning, afternoon, and evening recordings, respectively.
Several differences in the baseline clinical characteristics between patients with AFib who had an ischemic stroke and those who did not were observed before the propensity score matching, including the CHA 2 DS 2 − VASc score (Table 1). After matching, no significant difference was found in the baseline clinical characteristics between patients with AFib who developed an ischemic stroke and those who remained stroke-naïve for each time range (Table 2 for the morning recording). The CHA 2 DS 2 − VASc score did not show any significant differences between the two groups in any time range.  Data are presented as the mean ± standard deviation (SD) or number and frequency. Abbreviations are the same as those in Table 1.

Analysis of VRI
There were no significant differences in the conventional linear indices of the timedomain, except for SDVRI in the evening recording range, which showed a borderline association with ischemic stroke (p = 0.04) ( Table 3). The DFA of all of the time recordings showed that the scaling exponents was 0.64 at the scale less than approximately 100 s, which implied white-noise like behavior at small time scales, while it was 0.89 at the scale higher than 100 s, which implied the near-1/f fluctuation behavior at large time scales (morning recording range result in Figure 2). DFA scaling exponents did not show any significant difference between the two groups in small time scales and large time scales of all-time recordings. Furthermore, the statistical test of the MSE analysis showed that S (s) E did not differ significantly between the two patient groups for any of the time ranges. Lastly, the variance ratio did not show any significant difference between the two groups of patients in any of the time ranges, demonstrating that the normalization of the time series was effective. showed that the scaling exponents was 0.64 at the scale less than approximately 100 s, which implied white-noise like behavior at small time scales, while it was 0.89 at the scale higher than 100 s, which implied the near-1/f fluctuation behavior at large time scales (morning recording range result in Figure 2). DFA scaling exponents did not show any significant difference between the two groups in small time scales and large time scales of all-time recordings. Furthermore, the statistical test of the MSE analysis showed that ( ) did not differ significantly between the two patient groups for any of the time ranges. Lastly, the variance ratio did not show any significant difference between the two groups of patients in any of the time ranges, demonstrating that the normalization of the time series was effective.  No significant difference in ( ) was found between the two groups using the Mann-Whitney U test. We performed Kaplan-Meier survival analysis to check whether ( ) would be able to reflect the survival rate of ischemic stroke in stroke-naïve AFib patients when the survival time for each patient was included. In this analysis, survival time was  would be able to reflect the survival rate of ischemic stroke in stroke-naïve AFib patients when the survival time for each patient was included. In this analysis, survival time was defined as the length of time starting from the heart rate measurement date to the time when one had an ischemic stroke occurrence in the ischemic stroke group or the end of the follow-up period for the non-ischemic stroke group. The cut-off value was determined by the closest value to the (0,1) point in the ROC curve. Using the log-rank statistical test, the morning recording showed that patients with S when one had an ischemic stroke occurrence in the ischemic stroke group or the end of the follow-up period for the non-ischemic stroke group. The cut-off value was determined by the closest value to the (0,1) point in the ROC curve. Using the log-rank statistical test, the morning recording showed that patients with ( ) ≥ 0.68 at the time scale of 210 s had a significantly lower survival rate of ischemic stroke than those with ( ) < 0.68 ( Figure   3a). This implied that they had a higher risk of an ischemic stroke occurrence. For the afternoon recording, those with ( ) ≥ 0.76 at the time scale of 240 s and ( ) ≥ 0.78 at the time scale of 270 s had a significantly higher survival rate of ischemic stroke than those with ( ) < 0.76 and ( ) < 0.78, which indicated they had a lower risk of an ischemic stroke occurrence (the result for 270 s is shown in Figure 3b), To verify this result, we set the mortality as a competing risk in the CIF method, while ischemic stroke was set as the primary outcome. For the morning recording, the CIF results revealed that patients with ( ) ≥ 0.68 at the time scale of 210 s had a significantly higher cumulative incidence probability of ischemic stroke than those with ( ) < 0.68, while no significant difference was observed in the risk of mortality (Figure 4a). In contrast, the afternoon recording showed that patients with Meanwhile, the cumulative incidence probability did not show any significant difference in both ischemic stroke risk and mortality risk for the evening recording.  To verify this result, we set the mortality as a competing risk in the CIF method, while ischemic stroke was set as the primary outcome. For the morning recording, the CIF results revealed that patients with S (s) E ≥ 0.68 at the time scale of 210 s had a significantly higher cumulative incidence probability of ischemic stroke than those with S (s) E < 0.68, while no significant difference was observed in the risk of mortality (Figure 4a). In contrast, the afternoon recording showed that patients with S  Figure 4b). Meanwhile, the cumulative incidence probability did not show any significant difference in both ischemic stroke risk and mortality risk for the evening recording.  The MSE profiles were affected by characteristics such as the time correlation and probability distribution characteristics. To clarify which characteristic contributes to the ischemic stroke outcome and the difference in ( ) , we performed survival time analyses to both the autocorrelation coefficient at lag τ = 1 ( ( ) (1)) to calculate the time correlation characteristic and the information entropy ( ( ) ) for the probability distribution characteristic. The cut-off value was determined using the ROC curve. Based on the morning recording, patients with ( ) (1) < 0.88 at the time scale of 210 s had a significantly higher cumulative incidence probability of ischemic stroke than those with ( ) (1) ≥ 0.88, while no significant difference was observed in the risk of mortality (Figure 5a). For the afternoon recording, those with ( ) (1) < 0.82 at the time scale of 240 s and ( ) (1) < 0.81 at the time scale of 270 s had a significantly lower cumulative incidence probability of ischemic stroke than those with ( ) (1) ≥ 0.82 and ( ) (1) ≥ 0.81, while no significant difference was observed for the risk of mortality (the result for 270 s is shown in Figure 5b). The cumulative incidence probability did not show any significant difference in both ischemic stroke risk and mortality risk for the evening recording. In terms of probability distribution characteristics, the morning recording showed that patients with ( ) ≥ 2.18 at the time scale of 210 s had a significantly higher cumulative incidence probability of ischemic stroke occurrence than those with ( ) < 2.18, while no significant difference was observed for mortality risk (Figure 6a). Meanwhile, the cumulative incidence probability did not show any significant difference for both ischemic stroke risk and mortality risk for the afternoon and evening recording (the result for 270 s of the afternoon recording is shown in Figure 6b). D ) for the probability distribution characteristic. The cut-off value was determined using the ROC curve. Based on the morning recording, patients with R (s) (1) < 0.88 at the time scale of 210 s had a significantly higher cumulative incidence probability of ischemic stroke than those with R (s) (1) ≥ 0.88, while no significant difference was observed in the risk of mortality (Figure 5a). For the afternoon recording, those with R (s) (1) < 0.82 at the time scale of 240 s and R (s) (1) < 0.81 at the time scale of 270 s had a significantly lower cumulative incidence probability of ischemic stroke than those with R (s) (1) ≥ 0.82 and R (s) (1) ≥ 0.81, while no significant difference was observed for the risk of mortality (the result for 270 s is shown in Figure 5b). The cumulative incidence probability did not show any significant difference in both ischemic stroke risk and mortality risk for the evening recording.
In terms of probability distribution characteristics, the morning recording showed that patients with H (s) D ≥ 2.18 at the time scale of 210 s had a significantly higher cumulative incidence probability of ischemic stroke occurrence than those with H (s) D < 2.18, while no significant difference was observed for mortality risk (Figure 6a). Meanwhile, the cumulative incidence probability did not show any significant difference for both ischemic stroke risk and mortality risk for the afternoon and evening recording (the result for 270 s of the afternoon recording is shown in Figure 6b).

Discussion
The present study demonstrated that sample entropy ( ( ) ) at the time scale of 210 s of the morning recording and 240 s ≤ s ≤ 270 sec of the afternoon recording have a statistically significant prognostic value for ischemic stroke outcome in stroke-naïve AFib patients. Furthermore, the autocorrelation coefficient at lag τ = 1 as the time correlation characteristics of the MSE profiles might contribute to the difference in ( ) .

Discussion
The present study demonstrated that sample entropy ( ( ) ) at the time scale of 210 s of the morning recording and 240 s ≤ s ≤ 270 sec of the afternoon recording have a statistically significant prognostic value for ischemic stroke outcome in stroke-naïve AFib patients. Furthermore, the autocorrelation coefficient at lag τ = 1 as the time correlation characteristics of the MSE profiles might contribute to the difference in ( ) .

Discussion
The present study demonstrated that sample entropy (S We confirmed the presence of white noise-like fluctuations at shorter scales (less than approximately 100 s) and near-1/f fluctuations in longer scales (more than approximately 100 s) in the VRI time series of AFib patients (Figure 2) with α 1 = 0.64 and α 2 = 0.89, which is similar to the previous study [13]. Several findings have described the existence of white noise characteristics as an uncorrelated process (over short scales) and 1/f noise as a correlated process (over long scales) in the HRV of patients with AFib [11,26]. The dynamics of the regulatory process underlying the long-term component may be common between the HRV of a healthy subject and an AFib patient [11]. Thus, the time scales exhibiting near-1/f fluctuations were reported to reveal new information regarding the complexity of HRV, which could be measured by MSE analysis [10]. To evaluate the irregularity of ECG signals in the 1/f fluctuation area and the onset of a disease, sample entropy is considered to be an effective complex system analysis [27]. Furthermore, Ho et al. [28] also reported that the sample entropy calculated in the 1/f fluctuation area may serve as a significant predictor of mortality in patients with congestive heart failure. However, our Mann-Whitney U statistical test of S (s) E did not show any significant difference between AFib patients who had an ischemic stroke outcome and those who remained stroke-naïve.
On that account, we intended to check the ability of S  (Figure 4a). This indicated that AFib patients who had a higher risk of an ischemic stroke occurrence were those whose VRI generated a higher degree of irregularity in the near-1/f fluctuation area of the morning recording. For the afternoon recording, we discovered that those with S  Figure 4b). This suggests that patients with AFib whose VRI generated a lower degree of irregularity in the near-1/f fluctuation area had a higher risk of ischemic stroke for the afternoon recording. Thus, these results implied that S  Mann-Whitney U test, which showed no significant difference between the two groups, maybe because of the length of survival time, which was not included in the analysis. Therefore, we discovered that S (s) E was able to reflect the risk of an ischemic stroke event in stroke-naïve AFib patients when the survival time for each patient was included.
Our morning recording result was in agreement with the previous studies that S (s) E at large time scales were higher in ischemic stroke patients than in non-ischemic stroke patients [12,13], while our afternoon recording result was similar to reports from other studies that demonstrated a decrease in nonlinear behavior of the heart rate, which is associated with the worsening of pathological states [29]. It has been reported that healthy participants generate more complex dynamics than diseased participants [30]. Therefore, our results confirmed that those with a decrease in the degree of irregularity were more likely to develop an ischemic stroke based on the afternoon recording. This may attributed to the loss of complexity in patients with unfavorable outcomes. Our research agreed with a previous finding showing that acute ischemic stroke patients with a significantly lower complexity were more likely to develop stroke-in-evolution than those with a higher complexity [31]. A similar result was also demonstrated, that higher values of the complexity index were significantly associated with favorable outcomes in patients [32]. Less complex dynamics were observed in patients with many states of disease compared with patients in healthy conditions [33,34]. This is because the information content is degraded as physiological systems become less complex. Furthermore, they become less adaptable and less able to cope with the exigencies of constantly changing environments [34,35]. Young healthy systems are the most complex and adaptive systems [30].
Based on the time range, there was a diurnal variation in S (s) E found in this study. One possible explanation for this variation might be the presence of a circadian rhythm of S (s) E in the 1/f fluctuation area of stroke-naïve AFib patients. A circadian rhythm is a physiological and behavioral cycle with a recurring periodicity of approximately 24 h, controlling a variety of biological processes, such as the sleep-wake cycle [36,37]. According to our results, patients who remained stroke-naïve had a lower degree of irregularity in the morning (indicated by lower S (s) E ), followed by a higher degree of irregularity in the afternoon (indicated by higher S (s) E ). Thus, we hypothesized that this circadian pattern of S (s) E in scales larger than 100 s disappeared in patients who were at a higher risk of developing an ischemic stroke event.
As MSE profiles provide a possible characterization of the biosignal complexity [38], factors such as time correlation and probability distribution characteristics affect the results of S (s) E . Therefore, we applied both survival analyses to verify the factors contributing to the difference in S (s) E in our study. The time correlation characteristic was calculated by autocorrelation coefficient at lag τ = 1 (R (s) (1)), while the probability distribution characteristic was calculated by information entropy (H Matsuoka et al. [13] revealed that the probability distribution characteristic of MSE profiles in a wider range of scales of s ≥ 2 s is a useful measure for ischemic stroke risk assessment, while our result of H (s) D did not show any significant difference between the two groups. Furthermore, the previous study did not find any significant difference in the time correlation characteristic between the two groups, while the present study found that R (s) (1) at longer time scales might contribute to S (s) E . The difference between our finding and previous studies could be as a result of two reasons: firstly, we had a larger number of participants in our study, which were selected by the propensity score matching based on several baseline clinical characteristics covariates to reduce the effect of confounding; secondly, we focused on "stroke-naïve AFib patients", which means that only patients with AFib that had never developed any ischemic or hemorrhagic stroke before the heart rate measurement date were included.
This study had several limitations. The present study was an observational study with a cohort of Japanese patients at a single institution. This may have caused selection bias. To verify the findings of this study, further investigation with more heterogeneous participants (i.e., foreign patients) may eliminate the possibility of this bias and confirm whether the proposed hypothesis can be applied to a more diverse population. We also did not perform Entropy 2021, 23, 918 13 of 15 a comparison of the predictive performance in patients based on antithrombotic drug intake due to the limited data on the subject information. Moreover, the vascular disease in CHA 2 DS 2 − VASc score only consisted of coronary artery disease due to the absence of myocardial infarction, peripheral artery disease, and aortic plaque data. Furthermore, there were only a limited number of previous studies to interpret our results. Thus, the generalizability of our findings in each time range (morning, afternoon, and evening) may verify our proposed hypothesis, such as the presence of a circadian rhythm of S (s) E in stroke-naïve AFib patients with a low risk of developing an ischemic stroke event.

Conclusions
Our study found that stroke-naïve AFib patients whose VRI generated a higher degree of irregularity in scales exhibiting near-1/f fluctuations had a higher risk of developing an ischemic stroke in the morning recording, while they had a lower risk of ischemic stroke outcome in the afternoon recording. We also found that the time correlation characteristic of the MSE profiles might contribute to the difference in S