Automatic Detection of Atrial Fibrillation in ECG Using Co-Occurrence Patterns of Dynamic Symbol Assignment and Machine Learning

Early detection of atrial fibrillation from electrocardiography (ECG) plays a vital role in the timely prevention and diagnosis of cardiovascular diseases. Various algorithms have been proposed; however, they are lacking in considering varied-length signals, morphological transitions, and abnormalities over long-term recordings. We propose dynamic symbolic assignment (DSA) to differentiate a normal sinus rhythm (SR) from paroxysmal atrial fibrillation (PAF). We use ECG signals and their interbeat (RR) intervals from two public databases namely, AF Prediction Challenge Database (AFPDB) and AF Termination Challenge Database (AFTDB). We transform RR intervals into a symbolic representation and compute co-occurrence matrices. The DSA feature is extracted using varied symbol-length V, word-size W, and applied to five machine learning algorithms for classification. We test five hypotheses: (i) DSA captures the dynamics of the series, (ii) DSA is a reliable technique for various databases, (iii) optimal parameters improve DSA’s performance, (iv) DSA is consistent for variable signal lengths, and (v) DSA supports cross-data analysis. Our method captures the transition patterns of the RR intervals. The DSA feature exhibit a statistically significant difference in SR and PAF conditions (p < 0.005). The DSA feature with W=3 and V=3 yield maximum performance. In terms of F-measure (F), rotation forest and ensemble learning classifier are the most accurate for AFPDB (F = 94.6%) and AFTDB (F = 99.8%). Our method is effective for short-length signals and supports cross-data analysis. The DSA is capable of capturing the dynamics of varied-lengths ECG signals. Particularly, the optimal parameters-based DSA feature and ensemble learning could help to detect PAF in long-term ECG signals. Our method maps time series into a symbolic representation and identifies abnormalities in noisy, varied-length, and pathological ECG signals.


Introduction
Cardiovascular diseases (CVDs) are the primary cause of death worldwide, with 45% in the European Union [1,2]. According to the World Health Organization (WHO), the number of deaths from CVDs has increased by 34% since 2000 [2,3]. Atrial fibrillation (AF) is one of the most common CVDs that has affected 33.5 million individuals worldwide and may affect 17.9 million in Europe by 2050 [3].
Electrocardiography (ECG) is the preferred technique to record the electrical activity of the heart. It is a standard clinical tool to detect and diagnose CVDs [4][5][6]. In a classical approach, ECG signals are monitored over a short time, and abnormalities are detected by visual inspection. Automatic analysis of ECG requires reliable identification of fiducial points for accurate measurements [7][8][9]. Most of the existing methods can cope only with relatively noise-free signals and steady features from local waves such as QRS complex, P, Figure 1. The overall pipeline of the proposed approach. The interbeat (RR) intervals are applied to dynamic symbol assignment (DSA) to map electrocardiography (ECG) signals to a symbolic sequence. The thresholds is used to maps the symbols in the RR interval. The pattern transition probability is computed from co-occurrence pattern transition matrix for symbols , , }. Finally, the is transformed to a 1-dimensional array ⃗ using row-based concatenation, and the DSA features are extracted. The DSA features are fed to the k-nearest neighbor (kNN), support vector machine (SVM), random forest (RF), rotation forest (RoF), and ensemble learning (EL) classifiers to differentiate normal and paroxysmal atrial fibrillation (PAF) segments. The dark-blue arrow refers to the flow from one process to the next. The grey arrow refers to the intermediate outcome of the process.

Dynamic Symbol Assignment (DSA)
We first introduce our symbolization technique DSA (The source code of this study is available at: https://github.com/nagaganapathy/Dynamic_symbolic_Assignment.git (accessed on 1 May 2021)) designed explicitly for RR interval data. It is mainly used to determine the pattern transitions in the transformed symbolic sequences. The DSA approach consists of two major steps, namely, distance approximation, and symbolization. In most of the existing symbolization techniques, predefined thresholds are used as a discretization rule to map the sequential data into symbol sets [16,19]. The predefined thresholds may fail abruptly for time-series, as it requires prior domain knowledge for a specific application. To avoid the drawbacks of thresholds, we propose a dynamic thresholds list ( Figure 2). Figure 1. The overall pipeline of the proposed approach. The interbeat (RR) intervals are applied to dynamic symbol assignment (DSA) to map electrocardiography (ECG) signals to a symbolic sequence. The thresholds T i is used to maps the symbols in the RR interval. The pattern transition probability P is computed from co-occurrence pattern transition matrix M for symbols {a, b, c}. Finally, the P is transformed to a 1-dimensional array → P using row-based concatenation, and the DSA features are extracted. The DSA features are fed to the k-nearest neighbor (kNN), support vector machine (SVM), random forest (RF), rotation forest (RoF), and ensemble learning (EL) classifiers to differentiate normal and paroxysmal atrial fibrillation (PAF) segments. The dark-blue arrow refers to the flow from one process to the next. The grey arrow refers to the intermediate outcome of the process.

Dynamic Symbol Assignment (DSA)
We first introduce our symbolization technique DSA (The source code of this study is available at: https://github.com/nagaganapathy/Dynamic_symbolic_Assignment.git (accessed on 1 May 2021)) designed explicitly for RR interval data. It is mainly used to determine the pattern transitions in the transformed symbolic sequences. The DSA approach consists of two major steps, namely, distance approximation, and symbolization. In most of the existing symbolization techniques, predefined thresholds are used as a discretization rule to map the sequential data into symbol sets [16,19]. The predefined thresholds may fail abruptly for time-series, as it requires prior domain knowledge for a specific application. To avoid the drawbacks of thresholds, we propose a dynamic thresholds list ( Figure 2).

Distance Approximation
At first, we determine the length of each RR interval: where R j denotes the j th R-wave in the ECG signal ( Figure 3). The counter j is j N 0 .  Figure 2. The pipeline of the dynamic symbol assignment (DSA) approach; representative RR intervals as input to the DSA method (a), distance evaluation of the input data (b), distance approximation using dynamic threshold lists (c), and representation of symbolic sequence after symbolization (d).

Distance Approximation
At first, we determine the length of each RR interval: where denotes the R-wave in the ECG signal ( Figure 3). The counter is ℕ .
Let 1 denote the number of R-waves in the ECG signal, then the mean distance ̅ of all intervals yields: The pipeline of the dynamic symbol assignment (DSA) approach; representative RR intervals as input to the DSA method (a), distance evaluation of the input data (b), distance approximation using dynamic threshold lists (c), and representation of symbolic sequence after symbolization (d).
where ℕ , 2 , depicts step size, and is initialized as 0.05 for symmetrical threshold list using random search. For example, with | | = 5 , the four thresholds for mapping are 0.90 ̅ , 0.95 ̅ , 1.05 ̅ , and 1.10 ̅ , for = 1, = 2, = 3, and = 4, respectively. Let N + 1 denote the number of R-waves in the ECG signal, then the mean distance r N of all N intervals yields:

Symbolization
Each r j is mapped onto a particular symbol v i V of the vocabulary V, and its cardinality |V| denotes the number of symbols in V. The number of symbols is initialized as the odd number to determine symmetric threshold lists for mapping r j . For the mapping, we define |V| − 1 threshold T i . The thresholds T i are computed as: where i N 0 , i ≤ 2k, k depicts step size, and t is initialized as 0.05 for symmetrical threshold list using random search. For example, with |V| = 5, the four thresholds T i 5 for mapping are 0.90r N , 0.95r N , 1.05r N , and 1.10r N , for i = 1, i = 2, i = 3, and i = 4, respectively.

Symbolization
After approximation and creating the dynamic thresholds lists, we use the corresponding symbol vocabulary V = {a, b, . . . , z} to symbolize the elements of r j . The thresholds list T i is used to symbolize the RR intervals and it is given by: where v i ∈ V is the mapped symbol for the element r j , D = D 1 , D 2 , . . . , D j is the final DSA symbolized sequence representation for r j elements, which is a symbolic sequence.

Co-Occurrence Patterns
Then, we use the DSA to determine the co-occurrence of the specific patterns of word size W, which is followed by a symbol. In this study, we chose W in the range of 2 ≤ W ≤ 5. For instance, in a symbolized sequence, a simple two-symbol (W = 2) pattern transition is where pattern ab followed by c or three-symbol (W = 3) pattern transition can be a pattern where abc is followed by d . The identification of specific patterns and their pattern transitions with a symbol can be used to determine abnormality. We count the frequency of such patterns to compute the pattern transition probabilities (Table 1). Table 1. An example of pattern transition probability P computed from co-occurrence pattern transition matrix M with word size W = 3 for the symbolic sequence D. Considering an example of symbolic sequence D = {aabcbbbcbcbbbababc} defined using |V| = 3 with symbols {a, b, c}, we count the frequency of occurrence of pattern transition W = 2, followed by a symbol in a lexico-logical order. Then, we normalize the matrices to determine pattern transition probability P and compare the characteristics of ECG signals with varying lengths. For the symbolic sequence D, the P is computed using the co-occurrence pattern transition matrix M, and it is mathematically expressed as: where M i,j is the co-occurrence matrix representing the frequency of patterns that occurred in the symbolic sequence D with i th row depicting the symbol pattern and j th column representing a symbol [11]. We computed the pattern transition probabilities similar to Akbilgic et al. [17] ( Table 1). The selection of optimal W and |V| is essential to determine the vital information necessary to identify disease conditions and to maintain low computational complexity. Using our DSA method, the RR intervals of ECG are transformed into symbolic sequences. Later, P are computed for the derived symbolic sequence. Each P represents the pattern transition behavior of the signal segments. Finally, the P matrix is transformed into a single-row feature vector using row-based concatenation. For example, with W = 3 and |V| = 3, the single array → P for symbolic sequence D is given as { P 1 , P 2 , . . . P i·j , . . . , P 27 where i and j are the rows and columns of the co-occurrence matrix. Likewise, the transition behavior of the patterns in the series for varied W and |V| is computed to determine the probabilistic model of the series.

DSA Features
We extract DSA features, namely, maximum co-occurrence value with optimal values of W and |V|, from the symbolic sequence to discriminate SR and PAF segments. For this, we discretize the data using varied |V| and calculate the maximum co-occurrence value from the single array → P for varied |V|. In this study, we vary the |V| as 3, 5, 7, 9 and the W as 2, 3, 4, 5, to determine the pattern transition behaviors, respectively. We employ the non-parametric Wilcoxon test on the DSA feature to determine the optimal values of W and |V| that yield the maximum statistical significance.

Supervised Learning Classification
In the third step, we feed the feature to the supervised learning classifiers, namely, kNN, SVM, RF, RoF, and EL. Further, the grid search cross-validation technique is applied to tune the optimal parameters for all the classifiers mentioned above to ensure fair and unbiased comparison ( Table 2) [25].  SVM is one of the most preferred supervised learning methods for classification, regression, and prediction. The classifier differentiates the members of the classes by using the hyperplanes. It nonlinearly maps the input data into a higher-dimensional space. An optimal hyperplane maximizes the distance between the two different classes in training. For our data, we use a linear kernel function [26,27].

kNN
It is a simple and popular classification technique that is based on clustering [26]. kNN classifier determines the label of test data by identifying the nearest groups of c clusters in the training set as its neighbors. It estimates the label of test data in three stages: (1) initialization of the training dataset as kNN, (2) computation of the distance metrics between the neighbors, and (3) classification of a test set based on the majority of labels of neighboring sets in the training set. Based on the classification accuracy, we set c = 5.

RF
It is a powerful ensemble learning-based classifier that computes the final decision based on the predicted outcome of the majority of the decision trees. These multiple trees are distinct and trained with data subsets grouped using random extraction with replacement [27]. As a result, some data samples are used more than once for training, which enhances the stability and reliability of the classifier for slight variations [28]. The average of the multiple decision trees is considered as the best results.

RoF
The RoF is another ensemble learning-based approach. The features are divided into subsets and the principal components of each subset are applied to the decision trees [27]. Further, an equal number of rotations are performed on the feature subsets to generate new decision-making features. Additionally, these new features are also applied to decision trees to predict the labels of test data. Thus, RoF provides a more diverse and robust performance than the simple RF approach.

EL
Adaboost EL improves accuracy by combining multiple strong and weak classifiers. It yields multiple models of classifiers for the same training set. Further, the best decision is computed by using majority voting labels. The learning technique considers the decision of multiple experts of different weights and thereby further improves the classification accuracy. We consider SVM, kNN, RF and RoF as weak classifiers for integrated output [29].

Evaluation
This section describes the hypotheses, experiments to prove (or to disapprove) the hypothesis, our validation approaches, and the performance matrices.

Hypotheses
All the experiments are designed to evaluate particular assumptions. We formulate five hypotheses:

Hypotheses 1 (H1). DSA captures the dynamics of the time-series and learns the pattern transition for PAF conditions;
Hypotheses 2 (H2). DSA is a reliable technique to differentiate SR and PAF segments in various databases;

Hypotheses 3 (H3).
There is an improvement in performance with optimal DSA parameters; Hypotheses 4 (H4). DSA is consistent for a varied length of signals and does not require the same timestamp for comparisons; Hypotheses 5 (H5). DSA supports cross-data analysis and is reliable for series of a reduced length.

Experiments
All the experiments are designed to evaluate particular assumptions.

1.
E1: The heatmaps represent the global transition patterns of SR and PAF segments. The pattern transition between SR and PAF segments is computed, and the obtained patterns are compared visually between the segments. H1 will be accepted if the heatmaps for SR and PAF segments are distinct and exhibit varying patterns. 2.
E2: The DSA approach is applied to the various annotated databases, and the performance of the classifiers to differentiate SR and PAF segments is compared using their F-measure (F). H2 will be accepted if the F is higher (>90.0%). 3.
E3: The average performance of DSA feature for different |V| and W is compared individually using two databases. Statistical analysis is performed for varied W, |V|, and also to obtain comparable results. The performance of the classifiers is compared using the receiver operating characteristics (ROC) curve for multiple databases. E5: The F is computed on the series of two different but comparable databases to validate the cross-data analysis, and the values are compared. The majority of previous studies evaluated their methods on the same dataset [21,[30][31][32]. Here, the one-minute PAF segments are obtained from AFTDB, and one-minute SR segments are extracted from AFPDB. We applied our DSA method across datasets. H5 is accepted if the F is above 95.0%.

Validation
We use 5-fold cross-validation [10,25,33]. To ensure reliable results, the 5-fold crossvalidation is repeated ten times with randomly selected groups, and the average values are used for the performance comparison of the different classifiers. Furthermore, on the extracted features, statistical assessment is performed using a non-parametric Wilcoxon test with paired samples [34]. The average classification accuracy (ACC) of the results is defined as the performance of the classifier. We apply performance metrics, namely, precision (P), recall (R), F, and AUCs [34,35]. The P is defined as the number of correctly predicted PAF in the total number of predicted PAF. The R is defined as the index to determine the correctly predicted PAF to the total number of PAF. The F is given by the harmonic mean of P and R [2,34].

Databases
Several PAF databases are available online. Our selection criteria include the number of subjects, duration of recordings, the grade of other pathological conditions in the signals, and sampling rate ( Table 3).
The AF Prediction Challenge Database (AFPDB) has been used extensively in PAFrelated studies [22,24]. Hence, we use it to validate experiments H1-H4. It contains 50 twolead ambulatory ECG recordings, each includes 25-and 5-min length ECG segments that are obtained from 48 subjects. The records have been sampled at 128 Hz and digitized on a 16-bit resolution of over 200 A/D units per millivolt range (Table 2). Each recording contains a 25-min and a 5-min segment for normal rhythm and PAF, respectively (Figure 4a,b). It also includes another set of normal segments of 25-and 5-min lengths without any traces of PAF. A computer-aided method was used to annotate each beat. nals, and sampling rate (Table 3).  Challenge-2001  2  48  50  128  25, 5  16  5  24  AF Termination Challenge-2004 2  30  30  128  1  16  1  20-24 The AF Prediction Challenge Database (AFPDB) has been used extensively in PAFrelated studies [22,24]. Hence, we use it to validate experiments H1-H4. It contains 50 twolead ambulatory ECG recordings, each includes 25-and 5-min length ECG segments that are obtained from 48 subjects. The records have been sampled at 128 Hz and digitized on a 16-bit resolution of over 200 A/D units per millivolt range (Table 2). Each recording contains a 25-min and a 5-min segment for normal rhythm and PAF, respectively (Figure 4a,b). It also includes another set of normal segments of 25-and 5-min lengths without any traces of PAF. A computer-aided method was used to annotate each beat. The AF Termination Challenge Database (AFTDB) is also available online and accessible to the public. It contains 50 two lead ambulatory ECG recordings from 30 subjects [23]. The signals are sampled at 128 Hz with a resolution of 16 bits over a range of 200 A/D units per millivolt (Table 2). Each record contains one-minute length PAF extracted from the 20-24 h ambulatory ECG recording (Figure 4c,d). A computer-aided system has annotated the data individually. Of 50 recordings, 30 recordings are part of the training set in the challenge database and freely available. Thus, we considered only these 30 recordings for experiments.
In this study, the RR interval extracted from the annotated ECG signals of AFPDB and AFTDB databases is considered. Moreover, in the AFPDB dataset, the 5-min PAF segments and the first 5-min SR segments extracted from the 25-min long ECG signals are used for experiments H1-H4. Further, the 1-min PAF segments of the AFTDB dataset and  (Table 2). Each record contains one-minute length PAF extracted from the 20-24 h ambulatory ECG recording (Figure 4c,d). A computer-aided system has annotated the data individually. Of 50 recordings, 30 recordings are part of the training set in the challenge database and freely available. Thus, we considered only these 30 recordings for experiments.
In this study, the RR interval extracted from the annotated ECG signals of AFPDB and AFTDB databases is considered. Moreover, in the AFPDB dataset, the 5-min PAF segments and the first 5-min SR segments extracted from the 25-min long ECG signals are used for experiments H1-H4. Further, the 1-min PAF segments of the AFTDB dataset and the first one-minute SR segments of the 25-min-long ECG of the AFPDB dataset are is used for the experiments H2, H3, and H5.

1.
R1: The representative ECG signals, their corresponding RR interval, and the discretized series with symbolic sequences are shown in Figure 5. The differences in the distance between the R-spikes of the signals are visible in SR and PAF segments. The discretized symbolic sequence for SR segments contains relatively constant symbols, while the symbolic sequences of PAF segments are irregular and represent a frequent transition in patterns ( Figure 5). For example, the symbol sequences {bbbbbbbbbbbbbbbbbbbb} and {decbaecbddccbbbbadbcdcedbd} are for SR and PAF segments, respectively. The heatmaps for word size W = 3 perform the best (Figure 6), which is in agreement with the literature [11,19]. The pattern transitions are consistent in SR but rather random in PAF segments. This is attributed to the fact that the lengths of RR intervals are regular in SR segments. However, for PAF segments, the lengths are inconsistent and hence, the pattern transition followed by symbol b is found to be random. This approves our hypothesis H1.
tized series with symbolic sequences are shown in Figure 5. The differences in the distance between the R-spikes of the signals are visible in SR and PAF segments. The discretized symbolic sequence for SR segments contains relatively constant symbols, while the symbolic sequences of PAF segments are irregular and represent a frequent transition in patterns ( Figure 5). For example, the symbol sequences } and } are for SR and PAF segments, respectively. The heatmaps for word size = 3 perform the best (Figure 6), which is in agreement with the literature [11,19]. The pattern transitions are consistent in SR but rather random in PAF segments. This is attributed to the fact that the lengths of RR intervals are regular in SR segments. However, for PAF segments, the lengths are inconsistent and hence, the pattern transition followed by symbol ′ ′ is found to be random. This approves our hypothesis H1. 2. R2: Comparing SR and PAF segments in both databases, the highest performance of F = 93.6% and F = 98.3% is obtained using = 4 and 3, respectively ( Table 4). The RoF and EL classifiers have high performance for varied . The P and R are also found to be the highest for = 3 in both databases. The RoF classifier yields the highest ACC of 93.6% and 98.3% for AFPDB and AFTDB databases, respectively. With varied classifiers, the DSA method differentiates SR and PAF segments in both databases, which confirms our hypothesis H2;

2.
R2: Comparing SR and PAF segments in both databases, the highest performance of F = 93.6% and F = 98.3% is obtained using W = 4 and 3, respectively ( Table 4). The RoF and EL classifiers have high performance for varied W. The P and R are also found to be the highest for W = 3 in both databases. The RoF classifier yields the highest ACC of 93.6% and 98.3% for AFPDB and AFTDB databases, respectively. With varied classifiers, the DSA method differentiates SR and PAF segments in both databases, which confirms our hypothesis H2; 3.
R3: The DSA features for varied |V| discriminate SR and PAF segments in both databases (Figure 7). The maximum difference in the median in each database is 0.526 for |V| = 3. The DSA feature for varied W with |V| = 3 in AFPDB and is shown in Figure 7c,d. The higher value of the DSA feature indicates the presence of a similar pattern in the segments. The median is higher for SR segments and ranges from 0.5 to 0.75 for both databases. The mean decreases with higher values of W. The percentage difference of the average DSA feature is greater than 25% for varied W in both AFPDB and AFTDB databases. The smallest and largest difference in the average of DSA feature is observed with W=5 and 3 obtaining 0.122 and 0.526, respectively. For AFPDB, the top three parameters: |V| = 9, 7, and 5 with W = 3 obtained F of 94.6%, 92.7%, and 91.8 %, respectively. Similarly, for the AFTDB database, the top three parameters |V| = 5, 7, and 3 with W = 3 yield F of 99.8%, 98.3%, and 98.3%, respectively. When the word size W is set to 3 in both databases, the F is the highest (Table 5). Moreover, the DSA feature computed using |V| and W as 3 exhibits a statistically significant difference in determining SR and PAF segments in both databases (p < 0.005). SVM and kNN perform unreliably for varied S in both databases. EL and RoF yield the highest AUC for varied V (Figure 8). The DSA feature has a discriminative power to identify PAF segments. Based on the F, |V| and W are 3 as the optimal values for further experiments. Furthermore, this confirms our hypothesis H3. | | = 3. The DSA feature for varied with | | = 3 in AFPDB and is shown in Figure 7c,d. The higher value of the DSA feature indicates the presence of a similar pattern in the segments. The median is higher for SR segments and ranges from 0.5 to 0.75 for both databases. The mean decreases with higher values of W. The percentage difference of the average DSA feature is greater than 25% for varied W in both AFPDB and AFTDB databases. The smallest and largest difference in the average of DSA feature is observed with W=5 and 3 obtaining 0.122 and 0.526, respectively. For AFPDB, the top three parameters: | | = 9, 7, and 5 with = 3 obtained F of 94.6%, 92.7%, and 91.8 %, respectively. Similarly, for the AFTDB database, the top three parameters | | = 5, 7, and 3 with = 3 yield F of 99.8%, 98.3%, and 98.3%, respectively. When the word size is set to 3 in both databases, the F is the highest (Table 5). Moreover, the DSA feature computed using | | and as 3 exhibits a statistically significant difference in determining SR and PAF segments in both databases (p < 0.005). SVM and kNN perform unreliably for varied in both databases. EL and RoF yield the highest AUC for varied ( Figure 8). The DSA feature has a discriminative power to identify PAF segments. Based on the F, | | and are 3 as the optimal values for further experiments. Furthermore, this confirms our hypothesis H3.

4.
R4: For the AFPDB database, kNN and RoF obtained the highest F of 96.2% and 96.0%, respectively (Table 6). However, the AUC is high for longer time series with 3-min and 4-min sequences (96.9% and 97.2% in Table 6). The gain in AUC for a longer sequence can be explained by the fact that longer signals characterize the dynamics of signals effectively. In terms of P and R, kNN and RoF are found to be consistently high for a varied length of the signals (Figure 9). Furthermore, the ACC is observed to be high and consistent for kNN, RF, and RoF in 3-min sequences. However, the pattern transition in varied lengths of the sequences is captured by our DSA method (Table 6), which confirms our hypothesis H4. 5.
R5: The performance ranges from 84.1% to 99.8% for varied V using different classifiers. For the cross-dataset, the top three symbol lengths |V| = 5, 7, and 9 obtained the highest F of 99.8% (Table 7). RoF and EL yield the highest performance. Except for SVM and kNN, the performance of DSA with varied |V| is higher than 90.0%. Similarly, the AUC also ranges from 85.0% to 99.8% for varied S using different classifiers. The F is found to be consistent in |V| = 5 and 7, for RoF and EL classifiers. The kNN classifier yields the lowest performance of classifiers. Thus, we can conclude that our method is effective in the cross-data analysis (H5 is true).     (Table 6). However, the AUC is high for longer time series with 3-min and 4-min sequences (96.9% and 97.2% in Table 6). The gain in AUC for a longer sequence can be explained by the fact that longer signals characterize the dynamics of signals effectively. In terms of P and R, kNN and RoF are found to be consistently high for a varied length of the signals (Figure 9). Furthermore, the ACC is observed to be high and consistent for kNN, RF, and RoF in 3-min sequences. However, the pattern transition in varied lengths of the sequences is captured by our DSA method (Table 6), which confirms our hypothesis H4.
(a) (b) Figure 9. Comparison of P (a) and R (b) obtained for a varied length of time series by the DSA method and its classification using different classifiers.

Existing Challenges
The detection of PAF segments in ambulatory ECG recordings is essential and various works have been proposed. Many of these approaches are database-specific and cannot be applied to the signals with variable length [4,7,11,36]. Indeed, most of the PAF detection studies have been validated on an individual database with selected recordings to improve performance [2,10]. Recent advancements in recording devices such as long-term monitoring, wearables, and contactless ECG devices will improve continuous monitoring but will also yield more diverse quality and nature of signals [2,37]. Furthermore, domain adaptation and cross-data analysis in ECG signals have not been evaluated comprehensively to improve PAF detection and reduce false detection [38,39]. Therefore, the need for robust approaches with high performance in terms of PAF detection accuracy has been widely acknowledged [1,7,19].
Recently, symbolism techniques have been explored in physiological signals to characterize time-series dynamics and capture the transition pattern in pathological periods [16,20]. For instance, Akbilgic et al. [19] used symbolization to determine AF patterns in ECG signals. Mahajan et al. [11] applied symbols to discriminate normal and congestive heart failure conditions. Most of the symbolic approaches discretize the amplitude to captures the dynamic behavior of the signals [16,18,31]. However, this limits to selective leads and short-term recordings.
In this paper, we proposed a novel DSA method to symbolize RR intervals on the time axis to classify SR and PAF segments. To the best of our knowledge, our method is the first of its kind that uses a dynamic breakpoint list (Equation (4)), which overcomes the selective leads issue. Further, our implementation has been evaluated on multiple databases (Table 3). We have shown that DSA captures reliable transition patterns in sequences to discriminate SR and PAF in ECG with our experiments. DSA is robust, generalizable, and consistent ( Figure 6).

Multi-Length ECGs
To evaluate the performance of the DSA method for a varied length of the sequences, we computed the F and AUCs for the signals of different lengths in the AFPDB database. The DSA method distinguishes the different segments on different lengths effectively. The highest F obtained by the DSA method is 99.8%. The top performance is achieved in sequences with a lower length (1 and 3 min). Except for SVM, the F of all other classifiers is above 90.0% across 1-to 5-min length signals. This indicates that the DSA method may detect PAF in mobile recordings, even with small traces.
It is observed that the DSA method has a clear tendency to discriminate SR and PAF segments using RR interval data, even in varied time segments. This is in line with the results reported by Mahajan et al., who concluded that the smaller segments of the signals are suitable for identifying PAF segments [30]. In most cases, the PAF episodes are asymptotic and episodic in nature, and hence, the diagnosis of PAF is quite challenging, especially in the early stages [22,23]. The DSA method captures the transition pattern of the sequences irrespective of signal length. Therefore, identifying irregular patterns in short sequences will aid in the timely diagnosis of AF segments.

Cross Dataset Analysis
Most of the studies have been restricted to a single database to identify and discriminate PAF segments. Park et al. [40] have verified their method on AFPDB and AFTDB databases to differentiate PAF segments. Although the technique has been tested on multiple databases, cross-data analysis and domain adaptation are rarely found to address the effectiveness of these routines in clinical conditions. Therefore, we tested the DSA method for cross-data analysis. In line with Natarajan et al. [41], the DSA method captures the transition patterns and discriminates the PAF segments in the cross-data analysis. Moreover, the performance of DSA increases for short segments in the cross-data analysis. Therefore, DSA is reliable for determining transition patterns and can be extended for other pathological conditions.

Application with Novel Sensing Devices
Although wearable sensor-based monitoring provides a continuous recording of physiological signals, the signals from wearables and capacitive sensors are noisy and may contain dropouts on several leads [14]. Furthermore, most of the existing techniques are not effective on such sensors due to motion artefacts and nonstandard ECG segments [14]. Therefore, we tested our method on the ECG segments acquired from sensors, namely, a wearable T-shirt (Pro-Kit, Hexoskin, QC, Canada) and the cECG chair (Smart Seat, Capical, Braunschweig, Germany) in real-time conditions ( Figure 10) [42][43][44][45]. The STAPLE approach is used to determine the R peaks of acquired ECG segments [46]. In line with the proved hypothesis H1, our method also captures the transition patterns and determines the SR segments in such sensors ( Figure 10). Moreover, our method also discriminates the SR segments in the nonstandard capacitive ECG segments. Therefore, our method is reliable for wearables and capacitive ECG sensors.
However, there is a limitation with the DSA approach. Although the performance of DSA-based features with the classification model is reliable to identify different disease conditions, the approach depends on the R-wave detection algorithms. Hence, the necessity of using a robust R-wave detection algorithm before the DSA symbolization algorithm is emphasized.

Comparison and Future Scope
The comprehensive evaluation of the DSA method has shown that our method can be extended to other leads and has higher potential. Further, the performance of DSA over state-of-the-art methods shows its effectiveness (Table 8). Except for Sutton et al., most of the methods used the split validation technique to evaluate their method. Further, Park et al. evaluated the performance of the method by combining both the databases as a single dataset. In the current study, multiple window length sequences have also been tested. The DSA method is computationally efficient, yielding only a few seconds (<9.0 s) for smaller symbol lengths to discriminate SR and PAF segments. Hence, it is possible to implement DSA-based PAF detection routines in PC or smartphones. physiological signals, the signals from wearables and capacitive sensors are noisy and may contain dropouts on several leads [14]. Furthermore, most of the existing techniques are not effective on such sensors due to motion artefacts and nonstandard ECG segments [14]. Therefore, we tested our method on the ECG segments acquired from sensors, namely, a wearable T-shirt (Pro-Kit, Hexoskin, QC, Canada) and the cECG chair (Smart Seat, Capical, Braunschweig, Germany) in real-time conditions (Figure 10) [42][43][44][45]. The STAPLE approach is used to determine the R peaks of acquired ECG segments [46]. In line with the proved hypothesis H1, our method also captures the transition patterns and determines the SR segments in such sensors ( Figure 10). Moreover, our method also discriminates the SR segments in the nonstandard capacitive ECG segments. Therefore, our method is reliable for wearables and capacitive ECG sensors.   In the future, our method can be explored for signals from wearable devices, contactless ECGs, smart textiles, and other portable devices with nonstandard lead positions. In addition, RR interval sequences obtained from camera-based non-contact sensors can be used to evaluate the robustness of the DSA. Furthermore, DSA with deep learning could be explored to determine the abnormal transition patterns in the symbolic sequences.

Conclusions
In this paper, we have shown that the symbolization technique is a powerful tool for analyzing the irregularities and abnormalities of ECG signals. We proposed DSA to differentiate SR and PAF segments. For this, we use ECG signals and their RR intervals from AFPDB and AFTDB databases. We transform RR intervals into a symbolic representation and compute co-occurrence matrices. We extracted the DSA feature using varied |V| and W; and then applied to five machine learning algorithms for classification. Our DSA method determines the transition patterns of the signals and robustly discriminates PAF segments in mobile-recorded ECG signals. Our symbolization approach with low |V| and W parameters is effective to identify abnormalities. The combination of |V| = 3 and W = 3 features with an ensemble learning-based classifier gives the maximum performance (F = 94.0% and F = 99.80% for AFPDB and AFTDB database). Moreover, our method can differentiate PAF segments using variable-length signals and also supports cross-data analysis. Our method is amplitude-invariant and can be used for long-term signals. Our method is amplitude-invariant and can be used for long-term signals. Thus, it is useful for abnormalities detection in physiological signals. Furthermore, the approach can be explored for acoustics signals, mechanical signals, and other physiological signals.