Beat-to-Beat P-Wave Analysis Outperforms Conventional P-Wave Indices in Identifying Patients with a History of Paroxysmal Atrial Fibrillation during Sinus Rhythm

Early identification of patients at risk for paroxysmal atrial fibrillation (PAF) is essential to attain optimal treatment and a favorable prognosis. We compared the performance of a beat-to-beat (B2B) P-wave analysis with that of standard P-wave indices (SPWIs) in identifying patients prone to PAF. To this end, 12-lead ECG and 10 min vectorcardiogram (VCG) recordings were obtained from 33 consecutive, antiarrhythmic therapy naïve patients, with a short history of low burden PAF, and from 56 age- and sex-matched individuals with no AF history. For both groups, SPWIs were calculated, while the VCG recordings were analyzed on a B2B basis, and the P-waves were classified to a primary or secondary morphology. Wavelet transform was used to further analyze P-wave signals of main morphology. Univariate analysis revealed that none of the SPWIs performed acceptably in PAF detection, while five B2B features reached an AUC above 0.7. Moreover, multivariate logistic regression analysis was used to develop two classifiers—one based on B2B analysis derived features and one using only SPWIs. The B2B classifier was found to be superior to SPWIs classifier; B2B AUC: 0.849 (0.754–0.917) vs. SPWIs AUC: 0.721 (0.613–0.813), p value: 0.041. Therefore, in the studied population, the proposed B2B P-wave analysis outperforms SPWIs in detecting patients with PAF while in sinus rhythm. This can be used in further clinical trials regarding the prognosis of such patients.


Introduction
Atrial fibrillation (AF)-the most common sustained cardiac arrhythmia-while not a life-threatening condition itself, leads to an increased risk of stroke or heart failure and high rates of mortality [1]. Therefore, early detection and diagnosis of AF is a critical issue for all health stakeholders. Identifying individuals at a higher risk of developing AF is feasible using predictive models based on clinical variables, circulating biomarkers, and P-wave indices (PWIs) [2].

Materials and Methods
We invited 40 consecutive patients with newly diagnosed (less than a month) PAF and no history of antiarrhythmic medication to participate in the study; 60 age-and gender-matched individuals without any history of PAF or structural heart disease were used as the control group. Patients with comorbidities, such as previous cardiovascular surgery, previous cardiac ablation, heart failure NYHA class III-IV, severe valvular heart disease, prosthetic valves, reduced life expectancy, age > 75 years, atrioventricular block, presence of implanted pacemaker or cardiac defibrillator, moderate/severe renal or hepatic impairment, were excluded from the analysis. A complete medical history was obtained from all study volunteers, and both groups underwent clinical examination and calculation of the CHA 2 DS 2 -VASc score. Eventually, 33 patients with a PAF history and 56 controls fulfilled inclusion criteria. The baseline characteristics of the two groups can be found in Table 1. Standard 12-lead ECGs were obtained from all participants. The recordings were performed at least 7 days apart from any AF episode to minimize the potential effect of myocardial stunning. The ECGs were scanned, stored in digital format, magnified suffi-Diagnostics 2021, 11,1694 3 of 17 ciently, and analyzed with digital image processing software (imagej.nih.gov/ij, accessed on 21 October 2020). Additionally, three-orthogonal axis system (X-frontal, Y-vertical, and Z-sagittal axis) vectorcardiographic (VCG) signals of 10 min duration were also recorded at the same time, with study individuals resting in the supine position, using a high sampling rate (1000 Hz) Galix GBI-3S Holter monitor.
All participants were informed about the scope of the study and gave written informed consent. The study complied with the Declaration of Helsinki and was approved by the Special Purpose General Assembly of Aristotle University School of Medicine (#8, approved on 9 September 2016).
2.1. Standard P-Wave Indices 2.1.1. The 12-Lead ECG Indices Eight conventional 12 lead ECG PWIs, PR duration, P-wave duration, P-wave dispersion, P-wave peak time, P-wave axis, P-wave voltage in lead I, PTFV 1 , and P-wave area were measured by three observers, and mean values were calculated. P-wave dispersion was defined as the difference between the longest, and the shortest P-wave duration measured in any of the standard ECG leads. P-wave peak time is a novel index, equal to the duration between the beginning and peak of the P-wave measured in leads II or V 1 [11]. An abnormal P-wave axis was determined as a frontal P-wave axis less than 0 • or more than 75 • [3], while PTFV 1 was calculated as the amplitude-duration product of the terminal negative component of the P-wave in lead V 1 [12]. P-wave area (mV × ms) was measured as the sum of the absolute areas underneath the positive and negative P-wave deflections, and the maximum area from among the 12 leads was selected [3].
Furthermore, P-wave biphasicity in inferior leads was assessed to identify IAB type [13]. Partial-IAB (p-IAB) was defined as a P-wave ≥ 120 ms without a negative deflection in the inferior leads (II, III, aVF), and advanced IAB (a-IAB) as a P-wave ≥ 120 ms, along with biphasic morphology in inferior leads. Finally, a relatively new composite score, the MVP score (morphology-voltage-P-wave duration), was calculated, assigning up to two points to each of the three components [14] (Appendices A.1.1-A.1.7).

Orthogonal Morphology
Orthogonal P-wave morphology was accessed according to P-wave positive/negative deflection, or biphasicity, in leads X, Y, and Z. Three predefined types (orthogonal type 1, 2, or 3) indicative of the interatrial conduction route were considered [15]. In types 1 and 2, leads X and Y are positive, and lead Z is either negative or biphasic, while in type 3 lead X is positive and lead Y is biphasic. (Appendix A.1.8).
In conclusion, a total of fourteen SPWIs was taken into consideration; the eight conventional 12 lead ECG PWIs, along with the MVP score, a-IAB, p-IAB, and orthogonal types 1-3.

Beat-to-Beat Classification into Main and Secondary Morphology
Orthogonal vectorcardiograms were further studied to accomplish B2B analysis. Signal processing was performed using MATLAB R2015a, The MathWorks, Inc., Natick, MA, USA. Following an automated signal pre-processing procedure, consisting of denoising and QRS complex detection, artifacts and ectopic beats were removed in a semi-automated manner. According to a methodology previously described [9], where the existence of main and secondary P-wave morphologies was proposed, a clustering technique was used to classify P-waves into distinct groups of main, secondary, or other less frequent morphologies. Specifically, a signal window of 250 ms preceding every QRS complex, named P segment (P SEG ), was defined. Following a k-means clustering method, the optimal number of clusters, which better classify the P SEGs was calculated, and every P SEG was allocated to a cluster. The mean morphology of the cluster containing most P SEGs was used as a template for the detection of the P-waves matching the main morphology. For the remaining P SEGs , following the same process once again, an additional P-wave template was extracted, to detect the P-waves allocated to the secondary morphology. The beats that did not match any of the morphologies were considered as other morphologies. Therefore, the secondary morphology was not extracted merely as the mean morphology of the P-waves not matching the main morphology, but rather as the mean values of a group of beats highly correlated and truly forming a cluster. Finally, the percentage of P-waves matching the main and the secondary morphology in each lead was calculated, as well as the percentage of P-waves following simultaneously the main morphology in all three leads (Appendix A.2.1).

Time-Domain Analysis
Time features regarding P-wave position related to the QRS complex were calculated. Thus, the distance between P-wave onset, offset, and maximum value, and Q-or R-points were assessed for every beat allocated to the main morphology.

Time-Frequency Domain Analysis
The P-waves of the dominant morphology were further analyzed in a B2B aspect using continuous wavelet transform, an approach already been used for processing nonstationary signals such as ECG. The base wavelet used was the complex Morlet wavelet. According to a previous study of ours, where optimal frequency zone range was investigated, analysis was performed in three non-overlapping bands (low-L: 30-70 Hz, medium-M: 70-160 Hz, and high-H: 160-200 Hz) [9]. Frequency features such as mean, median, and maximum energy, as well as time-frequency features, such as maximum energy location regarding P-wave onset, offset, or peak, were calculated (Appendix A.2.2).
Eventually, P-wave B2B analysis, consisting of classification into primary and secondary morphology, and time-domain and time-frequency domain analyses, in three different axes, in three individual frequency zones, and both mean value and coefficient of variation calculation of selected variables ended up to a total of 262 features ( Table 2). Distance between P-wave onset to R-wave 3 -2 6 Distance between P-wave onset to Q-wave 3 -2 6 Distance between P-wave peak to R-wave 3 -2 6 Distance between P-wave peak to Q-wave 3 -2 6 Distance between P-wave offset to R-wave 3 -2 6 Distance between P-wave offset to Q-wave 3 -2 6 Distance between P-wave onset to P-wave peak 3 -2 6 Distance between P-wave onset to P-wave peak/P-wave duration 3 -2 6 Distance between P-wave peak to P-wave offset 3 -2 6 Distance between P-wave onset and maximum energy location 3 3 2 18 Distance between P-wave onset and maximum energy location normalized to P-wave duration 3 3 2 18 Distance between maximum energy location and P-wave offset 3 3 2 18 Distance between maximum energy location and R-wave 3 3 2 18 Distance between maximum energy location and Q-wave 3 3 2 18 Distance between P-wave peak and maximum energy location  Total 262 Features derived from beat-to-beat P-wave morphology and wavelet analysis. Abbreviations: leads (orthogonal leads X, Y, Z; freq. zone: frequency zones (low, mid, and high); mean: mean value; cv: coefficient of variation (SD/mean).

Statistical Analysis
Statistical analysis data were expressed as mean ± standard deviation for continuous variables. The Mann-Whitney U test or the Fisher's exact test was used, as appropriate, for statistical testing. Univariate logistic regression analysis was performed on statistically significant features to determine variables with prognostic value on detecting patients with a history of AF, and variables were ranked according to their area under the curve (AUC). Predictors were checked for collinearity, and the features with the highest AUC among those highly correlated were selected for further study.
Following multivariate logistic regression analysis, two different classifiers were developed-one based on features derived from B2B analysis and one using SPWIs. Receiver operating characteristics (ROC) curves were delineated for these classifiers, and the AUC was calculated to estimate the predictive value of each one. AUC comparison was accomplished using DeLong's method [16]. Statistical analysis was performed using MAT-LAB (R2015a) computer software, and an alpha level of <0.05 was accepted as statistically significant.

Results
The studied dataset consisted of 276 features in total; of those, 262 variables were obtained from B2B analysis and 14 from SPWIs. Five SPWIs were found significantly different between the two groups, while B2B analysis was represented by nineteen statistically significant features; seven of them were derived from the percentage of main or secondary morphologies and twelve from spectral and temporal analysis (Table 3). Table 3. P-wave indices and beat-to-beat variables in PAF patients and healthy controls.

Features
Healthy (n = 56) AF (n = 33)  P-wave peak to R-wave-X, cv 0.08 ± 0.2 0.1 ± 0.08 <0.001 P-wave peak to Q-wave-X, cv 0.03 ± 0.4 0.2 ± 0.2 <0.001 P-wave onset to R-wave-X, cv 0.03 ± 0.02 0.05 ± 0.04 0.004 P-wave onset to Q-wave-X, cv 0.04 ± 0.02 0.08 ± 0.06 0.004 P-wave onset to P-wave peak (peak time)-X, cv 0.05 ± 0.03 0.08 ± 0.06 <0.001 P-wave peak to P-wave offset-X, cv 0.1 ± 0.5 0.09 ± 0.06 <0.001 P-wave peak to P-wave offset-Y, cv 0.05 ± 0.03 0.1 ± 0.2 0.041 Time-frequency domain (ms) Maximum energy location to P-wave offset-mid band, Z, mean 61.1 ± 18.6 53.8 ± 20.1 0.044 Maximum energy location to P-wave onset, normalized to P-wave duration-high band, Z, mean 0.6 ± 0.1 0.6 ± 0.09 0.049 Maximum energy location to P-wave onset, normalized to P-wave duration-mid band, Z, mean 0.6 ± 0. Nine traditional PWIs did not differ significantly between the two groups: PR duration, P-wave dispersion, P-wave peak time, P-wave axis, PTFV 1 , P-wave orthogonal type, and a-IAB values were comparable in control and PAF groups. On the other hand, the widely used P-wave duration was longer, and the P-wave area, along with P-wave voltage in lead I were smaller in PAF patients, compared to the control group. Moreover, p-IAB was more common and the MVP score was higher in the PAF group.
Additionally, morphology analysis revealed significant differences between main, secondary, and secondary to the main ratio in both axis X and Y, along with the percentage of P-waves following main morphology in all leads at the same time. Furthermore, both time-domain features and frequency-derived parameters in all three axes were represented by significant variables such as coefficient of variation of P-wave peak time in axis X, mean value of maximum energy in the high-frequency band in axis Y, or mean value of maximum energy location to P-wave duration in the mid-frequency band in axis Z.
Following univariate logistic regression analysis, variables were ranked according to estimated AUC, and odds ratios (OR) were calculated. Sixteen features were significant predictors, while eight variables did not reach the level of significance or AUC lower limit above 0.5 and were excluded from further analysis (Table 4). Collinearity test revealed a strong correlation between main, secondary, and secondary to main morphology ratio in axis X, between the coefficient of variation of the distance between P-wave onset and the Q-or R-wave in axis X, and between the MVP score and P-wave duration. Only the features with the highest AUC among those highly correlated were selected (Table 4). Thus, eight B2B features and four SPWIs were finally used in multivariable regression analysis. Eventually, two classifier groups were developed: one using SPWIs and one using B2B variables. The risk of overfitting was kept low by ensuring an event-per-variable ratio above 10, and thus, only models consisting of ≤3 compounds were taken into consideration [17].
All suggested models for each category were compared to each other in terms of AUC, and those with the highest performance in our data sample were selected (Table 5). In the first case, a B2B classifier consisting of the following three features was proposed: (i) coefficient of variation of distance between the P-wave peak and Q-wave in X axis, (ii) percentage of P-waves allocated in main morphology in X axis, and (iii) mean value of maximum energy in the high-frequency band in the Y axis. In the second case, the model with the highest AUC among all SPWIs classifiers consisted of (i) p-IAB, (ii) P-wave voltage in lead I, and (iii) P-wave duration.  with the highest AUC among all SPWIs classifiers consisted of (i) p-IAB, (ii) P-wave voltage in lead I, and (iii) P-wave duration.

Discussion
In the current study, we assessed the performance of B2B P-wave analysis in distinguishing PAF patients from individuals with no history of AF, compared to that of standard PWIs. In our sample, a B2B-derived model outperformed a classifier based on established ECG-based PAF predictors.
Univariate analysis revealed that only B2B indices, and none SPWI, achieved an acceptable AUC > 0.7 in predicting PAF (Table 4). Furthermore, the B2B classifier consists of three components coming from three different fields of B2B analysis: main and secondary morphology analysis (percentage of P-waves allocated in main morphology in X axis), temporal analysis (coefficient of variation of distance between the P-wave peak and Q-wave in X axis), and wavelet analysis of main P-wave morphology (mean value of maximum energy in the high-frequency band in the Y axis), underlying the importance of an integrated approach to PAF prediction. Additionally, the SPWI classifier consists of p-IAB, voltage in lead I, and P-wave duration, features that, interestingly, are also compounds of the recently developed MVP score [14].
Many PWIs have been previously found to be highly correlated to AF development. Pwave duration may be the most widely studied among them. In our study, P-wave duration along with voltage in lead I, and patrial IAB was one of the variables included in the SPWI classifier. In a meta-analysis of Framingham Heart Study (FHS) and Atherosclerosis Risk in Communities (ARIC) study, P-wave duration > 120 ms was significantly associated with AF [18]. Moreover, low P-wave amplitude in lead I has been related to displaced interatrial conduction and AF recurrence in another study [19].
Additionally, in the aforementioned work, the P-wave area did not show a robust correlation with AF, and PTFV 1 predicted AF only in the ARIC study and not in FHS [18]. However, PTFV 1 has been found in other studies to be significantly associated with AF occurrence [3,12] but not in ours.
Advanced interatrial block (a-IAB) is an established electrocardiographic phenotype, that has been thoroughly studied. However, in Regicor Study it was found that a-IAB did not seem to provide an additional AF risk beyond that of P-wave duration [20], which was concordant with our findings. Risk factors for developing a-IAB such as aging, hypertension, and obesity are similar to those for developing AF, interpreting the relationship between a-IAB and AF [20]. Since, patients selected in our study were relatively young, with a short history of PAF and few other comorbidities, there were very few cases of a-IAB, not enough to support a statistically significant result. On the contrary, p-IAB was found to be a quite powerful predictor and thus was included in the proposed SPWI classifier.
One of the novelties in our work is the study of P-wave variability with means of B2B analysis. Most previous studies on P-wave analysis were conducted on 10 s ECG recordings or signal-averaged ECGs. However, a B2B approach on a P-wave study is feasible and effective [21,22]. In fact, in our study, not only we managed to observe P-wave variability in 10 min recordings, but we also succeeded in depicting and quantifying primary and secondary morphologies using an algorithm formerly described and repeatedly tested for consistency in various datasets. A higher percentage of recorded beats belonging to secondary morphologies and a lower percentage of beats of dominant morphology, in various orthogonal axes, were strongly associated with PAF, both in the current dataset and in a different dataset, previously studied [9]. This P-wave variability can only be reliably quantified in relatively lengthy ECG recordings such as the ones we use in the current study and could be proved to be the main advantage of the proposed modality. In fact, since the whole process is largely automated, the workload required to analyze a 10-minute recording is considerably low.
In addition, it is of great importance to underline that the patients who participated in our work had a noticeably short history of PAF (less than a month), were relatively young (mean age: 55.4 ± 12.6), and had a low CHA 2 DS 2 -VASc score (1.2 ± 1.3). Thus, using the proposed analysis it is feasible to detect predisposition to AF in the early stages.
Similarly, impaired interatrial conduction, assessed utilizing orthogonal morphology type, has been observed in young patients without comorbidities, implying that alterations in atrial electrophysiology may be the cause rather than the consequence of AF [23]. Moreover, in our study, patients receiving antiarrhythmic drugs were excluded, so it is safe to conclude that there was no interference to results caused by antiarrhythmic medication.
The case-control design of this study is a relative limitation but considered acceptable for a study investigating a novel diagnostic modality. Larger sample size would probably provide more robust results. In this work, simple logistic regression was employed, but in the future, more sophisticated artificial intelligence-based classifiers trained with features derived from larger populations are expected to reach a higher level of accuracy.
Although a well-known ECG-derived AF predictor, signal-averaged ECG (SAECG) was not analyzed in the current study. Recording atrial late potentials is technically challenging; thus, the hypothesis that low-amplitude atrial potentials are associated with AF was initially rejected [24]. However, using improved techniques, several SAECG P-wave indices were studied. In fact, atrial SAECG is widely used to calculate filtered P-wave duration (FPD), a more precise and reproducible measure of P-wave duration. On the contrary, SAECG late potential parameters, such as root mean square voltage of terminal Pwave segments, supposed to represent the activation of the left pulmonary veins [25], have shown little reproducibility and their clinical importance is controversial [26]. Therefore, in patients with PAF, FPD was longer, compared to normal controls, while no statistical difference was found in atrial late potentials [27]. Furthermore, FDP can identify hypertensive PAF patients while in sinus rhythm [28] and, along with left atrial dimension, is considered significant predictor of transmission from paroxysmal to persistent or permanent AF [29,30]. Platonov et al. have shown that fibrosis and fatty infiltration were more manifest in autopsies of patients with permanent AF than in those with paroxysmal AF, regardless of patients' age [31]. Moreover, P-wave duration and amplitude show a significant correlation with low-voltage area size and may be used as a non-invasive tool to predict severe scarring, as experimental and simulation studies suggest [32,33]. Therefore, PWIs probably predict AF progression by reflecting atrial fibrosis extension. Furthermore, B2B P-wave variability in patients with paroxysmal AF could be explained by variability in sinoatrial node exit location in combination with slow conducting regions, such as scars, as has been shown in computational simulation studies [34,35].
In future studies, an association between B2B indices and AF predictors other than SPWIs can be examined. Thus, a correlation between B2B analysis and echocardiographic parameters of left atrial size and function or characteristics of the left atrial substrate such as low voltage areas would be a potential research topic. Moreover, prospective studies can be conducted to determine B2B predictors' ability to detect clinical conditions beyond AF, such as embolic strokes of an undetermined source (ESUS).

Conclusions
Early identification of patients in the absence of organic heart disease or other predisposing factors, who are prone to PAF is a crucial issue for all health stakeholders. B2B P-wave analysis is a novel, effective approach for early AF prediction and seems that it is superior to the conventional PWIs analysis. Larger datasets will be necessary to aid the development of artificial intelligence models based on this analysis, to accurately predict PAF during sinus rhythm. Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results. The advanced interatrial block is defined as P-wave duration > 120 ms along with biphasic P-wave in inferior leads ( Figure A1) [13]. Figure A1. Advanced interatrial block. P-wave duration more than 120 ms along with biphasic P-wave in inferior leads II, III and aVF.
Appendix A.1.2. P-Wave Dispersion P-wave dispersion is defined as the difference between the longest and the shortest P-wave duration measured in any of the standard ECG leads ( Figure A2). P-wave dispersion is expected to be increased in PAF. Appendix A.1.3. P-Wave Area P-wave area is calculated as the sum of the absolute areas underneath the positive and negative P-wave deflections, and the maximum area from among the 12 leads is selected ( Figure A3). It is decreased in PAF, with a cut-off value of 4 mV × ms [3]. Appendix A.1.3. P-Wave Area P-wave area is calculated as the sum of the absolute areas underneath the positive and negative P-wave deflections, and the maximum area from among the 12 leads is selected ( Figure A3). It is decreased in PAF, with a cut-off value of 4 mV × ms [3]. PTFV1 is calculated by multiplying the depth by the duration of the P-wave terminal negativity in V1 ( Figure A4). A product of more than 4 mV × ms is considered abnormal [12]. The P-wave axis is calculated in the frontal plane ( Figure A5). Axis values between 0° and +75° are considered to be normal [3].   Figure A4). A product of more than 4 mV × ms is considered abnormal [12]. PTFV1 is calculated by multiplying the depth by the duration of the P-wave terminal negativity in V1 ( Figure A4). A product of more than 4 mV × ms is considered abnormal [12]. The P-wave axis is calculated in the frontal plane ( Figure A5). Axis values between 0° and +75° are considered to be normal [3].  The P-wave axis is calculated in the frontal plane ( Figure A5). Axis values between 0 • and +75 • are considered to be normal [3]. PTFV1 is calculated by multiplying the depth by the duration of the P-wave terminal negativity in V1 ( Figure A4). A product of more than 4 mV × ms is considered abnormal [12]. The P-wave axis is calculated in the frontal plane ( Figure A5). Axis values between 0° and +75° are considered to be normal [3].  Appendix A.1.6. P-Wave Voltage in Lead I P-wave voltage below 0.1 mV in lead I is considered abnormal ( Figure A6) [19]. Appendix A.1.6. P-Wave Voltage in Lead I P-wave voltage below 0.1 mV in lead I is considered abnormal ( Figure A6) [19]. The MVP score (Morphology-Voltage-P-wave duration) is calculated by assigning up to two points to each one of the three components (Table A1) [14].

Variable
Value Score

Morphology in inferior leads
Nonbiphasic (<120 ms) 0 Nonbiphasic (≥120 ms) In orthogonal types 1 and 2, leads X and Y are positive, and lead Z is either negative or biphasic. In type 3, lead X is positive, and lead Y is biphasic ( Figure A7). Each type represents a different interatrial activation route, with type 3 being highly correlated to AF [15]. The MVP score (Morphology-Voltage-P-wave duration) is calculated by assigning up to two points to each one of the three components (Table A1) [14]. Table A1. MVP score calculation.

Variable
Value Score

Morphology in inferior leads
Nonbiphasic (<120 ms) 0 Nonbiphasic (≥120 ms) In orthogonal types 1 and 2, leads X and Y are positive, and lead Z is either negative or biphasic. In type 3, lead X is positive, and lead Y is biphasic ( Figure A7). Each type represents a different interatrial activation route, with type 3 being highly correlated to AF [15]. Following a clustering procedure, each P-wave is allocated to main, secondary, or other less common morphologies ( Figure A8). In people suffering from AF, the percentage of P-waves allocated in main morphology is smaller than in healthy individuals. On the contrary, P-waves of secondary morphology are more common in this population [9]. Figure A8. Beat-to-beat allocation of P-waves to main or secondary morphology.