Background: Different treatments may be required for paroxysmal versus non-paroxysmal atrial fibrillation. However, they may be difficult to distinguish on an electrocardiogram (ECG). Machine learning methods may aid in differentiating these conditions, yet current approaches either do not preserve patient privacy or tend
[...] Read more.
Background: Different treatments may be required for paroxysmal versus non-paroxysmal atrial fibrillation. However, they may be difficult to distinguish on an electrocardiogram (ECG). Machine learning methods may aid in differentiating these conditions, yet current approaches either do not preserve patient privacy or tend to make the unrealistic assumption of uniform data.
Methods: We create a non-independently and identically distributed dataset for paroxysmal versus non-paroxysmal atrial fibrillation detection. Two baselines (a centralized classifier and a federated classifier) are trained, and the performances of classifiers pretrained on shared data before federated training are compared.
Results: The centralized classifier outperforms all other models (
), while the federated model is the worst-performing model (
). Classifiers that are pretrained on 10%, 30%, and 50% of shared data (CNN-10, CNN-30, CNN-50, respectively) perform better than the purely federated model (
for all models). Furthermore, no performance difference is observed between any of the models trained on shared data (the null hypothesis of a one-way analysis of variance test between the shared data models is not rejected,
).
Conclusions: The partial sharing of data in creating federated machine learning models may significantly improve performance. Furthermore, the volume of data required to be shared may be relatively small.
Full article