Federated Graph-Transformer Network for Coronary Artery Disease Severity Grading from X-Ray Coronary Angiography

Alphonse, Suja; Venkatesan, R.; Gunasekaran, Hemalatha; Swaminathan, Deepa Kanmani; Ramalakshmi, Krishnamoorthi

doi:10.3390/make8070187

Open AccessArticle

Federated Graph-Transformer Network for Coronary Artery Disease Severity Grading from X-Ray Coronary Angiography

by

Suja Alphonse

^1,2,

R. Venkatesan

^1,*

,

Hemalatha Gunasekaran

^3,*

,

Deepa Kanmani Swaminathan

⁴

and

Krishnamoorthi Ramalakshmi

⁵

¹

Division of Computer Science and Engineering, Karunya Institute of Technology and Sciences, Coimbatore 641114, India

²

CSE (Artificial Intelligence and Data Science), GMRIT Deemed to Be University, Rajam 532127, India

³

College of Computing and Information Sciences, University of Technology and Applied Sciences, Ibri 516, Oman

⁴

Department of Information Technology, Sri Krishna College of Engineering and Technology, Coimbatore 641008, India

⁵

Center of Excellence in Computer Vision, Alliance School of Advanced Computing, Alliance University, Bengaluru 562106, India

^*

Authors to whom correspondence should be addressed.

Mach. Learn. Knowl. Extr. 2026, 8(7), 187; https://doi.org/10.3390/make8070187

Submission received: 14 April 2026 / Revised: 27 June 2026 / Accepted: 29 June 2026 / Published: 2 July 2026

(This article belongs to the Section Learning)

Download

Browse Figures

Versions Notes

Abstract

Automated assessment of coronary artery disease (CAD) severity from invasive X-ray angiography is important for diagnostic accuracy, but there are limitations due to limited label data and privacy issues in multi-institutional collaboration. This research proposes a Federated Graph-Transformer Network (FGTN) that models coronary vessel compositions as graphs and uses a transformer unit of measurement to encode global anatomic circumstances for severity scaling. The publicly available X-ray angiography images and SYNTAX-Score dataset will be used, consisting of 232 X-ray coronary angiography images with analogous clinically calculated SYNTAX tons and angiographic factors from 231 patients, manually annotated by a competent cardiologist. The vascular tree is a primary segment that transforms inside the node-edge graph representing bifurcation and vessel sections, continuing topological features, and then processes by graph convolutions integrated with transformer self-attention to capture simultaneously the local stenosis features and global vessel relationships. A Horizontal Federated Learning Strategy allowing collaborative model training on clinical sites without sharing raw data. The intended FGTN achieved overall accuracy of 99.4%, precision of 97.6%, recall of 98.8%, and F1-score of 98.2%, exceeding the usual CNNs, Attention-UNet, and Capsule Connection baselines by a margin of 4–7%. For non-obstructive, mild, moderate, and severe stenosis classes, the AUC values were 0.98, 0.97, 0.96, and 0.95, respectively. Moreover, the Federated Learning framework shows firm convergence with lower, compared to 1.8% performance degradation, when compared to centralized training, and confirms robustness via heterogeneous data distribution. These results show that the proposed solution automatically calculates the CAD severity grading from coronary angiography images.

Keywords:

coronary artery disease; X-ray coronary angiography; federated learning; federated graph-transformer network; transformer-based deep learning

1. Introduction

Coronary artery disease (CAD) is a major form of Cardiovascular Disease (CVD) and remains one of the leading causes of mortality worldwide. According to the World Health Organization (WHO), cardiovascular diseases account for approximately 17.9 million deaths annually, representing nearly 32% of all global deaths [1]. In India, reported CAD prevalence has increased from 3 to 4% in the 1970s to over 11% in urban populations, placing increasing demands on healthcare systems for effective diagnosis and management [2]. Accurate assessment of CAD severity is essential for clinical diagnosis and for selecting appropriate revascularization strategies. X-ray coronary angiography (XCA) remains the clinical reference standard for visualizing coronary artery anatomy and identifying luminal stenosis. However, XCA interpretation remains subjective and operator-dependent, with previous studies reporting inter-observer disagreement rates of approximately 15–30% in the visual assessment of coronary stenosis severity [3]. In addition, factors such as vessel overlap, low image contrast, vascular tortuosity, and complex branching anatomy make accurate interpretation challenging even for experienced cardiologists. These limitations have motivated the development of automated computer-aided systems for reliable CAD severity assessment.

The SYNTAX score is an established angiographic scoring system used to quantify the complexity of coronary artery disease based on lesion characteristics such as anatomical location, bifurcation involvement, total occlusion, calcification, vessel tortuosity, and lesion morphology [4]. In clinical practice, Synergy Between Percutaneous Coronary Intervention with Taxus and Cardiac Surgery (SYNTAX) scores are commonly categorized into low-, intermediate-, and high-risk groups and are widely used to guide revascularization decisions between Percutaneous Coronary Intervention (PCI) and Coronary Artery Bypass Grafting (CABG). However, manual SYNTAX score calculation is time-consuming, requires detailed evaluation of multiple angiographic features, and remains subject to inter-observer variability [5]. Therefore, automated SYNTAX score estimation has the potential to improve diagnostic consistency, reduce clinical workload, and enhance workflow efficiency.

Recent advances in deep learning (DL) have significantly improved automated medical image analysis, including applications in coronary angiography. Convolutional Neural Networks (CNNs) have been widely applied to angiographic image analysis tasks such as vessel segmentation, stenosis detection, and CAD severity classification [6]. Previous studies have reported strong performance in coronary vessel segmentation and binary stenosis detection, demonstrating the effectiveness of CNN-based feature extraction methods. However, conventional CNNs primarily operate on grid-based image representations and may not explicitly encode the topological structure of coronary vessels, which is important for understanding lesion distribution and disease progression within the vascular network.

In Graph Neural Networks (GNNs), coronary arteries can be represented as graphs in which nodes correspond to bifurcation points or vessel segments, while edges represent anatomical vessel connections [7]. This representation enables topology-aware analysis of vascular morphology and lesion distribution, which is important for accurate CAD severity assessment [8]. Many existing GNN-based methods rely on handcrafted features or shallow graph architectures, limiting their ability to capture complex long-range dependencies across the vascular network.

Transformer architectures have recently emerged as promising approaches for medical imaging tasks because of their ability to capture long-range contextual dependencies using self-attention mechanisms [9]. Vision Transformers (ViTs) and hybrid CNN–transformer models have demonstrated competitive performance in selected medical imaging applications, including cardiac MRI analysis and retinal image classification [10]. However, despite their potential, transformer-based models have rarely been integrated with graph representations of coronary anatomy, leaving an important gap in topology-aware and context-rich CAD severity assessment.

Federated Learning (FL) addresses this challenge by enabling collaborative model training across institutions without transferring raw patient data to a centralized server [11,12]. By exchanging model parameters instead of imaging datasets, FL can reduce data-sharing requirements and support privacy-preserving collaborative learning. However, FL alone does not automatically guarantee regulatory compliance or complete privacy protection, and additional security and governance measures may still be required. Many previous studies have relied on proprietary or single-center datasets, limiting reproducibility and generalizability [13,14].

To address these challenges, this study proposes a Federated Graph-Transformer Network (FGTN) for automated CAD severity grading from X-ray coronary angiography images. The proposed framework represents coronary vasculature as graph-structured data, integrates graph convolutional learning with transformer-based global attention, and employs Federated Learning to support collaborative multi-institutional model training without sharing raw patient data. Using the publicly available SYNTAX-annotated angiography dataset, the proposed approach aims to improve topology-aware feature representation, contextual vessel analysis, and privacy-preserving CAD severity assessment. Overall, this work contributes a unified framework for interpretable and scalable automated coronary artery disease severity grading.

1.1. Limitations of Existing Works

Existing CAD severity assessment methods present several important limitations:

Conventional CNN-based models primarily rely on grid-based image representations and may not adequately preserve coronary vessel topology and anatomical connectivity.
Graph-based approaches improve topology-aware analysis but are highly dependent on accurate vessel segmentation, where segmentation errors can propagate and reduce classification reliability.
Transformer-based architectures generally require large-scale training datasets and may be prone to overfitting when applied to relatively small coronary angiography datasets.
Many existing studies employ centralized training frameworks, limiting their applicability in multi-institutional healthcare environments where direct patient data sharing is restricted.

1.2. Objectives of the Proposed Work

To address these limitations, the present study aims to achieve the following objectives:

Develop a hybrid framework combining graph learning and transformer attention for CAD severity grading.
Capture both coronary vessel topology and global contextual relationships from angiography images.
Improve robustness and diagnostic accuracy in complex angiographic conditions.
Enable privacy-preserving multi-institutional learning using Federated Learning.

2. Literature Review

Significant improvements have been reported recently in machine learning system-based automated assessment tools for XCA in CAD assessment, especially with the introduction of DL architectures that can effectively handle complex vascular structures. A recent CNN-based method for detecting coronary stenosis shows promising results, with reported accuracy between 82% and 88% for binary classification of coronary obstructions [15]. To overcome these issues, multi-scale CNN architectures have been introduced, allowing for precise differentiation between changes in vessel diameters and stenoses [16]. Jansson et al. [17] observed that the Graph Convolutional Network (Graph CNN) achieved an Area under the Receiver Operating Characteristic curve (AUC) of 0.91 in stenosis scaling; however, the model still remains vulnerable to changes in the angle of projection as well as delineation variability. Graph learning has proven itself as a promising approach in the assessment of coronary arteries, based on its in-built capability of capturing topography in the vasculature.

Kipf et al. [18] showed that the use of graph learning improved the accuracy of lesion localization by 6–8% compared with the use of Convolutional Neural Networks (CNNs) based on pixels, especially in cases with multiple lesions. However, the use of GNNs alone seems limited in terms of the capacity for capturing global contextual relationships, especially in diffuse disease patterns. Vaswani et al. in [19] observed that transformer-based medical models outperform traditional models on complex anatomical data by a margin of 3–6%. However, basic transformer models require large datasets with explicit structural constraints, limiting the use of these models in small-scale angiography datasets.

To overcome the data availability issues in medical imaging, Federated Learning (FL) has emerged as a promising solution for the development of medical AI models [20]. Sheller et al. in [21] proposed the federated analysis of angiography images with stable convergence and reduced data leakage risk, though the performance degradation was observed with increased non-IID data distributions. However, the majority of the federated models are based on traditional CNN models, which do not use vessel topological features and global attention mechanisms, limiting the diagnostic accuracy of the models.

Although there is a significant improvement, the existing automated CAD risk assessment techniques suffer from several limitations. Initially, several CNN-based and attention-based models show decreased robustness in scenarios of severe vessel overlap, heavy calcification, or poor contrast opacification, together with a reported accuracy drop of 8–12% in the same scenario. Consequently, the graph-based approach frequently relies on the segmentation of the target vessel, and the segmentation error propagates directly to the incorrect prediction, reducing the performance by up to 10% in noisy angiograms. Thirdly, transformer-based methods required large training datasets; at that time put into practice on small angiography datasets (300 samples), performance decline of 5–7% has been reported, free from robust regularization or pretraining. During privacy preservation, Federated Learning data frequently show sensitivity to non-IID information distribution, leading to convergence instability, and up to 3% performance loss compared to a centralized baseline. Such limitations together require an integrated framework that combines topology understanding, global contextual modeling, and federated optimization for reliable CAD severity scaling.

3. Dataset

The X-ray angiography images and SYNTAX Score Dataset contains 232 angiographic views obtained from 231 unique patients, where one patient contributed two independent angiographic acquisitions, resulting in the slight difference between the number of views and patients. As shown in Table 1, three optimal frames were automatically selected from each angiographic projection using MSSIM-based frame selection, producing 3459 total images. Each patient in the dataset included multiple angiographic frames and projection views acquired during coronary angiography. To avoid data leakage and overly optimistic evaluation, dataset partitioning and 10-fold cross-validation were performed strictly at the patient level, ensuring that all images belonging to a given patient remained exclusively within a single training, validation, or testing subset.

The SYNTAX score is a clinically established angiographic grading system used to quantify the anatomical complexity of coronary artery disease. The score considers lesion location, bifurcation involvement, calcification, chronic total occlusion, vessel tortuosity, and lesion length. In this study, the continuous SYNTAX scores were converted into four discrete CAD severity categories for multi-class classification. The mapping strategy was defined as follows: non-obstructive CAD for SYNTAX score < 10, mild CAD for scores between 10 and 22, moderate CAD for scores between 23 and 32, and severe CAD for scores > 32. These thresholds were selected based on clinically established SYNTAX-score risk stratification principles commonly used for coronary artery disease assessment and revascularization planning. The resulting four-class categorization enabled finer-grained severity prediction while preserving clinical interpretability. Because multiple angiographic frames and projection views were available for each patient, strict patient-level separation was enforced throughout the entire experimental pipeline to prevent data leakage. Specifically, all images belonging to a given patient were assigned exclusively to a single training, validation, or testing subset and were never distributed across multiple folds. Consequently, no patient appeared simultaneously in training and evaluation data at any stage of cross-validation or federated partitioning.

Key points were extracted from the vessel centerline using curvature-based analysis, where points exhibiting significant directional variation or local geometric importance were selected to preserve vascular morphology. The angiography dataset consisted of four CAD severity categories: non-obstructive (310 samples, 24.8%), mild (345 samples, 27.6%), moderate (358 samples, 28.6%), and severe (237 samples, 19.0%). The dataset therefore exhibited a mildly imbalanced distribution, with comparatively fewer severe CAD samples than mild and moderate categories. To make reliable model evaluation, patient-level separation was strictly maintained during dataset partitioning so that images from the same patient never appeared simultaneously in training, validation, and testing subsets. Stratified sampling was applied to preserve balanced representation of all CAD severity classes across subsets and federated clients. To further improve model generalization and reduce overfitting, data augmentation techniques, including random rotation, horizontal flipping, contrast enhancement, and minor geometric transformations, were applied only to the training subset while validation and testing datasets remained unchanged.

4. Methodology

Figure 1 shows the overall architecture of the proposed Federated Graph-Transformer Network (FGTN) to automate the assessment of CAD severity from X-ray angiography images. The primary objective of the proposed FGTN framework is to perform multi-class CAD severity grading from coronary angiography images based on SYNTAX-score-derived severity categories. The model classifies cases into four classes: non-obstructive, mild, moderate, and severe CAD. These severity classes were derived from predefined SYNTAX-score threshold ranges as described in the Section 3. In addition, plaque localization overlays are generated to provide visual interpretability of vessel obstructions.

The procedure begins with natural X-ray angiography projection, followed by vessel segmentation to extract coronary tree structures. Vessel segments are converted to a graph representation, where a node represents a bifurcation, or an endpoint, and an edge represents vessel parts. The convolutional layer of the graph shows the regional vessel morphology, while the transformer encoder models the global dependency between the nodes, which enhances the aspect representation of complex vessel organizations. To predict the severity score we use a graph embedding pass using a Multi-Layer Perceptron (MLP) classifier. The FGTN uses federated averages for several clients to ensure that no natural phenomena interfere with the local sites while contributing to the planetary model.

The proposed FGTN integrates graph-based coronary vessel modeling, transformer-based attention, and Federated Learning for automated CAD severity grading. The methodology consists of four key stages:

Vascular Segmentation and Graph Construction;
Graph Convolutional Feature Extraction;
Transformer-Based Global Context Encoding;
Federated Learning Aggregation.

4.1. Vascular Segmentation and Graph Construction

Each X-ray coronary angiography image is first processed using a U-Net-based vascular segmentation model to extract the coronary vessel tree. The segmentation network was initialized using pre-trained encoder weights and fine-tuned on the angiography dataset used in this study. The segmentation stage achieved a Dice Similarity Coefficient (DSC) of 0.962 and an Intersection over Union (IoU) of 0.931 on the independent test subset, confirming reliable vessel delineation for downstream graph construction. Following segmentation, vessel skeletonization was performed to extract vessel centerlines and identify anatomically significant points, including bifurcations, endpoints, and curvature-based centerline landmarks. After vascular segmentation, morphological skeletonization was applied to obtain one-pixel-wide vessel centerlines while preserving vascular topology. Node extraction was then performed using pixel-connectivity analysis on the skeletonized vessel map. Pixels connected to more than two neighboring pixels were identified as bifurcation nodes, while pixels connected to only one neighboring pixel were considered vessel endpoints. Additional anatomically significant centerline points were extracted using curvature-based sampling to preserve vessel geometry in elongated segments. Graph edges were subsequently formed by connecting adjacent nodes along continuous vessel centerline paths, thereby preserving anatomical vessel connectivity throughout the coronary tree. This is mathematically described as follows.

The X-ray coronary angiography image

I \in R^{H \times W}

is first processed to segment the coronary vessels, typically using classical image processing techniques or deep learning models such as U-Net.

V = \{v_{1}, v_{2}, \dots, v_{n}\}

are nodes representing vessel bifurcations, vessel endpoints, and anatomically significant centerline points identified based on curvature variation and vessel directional changes. Once the vessel tree is extracted, it is represented as a graph

G = (V, E)

, where the nodes

V = {v_{1}, v_{2}, \dots, v_{n}}

correspond to vessel bifurcations, endpoints, or key points along the vessel, and the edges

E = {e_{i j} ∣ v_{i}, v_{j} \in V}

represent vessel segments connecting these nodes.

Each node

v_{i}

is associated with a feature vector

x_{i}

capturing local morphological properties:

x_{i} = [d_{i}, θ_{i}, l_{i}, c_{i}] \in R^{F}

(1)

where

$d_{i}$ = local vessel diameter estimated using the Euclidean distance between opposing vessel boundaries along the centerline.
$θ_{i}$ = vessel orientation computed as the tangent angle of the local centerline segment relative to the horizontal axis.
$l_{i}$ = vessel segment length measured as the geodesic distance between adjacent graph nodes along the centerline.
$c_{i}$ = vessel curvature calculated from directional changes between consecutive centerline points using discrete curvature approximation.

Prior to graph construction, all node features were normalized using min-max normalization to ensure consistent numerical scaling across angiographic samples and federated clients.

The adjacency matrix

A \in R^{n \times n}

encodes connectivity between nodes:

A_{i j} = \{\begin{matrix} 1, & if v_{i} is connected to v_{j} \\ 0, & otherwise \end{matrix}

(2)

For a graph containing N nodes, the adjacency matrix has dimensions N × N. For example, a graph with 100 nodes produces a 100 × 100 adjacency matrix. The structural connectivity of the graph is encoded in an adjacency matrix, where

A_{i j} = 1

if nodes

v_{i}

and

v_{j}

are directly connected by a vessel segment, and

0

otherwise.

Figure 2 shows the entire process of how the X-ray coronary angiography image is converted to a graph representation of the vascular system. To start with, there is preprocessing of the X-ray coronary angiography image, followed by segmentation using a classical method equivalent to the Frangi filter and a deep learning method such as U-Net. After segmentation, there is skeletonization, which gives us the centerlines, followed by identification of key points such as bifurcations and endpoints. These key points are then used to form a graph, where each node is a key point, while the vessels connecting them form the edges. Since coronary angiography is a 2D projection of a 3D vascular structure, vessel overlap may occasionally create pseudo-intersections during graph construction. To reduce false node generation, connectivity validation based on vessel continuity, orientation consistency, and intensity profile analysis was incorporated during skeleton refinement. Each node is then provided with attributes, which describe the neighborhood characteristics such as diameter, location, length, and curvature. Finally, an adjacency matrix is constructed to describe how these nodes are connected, thereby creating a graph representation. This representation maintains all the characteristics of the coronary arteries while providing a strong framework for further vascular defect assessment and even integration with graph neural networks.

Since the downstream graph representation depends directly on the quality of vessel segmentation, segmentation inaccuracies may introduce false vessel discontinuities, pseudo-intersections, or incorrect bifurcation nodes, which can affect CAD severity prediction. To improve robustness, vessel continuity refinement, orientation-consistency verification, morphological filtering, and intensity-profile validation were incorporated during skeleton refinement and graph construction. These measures reduce the impact of noisy angiographic regions, vessel overlap, and incomplete segmentation outputs, thereby improving graph stability and classification reliability.

4.2. Graph Convolutional Feature Extraction

Graph Convolutional Networks (GCNs) extend the concept of traditional convolutional operations from regular grid-structured data, such as images, to irregular graph-structured data. This capability makes them particularly well suited for modeling vascular networks, where anatomical structures can be naturally represented as graphs.

In this case, each node represents a junction or a critical point in a vessel and a local feature associated with it, while the edge represents the interconnectivity between such points. The most important operation in a GCN’s layer is graph convolution, which updates the feature vector associated with each node by aggregating information from neighboring nodes. Mathematically, this is expressed in the forward propagation in the next layer l + 1 by the following expression in Equation (3).

H^{(l+ 1)} = σ ({\tilde{D}}^{- 1 / 2} \tilde{A} {\tilde{D}}^{- 1 / 2} H^{(l)} W^{(l)})

(3)

Here:

$H^{(l)} \in R^{n \times F_{l}}$ is the matrix of node features at layer $l$ , where $H^{(0)} = X$ is the initial feature matrix containing morphological descriptors like vessel diameter, orientation, length, and curvature.
$\tilde{A} = A + I_{n}$ is the adjacency matrix with added self-loops, allowing each node to also consider its own features during the update.
$\tilde{D}$ is the degree matrix of $\tilde{A}$ , and ${\tilde{D}}^{- 1 / 2} \tilde{A} {\tilde{D}}^{- 1 / 2}$ is the symmetric normalized adjacency, which ensures proper scaling during aggregation and prevents feature magnitude explosion in deeper layers.
$W^{(l)}$ is a trainable weight matrix for the layer that learns to transform and combine features.
$σ (\cdot)$ typically, a ReLU introduces non-linearity to enable the network to capture complex patterns.

Intuitively, each node updates its features by averaging information from its neighbors (weighted by adjacency) and applying a learned linear transformation. By stacking multiple GCN layers (

L

layers), nodes progressively gather information from farther regions of the graph, effectively capturing both local vessel morphology and global topological context.

Finally, to obtain a graph-level embedding

h_{G}

representing the entire vascular network, node features from the last GCN layer are pooled (mean or sum pooling), as shown in Equation (4).

h_{G} = Pooling (H^{(L)}) = \frac{1}{n} \sum_{i = 1}^{n} H_{i}^{(L)}

(4)

This aggregate embedding condenses the structural and morphological information of the entire vessel within a fixed-size vector which can then be familiar to downstream operations such as the assessment of the severity of CVD, classification, or prediction.

The system for extracting features from a coronary vessel graph using a Graph Convolutional Network (GCN) is encapsulated in Figure 3. On the left, the vessel graph shows the node that represents bifurcation, while the endpoint and edge represent vessel sections. A feature vector that captures nearby vessel features such as diameter, direction, length, and curvature is present at the respective node. The central section is a GCN layer where each node aggregates its neighbors’ features, using a trainable weight matrix and passing the result using a non-linear activation function. On the right, the modified node features are pooled (average or sum) to generate a fixed-size graph-level embedding that encodes the topological and morphological information of the entire coronary network for downstream enterprises to admire disease diagnosis or classification.

4.3. Transformer-Based Global Context Encoding

The Transformer-Based Global Context Encoding is designed to capture long-range dependencies and global anatomical relationships within a coronary vessel graph, which traditional GCNs may miss. After extracting local vessel features using a GCN, each node embedding is projected into query, key, and value vectors. To capture long-range dependencies and global anatomical context, the graph embeddings are fed into a transformer encoder. Each node embedding

h_{i}

is projected into the query

Q

, key

K

, and value

V

vectors as shown in Equation (5).

Q = H W_{Q}, K = H W_{K}, V = H W_{V}

(5)

All participating clients used the same predefined graph feature extraction pipeline and network architecture; therefore, the node embedding dimensionality remained consistent across all clients during federated aggregation. In addition, multi-head attention can simultaneously capture different types of inter-node links. The self-attention mechanism calculates, as shown in Equation (6),

Attention (Q, K, V) = softmax (\frac{Q K^{⊤}}{\sqrt{d_{k}}}) V

(6)

where

d_{k}

is the dimension of the key vectors. The transformer layers allow the network to learn inter-node relationships across the entire vessel graph, improving severity prediction.

The updated node embeddings from the attention mechanism are then combined and passed through a MLP to construct a final graph representation. The current vector summarizes the entire coronary vessel network, encodes the surroundings and the planet, and can be used for downstream operations, such as the estimation of the severity of the artery disease. The final graph level representation can be obtained as shown in Equation (7).

h_{final} = MLP (Concat (Attention Output))

(7)

Figure 4 illustrates the Transformer-Based Global Context Encoding framework for coronary artery disease severity prediction. The coronary vessel graph is first processed by a GCN to extract node embeddings representing local vessel features. Each node embedding is then projected into query (Q), key (K), and value (V) vectors and processed through a multi-head self-attention mechanism to compute inter-node relationships across the entire graph. The updated node embeddings are concatenated and passed through a MLP to generate a final graph-level representation.

h_{final}

.

This representation captures both local and global anatomical context, enabling accurate downstream severity prediction. Equations for Q, K, V projections, self-attention, and final graph embedding are shown to link the visual workflow with the mathematical formulation. Since the available angiography dataset is relatively limited for transformer-based learning, multiple strategies were employed to reduce overfitting and improve model generalization. Dropout regularization was applied within the transformer attention layers and MLP classifier, while L2 weight regularization was incorporated during optimization. Early stopping based on validation loss was used to prevent excessive training beyond convergence. In addition, data augmentation techniques, including rotation, horizontal flipping, contrast adjustment, and minor geometric transformations, were applied during preprocessing to improve variability in angiographic appearance and vessel orientation.

4.4. Federated Learning Aggregation

The FL framework consisted of three simulated clinical clients (Hospitals A, B, and C), each representing an independent healthcare institution. The dataset was partitioned using patient-level separation to ensure that all images belonging to a single patient remained within the same client. To simulate realistic clinical heterogeneity, a mildly non-IID distribution strategy was adopted in which the severity-class proportions and angiographic characteristics varied slightly across clients while maintaining overall class balance. Each client independently trained the local FGTN model using its private dataset and shared only model parameters with the central aggregation server without transferring raw patient data. To ensure privacy-preserving training across simulated clinical sites, a horizontal Federated Learning (HFL) approach is adopted. Suppose k clients (hospitals) have local models

w_{k}

. The global model is updated as shown in Equation (8).

w^{t + 1} = \sum_{k = 1}^{K} \frac{n_{k}}{n} w_{k}^{t}

(8)

where

$w_{k}^{t}$ = model parameters at client $k$ at round $t$ ;
$n_{k}$ = number of local training samples at client $k$ ;
$n = \sum_{k = 1}^{K} n_{k}$ .

In this framework, the Horizontal Federated Learning (HFL) facilitates the various hospitals (clients) to jointly train a global model that does not share sensitive patient information and maintains confidentiality. Individual clients keep their nearby model.

w_{k}

and train it on their community-based dataset using Stochastic Gradient Descent (SGD), updating the model parameter based on the community-based cross-entropy loss

L_{k}

and the learning rate

η

. Each client performing a nearby SGD is shown in Equation (9).

w_{k}^{t + 1} = w_{k}^{t} - η \nabla L_{k} (w_{k}^{t})

(9)

where

η

is the learning rate, and

L_{k}

is the local cross-entropy loss for severity classification.

During federated optimization, each communication round consisted of one local training epoch independently performed at every client using Stochastic Gradient Descent (SGD). After completion of local training, model parameters were transmitted to the central server and aggregated using the Federated Averaging (FedAvg) algorithm. The aggregated global model was then redistributed to all participating clients for the next training round. The federated training process was conducted for 20 communication rounds until convergence. One federated communication round corresponds to one local training epoch independently performed at each participating client before global aggregation. The server aggregated the local weights using the Federated Averaging (FedAvg) algorithm, where the contribution of each client was weighted proportionally to the number of local training samples. The aggregated global model was then redistributed to all clients for the next communication round. This iterative optimization process continued for 20 communication rounds until convergence.

As shown in Figure 5, the respective hospitals (clients) train a neighborhood model with

w_{k}^{t}

. They keep the dataset using SGD, and update model parameters based on the local cross-entropy loss. The revised local model is then sent to a central server, where it is aggregated within a global model

w^{(t+ 1)}

by means of the mean of the total of the local training sample

n_{k}

. The global model uses information from all clients during continuous statistical isolation to predict the severity of coronary artery disease.

Table 2 summarizes the distribution of training, validation, and testing images across the three simulated federated clients. The distribution was performed using patient-level stratified partitioning to maintain a balanced CAD severity-class representation across all federated clients while preventing patient overlap between subsets.

The federated experimental setup was conducted in two stages. First, patient-level partitioning was performed to distribute the dataset across three simulated federated clients while ensuring that all images belonging to the same patient remained within a single client. Subsequently, patient-level 10-fold cross-validation was performed within the Federated Learning framework to evaluate model generalization and statistical stability. During each fold, training, validation, and testing subsets were generated while preserving patient exclusivity and severity-class balance across all clients. Therefore, no patient appeared simultaneously across multiple clients or across training and testing subsets at any stage of federated optimization or cross-validation.

4.5. Implementation Details

The proposed FGTN was implemented using Python 3.10 with the PyTorch 2.2.2 and PyTorch Geometric 2.5.3 deep learning frameworks. Model training was performed using the Adam optimizer with an initial learning rate of 0.0001 and weight decay (L2 regularization) of 1 × 10⁻⁵. A batch size of 16 was used during local client training, and each local model was trained for one epoch per communication round during federated optimization. The global federated training process was conducted for 20 communication rounds until convergence.

Dropout regularization with a dropout rate of 0.3 was applied within the transformer encoder and multilayer perceptron (MLP) layers to reduce overfitting. Early stopping based on validation loss was employed with a patience value of 5 communication rounds. The graph convolutional layers used ReLU activation functions and hidden embedding dimensions of 128 features. Experiments were conducted on a workstation equipped with an NVIDIA RTX 4090 GPU (NVIDIA, Santa Clara, CA, USA) with 24 GB memory, an Intel Core i9 processor, and 64 GB RAM. The operating environment used Ubuntu Linux with CUDA-enabled GPU acceleration.

For fair comparative evaluation, the baseline CNN, Attention U-Net, and Capsule Network models were independently re-implemented within the same experimental framework rather than directly adopting previously published numerical results. The CNN baseline consisted of sequential convolutional and max-pooling layers followed by fully connected classification layers. The Attention U-Net incorporated an encoder–decoder segmentation architecture with attention gates for feature refinement, while the Capsule Network employed dynamic routing between capsule layers for hierarchical feature representation. Hyperparameter optimization for all baseline models was performed using validation-based tuning under identical preprocessing, augmentation, patient-level cross-validation, optimizer settings, and evaluation protocols.

Table 3 summarizes the validation-based hyperparameter optimization process. Multiple hyperparameter configurations were evaluated, and the final settings were selected based on validation performance, convergence stability, and generalization capability. The chosen configuration consistently provided the best overall performance and was therefore used in all subsequent experiments.

A representative challenging example was also observed in one severe CAD case, where the model exhibited partial localization ambiguity near overlapping coronary branches with reduced contrast enhancement. In this case, the predicted high-attention region extended beyond the exact expert-annotated lesion boundary due to vessel superposition and low local vessel visibility within the angiographic projection. Such discrepancies are clinically plausible because severe multi-vessel disease frequently produces anatomically complex imaging patterns that complicate precise lesion delineation even during manual interpretation. The overall severity classification remained clinically correct, indicating that the proposed framework preserved a robust global disease assessment.

5. Results and Discussions

The proposed Federated Graph-Transformer Network (FGTN) was evaluated on a dataset comprising 232 X-ray coronary angiography images from 231 patients, annotated with clinically computed SYNTAX scores and angiographic variables. The vascular trees were segmented and converted into node-edge graphs, preserving topological features. The FGTN processed these graphs using graph convolutional layers combined with transformer-based self-attention to capture both local stenosis and global vessel relationships.

Figure 6 presents a comprehensive study on coronary artery disease using X-ray angiography. The original angiogram is shown on the top left side, and the system of coronary vessels. In the right metameric vessel, the tissue is shown in white for clear visualization. Plaque detection is represented by color-coded classifications for the severity of the plaques. The classification is as follows: green for non-obstructive or mild plaques, yellow for moderate stenosis, and red for severe stenosis. The bottom-right panel is a summary of the most severe classification for the major coronary branch. This indicates severe stenosis represented by the color red in the key vessel, as well as the region representing moderate stenosis and the region representing non-obstructive plaques. This indicates a spectrum of coronary artery disease. The quantitative analysis indicates severe stenosis in 40% of the vessel length, moderate stenosis in 25%, and non-obstructive plaques in 35%.

The class distribution across all dataset splits remained statistically balanced, ensuring unbiased evaluation of CAD severity grading. The classification is as follows: green for non-obstructive or mild plaques, yellow for moderate stenosis, and red for severe stenosis. The bottom-right panel indicates the severity classification for the major coronary branch. This indicates severe stenosis in the key vessel show in red, as well as the region representing moderate stenosis and the region representing non-obstructive plaques. This indicates a spectrum of coronary artery disease. The quantitative analysis indicates severe stenosis in 40% of the vessel length, moderate stenosis in 25%, and non-obstructive plaques in 35%. This indicates clinically significant plaque dispersion for planning targeted involvement.

Figure 7 shows five representative X-ray coronary angiography scenarios, individual vessel analysis, plaque detection, and disease severity. For each case, the first column shows the original angiogram showing the anatomy of the coronary arteries, the second column provides the segmental vessel map for systematic clarity, and the third column shows the plaque detection overlays with color-coded severity; color-coded plaque severity overlays are shown for each representative case. Among these cases, situation 1 has a widespread severe CAD (SYNTAX Score 34), instance 2 has a moderate stenosis (SYNTAX Score 23), event 3 has a predominantly low-risk CAD (SYNTAX Score 12), event 4 has a severe CAD (SYNTAX Score 32), and incident 5 has a variety of stenosis together with a moderate-to-severe stenosis (SYNTAX Score 39).

The highlighted vascular regions generated by the proposed FGTN framework showed strong spatial correspondence with lesion locations identified by expert clinical assessment. In severe CAD cases, the attention-weighted graph representations predominantly focused on stenotic vessel segments exhibiting pronounced luminal narrowing and irregular contrast flow patterns that were consistent with clinically annotated high-risk lesions. This alignment suggests that the topology-aware graph transformer successfully captured diagnostically relevant vascular structures rather than relying on unrelated image artifacts or background patterns.

5.1. Quantitative Performance

The vascular trees were segmented and converted into node-edge graphs, preserving topological features. The FGTN processed these graphs using graph convolutional layers combined with transformer-based self-attention to capture both local stenosis and global vessel relationships. All comparative baseline models (CNN, Attention U-Net, and Capsule Network) were independently re-implemented and evaluated using the same preprocessing pipeline, augmentation strategy, patient-level training-validation-testing splits, federated partitioning protocol, optimization settings, and evaluation metrics to ensure fair and reproducible comparison. Hyperparameter tuning for all models was conducted using validation-based optimization under identical experimental conditions.

The FGTN achieved remarkable performance in CAD severity grading. Figure 8 compares the performance of FGTN with the standard models, including CNN, Attention U-Net, and Capsule. The FGTN achieved high volumes in all languages, together with an accuracy 99.4%, precision 97.6%, recall 98.8%, and F1-score 98.2%. The baselines, with F1-scores ranging between 83 and 94 percent, highlight the advantages of topology-aware graph embedding and transformer-based attention.

Table 3 demonstrates consistent performance across folds with low variance, confirming the stability and generalization capability of the proposed FGTN framework. The low standard deviation observed across folds indicates stable model performance and reduced sensitivity to variations in dataset partitioning. The overall AUC values reported in Table 4 correspond to the macro-average multi-class AUC computed from one-vs-rest ROC analysis across the four CAD severity categories (non-obstructive, mild, moderate, and severe). This macro-average AUC summarizes the model’s overall discrimination capability across all severity classes under the same patient-level 10-fold cross-validation protocol.

To evaluate statistical reliability, performance metrics were computed using patient-level 10-fold cross-validation and are reported as the mean ± standard deviation across folds. In addition, 95% confidence intervals (CI) were estimated for the primary evaluation metrics of the proposed FGTN model. The FGTN achieved an accuracy of 98.3% (95% CI: 97.8–98.8%), F1-score of 98.2% (95% CI: 97.7–98.7%), and macro-average AUC of 0.96 (95% CI: 0.95–0.97). Paired statistical comparisons across folds demonstrated that the improvements of FGTN over baseline CNN, Attention U-Net, and Capsule Network models were statistically significant (p < 0.05). The relatively high performance is considered plausible due to the combined effects of topology-aware graph representation, transformer-based global context encoding, robust vascular segmentation, and strict patient-level data separation during cross-validation, which collectively improved feature discrimination while minimizing data leakage.

The shallow CNN baseline achieved substantially lower performance compared to the proposed FGTN framework, indicating that the reported improvements were not solely attributable to dataset characteristics or evaluation bias. The progressive performance gains observed from shallow CNNs to conventional CNNs, and subsequently to graph-transformer-based architectures, further support the contribution of topology-aware vascular representation learning and transformer-based contextual modeling toward improved CAD severity discrimination. To assess training stability, the proposed FGTN framework was evaluated across three independent runs using different random initialization seeds. Only minor performance variation (below ±0.5%) was observed across runs, indicating stable convergence and reproducible optimization behavior.

5.2. Class-Wise Severity Evaluation

The class-wise one-vs-rest ROC analysis demonstrates robust discrimination performance across all CAD severity categories, as summarized in Table 5. These class-specific AUC values collectively contribute to the overall macro-average multi-class AUC reported in Table 3.

Class-wise AUC values were computed using one-vs-rest ROC analysis for each CAD severity category. The overall macro-average AUC reported in Table 3 represents the average performance across these four class-wise ROC evaluations.

Figure 9 shows the area under the curve (AUC) for the different CAD severity levels. For non-obstructive, moderate, and severe classes, FGTN achieved AUC standards of 0.98, 0.97, 0.96, and 0.95. That indicates a consistent and resilient bias in the early stages of deterioration, as well as a slightly lower performance in severe situations due to the overlap of anatomic features.

The slightly lower AUC observed for severe CAD cases (0.95) compared to non-obstructive and mild categories may be attributed to the increased anatomical complexity frequently present in advanced coronary disease. Severe lesions often involve diffuse stenosis, overlapping vessel structures, heavy calcification, tortuous vascular morphology, and reduced contrast visibility, which can increase feature similarity between moderate and severe categories and complicate precise classification. In addition, severe disease patterns may extend across multiple vessel branches, introducing greater structural heterogeneity within angiographic projections. Clinically, this observation reflects the inherent difficulty of accurately distinguishing advanced stenotic burden using 2D angiography alone. The proposed FGTN framework maintained high discriminatory performance even in severe cases, indicating that the integration of topology-aware graph learning and transformer-based global contextual encoding effectively captures complex vascular relationships relevant to high-risk CAD assessment.

5.3. Federated Learning Performance

The federated experiments were conducted using three simulated non-IID clinical clients over 20 communication rounds with one local training epoch per round using FedAvg aggregation. To centralize training, a global model employing horizontal Federated Learning mimicking multi-site collaboration achieved defended convergence, together with a 1.8% performance degeneration comparison. The proposed FGTN is capable of competently training from dispersed heterogeneous statistics that do not compromise accuracy or isolation. The results of the experiments indicate that the FGTN framework effectively integrates local and global vessel information. The graph convolutional layer captures local stenosis forms close to vessel segments and bifurcation during the transformer-based self-attention, which enables modeling of long-range dependence across the vascular tree. This synthesis facilitates highly accurate severity scaling, surpassing traditional CNNs, Attention U-Net, and Capsule systems, which emphasize neighborhood features or insufficient ability to continue the vessel graph topology. The Federated Learning ensures privacy-preserving cooperation with stimulated clinical sites. The FGTN should be acceptable for real multi-structured applications where the proportion of persevering statistics is limited to centralize the training to show robustness to information heterogeneity.

The global federated model was evaluated using the independent test dataset obtained from the same publicly available X-ray angiography dataset. Patient-wise separation was strictly maintained so that no patient images appeared simultaneously in training and testing subsets across clients.

The classification by severity shows a slightly lower AUC for severe circumstances (0.95), mimicking the inherent problem with selective severe stenosis in moderate scenarios due to the overlap of anatomic features in X-ray angiography. However, the high clinical applicability of the automated CAD badness scale may help the cardiologist to establish, lower the manual note attempt, and improve treatment. Overall, the FGTN provides a topology-aware, privacy-preserving, and clinically reliable framework for automated CAD assessment from coronary angiography, combining the advantages of graph-based modeling, transformer attention, and Federated Learning.

To centralize the training above 20 training rounds, Figure 10 shows the progression of the F1-score in the federated global model comparison. The federated FGTN is steadily approaching 98.2% F1-score, closely matching the centralized model, which achieves 97.0%, demonstrating safe convergence and minimal performance degradation (1.8%) in a privacy-preserving multi-client setup.

The global federated model, as shown in Table 6, is continuously improving in the round, surpassing individual clients, demonstrating effective aggregation and minimal performance loss compared to centralized training.

The iterative performance of the FGTN model above 20 federated training rounds is shown in Figure 11. The main panel shows the convergence of the F1-score for the three simulated clients and the global model, highlighting the steady increase in client and global performance, with the global F1-score reaching 98.2% by round 20. The following panel shows the progression of severity of AUC in non-obstructive, mild, moderate, and severe CAD workshops, showing firm convergence together with final AUC standards of 0.98, 0.97, 0.96, and 0.95 respectively. In addition to unchanging and reliable image prediction, the third panel gives average image-level prediction confidence, which increased from 91.2% to 97.5%, while the standard deviation decreased. In general, the figure shows that the federated FGTN model achieves reliable, precise, and self-assured image-level CAD incorrect classification using iterative collaborative learning.

Figure 12 illustrates the effect of varying the hidden embedding dimension on the validation F1-score of the proposed FGTN model. Increasing the hidden dimension from 64 to 128 improves the validation F1-score from 96.1% to 98.2%, indicating enhanced feature representation capability. However, a further increase to 256 results in a marginal decrease to 98.1%, suggesting that larger embedding dimensions provide limited additional benefit while increasing computational complexity. Therefore, a hidden dimension of 128 was selected as the optimal configuration.

Figure 13 shows the impact of different dropout rates on the validation F1-score. The model achieves a validation F1-score of 97.2% at a dropout rate of 0.1, which increases to 98.2% at 0.3, demonstrating improved generalization and reduced overfitting. When the dropout rate is further increased to 0.5, the validation F1-score decreases slightly to 97.6%, likely due to excessive regularization. These results confirm that a dropout rate of 0.3 provides the best balance between model robustness and classification performance.

The confusion matrix shown in Figure 14 represents patient-level CAD severity predictions obtained from the independent testing subsets during patient-level 10-fold cross-validation. Although the complete dataset contained a substantially larger number of angiographic images acquired from multiple projections and frames, all images belonging to the same patient were grouped during evaluation to maintain patient exclusivity and avoid data leakage. Predictions from individual angiographic images were aggregated to generate a single final CAD severity classification for each patient. Consequently, the 232 entries shown in the confusion matrix correspond to patient-wise evaluation samples rather than individual angiographic images. The confusion matrix for the classification performance of the proposed model for the four-artery stenosis severity classes: Non-Obstructive, Mild, Moderate, and Severe. This is demonstrated in Figure 14. From these results, there is a strong dominance along the diagonal, indicating a high level of correctness. There is minimal misclassification, which mostly takes place between neighboring malignancy levels, such as between Mild and Moderate, or between Moderate and Severe, indicating ambiguity, which is characteristic of borderline stenotic lesions.

The ablation results presented in Table 7 indicate that each component of the proposed model contributes to improved performance. Removing the graph module reduces the ability of the model to capture vessel topology, leading to lower classification accuracy. Excluding the transformer module limits the model’s capability to learn global contextual relationships across vessel segments. Similarly, removing the Federated Learning framework slightly decreases the generalization capability of the model across distributed datasets. The complete FGTN architecture achieves the best performance, confirming the effectiveness of integrating graph learning, transformer attention, and federated optimization for accurate coronary artery disease severity classification.

Further, the performance of the FGTN model is compared with various recent deep learning architectures that have been proposed for CAD detection, such as CNN, RF-CNN-F, U-Net, Capsule Networks, Graph Neural Networks, Transformer, and Federated Learning architectures.

As shown in Table 8, the proposed model achieves superior performance across key evaluation metrics such as accuracy, F1-score, and AUC. Conventional CNN and segmentation-based approaches reported relatively lower performance due to limited capability in capturing complex vascular structures [8,22], whereas hybrid learning models such as RF-CNN-F demonstrated moderate improvements [23]. Graph-based learning methods improved topology awareness of coronary vessels [7], and transformer architectures enhanced global contextual modeling through attention mechanisms [10,24]. Federated Learning frameworks further enabled privacy-preserving distributed training across institutions [11,20,21]. By integrating graph learning, transformer attention, and federated optimization, the proposed FGTN framework achieves higher diagnostic accuracy and robustness compared to existing methods.

Table 8 shows that the proposed FGTN achieves the highest classification performance (99.4% accuracy, 98.2% F1-score, and 0.96 AUC) with 10.1 M parameters and 17.3 GFLOPs. Although some models, such as the Transformer model (12.8 M parameters, 24.7 G FLOPs) and Capsule Network (11.3 M parameters, 22.5 G FLOPs), exhibit higher computational complexity, they achieve lower classification performance. Conversely, lightweight CNN-based methods require fewer parameters and FLOPs but provide substantially lower accuracy. These results indicate that the superior performance of FGTN arises from the effective combination of graph-based topological learning and transformer-based contextual modeling rather than increased model complexity alone, resulting in a favorable complexity–performance trade-off.

The proposed FGTN framework is intended as a decision-support tool for cardiologists rather than a replacement for clinical expertise. In practical settings, the model could assist in automated CAD severity assessment by highlighting high-risk stenotic regions and providing preliminary severity grading during coronary angiography interpretation. Such support may help improve diagnostic consistency, reduce interpretation time, and facilitate early triage of patients requiring urgent interventional evaluation. However, the current framework should be considered a promising assistive technology requiring further large-scale clinical validation before routine deployment.

In real-world federated healthcare environments, participating hospitals may exhibit substantially different CAD severity distributions, imaging protocols, patient demographics, and acquisition conditions. Consequently, the local datasets available at each institution are inherently heterogeneous and non-IID, which may affect the generalization capability of a single global federated model across all hospitals. Although the present study simulated mildly heterogeneous client distributions, stronger institutional variability may introduce client-specific performance imbalance and convergence challenges. The proposed FGTN framework improves robustness by combining topology-aware graph learning with transformer-based global contextual encoding; however, future work will investigate personalized Federated Learning, adaptive aggregation strategies, and validation using real multi-center hospital datasets to further improve hospital-specific generalization and robustness under highly heterogeneous clinical settings.

5.4. Limitations of the Study

Despite achieving high classification performance, the proposed FGTN framework has several limitations. First, the dataset size remains relatively limited for large-scale transformer-based learning, which may affect generalization across diverse clinical populations. Second, coronary angiography represents a 2D projection of a complex 3D vascular structure, and vessel overlap may occasionally introduce pseudo-connections during graph construction. Third, the Federated Learning environment was simulated using partitioned public datasets rather than real multi-institutional deployments. Finally, the graph-transformer architecture introduces increased computational complexity and communication overhead during federated training. Future work will focus on large-scale real-world multi-center federated deployment, integration of 3D angiographic reconstruction, and validation using heterogeneous clinical datasets.

6. Conclusions

The Federated Graph Transformer Infrastructure (FGTN) provides a robust and privacy-preserving framework to automate the CAD severity grading from X-ray coronary angiography images. FGTN has been able to capture both regional stenosis and planetary vessel topology by merging graph convolutional predictions with transformer-based self-attention, enabling highly accurate vessel and graph-level predictions. The model achieved an overall accuracy of 99.4%, precision of 97.6%, recall of 98.8%, and a F1-score of 98.2%, surpassing conventional CNNs, Attention U-Net, and Capsule connections by 4–7%. The severity-based valuations show robust class intolerance, with AUC of 0.98 for non-obstructive, 0.97 for mild, 0.96 for moderate, and 0.95 for severe CAD, mirroring reliable performance even in severe cases. The horizontal Federated Learning technique, which allows collaborative training on stimulated clinical sites while maintaining patient autonomy, alongside the global model, shows less 1.8% degradation in performance compared to centralized training. The proposed FGTN framework demonstrates promising potential for topology-aware and privacy-preserving CAD severity assessment and represents an encouraging step toward future clinical translation. Although the proposed framework demonstrated strong comparative performance under consistent experimental conditions, additional benchmarking against broader state-of-the-art CAD angiography models and external multi-center datasets remains an important direction for future investigation.

Author Contributions

Conceptualization, S.A., K.R. and R.V.; methodology, S.A. and R.V.; software, R.V. and H.G.; validation, S.A., D.K.S. and H.G.; formal analysis, R.V. and K.R.; resources, S.A. and R.V.; data curation, R.V. and D.K.S.; writing—original draft preparation, S.A.; writing—review and editing, S.A., H.G. and K.R.; supervision, R.V. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original data presented in the article are available on Figshare at: 16 September 2025 X-Ray Angiography Images and SYNTAX-Score Dataset. https://figshare.com/articles/dataset/X-Ray_Angiography_Images_and_SYNTAX-Score_Dataset/25801447 (accessed on 28 June 2026).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AUC	area under the curve
CABG	Coronary Artery Bypass Grafting
CAD	Coronary Artery Disease
CNN	Convolutional Neural Network
CVD	Cardiovascular Disease
DICOM	Digital Imaging and Communications in Medicine
DL	Deep Learning
FGTN	Federated Graph-Transformer Network
FL	Federation learning
GCNs	Graph Convolutional Networks
GDPR	General Data Protection Regulation
GNNs	Graph Neural Networks
HFL	horizontal federated learning
HIPAA	Health Insurance Portability and Accountability Act
MLP	Multi-Layer Perceptron
MRI	Magnetic Resonance Imaging
PCI	percutaneous Coronary Intervention
RF-CNN-F	Random Forest with Convolutional Neural Network Features
SGD	stochastic gradient descent
SYNTAX	Synergy between Percutaneous Coronary Intervention with Taxus and Cardiac Surgery
ViTs	Visual Transformers
XCA	X-ray Coronary Angiography

References

World Health Organization. Cardiovascular Diseases (CVDs). Available online: https://www.who.int/news-room/fact-sheets/detail/cardiovascular-diseases-(cvds) (accessed on 12 February 2026).
Prabhakaran, D.; Jeemon, P.; Roy, A. Cardiovascular diseases in India: Current epidemiology and future directions. Circulation 2016, 133, 1605–1620. [Google Scholar] [CrossRef] [PubMed]
Genereux, P.; Palmerini, T.; Caixeta, A.; Cristea, E.; Mehran, R.; Sanchez, R.; Lazar, D.; Jankovic, I.; Corral, M.D.; Dressler, O.; et al. SYNTAX score reproducibility and variability between interventional cardiologists, core laboratory technicians, and quantitative coronary measurements. Circ. Cardiovasc. Interv. 2011, 4, 553–561. [Google Scholar] [CrossRef] [PubMed]
Sianos, G.; Morel, M.A.; Kappetein, A.P.; Morice, M.C.; Colombo, A.; Dawkins, K.; Van Den Brand, M.; Van Dyck, N.; Russell, M.E.; Mohr, F.W.; et al. The SYNTAX score: An angiographic tool grading the complexity of coronary artery disease. EuroIntervention 2005, 1, 219–227. [Google Scholar] [PubMed]
Barac, Y.D.; Witberg, G.; Assali, A.; Klempfner, R.; Krutzwald-Josefson, E.; Rubchevsky, V.; Abergel, E.; Kornowski, R.; Aravot, D. The Clinical SYNTAX score predicts survival better than the SYNTAX score in coronary revascularization. J. Thorac. Cardiovasc. Surg. 2024, 167, 164–173. [Google Scholar] [CrossRef] [PubMed]
Litjens, G.; Kooi, T.; Bejnordi, B.E.; Setio, A.A.A.; Ciompi, F.; Ghafoorian, M.; Van Der Laak, J.A.; Van Ginneken, B.; Sánchez, C.I. A survey on deep learning in medical image analysis. Med. Image Anal. 2017, 42, 60–88. [Google Scholar] [CrossRef] [PubMed]
Wolterink, J.M.; Leiner, T.; Išgum, I. Graph convolutional networks for coronary artery seg mentation in cardiac CT angiography. In Proceedings of the International Workshop on Graph Learning in Medical Imaging, Shenzhen, China, 17 October 2019; Springer: Cham, Switzerland, 2019; pp. 62–69. [Google Scholar] [CrossRef]
Fan, T.; Wang, G.; Li, Y.; Wang, H. MA-Net: A multi-scale attention network for liver and tumor segmentation. IEEE Access 2020, 8, 179656–179665. [Google Scholar] [CrossRef]
Hatamizadeh, A.; Tang, Y.; Nath, V.; Yang, D.; Myronenko, A.; Landman, B.; Roth, H.; Xu, D. UNETR: Transformers for 3D medical image segmentation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 574–584. [Google Scholar] [CrossRef]
Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar] [CrossRef]
Li, T.; Sahu, A.K.; Talwalkar, A.; Smith, V. Federated learning: Challenges, methods, and future directions. IEEE Signal Process. Mag. 2020, 37, 50–60. [Google Scholar] [CrossRef]
Rieke, N.; Hancox, J.; Li, W.; Milletari, F.; Roth, H.R.; Albarqouni, S.; Bakas, S.; Galtier, M.N.; Landman, B.A.; Maier-Hein, K.; et al. The future of digital health with federated learning. npj Digit. Med. 2020, 3, 119. [Google Scholar] [CrossRef] [PubMed]
Khened, M.; Kori, A.; Rajkumar, H.; Krishnamurthi, G.; Srinivasan, B. A generalized deep learning framework for whole-slide image segmentation and analysis. Sci. Rep. 2021, 11, 11579. [Google Scholar] [CrossRef] [PubMed]
X-Ray Angiography Images and SYNTAX-Score Dataset. Available online: https://figshare.com/articles/dataset/X-Ray_Angiography_Images_and_SYNTAX-Score_Dataset/25801447 (accessed on 16 September 2025).
Sabour, S.; Frosst, N.; Hinton, G.E. Dynamic routing between capsules. In Advances in Neural Information Processing Systems, 2017; Curran Associates, Inc.: Red Hook, NY, USA, 2017. [Google Scholar] [CrossRef]
Matta, S.S.; Bolli, M. Federated learning for privacy-preserving healthcare data sharing: Enabling global AI collaboration. Am. J. Sch. Res. Innov. 2025, 4, 320–351. [Google Scholar] [CrossRef]
Jansson, L.; Sandström, T. Graph Convolutional Neural Networks for Brain Connectivity Analysis. Master’s Thesis, Chalmers University of Technology, Gothenburg, Sweden, 2020. [Google Scholar]
Kipf, T.N.; Welling, M. Semi-supervised classification with graph convolutional networks. arXiv 2016, arXiv:1609.02907. [Google Scholar]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Advances in Neural Information Processing Systems, 2017; Curran Associates, Inc.: Red Hook, NY, USA, 2017. [Google Scholar] [CrossRef]
Kairouz, P.; McMahan, H.B.; Avent, B.; Bellet, A.; Bennis, M.; Bhagoji, A.N.; Bonawitz, K.; Charles, Z.; Cormode, G.; Cummings, R.; et al. Advances and open problems in federated learning. Found. Trends Mach. Learn. 2021, 14, 1–210. [Google Scholar] [CrossRef]
Sheller, M.J.; Reina, G.A.; Edwards, B.; Martin, J.; Bakas, S. Multi-institutional deep learning modeling without sharing patient data: A feasibility study on brain tumor segmentation. In Proceedings of the MICCAI Brainlesion Workshop, Granada, Spain, 16 September 2018; Springer: Cham, Switzerland, 2018; pp. 92–104. [Google Scholar] [CrossRef] [PubMed]
Alothman, A.F.; Sait, A.R.W.; Alhussain, T.A. Detecting coronary artery disease from com puted tomography images using a deep learning technique. Diagnostics 2022, 12, 2073. [Google Scholar] [CrossRef] [PubMed]
Khozeimeh, F.; Sharifrazi, D.; Izadi, N.H.; Joloudari, J.H.; Shoeibi, A.; Alizadehsani, R.; Tartibi, M.; Hussain, S.; Sani, Z.A.; Khodatars, M.; et al. RF-CNN-F: Random forest with convolutional neural network features for coronary artery disease diagnosis based on cardiac magnetic resonance. Sci. Rep. 2022, 12, 11178. [Google Scholar] [CrossRef] [PubMed]
Zhang, J.; Ye, Z.; Chen, M.; Yu, J.; Cheng, Y. TransGraphNet: A novel network for medical image segmentation based on transformer and graph convolution. Biomed. Signal Process. Control. 2025, 104, 107510. [Google Scholar] [CrossRef]

Figure 1. Federated Graph-Transformer Network (FGTN) for CAD severity grading.

Figure 2. Pipeline for vascular segmentation and graph representation from X-ray coronary angiography.

Figure 3. Graph convolutional feature extraction for coronary vessel networks.

Figure 4. Transformer-based global context encoding for coronary vessel graphs.

Figure 5. Horizontal Federated Learning for coronary disease severity classification.

Figure 6. Coronary artery plaque detection and severity analysis from X-ray angiography.

Figure 7. Plaque detection and severity analysis across five representative X-ray coronary angiography cases.

Figure 8. Comparison of the proposed FGTN model with baseline deep learning and graph-based approaches under identical experimental conditions.

Figure 9. AUC by severity class.

Figure 10. Federated Learning convergences.

Figure 11. Iterative image-based classification results of federated graph-transformer network (FGTN) for CAD severity grading.

Figure 12. Sensitivity analysis of hidden dimension.

Figure 13. Sensitivity analysis of dropout rate.

Figure 14. Confusion matrix for multi-class coronary artery stenosis severity classification.

Table 1. Dataset image descriptions and subset specifications [14].

Subset	Image Count	Patient-Wise Separation
Training Set	2421 images	Images from 161 patients
Validation Set	519 images	Images from 35 patients
Test Set	519 images	Images from 35 patients
Total Images	3459 images	231 patients
Total Projections	1153 views	Multiple views per patient
Frame Selection Method	3 best frames per projection	MSSIM-based automatic selection

Table 2. Distribution of images across three simulated federated clients under mildly non-IID patient-level partitioning.

Client	Training Images	Validation Images	Test Images
Client 1	807	173	173
Client 2	807	173	173
Client 3	807	173	173

Table 3. Hyperparameter optimization results on validation set.

Hyperparameter	Tested Values	Validation Accuracy (%)	Validation F1-Score (%)	Selected Value
Learning Rate	1 × 10⁻³	95.8	95.2	1 × 10⁻⁴
	5 × 10⁻⁴	97.1	96.8
	1 × 10⁻⁴	98.4	98.1
Hidden Dimension	64	96.8	96.1	128
	128	98.4	98.1
	256	98.3	98.0
Dropout Rate	0.1	97.5	97.2	0.3
	0.3	98.4	98.1
	0.5	97.8	97.5
Weight Decay	1 × 10⁻⁴	97.9	97.4	1 × 10⁻⁵
	1 × 10⁻⁵	98.4	98.1
	1 × 10⁻⁶	98.2	97.8

Table 4. Performance results reported as mean ± standard deviation obtained from 10-fold patient-level cross-validation.

Model	Accuracy (%)	Precision (%)	Recall (%)	F1-Score (%)	Macro-Average AUC	95% CI (Accuracy)
Shallow CNN	81.4 ± 2.3	80.7 ± 2.5	81.0 ± 2.2	80.8 ± 2.3	0.84 ± 0.03	80.0–82.8
CNN	86.8 ± 1.9	85.9 ± 2.1	86.4 ± 1.8	86.1 ± 1.9	0.89 ± 0.02	85.6–88.0
Attention U-Net	88.2 ± 1.7	87.6 ± 1.8	88.0 ± 1.6	87.8 ± 1.7	0.91 ± 0.02	87.1–89.3
Capsule Network	89.0 ± 1.6	88.4 ± 1.7	88.7 ± 1.5	88.5 ± 1.6	0.92 ± 0.01	88.0–90.0
GTN (Centralized)	98.9 ± 0.6	98.1 ± 0.7	98.4 ± 0.6	98.2 ± 0.6	0.97 ± 0.01	98.5–99.3
FGTN (Federated)	98.3 ± 0.8	97.6 ± 0.8	98.0 ± 0.7	98.2 ± 0.8	0.96 ± 0.01	97.8–98.8

Table 5. Class-wise one-vs-rest AUC evaluation for CAD severity classification.

Severity Class	CNN AUC	Attention U-Net AUC	Capsule Network AUC	FGTN AUC
Non-Obstructive	0.90	0.91	0.92	0.98
Mild	0.88	0.89	0.90	0.97
Moderate	0.87	0.88	0.89	0.96
Severe	0.85	0.87	0.88	0.95

Table 6. Federated Learning convergences.

Round	Client 1 F1	Client 2 F1	Client 3 F1	Global FGTN F1
1	95.2	94.8	95.0	96.2
5	96.5	96.2	96.4	97.8
10	97.1	96.9	97.0	98.2
15	97.3	97.2	97.4	98.4
20	97.4	97.3	97.5	98.5

Table 7. Ablation results of the proposed FGTN model.

Model Variant	Graph Module	Transformer Module	Federated Learning	Accuracy (%)	F1-Score (%)	AUC
FGTN—without Graph	✗	✓	✓	91.8	90.9	0.93
FGTN—without Transformer	✓	✗	✓	92.1	91.3	0.94
FGTN—without Federated Learning	✓	✓	✗	93.4	92.5	0.95
Proposed FGTN (Full Model)	✓	✓	✓	99.4	98.2	0.96

Table 8. Comparison of the proposed method with existing methods.

Method	Parameters (M)	FLOPs (G)	Accuracy (%)	F1-Score (%)	AUC
CNN-based CAD Detection [23]	3.2	8.4	88.4	87.2	0.89
RF-CNN-F [24]	4.1	9.7	90.1	89.0	0.91
U-Net [8]	8.7	18.6	89.3	88.6	0.90
Capsule Network [15]	11.3	22.5	90.5	89.3	0.91
Graph Neural Network [7]	7.6	14.2	91.2	90.4	0.92
Transformer Model [10]	12.8	24.7	92.1	91.0	0.93
Federated CNN [12]	5.8	11.6	92.8	91.6	0.94
Proposed FGTN	10.1	17.3	99.4	98.2	0.96

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Alphonse, S.; Venkatesan, R.; Gunasekaran, H.; Swaminathan, D.K.; Ramalakshmi, K. Federated Graph-Transformer Network for Coronary Artery Disease Severity Grading from X-Ray Coronary Angiography. Mach. Learn. Knowl. Extr. 2026, 8, 187. https://doi.org/10.3390/make8070187

AMA Style

Alphonse S, Venkatesan R, Gunasekaran H, Swaminathan DK, Ramalakshmi K. Federated Graph-Transformer Network for Coronary Artery Disease Severity Grading from X-Ray Coronary Angiography. Machine Learning and Knowledge Extraction. 2026; 8(7):187. https://doi.org/10.3390/make8070187

Chicago/Turabian Style

Alphonse, Suja, R. Venkatesan, Hemalatha Gunasekaran, Deepa Kanmani Swaminathan, and Krishnamoorthi Ramalakshmi. 2026. "Federated Graph-Transformer Network for Coronary Artery Disease Severity Grading from X-Ray Coronary Angiography" Machine Learning and Knowledge Extraction 8, no. 7: 187. https://doi.org/10.3390/make8070187

APA Style

Alphonse, S., Venkatesan, R., Gunasekaran, H., Swaminathan, D. K., & Ramalakshmi, K. (2026). Federated Graph-Transformer Network for Coronary Artery Disease Severity Grading from X-Ray Coronary Angiography. Machine Learning and Knowledge Extraction, 8(7), 187. https://doi.org/10.3390/make8070187

Article Menu

Federated Graph-Transformer Network for Coronary Artery Disease Severity Grading from X-Ray Coronary Angiography

Abstract

1. Introduction

1.1. Limitations of Existing Works

1.2. Objectives of the Proposed Work

2. Literature Review

3. Dataset

4. Methodology

4.1. Vascular Segmentation and Graph Construction

4.2. Graph Convolutional Feature Extraction

4.3. Transformer-Based Global Context Encoding

4.4. Federated Learning Aggregation

4.5. Implementation Details

5. Results and Discussions

5.1. Quantitative Performance

5.2. Class-Wise Severity Evaluation

5.3. Federated Learning Performance

5.4. Limitations of the Study

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI