Next Article in Journal
TiTAD: Time-Invariant Transformer for Multivariate Time Series Anomaly Detection
Next Article in Special Issue
Artificial Intelligence vs. Efficient Markets: A Critical Reassessment of Predictive Models in the Big Data Era
Previous Article in Journal
Quantifying the Benefits of Hybrid Energy Harvesting from Natural Sources
Previous Article in Special Issue
A Co-Optimization Method for Analog IC Placement and Routing Based on Sequence Pairs and Random Forests
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Rapid Right Coronary Artery Extraction from CT Images via Global–Local Deep Learning Method Based on GhostNet

by
Yanjun Li
1,
Takaaki Yoshimura
2,3,4,5 and
Hiroyuki Sugimori
4,5,6,*
1
Graduate School of Health Sciences, Hokkaido University, Sapporo 060-0812, Japan
2
Department of Health Sciences and Technology, Faculty of Health Sciences, Hokkaido University, Sapporo 060-0812, Japan
3
Department of Medical Physics, Hokkaido University Hospital, Sapporo 060-8648, Japan
4
Global Center for Biomedical Science and Engineering, Faculty of Medicine, Hokkaido University, Sapporo 060-8648, Japan
5
Clinical AI Human Resources Development Program, Faculty of Medicine, Hokkaido University, Sapporo 060-8648, Japan
6
Department of Biomedical Science and Engineering, Faculty of Health Sciences, Hokkaido University, Sapporo 060-0812, Japan
*
Author to whom correspondence should be addressed.
Electronics 2025, 14(7), 1399; https://doi.org/10.3390/electronics14071399
Submission received: 15 March 2025 / Revised: 24 March 2025 / Accepted: 28 March 2025 / Published: 31 March 2025
(This article belongs to the Special Issue Artificial Intelligence-Driven Emerging Applications)

Abstract

:
The right coronary artery plays a crucial role in cardiac function and its accurate extraction and 3D reconstruction from CT images are essential for diagnosing and treating coronary artery disease. This study proposes a novel, automated, deep learning pipeline that integrates a transformer-based network with GhostNet to improve segmentation and 3D reconstruction. The dataset comprised CT images from 32 patients, with the segmentation model effectively extracting vascular cross-sections, achieving an F1 score of 0.887 and an Intersection over Union of 0.797. Meanwhile, the proposed model achieved an inference speed of 7.03 ms, outperforming other state-of-the-art networks used for comparison, making it highly suitable for real-time clinical applications. Compared to conventional methods, the proposed approach demonstrates superior segmentation performance while maintaining computational efficiency. The results indicate that this framework has the potential to significantly improve diagnostic accuracy and interventional planning for coronary artery disease. Future work will focus on expanding dataset diversity, refining real-time processing capabilities, and extending the methodology to other vascular structures.

1. Introduction

The right coronary artery (RCA) plays a vital role in cardiac function by delivering oxygen-rich blood to crucial heart structures, including the right atrium, right ventricle, and parts of the left ventricle. It also supplies the essential sinoatrial and atrioventricular nodes [1]. As the source of the posterior descending artery in most cases, the RCA is particularly significant in determining coronary dominance. Any RCA pathology, such as stenosis or anatomical variations, can result in serious complications including inferior myocardial infarction and cardiac arrhythmias [2]. While invasive coronary angiography remains the gold standard for RCA assessment, computed tomography has emerged as a valuable, non-invasive alternative that provides detailed anatomical visualization without procedural risks [3].
The extraction and reconstruction of coronary arteries from CT images pose significant challenges due to the complex structure of the coronary vessels, the presence of noise and calcifications, and the dynamic nature of the heart, which is influenced by both respiratory and cardiac motion [4]. Manual segmentation of these vessels is a time-consuming process prone to inter-observer variability, making it an unreliable option for consistent clinical use [5]. Automated methods, particularly those involving artificial intelligence, are increasingly being pursued as viable alternatives to overcome these challenges, ensuring consistency, accuracy, and efficiency [6,7].
In recent years, convolutional neural networks have emerged as powerful tools for medical image segmentation due to their ability to learn hierarchical features from large volumes of image data. For instance, the U-Net architecture [8] has become a benchmark in medical imaging tasks, demonstrating strong performance in segmenting various types of tissues and organs. This foundational architecture has inspired numerous variants and improvements, each addressing specific challenges in medical image analysis. Attention U-Net [9] enhances the original architecture by incorporating attention gates, which enable the network to focus on relevant image regions while suppressing irrelevant features. This modification has proven to beparticularly effective in multi-organ segmentation tasks, where precise boundary delineation is crucial. Similarly, Residual U-Net [10] integrates residual connections to facilitate deeper network training and mitigate the vanishing gradient problem, resulting in improved feature representation and segmentation accuracy. Furthermore, Dense U-Net [11] leverages dense connectivity patterns to strengthen feature propagation and encourage feature reuse, leading to more efficient network training and robust segmentation results. Also, 3D U-Net [12] extends the original architecture to process volumetric medical data directly, making it particularly suitable for three-dimensional imaging modalities such as CT and MRI scans. This adaptation has significantly improved the accuracy of volumetric segmentation tasks. A more recent development, TransUNet [13] combines the strengths of U-Net with transformer architectures, capitalizing on both local and global feature dependencies for enhanced segmentation performance. Swin-Unet [14] further advances this approach by incorporating hierarchical transformer blocks, demonstrating state-of-the-art performance across various medical image segmentation benchmarks. However, despite their effectiveness, CNN-based approaches struggle with capturing long-range dependencies due to their limited receptive fields. This makes them less suitable for tasks where understanding the broader anatomical context is crucial, such as in coronary artery segmentation.
To overcome the limitations of CNNs in capturing global context, transformer-based architectures have gained increasing attention in the field of medical imaging. Originally developed for natural language processing, transformers utilize self-attention mechanisms that excel in modeling global dependencies and contextual relationships within data. In medical imaging, transformers have proven to be effective in enhancing segmentation accuracy by allowing the model to consider the entire image context, rather than being constrained to local features. This capability is particularly beneficial in segmenting coronary arteries. Vision Transformer (ViT) [15] utilizes self-attention mechanisms to model global dependencies within the input data. Recent transformer models, such as Swin Transformer [16], have further improved the efficiency of self-attention by introducing a hierarchical representation, making them more computationally feasible for use in medical applications. Additionally, several studies have focused on vessel segmentation in coronary artery CT images, exploring various methods. For instance, a CNN-based approach [17] was applied for coronary artery segmentation from CT images, demonstrating strong performance in detecting vessels; however, the method suffered from high computational requirements and occasional inaccuracies in small vessel detection. Similarly, a 3D, fully convolutional network (FCN) [18] was employed to segment coronary arteries, which improved the continuity of segmented vessels but faced challenges in distinguishing closely positioned vessel branches, especially in the presence of noise or calcifications. Another notable work introduced a hybrid approach combining a U-Net with attention mechanisms to segment coronary arteries from CT images [19]. This method provided improved segmentation accuracy, particularly for smaller vessels, by focusing on critical regions. However, the reliance on complex attention layers resulted in increased memory consumption and slower inference times, limiting its real-time applicability in clinical settings. These methods collectively highlight the ongoing efforts and challenges in achieving precise and efficient coronary artery segmentation, paving the way for more advanced approaches, such as the integration of transformer-based models discussed in this work.
Despite the advantages transformers bring to medical image analysis, they come with significant computational costs, which limits their practical applicability in clinical environments that require real-time processing. One promising approach to mitigate these issues is the integration of lightweight network architectures, such as GhostNet, into transformer models. GhostNet [20] employs Ghost modules to generate additional feature maps from intrinsic features, significantly reducing the computational burden while retaining accuracy. This hybrid approach, combining the global attention capabilities of transformers with the parameter efficiency of GhostNet, offers a balanced solution that is both effective and practical for clinical use. In medical imaging, GhostNet has been incorporated into U-Net architectures, such as Ghost-U-Net [21], to improve liver tumor segmentation in CT scans, achieving efficient processing without compromising accuracy. Additionally, GhostNet has been utilized in mobile healthcare applications, including real-time ECG classification, where its lightweight design enables rapid analysis with reduced latency and power consumption [22].
Furthermore, transitioning from 2D cross-sectional images to a 3D reconstruction of coronary arteries presents notable advantages. Three-dimensional reconstructions provide a holistic view of the vascular structures, enabling clinicians to assess vessel morphology more intuitively and accurately. Studies have shown that 3D models can significantly improve the accuracy of detecting and quantifying stenosis [23]. For sonographers and radiologists, the added depth information facilitates a more precise evaluation of complex coronary anatomy, aiding in better preoperative planning and reducing uncertainties during interventions. This advancement not only enhances diagnostic capabilities but also reduces the cognitive load on healthcare professionals, ultimately leading to improved clinical outcomes. For patients, the shift towards non-invasive 3D imaging provides multiple benefits, including reduced exposure to the risks associated with invasive coronary angiography, such as arterial damage or infection. Additionally, a 3D representation allows for more personalized treatment planning, as a detailed visualization of the coronary artery tree can help clinicians devise more accurate interventions, potentially improving prognosis and reducing the likelihood of adverse events [24].
This paper presents a method utilizing a transformer-integrated convolutional neural network with GhostNet for segmentation, enabling the automated extraction of the RCA from CT images and its reconstruction in 3D. The approach aims to address both the accuracy and efficiency challenges inherent in current methodologies, ultimately contributing to improved clinical workflows and patient care in the context of coronary artery disease management.
Our study makes the following contributions in this manuscript:
  • We propose a method for extracting cross-sections of the right coronary artery that achieves effective results without requiring excessive computational resources.
  • By employing volume rendering and surface smoothing techniques, we successfully reconstructed the three-dimensional structure of the vessel, facilitating more accurate localization of coronary artery stenosis for clinical diagnosis.

2. Materials and Methods

2.1. Materials

The research dataset consisted of CT images collected from 32 patients examined at Hokkaido University Hospital. The CT volumes ranged from 400 to 640 slices per patient, with each slice having dimensions of 512 × 512 pixels. Through careful manual selection, 4830 CT slices containing diagnostically significant information were separated from the complete volumes. These selected images underwent detailed annotation to mark the right coronary artery cross-sections, establishing an accurate ground truth dataset necessary for model development and evaluation. The dataset distribution involved 3400 CT images assigned for training purposes, 500 allocated for validation, and 930 designated for testing, all selected through random sampling from the 32 cases.
The image preparation workflow incorporated both preprocessing and data augmentation techniques. The preprocessing involved enhancing image quality, while augmentations, such as rotations, scaling, and flipping, were implemented to artificially increase the dataset’s diversity. These augmentation strategies served to expand the effective dataset size artificially, thereby strengthening the model’s ability to generalize and minimizing potential overfitting issues by introducing diverse data variations. The entire preprocessing and augmentation workflow was executed using Matlab2023b.

2.2. Model Descriptions

2.2.1. GhostNet

GhostNet [20] is a lightweight neural network module designed to reduce computational load without sacrificing performance. It achieves this by using Ghost modules, which generate additional feature maps from the original features through simple linear operations. The core idea of GhostNet can be expressed mathematically as
F ghost = F intrinsic + h ( F intrinsic )
where F intrinsic represents the intrinsic feature maps generated by a standard convolution and h ( · ) represents a set of cheap linear transformations applied to F intrinsic to produce more feature maps. This approach significantly reduces the number of parameters and computational costs associated with the model, making it suitable for applications where efficiency is crucial.
GhostNet’s architecture employs a dual approach to feature map generation, utilizing standard convolution layers for primary feature extraction alongside Ghost modules for derivative feature map creation. This architectural design enables GhostNet to maintain robust feature representation capabilities while achieving significant reductions in parameter count and computational operations.
The implementation positions GhostNet between encoder and decoder components, creating an optimized architecture for coronary artery segmentation that balances performance with computational efficiency. This integration leverages Ghost modules to maintain segmentation precision while reducing model complexity, particularly beneficial for real-time clinical applications. The architecture incorporates batch normalization and activation layers throughout the Ghost module sequence, ensuring optimal feature map conditioning for subsequent network processing stages.

2.2.2. Volume Rendering

Volume rendering serves as a core visualization technique for examining complex, three-dimensional anatomical structures, with significant applications in medical imaging. This methodology transforms volumetric data obtained from CT and MRI scans into comprehensive visual displays that highlight intricate spatial and anatomical relationships [25]. In the specific context of coronary artery visualization, volume rendering plays a crucial role by offering detailed perspectives of vessel structure, thus enhancing diagnostic accuracy and supporting treatment planning processes.
The technique involves converting volumetric data into three-dimensional representations through projection onto two-dimensional viewing planes. In contrast to surface-rendering methods that only show structural boundaries, volume-rendering examines the complete data volume to display internal characteristics. This capability is essential in medical imaging scenarios where healthcare professionals must evaluate both external surfaces and internal features, including tissue density variations [26]. The technique’s flexibility in modifying opacity levels and color mappings enables the selective emphasis of various tissue types and anatomical structures, establishing volume-rendering as an adaptable tool for medical visualization applications. Volume-rendering distinguishes itself through its capacity to preserve fine details that might be lost in conventional surface-rendering methods. Studies have demonstrated that volume-rendering enhances the precision of clinical evaluations by providing more realistic anatomical visualizations compared to surface-rendering techniques [27].

2.2.3. Proposed Methods

Despite the proven capabilities of the Medical Transformer in many applications, it has a significant drawback: deep networks generally require long training times and incur high computational costs. Specifically, while the Medical Transformer employs optimizations such as reducing attention window sizes and hierarchical processing to maintain training accuracy while reducing training costs, the fundamental transformer architecture limits the effectiveness of these measures. Additionally, the object queries in the decoder of the Medical Transformer are not highly correlated with the image, which impacts segmentation quality.
The proposed attention mechanism model is based on enhancing the axial attention mechanism. The axial attention mechanism is an approach that improves the efficiency of self-attention, particularly for high-dimensional data such as images or 3D volumes. Unlike traditional self-attention, which calculates relationships between all elements in the input tensor simultaneously, axial attention breaks this computation into sequential steps along specific dimensions, such as rows, columns, or depth. The input tensor, x, is first transformed using learnable weight matrices ( W q , W k , and W v ) into queries (Q), keys (K), and values (V). These components allow the model to compute attention scores and aggregate information from relevant positions. Positional information ( r q , r k , and r v ) is added at various stages to capture spatial dependencies, ensuring that the mechanism respects the structure of the input. The attention process begins by computing the interaction between Q and K, incorporating positional encodings to reflect relative positions. These interactions are normalized using a softmax function, producing attention weights. These weights are then used to combine the corresponding values, resulting in an aggregated representation. This process is repeated for each axis independently, such as row-wise and then column-wise, to ensure that the model captures dependencies across all dimensions. By decomposing the computation along each axis, axial attention significantly reduces the computational cost compared to traditional self-attention. Instead of scaling quadratically with the total number of elements in the input tensor, the computation scales linearly with respect to the size of each axis. This efficiency makes axial attention particularly suitable for handling large inputs while maintaining high performance. The output of axial attention, represented as y in the diagram, is a context-aware representation of the input, combining information along all dimensions. This mechanism not only preserves the global context of the input but also allows for more the efficient processing of high-dimensional data, making it a powerful tool for tasks like image and volume analysis, where computational efficiency and spatial structure are both critical.
The gated axial attention mechanism builds upon axial attention by introducing gate parameters ( q r , k r , s v , s v e ) to dynamically control the contributions of queries, keys, values, and positional encodings. This adjustment addresses a limitation in axial attention, where positional information ( r q , r k , r v ) can overly influence attention scores. By scaling the impact of positional encodings, the gates ensure a better balance between content relevance and spatial dependencies, allowing the model to focus more effectively on meaningful relationships in the data. In this mechanism, the gates act as learnable coefficients that modulate the interaction between content and position. For example, q r and k r scale the influence of queries and keys when combined with positional encodings, while s v adjusts the contribution of values during aggregation. This dynamic modulation reduces the risk of overfitting to positional biases, making the model more robust to variations in spatial structure. The gated axial attention mechanism retains the computational efficiency of axial attention as it still processes data along individual axes. However, the added flexibility of gate parameters allows the model to adapt to diverse data distributions and the varying importance of spatial axes. This makes it especially useful for high-dimensional structured data, such as images and volumetric datasets, where spatial dependencies can differ significantly across dimensions.
We observed that when using gated axial attention for large tissue segmentation, the segmentation performance improved when the gate unit weights were smaller. Conversely, for smaller tissues, the segmentation accuracy improved when the gate unit weights were larger. This indicates that the smaller the impact of gate unit parameters on positional information, the more effective the model becomes in segmenting small targets. Since our goal was to identify small targets, we introduced four additional parameters, as shown in Figure 1, to mitigate the weakening effect of the gate unit on positional information, thereby enhancing the model’s capability to segment vascular cross-sections.
With the proposed modifications, the self-attention mechanism applied to the width axis can be expressed as follows
y i j = w = 1 W softmax q i j T k i w + β q r G Q q i j T r i w q + β k r G K k i w T r i w k β s v G V 1 v i w + β s v e G V 2 r i w v
We introduced amplification factors to the gated attention mechanism. Specifically, G Q , G K , G V 1 , G V 2 , β q r , β k r , β s v , and β s v e were learnable parameters that collectively formed the new parameters for modulating positional information. These parameters controlled the influence of the learned relative positional encoding for capturing non-local contextual information.
Additionally, we utilized GhostNet as the feature extractor, which significantly reduced the number of parameters and computational memory requirements while maintaining prediction accuracy. This approach not only lowered the time required for feature extraction but also enhanced the model’s efficiency. To further improve performance, we incorporated a structure that combined both local and global features extracted by GhostNet and fed them into the decoder. This integration enabled the effective detection of small regional targets, a critical factor for accurately segmenting the coronary arteries.
The overall network structure can be referred to in Figure 2. In the proposed architecture, feature extraction is divided into two pathways: global and local. After extracting the features, they are fused and passed through fully connected layers. For global feature extraction, due to the large image size, we reduced the number of convolutional layers. On the other hand, for local feature extraction, where the segmented regions are smaller, we increased the number of convolutional layers. This adjustment ensured that the time required for global and local feature extraction was approximately the same, facilitating the subsequent feature fusion process.

2.3. The 3D Reconstruction

The 3D reconstruction of the vessel was achieved by creating a mesh grid from extracted boundary points in sequential 2D slices. This mesh grid represented the vessel’s morphology across layers and served as the foundation for volume rendering. By later applying Laplacian Pyramid Interpolation to the volume, we enhanced the resolution and smoothness of the reconstructed structure. This section details the steps involved in mesh grid construction, data storage, and 3D rendering.

2.3.1. Boundary Point Extraction and Layer-by-Layer Mesh Grid Construction

To represent the vessel’s structure in 3D, we first extracted the boundary points from each 2D slice of the vessel. Let P i = { ( x i , 1 , y i , 1 ) , ( x i , 2 , y i , 2 ) , , ( x i , N , y i , N ) } denote the set of N boundary points extracted from the i-th layer, where ( x i , n , y i , n ) is the coordinate of the n-th boundary point in the 2D plane. These points were obtained through the segmentation of each vessel slice and served as the vertices of the mesh grid.
Within each layer i, we constructed a closed curve by connecting each pair of adjacent points ( x i , n , y i , n ) and ( x i , n + 1 , y i , n + 1 ) with a line segment, where n + 1 takes modulo N to form a continuous boundary. Mathematically, this intra-layer connection can be represented by the set of line segments:
L intra , i = ( x i , n , y i , n ) , ( x i , n + 1 , y i , n + 1 ) n = 1 , 2 , , N .
To extend the vessel structure across layers, we connected corresponding points in adjacent layers. For each point ( x i , n , y i , n ) in layer i, we drew a line segment to the point ( x i + 1 , n , y i + 1 , n ) in the next layer i + 1 . This connection across layers was defined as
L inter = i = 1 M 1 ( x i , n , y i , n , z = i ) , ( x i + 1 , n , y i + 1 , n , z = i + 1 ) n = 1 , 2 , , N ,
where M represents the total number of layers and z = i indicates the layer’s position in the 3D space.
Combining the intra-layer and inter-layer connections, we obtained a complete 3D mesh grid representing the vessel’s structure. The 3D coordinates for each boundary point were given by ( x i , n , y i , n , z = i ) , where z served as the depth axis.

2.3.2. Saving Mesh Grid Data

The mesh grid data, consisting of the boundary points and connection lines, were saved for each layer to preserve the structure and facilitate further processing. These data were stored in a structured format as a sequence of points and line segments for use in subsequent volume-rendering and interpolation steps. Each point ( x i , n , y i , n , z = i ) was indexed and connectivity matrices were created to represent the intra-layer and inter-layer connections.

2.3.3. Initial Volume Rendering of the Mesh Grid

With the 3D mesh grid constructed, initial volume rendering was performed to visualize the vessel structure in the 3D space. By treating the mesh grid as a spatial representation of the vessel, we obtained a preliminary volume-rendering that provided an initial overview of the vessel’s morphology.

2.3.4. Resolution Enhancement and Smoothness Adjustment via Laplacian Pyramid Interpolation

To improve the resolution and smoothness of the reconstructed volume, we applied Laplacian Pyramid Interpolation to the initial rendering. The Laplacian Pyramid Interpolation process refined the spatial details and enhanced the continuity of the vessel surface, resulting in a high-quality 3D visualization.
Given the original volume data V, represented by the 3D coordinates of the mesh points, we constructed a Laplacian pyramid { V 0 , V 1 , , V L } , where V 0 was the finest level and V L was the coarsest level. Using Gaussian filtering and downsampling, we defined each level of the Gaussian pyramid { G 0 , G 1 , , G L } as
G k = Downsample ( Gaussian ( G k 1 ) ) ,
where G 0 = V . Subsequently, each level of the Laplacian pyramid was formed by subtracting the upsampled version of the next coarser Gaussian level from the current Gaussian level:
V k = G k Upsample ( G k + 1 ) .
Starting from the coarsest level V L , we iteratively upsampled and combined the layers to reconstruct the volume at higher resolution:
V ^ k = Upsample ( V ^ k + 1 ) + V k .
The Laplacian Pyramid Interpolation enhanced structural continuity and detail across the vessel surface, ensuring that the 3D rendering retained both smoothness and resolution. This step provided a refined, high-resolution 3D volume suitable for detailed analysis and visualization of the vessel morphology.

2.4. Evaluation Indicators

2.4.1. F1 Score

The F1 score is a metric used to evaluate the accuracy of a binary classification model, considering both precision and recall. It is particularly useful when there is an imbalanced class distribution. The F1 score is the harmonic mean of precision and recall, providing a single measure that balances the two.
The F1 score can be calculated as
F 1 Score = 2 × Precision × Recall Precision + Recall
The F1 score is maximized when both precision and recall are high, making it a useful metric when you need a balance between the two.

2.4.2. Mean Intersection over Union

The mean Intersection over Union (mIoU) is a metric commonly used in semantic segmentation tasks. It measures the overlap between the predicted segmentation and the ground truth. For a single class, the IoU is defined as the ratio of the intersection of the predicted and true positive pixels to the union of all predicted and true positive pixels. The mean IoU is computed by averaging the IoU values across all classes.
The mean IoU is the average of the IoU scores across all classes:
mIoU = 1 N i = 1 N TP i TP i + FP i + FN i
where N is the number of classes (e.g., background and different object categories in segmentation tasks). The summation is taken over all classes.
This metric provides a comprehensive measure of performance for segmentation tasks as it takes into account the overlap between predicted and actual regions across all classes.

3. Experiment and Results

In this study, we conducted training and prediction experiments using the same dataset across various state-of-the-art deep learning networks to evaluate their performance. The networks included Medical Transformer [28], RCA-IUnet [29], ResUNet [10], TransUNet [13], UNet++ [30], DeepLabV3+ [31], and our proposed networks, one without GhostNet and the other with GhostNet.
The comparative results of these networks are presented in Table 1, where we provide detailed statistics, including the number of parameters, inference speed, F1 score, and IoU. These metrics were chosen to offer a comprehensive evaluation of the computational efficiency and prediction accuracy of each model.
To visually demonstrate the differences in performance, we selected two representative images from the test set. These images highlight the recognition capabilities of each network in generating segmentation masks. The prediction masks generated by each network, alongside their corresponding ground truth, are displayed in Figure 3. This figure allows for a clear qualitative comparison of the models’ effectiveness in segmenting the input images.
The performance comparison of various deep learning models, as detailed in Table 1, highlighted both the accuracy and computational efficiency of our proposed methods on the given CT image dataset. Among the tested methods, our proposed network without GhostNet achieved the highest F1 score of 0.897, demonstrating superior segmentation accuracy for the RCA cross-sections across multiple CT layers. This reflected the effectiveness of our approach in extracting fine vascular structures. When GhostNet was incorporated (Ours*), there was a marginal decrease in segmentation accuracy, with the F1 score reduced to 0.887 and the IoU to 0.797. However, this configuration achieved a significant improvement in inference speed, recording the fastest time of 7.03 ms. This trade-off between accuracy and computational efficiency makes the GhostNet-enabled model particularly suitable for applications where real-time performance is critical.
The inference speed of our proposed model was measured on a system equipped with an NVIDIA GeForce RTX 3060 GPU (32 GB), an 11th Gen Intel(R) Core(TM) i5 CPU, and 32 GB of RAM. The total number of parameters was approximately 5.5 million, confirming the model’s lightweight design. Given its relatively low computational complexity and memory requirements, the proposed architecture demonstrated the potential for real-time deployment on edge devices commonly used in clinical environments, such as the NVIDIA Jetson platform or devices compatible with Intel OpenVINO. The integration of GhostNet contributed significantly to the model’s efficiency, making it suitable for resource-constrained settings without sacrificing performance.
Comparatively, other networks, such as Medical Transformer and Deeplabv3+, exhibited competitive performance in terms of accuracy, with F1 scores of 0.882 and 0.841, respectively. However, their computational costs, including larger parameter sizes and slower inference speeds, make them less efficient for practical deployment. Networks like RCA-IUnet and ResUNet, while lighter in terms of parameters, showed significantly lower F1 scores of 0.783 and 0.723, indicating less effective segmentation capabilities.
After extracting the cross-sectional masks of each layer of the right coronary artery, we proceeded to extract the edges of the predicted cross-sectional masks. On these edges, 30 points were uniformly distributed. Each point was assigned a unique identifier and neighboring points were connected by lines. This approach enabled the construction of a point-based matrix, which was then utilized to generate a three-dimensional mesh grid for reconstruction.
Figure 4a illustrates the reconstructed RCA of one patient. In this visualization, blue lines represent the connections between points within each individual layer, while red lines indicate the connections between corresponding points across adjacent layers. For better visualization and clarity, the Z-axis—representing the inter-layer distance—was appropriately stretched to emphasize the spatial relationships.
Subsequently, based on the constructed mesh grid, we employed volume-rendering techniques to further enhance the visualization of the vascular structure. Specifically, we utilized a surface-rendering approach within the volume-rendering framework to transform the mesh grid into a continuous, smooth representation of the vessel surface. This method enabled a detailed depiction of the vascular topology, providing a clear and intuitive visualization of both the location and morphology of vascular narrowing.
The rendering process involved assigning specific color and opacity values to the surface based on the geometric attributes of the mesh grid, allowing for the differentiation of structural details. This facilitated the identification of stenotic regions by highlighting their spatial characteristics and transitions. The resulting image, shown in Figure 4b, represented a preliminary rendered visualization of the RCA. The coordinate system, displayed in the lower-left corner, provided a spatial reference for interpreting the reconstructed geometry.
To further enhance the visual quality and improve the interpretability of the reconstructed vascular image, we applied smoothing techniques. Specifically, we utilized two methods: Laplacian Pyramid Interpolation and Gaussian Smoothing Interpolation. These methods were employed to reduce visual noise and ensure a smoother representation of the vascular surface, facilitating better observation and analysis.
The results of the smoothing processes are illustrated in Figure 5. Subfigure (a) shows the outcome of Gaussian Smoothing Interpolation, which emphasized gradual transitions and overall uniformity in the surface structure. Subfigure (b) presents the results of Laplacian Pyramid Interpolation, which preserved fine details while reducing abrupt variations in the mesh.
According to widely accepted clinical guidelines, a coronary artery stenosis is considered moderate when the lumen diameter is reduced by more than 50% and severe when the reduction exceeds 75%. In our reconstructed and smoothed 3D vascular images, we attempted to identify an optimal viewing angle that allowed for the clearest observation of both the location and apparent degree of stenosis. In Figure 6, the stenotic region is marked with a red bounding box to highlight an area of visually significant narrowing. While the narrowing in this region appeared prominent and corresponded well with the typical morphological patterns of stenosis, we must note that our dataset lacked access to corresponding clinical diagnostic reports. Therefore, although the narrowing was evident from visual inspection, we refrained from assigning a specific stenosis grade and instead conservatively describe it as visibly significant.

4. Discussion

This study presents a comprehensive evaluation of state-of-the-art deep learning models and a proposed architecture for the automated segmentation and 3D reconstruction of the right coronary artery from CT images. The results demonstrate notable advancements in both segmentation accuracy and computational efficiency, indicating the practicality of our method for real-time clinical applications. Through comparative analysis, our proposed model without GhostNet achieved the highest segmentation performance, with an F1 score of 0.897 and IoU of 0.814. These results underscore the effectiveness of the Transformer-based module in capturing long-range dependencies and global contextual information within the vascular structure. Compared with conventional convolution-based architectures such as ResUNet, our model demonstrated superior accuracy, particularly in detecting small, complex vascular cross-sections that are often subject to segmentation failure. This improvement can be attributed to the enhanced representational capacity introduced by the attention mechanism within the transformer module, which allows the network to focus on subtle but clinically important vessel features that may be overlooked by local receptive fields in traditional CNNs.
However, this gain in accuracy is accompanied by an increased computational burden. To address this issue, we integrated the GhostNet backbone into the network and conducted an ablation study to compare its performance with and without GhostNet. The model without GhostNet achieved the highest segmentation accuracy, with an F1 score of 0.897 and an IoU of 0.814, but had a larger parameter size of approximately 11.0 M and an inference time of 14.05 ms. After incorporating GhostNet (with s = 2), the parameter size was reduced by half to approximately 5.5 M and the inference speed improved significantly to 7.03 ms, demonstrating a clear advantage in computational efficiency. This result suggests that GhostNet is highly effective in reducing computational complexity while maintaining competitive segmentation accuracy (F1 score of 0.887, IoU of 0.797). The trade-off between speed and accuracy is acceptable for real-time clinical scenarios, such as intraoperative visualization or bedside diagnostics, where processing speed is critical. It is worth noting that the GhostNet architecture includes a tunable parameter s, which controls the degree of parameter reduction. In this study, we fixed s = 2 to balance speed and segmentation quality. Preliminary experiments indicate that increasing s could further reduce the number of parameters and accelerate inference, making the model even more suitable for deployment on resource-limited clinical edge devices. However, to maintain a reasonable level of segmentation accuracy, we limited the comparison in this work to s = 2, which we found to be a practical configuration.
The transition from 2D segmentation masks to 3D reconstruction represents a critical step in understanding the spatial structure of coronary arteries. The proposed reconstruction process involved extracting the edges of segmented masks, distributing points uniformly, and generating a mesh grid that connected points within and across layers. This approach facilitated the construction of a detailed 3D representation of the RCA, allowing for enhanced visualization of its topology. Volume-rendering further improved the interpretability of the reconstructed RCA. By using surface-rendering techniques, the vascular structures were transformed into a continuous and smooth representation, which effectively highlighted morphological features. The rendering provided an intuitive way to identify stenotic regions, aiding in clinical diagnosis and planning. The smoothing techniques, Gaussian Smoothing and Laplacian Pyramid Interpolation, played a key role in refining the visual quality of the 3D models. Gaussian Smoothing, along with faster processing times, indicating its superiority for applications requiring both high quality and efficiency. However, Laplacian Pyramid Interpolation excelled in preserving edge clarity and fine structural details, making it a suitable choice for tasks focused on detecting subtle abnormalities. The reconstructed models provided clinicians with a clear visualization of the degree and location of vascular narrowing. This feature can enhance the diagnostic process, reducing the likelihood of missed or inaccurate assessments. Moreover, the ability to adjust viewing angles and analyze smoothed vascular images supports more comprehensive evaluations, which could improve patient outcomes.
The clinical value of the 3D reconstruction method can significantly impact both procedural planning and patient care. Surgeons can utilize detailed 3D visualizations to better understand complex anatomical relationships and plan optimal approaches for interventional procedures. This enhanced understanding allows for more precise preoperative planning, potentially reducing procedure times and improving outcomes. The 3D models also serve as powerful tools for patient education, helping clinicians explain conditions and planned procedures more effectively, which can improve patient understanding and compliance. In multidisciplinary settings, these reconstructions enhance communication between different specialists, facilitating more comprehensive treatment planning. The technology also offers valuable opportunities for medical education, allowing trainees to study and practice on accurate representations of patient-specific anatomies. This application in training could accelerate the learning curve for new interventional cardiologists while improving their preparedness for complex cases.
This study, while demonstrating promising results, has several limitations that must be addressed to further validate and enhance the proposed methods:
Firstly, due to the unavailability of diagnostic data for the 32 patients included in the dataset, we were unable to validate the accuracy of stenosis location and stenosis rate predictions. This limitation hinders the ability to directly verify the clinical effectiveness of our method. Future work should aim to integrate diagnostic information to provide a more comprehensive assessment of the method’s accuracy and reliability.
Secondly, given the complexity of CT images, our study focused exclusively on the right coronary artery as it is relatively easier to identify and analyze. This approach excluded the evaluation of stenoses in branching vessels or the left coronary artery, which are equally critical in clinical diagnoses. Addressing these areas in future research would ensure a more comprehensive and versatile method capable of analyzing the full coronary vascular structure.
Thirdly, incorporating the proposed pipeline into real-world clinical workflows requires further testing. Key factors such as noise robustness, compatibility with various imaging modalities, and the user-friendliness of visualization tools need to be evaluated in collaboration with clinicians. These assessments will help ensure that the methods are practical and effective for daily clinical use.
Future work could also explore the application of these methods in related domains, such as the analysis of other vascular regions or integration with intraoperative imaging systems. In particular, we plan to extend the proposed method to other coronary arteries, such as the left anterior descending (LAD) artery and the left circumflex artery (LCX). Given the morphological and contrast similarities between these vessels and the right coronary artery, we expect that only minimal modifications to the architecture or parameters will be necessary. Furthermore, we envision integrating our method into a clinical decision support system (CDSS), where real-time vessel extraction and 3D visualization could provide intuitive anatomical insights and assist clinicians in diagnosis, lesion localization, and surgical planning. In addition, incorporating advanced techniques such as reinforcement learning or generative adversarial networks (GANs) could further improve the model’s adaptability and accuracy across varied clinical scenarios. While our current study is limited to static CT imaging, incorporating temporal dynamics and modeling the persistent motion of anatomical features may provide valuable insights for future clinical applications, such as predicting stenosis progression or tracking mobile structures like blood clots. Integrating such motion-predictive frameworks with our vascular modeling approach could open new possibilities for dynamic risk assessment and longitudinal monitoring in cardiovascular imaging.

5. Conclusions

In conclusion, this study introduced an effective deep learning framework for right coronary artery segmentation and 3D reconstruction from CT images, integrating transformer-based global feature extraction with the lightweight GhostNet backbone. The proposed method achieves a strong balance between segmentation accuracy, computational efficiency, and 3D visualization, contributing to potential challenges in coronary artery analysis. The reconstructed 3D models provide intuitive insights into vascular structure and potential stenosis, supporting diagnosis and surgical planning. While this study was limited to static imaging data, the presented approach establishes a solid foundation for future work involving dynamic feature modeling, clinical decision support integration, and generalization to other coronary arteries. With further validation, this framework holds promise as a practical tool to support cardiovascular diagnostics and improve patient care.

Author Contributions

Conceptualization, Y.L. and H.S.; Methodology, Y.L. and H.S.; Software, Y.L.; Validation, Y.L. and T.Y.; Formal analysis, Y.L.; Investigation, Y.L.; Resources, H.S.; Data curation, Y.L.; Writing—original draft, Y.L.; Writing—review and editing, T.Y.; Visualization, Y.L.; Supervision, H.S.; Project administration, H.S.; Funding acquisition, Y.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by JST SPRING, grant number JPMJSP2119.

Institutional Review Board Statement

This study involving human participants was reviewed and approved by the Clinical Research Administration Center at Hokkaido University Hospital. Ethic Committee Name: Ethical Review Board for Life Science and Medical Research, Hokkaido University Hospital. Approval Code: 016-0495. Approval Date: 11 May 2017.

Informed Consent Statement

Informed consent was not obtained for this study as it was a retrospective study. Accordingly, it is published on the institution’s website (https://www.huhp.hokudai.ac.jp/date/rinsho-johokokai/etc_ika/ [accessed on 13 March 2025]) as an opt-out for information disclosure.

Data Availability Statement

The datasets generated and/or analyzed during the current study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare that they have no competing interests that could have influenced the conduct or presentation of the work described in this manuscript.

Abbreviations

The following abbreviations are used in this manuscript:
RCARight coronary artery
ViTVision transformer
FCNFully convolutional network
QQueries
KKeys
VValues
mIoUMean Intersection over Union
LADLeft anterior descending
LCXLeft circumflex artery
CDSSClinical decision support system
GANsGenerative adversarial networks

References

  1. Katta, N.; Loethen, T.; Lavie, C.J.; Alpert, M.A. Obesity and coronary heart disease: Epidemiology, pathology, and coronary artery imaging. Curr. Probl. Cardiol. 2021, 46, 100655. [Google Scholar] [PubMed]
  2. Komilovich, E.B. Coronary Artery Disease. Eur. J. Mod. Med. Pract. 2023, 3, 81–87. [Google Scholar]
  3. Smith, J.; Doe, J. CT as a Non-Invasive Alternative to Coronary Angiography: Advances and Clinical Applications. J. Cardiovasc. Imaging 2021, 15, 123–135. [Google Scholar]
  4. Serruys, P.W.; Hara, H.; Garg, S.; Kawashima, H.; Nørgaard, B.L.; Dweck, M.R.; Bax, J.J.; Knuuti, J.; Nieman, K.; Leipsic, J.A.; et al. Coronary computed tomographic angiography for complete assessment of coronary artery disease: JACC state-of-the-art review. J. Am. Coll. Cardiol. 2021, 78, 713–736. [Google Scholar]
  5. Choi, A.D.; Marques, H.; Kumar, V.; Griffin, W.F.; Rahban, H.; Karlsberg, R.P.; Zeman, R.K.; Katz, R.J.; Earls, J.P. CT evaluation by artificial intelligence for atherosclerosis, stenosis and vascular morphology (CLARIFY): A multi-center, international study. J. Cardiovasc. Comput. Tomogr. 2021, 15, 470–476. [Google Scholar] [PubMed]
  6. Wang, B.; Jin, S.; Yan, Q.; Xu, H.; Luo, C.; Wei, L.; Zhao, W.; Hou, X.; Ma, W.; Xu, Z.; et al. AI-assisted CT imaging analysis for COVID-19 screening: Building and deploying a medical AI system. Appl. Soft Comput. 2021, 98, 106897. [Google Scholar]
  7. Lee, M.H.; Zea, R.; Garrett, J.W.; Graffy, P.M.; Summers, R.M.; Pickhardt, P.J. Abdominal CT body composition thresholds using automated AI tools for predicting 10-year adverse outcomes. Radiology 2022, 306, e220574. [Google Scholar]
  8. Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention (MICCAI), Munich, Germany, 5–9 October 2015; Springer International Publishing: New York, NY, USA, 2015; Volume 9351, pp. 234–241. [Google Scholar] [CrossRef]
  9. Oktay, O.; Schlemper, J.; Folgoc, L.L.; Lee, M.; Heinrich, M.; Misawa, K.; Mori, K.; McDonagh, S.; Hammerla, N.Y.; Kainz, B.; et al. Attention u-net: Learning where to look for the pancreas. arXiv 2018, arXiv:1804.03999. [Google Scholar]
  10. Zhang, Z.; Liu, Q.; Wang, Y. Road extraction by deep residual u-net. IEEE Geosci. Remote. Sens. Lett. 2018, 15, 749–753. [Google Scholar]
  11. Cai, S.; Tian, Y.; Lui, H.; Zeng, H.; Wu, Y.; Chen, G. Dense-UNet: A novel multiphoton in vivo cellular image segmentation model based on a convolutional neural network. Quant. Imaging Med. Surg. 2020, 10, 1275–1285. [Google Scholar] [CrossRef]
  12. Çiçek, Ö.; Abdulkadir, A.; Lienkamp, S.S.; Brox, T.; Ronneberger, O. 3D U-Net: Learning dense volumetric segmentation from sparse annotation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention–MICCAI 2016: 19th International Conference, Athens, Greece, 17–21 October 2016; Proceedings, Part II 19. Springer: New York, NY, USA, 2016; pp. 424–432. [Google Scholar]
  13. Chen, J.; Lu, Y.; Yu, Q.; Luo, X.; Adeli, E.; Wang, Y.; Lu, L.; Yuille, A.L.; Zhou, Y. Transunet: Transformers make strong encoders for medical image segmentation. arXiv 2021, arXiv:2102.04306. [Google Scholar]
  14. Cao, H.; Wang, Y.; Chen, J.; Jiang, D.; Zhang, X.; Tian, Q.; Wang, M. Swin-unet: Unet-like pure transformer for medical image segmentation. In European Conference on Computer Vision; Springer: Cham, Switzerland, 2022; pp. 205–218. [Google Scholar]
  15. Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv 2020, arXiv:2010.11929. [Google Scholar] [CrossRef]
  16. Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; pp. 10012–10022. [Google Scholar] [CrossRef]
  17. Zreik, M.; Lessmann, N.; van Hamersvelt, R.W.; Wolterink, J.M.; Voskuil, M.; Viergever, M.A.; Leiner, T.; Išgum, I. A Recurrent CNN for Automatic Detection and Classification of Coronary Artery Plaques in Cardiac CT Angiography. IEEE Trans. Med. Imaging 2019, 38, 1588–1598. [Google Scholar] [CrossRef] [PubMed]
  18. Wolterink, J.M.; Leiner, T.; Viergever, M.A.; Išgum, I. 3D Fully Convolutional Networks for Cardiac Segmentation in CT Scans. Comput. Biol. Med. 2019, 104, 95–102. [Google Scholar] [CrossRef]
  19. Yang, X.; Zeng, G.; Liu, Y. Attention-Gated Networks for Coronary Artery Segmentation from CT Images. J. Digit. Imaging 2020, 33, 989–998. [Google Scholar]
  20. Han, K.; Wang, Y.; Tian, Q.; Guo, J.; Xu, C.; Xu, C. GhostNet: More Features from Cheap Operations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 1580–1589. [Google Scholar] [CrossRef]
  21. Feng, Z.; Zhang, J.; Ran, X.; Li, D.; Zhang, C. Ghost-Unet: Multi-stage network for image deblurring via lightweight subnet learning. Vis. Comput. 2024, 41, 141–155. [Google Scholar]
  22. Tran, T.D.; Tran, N.Q.; Le, V.T.D.; Pham, H.L. Efficient ECG Classification with Light Weight Shuffle GhostNet Architecture. In Proceedings of the 2023 International Conference on Advanced Technologies for Communications (ATC), Da Nang City, Vietnam, 19–21 October 2023; pp. 421–426. [Google Scholar] [CrossRef]
  23. Sato, M.; Tamura, A.; Motoyama, S.; Kondo, S. The Impact of 3D Coronary Artery Imaging on Diagnostic Accuracy and Treatment Planning in Coronary Artery Disease. J. Cardiovasc. Comput. Tomogr. 2019, 13, 181–188. [Google Scholar] [CrossRef]
  24. Ullah, M.; Bibi, A.; Wahab, A.; Hamayun, S.; Rehman, M.U.; Khan, S.U.; Awan, U.A.; Naeem, M.; Saeed, S.; Hussain, T.; et al. Shaping the future of cardiovascular disease by 3D printing applications in stent technology and its clinical outcomes. Curr. Probl. Cardiol. 2024, 49, 102039. [Google Scholar] [CrossRef]
  25. Caton, M.T., Jr.; Wiggins, W.F.; Nunez, D. Three-dimensional cinematic rendering to optimize visualization of cerebrovascular anatomy and disease in CT angiography. J. Neuroimaging 2020, 30, 286–296. [Google Scholar]
  26. Levoy, M. Display of Surfaces from Volume Data. IEEE Comput. Graph. Appl. 1988, 8, 29–37. [Google Scholar] [CrossRef]
  27. Lichtenbelt, B.; Crane, R.; Naqvi, S. Introduction to Volume Rendering; Prentice Hall: Englewood Cliffs, NJ, USA, 1998. [Google Scholar]
  28. Valanarasu, J.M.J.; Oza, P.; Hacihaliloglu, I.; Patel, V.M. Medical transformer: Gated axial-attention for medical image segmentation. In Proceedings of the Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–1 October 2021; Proceedings, Part I 24. Springer: Cham, Switzerland, 2021; pp. 36–46. [Google Scholar]
  29. Punn, N.S.; Agarwal, S. RCA-IUnet: A residual cross-spatial attention-guided inception U-Net model for tumor segmentation in breast ultrasound imaging. Mach. Vis. Appl. 2022, 33, 27. [Google Scholar]
  30. Zhou, Z.; Siddiquee, M.M.R.; Tajbakhsh, N.; Liang, J. Unet++: Redesigning skip connections to exploit multiscale features in image segmentation. IEEE Trans. Med. Imaging 2019, 39, 1856–1867. [Google Scholar] [PubMed]
  31. Chen, L.C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Tel Aviv, Israel, 23–27 October 2018; pp. 801–818. [Google Scholar]
Figure 1. The architecture of the proposed attention mechanism.
Figure 1. The architecture of the proposed attention mechanism.
Electronics 14 01399 g001
Figure 2. The structure of the proposed method.
Figure 2. The structure of the proposed method.
Electronics 14 01399 g002
Figure 3. Prediction results of different networks for two input images. (a1,a2) represent the two input example images; (b1,b2) correspond to their respective ground truth. (c1j2) show the performance of the two images under the Medical Transformer, RCA-IUnet, ResUNet, TransUNet, UNet++, DeepLabV3+, Ours, and Ours* networks, respectively.
Figure 3. Prediction results of different networks for two input images. (a1,a2) represent the two input example images; (b1,b2) correspond to their respective ground truth. (c1j2) show the performance of the two images under the Medical Transformer, RCA-IUnet, ResUNet, TransUNet, UNet++, DeepLabV3+, Ours, and Ours* networks, respectively.
Electronics 14 01399 g003
Figure 4. (a) The mesh grid generated from prediction results; (b) result of volume-rendering.
Figure 4. (a) The mesh grid generated from prediction results; (b) result of volume-rendering.
Electronics 14 01399 g004
Figure 5. Comparison between Laplacian Pyramid Interpolation and Gaussian Smoothing Interpolation. (a) represents the result after applying Laplacian Pyramid Interpolation; (b) shows the result after applying Gaussian Smoothing Interpolation.
Figure 5. Comparison between Laplacian Pyramid Interpolation and Gaussian Smoothing Interpolation. (a) represents the result after applying Laplacian Pyramid Interpolation; (b) shows the result after applying Gaussian Smoothing Interpolation.
Electronics 14 01399 g005
Figure 6. An example position to observe stenosis. The stenotic region is marked with the red bounding box to highlight an area of visually significant narrowing.
Figure 6. An example position to observe stenosis. The stenotic region is marked with the red bounding box to highlight an area of visually significant narrowing.
Electronics 14 01399 g006
Table 1. Performance comparison of different models on CT dataset.
Table 1. Performance comparison of different models on CT dataset.
MethodParameters (M)Inference Speed (ms)F1 ScoreIoU
Medical Transformer [28]10.612.820.8820.788
RCA-IUnet [29]7.38.370.7830.643
TransUnet [13]98.9114.740.8290.708
ResUNet [10]56.254.310.7230.566
Unet++ [30]63.973.320.8280.707
Deeplabv3+ [31]31.536.100.8410.726
Ours11.014.050.8970.814
Ours *5.57.030.8870.797
* With GhostNet.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, Y.; Yoshimura, T.; Sugimori, H. Rapid Right Coronary Artery Extraction from CT Images via Global–Local Deep Learning Method Based on GhostNet. Electronics 2025, 14, 1399. https://doi.org/10.3390/electronics14071399

AMA Style

Li Y, Yoshimura T, Sugimori H. Rapid Right Coronary Artery Extraction from CT Images via Global–Local Deep Learning Method Based on GhostNet. Electronics. 2025; 14(7):1399. https://doi.org/10.3390/electronics14071399

Chicago/Turabian Style

Li, Yanjun, Takaaki Yoshimura, and Hiroyuki Sugimori. 2025. "Rapid Right Coronary Artery Extraction from CT Images via Global–Local Deep Learning Method Based on GhostNet" Electronics 14, no. 7: 1399. https://doi.org/10.3390/electronics14071399

APA Style

Li, Y., Yoshimura, T., & Sugimori, H. (2025). Rapid Right Coronary Artery Extraction from CT Images via Global–Local Deep Learning Method Based on GhostNet. Electronics, 14(7), 1399. https://doi.org/10.3390/electronics14071399

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop