1. Introduction
Coronary artery disease (CAD) is characterized by plaque build-up in the coronary arteries, which can restrict blood flow, impair cardiac function, and ultimately lead to the development of heart failure [
1]. Non-obstructive CAD can often be managed with medication alone, whereas an obstructive lesion, in which a coronary artery is blocked by ≥70%, may require further treatment. In clinical practice, when a lesion visualized on invasive coronary angiography is indeterminate, the functional significance must be carefully evaluated to determine the appropriate management strategy. Fractional flow reserve (FFR) is a widely used clinical index for assessing the physiological impact of coronary artery stenosis. It is defined as the ratio of maximal blood pressure distal to a stenotic lesion to the maximal aortic pressure under conditions of induced hyperemia. FFR provides a quantitative measure of the extent to which coronary narrowing limits blood flow to the myocardium. An FFR value ≤ 0.80 indicates hemodynamically significant stenosis and reduced myocardial function [
2], in where revascularization procedures such as percutaneous coronary intervention (PCI) or coronary artery bypass grafting (CABG) are generally recommended [
3].
Clinically, FFR is measured invasively using a pressure-sensing guidewire that is advanced across the lesion during coronary angiography. This wire-based FFR measurement is considered the gold standard for determining whether a stenosis is hemodynamically significant. Despite its clinical reliability, wire-based FFR is invasive [
4], time-consuming, and associated with procedural risks and additional costs, as 27.4% of device failures result in patient adverse events [
5]. As a result, computational fluid dynamics (CFD)-based blood pressure prediction has emerged as a non-invasive alternative for estimating FFR. By simulating coronary blood flow and pressure proximal and distal to suspected stenotic segments, CFD-based approaches enable preliminary functional assessment without the need for a pressure wire. Therefore, accurate blood pressure prediction is a crucial step toward reliable non-invasive FFR estimation and early identification of CAD.
In recent years, non-invasive FFR estimation has been developed using computational modeling techniques [
6]. These approaches do not measure pressure directly but instead compute FFR from anatomical data derived from coronary imaging, and fractional flow reserve computed tomography (
) that is employed in clinical practice [
7]. Invasive coronary angiography (ICA), while highly accurate, is invasive and involves an arteriotomy through which a guiding catheter is maneuvered through the vasculature to the aortic root, posing considerable procedural risk [
8]. In contrast, coronary computed tomography angiography (CCTA) provides three-dimensional anatomical imaging of the coronary arteries with lower risk, which requires the injection of a contrast agent before a CT scan [
7,
9].
models the coronary artery tree based on CT images and uses CFD to simulate blood flow. Cardiologists can then visualize these simulations to assess the functional significance of stenoses and inform treatment planning, ranging from medication for low-risk cases to surgical intervention for high-risk patients.
Despite its clinical utility,
relies on complex CFD simulations that involve multiple preprocessing and modeling steps, including image loading, segmentation, geometry reconstruction, meshing, boundary condition assignment, and numerical simulation. Also, transient CFD analysis requires solving complex nonlinear equations with millions of degrees of freedom, typically resulting in computational times of 12–24 h per case [
10], which limits its use in emergency settings and rapid clinical decision-making. However, commercial solutions such as HeartFlow [
11,
12] and ArteryFlow [
13] demonstrate the feasibility of CCTA-based functional assessment but remain constrained by the inherent computational burden of CFD workflows.
To address these limitations, data-driven approaches based on deep learning have been explored for direct hemodynamic prediction from imaging data. Conventional regression architectures, including ResNet-based feature extractors [
14], multilayer perceptrons (MLPs) [
15], and recurrent models such as LSTMs and Bi-LSTMs [
16], have been applied to continuous prediction tasks but may suffer from limited robustness when modeling complex spatial pressure distributions. More recently, diffusion models [
17], initially developed for image synthesis, have demonstrated potential in medical imaging applications, including conditional CT generation [
18]. Compared with approaches such as Classification and Regression Diffusion (CARD) [
19], which primarily focus on uncertainty estimation, diffusion-head-based frameworks offer a structured refinement mechanism for discrete and continuous-value prediction. These emerging methods provide a scalable and computationally efficient alternative to simulation-based strategies and may facilitate broader clinical adoption of non-invasive functional assessment from CCTA.
Importantly, FFR is fundamentally defined as a pressure ratio across a coronary stenosis under hyperemic conditions. Therefore, accurate estimation of coronary blood pressure is the key step in FFR computation. Motivated by this observation, this study proposes a deep learning-based framework that leverages CCTA to directly predict coronary pressure distribution, thereby enabling efficient FFR estimation without performing full CFD simulations. Performing CFD simulations from CCTA typically involves multiple steps: (a) loading CT images, (b) loading the 3D artery segmentation, (c) clipping the artery ends, (d) modeling the clipped artery, (e) meshing the model, (f) setting simulation parameters, (g) running the blood flow simulation, and (h) processing the simulation results, as illustrated in
Figure 1.
The proposed approach consists of two main components: an automated data pipelining framework and a novel deep learning model designed for hemodynamic prediction. Together, these components enable non-linear estimation of CFD-derived coronary pressure distributions directly from medical imaging data. The data pipeline standardizes and streamlines the generation of datasets by automating key preprocessing steps, including coronary artery segmentation, vessel clipping, transformation of patient-specific geometries into a unified reference coordinate system and application of necessary linear mappings. This structured preprocessing ensures geometric consistency across patients and facilitates robust model training. Building on this pipeline, the proposed deep learning framework is trained to predict blood pressure throughout the coronary artery tree without requiring full CFD simulations. Since FFR is fundamentally derived from pressure ratios, accurate pressure prediction enables an efficient functional assessment of coronary stenosis. By eliminating the need for computationally intensive CFD workflows, the proposed method has the potential to reduce diagnostic time and cost, increase clinical performance, and minimize patient risk associated with invasive evaluation procedures.
2. Materials and Methods
The proposed method is made up of two main parts: the Patch-Based Dataset Pipeline (PBDP) and inverted conditional diffusion (ICD) for blood flow prediction. The data pipeline consists of simulating blood flow in coronary arteries and extracting the centerline, image patches, and scalar pressures at each point for every r into a dataset. The added improvement to the pipeline is converting the volume to Left Posterior Superior (LPS) coordinate system so that all files share a common coordinate system, then transforming the volume to the center of the pressure polygon so that it aligns the patches with the centerline, before finally converting world voxel coordinates to local coordinates.
2.1. Blood Fluid Simulation Using CCTA
The blood fluid data for coronary arteries is generated through a multi-step pipeline involving data loading, artery clipping, centerline extraction, modeling, meshing, parameter assignment, simulation, and postprocessing, as shown in
Figure 1. Each step is detailed below.
(1) Data Loading and Artery Clipping. Raw patient data, including CT scans and corresponding 3D segmentation labels, are imported into 3D Slicer [
21]. Within the Slicer interface, the Segmentation Editor is selected, and the 3D view is used to visualize the coronary artery. The target artery is then clipped to isolate the region of interest.
(2) Centerline Extraction. Using the VMTK module in Slicer, centerlines are extracted from the segmented arteries. The surface is set to the segmentation label, and new endpoints are defined. A new centerline model is generated and applied, with endpoint adjustments ensuring that endpoints are located only at the termini of the artery tree [
22].
(3) Inlet and Outlet Modification. To prepare the artery for CFD meshing, the inlet and outlet segments are clipped using VMTK’s Clip Vessel module in 3D Slicer. The newly generated centerline and endpoints are used to define clipping locations. Caps are optionally generated for the inlet and outlet surfaces; however, final capping can be deferred to the modeling step for precise boundary treatment.
(4) Modeling. Using SimVascular, the clipped artery is imported as a model. Face extraction is performed with a separation angle of , and the global representation is reinitialized. Surfaces not corresponding to anatomical structures are removed, holes are filled, and caps are labeled according to their anatomical location (e.g., left anterior descending, LAD).
(5) Meshing. A new mesh is generated for the imported model. Mesh size is estimated, and the mesher is executed. Successful meshing requires properly clipped inlets and outlets; if meshing fails, the clipping and capping procedures are adjusted. The model can then be re-meshed as necessary to ensure mesh quality. The completed meshes had an average of 62,088 nodes, 321,323 elements, 65,734 edges, and 43,822 faces.
(6) Blood Flow Simulation. The below parameters are assigned to perform the blood flow simulation using SimVascular. Using the configured mesh and parameters, simulations are executed in SimVascular. Parallel computing with MPI is enabled to leverage available computational resources (10 processes in this setup). The simulation generates time-resolved hemodynamic data across 150 timesteps × 0.001 s = 0.15 s of a 0.8 s cardiac cycle. Due to the time-consuming nature of a full cardiac cycle simulation, only 12% of a cardiac cycle’s timesteps were simulated. After performing the simulation, visualization and analysis are conducted in ParaView [
23], enabling playback of the entire simulation. Average pressures are extracted using the averaged mmHg field for downstream analysis. The average pressure distributions for datasets CCTA1000 and CCTA36 (introduced in
Section 2.7) are shown in
Figure 2.
The initial pressure was set to 133,300 Pa, which yielded an output of around 100 mmHg in the reported results. The value 13,330 Pa directly converts to 100 mmHg; however, using it would not yield the expected results. It is possible that Simvascular internally scales down the average pressure in mmHg by a factor of 10. To avoid confusion, we use 133,300 Pa, then during training, we use the “average pressure mmHg” property from the simulation file as the ground truth. For boundary conditions, all outlet caps are assigned a resistance of 1333 Pa·s/cm
3, except for the inlet. The vessel wall properties are defined as follows: wall thickness of 0.2 mm, elastic modulus of
Pa, density of 0.8 g/cm
3, and wall pressure of 133,300 Pa [
24]. The remaining solver parameters are summarized in
Table 1 [
25]. Simvascular adopts laminar flow modeling and no turbulence model. The blood viscosity is 0.04 g/cm × s
2 and blood density is 1.06 g/cm
3 [
26].
2.2. Patch-Based Data Preparation
The blood flow simulation generates pressure values on every face of the 3D artery mesh. However, FFR evaluation does not require pressure information at every point on the arterial wall; it depends only on the pressure along the artery centerline, which mimics the clinical procedure of inserting a pressure wire invasively to measure blood pressure at the center of the artery. Therefore, instead of using the full volumetric pressure field, only the average pressures along the centerline are extracted. Local 3D image patches are simultaneously extracted around centerline points. These patches capture imaging features of the surrounding vessel and are used as inputs to predict the corresponding centerline pressure, thereby linking anatomical imaging to hemodynamic function.
Furthermore, this strategy reduces data dimensionality, focuses on clinically relevant locations, and aligns the deep learning framework with real-world blood pressure measurement procedures.
For each point on the centerline , where denotes the index of the centerline point in artery n, with volume origin and voxel spacing , a voxel patch is extracted and centered at the point coordinates. The average pressure within a radius around the centerline point is calculated and assigned as the label for the patch.
All coordinates are converted to local voxel coordinates relative to the artery centerline to ensure spatial consistency. Mathematically, the local voxel indices are computed as in Equation (
1).
where ⊙ denotes element-wise division.
A prerequisite for this computation is that volumes, centerlines, and pressure data are aligned within the same coordinate system to avoid spatial mismatch. This is ensured through a volume alignment procedure. Specifically, when a volume in the Right Anterior Superior (RAS) coordinate system is converted to a Left Posterior Superior (LPS) compliant format, the x and y coordinates are sign-flipped. After the conversion, the volume loses all spatial coordination. To align the converted volume, it must be translated from its new LPS coordinate center to the artery tree’s center and rotated by along the x-axis if required. The volume center is computed as half of the element-wise product between the sum of the origin and spacing and the volume dimensions.
For each artery, the algorithm iterates along the centerline, extracting a 3D image patch centered at every point to capture local anatomical features. The corresponding pressure values are averaged within a small neighborhood around each centerline point to serve as the training label. All spatial coordinates are transformed into a local reference frame relative to the artery centerline, ensuring spatial consistency across arteries and patients.
This procedure systematically constructs a dataset of paired imaging patches and centerline pressure values, enabling the model to learn the mapping from anatomical structure to hemodynamic function. Consequently, the trained model can simulate blood pressure distributions and support non-invasive FFR estimation from coronary CT imaging.
2.3. Inverted Conditional Diffusion
The proposed ICD model is a modification of traditional conditional diffusion frameworks, in which the roles of input and label are repurposed. Unlike standard diffusion models that generate images conditioned on labels, ICD treats the labels (i.e., pressure values) as inputs and the anatomical features (image patches) as conditioning variables, effectively inverting the diffusion paradigm. This formulation enables the network to regress blood pressure values from local imaging features in a manner that conventional encoder-only networks cannot replicate.
2.3.1. Forward Diffusion Process
The forward Markov chain is parameterized by a predefined variance schedule
, where
denotes the noise variance at diffusion step
t, and
. Let
denote the simulated CFD blood pressure ground truth, and
denote the sequence of latent variables. The forward process is defined as in Equations (
2) and (
3).
After
T diffusion steps, the latent variable approaches an isotropic Gaussian distribution [
26], as shown in Equation (
4):
2.3.2. Reverse Diffusion Process
The reverse process learns a parameterized conditional distribution , where c denotes the conditioning variable formed by concatenating imaging features and relative centerline coordinates. The model iteratively denoises back to the original pressure value .
The reverse transitions are modeled as Gaussian distributions with learned mean
and variance
following the DDPM sampling formulation [
27], as shown in Equation (
5).
The variance and mean are defined as in Equations (
6) and (
7).
where
,
, and
is a neural network trained to predict the added noise, as
[
28].
Through iterative denoising, the model effectively reconstructs pressure values from imaging features by reversing the forward noise process.
2.3.3. Conditioning and Sampling
Conditioning is implemented by concatenating the image patch features and relative centerline coordinates into the denoising network. During inference, the model initializes from Gaussian noise and iteratively applies the learned reverse transitions to recover predicted pressure values at centerline locations.
This strategy enables ICD to construct a latent representation from the pressure labels and propagate backward through the diffusion chain, allowing accurate regression even under complex non-linear anatomical–hemodynamic relationships. The complete ICD procedure is summarized in Algorithm 1.
| Algorithm 1 Proposed inverted conditional diffusion (ICD). |
- 1:
Input: sample , conditioning variables c, noise schedule - 2:
Define and
- 3:
// Forward Process - 4:
for to T do - 5:
Sample - 6:
- 7:
end for
- 8:
// Reverse Process - 9:
Train conditional network to predict - 10:
for down to 1 do - 11:
Sample if , else - 12:
- 13:
end for - 14:
Ensure: Denoised sample
|
2.4. Architecture of ICD
The architecture of the proposed ICD is shown in
Figure 3. The ICD model is designed to predict blood pressure at each coronary artery centerline point by integrating local anatomical information with geometric positional cues.
To achieve this, the framework begins with a 3D convolutional encoder that processes a volumetric CCTA patch centered around the vessel lumen. This encoder consists of sequential 3D convolution, batch normalization, ReLU activation, and max-pooling layers, progressively extracting multi-scale spatial features. After the final convolutional block, the resulting feature map is flattened and projected into a compact 128-dimensional embedding that captures local vessel morphology, lumen intensity patterns, plaque burden, and surrounding tissue characteristics.
In parallel, each centerline coordinate is encoded through a lightweight linear mapping to produce a 3-dimensional geometric embedding. This embedding provides the model with structural context, enabling it to learn how blood pressure naturally varies along the artery according to vessel curvature, branch location, and proximity to stenotic regions. The anatomical and geometric embeddings are concatenated to form a unified 288-dimensional representation describing the imaging context and spatial identity of each target point. This fused representation serves as conditioning input for the diffusion process.
The diffusion module forms the core of the ICD model. Unlike conventional diffusion models used for image generation, our formulation inverts the process to operate directly in the regression domain. During training, Gaussian noise is progressively added to the ground truth pressure values, forming a forward diffusion trajectory. The model then learns the reverse denoising trajectory, where a neural network—conditioned on the fused anatomical–geometric representation—iteratively predicts a cleaner pressure estimate at each diffusion timestep. Two fully connected layers with ReLU activation serve as the denoising backbone, and a final linear layer outputs a single pressure value representing the blood pressure prediction at that centerline point. The ICD design eliminates sequential dependencies found in recurrent models and yields smooth, physically consistent pressure predictions that align more closely with coronary physiology.
2.5. Loss Function and Optimization
Huber loss is employed to optimize the proposed model. Huber loss outperforms other loss functions in time-series analysis by combining MSE and MAE, choosing MSE for small errors and MAE for large errors, providing more stable results [
29]. The Huber loss function is defined as follows, where
is a threshold parameter,
y is the ground truth value, and
is the predicted value, as shown in Equation (
8).
The optimizer used is Adam with decoupled weight decay for its ability to optimize learning rate and weight decay simultaneously, reducing unnecessary hyperparameter tuning, speeding up convergence, and mitigating overfitting.
2.6. Evaluation Metrics
The evaluation metrics used in this study include score, Pearson correlation coefficient (PCC), normalized root mean squared error (NRMSE), and root mean squared error (RMSE).
Score: The coefficient of determination, which measures the proportion of variance in the dependent variable predictable from the independent variables, as shown in Equation (
9).
Here, is the ground truth blood pressure for one patch, is the predicted value, and is the mean of the observed pressures.
Pearson Correlation Coefficient (PCC) measures the strength of the linear relationship between two continuous variables. Let
be the prediction,
the mean of predictions,
the ground truth, and
the mean of observations, as defined in Equation (
10).
Root Mean Squared Error (RMSE) measures the average magnitude of prediction errors, as shown in Equation (
11).
Normalized Root Mean Squared Error (NRMSE) normalizes RMSE relative to the observed mean, facilitating comparison across models with different scales, as dentoed in Equation (
12).
Here, m is the number of samples, and is the mean observed value.
2.7. Enrolled Datasets
Two datasets were used in this study: CCTA1000 and CCTA36. CCTA1000 corresponds to the publicly available ImageCAS dataset [
30], which contains high-quality coronary CT angiography (CCTA) scans with expertly annotated coronary artery segmentations. CCTA36 is a private dataset [
31] and includes patients with invasive blood ressure measurements.
The dataset consists of 3D CTA images captured by the Siemens 128-slice dual-source scanner from 1000 patients. For patients who had previously been diagnosed with coronary artery disease, early revascularization within 90 days after is included. The high-dose CTA is performed, and during the reconstruction, the 30–40% phase or the 60–70% phase is selected to obtain the best coronary artery images. The resulting scans have a spatial resolution of
(206–275) voxels, a planar resolution of 0.29–0.43 mm
2, and spacing of 0.25–0.45 mm. The data was collected from realistic clinical cases at the Guangdong Provincial People’s Hospital during April 2012 to December 2018. Only the patients older than 18 years and with a documented medical history of ischemic stroke, transient ischemic attack or peripheral artery disease were eligible to be included. Finally, there were a total of 414 females and 586 males included, with the average ages being 59.98 and 57.68, respectively. The left and right coronary arteries in each image are independently labeled by two radiologists, and their results are cross-validated. The labeled coronary artery includes the left main coronary artery, left anterior descending coronary artery, left circumflex coronary artery, right coronary artery, diagonal 1, diagonal 2, diagonal 3, obtuse marginal branch 1, obtuse marginal branch 2, obtuse marginal branch 3, ramus intermedius, posterior descending arteries, acute marginal 1 and other blood vessels [
30]. The data is made up of Nifti volume-labeled segmentations.
The CCTA36 dataset includes thirty-six patients with at least one coronary stenosis . These cases were retrospectively collected, and each patient underwent SPECT MPI imaging followed by invasive FFR assessment, providing ground truth functional measurements.
For experiments, a subset of each dataset was used. From CCTA1000, 40 scans were used for training, 5 for validation, and 10 for testing. From CCTA36, 10 out of 36 patients were enrolled, with 6 scans used for training, 2 for validation, and 2 for testing. CCTA1000 is considered a medium-sized dataset, whereas CCTA36 represents a small, functionally annotated dataset.
4. Discussion
The experimental results demonstrate that the proposed ResNet50-ICD model achieves superior performance on the medium-sized CCTA1000 dataset, consistently outperforming competing architectures in , RMSE, and NRMSE, while maintaining a high Pearson correlation. The comparison across architectures highlights two important observations. First, deeper feature extraction using ResNet improves predictive stability compared with plain CNN-based encoders, suggesting that richer anatomical representation from CCTA patches is beneficial for pressure regression. Second, diffusion-based regression provides improved robustness over conventional MLP and BiLSTM regressors, likely due to its iterative refinement mechanism and reduced sensitivity to sequential ordering along the vessel centerline. In contrast, recurrent architectures such as BiLSTM may suffer from order dependency, where hidden-state propagation across branches can accumulate noise and degrade regression accuracy. These findings indicate that order-invariant modeling is advantageous for spatially structured coronary data.
From a clinical perspective, accurate and scalable pressure prediction is a critical step toward non-invasive FFR estimation. By avoiding computationally intensive CFD simulations, the proposed ICD framework substantially reduces inference complexity while preserving strong agreement with reference values. This efficiency may facilitate broader clinical deployment, particularly in scenarios requiring rapid decision-making or large-scale CCTA screening. However, performance on the small CCTA36 dataset was inconclusive, reflecting the data-intensive nature of diffusion-based models and the challenges of training high-capacity architectures with limited samples. Additional limitations include higher computational demands during training and the need for carefully standardized preprocessing pipelines. Future work will focus on improving data efficiency, incorporating multi-center datasets for better generalization, and exploring hybrid physics-informed learning strategies to further enhance robustness and clinical interpretability. In addition, fluid dynamical factors such as arterial wall shear stress, recirculating zone and pressure drops can be incorporated as features in the dataset for improved regression accuracy and physiologically realistic CFD modeling.