Quantitative Prediction of SYNTAX Score for Cardiovascular Artery Disease Patients via the Inverse Problem Algorithm Technique as Artificial Intelligence Assessment in Diagnostics

The quantitative prediction of the SYNTAX score for cardiovascular artery disease patients using the inverse problem algorithm (IPA) technique in artificial intelligence was explored in this study. A 29-term semi-empirical formula was defined according to seven risk factors: (1) age, (2) mean arterial pressure, (3) body surface area, (4) pre-prandial blood glucose, (5) low-density-lipoprotein cholesterol, (6) Troponin I, and (7) C-reactive protein. Then, the formula was computed via the STATISTICA 7.0 program to obtain a compromised solution for a 405-patient dataset with a specific loss function [actual-predicted]2 as low as 3.177, whereas 0.0 implies a 100% match between the prediction and observation via “the lower, the better” principle. The IPA technique first created a data matrix [405 × 29] from the included patients’ data and then attempted to derive a compromised solution of the column matrix of 29-term coefficients [29 × 1]. The correlation coefficient, r2, of the regression line for the actual versus predicted SYNTAX score was 0.8958, showing a high coincidence among the dataset. The follow-up verification based on another 105 patients’ data from the same group also had a high correlation coefficient of r2 = 0.8304. Nevertheless, the verified group’s low derived average AT (agreement) (ATavg = 0.308 ± 0.193) also revealed a slight deviation between the theoretical prediction from the STATISTICA 7.0 program and the grades assigned by clinical cardiologists or interventionists. The predicted SYNTAX scores were compared with earlier reported findings based on a single-factor statistical analysis or scanned images obtained by sonography or cardiac catheterization. Cardiologists can obtain the SYNTAX score from the semi-empirical formula for an instant referral before performing a cardiac examination.


Introduction
The quantitative prediction of the SYNTAX score for cardiovascular artery disease patients using the inverse problem algorithm technique as an artificial intelligence assessment in clinical diagnostics was evaluated in this study. The SYNTAX score is an angiographic tool to help cardiologists, interventionists, and surgeons to grade the complexity of coronary artery lesions. A higher SYNTAX score indicates a more complex condition and a worse prognosis in patients undergoing contemporary revascularization [1]. The symptoms of tiredness, wheezing, and swelling that often occur in clinical cardiovascular artery disease (CAD) are often mistaken for normal aging, so up to 90% of patients are unable to detect cardiovascular-artery-related symptoms at an early stage and often miss the golden treatment time [2]. Heart disease featured among the top ten causes of death worldwide from 2000 to 2019. More than 70,000 patients are hospitalized due to cardiovascular artery disease in Taiwan every year, according to WHO statistics [3]. Statistics from the Heart Failure Registration Program released in 2017 show that up to 32.3% of patients will be hospitalized again within six months, and the mortality rate within five years of being diagnosed with cardiovascular artery disease is nearly 50% [4]. Cardiovascular artery disease is a significant disease that should not be neglected. Moreover, patients suffering from cardiovascular artery disease usually deteriorate without proper treatment in advance.
Many researchers have noticed this crucial problem and tried to propose many preliminary predictions of the SYNTAX score to prevent cardiovascular artery disease in advance. For instance, Akboga et al. claimed that SYNTAX had a significant correlation with the ratio of monocytes to high-density lipoprotein cholesterol from the observation of 1229 patients [5]. Ikeda et al. found a significant correlation between carotid intima-media thickness and SYNTAX from 370 consecutive patients [6] and between carotid artery intimamedia thickness and the plaque score from 501 consecutive patients [7]. Ikeda et al. also revealed that carotid artery ultrasound imaging and the ankle-brachial index could reasonably predict the severity of SYNTAX from 496 patient cases [8]. Rahmani et al. investigated the correlation between the Global Registry for Acute Coronary Events (GRACE) and SYN-TAX for the risk stratification of 330 patients, although the regression correlation coefficient was as low as 0.116 [9]. However, properly grading the SYNTAX score in clinical diagnosis is quite problematic. As clearly depicted in Figure 1, two scenarios illustrate SYNTAX score grading from a cardiac X-ray examination. As shown, every lesion was graded according to its size or calcification with different weighted factors and eventually received scores of 47 (high) and 10 (low) in scenarios 1 and 2, respectively [1,2]. In contrast, seven essential factors (1. age; 2. mean arterial pressure; 3. body surface area; 4. pre-prandial blood glucose; 5. low-density-lipoprotein cholesterol; 6. Troponin I; and 7. C-reactive protein) were adopted as risk factors to satisfactorily predict the SYNTAX score using an inverse problem algorithm in this study. In doing so, 29 customized terms of a first-order nonlinear semi-empirical formula were derived via the STATISTICA 7.0 software to perform the analysis and provide reliable results on either numerical coincidence or clinical verification. A related discussion concerning the IPA technique or SYNTAX prediction is also included. A comparison of various forecasts of the SYNTAX score was also performed.   Two scenarios showing how the SYNTAX score was graded in a cardiac examination. The SYNTAX score was graded as 47 (high) and 10 (low) for scenarios 1 and 2, respectively.

Basics of the Inverse Problem Algorithm
In the first-order linear equation y = βx, y is the expected value, while the sensitivity of x to y is reflected by β. If y = y [405 × 1] is the expected value, also referred to as the actual SYNTAX score, which correlates with 29-term coefficients, M [29 × 1], then the respective correlation equation takes the following form: If ∅ is the standard loss function, then where V and V T are the direct and transpose dataset matrices of the risk factors and crossinteractions between two factors [405 × 29]. For calculating the extreme values of the proposed function, according to L'Hospital's rule, Equation (4) implies that the first-order total differential of the loss function Φ has a zero value. Then, the particular inverse matrix (V T ·V) (cf. Equations (2)-(6)) is used to derive the column matrix of the 29-term coefficient M [10]. The computation is performed via the STATISTICA 7.0 default program, yielding a compromised solution with the minimal loss function Φ. This solution can be further customized according to user demand. The IPA technique's most available feature is that, besides providing a quantitative expectation of the particular syndrome based on several biological indices, it also forecasts the potential risk to medical staff when facing patients with no significant syndrome detected. In addition, solving the inverse matrix of biological datasets satisfies the convergence of numerical analysis. The derived semi-empirical formula offers an additional suggestion for clinical imaging diagnosis from any radiological facility, such as cardiac X-ray, sonography, or CT angiography.

The IPA Flowchart
The IPA technique in artificial intelligence can be schematized by the flowchart in Figure 2. It implies that the SYNTAX score in this study (as a quantified expectation value of the particular project) should be defined first. Next, one has to preset the number of risk factors that have to be orthogonal. Then, the estimated expectation value should be verified using data from another group of patients to ensure accuracy. Any failure in verifying or checking the program outcomes (loss function, variance, or correlation coefficient) via the STATISTICA 7.0 program requires going back to the preliminary stage to redefine the risk factors or increase the number of patients' data. Otherwise, due to the limited data scope, the program may not converge to an acceptable range.

Semi-Empirical Formula Elaboration
In IPA, semi-empirical formulas contain only contributions from one factor and crossinteractions between two factors. Thus, all triple (v1 × v2 × v3, or v1 × v2 × v4, etc.) or quadruple (v1 × v2 × v3 × v4, or v1 × v2 × v3 × v5, etc.) cross-interactions among factors are ignored. In contrast, all multiple residual cross-interactions are merged into the final constant term as a minor oscillation to reach convergence of the numerical solution. The mathematical expression is defined as follows: As depicted, the expectation value (v8, i.e., the SYNTAX score in this study) is always listed on the left side of the equation. In contrast, the right side contains the semi-empirical formula of seven variables (v1~v7).

SYNTAX Score and Seven Risk Factors
The SYNTAX score is the sum of the points assigned to each lesion identified in the coronary tree with >50% diameter narrowing in vessels above 1.5 mm in diameter. Further, the SYNTAX score is subdivided into three scenarios, namely, low (≤16), intermediate (16)(17)(18)(19)(20)(21)(22), and high (>22) [1]. In this study, the SYNTAX score was graded by seasoned cardiologists or interventionists when patients underwent cardiovascular examinations.
Seven essential biological indices were assigned as risk factors in this study: (1) age, (2) mean arterial pressure (MAP), (3) body surface area (BSA), (4) pre-prandial blood glucose (glucose AC), (5) low-density-lipoprotein cholesterol (LDL-C), (6) Troponin I (cTnI), and (7) C-reactive protein (CRP). MAP is a widely used parameter, reflecting the mean pressure in human arteries per complete cardiac cycle. It is considered a better indicator of perfusion to organs than systolic blood pressure (SBP) or diastolic blood pressure (DBP), being derived as follows: MAP = (SBP + 2·× DBP)/3. 3. The body surface area (BSA) strongly correlates with human metabolic mechanisms and is defined as √ H × W/3600) [m 2 ] (H: height [cm]; W: weight [kg]). Glucose AC in fasting individuals is known to be maintained at a constant level at the expense of glycogen stores in the liver and skeletal muscle. LDL-C is one of the five major groups of lipoproteins that transport all fat molecules around the body in extracellular water. Troponin I (cTnI) is a cardiac and skeletal muscle protein that binds to actin in thin myofilaments and holds the actin-tropomyosin complex in place. The last factor, CRP, is an annular pentameric protein found in blood plasma, whose circulating concentrations rise in response to inflammation. It is an acute-phase

Semi-Empirical Formula Elaboration
In IPA, semi-empirical formulas contain only contributions from one factor and cross-interactions between two factors. Thus, all triple (v1 × v2 × v3, or v1 × v2 × v4, etc.) or All risk factors should be normalized to the same domain range from −1 to +1 before executing the STATISTICA 7.0 program for the IPA structure's incorporation of clinical data and the unification of each risk factor's dimensionality. Each critical risk factor reading X* is normalized via the following equation: where X, X min , and X max are the respective risk factor's original, minimum, and maximum readings (V 1 -V 7 ). For example, for MAP (V 2 )'s maximum and minimum readings of 153 and 50 mmHg, respectively, the MAP values of case Nos. 100 or 183 were normalized from their original values (71 and 120) to the following ones: −0.6026 and +0.3550. Thus, the MAP scale range was normalized from −1.0 to +1.0. The readings of the seven factors and their actual (original) SYNTAX scores before the normalization process (cf. Equation (8)) were obtained for 405 cardiovascular artery disease patients with their cardiac diagnoses reported in the Taichung Armed Forces General Hospital, Taiwan, from 1 January 2016 to 30 June 2021. In addition, another group of 105 patients with a similar syndrome was randomly assigned as a verified group from the original 555 (405 + 105 = 555)-patient group in the follow-up study. The survey was authorized by the Institutional Review Board (IRB) of the Tri-Service General Hospital, Taiwan (Permit No. B202005075). The individual results are given in Table 1.

Running STATISTICA 7.0 Program
The STATISTICA 7.0 default program [11] was used to execute the IPA algorithm. The correlation and cross-interaction among seven factors (cf. Equation (7)) were assigned and treated as nonlinear estimations, nonlinear models, and user-specified regressions with customized loss functions. The numerical simulations adopted the normalized data from 405 patients. The loss function was calculated explicitly via Rosenbrock and quasi-Newton numerical analyses, yielding the converged solution. Noteworthy is that alternative methods, such as Simplex, simples, or Rosenbrock pattern search, failed to obtain the minimum loss function that would satisfy user demands in this study.
The actual SYNTAX scores of cardiovascular artery disease patients were the expectation values of the computational results. Therefore, 11,745 individual data points (405 × 29 = 11,745) were included in the algorithm to optimize the compromised column matrix (405 × 1 = 405) of the SYNTAX scores of patients as a final numerical solution. In addition, twenty-nine terms containing one constant were incorporated into this algorithm to reveal any possible links among clinical factors. The loss function (Φ) was defined according to the total fluctuation between each theoretical and actual SYNTAX score for all 405 cardiovascular artery disease patients. The STATISTICA 7.0 program operation is visualized in Figure 3. One has to follow the proposed options and define the unique loss function to construct the coefficient matrix via the IPA.  Table 2 shows the precise data of the risk factors after normalization. As clearly illustrated, the mean value should approach 0.0 if the specific biological index of the patient group follows the normal distribution (range from −1.0 to +1.0). Accordingly, glucose AC (0.68), cTnI (0.90), and CRP (0.74) had high average values, whereas MAP (0.00), BSA (0.07), age (0.17), and even the SYNTAX score itself (0.11) fulfilled the definition of a standard normal distribution from a total of 405 patients' statistical data.     Table 2 shows the precise data of the risk factors after normalization. As clearly illustrated, the mean value should approach 0.0 if the specific biological index of the patient group follows the normal distribution (range from −1.0 to +1.0). Accordingly, glucose AC (0.68), cTnI (0.90), and CRP (0.74) had high average values, whereas MAP (0.00), BSA (0.07), age (0.17), and even the SYNTAX score itself (0.11) fulfilled the definition of a standard normal distribution from a total of 405 patients' statistical data.  indicating the high coincidence of the derived prediction according to the original data matrix. The calculated coefficients of the 29-term semi-empirical formula, as defined in Equation (7), are listed in Table 3. Since all risk factors were normalized from −1 to +1, high coefficients corresponded to significant contributions in dominating the prediction of the SYNTAX score. 3.177. The sample variance and regression correlation were 0.8958 and 0.9465, respectively, indicating the high coincidence of the derived prediction according to the original data matrix. The calculated coefficients of the 29-term semi-empirical formula, as defined in Equation (7), are listed in Table 3. Since all risk factors were normalized from −1 to +1, high coefficients corresponded to significant contributions in dominating the prediction of the SYNTAX score.    Table 3. The coefficients of the 29-term semi-empirical formula (cf. Equation (7)) from the calculated outcomes of the STATISTICA 7.0 program. The factors were all normalized from −1 to +1. Thus, the large derived coefficients significantly dominate the performance of SYNTX score prediction.

Verifying the Predicted SYNTAX Score
Another group of 105 patients who were randomly adopted from the original patient group in this study was assigned as a verified group to verify the prediction of the SYN-TAX score from the derived semi-empirical formula. In doing so, the biological indices of the verified group were input as a dataset matrix and then calculated to obtain the predicted SYNTAX score. Table 4 shows the detailed information of the verified group. As demonstrated, each risk factor's maximum or minimum value also falls into a similar range to that of the original group. Figure 5 shows that the derivation of data from a verified group of 105 patients coincided with the original data from the group of 405 patients. As depicted, the two data groups consistently merged along the axis of the actual SYNTAX score. Specifically, the defined agreement (AT) equals [(actual − prediction)/actual] of the SYNTAX score. Therefore, the average AT avg and standard deviation of the 105 ATs are 0.308 and 0.193, respectively, implying high agreement between the actual and predicted values of the SYNTAX score [12][13][14]. Figure 6 illustrates the distribution of 105 individual ATs in this study. As demonstrated, most ATs lie below 0.4, showing a reliable prediction of the SYNTAX score.

Dominant Factors of the SYNTAX Score Prediction
Either LDL-C (ranking: 2), age (3), BSA (5), glucose AC (6), or MAP (8) is the dominant risk factor in predicting the SYNTAX score, whereas cTnI (20) and CRP (27) are minor contributors according to the corresponding coefficient from the STATISTICA 7.0 program outcomes (cf. Table 3). Since all risk factors were normalized to eliminate their dimensionality in the preliminary stage and fit them to the interval between −1.0 and +1.0, the derived coefficient of any specific risk factor reflects its dominance in the semi-empirical formula. Although the individual factors in this study may not provide dominant contributions to the expectation value, their cross-interactions could strongly dominate the performance. According to IPA's computational assumption, the cross-interaction between two factors (for instance, A (age) and B (cTnI) in this study) was interpreted as A × B and mathematically defined as a cross-product (A × B) with a vertical vector to both A and B. Thus, additional terms of cross-interactions between two factors in the semi-empirical formula provided alternative paths for optimizing the compromised solution. In addition, the assigned vector of either factor itself or the cross-interaction between factors created three specific degrees of freedom (DOFs) along the vector for optimizing the compromised solution in the numerical analysis.
TAX score. Specifically, the defined agreement (AT) equals [(actual − prediction)/actual] of the SYNTAX score. Therefore, the average ATavg and standard deviation of the 105 ATs are 0.308 and 0.193, respectively, implying high agreement between the actual and predicted values of the SYNTAX score [12][13][14]. Figure 6 illustrates the distribution of 105 individual ATs in this study. As demonstrated, most ATs lie below 0.4, showing a reliable prediction of the SYNTAX score.

Reducing the Number of Risk Factors
The 29-term semi-empirical formula based on seven risk factors could reasonably predict the SYNTAX score, according to the verification performed using another group of 105 patients. However, once the number of risk factors is decreased, the limited term of the correlated semi-empirical formula should also lose its high accuracy in the presumption of a robust designation. Accordingly, we reduced the number of risk factors in decreasing order of importance: LDL-C (7), age (6), BSA (5), glucose AC (4), MAP (3), cTnI (2), and CRP (1) (cf. Table 3). Thus, the semi-empirical formula corresponding to either six, five, or even only one factor could be defined via Equations (9)- (13). Restated, the number of risk factors decreased sequentially from the first (CRP) to the second and the last one (age) with a corresponding short-term semi-empirical formula. Thus, the last semi-empirical formula (v2) was defined according to LDL-C (v1) only since it provided the most dominant contribution to the prediction of the SYNTAX score (cf. Equation (14)).
The prediction from the STATISTICA 7.0 program based on various risk factors is reorganized in Table 5. As clearly illustrated, the regression curve reflects the prediction versus the actual SYNTAX score under various risk factors, and a good capability of the program prediction can be observed from either high sensitivity (i.e., the regression curve slope) or high coincidence (i.e., the correlation coefficient of the regression curve) [15]. In addition, the original prediction according to seven risk factors is also listed for comparison. As clearly illustrated, the accuracy of the SYNTAX score dropped with the reduced number of risk factors. Therefore, bountiful information on a patient's biological index is always preferable for data collection in artificial intelligence. Table 5. Best-fitting parameters of the linear regression line. The results were calculated based on various numbers of risk factors via the STATISTICA 7.0 program. In addition, the original predictions according to seven risk factors are also listed for comparison. In contrast, even the most dominant factor, namely, LDL-C in this study, cannot reasonably predict the SYNTAX core alone in reality since the respective correlation coefficient drops to a tiny 0.0644. However, with one additional factor added (age, as the second dominant factor), the coefficient increases to 0.2922, although it is still lower than the one derived from seven factors (0.8958) in this study.

Discussion of Similar Research Results Based on Various Risk Factors
Scholars have performed similar research using various risk factors to predict SYNTAX scores for cardiovascular artery disease patients from multiple perspectives. In particular, Akboga et al. [5] adopted the monocyte-to-HDL-C ratio to predict the SYNTAX score. The acquired clinical patient data were subjected to statistical analysis by SPSS, yielding linear correlations of the monocyte-to-HDL-C ratio with the SYNTAX score and C-reactive protein (with respective correlation coefficients of only 0.371 and 0.336). Ikeda et al. [7,8] applied the ultrasonography technique to measure the cardiovascular artery's intima-media thickness or plaque, revealing a correlation between the SYNTAX score and the derived data. Alternatively, Rahmani et al. [9] recommended using the GRACE (Global Registry of Acute Coronary Events) score to predict the SYNTAX score. However, the regression results barely showed a positive correlation (r 2 = 0.116). Studies by Kurtul et al. [16] and Sebastianki et al. [17] attempted to predict the SYNTAX score via the serum albumin concentration and the ankle-brachial index, respectively. However, these studies revealed that glucose AC, LDL-C, and creatine were also correlated, reflecting their coupled contribution to the SYNTAX score at a particular level. Noteworthy is that the single-factor regression technique is far beyond the complexity that multiple-factor correlation can reveal, and thus, a suitable IPA technique, as proposed in this study, can satisfactorily resolve the challenge in numerical analysis. In addition, Liu et al. [18] reported that systolic and diastolic echocardiographic parameters could also predict the SYNTAX score via the sonography technique but still needed more convincing results to demonstrate the true correlation, since the measured group of patients contained only 74 persons. The advantage of the IPA technique over those above is that it furnishes a reliable and rapid suggestion of the SYNTAX score for clinical cardiologists to instantly alert them to potential risks to patients before undergoing any solid examinations.

IPA Technique in Artificial Intelligence Applications
To further explore the application of the IPA technique in artificial intelligence, as described in this study, the prediction of the SYNTAX score using seven risk factors was portrayed by a color ladder diagram, as listed in Figure 7. In doing so, four out of seven risk factors were preset as average values to imply their general behavior in the CAD patient group because of their minor contributions to the accuracy of the SYNTAX score (cf . Table 3) [19,20]. The four minor factors were preset as MAP (0.0), glucose AC (0.68), cTnI (0.90), and CRP (0.74). Accordingly, the resulting readings after normalization were all equal to 0.0 for these risk factors (cf. Equation (8)). The three major factors, age (30-90 yr), BSA (1.0-2.2 m 2 ), and LDL-C (10-300 mg/dL), were preset as the X-, Y-, and Z-axes, respectively, in this study. As clearly demonstrated, the SYNTAX score is high (>22) when LDL-C is higher than 100 mg/dL and becomes severe for high LDL-C (>250 mg/dL). Either young age or a small BSA is beneficial for maintaining a low SYNTAX score. In addition, cardiologists or interventionists can quickly obtain the suggested SYNTAX score by calculating the semi-empirical formula with only three major factors or obtain a precise outcome with all seven risk factors, as mentioned in this study. young age or a small BSA is beneficial for maintaining a low SYNTAX score. In additi cardiologists or interventionists can quickly obtain the suggested SYNTAX score by culating the semi-empirical formula with only three major factors or obtain a precise o come with all seven risk factors, as mentioned in this study. , and LDL-C (10-300 mg/dL), w preset as X-, Y-, and Z-axes, respectively, in this study. As clearly demonstrated, the SYNTAX sc is high (>22) when LDL-C is higher than 100 mg/dL and becomes severe for high LDL-C (> mg/dL).

Conclusions
The quantitative prediction of the SYNTAX score for cardiovascular artery dise patients using the IPA technique as an artificial intelligence assessment in clinical di nostics was evaluated in this study. The 29-term semi-empirical formula was defined cording to seven risk factors (age, mean arterial pressure, body surface area, pre-prand blood glucose, low-density-lipoprotein cholesterol, Troponin I, and C-reactive prote The correlation between actual and predicted SYNTAX scores reached r 2 = 0.8958, imp ing that a highly coincident solution was obtained based on the dataset of 405 patien The obtained formula was verified by a dataset of 105 patients with similar sympto yielding r 2 = 0.8304. The derived average AT of the verified group (ATavg = 0.308 ± 0.1 revealed a slight deviation between the theoretical prediction from the STATISTICA program and the grades assigned by cardiologists or interventionists. The proposed I technique proved to be a valuable and reliable tool in helping clinical diagnosis. Patie can receive an instant SYNTAX score from personal biological indices before undergo cardiovascular artery examination by either ultrasonography or cardiac catheterizatio , and LDL-C (10-300 mg/dL), were preset as X-, Y-, and Z-axes, respectively, in this study. As clearly demonstrated, the SYNTAX score is high (>22) when LDL-C is higher than 100 mg/dL and becomes severe for high LDL-C (>250 mg/dL).

Conclusions
The quantitative prediction of the SYNTAX score for cardiovascular artery disease patients using the IPA technique as an artificial intelligence assessment in clinical diagnostics was evaluated in this study. The 29-term semi-empirical formula was defined according to seven risk factors (age, mean arterial pressure, body surface area, pre-prandial blood glucose, low-density-lipoprotein cholesterol, Troponin I, and C-reactive protein). The correlation between actual and predicted SYNTAX scores reached r 2 = 0.8958, implying that a highly coincident solution was obtained based on the dataset of 405 patients. The obtained formula was verified by a dataset of 105 patients with similar symptoms, yielding r 2 = 0.8304. The derived average AT of the verified group (AT avg = 0.308 ± 0.193) revealed a slight deviation between the theoretical prediction from the STATISTICA 7.0 program and the grades assigned by cardiologists or interventionists. The proposed IPA technique proved to be a valuable and reliable tool in helping clinical diagnosis. Patients can receive an instant SYNTAX score from personal biological indices before undergoing cardiovascular artery examination by either ultrasonography or cardiac catheterization.