Importance of Visual Estimation of Coronary Artery Stenoses and Use of Functional Evaluation for Appropriate Guidance of Coronary Revascularization—Multiple Operator Evaluation

Background: Visual estimation (VE) of coronary stenoses is the first step during invasive coronary angiography. The aim of this study was to evaluate the accuracy of VE together with invasive functional assessment (IFA) in defining the functional significance (FS) of coronary stenoses based on the opinion of multiple operators. Methods: Fourteen independent operators visually evaluated 133 coronary lesions which had a previous FFR measurement, indicating the degree of stenosis (DS), FS and IFA intention. We determined the accuracy of FS prediction using several scenarios combining individual and group decision, considering IFA as deemed necessary by the operator or only in intermediate lesions. Results: The accuracy of VE in predicting FS was largely variable between operators (average 66.1%); it improved significantly when IFA was used either as per operator’s opinion (86.3%; p < 0.0001) or only in intermediate DS (82.9; p < 0.0001). There was no significant difference between using IFA per observer’s opinion or only in intermediate DS lesions (p = 0.166). The poorest accuracy of VE for FS was obtained in intermediate DS lesions (59.1%). Conclusions: There are significant inter-observer differences in reporting the degree of DS, while the accuracy of VE prediction of FS is also largely dependent on the operator, and the worst performance is obtained in the evaluation of intermediate DS.


Introduction
Visual evaluation of coronary stenoses angiographic appearance (also known as eyeballing) by the interventional cardiologist represents the first evaluation step during invasive coronary angiography (ICA). It informs the physician about need for further anatomical or functional evaluation (either invasive or non-invasive) or supports the therapeutic decision regarding the need for revascularization. One of the key issues during stenosis evaluation is the quantification of the degree of stenosis. This may be performed either visually (visual estimation-VE) or using dedicated software packages for quantitative coronary angiography (QCA) to estimate the degree of diameter stenosis (DS). The percentage DS (%DS) is calculated by reporting the minimum diameter of stenosis to the reference diameter. There are two current methods for determining the %DS: the first one averages the proximal and distal reference vessel diameter and the second one interpolates the reference vessel diameter [1]. According to current guidelines, there are different DS thresholds which are recommended to guide the indication for coronary revascularization. According to the %DS, lesions may be classified as probably nonsignificant functionally (DS < 50% in the European and <40% in the American guideline) or probably functional significant (DS > 90% in the European and >70% in the American guideline). The effect of a 50% DS on maximum achievable coronary flow was firstly formulated by Gould in 1974 on an experimental model of coronary stenosis [2]. Subsequently, more complex elements were proposed to comprehensively evaluate the severity of coronary plaques, such as eccentricity of the plaques, bifurcation lesions, morphological and multidimensional geometric parameters derived from computer tomography imaging [3]. Moreover, with angiography we evaluate the % DS based on a 2D projection of a 3D vessel structure which may not correctly reflect the area reduction induced by the stenosis, the latter being considered a more reliable parameter for the evaluation of the functional impact of an atherosclerotic lesion [4].
The use of coronary physiology to guide the decision of revascularization has been proven to improve long term patient outcomes compared to angiographic guidance [5]. The fractional flow reserve (FFR), calculated as the ratio of the pressure measured distally to a stenosis to the proximally measured pressure during maximal hyperemia, can determine objectively the functional significance of coronary stenosis using a threshold of 0.8 [6]. This is most relevant in the evaluation of intermediate lesions, defined as 50-90% DS (European guidelines) or 40-70% DS (American guidelines) when there is no proof for ischemia prior of coronary angiography. These DS thresholds rely however mainly on VE. Significant mismatch between visual and functional severity of DS have been observed and pinpointed in earlier studies [7,8]. Seen these disparities, the VE of %DS on ICA is being challenged as the gold standard for the hemodynamic significance of a lesion, being replaced with FFR [9].
In patients with ICA, prior documentation of ischemia is present in only 44% of cases [10], but the percentage largely depends on current practices and test availability [11]. For patients with multivessel disease (defined as more than one major coronary artery with DS > 50%), even if prior proof of ischemia is obtained, it is important to define the functional significance of each stenosis for a proper guidance of coronary revascularization. Limitations of a more extensive use of invasive functional evaluation at the time of ICA include costs, increased procedural time [12] or reduced experience of the operator [13].
Based on real-world observational data, a recent report [10] shows that in patients with stable coronary disease and angiographic intermediate lesions (40-70% DS based on VE), FFR was used in only 16.5% of cases, while prior noninvasive tests were performed just in 38.7% of patients. The same report shows that less than one third of the patients which were revascularized (29.5%) without invasive functional evaluation had a prior noninvasive proof of ischemia. Hence, it is obvious that VE plays an important role in predicting the functional significance and the need for subsequent revascularization. Zir et al. [14] established 45 years ago the limitations in VE accuracy, documenting a significant inter-observer variability in DS assessment. That is why we sought to evaluate the variability in visual interpretation by practicing interventional cardiologists, in the current era when there is strong evidence about the need to pursue a functional strategy for coronary revascularization guidance.
The main aim of the present study is to evaluate the accuracy of VE in establishing the need for revascularization based on the evaluation of multiple operators, compared to invasive FFR with threshold of 0.8 as gold standard. Also, we evaluated the interoperator variability of VE in grading the severity of coronary stenoses (degree of DS). Furthermore, we sought to identify individual predictors for the decision to perform an Diagnostics 2021, 11, 2241 3 of 18 invasive functional evaluation for establishing the functional significance (stenosis severity, lesion location, etc.), as well as to assess the performance of the selective use of FFR (based either on operator opinion after VE or degree of DS as evaluated visually).

Material and Methods
Invasive coronary angiography evaluations were performed in 86 patients as per local practice based on clinical indication for evaluation, between 2014 and 2016, for the purpose of a local prospective study aiming at the hemodynamic characterization of coronary lesions in relation to angiographic appearance. Briefly, this cohort comprised patients needing invasive functional evaluation for at least one lesion based on the clinical judgement of the attending physician after ICA. The exclusion criteria were the following: subjects with acute coronary syndromes during previous 1 week, occluded of sub-totally occluded vessels, left main lesions or previous myocardial infarction in the investigated vessel. The protocol was approved by the local ethical committee and all patients provided informed consent prior to the inclusion. Investigators were advised to perform at least two projections for each coronary segment separated by at least 30 degrees in rotation/angulation, to allow for an accurate visualization of the coronary stenosis. After standard image acquisition, the physician performed FFR measurements in all coronary segments which had a clinical indication, using intracoronary administration of adenosine as per general recommendations for FFR measurement. Coronary revascularization was decided by the attending physician based on the clinical and invasive evaluation (including the FFR measurements), using clinical practice guidelines recommendations.
The images were anonymized, and two different views showing worst angiographic stenosis were selected for each lesion for which FFR was available. These views were included in a web-based annotation application which was developed for the purpose of this study. Each pair of selected images was evaluated independently by certified interventional cardiologists (which are regularly involved in taking decisions regarding coronary revascularization) who accepted to take part in the study. The evaluation process was performed between January and March 2021. Because this process was based entirely on anonymized images, no further ethical committee approval was deemed necessary.
For each lesion the operators were asked to answer to the following: -Indicate the segment of the most severe lesion on each major coronary artery (LAD, left circumflex (LCx) or right coronary artery (RCA)). The performance of visual DS grading was assessed initially using the 7-degree scale (see above), for general agreement between all operators. For each lesion we determined also the DS determined based on the vote of the majority. For further analyses we reclassified the visual DS based on a three-grade scale using the 50% and 70 % thresholds, and this classification was again analyzed for inter-observer agreement. We also used this 3-grade classification as a nominal variable because of the different clinical significance for each class: <50%-probably non-significant lesions; 50-70%-intermediate lesions; >70%-probably significant lesions.

Statistical Analysis
All analyses were conducted using the statistical software program SPSS version 23 (IBM, Armonk, NY, USA). Data was presented as mean ± SD for continuous variables and as number and percentage for categorical variables. Contingency tables were generated for each of the 14 operators to assess the agreement between the indication for revascularization as per his/her VE and FFR as proof of ischemia (using ≤0.8 as threshold). Kendall's W was employed to determine the agreement between the operators' VE concerning the severity of each lesion evaluated based on a seven-or three-degree severity scale; a Kendall W value of one depicting perfect agreement and zero, no agreement. The agreement concerning revascularization or the necessity for functional evaluation between two observers was calculated by Cohen's kappa, and between multiple observers by Fleiss kappa. We also used Fleiss kappa for the inter-observer agreement regarding 3-grade DS lesion classification when we interpreted the categories on the clinical significance basis (<50%-probably nonsignificant; 50-70%-indeterminate; >70% probably significant). The kappa agreement was interpreted as follows: poor (<0.1), slight (0.1-0.2), fair (0.21-0.4), moderate (0.41-0.6), substantial (0.61-0.8), almost perfect (0.81-1), or perfect (1). For each observer sensitivity, specificity, the positive (PPV) and negative predictive values (NPV) and the accuracy (ACC) were calculated. For comparing the various scenarios of revascularization (VE, hybrid 1, hybrid 2) we used the McNemar test. The Phi correlation coefficient was calculated to evaluate the correlation between the decision to perform invasive testing and several variables, such as the localization of the stenosis (proximal or distal), the artery (LAD, LCx, RCA) or the visual grading. The proximal segments of major coronary arteries were considered as follows: 1 (RCA), 11 (LCx) and 6 (LAD). The receiver operator characteristic (ROC) analysis was used to evaluate the diagnostic capability of the VE of DS to detect hemodynamically significant stenoses (based on a FFR value ≤ 0.80).

Results
Complete evaluation of all lesions was performed by 14 interventional cardiologists. Based on their reporting they had a previous experience in interventional cardiology of 12.9 ± 6.7 (5-26) years, with an annual PCI load of 234 ± 124.8 (50-400) procedures. The cohort included 86 patients with stable coronary syndrome, with a total number of 133 lesions. In our study, 67.4% of patients were male with a mean age of 62.6 ± 8.9 years. A percentage of 88.3 of the patients were hypertensive, 30.2% were diabetic and 51.16% were smokers ( Table 1). Most of the evaluated lesions were located on the LAD (71 lesions, 53.38%), whereas 32 were on RCA (24.06%), and 30 (22.5%) on LCx. Among all evaluated lesions, 39.09% (52 lesions) were hemodynamically significant when assessed by FFR with a mean FFR value of 0.67 (range 0.19-0.8), while the mean FFR was 0.88 (0.81-0.98) in lesions with non-ischemic FFR. Subsequent revascularization was performed by PCI in 44 subjects (51 lesions), while in four patients (five lesions) the decision was towards surgical revascularization. The general and FFR related characteristics are displayed in Table 1. The results for the VE of the stenoses severity are presented in Table A1 (Appendix A). The agreement for the VE of lesions among all observers was good both on the 7-degree scale (Kendall's W = 0.866, p < 0.005) and on the 3-degree scale (Kendall's W = 0.748, p < 0.005). Based on the classification of the majority, 25 (18.8%) lesions had DS < 50%, 60 (45.1%) had DS 50-70% and 48 (36.1%) had DS > 70%. For all operators the individual gradings for the 3-degree scale are presented in Tables A2 and A3. For each interval of DS there were 5-57, 29-90 and 16-83 lesions classified as <50%, 50-70% and respectively >70%. When the 3-gade scale agreement was measured using Fleiss kappa for nominal variables, there was an overall fair agreement (k = 0.392 ± 0.07; p < 0.0001) and a moderate agreement between the decision of the top three operators (k = 0.442 ± 0.037; p < 0.0001).
Invasive functional evaluation was variably requested by different operators in 27.3-74.6% of the cases (mean value of 49.2%) ( Table A4). The agreement for the necessity to proceed to invasive-functional tests among all operators was only fair, with a Fleiss kappa of 0.214. For lesions evaluated as <50%, 50-70% and >70% the functional evaluation was considered necessary in 0-76,9%, 60-100% and 0-67.7% respectively, by different operators resulting in mean values of 22.8%, 88.3% and 29% for each category. It was not possible to evaluate the agreement based on the degree of DS because the lesions were graded differently by different evaluators. For the top three operators there was only a moderate overall agreement, (average Cohen kappa 0.534) ( Table 2). However, when comparing the decision of the majority (at least eight positive votes) to the majority of the top three operators, the agreement was much better (Cohen k 0.729). The decision to perform functional assessment was not influenced by the type of vessel (LAD, LCx or RCA) or the localization of the lesion (proximal versus distal) but correlated for the majority of the investigators (11 of them) with visual classification as intermediate stenosis (phi value 0.204-0.763- Table 3). When comparing the operator decision for revascularization based on VE with measured FFR as gold standard, there was an average accuracy of 66.1 ± 5.8%, ranging from 55% to 75.8% for different operators (Table 4, Figure 1).    Figure 1. The accuracy to predict an ischemic fractional flow reserve based on each operator's visual estimation, Hybrid 1 or Hybrid 2 approaches. * above each column depicts a p-value < 0.05 between VE and Hybrid approach; ** a p < 0.005 between the Hybrid 1 and Hybrid 2 approach.
The overall agreement for the indication of revascularization based only on th of stenoses was moderate (k = 0.448, p < 0.0001). Based on our analysis we did not pr significant correlation between the operator's experience quantified by the yea interventional cardiology practice (r = -0.48; p = 0.08) or annual PCI load (r = -0.4 0.11) and the accuracy in predicting the functional significance of the stenoses. Whe decision for revascularization was based on votes of the majority (at least eight), it l a slightly better accuracy, 67.7%. When considering the majority opinion of the top the accuracy was 75.9%, but the difference was not significant statistically (p = 0.09) there was a better agreement between these two group decisions (Cohen kappa 0 The overall agreement for the indication of revascularization based only on the VE of stenoses was moderate (k = 0.448, p < 0.0001). Based on our analysis we did not prove a significant correlation between the operator's experience quantified by the years of interventional cardiology practice (r = -0.48; p = 0.08) or annual PCI load (r = -0.44; p = 0.11) and the accuracy in predicting the functional significance of the stenoses. When the decision for revascularization was based on votes of the majority (at least eight), it led to a slightly better accuracy, 67.7%. When considering the majority opinion of the top three the accuracy was 75.9%, but the difference was not significant statistically (p = 0.09), and there was a better agreement between these two group decisions (Cohen kappa 0.773). Still, at the individual level there was only a moderate inter-observer agreement, even between the top three raters, concerning the indication for revascularization with Cohen kappa values between 0.55-0.68 (Table 5). When classifying the lesions based on the decision of the majority, we observed that in the category <50% there were only three lesions with ischemic FFR (12%), while for 50-70% stenoses only 28.3% were functionally significant and in the category >70% stenoses, 70.8% were functionally significant (Table A2). The average accuracy for the operators' decision for revascularization in each of these groups were 81.2%, 59.1%, 65.6%, respectively. However, the accuracy improved when we used the decision of the top three operators (88%, 75% and 70.8%), especially for the intermediate degree stenoses (Table A2). Evaluating non-intermediate lesions (<50% or >70%) in terms of predicting the need for revascularization, between top three and the overall majority, we obtained an almost perfect agreement, with a Cohen kappa 0.914 (p < 0.0001), while the agreement for the intermediate lesions was only fair with a Cohen kappa 0.385 (p = 0.001).
We performed the ROC analysis for the dependence between the DS severity (as quantified by each operator and by the decision of majority) and the functional significance of the lesions (Figure 2, Table A6). The values for the area under the curve for different operators were similar (range 0.706-0.802), without a significant difference between the largest and the lowest value, while the value for AUC obtained using the decision of the majority lied also in this range (0.752). p = 0.044).
We performed the ROC analysis for the dependence between the DS severity (as quantified by each operator and by the decision of majority) and the functional significance of the lesions (Figure 2, Table A6). The values for the area under the curve for different operators were similar (range 0.706-0.802), without a significant difference between the largest and the lowest value, while the value for AUC obtained using the decision of the majority lied also in this range (0.752).

VE of the DS
The first step in the evaluation of a coronary lesion is the estimation of the maximum degree of diameter reduction, at the level of the stenosis, compared to the reference diameter. The reference is considered the average diameter of the proximal and distal segments. There are several reports on inter-and intra-observer variability in DS reporting [14,15] performed several decades earlier. We asked the operators to report the range of DS, which allowed us to perform a further classification in more broader severity groups, as recommended by current guidelines. The process of grading resulted in a Kendall's W value of 0.866, (p < 0.005) for the 7-grade severity scale, considered to illustrate an overall good agreement, and a Kendall's W value of 0.748, (p < 0.005) for the 3-grade severity scale (a value of 1 means a perfect agreement). This overall good agreement measured with Kendall's W should be interpreted cautiously as there were large variations between evaluators regarding number of lesions classified in each interval of the 3-grade severity scale (Tables A2 and A3). Because of this wide inter-observer variation in grading the lesions, we used the decision of the majority in order to classify the lesions for some.

Need for Invasive Functional Evaluation
In our study we asked the operators to indicate the need for invasive functional evaluation to assess the lesions in an ideal scenario without any constraints, for guiding the therapeutic decision regarding revascularization. There were large differences between the operator's opinions in considering the need for functional evaluation with fair agreement (Fleiss kappa of 0.214), whereas between the top three operators there was a moderate agreement with an average Cohen kappa of 0.534 ( Table 2). The evaluation was considered necessary most often when the lesion was classified as intermediate (DS between 50-70%), as generally recommended, without a significant influence based on the vessel involved or the proximal location of the lesion. However, at individual level there were wide variations even when we quantified the need for evaluation based on the 3-grade stenosis classification. These wide differences in opinion may reflect differences in experience using functional evaluation or other lesion characteristics impacting the operator's opinion. There are registry data supporting the significant variation in utilization of FFR during every day practice in patients with intermediate degree stenosis [10]. To overcome the variability introduced by the operator's subjectivity, it might be indicated to use group decision. Analyzing the decision of the majority (at least eight votes) and the decision of the top three operators (at least two votes), we obtained a much better agreement with a Cohen k of 0.729 (Table 2). This finding may support the use of group decision when evaluating the need for invasive functional assessment.

The Functional Significance Prediction by VE
In our cohort of patients there were 54 (40.6%) lesions with an ischemic FFR value. Considering the three grade DS categories (<50%, 50-70% and >70 as classified by the majority of the operators), there were 3, 17 and 34 lesions, representing 12%, 28.3% and 70.8% from each category. Our data compare well to those derived from the FAME trial where in lesions >50% DS by VE [16] there were 35% significant lesions in the intermediate degree stenosis (50-70%), while among those >70% DS there were 80% functionally significant lesions. It means that for visually intermediate lesions and for those >70% (non-sub-totally occluded lesions) there are around one third and three fourths of significant lesions respectively. These findings argue against using the generally proposed 50% threshold for defining the significance of the lesions. In our study the average accuracy of identifying functional significant lesions in the intermediate severity range was 59% ranging from 29.8% to 74.6%, meaning that for some evaluators the accuracy was less than 50% (worse than "tossing a coin"). Moreover, it decreased slightly to 56.7% when we modelled the decision based on the opinion of the majority of the operators (we used the threshold of at least 8 positive votes). However, it improved significantly when we used the group decision of the top three operators (to 75.9%). In fact, only in the intermediate degree stenoses by VE the discriminating capacity of top three operators performed better compared to the overall accuracy (Table A2). It is worth mentioning here that in our study the evaluators' accuracy did not correlate with his/her overall previous experience in the field of interventional cardiology, as evaluated by means of years of experience or the annual PCI load. There are other reports like Lopez et al. [17] showing similar accuracy in functional significance prediction based on consensus VE evaluation for experienced operators evaluating intermediate DS lesions as demonstrated by QCA, while the reclassification of revascularization adequacy after FFR reached almost 50% in the French registry R3F [18]. It is known that fore severe lesions DS might be overestimated when compared to QCA [13]. However, VE was proven to be a better clinical tool compared to QCA for evaluating the functional significance [13]. Also, Shah et al. [19] demonstrated that among lesions considered as non-significant on QCA, those evaluated visually as significant had a higher long-term risk compared to non-significant lesions based on both VE and QCA.
We also evaluated the dependence between the VE of degree of DS and the functional significance of the stenoses. We performed the ROC analysis for each grader (Figure 2, Table A6). Overall, there were similar results as measured by the AUC. The difference between the largest (Operator 9) and the smallest values (Operator 3) was not significantly different (p = 0.116; AUC difference −0.097; 95% CI −0.217-0.029). We may conclude that we had similar capabilities for discrimination based on DS evaluation by VE for all operators, which may support, alongside with the general overall agreement between the evaluators in grading the stenosis (as illustrated by the Kendall w of 0.866 for the agreement on the 7-grade severity scale), the validity of our methodology of lesion evaluation. When analyzing the curve obtained using the classification based on the decision of the majority (less prone to individual observer variation), we may identify the threshold values for DS (evaluated by VE) which may guide the strategy of functional evaluation (Figure 2). Using a threshold of <50% for ruling out significant stenosis we obtain a sensitivity of 94.4%, while using a value of >80% to define functional significance we obtain a specificity of 97.5%. If we decrease the threshold to 70% the specificity decreases to 77.2% which seems not appropriate. We must also acknowledge the limitation of VE in discriminating the DS severity around the 70% threshold because of the tendency of overestimation by VE compared to QCA [13]. Classifying a lesion as 60-70% will support invasive functional assessment, while grading it as 70-80% might preclude it, making the intermediate-high grade lesions less prone to be evaluated functionally.
This supports the use of a >80% DS by VE to define functional significance, as was mentioned in a recent consensus [20], a less restrictive strategy than the one recommended in the European Guidelines for Revascularization [21] which advocates a threshold of 90% to justify revascularization based solely on angiographic evaluation of stenosis severity. It is important to mention that in every-day practice we may have additional clinical information, which should always be used to support the decision to revascularization, when the DS does not meet those specific criteria. A further evaluation of the subjective decision (both DS quantification and functional significance prediction) performed specifically for these intermediate-high DS lesions should be envisaged.
The Hybrid 1 scenario attempted to simulate an ideal every-day practice. The operator performs the angiography, evaluates it visually and decides if the vessel requires further functional evaluation to define the need for revascularization. Then, if the VE is considered to be enough, no further information will be obtained and the lesion will be treated according to the VE quantification, otherwise, functional evaluation is performed to inform the decision. Using this approach, which incorporates the previous experience of the operator, an overall accuracy of 86.3% (71.2-93.9%) was obtained, which is significantly higher compared to the VE accuracy (p < 0.0001). The scenario of performing functional evaluation only in intermediate DS (Hybrid 2), as classified by each observer, resulted in a nonsignificant lower accuracy compared to the Hybrid 1 scenario (82.9 vs 86.3%; p = 0.166). There were on average 14.6 less functional evaluations with the Hybrid 2 strategy compared to the Hybrid 1 strategy. This significant increase in accuracy for the Hybrid 1 strategy compared to VE originated from a better classification of intermediate DS lesions, which were the most frequently indicated as needing functional evaluation (see Table A4). Consequently, in intermediate DS lesions the overall accuracy of the Hybrid 1 strategy reached 94.3% (Table A3). Our data does not support the routine use of a Hybrid 1 like strategy (invasive functional assessment in all lesions deemed necessary by the operator) over a Hybrid 2 like strategy (restricting functional assessment only to intermediate DS lesions, as evaluated visually), as this does not result in a significant increase in accuracy at the expense of increasing the need for invasive functional evaluation. Furthermore, using the Hybrid 1 strategy there is a worse classification in terms of functional significance for those lesions graded as >70% compared to <50% (79.5% vs. 86.9%, p = 0.044), most probably related to the tendency of overestimating the significance of tight lesions by VE. It is intriguing however that even for a hypothetically liberal use of invasive functional testing the overall accuracy is limited to almost 85% (with a maximum of 93.9%), while the best individual accuracy was achieved using a Hybrid 2 scenario (95.5%).
When a lesion is classified as <50% it means that it is regarded as probably nonsignificant, and a functional assessment might be considered overall in only one fifth of the cases, or in less than one tenth based on the decision of the majority. By visual estimation of the functional significance there is a correct decision in 80-90% of these cases. When a lesion is classified in the >70% range it means that it is regarded as probably significant, and a functional assessment might be considered overall in only 20-30% of the cases. By visual estimation of the functional significance there is a correct decision in 60-70% of the cases. Only for intermediate degree stenosis there is a high level of indication for invasive functional testing, while the chance of a correct prediction of functional significance is unacceptably low (50-60%). These are the reasons for which we interpreted the 3-grade classification also as a nominal variable when we analyzed the inter-observer agreement in DS estimation, and we demonstrated that there is only a fair overall agreement (k = 0.39 ± 0.07) and a moderate agreement even between the top three operators. This means that there are in fact important operator dependent variations in terms of clinical decision. It seems that, bearing in mind the limitations of our study, there is still some room for improvement in the way we currently integrate the use of invasive functional testing with the information obtained by visual evaluation of coronary angiographies.

Limitations
We assumed that the data that were made available were based entirely on the operator's opinion after proper evaluation of the included lesions. We could not control for evaluator's accidental errors during data collection in the central database. The images used (two views for each coronary artery) were selected as being those which display the highest DS based on the author's opinion. The operators annotated these images and did not have access to the entire coronary evaluation as is the case in usual practice. Moreover, a web-based annotation tool was employed for evaluating the medical images. Real life accuracy may be better due to the integration of the overall clinical assessment of the patients, which is based not only on angiography images. We did not perform an analysis of intra-observer variability which would have been of much interest especially for the top three raters. For the purposes of our study we have considered FFR as golden standard for defining the functional significance of the stenoses as currently, without taking into consideration its possible limitation. Fractional flow rate reflects the hemodynamic severity of coronary stenosis in the proximal artery assuming that maximum hyperemia may be achieved; abnormal resistance in the microcirculation may lead to a decreased maximum coronary flow and subsequently to a smaller pressure gradient across the stenosis [22]; to the best of our knowledge the clinical impact of these FFR limits is unknown.

Conclusions
Visual evaluation of coronary lesions is an important tool for evaluation during ICA. There are significant inter-observer differences in reporting the degree of DS. The accuracy of VE for predicting the functional significance is largely dependent on the operator, and the worst performance is seen in the evaluation of intermediate degree stenoses. The average accuracy of the visual prediction of functional significance of intermediate degree stenosis was 55%, advocating against the sole use of VE to support the decision of revascularization. The whole-group decision did not improve the accuracy of functional significance prediction in intermediate lesions. The visual prediction of functional significance for intermediate stenosis may increase up to 75% in accuracy for some operators, most probably based on specific personal experience. There were wide variations between the evaluators regarding the need for invasive functional assessment. Intermediate degree stenoses are most likely to be evaluated functionally. Institutional Review Board Statement: The study was conducted according to the guidelines of the Declaration of Helsinki and did not require an Institutional Review Board approval as this work was performed using anonymized images collected previously.

Informed Consent Statement:
This work was performed using anonymized images collected previously. Informed consent was obtained at the time of performing the procedures, prior to any procedure.