Assessment of Liver Metastases Using CT and MRI Scans in Patients with Pancreatic Ductal Adenocarcinoma: Effects of Observer Experience on Diagnostic Accuracy

The aim of this study was to investigate the impact of radiologic experience on the diagnostic accuracy of computed tomography (CT) vs. magnetic resonance imaging (MRI) reporting on the liver metastases of pancreatic ductal adenocarcinoma (LM of PDAC). Intra-individual CT and MRI examinations of 112 patients with clinically proven LM of PDAC were included. Four radiologists with varying years of experience (A > 20, B > 5, C > 1 and D < 1) assessed liver segments affected by LM of PDAC, as well as associated metastases occurring in each patient. Their sensitivity and specificity in evaluating the segments were compared. Cohen’s Kappa (κ) for diagnosed liver segments and Intra-class Correlation Coefficients (ICC) for the number of metastatic lesions in each patient were calculated. The radiologists’ sensitivity and specificity for the CT vs. MRI were, respectively: Reader A—94.4%, 90.3% vs. 96.6%, 94.8%; B—86.7%, 79.7% vs. 83.9%, 82.0%; C—78.0%, 76.7% vs. 83.3%, 78.9% and D—71.8%, 79.2% vs. 64.0%, 69.5%. Reviewers A and B achieved greater agreement in assessing results from the MRI (κ = 0.72, p < 0.001; ICC = 0.73, p < 0.001) vs. the CT (κ = 0.58, p < 0.001; ICC = 0.61, p < 0.001), in contrast to readers C and D (MRI: κ = 0.34, p < 0.001; ICC = 0.42, p < 0.001, and CT: κ = 0.48, p < 0.001; ICC = 0.59, p < 0.001). Our results indicate that the accurate diagnosis of LM of PDAC depends more on radiologic experience in MRI over CT scans.


Introduction
International guidelines advise contrast-enhanced computed tomography (CE-CT) for routine diagnosing and staging of pancreatic cancer, whereas magnetic resonance imaging (MRI) is mostly used for the characterization of indeterminate liver lesions [1,2]. The accurate assessment of liver metastases (LM), both colorectal (CRLM) and non-colorectal LM, is crucial in multidisciplinary oncology [1][2][3][4]-especially for patients with pancreatic ductal adenocarcinoma (PDAC). The enhanced detection of LM could reduce the futile resection of tumors and markedly increase life expectancy.
At present, CE-CT has been widely used as a standard imaging modality to determine the stage of pancreatic cancer. However, its ability to detect LM less than 1 cm in size is reported to be limited and unsatisfactory, given that its rate of accuracy currently stands at just 50% [3,4]. Fortunately, liver-specific magnetic resonance contrast agents like gadoxetate disodium appear to offer great promise because of their ability to provide more precise evaluations of tumor infiltrations [5,6], and are now recommended for the diagnosis and characterization of malignant lesions in non-cirrhotic livers [7]. Moreover, prior studies have demonstrated that multidisciplinary team meetings associate with significant improvements in clinical outcomes, as imaging data is reviewed by all physicians involved in the patient's care, regardless of their radiologic experience [8][9][10][11]. It is possible that the widespread availability of medical imaging data may eventually lead to independent read-outs of CT and MRI examinations without consultation of the corresponding radiological report. Yet, since marking imagery is not a standardized practice amongst radiologists, but depends more on their reporting preferences and level of experience [12], the comprehensibility of oncologic findings for inexperienced readers remains unknown [2].
In this study, we aimed to determine the impact of observer experience in CT and MRI examinations on the diagnostic accuracy of (LM of PDAC).

Diagnostic Performance
Sensitivity, specificity, positive predictive values (PPV) and negative predictive values (NPV) were calculated on a segmental basis in Table 1. Overall, a trend was found in that the diagnostic performance was proportional to experience for both reporting of CT and MRI examinations in  Table 2. The differences between Reviewers A and B, as well as B and D, were significant for CT reporting (p = 0.001 for both). Regarding MRI, the following comparisons reached the level of significance, respectively: Readers A and C (p = 0.013), A and D (p < 0.001), B and D (p = 0.001) and C and D (p = 0.014). Additional data is summarized in Table 3.

Inter-Observer Agreement
Among all the reviewers, inter-reader agreement for the liver segments that were affected by malignancy was higher for MRI (κ = 0.44, p < 0.001) than for CT (κ = 0.43, p < 0.001). In particular, the experienced reviewers, A and B, achieved greater agreement for MRI (κ = 0.72, p < 0.001) than for CT (κ = 0.58, p < 0.001), unlike the less experienced C and D reviewers (MRI: κ = 0.34, p < 0.001 and CT: κ = 0.48, p < 0.001, respectively). A similar trend was found for the inter-observer agreement regarding the number of LM present in each patient. All reviewers, together, achieved greater inter-observer agreement for MRI (ICC = 0.59, p < 0.001) than for CT (ICC = 0.53, p < 0.001). The experienced reviewers, A and B, showed higher agreement for MRI (ICC = 0.73, p < 0.001) than for CT (ICC = 0.61, p < 0.001), as opposed to the less experienced reviewers, C and D (MRI: ICC = 0.41, p < 0.001 and CT: ICC = 0.59, p < 0.001). Complementary data are shown in Table 4. Table 4. Inter-observer agreement regarding liver segments that were affected by the liver metastases of pancreatic ductal adenocarcinoma (LM of PDAC) (κ) and the number of lesions reported per patient (ICC) for CT and MRI analyses. The experienced reviewers showed a significantly higher agreement for MRI than for CT reporting.

Discussion
This study aimed to investigate the impact of observer experience on diagnostic performance and inter-observer agreement in reporting LM of PDAC using CT and MRI scans because data from these imaging modalities are commonly reviewed by physicians with varying levels of experience in the clinical and radiological practice of oncology [13].
We found that diagnostic performance was primarily proportional to reviewer experience, with the most experienced reviewer, A, achieving the highest sensitivity, specificity, PPV and NPV [14][15][16]. Thus, our results indicate that observer experience is essential for exhibiting a high diagnostic accuracy in the aforementioned imaging modalities ( Figure 1). More importantly, values indicating diagnostic accuracy differed more distinctly for MRI than for CT, as sensitivity, specificity, PPV and NPV were distributed over a larger range for the MRI analyses [14][15][16]. Therefore, our data suggest that reviewer experience has a greater impact on MRI than on CT reporting. This hypothesis is further supported by the fact that the more experienced radiologists, Reviewers A and B, showed a greater difference in diagnostic performance indices for MRI. This suggests that there is a greater learning curve for MRI interpretation after 5 years of radiologic experience compared with CT reporting. Accordingly, we detected a greater inter-observer agreement among the experienced reviewers for the MRI analyses, both for the amount of LM of PDAC detected per patient and the affected liver segments, as opposed to the less experienced reviewers, who achieved a higher inter-observer agreement using CT images. This may indicate that experienced reviewers are more likely to be consistent in their findings when evaluating MRI scans, unlike less experienced reviewers. Potential factors that may contribute to the need for more experience in MRI analyses over CT interpretations are the greater number of images that need to be assessed as well as its challenging physical theory. For instance, chemical shift and diffusion-weighted imaging in MRI may be more difficult to assess for less experienced reviewers than CT series that are based on density values and primarily defined by the time of image acquisition relative to the contrast media administration. Contrastingly, standard CT scans for liver imaging in patients with LM of PDAC include a reduced amount of series and often consist solely of a non-contrast and portal-venous phase image acquisition [16]. Thus, our data advocate that the reporting of CT images may be more comprehensible and intuitive for less experienced reviewers. Our results suggest that marking and explaining findings in MRI reports will make them easier to understand for less experienced readers and clinicians, who evaluate imaging data in the absence of an experienced radiologist. Additionally, these results further justify the educational practice in radiological departments wherein residents undergo CT before MRI training.
Evidence regarding the influence of observer experience in interpreting oncological CT and MRI examinations is scarce, despite the fact that both clinicians and radiologists with varying levels of experience routinely review such studies and thus may affect therapeutic regimens and patient care. Although few investigations have examined the impact of reader experience on the reproducibility of tumor measurements [12,17], a prior study examined the influence of observer expertise on CT and MRI reporting in patients with CRLM [11]. The authors concluded that the MRI analysis of CRLM is more affected by observer experience than CT interpretation [11]. Our results are consistent with this hypothesis and likewise indicate that reviewer experience is a crucial determinant in diagnostic accuracy. More importantly, they also support the assumption that this effect is more distinct within MRI over CT reporting. Therefore, we conclude that experience may have a greater impact on Potential factors that may contribute to the need for more experience in MRI analyses over CT interpretations are the greater number of images that need to be assessed as well as its challenging physical theory. For instance, chemical shift and diffusion-weighted imaging in MRI may be more difficult to assess for less experienced reviewers than CT series that are based on density values and primarily defined by the time of image acquisition relative to the contrast media administration. Contrastingly, standard CT scans for liver imaging in patients with LM of PDAC include a reduced amount of series and often consist solely of a non-contrast and portal-venous phase image acquisition [16]. Thus, our data advocate that the reporting of CT images may be more comprehensible and intuitive for less experienced reviewers. Our results suggest that marking and explaining findings in MRI reports will make them easier to understand for less experienced readers and clinicians, who evaluate imaging data in the absence of an experienced radiologist. Additionally, these results further justify the educational practice in radiological departments wherein residents undergo CT before MRI training.
Evidence regarding the influence of observer experience in interpreting oncological CT and MRI examinations is scarce, despite the fact that both clinicians and radiologists with varying levels of experience routinely review such studies and thus may affect therapeutic regimens and patient care. Although few investigations have examined the impact of reader experience on the reproducibility of tumor measurements [12,17], a prior study examined the influence of observer expertise on CT and MRI reporting in patients with CRLM [11]. The authors concluded that the MRI analysis of CRLM is more affected by observer experience than CT interpretation [11]. Our results are consistent with this hypothesis and likewise indicate that reviewer experience is a crucial determinant in diagnostic accuracy. More importantly, they also support the assumption that this effect is more distinct within MRI over CT reporting. Therefore, we conclude that experience may have a greater impact on diagnostic accuracy for the MRI reporting of LM of varying primary carcinomas compared with CT analyses. In addition to this prior investigation, we found consistent results using CT and MRI examinations on the same individuals, which included a diffusion-weighted MRI series.
This study has limitations that should be mentioned. First, we only included four radiologists with varying levels of experience. Because the influence of experience on diagnostic accuracy is challenging to investigate, results may be more representative following a larger population of reviewers in future studies. Additionally, subtle changes of LM of PDAC that were not indicated by criteria in the image analyses may limit the comparability of the intra-individual CT and MRI examinations.

Patients
This retrospective study was approved by our local institutional review board (Ethical Committee of Kindai University No. 23-101), and written informed consent was obtained from each patient before undergoing a CT or MRI scan. This study used data from clinical records and images collected from Kindai University Hospital, a high-volume regional referral center. Between January 2009 and December 2017, patients with histologically confirmed pancreatic cancer who underwent a CE-CT and gadoxetic acid-enhanced MRI were enrolled in the study. CT and MRI examinations of 164 patients with LM of PDAC were retrospectively included. From the initial study group, 52 patients were excluded because their time intervals between the CT and MRI examination were greater than one month, so 112 individuals in total participated (62 males, mean age ± standard deviation: 62 ± 12.4 y). There were 172 of LM of PDAC and 157 liver segments that were affected by LM of PDAC.

CT Imaging Protocol
Intravenous CE-CT imaging was performed using a 64-channel multidetector row scanner (Light Speed VCT Vision, GE Healthcare, Waukesha, WI, USA) with a tube voltage of 100-170 kV, an automatic dose modulation, a pitch of 2.0 and a slice thickness of 1 mm. Axial, coronal and sagittal slices were reconstructed with a section thickness of 5.0 mm and an increment of 3 mm. After the unenhanced images were acquired, 510 mg/kg of iodinated contrast material (Optiray 320, Guerbet Japan, Tokyo, Japan) was administered intravenously into the antecubital vein at a rate of 3-4 mL/s. Scanning was performed at the beginning of the pancreatic parenchymal phase (after 40 s) and the subsequent liver phase was obtained 70 s after the intravenous administration of the contrast material.

MR Imaging Protocol
Enhanced magnetic resonance imaging (EOB-MRI) was performed with two superconducting magnet systems on a 3.0-T scanner (Magnetom Trio, Siemens Medical Systems, Erlangen, Germany; n = 48; Achieva TX, Philips Healthcare, Best, The Netherlands; n = 64) using a 32-channel phased-array body coil for all patients. The Magnetom Trio scanner was actively shielded with a 45 mT/m gradient field strength and slew rate of 200 T/m/s. The Achieva TX scanner was actively shielded with a 50 mT/m gradient field strength and slew rate of 220 T/m/s. FOr both scanners, after breath-hold double-echo T1-weighted gradient recalled echo (GRE) images (in-phase and opposed-phase images) and navigator-triggered fat-suppressed T2-weighted turbo spin-echo (TSE) images were obtained, dynamic fat-suppressed T1-weighted images were obtained with a three-dimensional (3D) GRE sequence before (pre-contrast), 14-30 s after (arterial phase by means of a bolus-triggered technique), 70 s after and 3 min after the intravenous administration of gadoxetic acid (EOB Primovist; Bayer Yakuhin, Osaka, Japan), which was injected as a bolus (2.0 mL/s) at a dose of 0.025 mmol/kg of body weight, followed by a 20 mL saline flush. Hepatocyte-phase images were obtained 20 min after the gadoxetic acid injection.

Image Analysis
All CT and MRI series were reviewed by four reviewers with varying levels of experience in oncologic radiology (Reviewer A > 20 y, B > 15 y, C > 1 y, D < 1 y) according to Albrecht 's methodology [11]. They were aware that this study focused on the detection of LM, though were not told that LM of PDAC was present in each patient and were blinded to patients' ages, primary cancers, clinical course and previous treatments. Images were assessed in a randomized order, with a mandatory time interval between the intra-individual read-outs of CT and MRI datasets of one week, so as to reduce potential recall biases. Preset window settings could be freely adjusted. Reviewers reported the amount of metastases they detected per patient, as well as the liver segments that were affected by LM of PDAC, according to the Couinaud classification of hepatic anatomy. Segment 4 was evaluated as a cranial (4a) and caudal portion (4b). Thus, each liver segment was either rated as affected (positive) or not affected (negative) regarding malignant infiltration by LM of PDAC.
The criteria for the radiological diagnosis of LM on the CE-CT were that they had to be ill-defined, heterogeneous nodules with a higher attenuation than bile and having some degree of enhancement. The criteria for the radiological diagnosis of LM on the EOB-MRI were that they had to be focal, discrete nodular lesions that showed a high signal intensity relative to the liver parenchyma on T2-weighted fast spin-echo (FSE) images (with a lower signal intensity than the gallbladder or cerebrospinal fluid), a low signal intensity relative to the liver parenchyma on T1-weighted GRE images obtained at 70 s and 3 min after the gadoxetic acid injection, and were more conspicuous on the hepatocyte-phase images. The diagnosis of LM was more definite when perilesional enhancement was detected on the T1-weighted GRE images obtained 30 s after the gadoxetic acid injection (Figure 1).

Standard of Reference
In patients who underwent endoscopic ultrasonography fine-needle aspiration (EUS-FNA) for liver tumors, the diagnosis of LM was made based on a combination of the histopathological findings of EUS-FNA samples and follow-up imaging examinations. In cases without histopathological diagnosis, the final diagnosis was confirmed by combining all available imaging examinations performed 2 to 3 m after the initial CT and MRI. If possible, observation with these imaging examinations of the lesions was followed up every 2 mos. In cases lacking a histopathological diagnosis, the final diagnosis was made when a significant change in the tumor's size was confirmed using these imaging examinations. Most of the tumors (159 tumors) were increased in the interval. A few tumors were confirmed to have shrunk by chemotherapy and were not abscesses by combining all available imaging examinations. In this study, LM were considered to pre-exist at the time of the initial examination, despite no detection by any imaging modalities (metastatic lesions were identified by follow-up imaging within 3 m of the initial examination).

Conclusions
Our results indicate that the accurate diagnosis of LM of PDAC depends more on radiologic experience in MRI than in CT scans. Although observer experience is crucial for a high diagnostic accuracy regarding both CT and MRI analysis in patients with LM of PDAC, this effect seems to be more pronounced in MRI reporting.