Diffusion-Weighted MRI in Patients with Testicular Tumors—Intra- and Interobserver Variability

In general, magnetic resonance (MR) diffusion-weighted imaging (DWI) has shown potential in clinical settings. In testicles parenchyma, the DW imaging helps differentiate and characterize benign from malignant lesions. Placement and size of the region of interest (ROI) may affect the ADC value. Therefore, the aim of this study was to investigate the intra- and interobserver variability in testicular tumors when measuring ADC using various types of regions of interest (ROI). Two observers performed the ADC measurements in testicular lesions based on three ROI methods: (1) whole volume, (2) round, and (3) small sample groups. Intra- and interobserver variability was analyzed for all ROI methods using intraclass correlation coefficients (ICC) and bland-altman plots. The two observers performed the measurements twice, three months apart. A total of 26 malignant testicle tumors were included. Interobserver agreement was excellent in tumor length (ICC = 0.98) and tumor width (ICC = 0.98). In addition, intraobserver agreement was excellent in tumor length (ICC = 0.98) and tumor width (ICC = 0.99). The whole volume interobserver agreement in the first reading was excellent (ICC = 0.93). Round ADC had an excellent (ICC = 0.93) and fair (ICC = 0.58) interobserver agreement, in the first and second reading, respectively. Interobserver agreement in ADC small ROIs was good (ICC = 0.87), and good (ICC = 0.78), in the first and second reading, respectively. Intraobserver agreement varied from fair, good to excellent agreement. The ROI method showed varying inter- and intraobserver agreement in ADC measurement. Using multiple small ROI conceded the highest interobserver variability, and, thus, the whole volume or round seem to be the preferable methods.


Introduction
Diffusion-weighted imaging (DWI) is a magnetic resonance imaging (MRI) noninvasive technique measuring diffusion of molecular water. DWI may be used for evaluation of tumors and has shown potential in clinical settings. During the last decade, DWI, including apparent diffusion coefficient (ADC), has been applied in discriminating malignant from benign nodules in a variety of organs [1][2][3][4][5]. In general, adding DWI sequences to existing MR protocols is not a complicated and time comsuming task and can easily be applied to existing protocols.
Ultrasound is the first-choice modality when investigating testicles due to low cost, availability, and diagnostic accuracy. However, ultrasound is limited by operator dependency. MRI imaging is becoming popular as new features, such as ADC measurements, that have potential to impact in the diagnostic field. In addition, MRI has the ability to image small structures [6] and can be used as both a clinical and a problem-solving tool [7,8]. DWI imaging may add valuable information and is currently being applied when investigating various testicular conditions and diseases, e.g., microlithiasis, torsion, undescended testis and varicocele [1,[9][10][11].
The Scrotal and Penile Imaging Working Group by European Society of Urogenital Radiology (ESUR) recommends additional scrotal MRI to characterize testicular masses [12], including T1, T2 and DWI sequences with at least three b-values. The Scrotal and Penile Imaging Working Group recommends use of scrotal MRI when ultrasound findings are inconclusive or inconsistent with clinical examination, when performing local staging of testicular malignancy, when differentiating benign from malignant tumors, and when assessing of acute scrotum [13].
In testicular parenchyma, DWI can differentiate and characterize benign from malignant tumors [1,14]. Tumours often display a heterogeneous appearance. The ADC measurement may depend on the selected region of interest (ROI). Therefore, choice of ROI shape may affect the ADC value. Few studies have investigated MRI interobserver variability in testicular tumors using different ROI size [14][15][16][17][18]. Tsili et al. investigated interobserver variability in testicular tumors using ADC and found DWI ADC interobserver variability to be excellent regarding ROI shape [14].
To our knowledge, no studies have evaluated the intra-observer variation of ROIs in testicular tumors. The aim of this study was, firstly, to assess the intra-and interobserver variability in ADC values using three different methods of ROI selection in testicular tumors and, secondly, to evaluate the measurement of MRI testicular length and width, including intra-and interobserver variability.

Materials and Methods
This retrospective study was approved by the local Danish Data Protection Agency (2012-58-0018) and the local hospital review board. Approval by the Regional Committee on Health Research Ethics was not required due to the retrospective study design. However, written informed consent was mandatory prior to MRI examination.

Patients
We retrospectively evaluated a total of 26 patients with malignant testicular tumors in the period 2013-2016. Only patients with histopathological confirmed malignant testicular tumor were included. The National Pathology Registry database was searched [19] using the Danish unique civil registration number [20]. We excluded one patient, as the MRI investigation was performed after the orchiectomy.

MRI
The MRI examination was performed using an Ingenia 1.5-Tesla system (Philips Medical Systems, Best, Netherlands) with patients in prone position using the posterior coil placed at the center of the scrotum. A T2-weighted spin echo, T1-weighted and DWI sequences with b values of 0, 100, 300, 600, 900 and 1100 s/mm 2 were performed, and ADC maps were generated automatically at the workstation. The MRI protocol has been published previously and details are presented in Table 1 [1]. All data was stored in the hospital's Picture Archiving and Communication System (PACS).

Observers
The two observers individually interpreted the 26 malignant testicular tumors. The two observers measured tumor length and width on the T2-weighted images. Two senior radiologists with 8-10 years of experience participated in this study. The two observers used the same type of diagnostic screen (21.3 monitor CCL358i2 from TOTUKU, JVCENWOOD Cooperation, Kanagawa, Japan) to review all 26 cases. All images were viewed using an Easyviz Impax PACS workstation (Medical Insight, Valby, Denmark).
The observers were placed in an undisturbed office and were unable to discuss the cases with colleagues. The observers independently measured the ADC on 26 testicular tumors using the three ROI types and were blinded to histopathological findings, previous examinations and patient medical history. The observers were blinded to each other's measurements.
The ROI was placed in one of the following positions: whole volume, round ROI and small ROI. The whole volume ROI was as large as possible, covering all the tumor tissue. The round ROI was placed in the center of the tumor as large as possible. The small ROIs were 3 mm in size and were placed in up to five individual places within the tumor without any overlap; a mean was calculated ( Figure 1). In general, all ROIs were placed carefully to avoid overlap with other testicular tissue. To limit observer bias, the two observers re-evaluated the 26 testicular tumors after a period of 3 month. Cases were presented in random order.

Statistical Analysis
Intra-and interobserver absolute agreement of length, width, and tumor ADC value in the three ROI methods was assessed by interclass correlation coefficients (ICC). A two-way random effect absolute agreement model was used to estimate the interobserver ICC, including 95% confidence intervals (95% CI). A two-way mixed effect absolute agreement model was used for calculating the ICC intraobserver agreement. The ICC were interpreted as poor (below 0.50), fair (0.50-0.75), good (0.76-0.90) and excellent (above 0.90) [21].
Bland-Altman plots were used to assess the intraobserver agreement across the scale on testicular tumor length, width, and ADC measurements by plotting the differences between the two readings against the mean of each test-retest measurement pair.
Since data were not normally distributed, limits of agreement (LOA) were estimated as non-parametric using 2.5th and 97.5th percentiles. Statistical analyses were carried out with STATA statistical software (version 17.0 STATA, College Station, TX, USA). Figure 1 shows placement of the three ROI methods within a testicular tumor. The patients median age was 38 years (range 23-79 years). All tumors were confirmed by histopathology. Main MRI scan parameters are presented in Table 1. Descriptive statistics on ADC, tumor length and width assessed by the two observers using the three ROI methods are presented in Tables 2 and 3.  Round ADC showed excellent agreement with an ICC = 0.93 (95% CI 0.85-0.97) and fair agreement in the second reading (ICC = 0.58, 95% CI 0.25-0.78). The interobserver agreement in ADC small ROIs was good in the first and second reading with ICC = 0.87 (95% CI 0.72-0.94) and ICC = 0.78 (95% CI 0.54-0.90), respectively. Tumor size was heterogene (Table 3).
Intraobserver variability (observer 1) showed good to excellent agreement for whole volume ROI (     Bland-Altman plots on the three ROI methods of ADC measurement showed a better absolute agreement in ROI whole volume (mean difference = −0.02, 95% LOA 0.26 to −0.40) and ROI round (mean difference = −0.01, 95% LOA 0.46 to −0.68) than ROI small (mean difference = −0.04, 95% LOA 0.26 to −0.34). In general, the Bland-Altman plots suggested no systematic bias in terms of varying extend of agreement along the measurement scale.
The mean difference in all the ROI methods was very close to 0 (ranging from −0.01 × 10 −3 mm 2 /s in round ROI ADC to 0.07 cm tumor width) indicating excellent absolute agreement. However, round ROI ADC showed the highest LOA (95% LOA 0.46 to −0.68 × 10 −3 mm 2 /s) compared to whole volume ROI and small ROI.

Discussion
Our results show that the reliability of intraobserver measurements in testicular tumor length and width was excellent in both readings. The whole, round and small ROI method ranged from good to excellent agreement in the two reading.
The excellent level of interobserver agreement shown in this study may be explained by the similar level of experience by the two observers. However, measurement repeatability is important, regardless of experience. Still, variability may be related to shape and size of the ROI. Clauser et al. pointed out that using a small ROI placed only in the lowest signal intensity area versus an ROI containing the whole lesion may provide variability [28]. Furthermore, Inoue et al. found significant values in endometrial cancer using various ROI methods with the same size [31]. In rectal cancer Lambregts et al. found significant variability when using different ROI sizes, with whole-volume ROI showing the best reliability [29]. Zhou et al. investigated benign and malignant thyroid nodules and found ROI methods to affect the ADC values, with whole-volume ROI showing the best reliability [32].
In testicular lesions, Tsili et al. found excellent interobserver agreement regardless of ROI shape and size [14], which was also the case in our study, indicating that the organ placement, appearance, and accessibility may positively affect the variability.
In this study, the observers used three ROIs methods: whole volume, round and small ROIs. In general, it is important to know that the choice of ROI type can affect the outcome. Priola et al. investigated ADC whole-volume and found whole tumor volume to provide the most reproducible results in patients with advanced rectal cancer [33]. We found whole volume was the most reproducible method.
We used a total of six b-values (0, 100, 300, 600, 900 and 1100 s/mm 2 ), which is considered a high number. B-values measure the degree of diffusion weighting applied to the image. The ADC is calculated by the applied b-values, usually two or more. By adding more b-values to the DWI, the scan time is increased. Kuhnke et al. advocates that ADC values can be calculated using two b-values [34]. Many papers choose to include two b-values, as the gain by adding more b-values is relatively small. Still, it is important to consider the number and size of b-values, as the ADC values will be affected by this choice.

Strengths and Limitations
There are some limitations in the present study. MRI scans were all performed on the same MRI 1.5-tesla unit, which is considered a strength. It is possible that a 3-tesla unit could have achieved an improved image quality and, hence, the observers may have had better visualization of the testicular tissue and potentially better placement of the ROIs within the malignant lesion. However, this seems unlikely, as both 1.5 and 3 tesla units provides prominent images. The number of patients included (n = 26) and number of radiologist (n = 2) is limited; however, other studies have included the same number of radiologists and patients. We used three types of ROIs: Whole volume, round, and small, assessed by two senior observers; we consider it a strength to include different types of ROIs, as radiologists typically have dissimilar ROI preferences. Furthermore, it is a strength that all patients included had histopathological confirmed testicular cancer. However, currently, there is no generally accepted ADC differnce to compare the LOA with, which makes it difficult to compare results. In this study, we found low LOA, e.g., ADC −0.68.

Conclusions
This study demonstrates that all three ROI methods can be used when measuring ADC values in malignant testicular tumors; however, we found the whole volume ROI to be the most reproducible method. The intra-and interobserver agreement in tumor length and width was excellent, and ROI whole volume, round, and small ranged from good to excellent agreement.  Institutional Review Board Statement: This retrospective study was approved by the local Danish Data Protection Agency (2012-58-0018) and the local hospital review board. Approval by the Regional Committee on Health Research Ethics was not required due to the retrospective study design. However, written informed consent was mandatory prior to MRI examination.