Influence of a Deep Learning Noise Reduction on the CT Values, Image Noise and Characterization of Kidney and Ureter Stones

Deep-learning (DL) noise reduction techniques in computed tomography (CT) are expected to reduce the image noise while maintaining the clinically relevant information in reduced dose acquisitions. This study aimed to assess the size, attenuation, and objective image quality of reno-ureteric stones denoised using DL-software in comparison to traditionally reconstructed low-dose abdominal CT-images and evaluated its clinical impact. In this institutional review-board-approved retrospective study, 45 patients with renal and/or ureteral stones were included. All patients had undergone abdominal CT between August 2019 and October 2019. CT-images were reconstructed using the following three methods: filtered back-projection, iterative reconstruction, and PixelShine (DL-software) with both sharp and soft kernels. Stone size, CT attenuation, and objective image quality (signal-to-noise ratio (SNR), contrast-to-noise ratio (CNR)) were evaluated and compared using Bonferroni-corrected Friedman tests. Objective image quality was measured in six regions-of-interest. Stone size ranged between 4.4 × 3.1–4.4 × 3.2 mm (sharp kernel) and 5.1 × 3.8–5.6 × 4.2 mm (soft kernel). Mean attenuation ranged between 704–717 Hounsfield Units (HU) (soft kernel) and 915–1047 HU (sharp kernel). Differences in measured stone sizes were ≤1.3 mm. DL-processed images resulted in significantly higher CNR and SNR values (p < 0.001) by decreasing image noise significantly (p < 0.001). DL-software significantly improved objective image quality while maintaining both correct stone size and CT-attenuation values. Therefore, the clinical impact of stone assessment in denoised image data sets remains unchanged. Through the relevant noise suppression, the software additionally offers the potential to further reduce radiation exposure.


Introduction
In patients with acute flank pain and suspected renal or ureteral stones, non-contrastenhanced computed tomography (CT) is recommended for diagnosis [1]. CT also allows for the assessment the composition and size of potential urinary tract stones [2,3]. The attenuation of reno-ureteric stones can provide information on their composition or origin. Furthermore, the size of the stones can influence the treatment strategy [4]. Measurements of stone size and CT-attenuation can be performed in either soft tissue window or bone window, depending on the clinic's standard operating procedure. There is still no general or international standard on how the measurements should be performed. Both methods are described in the literature [5], though they are known to provide different measurements. Usually, soft-tissue window settings tend to overestimate stone size, while bone window settings tend to slightly underestimate stone size [6]. Furthermore, the attenuation is influenced by the size of the measurement area (region of interest) and partial volume effects.
A major concern of CT imaging is the associated radiation exposure and the potential carcinogenic effects [7,8]. Therefore, it is important to reduce the radiation burden of the CT, while maintaining its diagnostic accuracy. Dose reduction in CT is threefold and relates to (a) hardware optimization, (b) protocol optimization, and (c) post-processing software, i.e., noise-reduction techniques. Noise-reduction techniques, such as iterative reconstruction or deep-learning-based post-processing software allow for the reduction in the required radiation exposure [8,9]. However, the underlying principle of these techniques is often kept confidential. The potential effect on image quality and quantitative image information needs to be evaluated thoroughly prior to clinical application.
One example of a deep-learning-based software for noise suppression is PixelShine (AlgoMedica, Sunnyvale, CA, USA). PixelShine was already evaluated in terms of objective image quality in different body regions, such as low-dose abdominal CT and whole-body low-dose CT [8,10]. However, the influence on small structures, such as the size and CTattenuation values of reno-ureteric stones have not been evaluated so far. Therefore, the aim of this study was to assess whether and to what degree the post-processing software PixelShine influences attenuation measurements and stone size in patients that undergo low-dose abdominal CT and the degree to which the objective image quality is altered by the artificial intelligence (AI)-based technique.

Patient Cohort
This IRB-approved retrospective study included all patients that underwent low-dose abdominal CT in our department with radiological indication of suspected stone disease and were diagnosed with at least one stone in the urinary tract between 16 August 2019 and 13 October 2019.

CT Acquisition
CT examinations were performed on a Somatom Definition Flash CT scanner (Siemens Healthineers, Forchheim, Germany) without contrast-enhancement. The protocol was a dedicated low-dose protocol. In detail, patients were scanned in the prone position and scan coverage included the upper poles of the kidneys to the pelvic floor. Scan parameters were a tube potential of 100 kVp, reference tube-current time product 80 mAs, collimation 128 × 0.6 mm, pitch 0.6, and rotation time 0.5 sec. Dose parameters (CTDIvol and DLP) were documented to calculate the effective dose with the DLP to effective dose conversion factor of k = 0.0151 mSv/(mGycm) [11].
The reconstructions (e) and (f) were obtained by sending the filtered-back projections from (a) and (b), stored in our picture archive and communication system (PACS, Sectra Medical Systems, Linköping, Sweden) to the PixelShine server. Since PixelShine is a commercial software product, the algorithm is kept confidential. Post-processed images were returned to PACS for further evaluation.

Image Analysis
Image analysis was performed in ImageJ version 1.52p (National Institute of Health, Bethesda, ML, USA). Circular regions-of-interest (ROIs) with a radius of approximately 10 mm (area 314 mm 2 ) were drawn in the liver, spleen, paravertebral muscle, fat, vertebral body and in the air outside the patient for all six reconstructions for all patients. The position of the ROI was identical in each image set. Measured parameters were ROI area size, CT value, standard deviation (noise), and minimum and maximum CT values.
Subsequently, signal to noise ratio (SNR) and contrast to noise ratio (CNR) were calculated according the following formulas: Furthermore, one radiologist-in-training with 4 years of experience (B.V.) in reading abdominal CT measured the size (x-and y-diameter) and CT-attenuation of all detected stones in all six image data sets for all patients. Stones were magnified for the measurement in order to obtain exact measurement boundaries. Reading of images was performed in PACS with anonymized image data (patient information and type of reconstruction were unknown to the radiologist). In case of multiple stones, the largest stone was evaluated. The same stone in each set of reconstructions was evaluated.

Statistical Analysis
Analysis of patient data was performed using Microsoft Excel 2016 (Redmond, WA, USA). Statistical differences in CT values, image noise and size measurements between the six reconstructions were calculated using SPSS version 28 (IBM, Chicago, IL, USA) [13]. For the statistical analyses, Friedman tests with related samples and post hoc Bonferroni-correction were performed. The level of significance was p < 0.05. Figures were built using R [14].

Patient Characteristics
In the evaluated study period, 45 patients (32 males, 13 females) were diagnosed with a stone in the urinary tract. All stones were visible in each respective reconstruction. A mean CTDIvol of 2.6 ± 1.1 mGy (range 1.3-7.9 mGy) and a mean DLP of 108.9 ± 42.4 mGycm (range 47.4-288.5 mGycm) were obtained. The average effective dose amounted to 1.6 ± 0.6 mSv (range 0.7-4.4 mSv).

Stone Size
Stone size varied between measurements in soft tissue and sharp reconstructions. Stones in sharp kernel reconstructions presented with a more distinct edge than in soft kernel reconstructions. Differences are depicted in Figure 1.
In a direct comparison between soft tissue and sharp reconstructions, the measured stone sizes were significantly smaller (p < 0.001) when measured in sharp reconstructions, except for the x-diameter of the stones on iterative reconstructed images (p = 1.000). The largest differences in size amounted to 3.3 mm (P30f vs. P70f). Within one reconstruction kernel, differences in stone size were ≤1.3 mm. Results of statistical analyses are provided in Table 1. In a direct comparison between soft tissue and sharp reconstructions, the measured stone sizes were significantly smaller (p < 0.001) when measured in sharp reconstructions, except for the x-diameter of the stones on iterative reconstructed images (p = 1.000). The largest differences in size amounted to 3.3 mm (P30f vs. P70f). Within one reconstruction kernel, differences in stone size were ≤1.3 mm. Results of statistical analyses are provided in Table 1. Table 1. Results of the stone size measurements and statistical analysis. Size measurements of the three evaluated reconstructions were compared (B vs. I, I vs. P, P vs. B). Differences between measurements with shared superscripts were statistically significant. There were no significant differences between sharp kernel reconstructions. When applying soft kernels, stone diameters (x-and y) measured with iterative reconstruction were smaller than measured on filtered back-projections or PixelShine-processed reconstructions in 87/90 (96.7%) and 83/90 (92.2%) of the cases, respectively. When applying sharp kernels, there were no statistically significant differences in the size measurements between the three reconstruction techniques.

CT-Attenuation Values of Stones
In a direct comparison between all reconstructions with sharp and soft kernels, the measured CT values were significantly higher in reconstructions with sharp kernels (p < 0.001).
Mean CT values of stones in sharp and soft kernels are provided in Table 2, together with their statistical analysis. In general, attenuation measurements were lowest in Pix-elShine-processed images. Figure 2 visualizes the measured stone attenuations and their distributions. In one tiny stone (size <1 mm × 1 mm), the CT value in the B70f-reconstruction was 751 HU higher than in P70f-and I70f-reconstructions.  Table 1. Results of the stone size measurements and statistical analysis. Size measurements of the three evaluated reconstructions were compared (B vs. I, I vs. P, P vs. B). Differences between measurements with shared superscripts were statistically significant. There were no significant differences between sharp kernel reconstructions.

x (mm) y (mm) x (mm) y (mm)
Soft Kernels Sharp Kernels When applying soft kernels, stone diameters (x-and y) measured with iterative reconstruction were smaller than measured on filtered back-projections or PixelShineprocessed reconstructions in 87/90 (96.7%) and 83/90 (92.2%) of the cases, respectively. When applying sharp kernels, there were no statistically significant differences in the size measurements between the three reconstruction techniques.

CT-Attenuation Values of Stones
In a direct comparison between all reconstructions with sharp and soft kernels, the measured CT values were significantly higher in reconstructions with sharp kernels (p < 0.001).
Mean CT values of stones in sharp and soft kernels are provided in Table 2, together with their statistical analysis. In general, attenuation measurements were lowest in PixelShine-processed images. Figure 2 visualizes the measured stone attenuations and their distributions. In one tiny stone (size <1 mm × 1 mm), the CT value in the B70freconstruction was 751 HU higher than in P70f-and I70f-reconstructions.   P70f)). * p < 0.050, *** p < 0.001.

Stone Composition
A urinary stone analysis was available for 25/45 patients (55.6%). The majority of stones were composed of calcium-oxalate (15/45, 33.3%). Other stones were composed of a mixture of calcium-oxalate and carbonate apatite (5/45, 11.1%), or uric acid (3/45, 6.7%), of a mixture of carbon apatite and magnesium ammonium phosphate (1/45, 2.2%) and of cysteine (1/45%, 2.2%). Corresponding CT values are presented in Table 3. Although the CT values of stones composed of uric acid and calcium oxalate are similar for soft kernel reconstructions, differences are larger for sharp kernel reconstructions. Table 3. Composition of the stones and corresponding attenuation values where results of X-ray diffraction were available. Data provided as median with 25%-and 75%-quartiles in parentheses, where n > 1 stone was available.

CT Values and Image Noise in Tissues and Air
Differences in CT values between ROIs in liver, spleen, fat, and muscle were comparable in reconstructions with soft kernels, with maximum differences of 0.5% between the reconstructions (see Table 4a). In air and bone ROIs reconstructed with sharp kernels, differences in attenuation were within 5.5%. Still, there were significant differences in mean attenuation. Image noise varied between the reconstructions as follows (see Table 4b): highest noise values were measured in filtered back-projections whereas lowest noise values were measured with the denoising software PixelShine. Differences between the reconstructions were significant (p < 0.001). In fat, image noise was 57% lower in P30f-than in B30f-reconstructions.

SNR and CNR
The highest SNR values independent of the kernel type and tissue were determined for PS-reconstructed images. Due to higher noise values when applying sharp kernels, SNR values were considerably lower compared to applying soft kernels, where the image noise was suppressed. See Figure 3 for the SNR measured in the liver and bone.
Highest CNR values measured in stones and fat tissue were determined for PixelShine due to lowest noise levels both in soft tissue and in bone reconstructions (CNR 29.3 for B30f compared to 67.1 for P30f). Differences between the kernels were statistically significant (p < 0.001) (see Figure 4). Highest CNR values measured in stones and fat tissue were determined for Pix-elShine due to lowest noise levels both in soft tissue and in bone reconstructions (CNR 29.3 for B30f compared to 67.1 for P30f). Differences between the kernels were statistically significant (p < 0.001) (see Figure 4).  Signal to noise ratio calculated in the liver (a) and bone (b). Differences between the three reconstruction techniques were significant (*** p < 0.001).
Highest CNR values measured in stones and fat tissue were determined for Pix-elShine due to lowest noise levels both in soft tissue and in bone reconstructions (CNR 29.3 for B30f compared to 67.1 for P30f). Differences between the kernels were statistically significant (p < 0.001) (see Figure 4).

Discussion
Reno-ureteric stones were evaluable with traditional filtered back-projection, iterative reconstruction, and the novel deep-learning method PixelShine. Sharp kernel reconstructions resulted in smaller stone size measurements and significantly higher CT-attenuation values than soft kernel reconstructions. The differentiation between stone compositions was improved using sharp kernel reconstructions. Using AI-based methods offers increased signal-to-noise and contrast-to-noise ratios with the potential to further reduce radiation exposure to the patient.
There is still no gold standard for how to measure the attenuation and size of stones, e.g., with a defined reconstruction method or type of kernel. Especially with the increasing number of scanner-integrated or vendor-independent post-processing techniques, the comparability between the different methods may be difficult.
The software PixelShine has already been evaluated in a few technical and clinical research investigations, such as in ultra-low dose abdominal, pelvic, or midfacial trauma

Discussion
Reno-ureteric stones were evaluable with traditional filtered back-projection, iterative reconstruction, and the novel deep-learning method PixelShine. Sharp kernel reconstructions resulted in smaller stone size measurements and significantly higher CT-attenuation values than soft kernel reconstructions. The differentiation between stone compositions was improved using sharp kernel reconstructions. Using AI-based methods offers increased signal-to-noise and contrast-to-noise ratios with the potential to further reduce radiation exposure to the patient.
There is still no gold standard for how to measure the attenuation and size of stones, e.g., with a defined reconstruction method or type of kernel. Especially with the increasing number of scanner-integrated or vendor-independent post-processing techniques, the comparability between the different methods may be difficult.
The software PixelShine has already been evaluated in a few technical and clinical research investigations, such as in ultra-low dose abdominal, pelvic, or midfacial trauma CT [8,10,[15][16][17][18][19]. The publications show that the deep learning technique provides diagnostic images even at radiation exposures of 30% of the initial dose, regardless of the scanner type or reconstruction technique [8]. This is achieved by vigorously reducing image noise, resulting in increased signal-to-noise and contrast-to-noise ratios [8,10,15,18]. However, until now, no evaluation of (a) the detection and (b) the characterization of reno-ureteric stones in PixelShine-post-processed low-dose computed tomography of the abdomen has been published. The influence of deep learning techniques on the detection, image quality, size and attenuation of stones has already been studied for other vendors (AiCE, GE Healthcare and TrueFidelity TM , Canon Medical Solutions) (see Table 5) [20][21][22]. Comparable to our results, the aforementioned techniques reduce noise, possibly allowing for a reduction in the radiation dose in the future. Unfortunately, smaller stones <3mm could possibly be missed when reducing the radiation exposure, so the techniques should still be considered with caution. In general, reno-ureteric stones were evaluable in the post-processed reconstructions in this study, both with a sharp kernel and a soft kernel. No stone was missed. However, CT values and stone sizes differed between the three evaluated techniques. When measuring the CT values in the six evaluated image data sets per patient, the highest CT numbers were determined using sharp kernel and filtered back-projection as reconstruction methods. Iterative reconstruction and PixelShine resulted in lower CT-attenuation values. The measured stone diameters explain the difference. Diameters in reconstructions with sharp kernels were approximately 1 mm smaller compared to the diameters in soft kernel reconstructions. Presuming that the stone size is smaller since its margins are better distinguishable when employing sharp kernels, the CT value is more likely to be measured in the centre region of the stone. In this case, the periphery of the stone that might already contain soft tissue instead of stone material is elided. Therefore, less partial volume artifacts influence the measurement of attenuation, which would decrease the average CT value. Therefore, we recommend the diagnosis and measurement of reno-ureteric stones using sharp reconstructions and bone window settings (see Figure 5).
domen has been published. The influence of deep learning techniques on the detect image quality, size and attenuation of stones has already been studied for other vend (AiCE, GE Healthcare and TrueFidelity TM , Canon Medical Solutions) (see Table 5) [20-Comparable to our results, the aforementioned techniques reduce noise, possibly all ing for a reduction in the radiation dose in the future. Unfortunately, smaller stones <3 could possibly be missed when reducing the radiation exposure, so the techniques sho still be considered with caution.
In general, reno-ureteric stones were evaluable in the post-processed reconstruct in this study, both with a sharp kernel and a soft kernel. No stone was missed. Howe CT values and stone sizes differed between the three evaluated techniques. When m uring the CT values in the six evaluated image data sets per patient, the highest CT n bers were determined using sharp kernel and filtered back-projection as reconstruc methods. Iterative reconstruction and PixelShine resulted in lower CT-attenuation val The measured stone diameters explain the difference. Diameters in reconstructions w sharp kernels were approximately 1 mm smaller compared to the diameters in soft ke reconstructions. Presuming that the stone size is smaller since its margins are better tinguishable when employing sharp kernels, the CT value is more likely to be measu in the centre region of the stone. In this case, the periphery of the stone that might alre contain soft tissue instead of stone material is elided. Therefore, less partial volume facts influence the measurement of attenuation, which would decrease the average value. Therefore, we recommend the diagnosis and measurement of reno-ureteric sto using sharp reconstructions and bone window settings (see Figure 5). Lidén et al. evaluated the impact of image post-processing parameters on the siz renal stones and further assessed the inter-and intra-reader variability of stone size m urements [23]. They noticed considerable differences between reconstructions of diffe slice thickness and increment, window settings (bone vs. soft tissue), and furthermore intra-reader variability of ±0.5 mm and an inter-reader variability of ±1.3 mm. They cluded that a difference in size estimation of a structure of one or two millimeters is o significance in most clinical situations [23]. In our study, differences in sizes were maximum 3.3 mm between soft and sharp kernel reconstructions and 1.3 mm within reconstruction kernel. These differences are likely to have no significance on the treatm of patients. Lidén et al. evaluated the impact of image post-processing parameters on the size of renal stones and further assessed the inter-and intra-reader variability of stone size measurements [23]. They noticed considerable differences between reconstructions of different slice thickness and increment, window settings (bone vs. soft tissue), and furthermore, an intra-reader variability of ±0.5 mm and an inter-reader variability of ±1.3 mm. They concluded that a difference in size estimation of a structure of one or two millimeters is of no significance in most clinical situations [23]. In our study, differences in sizes were of a maximum 3.3 mm between soft and sharp kernel reconstructions and 1.3 mm within one reconstruction kernel. These differences are likely to have no significance on the treatment of patients.
Although differences in CT values of the stones were visible within one kernel, the CT value only influenced the immediate treatment if the stone passed without medical intervention. Unfortunately, single-source CT cannot provide information on the stone composition that is as detailed as that of a dual-energy CT [24][25][26]. It is possible to obtain a rough differentiation between different compositions by means of the CT value; however, there is no strict cut-off value for each composition (e.g., uric acid, cysteine, calcium) [27]. In general, stones consisting of uric acid have CT values of lower than 600 HU, which was supported by our study [2,3,28,29]. Stones consisting of calcium oxalate tend to have higher CT values (>1000 HU), again supported by our study [2,3,28,29]. We were able to see that the differentiation between stones composed of uric acid and stones composed of calcium oxalate is clearer with sharp kernel reconstructions. In the case of medical intervention due to an immobile stone, differentiation helps to induce a medicinal therapy for uric acid stones, whereas calcium oxalate stones require interventional therapy.
In this study, the range of CT values within one reconstruction kernel could result in a misinterpretation of the composition since differences in CT values could be as large as 750 HU; however, this is only the situation in rare cases and with very small stones. The referring doctor usually only receives information regarding the occurrence, size and position of a stone rather than the composition. The direct treatment is usually based on the clinical presentation and patient symptoms rather than the stone composition. Furthermore, stones are frequently calcified, which might hide the actual stone composition [30].
Unfortunately, many articles do not describe the reconstruction kernel they use, rather only detailing the window settings (e.g., bone or soft tissue window). However, the reconstruction kernel changes the sharpness an image noise of structures [31]. Typically, soft tissue reconstruction kernels tend to smoothen tissue edges. In contrast, sharp kernels are often edge enhancing, creating sharper and more distinct tissue edges. Sharp kernels influence the size of a small structures and consequently, also the CT value of measured stones, since the partial volume effects are reduced. Nevertheless, even when using the same kernel, CT values and stone sizes are influenced by employing different window settings (bone vs. soft tissue windows) [23,28,29,32].
Both soft and sharp reconstruction kernels exhibit advantages and disadvantages in the detection and measurement of stones. The anatomical classification is easier with soft kernels and soft tissue window, as the ureters and their anatomical course can be better distinguished here. However, the detectability of stones is usually higher when using sharp kernels and bone window, since the high attenuation of stones in these settings is more prominent compared to soft tissue settings. Umbach et al. and Danilovic et al. proposed using bone window and small slice thickness to determine the stone size due to a higher accuracy compared to measurements in soft tissue window settings [32,33]. However, they did not provide information on the employed reconstruction filter. Moreover, Eisner et al. also proposed using bone window settings and to magnify the image to increase the accuracy of stone measurements [34].
Independent of the stone characteristics, this study also examined the objective image quality between the six different reconstructions. PixelShine's image noise reduction ability was already proven in different studies [8,10,15,18]. This study demonstrated the highest signal-to-noise and contrast-to-noise ratios with reconstructions employing PixelShine, both for soft and sharp kernels. A noise reduction in the region around the stones might improve the detection of very small stones. However, in this study, all stones could be detected in all reconstructions, independent of the noise level. Some limitations need to be mentioned. The evaluation of stones in terms of material compositions is best performed by means of dualenergy CT [35]. However, this technique is not always available in every center and often is associated with higher radiation exposure [24,35,36]. In our institute, the detection of renoureteric stones was performed with a single tube potential. Furthermore, stone size was not available for any of the patients, and composition was only available for some patients. Hence, we could not evaluate stone composition and stone size thoroughly. However, this study did not aim at the exact stone composition. Furthermore, other reconstruction methods, such as ADMIRE (Siemens Healthcare, Forchheim, Germany) might have resulted in different characterizations. Additionally, since the software PixelShine is a commercial product, the exact algorithm is kept confidential and it can be used simply to evaluate the results.

Conclusions
The size and CT value of reno-ureteric stones differ between sharp and soft reconstruction kernels. Within one type of kernel, post-processing methods such as PixelShine influence the measurements to a certain degree, however, this does not impede the clinical decision. Currently, the software is not part of our standard reconstruction process, but is only employed for research purposes. Therefore, PixelShine images were reconstructed in a subsequent step. However, it is possible to integrate the reconstruction process in the general workflow. A noise-reduction algorithm, which can decrease the radiation exposure of these patients is of great advantage and is highly recommended.
Yet, a standard measuring procedure within one institute is required, since the differences in size and CT value between soft and sharp kernel reconstructions were statistically significant and so might be useful for further treatment. In general, we recommend the usage of PixelShine and sharp kernel reconstructions, diagnosed in a bone window to increase the differentiability of stone compositions.