Diagnostic Accuracy of Diffusion-Weighted MRI for Differentiating Benign and Malignant Thyroid Nodules: Systematic Review and Meta-Analysis

Noto, Benjamin; Bobe, Carolin; Brandt, Jonas; Raum, Heiner N.; Nacul, Nabila Gala; Riemann, Burkhard; Helfen, Anne

doi:10.3390/cancers17162677

Open AccessSystematic Review

Diagnostic Accuracy of Diffusion-Weighted MRI for Differentiating Benign and Malignant Thyroid Nodules: Systematic Review and Meta-Analysis

by

Benjamin Noto

^1,2,*

,

Carolin Bobe

¹,

Jonas Brandt

¹

,

Heiner N. Raum

¹

,

Nabila Gala Nacul

¹,

Burkhard Riemann

² and

Anne Helfen

¹

Clinic for Radiology, University of Münster and University Hospital Münster, 48149 Münster, Germany

²

Department of Nuclear Medicine, University of Münster and University Hospital Münster, 48149 Münster, Germany

^*

Author to whom correspondence should be addressed.

Cancers 2025, 17(16), 2677; https://doi.org/10.3390/cancers17162677

Submission received: 2 July 2025 / Revised: 12 August 2025 / Accepted: 14 August 2025 / Published: 18 August 2025

(This article belongs to the Section Systematic Review or Meta-Analysis in Cancer Research)

Download

Browse Figures

Versions Notes

Simple Summary

Thyroid nodules are highly prevalent, but most are benign. The limited specificity of current diagnostic approaches leads to unnecessary interventions and overtreatment. Diffusion-weighted MRI (DWI), which quantifies tissue microstructure using the apparent diffusion coefficient (ADC), is a promising non-invasive imaging modality for thyroid nodule classification. This meta-analysis of 46 studies demonstrates that DWI offers high diagnostic accuracy, with pooled sensitivity and specificity of 0.84 and 0.88, respectively. Evaluation of acquisition techniques and imaging parameters identified reduced field-of-view DWI and the mono-exponential ADC model as particularly promising for clinical application. However, the analysis also revealed a lack of technical standardization—especially regarding b-value selection—as a major hurdle to clinical translation. To fully realize the clinical potential of DWI, coordinated efforts toward standardizing acquisition protocols are needed.

Abstract

Background: Thyroid nodules are highly prevalent, affecting up to 75% of the population, yet most are benign. The limited specificity of ultrasound-based workup leads to substantial overdiagnosis and overtreatment, underscoring the need for improved imaging-based classification. Diffusion-weighted MRI (DWI), quantified via the apparent diffusion coefficient (ADC), has emerged as a promising imaging biomarker. This meta-analysis updates pooled diagnostic performance metrics and systematically evaluates which DWI acquisition techniques, imaging parameters, and combinations with other MRI modalities are most promising for clinical translation. Methods: PubMed, Web of Science, Scopus, and ProQuest were systematically searched. Pooled sensitivity, specificity, and area under the curve (AUC) were calculated using bivariate random-effects models. The effects of b-value, magnetic field strength, echo time, and diffusion model on diagnostic accuracy and ADC values were examined through subgroup and meta-regression analyses. Results: Forty-six studies (3003 nodules) were included. Pooled sensitivity and specificity were 0.84 (95% CI: 0.81–0.86) and 0.88 (95% CI: 0.85–0.90), with an AUC of 0.912. Intravoxel incoherent motion and diffusion kurtosis imaging showed no added value over the mono-exponential model. For the mono-exponential model, a negative association between b-values and reported ADCs was observed, whereas no association was found between b-values and diagnostic accuracy. Magnetic field strength and echo time did not affect ADCs. Combining DWI with morphological imaging showed the potential to further enhance diagnostic performance. Conclusions: DWI holds strong potential to improve the diagnostic workup of thyroid nodules. Technical standardization, particularly of key acquisition parameters, should be pursued to enable clinical implementation.

Keywords:

thyroid nodule; magnetic resonance imaging; diffusion magnetic resonance imaging; thyroid gland; meta-analysis

1. Introduction

Thyroid nodules are a common clinical finding, with a prevalence of up to 75% in the general population, though only about 5–15% are malignant [1,2,3]. The primary aim of thyroid nodule workup is to accurately differentiate malignant from benign nodules. Ultrasound remains the first-line imaging modality in euthyroid patients, and several Thyroid Imaging Reporting and Data System (TIRADS) frameworks, based on B-mode ultrasound characteristics, have been developed to improve diagnostic consistency [4]. However, specificity remains a challenge: recent meta-analyses report specificities as low as 50% for the ATA Guidelines and around 70% for ACR-TIRADS [5]. Even the specificity of fine needle aspiration cytology (FNA) is limited. In a recent meta-analysis including 16,597 patients from 36 studies, FNA sensitivity was around 86%, but specificity was only 71% [6]. Given the high prevalence of benign nodules, this leads to substantial overdiagnosis and unnecessary interventions [7], underlining the need for improved diagnostic strategies.

Diffusion-weighted MRI (DWI) and its quantitative parameter, the apparent diffusion coefficient (ADC), offer a non-invasive means to probe tissue microstructure, with lower ADC values observed in malignant thyroid nodules due to higher cellular density and reduced extracellular space [8,9]. Thus, ADC may serve either as a standalone imaging biomarker or as a complementary measure integrated into existing frameworks like TIRADS.

Previous reviews and meta-analyses have demonstrated a high diagnostic performance of DWI for thyroid nodule classification [10,11,12]. However, these analyses have not accounted for the considerable methodological heterogeneity across studies, particularly in terms of technical parameters such as magnetic field strength and b-value selection. At least for the mono-exponential model, ADC values decrease with increasing b-values, making it essential to consider b-value dependence when comparing or applying thresholds. Moreover, previous reviews have not evaluated the diagnostic value of advanced diffusion models—such as intravoxel incoherent motion (IVIM) or diffusion kurtosis imaging (DKI)—or advanced acquisition techniques like reduced field-of-view (rFOV) DWI. This omission limits the interpretability and clinical translatability of the findings.

This meta-analysis builds on prior reviews by updating pooled diagnostic performance metrics while also introducing a critical additional dimension: a structured investigation into the impact of imaging parameters and techniques. By doing so, we seek to clarify the path toward standardizing DWI as a clinically useful and reproducible tool for thyroid nodule classification, and to identify a set of imaging parameters best suited for future research and clinical implementation.

2. Materials and Methods

The meta-analysis was conducted according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses of Diagnostic Test Accuracy (PRISMA-DTA) guidelines [13]. A systematic literature search was conducted in Pubmed, Web of Science, Scopus, and Proquest. No date limit was set. Search parameters used can be found in the Table A1. After removal of duplicates, a two-phase study selection process was used. First, titles and abstracts of retrieved articles were screened for relevance. Subsequently, the full texts of potentially eligible articles were reviewed in detail. This selection process was independently performed by two reviewers (A.H. and B.N.) on two separate occasions. Disagreements were resolved through consensus.

Study inclusion was based on the following PICOS criteria: Population (P): adults with thyroid nodules. Intervention (I): diffusion-weighted magnetic resonance imaging quantified by the apparent diffusion coefficient. Comparison (C): nodule dignity according to post-surgical histologic workup or fine needle aspirate cytology; Outcomes (O): sensitivity, specificity, mean ADC of benign and malignant nodules; Study Design (S): retrospective or prospective studies. Criteria for study-exclusion were as follows: (1) duplicate articles; (2) abstracts without full texts, editorial comments, letters, case reports, reviews, meta-analyses; (3) non-English full-text articles; (4) studies with incomplete or ambiguous data regarding sensitivity, specificity, or number of benign and malignant nodules; (5) studies with a highest b-value below 500

{mm}^{2} / s

, which falls short of the minimum highest b-value recommended by the Quantitative Imaging Biomarkers Alliance (QIBA) of the Radiological Society of North America for reliable ADC quantification [14].

Data was extracted by B.N. and independently cross-validated by J.B. and N.G.N. Discrepancies were resolved by consensus. The QUADAS-2 tool tailored to this review was applied by C.B. and independently cross-validated by B.N. to assess the quality of included studies and applicability concerns [15]. Disagreements were resolved by consensus discussion.

This review was not registered.

Statistical Analysis

All analyses were conducted in R (version 4.1.0, R Foundation for Statistical Computing, Vienna, Austria). From the reported sensitivity, specificity, and number of malignant and benign cases,

2 \times 2

contingency tables (true positive, false positive, true negative, false negative) were reconstructed.

Separate univariate random-effects meta-analyses for sensitivity and specificity were performed using the metaprop() function from the meta package [16], with logit transformation and inverse-variance pooling. Forest plots were generated to visualize study-level and summary estimates.

To jointly model sensitivity and specificity while accounting for their correlation, a bivariate random-effects meta-analysis via the reitsma() function from the mada package was applied [17]. Summary estimates and 95% confidence intervals were extracted, and an sROC curve was plotted, including individual study points and the pooled summary estimate with confidence region.

Subgroup analyses were conducted for studies using mono-exponential vs. IVIM-based DWI models.

To examine the effects of technical parameters (maximum b-value, magnetic field strength, echo time) on reported mean ADC values of studies using the mono-exponential ADC model, a weighted mixed-effects meta-regression was performed using the rma() function from the metafor package. The relative contribution of each predictor was quantified via changes in explained heterogeneity (

R^{2}

) when variables were removed from the full model.

3. Results

3.1. Literature Search

Figure 1 illustrates the flow of studies through the literature search and screening process in accordance with PRISMA guidelines and the predefined inclusion criteria.

3.2. Risk of Bias and Applicability Assessment

Risk of bias and concerns regarding applicability were assessed using a modified version of the QUADAS-2 tool, as detailed in the appendix (Figure A1). The tool was adapted for this review by adding a tailored signaling question to the risk of bias domain: Is the reference standard likely to correctly classify the target condition? This was answered as follows: Yes, if only histopathology was used as the reference standard; No, if the reference standard was clearly unsuitable; and Unclear, if fine needle aspiration cytology (FNAC) was used as the reference standard in some or all cases. This modification reflects the limited diagnostic accuracy of FNAC compared to histopathology. Results are summarized in Figure 2 and visualized in Figure 3.

High risk of bias was rare. One study was rated as having a high risk of bias in the patient selection domain due to inappropriate exclusion criteria [18], and one study was deemed to have a high risk of bias in the reference standard domain due to the use of clinical follow-up as a reference in some cases [19]. Twelve out of forty-six studies (28.3%) used FNAC as the reference standard in some or all cases and were therefore rated as having unclear risk of bias in the reference standard domain. Apart from these specific concerns, most unclear ratings stemmed from insufficient reporting.

3.3. Study Characteristics

Forty-six studies with 3003 nodules (1746 benign and 1257 malignant) were included [8,9,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61]. Concerning the standard of reference, 33 (71.7%) studies used histology as the sole reference, 8 (17.4%) used histology and FNA as references, 4 studies (8.7%) used FNA as the sole standard of reference, and one study (2.2%) used a combination of FNA and follow-up. Concerning magnetic field strength, 23 (50.0%) studies used MRI machines with a field strength of 1.5 Tesla, 21 (45.7%) studies with 3.0 Tesla, and two studies used a machine with 1.0 Tesla (4.3%). Countries of origin of the first authors were as follows: China (21), Egypt (7), Turkey (5), Japan (3), Pakistan (2), Austria (2), India (1), Iran (2), UK (1), Vietnam (1), and South Korea (1). Five studies applied IVIM [21,23,26,30,56]. For studies applying IVIM the diagnostic performance of D (pure diffusion) was assessed. For studies applying a mono-exponential diffusion model, the diagnostic performance of the ADC value based on the highest b-value was assessed.

3.4. Meta-Analysis of Diagnostic Performance

All 46 studies demonstrated the diagnostic utility of DWI in differentiating benign from malignant lesions. The bivariate random-effects meta-analysis, using the Reitsma model, yielded a pooled sensitivity of 0.84 (95% CI: 0.81–0.86) and a pooled specificity of 0.88 (95% CI: 0.85–0.90). Forest plots and results of univariable analysis of sensitivity and specificity are presented in Figure 4 and Figure 5. The area under the summary receiver operating characteristic (sROC) curve was 0.91 (Figure 6). The normalized partial AUC, restricted to the observed false positive rates, was 0.85. Between-study heterogeneity, as quantified by

I^{2}

statistics using the Zhou and Dendukuri approach, was 3.1% [62].

Subgroup analyses showed no statistically notable difference in the pooled sensitivity and specificity of studies using IVIM-based DWI compared to those using mono-exponential DWI (sensitivity: 0.87 vs. 0.84,

p = 0.537

; specificity: 0.90 vs. 0.88,

p = 0.919

).

Among studies using mono-exponential DWI, a bivariate meta-regression was conducted to assess whether the maximum b-value was associated with diagnostic performance. The maximum b-values reported in these studies ranged from 500 to 2000, with 1000 being most commonly used. No significant association was found between maximum b-value and either sensitivity (

p = 0.806

) or specificity (

p = 0.397

), indicating that variation in the b-value across studies did not meaningfully influence test accuracy.

3.5. Studies Reporting Lower ADC Values for Benign than for Malignant Nodules

In contrast to 42 studies reporting lower ADC values for malignant than for benign thyroid nodules, four studies—Schueller-Weidekamm et al. (2009 and 2010), Le Tuan Linh et al. (2019), and Chung et al. (2020)—were identified, that reported the opposite observation, namely lower average ADC values for benign nodules than for malignant ones [57,58,59,60]. Still, all four studies reported positively on the diagnostic capacity of DWI. A possible explanation for the findings of Chung et al. may lie in the specific composition of their study cohort, which purely included follicular neoplasms. In their analysis, follicular carcinomas demonstrated a higher mean ADC value (

0.783 \times 10^{- 3} {mm}^{2} / s

) as compared to follicular adenomas (

0.581 \times 10^{- 3} {mm}^{2} / s

) [60]. This is in contrast to findings reported by Abdel Razek and colleagues (2008), who observed a mean ADC of

0.77 \times 10^{- 3} {mm}^{2} / s

for follicular carcinoma, and a considerably higher value of

1.7 \times 10^{- 3} {mm}^{2} / s

for follicular adenoma [48]. The reasons for the reversed trend reported in the studies by the groups of Schueller-Weidekamm and Le Tuan Linh remain unclear.

3.6. Influence on Highest b-Value, Magnetic Field Strength, and Echo Time on Reported ADC Values

A weighted mixed-effects meta-regression was performed to examine the influence of imaging parameters (maximum b-value, magnetic field strength, echo time) and nodule type (benign vs. malignant) on reported mean ADC values. For this meta-regression, studies using IVIM (5 studies) were excluded. Also, the studies by Abdel-Rahman et al. (2016), Abd-Alhamid et al. (2016), Saeed et al. (2023), and Wang et al. (2024) were excluded since no standard deviations of mean ADC values were provided, which are necessary for meta-regression calculations [22,25,38,49]. Furthermore, the the studies by Schueller-Weidekamm et al. (2009 and 2010), Le Tuan Linh et al. (2019), and Chung et al. (2020), that reported lower ADC values for benign nodules as compared to to malignant ones, in contrast to all other studies, were excluded [57,58,59,60]. Hence, 33 studies were included. The impact of nodule type, b-value, magnetic field strength, and echo time (TE) on ADC values is visualized in Figure 7.

For the included studies the pooled estimate of the mean ADC was

1.76 \times 10^{- 3} {mm}^{2} / s

(95% CI: 1.68–1.85) for benign nodules and

1.08 \times 10^{- 3} {mm}^{2} / s

(95% CI: 0.98–1.18) for malignant. The meta-regression model explained 71.1% of the residual heterogeneity in reported ADC values (

R^{2} = 0.71

). Higher maximum b-values were significantly associated with lower mean ADC values (

β = - 0.0003 \times 10^{- 3} \frac{{mm}^{4}}{s^{2}}

,

p < 0.001

), while higher magnetic field strength and echo time showed no statistically notable association with ADC values (

β = 0.0606 \times 10^{- 3} \frac{{mm}^{2}}{s \times T}

,

p = 0.208

) and (

β = - 4.5620 \times 10^{- 3} \frac{{mm}^{2}}{s^{2}}

,

p < 0.081

). Malignant nodules had substantially lower mean ADC values compared to benign nodules (

β = - 0.6801 \times 10^{- 3} \frac{{mm}^{2}}{s}

,

p < 0.001

). Analysis of relative variable importance showed that nodule type accounted for 89.2% of the explained variance, followed by maximum b-value (9.7%), magnetic field strength (0.6%) and echo time (0.6%).

3.7. Studies Reporting on Multimodal Analysis

Several studies evaluated the added diagnostic value of combining diffusion-weighted imaging (DWI) with other MRI-based techniques for thyroid nodule classification:

T1 Mapping: Yuan et al. (2023) assessed the feasibility of combining T1 mapping with ADC measurements. They reported an AUC of 0.837 for ADC and 0.845 for T1 mapping, with a combined AUC of 0.956. Notably, the acquisition time for T1 mapping was only 36 s [32].
Morphologic Parameters: Tang et al. (2023) integrated DWI metrics with morphological features commonly assessed in TIRADS. A combined model incorporating mean diffusivity from diffusion kurtosis imaging, maximum diameter, and margin irregularity achieved an AUC of 0.996, with a sensitivity of 95.1% and specificity of 100.0% [29].
Wang et al. (2018) explored a multivariable model combining DWI with post-contrast and morphologic parameters. Independent predictors included ADC, irregular shape, a ring sign in the delayed phase, and cystic degeneration. An ADC-only model achieved an AUC of 0.95, while a combined model reached an AUC of 0.99 [45].
Amide Proton Transfer-Weighted Imaging (APT): Li et al. (2020) examined APT imaging in combination with DWI (44 nodules; 22 malignant and 22 benign). APT alone yielded an AUC of 0.835. While ADC alone achieved an AUC of 0.95, the addition of APT did not further improve diagnostic performance (combined AUC: 0.95) [18].
Dynamic Contrast-Enhanced Imaging (DCE): Song et al. (2020) evaluated the incremental value of DCE imaging alongside IVIM-derived diffusion parameters. Pharmacokinetic modeling of DCE parameters produced modest diagnostic performance (AUC = 0.668 for $K^{trans}$ , and 0.682 for $K_{ep}$ ). In contrast, the IVIM-derived diffusion coefficient D alone achieved an AUC of 0.969. A combined model ( $D + K^{trans} + K_{ep}$ ) improved the AUC to 0.991, although it was not significantly different from the AUC of D alone [21].
Similarly, Sasaki et al. (2013) proposed a stepwise diagnostic approach combining ADC and DCE time intensity curves. While the combined model showed an accuracy of 91%, ADC alone achieved the same accuracy, suggesting limited added value of DCE [51].
Spectroscopy: Three studies, all conducted in Egypt, evaluated the diagnostic performance of MR spectroscopy (MRS) in addition to DWI [46,50,52]. All reported improved sensitivity and specificity when combining ADC with MRS, although none described how the combined models were constructed, and none tested whether the improvements were statistically significant. Two of the studies originated from the same institution [50,52]. El-Hariri et al. (2012) reported sensitivities and specificities of 94% and 95% for DWI, 94.7% and 89.2% for MRS, and 96% and 100% for the combined approach, respectively [52]. Elshafey et al. (2014) reported 96% sensitivity and 85% specificity for DWI, 96% and 92% for MRS, and 100% and 93% for the combination [46]. Taha Ali (2017), in a cohort of 42 nodules (28 benign, 14 malignant), used MRS to assess the presence of a choline peak. Reported sensitivity and specificity were 100% and 89.3% for MRS, 85.7% and 89.2% for DWI, and 100% and 96% for the combined method [50].

3.8. Studies Investigating Advanced DWI Techniques

Liling Jiang et al. (2022) compared reduced field-of-view (rFOV) DWI with simultaneous multislice readout-segmentation of long variable echo-trains DWI (SMS-RESOLVE-DWI). rFOV-DWI demonstrated superior image sharpness, fewer artifacts, and overall better image quality compared to SMS-RESOLVE-DWI [27].

Xiuyu Wang et al. (2023) evaluated multiplexed sensitivity-encoding diffusion-weighted imaging (MUSE-DWI) against conventional DWI. MUSE-DWI provided improved image quality, clearer thyroid contour delineation, and greater lesion conspicuity. The reported acquisition times were similar: 3 min 40 s for MUSE-DWI and 3 min 27 s for conventional DWI [25].

3.9. Studies Investigating Advanced Diffusion Models

Xian Zhu et al. (2022) assessed the diagnostic performance of several advanced diffusion models, including bi-exponential, stretched exponential, and diffusion kurtosis imaging (DKI), in comparison to the conventional mono-exponential DWI model. No improvement in diagnostic accuracy was observed for any of the advanced models over the standard approach [33].

Similarly, two additional studies directly compared DKI with the mono-exponential model and reported no significant differences in diagnostic performance [9,29]. However, DKI required substantially longer acquisition times—6 min 52 s versus 1 min 52 s for conventional DWI [29].

IVIM has also been compared with the mono-exponential model in two studies, with no difference in diagnostic accuracy between the IVIM-derived true diffusion coefficient (D) and ADC values obtained from mono-exponential DWI [23,56].

Another study compared the diagnostic performance of IVIM and DKI-derived parameters. Again, no difference in diagnostic performance was found [26].

Five studies examined the IVIM-derived perfusion fraction (f) in benign versus malignant nodules, yielding inconsistent findings. Four studies reported significantly higher perfusion fraction values in benign nodules [23,26,30,56], while one study found higher values in malignant nodules [21].

Studies Investigating the Influence of b-Value Choice on Diagnostic Performance

Three studies investigated the influence of b-value selection on the diagnostic performance of ADCs calculated using mono-exponential DWI models [28,33,44]. ADC values were derived from both two-point models—specifically using the combinations (b0 and b800), (b0 and b1000), and (b0 and b2000)—as well as from more complex multi-point models incorporating up to 13 b-values: 0, 30, 50, 80, 100, 150, 200, 400, 600, 800, 1000, 1500, and 2000 s/mm².

Both Xian Zhu et al. (2022) and Qingjun Wang et al. (2019) reported no significant differences in diagnostic performance between ADCs calculated from different b-value combinations. In contrast, in a previous study, Qingjun Wang et al. (2018) found significantly lower AUCs for ADC (b0–800) compared to ADC (b0–2000) and ADC( b0–800–2000), with p-values of 0.047 and 0.041, respectively [44].

With regard to signal intensity ratios (SIR), Liling Jiang et al. (2024) described differences in signal intensity between benign and malignant nodules at a b-value of 1500 s/mm², which were visible to the naked eye [53]. Similarly, Qingjun Wang et al. (2019) reported high diagnostic accuracy for SIR measured at b = 2000 s/mm² (SIR_b2000), achieving an AUC of 0.975 for differentiating malignant from benign thyroid micronodules.

4. Discussion

This systematic review and meta-analysis aimed to evaluate the diagnostic performance of DWI in differentiating benign from malignant thyroid nodules and to examine the influence of technical imaging parameters. Across 46 included studies comprising 3003 nodules, DWI demonstrated a high diagnostic accuracy, with a pooled sensitivity of 0.84 (95% CI: 0.81–0.86), a pooled specificity of 0.88 (95% CI: 0.85–0.90), and an area under the summary receiver operating characteristic (sROC) curve of 0.91.

Three previous meta-analyses have investigated the diagnostic performance of DWI for thyroid nodule classification, all reporting high diagnostic accuracy [10,11,12]. Compared to our study—which includes 17 more studies than the largest of these earlier reviews—each of the prior analyses found similarly high pooled sensitivities, with confidence intervals ranging from 0.85 to 0.94. Our slightly lower sensitivity estimate may be explained by our use of a bivariate random-effects model that accounts for the correlation between sensitivity and specificity, offering a more rigorous statistical approach. Pooled specificities in the earlier meta-analyses ranged from 0.83 to 0.98 within their respective confidence intervals. Our pooled specificity of 0.88 (95% CI: 0.85–0.90) is consistent with these findings. Studies included in previous reviews but excluded from our review are listed in Table A2. Overall, our finding of a high diagnostic performance of DWI for thyroid nodule classification is consistent with earlier meta-analyses.

The

I^{2}

statistic from our bivariate meta-analysis was 3.1%, indicating low between-study heterogeneity and a high degree of consistency in diagnostic performance across studies, despite differences in patient populations and technical parameters. This suggests robustness and potential generalizability of our findings.

4.1. Beyond Previous Meta-Analyses: Analysis of Technical Parameters

In contrast to earlier meta-analyses, our study not only updates pooled diagnostic performance estimates but also includes a detailed review and meta-regression analysis on the influence of technical imaging parameters. We further assessed studies that combined DWI with other MRI-based techniques.

Among studies applying the mono-exponential model to calculate ADC values, our meta-regression revealed a statistically notable negative association between the highest b-value and the reported ADC values. This relationship is consistent with theoretical expectations [63]. These findings emphasize the importance of standardizing DWI acquisition protocols to enhance inter-study comparability, support the development of robust ADC thresholds, and ultimately enable clinical translation. The diagnostic performance of the ADC was not influenced by b-value choice. Still, some studies reported that high b-values (e.g., ≥1500 s/mm²) assisted in visually detecting malignant nodules on trace DWI images. That said, the relevance of this advantage for potential future clinical implementations may be limited. In our opinion, patients referred for DWI due to thyroid nodules would almost certainly have already undergone thyroid ultrasound. The role of MRI in this context would likely not be for initial detection but rather for further characterization of nodules that appear suspicious on ultrasound. Given the drawbacks of high b-value imaging (i.e., need for advanced hardware, longer acquisition times since more signal averages are required), physical high b-value imaging does not seem essential. If facilitation of malignancy detection is desired, calculated high b-value DWI images could be an option.

Several studies also evaluated advanced diffusion models, primarily IVIM and DKI. However, none demonstrated superior diagnostic performance over the standard mono-exponential model. Given their extended acquisition times and hardware demands, these advanced techniques offer no clear diagnostic advantage. Our analysis also found no evidence that magnetic field strength affects reported ADC values, suggesting it plays a negligible role in ADC variability and can be deprioritized in future standardization efforts.

Regarding DWI technique, our review suggests that reduced field-of-view DWI is currently the most suitable option, as it provides improved image quality and lesion conspicuity, clinically acceptable acquisition times, and minimizes artifacts from surrounding anatomical structures.

In conclusion, our findings support the following DWI configuration as the most promising for future research and potential routine implementation: reduced field-of-view DWI using mono-exponential ADC calculation with a maximum b-value in the range of 500–1000 mm²/s.

4.2. Combination with Other Imaging Techniques

Several studies included in this review have investigated the feasibility of combining DWI with other MRI imaging techniques. In our opinion, the most obvious option is the combination with morphological image properties, following the established parameters of the ultrasound-based TIRADS systems. Two studies included in this review report near-perfect AUC for models combining morphological imaging parameters with DWI [29,45].

Thus, further combined research approaches on prediction models based on DWI-MRI and morphological parameters are highly desired.

4.3. Limitations

This study is limited by including only studies published in English, resulting in the exclusion of 13 potentially relevant studies. Also, some studies included in this meta-analysis were published prior to the recognition of the pathological entity known as noninvasive follicular thyroid neoplasm with papillary-like nuclear features (NIFTP). As a result, nodules that would now be classified as benign may have been previously categorized as malignant, potentially affecting the diagnostic accuracy estimates [64].

As previously discussed, the most probable clinical role of MRI lies in the further characterization of thyroid nodules that appear suspicious on ultrasound. However, only eight studies included in this review explicitly restricted inclusion to nodules indicated for FNA, either according to the American Thyroid Association Guidelines or a TIRADS classification system, limiting the applicability of this meta-analysis to the most likely real-world clinical use case [9,23,28,30,38,44,60,61]. Although all eight studies reported favorable diagnostic performance for DWI, further research specifically targeting nodules recommended for FNA under the current TIRADS appears warranted. Moreover, only one study focused exclusively on nodules classified as Bethesda category III or IV following FNA [60]. Notably, in many studies, prior needle biopsy was an explicit exclusion criterion [25,27,32,53]. Therefore, future investigations are needed to determine whether DWI remains feasible and diagnostically useful after needle biopsy, or whether post-bioptic changes diminish its value.

While our meta-analysis focused on diagnostic performance, it is important to acknowledge that resource availability, cost, and practicality of implementing MRI as a second-line diagnostic tool must also be considered. The feasibility of such use is likely to vary significantly across healthcare systems, depending on factors such as local infrastructure, reimbursement policies, and individual patient insurance coverage. These context-dependent factors should be taken into account when considering the broader clinical applicability of our findings.

4.4. Conclusions

In conclusion, the apparent diffusion coefficient represents a highly promising quantitative imaging biomarker for thyroid nodule classification. Provided that technical parameters can be standardized, DWI should seriously advance toward routine clinical application. Moreover, future research exploring the combined diagnostic value of DWI with morphological imaging features seems to hold considerable potential to further improve diagnostic accuracy.

Author Contributions

Conceptualization, B.N., B.R. and A.H.; methodology, B.N.; software, B.N.; validation, B.N., A.H., N.G.N. and C.B.; formal analysis, B.N.; investigation, B.N., A.H., J.B., N.G.N. and C.B.; data curation, B.N.; writing—original draft preparation, B.N.; writing—review and editing, B.N., A.H., J.B. and H.N.R.; visualization, B.N.; supervision, B.N.; project administration, B.N. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ACR	American College of Radiology
ADC	Apparent diffusion coefficient
ATA	American Thyroid Association
AUC	Area under the receiver operating curve
cDWI	Conventional single-shot spin echo echo-planar imaging
DCE	Dynamic contrast-enhanced MR perfusion
DKI	Diffusion kurtosis imaging
DWI	Diffusion-weighted MRI
EPI DWI	Echo-Planar Imaging DWI
FNA	Fine needle aspiration (cytology)
IVIM	Intravoxel incoherent motion
MRI	Magnetic resonance imaging
MRS	Magnetic Resonance Spectroscopy
MUSE-DWI	Multiplexed sensitivity-encoding diffusion-weighted imaging
NIFTP	Neoplasm with papillary-like nuclear features
PRISMA-DTA	Systematic Reviews and Meta-Analyses of Diagnostic Test Accuracy
QIBA	Quantitative Imaging Biomarkers Alliance
rFOV	Reduced field-of-view DWI
SIR	Signal intensity ratios
SMS-RESOLVE-DWI	Simultaneous Multi-Slice Readout Segmentation of Long Variable
	Echo-Trains DWI
sROC	Summary receiver operating characteristic
TE	Echo time
TIRADS	Thyroid Imaging Reporting and Data System

Appendix A

Table A1. Search strategies for each database.

Database	Search Query
PubMed	((“Thyroid nodule”[Mesh]) OR (Thyroid[Title/Abstract])) AND (“Diffusion Magnetic Resonance Imaging”[Mesh] OR (diffusion-weighted[Title/Abstract]) OR (DWI[Title/Abstract]))
Web of Science	((TS=(DWI)) OR TS=(diffusion-weighted)) AND TS=(Thyroid nodule)
Scopus	TITLE-ABS-KEY (thyroid) AND TITLE-ABS-KEY (nodule) AND TITLE-ABS-KEY (diffusion) AND (LIMIT-TO (DOCTYPE, “ar”) OR LIMIT-TO (DOCTYPE, “re”) OR LIMIT-TO (DOCTYPE, “cp”))
ProQuest	noft(thyroid) AND noft(diffusion) AND noft(Magnetic resonance imaging) AND noft(nodule)

Figure A1. Items of the adapted QUADAS-2 tool and risk of bias and level of concerns about applicability.

Table A2. Comparison with previous reviews.

	This Meta-Analysis	Lian-Ming Wu et al. (2014) [10]	Lihua Chen et al. (2016) [11]	Meyer et al. (2021) [12]
Number of included studies	46	7	15	24
Number of Nodules	3003	358	765	1714
Studies exluded from our review and reasons		[65]: Only available in Chinese. [66]: Could not be found based on the citation in the review.	[66]: Could not be found based on the citation in the review. [65,67,68]: Only available in Chinese.	[69]: Also included lymph nodes besides thyroid nodules. [70,71]: Insufficient maximum b-value. [71,72,73,74]: Reported only ADC values but not sensitivity and specificity. [75,76]: Included only malignant nodules. [77]: Unclear standard of reference.
Pooled sensitivity	0.84 (95 % CI: 0.81–0.86)	0.91 (95 % CI: 0.87–0.94)	0.90 (95 % CI: 0.85–0.93)	-
Pooled specificity	0.88 (95 % CI: 0.85–0.90)	0.93 (95 % CI: 0.86–0.96)	0.95 (95% CI: 0.88–0.98)	-
AUC	0.91	0.94	0.95	-

References

Fisher, S.B.; Perrier, N.D. The incidental thyroid nodule. CA A Cancer J. Clin. 2018, 68, 97–105. [Google Scholar] [CrossRef]
Guth, S.; Theune, U.; Aberle, J.; Galach, A.; Bamberger, C. Very high prevalence of thyroid nodules detected by high frequency (13 MHz) ultrasound examination. Eur. J. Clin. Investig. 2009, 39, 699–706. [Google Scholar] [CrossRef] [PubMed]
Haugen, B.R. 2015 American Thyroid Association management guidelines for adult patients with thyroid nodules and differentiated thyroid cancer: What is new and what has changed? Cancer 2017, 123, 372–381. [Google Scholar] [CrossRef] [PubMed]
Sollmann, L.; Eveslage, M.; Danzer, M.F.; Schäfers, M.; Heitplatz, B.; Conrad, E.; Hescheler, D.; Riemann, B.; Noto, B. Additional Value of Pertechnetate Scintigraphy to American College of Radiology Thyroid Imaging Reporting and Data Systems and European Thyroid Imaging Reporting and Data Systems for Thyroid Nodule Classification in Euthyroid Patients. Cancers 2024, 16, 4184. [Google Scholar] [CrossRef] [PubMed]
Li, W.; Wang, Y.; Wen, J.; Zhang, L.; Sun, Y. Diagnostic performance of American College of Radiology TI-RADS: A systematic review and meta-analysis. Am. J. Roentgenol. 2021, 216, 38–47. [Google Scholar] [CrossRef]
Hsiao, V.; Massoud, E.; Jensen, C.; Zhang, Y.; Hanlon, B.M.; Hitchcock, M.; Arroyo, N.; Chiu, A.S.; Fernandes-Taylor, S.; Alagoz, O.; et al. Diagnostic accuracy of fine-needle biopsy in the detection of thyroid malignancy: A systematic review and meta-analysis. JAMA Surg. 2022, 157, 1105–1113. [Google Scholar] [CrossRef]
Papaleontiou, M.; Haymart, M.R. Too much of a good thing? A cautionary tale of thyroid cancer overdiagnosis and overtreatment. Thyroid 2020, 30, 651. [Google Scholar] [CrossRef]
Shi, H.F.; Feng, Q.; Qiang, J.W.; Li, R.K.; Wang, L.; Yu, J.P. Utility of diffusion-weighted imaging in differentiating malignant from benign thyroid nodules with magnetic resonance imaging and pathologic correlation. J. Comput. Assist. Tomogr. 2013, 37, 505–510. [Google Scholar] [CrossRef]
Shi, R.y.; Yao, Q.y.; Zhou, Q.y.; Lu, Q.; Suo, S.t.; Chen, J.; Zheng, W.j.; Dai, Y.m.; Wu, L.m.; Xu, J.r. Preliminary study of diffusion kurtosis imaging in thyroid nodules and its histopathologic correlation. Eur. Radiol. 2017, 27, 4710–4720. [Google Scholar] [CrossRef]
Wu, L.M.; Chen, X.X.; Li, Y.L.; Hua, J.; Chen, J.; Hu, J.; Xu, J.R. On the utility of quantitative diffusion-weighted MR imaging as a tool in differentiation between malignant and benign thyroid nodules. Acad. Radiol. 2014, 21, 355–363. [Google Scholar] [CrossRef]
Chen, L.; Xu, J.; Bao, J.; Huang, X.; Hu, X.; Xia, Y.; Wang, J. Diffusion-weighted MRI in differentiating malignant from benign thyroid nodules: A meta-analysis. BMJ Open 2016, 6, e008413. [Google Scholar] [CrossRef] [PubMed]
Meyer, H.J.; Wienke, A.; Surov, A. Discrimination between malignant and benign thyroid tumors by diffusion-weighted imaging–A systematic review and meta analysis. Magn. Reson. Imaging 2021, 84, 41–57. [Google Scholar] [CrossRef] [PubMed]
Salameh, J.P.; Bossuyt, P.M.; McGrath, T.A.; Thombs, B.D.; Hyde, C.J.; Macaskill, P.; Deeks, J.J.; Leeflang, M.; Korevaar, D.A.; Whiting, P.; et al. Preferred reporting items for systematic review and meta-analysis of diagnostic test accuracy studies (PRISMA-DTA): Explanation, elaboration, and checklist. BMJ 2020, 370, m2632. [Google Scholar] [CrossRef] [PubMed]
RSNA Quantitative Imaging Biomarkers Alliance: QIBA Profile: Diffusion-Weighted Magnetic Resonance Imaging (DWI). 2019. Available online: https://qibawiki.rsna.org/images/6/63/QIBA_DWIProfile_Consensus_Dec2019_Final.pdf (accessed on 13 August 2025).
Whiting, P.F.; Rutjes, A.W.; Westwood, M.E.; Mallett, S.; Deeks, J.J.; Reitsma, J.B.; Leeflang, M.M.; Sterne, J.A.; Bossuyt, P.M.; QUADAS-2 Group. QUADAS-2: A revised tool for the quality assessment of diagnostic accuracy studies. Ann. Intern. Med. 2011, 155, 529–536. [Google Scholar] [CrossRef]
Balduzzi, S.; Rücker, G.; Schwarzer, G. How to perform a meta-analysis with R: A practical tutorial. Evidence-Based Mental Health 2019, 22, 153–160. [Google Scholar] [CrossRef]
Doebler, P.; Sousa-Pinto, B. mada: Meta-Analysis of Diagnostic Accuracy. In R Package, Version 0.5.11; The R Foundation: Vienna, Austria, 2022. [Google Scholar]
Li, G.; Jiang, G.; Mei, Y.; Gao, P.; Liu, R.; Jiang, M.; Zhao, Y.; Li, M.; Wu, Y.; Fu, S.; et al. Applying Amide Proton Transfer-Weighted Imaging (APTWI) to Distinguish Papillary Thyroid Carcinomas and Predominantly Solid Adenomatous Nodules: Comparison With Diffusion-Weighted Imaging. Front. Oncol. 2020, 10, 918. [Google Scholar] [CrossRef]
Ilica, A.T.; Artaş, H.; Ayan, A.; Günal, A.; Emer, O.; Kilbas, Z.; Meric, C.; Atasoy, M.M.; Uzuner, O. Initial experience of 3 Tesla apparent diffusion coefficient values in differentiating benign and malignant thyroid nodules. J. Magn. Reson. Imaging 2013, 37, 1077–1082. [Google Scholar] [CrossRef]
Khizer, A.T.; Raza, S.; Slehria, A.U.R. Diffusion-weighted MR imaging and ADC mapping in differentiating benign from malignant thyroid nodules. J. Coll. Physicians Surg.-JCPSP 2015, 25, 785–788. [Google Scholar]
Song, M.; Yue, Y.; Guo, J.; Zuo, L.; Peng, H.; Chan, Q.; Jin, Y. Quantitative analyses of the correlation between dynamic contrast-enhanced MRI and intravoxel incoherent motion DWI in thyroid nodules. Am. J. Transl. Res. 2020, 12, 3984–3992. [Google Scholar]
Saeed, Z.A.; Arooj, S.; Rashid, N.; Masood, M.; Asghar, A. Role of Diffusion-Weighted MRI ( DWI) in Differentiating Between Benign and Malignant Nodules of Thyroid Taking Histopathology as Gold Standard. Ann. King Edw. Med. Univ. Lahore Pak. 2023, 29, 23–28. [Google Scholar] [CrossRef]
Ai, Z.D.; Yu, X.P.; Hou, L.; Liu, J.; Lu, Q.; Zhang, Z.Y.; Chen, J. The value of intravoxel incoherent motion diffusion-weighted magnetic resonance imaging in the differential diagnosis of thyroid carcinoma and nodular goiter. Int. J. Clin. Exp. Med. 2022, 15, 61–67. [Google Scholar]
Bhargava, K.; Narula, H.; Mittal, A.; Sharma, D.; Goel, K.; Nijhawan, D. Role of Diffusion-Weighted Magnetic Resonance Imaging in Differentiating Benign From Malignant Thyroid Lesions: A Prospective Study. J. Clin. Diagn. Res. 2020, 14, TC1–TC4. [Google Scholar] [CrossRef]
Wang, X.; Wang, P.; Zhang, H.; Wang, X.; Shi, J.; Hu, S. Multiplexed sensitivity-encoding versus single-shot echo-planar imaging: A comparative study for diffusion-weighted imaging of the thyroid lesions. Jpn. J. Radiol. 2024, 42, 268–275. [Google Scholar] [CrossRef] [PubMed]
Jiang, L.; Chen, J.; Huang, H.; Wu, J.; Zhang, J.; Lan, X.; Liu, D.; Zhang, J. Comparison of the Differential Diagnostic Performance of Intravoxel Incoherent Motion Imaging and Diffusion Kurtosis Imaging in Malignant and Benign Thyroid Nodules. Front. Oncol. 2022, 12, 895972. [Google Scholar] [CrossRef] [PubMed]
Jiang, L.; Zhang, J.; Chen, J.; Li, Q.; Liu, W.; Wu, J.; Liu, D.; Zhang, J. rFOV-DWI and SMS-RESLOVE-DWI in patients with thyroid nodules: Comparison of image quality and apparent diffusion coefficient measurements. Magn. Reson. Imaging 2022, 91, 62–68. [Google Scholar] [CrossRef]
Wang, Q.; Guo, Y.; Zhang, J.; Ning, H.; Zhang, X.; Lu, Y.; Shi, Q. Diagnostic value of high b-value (2000s/mm2) DWI for thyroid micronodules. Medicine 2019, 98, e14298. [Google Scholar] [CrossRef]
Tang, Q.; Liu, X.; Jiang, Q.; Zhu, L.; Zhang, J.; Wu, P.Y.; Jiang, Y.; Zhou, J. Unenhanced magnetic resonance imaging of papillary thyroid carcinoma with emphasis on diffusion kurtosis imaging. Quant. Imaging Med. Surg. 2023, 13, 2697–2707. [Google Scholar] [CrossRef]
Song, M.H.; Jin, Y.F.; Guo, J.S.; Zuo, L.; Xie, H.; Shi, K.; Yue, Y.L. Application of whole-lesion intravoxel incoherent motion analysis using iZOOM DWI to differentiate malignant from benign thyroid nodules. Acta Radiol. 2019, 60, 1127–1134. [Google Scholar] [CrossRef]
Latif, M.A.; Rakhawy, M.M.E.; Saleh, M.F. Diagnostic accuracy of B-mode ultrasound, ultrasound elastography and diffusion weighted MRI in differentiation of thyroid nodules (prospective study). Egypt. J. Radiol. Nucl. Med. 2021, 52, 256. [Google Scholar] [CrossRef]
Yuan, L.; Zhao, P.; Lin, X.; Yu, T.; Diao, R.; Ning, G. T1 mapping and reduced field-of-view DWI at 3.0 T MRI for differentiation of thyroid papillary carcinoma from nodular goiter. Clin. Physiol. Funct. Imaging 2023, 43, 137–145. [Google Scholar] [CrossRef]
Zhu, X.; Wang, J.; Wang, Y.C.; Zhu, Z.F.; Tang, J.; Wen, X.W.; Fang, Y.; Han, J. Quantitative differentiation of malignant and benign thyroid nodules with multi-parameter diffusion-weighted imaging. World J. Clin. Cases 2022, 10, 8587–8598. [Google Scholar] [CrossRef]
Kong, W.; Yue, X.; Ren, J.; Tao, X. A comparative analysis of diffusion-weighted imaging and ultrasound in thyroid nodules. BMC Med Imaging 2019, 19, 92. [Google Scholar] [CrossRef] [PubMed]
Zheng, T.; Xie, X.; Ni, Z.; Tang, L.; Wu, P.Y.; Song, B. Quantitative evaluation of diffusion-weighted MRI for differentiating benign and malignant thyroid nodules larger than 4 cm. BMC Med Imaging 2023, 23, 212. [Google Scholar] [CrossRef] [PubMed]
Bayraktaroglu, S.; Öztürk, P.K.; Ceylan, N.; Makay, O.; Icöz, G.; Ertan, Y. 3 tesla apparent diffusion coefficient (ADC) values of thyroid nodules: Prediction of benignancy and malignancy. Iran. J. Radiol. 2019, 16, e63974. [Google Scholar] [CrossRef]
Özer, B.M.; Pabuscu, Y.; Tarhan, S.; Oval, G.Y.; Aydede, H.; Demireli, P.; Karadeniz, T. Effectiveness of diffusion-weighted magnetic resonance imaging (DW-MRI) in the differentiation of thyroid nodules. Thyroid Res. 2024, 17, 24. [Google Scholar] [CrossRef]
Abd-Alhamid, T.; Khafagy, A.G.; Sersy, H.A.; Askora, A.; Rabie, T.M.; Taha, M.S.; Ebrahim, A.; Sayed, H.M.E. Certainty of pretreatment apparent diffusion coefficient in the characterization of thyroid gland pathologies. Egypt. J. Otolaryngol. 2017, 33, 495–501. [Google Scholar] [CrossRef]
Nakahira, M.; Saito, N.; Murata, S.I.; Sugasawa, M.; Shimamura, Y.; Morita, K.; Takajyo, F.; Omura, G.; Matsumura, S. Quantitative diffusion-weighted magnetic resonance imaging as a powerful adjunct to fine needle aspiration cytology for assessment of thyroid nodules. Am. J. Otolaryngol.—Head Neck Med. Surg. 2012, 33, 408–416. [Google Scholar] [CrossRef]
Hao, Y.; Pan, C.; Chen, W.W.; Li, T.; Zhu, W.Z.; Qi, J.P. Differentiation between malignant and benign thyroid nodules and stratification of papillary thyroid cancer with aggressive histological features: Whole-lesion diffusion-weighted imaging histogram analysis. J. Magn. Reson. Imaging 2016, 44, 1546–1555. [Google Scholar] [CrossRef]
Noda, Y.; Kanematsu, M.; Goshima, S.; Kondo, H.; Watanabe, H.; Kawada, H.; Bae, K.T. MRI of the thyroid for differential diagnosis of benign thyroid nodules and papillary carcinomas. Am. J. Roentgenol. 2015, 204, W332–W335. [Google Scholar] [CrossRef]
Aghaghazvini, L.; Sharifian, H.; Yazdani, N.; Hosseiny, M.; Kooraki, S.; Pirouzi, P.; Ghadiri, A.; Shakiba, M.; Kooraki, S. Differentiation between benign and malignant thyroid nodules using diffusion-weighted imaging, a 3-T MRI study. Indian J. Radiol. Imaging 2018, 28, 460–464. [Google Scholar] [CrossRef]
Brown, A.M.; Nagala, S.; McLean, M.A.; Lu, Y.; Scoffings, D.; Apte, A.; Gonen, M.; Stambuk, H.E.; Shaha, A.R.; Tuttle, R.M.; et al. Multi-institutional validation of a novel textural analysis tool for preoperative stratification of suspected thyroid tumors on diffusion-weighted MRI. Magn. Reson. Med. 2016, 75, 1708–1716. [Google Scholar] [CrossRef] [PubMed]
Wang, Q.; Guo, Y.; Zhang, J.; Shi, L.; Ning, H.; Zhang, X.; Lu, Y. Utility of high b-value (2000 sec/mm2) DWI with RESOLVE in differentiating papillary thyroid carcinomas and papillary thyroid microcarcinomas from benign thyroid nodules. PLoS ONE 2018, 13, e0200270. [Google Scholar] [CrossRef] [PubMed]
Wang, H.; Wei, R.; Liu, W.; Chen, Y.; Song, B. Diagnostic efficacy of multiple MRI parameters in differentiating benign vs. malignant thyroid nodules. BMC Med. Imaging 2018, 18, 50. [Google Scholar] [CrossRef] [PubMed]
Elshafey, R.; Elattar, A.; Mlees, M.; Esheba, N. Role of quantitative diffusion-weighted MRI and 1H MR spectroscopy in distinguishing between benign and malignant thyroid nodules. Egypt. J. Radiol. Nucl. Med. 2014, 45, 89–96. [Google Scholar] [CrossRef]
Mutlu, H.; Sivrioglu, A.K.; Sonmez, G.; Velioglu, M.; Sildiroglu, H.O.; Basekim, C.C.; Kizilkaya, E. Role of apparent diffusion coefficient values and diffusion-weighted magnetic resonance imaging in differentiation between benign and malignant thyroid nodules. Clin. Imaging 2012, 36, 1–7. [Google Scholar] [CrossRef]
Razek, A.A.K.A.; Sadek, A.G.; Kombar, O.R.; Elmahdy, T.E.; Nada, N. Role of apparent diffusion coefficient values in differentiation between malignant and benign solitary thyroid nodules. Am. J. Neuroradiol. 2008, 29, 563–568. [Google Scholar] [CrossRef]
Abdel-Rahman, H.M.; Abowarda, M.H.; Abdel-Aal, S.M. Diffusion-weighted MRI and apparent diffusion coefficient in differentiation of benign from malignant solitary thyroid nodule. Egypt. J. Radiol. Nucl. Med. 2016, 47, 1385–1390. [Google Scholar] [CrossRef]
Ali, T.F.T. Solitary thyroid nodule: Diagnostic yield of combined diffusion weighted imaging and magnetic resonance spectroscopy. Egypt. J. Radiol. Nucl. Med. 2017, 48, 593–601. [Google Scholar] [CrossRef]
Sasaki, M.; Sumi, M.; Kaneko, K.I.; Ishimaru, K.; Takahashi, H.; Nakamura, T. Multiparametric MR imaging for differentiating between benign and malignant thyroid nodules: Initial experience in 23 patients. J. Magn. Reson. Imaging 2013, 38, 64–71. [Google Scholar] [CrossRef]
El-Hariri, M.A.; Gouhar, G.K.; Said, N.S.; Riad, M.M. Role of diffusion-weighted imaging with ADC mapping and in vivo 1H-MR spectroscopy in thyroid nodules. Egypt. J. Radiol. Nucl. Med. 2012, 43, 183–192. [Google Scholar] [CrossRef]
Jiang, L.; Chen, J.; Tan, Y.; Wu, J.; Zhang, J.; Liu, D.; Zhang, J. Comparative analysis of the image quality and diagnostic performance of the zooming technique with diffusion-weighted imaging using different b-values for thyroid papillary carcinomas and benign nodules. Front. Oncol. 2024, 14, 1241776. [Google Scholar] [CrossRef]
Wu, Y.; Yue, X.; Shen, W.; Du, Y.; Yuan, Y.; Tao, X.; Tang, C.Y. Diagnostic value of diffusion-weighted MR imaging in thyroid disease: Application in differentiating benign from malignant disease. BMC Med. Imaging 2013, 13, 1–7. [Google Scholar] [CrossRef] [PubMed]
Ekinci, O.; Boluk, S.E.; Eren, T.; Ozemir, I.A.; Boluk, S.; Salmaslioglu, A.; Leblebici, M.; Alimoglu, O. Utilidad de la resonancia magnética ponderada por difusión cervical en la detección del cáncer tiroideo. Cirugía Espa Nola 2018, 96, 620–626. [Google Scholar] [CrossRef] [PubMed]
Song, M.; Yue, Y.; Jin, Y.; Guo, J.; Zuo, L.; Peng, H.; Chan, Q. Intravoxel incoherent motion and ADC measurements for differentiating benign from malignant thyroid nodules: Utilizing the most repeatable region of interest delineation at 3.0 T. Cancer Imaging 2020, 20, 1–9. [Google Scholar] [CrossRef] [PubMed]
Schueller-Weidekamm, C.; Kaserer, K.; Schueller, G.; Scheuba, C.; Ringl, H.; Weber, M.; Czerny, C.; Herneth, A. Can quantitative diffusion-weighted MR imaging differentiate benign and malignant cold thyroid nodules? Initial results in 25 patients. Am. J. Neuroradiol. 2009, 30, 417–422. [Google Scholar] [CrossRef]
Schueller-Weidekamm, C.; Schueller, G.; Kaserer, K.; Scheuba, C.; Ringl, H.; Weber, M.; Czerny, C.; Herneth, A.M. Diagnostic value of sonography, ultrasound-guided fine-needle aspiration cytology, and diffusion-weighted MRI in the characterization of cold thyroid nodules. Eur. J. Radiol. 2010, 73, 538–544. [Google Scholar] [CrossRef]
Linh, L.T.; Cuong, N.N.; Hung, T.V.; Hieu, N.V.; Lenh, B.V.; Hue, N.D.; Pham, V.H.; Nga, V.T.; Chu, D.T. Value of diffusion weighted MRI with quantitative ADC map in diagnosis of malignant thyroid disease. Diagnostics 2019, 9, 129. [Google Scholar] [CrossRef]
Chung, S.; Lee, J.; Yoon, R.; Sung, T.Y.; Song, D.; Pfeuffer, J.; Kim, I. Differentiation of follicular carcinomas from adenomas using histogram obtained from diffusion-weighted MRI. Clin. Radiol. 2020, 75, 878.e13–878.e19. [Google Scholar] [CrossRef]
Shayganfar, A.; Azin, N.; Hashemi, P.; Ghanei, A.M.; Hajiahmadi, S. Diagnostic accuracy of multiple MRI parameters in dealing with incidental thyroid nodules. SN Compr. Clin. Med. 2022, 4, 228. [Google Scholar] [CrossRef]
Zhou, Y.; Dendukuri, N. Statistics for quantifying heterogeneity in univariate and bivariate meta-analyses of binary data: The case of meta-analyses of diagnostic accuracy. Stat. Med. 2014, 33, 2701–2717. [Google Scholar] [CrossRef]
Iima, M.; Partridge, S.C.; Le Bihan, D. Six DWI questions you always wanted to know but were afraid to ask: Clinical relevance for breast diffusion MRI. Eur. Radiol. 2020, 30, 2561–2570. [Google Scholar] [CrossRef] [PubMed]
Nikiforov, Y.E.; Seethala, R.R.; Tallini, G.; Baloch, Z.W.; Basolo, F.; Thompson, L.D.; Barletta, J.A.; Wenig, B.M.; Al Ghuzlan, A.; Kakudo, K.; et al. Nomenclature revision for encapsulated follicular variant of papillary thyroid carcinoma: A paradigm shift to reduce overtreatment of indolent tumors. JAMA Oncol. 2016, 2, 1023–1029. [Google Scholar] [CrossRef] [PubMed]
Ren, S.; Liu, C.H.; Bai, R.J. Value of diffusion weighted imaging in diagnosis of nodular lesions of thyroid: A preliminary study. Zhonghua Yi Xue Za Zhi 2010, 90, 3351–3354. [Google Scholar]
Li, R.; QJ, L.W.; Liu, W. Application of MR diffusion weighted imaging in the differentiation of malignant from benign thyroid focal lesions. Radiol Pract. (China) 2009, 24, 719–722. [Google Scholar]
Yan, B.; Liu, H.J.; Wang, C.B.; Li, M.; Min, Z.G.; Ma, S.H. ADC values in differentiation of benign and malignant thyroid nodules. Chin. J. Med. Imaging Technol. 2011, 27, 510–514. [Google Scholar]
Yue, X.; Tao, X.; Gao, X. Application of diffusion-weighted MR imaging in the diagnosis of thyroid disease. Chin. J. Radiol. 2012, 12, 500–504. [Google Scholar]
Aydın, H.; Kızılgöz, V.; Tatar, İ.; Damar, Ç.; Güzel, H.; Hekimoğlu, B.; Delibaşı, T. The role of proton MR spectroscopy and apparent diffusion coefficient values in the diagnosis of malignant thyroid nodules: Preliminary results. Clin. Imaging 2012, 36, 323–333. [Google Scholar] [CrossRef]
Bozgeyik, Z.; Coskun, S.; Dagli, A.F.; Ozkan, Y.; Sahpaz, F.; Ogur, E. Diffusion-weighted MR imaging of thyroid nodules. Neuroradiology 2009, 51, 193–198. [Google Scholar] [CrossRef]
Erdem, G.; Erdem, T.; Muammer, H.; Mutlu, D.Y.; Fırat, A.K.; Sahin, I.; Alkan, A. Diffusion-weighted images differentiate benign from malignant thyroid nodules. J. Magn. Reson. Imaging 2010, 31, 94–100. [Google Scholar] [CrossRef]
Dilli, A.; Ayaz, U.Y.; Cakir, E.; Cakal, E.; Gultekin, S.S.; Hekimoglu, B. The efficacy of apparent diffusion coefficient value calculation in differentiation between malignant and benign thyroid nodules. Clin. Imaging 2012, 36, 316–322. [Google Scholar] [CrossRef]
Liu, R.; Jiang, G.; Gao, P.; Li, G.; Nie, L.; Yan, J.; Jiang, M.; Duan, R.; Zhao, Y.; Luo, J.; et al. Non-invasive amide proton transfer imaging and ZOOM diffusion-weighted imaging in differentiating benign and malignant thyroid micronodules. Front. Endocrinol. 2018, 9, 747. [Google Scholar] [CrossRef]
Tan, H.; Chen, J.; Zhao, Y.L.; Liu, J.H.; Zhang, L.; Liu, C.S.; Huang, D. Feasibility of intravoxel incoherent motion for differentiating benign and malignant thyroid nodules. Acad. Radiol. 2019, 26, 147–153. [Google Scholar] [CrossRef]
Hu, S.; Zhang, H.; Wang, X.; Sun, Z.; Ge, Y.; Li, J.; Dou, W. Can diffusion-weighted MR imaging be used as a tool to predict extrathyroidal extension in papillary thyroid carcinoma? Acad. Radiol. 2021, 28, 467–474. [Google Scholar] [CrossRef]
Schob, S.; Voigt, P.; Bure, L.; Meyer, H.J.; Wickenhauser, C.; Behrmann, C.; Höhn, A.; Kachel, P.; Dralle, H.; Hoffmann, K.T.; et al. Diffusion-weighted imaging using a readout-segmented, multishot EPI sequence at 3 T distinguishes between morphologically differentiated and undifferentiated subtypes of thyroid carcinoma—a preliminary study. Transl. Oncol. 2016, 9, 403–410. [Google Scholar] [CrossRef]
Shi, H.; Yuan, Z.; Yang, C.; Zhang, J.; Liu, C.; Sun, J.; Ye, X. Role of multi-modality functional imaging in differentiation between benign and malignant thyroid 18 F-fluorodeoxyglucose incidentaloma. Clin. Transl. Oncol. 2019, 21, 1561–1567. [Google Scholar] [CrossRef] [PubMed]

Figure 1. PRISMA flowchart showing the study selection process.

Figure 2. Figure summarizing the risk of bias and concern for applicability of the included 46 studies according to the QUADAS-2 tool, modified for this systematic review [8,9,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61]. * Studies excluded from the analysis regarding the influence of the highest b-value, magnetic field strength, and echo time on reported ADC values because of reported higher ADC values for malignant nodules compared to benign nodules, contrary to the majority of the literature.

Figure 3. Graphical summary of the risk of bias and concern for applicability according to the modified QUADAS 2 tool.

Figure 4. Forest plot for sensitivity (univariable analysis). Note the difference of 0.01 for the point estimate of pooled sensitivity in the univariable analysis compared to the bivariate analysis using the Reitsma method [8,9,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61].

Figure 5. Forest plot for specificity. Note the difference of 0.01 in the univariable analysis of pooled specificity to the bivariate analysis conducted using the Reitsma method [8,9,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61].

Figure 6. Summary receiver operating characteristic (sROC) curve from the bivariate meta-analysis of 46 included studies. The point estimates for the pooled sensitivity and specificity were 0.84 and 0.88, respectively. The area under the sROC curve was 0.91, indicating high overall diagnostic performance. The summary estimate and its 95% confidence region are shown.

Figure 7. Mean ADC values (in

10^{- 3} {mm}^{2} / s

) reported by the included studies for (A) benign and malignant nodules, (B) stratified by magnetic field strength (in Tesla) of the MRI scanner used, and (C) stratified by the maximal b-value (in s/mm²) used in the respective studies. (D) Mean ADC values plotted against echo time (TE in ms), stratified by nodule type.

Figure 7. Mean ADC values (in

10^{- 3} {mm}^{2} / s

) reported by the included studies for (A) benign and malignant nodules, (B) stratified by magnetic field strength (in Tesla) of the MRI scanner used, and (C) stratified by the maximal b-value (in s/mm²) used in the respective studies. (D) Mean ADC values plotted against echo time (TE in ms), stratified by nodule type.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Noto, B.; Bobe, C.; Brandt, J.; Raum, H.N.; Nacul, N.G.; Riemann, B.; Helfen, A. Diagnostic Accuracy of Diffusion-Weighted MRI for Differentiating Benign and Malignant Thyroid Nodules: Systematic Review and Meta-Analysis. Cancers 2025, 17, 2677. https://doi.org/10.3390/cancers17162677

AMA Style

Noto B, Bobe C, Brandt J, Raum HN, Nacul NG, Riemann B, Helfen A. Diagnostic Accuracy of Diffusion-Weighted MRI for Differentiating Benign and Malignant Thyroid Nodules: Systematic Review and Meta-Analysis. Cancers. 2025; 17(16):2677. https://doi.org/10.3390/cancers17162677

Chicago/Turabian Style

Noto, Benjamin, Carolin Bobe, Jonas Brandt, Heiner N. Raum, Nabila Gala Nacul, Burkhard Riemann, and Anne Helfen. 2025. "Diagnostic Accuracy of Diffusion-Weighted MRI for Differentiating Benign and Malignant Thyroid Nodules: Systematic Review and Meta-Analysis" Cancers 17, no. 16: 2677. https://doi.org/10.3390/cancers17162677

APA Style

Noto, B., Bobe, C., Brandt, J., Raum, H. N., Nacul, N. G., Riemann, B., & Helfen, A. (2025). Diagnostic Accuracy of Diffusion-Weighted MRI for Differentiating Benign and Malignant Thyroid Nodules: Systematic Review and Meta-Analysis. Cancers, 17(16), 2677. https://doi.org/10.3390/cancers17162677

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Diagnostic Accuracy of Diffusion-Weighted MRI for Differentiating Benign and Malignant Thyroid Nodules: Systematic Review and Meta-Analysis

Simple Summary

Abstract

1. Introduction

2. Materials and Methods

Statistical Analysis

3. Results

3.1. Literature Search

3.2. Risk of Bias and Applicability Assessment

3.3. Study Characteristics

3.4. Meta-Analysis of Diagnostic Performance

3.5. Studies Reporting Lower ADC Values for Benign than for Malignant Nodules

3.6. Influence on Highest b-Value, Magnetic Field Strength, and Echo Time on Reported ADC Values

3.7. Studies Reporting on Multimodal Analysis

3.8. Studies Investigating Advanced DWI Techniques

3.9. Studies Investigating Advanced Diffusion Models

Studies Investigating the Influence of b-Value Choice on Diagnostic Performance

4. Discussion

4.1. Beyond Previous Meta-Analyses: Analysis of Technical Parameters

4.2. Combination with Other Imaging Techniques

4.3. Limitations

4.4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI