The Diagnostic Efficiency of Quantitative Diffusion Weighted Imaging in Differentiating Medulloblastoma from Posterior Fossa Tumors: A Systematic Review and Meta-Analysis

Medulloblastoma (MB) is considered the most common and highly malignant posterior fossa tumor (PFT) in children. The accurate preoperative diagnosis of MB is beneficial in choosing the appropriate surgical methods and treatment strategies. Diffusion-weighted imaging (DWI) has improved the accuracy of differential diagnosis of posterior fossa tumors. Nonetheless, further studies are needed to confirm its value for clinical application. This study aimed to evaluate the performance of DWI in differentiating MB from other PFT. A literature search was conducted using databases PubMed, Embase, and Web of Science for studies reporting the diagnostic performance of DWI for PFT from January 2000 to January 2022. A bivariate random-effects model was employed to evaluate the pooled sensitivities and specificities. A univariable meta-regression analysis was used to assess relevant factors for heterogeneity, and subgroup analyses were performed. A total of 15 studies with 823 patients were eligible for data extraction. Overall pooled sensitivity and specificity of DWI were 0.94 (95% confident interval [CI]: 0.89–0.97) and 0.94 (95% CI: 0.90–0.96) respectively. The area under the curve (AUC) of DWI was 0.98 (95% CI: 0.96–0.99). Heterogeneity was found in the sensitivity (I2 = 62.59%) and the specificity (I2 = 35.94%). Magnetic field intensity, region of interest definition and DWI diagnostic parameters are the factors that affect the diagnostic performance of DWI. DWI has excellent diagnostic accuracy for differentiating MB from other PFT. Hence, it is necessary to set DWI as a routine examination sequence for posterior fossa tumors.


Introduction
Posterior fossa tumors (PFT) are the most common central nervous system tumors in children, accounting for around 60% of pediatric brain tumors [1]. The pathological characteristics, malignant degree, metastasis frequency, and prognosis vary greatly among different histological types of PFT [2]. Accurate preoperative diagnosis plays a crucial role for pediatric patients with PFT, as the most common tumors in this location and age group, such as pilocytic astrocytoma (PA) and medulloblastoma (MB), may determine the need for different surgical approaches with markedly different outcomes and disease progression [3]. Medulloblastoma (MB) is the most common malignant tumor of the posterior fossa, accounting for about 40% of PFT [4]. While the current treatment strategy for MB is surgery combined with postoperative chemoradiotherapy [5], a recent study has reported that the complete resection rate of metastatic MB after neoadjuvant chemotherapy was higher, and the neuropsychological prognosis of children with delayed surgical resection was better [6]. The new treatment direction also emphasizes the importance of the preoperative diagnosis of MB.

Literature Search
We searched PubMed, Web of Science, and Embase databases for studies published from January 2000 to January 2022. The keywords below were applied to search for eligible records: (diffusion OR diffusion-weighted imaging OR DWI OR apparent diffusion coefficient OR ADC) AND posterior fossa tumors. A search of the lists of references from included studies was also performed.

Study Selection
Two qualified authors (Y. Luo and S. Zhang) independently screened and checked the articles retrieved respectively, according to the predefined inclusion and exclusion criteria. Firstly, the title and abstract of the study were reviewed. Then the full text of potentially eligible studies was scanned swiftly. Finally, the eligible studies were screened by reading the full text carefully. Eligibility criteria were set as followings: (1) The tumors studied included MB; (2) DWI sequence and its related parameters were used for differential diagnosis; and (3) the purpose of the study was to explore the value of DWI in differentiating PFT and extracting true positive, false negative, false positive, and true negative results. Disagreement was resolved by discussion, and finally reaching a consensus.

Data Extraction and Quality Assessment
Data extraction was performed by author A (Y. Luo) and confirmed by author B (S. Zhang). The study characteristics extracted included country of origin, study type, reference standards, patients age, sample size, magnetic field strength, methods of ROI definition, b values (sec/mm 2 ), sensitivity, specificity, the number of true positive, false negative, false positive, and true negative findings using DWI. For studies that reported multiple sensitivity and specificity in identifying posterior fossa tumors in different groups, we extracted the group with the highest number of correctly classified lesions (true-positive findings + true-negative findings) to avoid overrepresentation of a sample [22], whereas in subgroup analysis, we extracted each subgroup of data that we were interested in. Two investigators evaluated the risk of bias and applicability of each study independently by using the Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) with any disagreement resolved with consensus.

Statistical Analysis
Meta-analysis was performed by using STATA version 16.0 (StataCorp, College Station, TX, USA). A bivariate random-effects model was used to calculate the pooled sensitivity and specificity. Heterogeneity of the pooled estimation was evaluated using the following criteria: (1) Cochran's Q-test (p < 0.05 indicating the presence of heterogeneity); (2) Higgins inconsistency index (I 2 ) test. Multilevel mixed-effects Logistic regression analysis was used to compare the diagnostic efficacy of DWI in differentiating MB from other PFT with a significant level of p < 0.05. Publication bias was assessed using Deeks' funnel plots.
In order to explore the possible causes of heterogeneity, a meta-regression was conducted to evaluate the following factors: (1) the average age of the subjects (<18 years vs. >18 years); (2) country of origin (Asia vs. Non-Asia); (3) magnetic field strength (3.0 Tesla or mixed vs. 1.5 Tesla); (4) the calculation method of diagnostic parameters (only lesions are counted vs. normal tissue was referenced); (5) whether the optimal threshold value of diagnostic parameters is obtained from the ADC histogram; (6) the region of interest (ROI) in DWI or ADC map (single layer vs. multiple layers).

Literature Search and Article Selection
A total of 1789 articles met the retrieval requirements (Figure 1), followed by the removal of 423 duplicate articles. After reviewing the titles and abstracts, 78 articles remained. Sixty-three articles were excluded by full-text review for the following reasons: 24 articles did not have enough data for calculation or construction of TP, FP, TN, and FN values; 9 articles used machine learning for analysis, 17 articles did not include MB, and 13 articles only identified high and low grade or benign and malignant tumors. Finally, a total of 15 studies was included for quantitative analysis [3,17,[23][24][25][26][27][28][29][30][31][32][33][34][35].

Basic Characteristics of the Included Studies
The basic characteristics of the included studies were summarized in Table 1. The present meta-analysis included a total of 823 patients with PFT and 371 patients with MB among them. Six studies differentiated MB from EP, four studies differentiated MB from pilocytic astrocytoma (PA), and five studies differentiated MB from mixed PFT. Only two of the studies focused on adults, and the average age of patients in 12 of the studies was younger than 18. In one study the age of patients was not reported, we assumed that all patients were younger than 18 years old since the study sample was from a Children's Hospital. Six studies obtained images with a magnetic field of 1.5 Tesla (T), seven studies used a combination of 1.5 T and 3.0 T, one study used 3.0 T, and one study did not report magnetic field. The majority of studies (14/15) were retrospective.

Basic Characteristics of the Included Studies
The basic characteristics of the included studies were summarized in Table 1. T present meta-analysis included a total of 823 patients with PFT and 371 patients with among them. Six studies differentiated MB from EP, four studies differentiated MB fr pilocytic astrocytoma (PA), and five studies differentiated MB from mixed PFT. Only t of the studies focused on adults, and the average age of patients in 12 of the studies w younger than 18. In one study the age of patients was not reported, we assumed that patients were younger than 18 years old since the study sample was from a Childre Hospital. Six studies obtained images with a magnetic field of 1.5 Tesla (T), seven stud used a combination of 1.5 T and 3.0 T, one study used 3.0 T, and one study did not rep magnetic field. The majority of studies (14/15) were retrospective.
The diagnostic parameters of DWI and regions of interest selected by different st ies were also heterogeneous. Minimum ADC value was used in two studies, mean A value was used in two, ADC ratio (the ratio of mean or minimum ADC value to norm tissue) was used in five, ADC histogram was used in five, and relative diffuse-weigh signal intensity (rDWSI) was used in one. While four studies took the entire tumor volu as their area of interest, four studies chose a single layer, and another seven studies ch three continuous or discontinuous layers. The reference standard for diagnosis in all st ies was the pathological diagnosis. The DWI diagnostic parameters of all studies w measured without the pathological diagnosis. The diagnostic parameters of DWI and regions of interest selected by different studies were also heterogeneous. Minimum ADC value was used in two studies, mean ADC value was used in two, ADC ratio (the ratio of mean or minimum ADC value to normal tissue) was used in five, ADC histogram was used in five, and relative diffuse-weighted signal intensity (rDWSI) was used in one. While four studies took the entire tumor volume as their area of interest, four studies chose a single layer, and another seven studies chose three continuous or discontinuous layers. The reference standard for diagnosis in all studies was the pathological diagnosis. The DWI diagnostic parameters of all studies were measured without the pathological diagnosis. Figure 2 presents the results of quality assessment using QUADAS-2. The majority of the studies had high quality with a low risk of bias. All studies except two [30,32] did not report whether diagnostic thresholds were predetermined, so we marked them [3,17,[23][24][25][26][27][28][29]31,[33][34][35] as unclear risks of the index bias domain. Two studies [24,28] had a high risk of fluid and timing bias because the number of individual tumor types enrolled did not reach the basic level of statistical analysis. One study [30] had an unclear risk of concern of applicability for index test since the diagnostic index used in this study is unconventional and has not been reported in other studies, we doubt whether it can be replicated in clinical work.  [3,17,23−29,31,33−35] as unclear risks of the index bias domain. Two studies [24,28] had a high risk of fluid and timing bias because the number of individual tumor types enrolled did not reach the basic level of statistical analysis. One study [30] had an unclear risk of concern of applicability for index test since the diagnostic index used in this study is unconventional and has not been reported in other studies, we doubt whether it can be replicated in clinical work.

Main Statistical Analysis Results
The forest plot of the sensitivity and specificity of DWI is displayed in Figure 3.

Main Statistical Analysis Results
The forest plot of the sensitivity and specificity of DWI is displayed in Figure 3.

Subgroup Analysis and Meta-Regression
Among the covariates, magnetic field strength was revealed to be a significant factor affecting study heterogeneity (Table 2). Studies using 3.0 T scanners showed higher sensitivity of 0.98 (95% CI: 0.96-1.00) and specificity of 0.94 (95% CI: 0.91-0.98) than studies

Subgroup Analysis and Meta-Regression
Among the covariates, magnetic field strength was revealed to be a significant factor affecting study heterogeneity (Table 2). Studies using 3.0 T scanners showed higher sensitivity of 0.98 (95% CI: 0.96-1.00) and specificity of 0.94 (95% CI: 0.91-0.98) than studies only using 1.5 T scanners which show the sensitivity of 0.89 (95% CI: 0.84-0.94) and specificity of 0.93 (95% CI: 0.87-0.98) (p < 0.05). The way the ROIs were delineated can also explain the heterogeneity of specificity, with the specificity of 0.93 (95% CI 0.89-0.96) for studies delineating a single layer and 0.97 (95% CI 0.93-1.00) for multiple layers (p = 0.01), which is not notably related to the heterogeneity of sensitivity (p = 0.13). Otherwise, the country of origin also explains the heterogeneity of sensitivity (p = 0.02), while the average age was not revealed to be a significant factor affecting study heterogeneity. As for the differences in diagnostic parameters of DWI selected by different studies, our study conducted two grouping methods: one was directly obtained based on ADC images of lesions (min ADC, mean ADC, ADC histogram) and the other was calculated based on normal tissues (ADC ratio, rDWSI). Another grouping criterion was whether the diagnostic parameters were obtained directly from the PACS system, as the ADC histogram was calculated by other software. The results showed that the diagnostic parameters of DWI obtained by calculating the ratio of lesions to normal tissues have higher specificity of 0.95 (95% CI: 0.91-0.98) than those obtained based on lesions of 0.92 (95% CI: 0.87-0.98) with p < 0.001. However, whether DWI parameters were obtained directly from PACS had no remarkable effect on the efficacy of identifying PFT.

Comparison of DWI for Differentiating MB from EP and PA
A total of six studies were included that exhibit differentiation between MB and EP, and four studies that indicate the difference between MB and PA. The forest plots are shown in Figure 6. The sensitivity and specificity for DWI to differentiate MB from PA were 0.96 (95% CI: 0.77-1.00) and 0.99 (95% CI: 0.65-1.00), respectively. The sensitivity and specificity for DWI to differentiate MB from EP were 0.90 (95% CI: 0.81-0.95) and 0.88 (95% CI: 0.77-0.94), respectively (Table 3). There was no obvious difference in sensitivity (p = 0.74), while the difference in specificity (p = 0.01) was statistically significant. The key information of each study is shown in Table 4. The ADC ratio of MB was 0.91-1.02, and that of EP was 1.30-1.58. The optimal threshold value for differentiating MB from EP was 1.00-1.20.

Discussion
This meta-analysis investigated 15 studies with 823 patients to evaluate the diagnostic performance of DWI for differentiating MB from other posterior fossa tumors. DWI demonstrated an overall high diagnostic performance. The sensitivity of each study is heterogeneous, which is probably caused by magnetic field intensity, ROI determination method, and the diagnostic parameters of DWI. DWI showed the highest sensitivity and specificity in differentiating MB from PA. Therefore, we supposed that the quantitative DWI, as a non-invasive imaging inspection method, could improve the accuracy of the differential diagnosis for posterior fossa tumors.
ADC value is the most regularly used diagnostic parameter of the DWI sequence, proved efficient in describing the limited diffusion of water molecules [36,37]. Compared with benign lesions, malignant lesions are characterized by more compressed cells, causing a reduction in water molecules, resulting in higher signal intensity but the ADC values decreased [38,39]. Fourteen of the fifteen studies selected derived parameters of the ADC map as diagnostic indicators. However, different studies have different calculation methods for ADC values. Previous studies have suggested that using the histogram to process DWI data may help to provide quantitative information on tumor heterogeneity and can be more advantageous in the differential diagnosis of different tumors and tumor grading [40][41][42]. Our results showed that the diagnostic performance of the ADC histogram was not significantly higher than that of other ADC parameters obtained directly from PACS system in differentiating posterior fossa tumors. It is more convenient and faster to use the ADC parameters directly obtained from the PACS system, such as ADC ratio, for radiologists in daily clinical practice. Moreover, the diagnostic specificity of DWI parameters can be enhanced by calculating the ratio of lesions to normal tissue (such as ADC ratio and rDWSI), which leads us to believe that the differences of individual normal tissues should be considered in the clinical application of DWI.
The definition of ROI used to calculate DWI diagnostic parameters was also of an important influence on the study. The ADC value of necrotic and cystic tumor components is very high compared to tumor tissue, so the inclusion of these areas would artificially increase the ADC value of the tumor [43]. Although the ROIs of our included studies did not select these components, it must be noted that the ROIs of the included studies still have differences. We found that the definition of ROI can account for the heterogeneity of specificity, with DWI parameters captured at multiple levels having higher specificity than those captured at a single level. Therefore, we recommend obtaining the DWI parameters by measuring the mean values of multiple layers rather than the mean values of different regions at the same layer.
In univariable meta-regression analysis, we also found that DWI showed better sensitivity and specificity in studies with high field intensity (3.0 T) MRI than in studies with only 1.5 T MRI. High field imaging can obtain a higher signal-to-noise ratio, thus improving the spatial resolution or signal-to-noise ratio of DWI images [44]. Our results suggested that high-field imaging may have better diagnostic performance in the differential diagnosis of posterior fossa tumors. This is consistent with many previous reports of DWI being applied to other organs or systems [36,45]. Since the 3.0 T has the advantages of improving signal-to-noise ratio and reducing artifacts such as medium effects, we suggest using 3.0 T DWI to improve the diagnostic performance of posterior fossa tumors. However, fewer than three studies only use 3.0 T MRI, and thus we cannot compare between 3.0 T and 1.5 T directly. In sequence, more subsequent studies will be needed to prove the validity of this comparison.
The three most common pediatric posterior fossa tumors were MB, EP, and PA [46]. There are important differences in the incidence, degree of malignancy, frequency of metastasis, and prognosis of these tumors, as well as the treatment strategies based on tumor type and histological subtype [47][48][49][50]. MB consists of densely arranged sheets of homogeneous small tumor cells with a small number of necrotic elements and normal tissue [10]. This histological feature of MB results in the lowest ADC value of the three tumors since the ADC value of brain tumors is related to the number of cells [51,52]. A previous meta-analysis also suggested that MB, EP, and PA had increased the mean ADC values sequentially [37]. Therefore, distinguishing MB from EP is more difficult than PA due to smaller differences in ADC values. This explains the statistically significant difference in the specificity of DWI for differentiating MB from EP than that for differentiating MB from PA. On the contrary, the results showed that the sensitivity of DWI for differentiating MB from EP was not significantly different from that of differentiating MB from PA. This means that quantitative DWI can be a beneficial auxiliary diagnostic tool when it is difficult to distinguish between MB and EP in the clinic.
At the same time, MB needs to be differentiated from atypical teratoma rhabdoid tumors (ATRT). Only one of the earlier studies included in this work evaluated the efficacy of DWI in differentiating MB from ATRT, and hence subgroup analysis could not be performed. This study [23] showed that the sensitivity and specificity of DWI in differentiating MB from ATRT were 66.7% and 50%, respectively. More studies [33,53] showed no significant difference in ADC value between MB and ATRT. This might be due to the fact that both MB and ATRT are embryonic tumors with large nuclei and small cytoplasm, the ADC values of the two tumors are small and similar, and it is difficult to distinguish them by DWI alone [53]. The preoperative differential diagnosis of MB and ATRT needs to be explored by more new MRI techniques.
In recent years, new Artificial Intelligence (AI) methods have emerged for the analysis of brain tumor imaging in children. Although the use of AI tools in routine clinical practice has yet to be explored, it has shown great potential in the identification of common PFT in children [54]. It is believed that the preoperative diagnosis of highly embryonic tumors in the posterior fossa may be more accurate in the future. Meanwhile, MRI-based AI technology is also used to predict the survival rate of brain tumor patients, which can provide supplementary information for improving clinical decision-making tasks. Combined with quantitative features derived from DWI, it is of great significance for AI to predict the survival assessment of brain tumor patients [55].
There were a few limitations of the present study. First, the number of eligible studies was relatively limited, with potentially relevant studies but no useful data reported to calculate TP, FN, FP, and TN. Second, the included studies showed significant heterogeneity in pooled sensitivities, which may reduce the general applicability of the combined estimates. However, we identified the cause of the heterogeneity through univariate meta-regression and found some methods that may enhance the diagnostic efficiency according to the meta-regression results. Finally, most of the included studies were retrospective studies (14/15). Confounding factors and bias were less controlled for than in prospective studies. Therefore, further large-scale prospective studies for diagnosing posterior fossa tumors should be conducted to provide a valuable reference for clinical diagnosis.

Conclusions
In conclusion, the diagnostic parameters of the DWI imaging sequence have excellent diagnostic accuracy for differentiating MB from other posterior fossa tumors. Standardizing the definition of the area of interest and the calculation method of diagnostic parameters will assist clinicians to improve the accuracy of diagnosis in daily work. High-field MRI, multi-slice ROI, and DWI parameters calculated with reference to normal tissues may be beneficial factors. The results of this study have the potential to provide valuable information for the treatment planning of pediatric PFT, including the extent of tumor re-section and the implementation of adjuvant therapy. Meanwhile, further prospective studies with standardized scanning protocols and large samples are needed to accurately quantify the diagnostic threshold. Quantitative DWI combined with artificial intelligence technology may be the future direction of in-depth exploration of posterior fossa tumors.