Deciphering the Prognostic Efficacy of MRI Radiomics in Nasopharyngeal Carcinoma: A Comprehensive Meta-Analysis

This meta-analysis investigates the prognostic value of MRI-based radiomics in nasopharyngeal carcinoma treatment outcomes, specifically focusing on overall survival (OS) variability. The study protocol was registered with INPLASY (INPLASY202420101). Initially, a systematic review identified 15 relevant studies involving 6243 patients through a comprehensive search across PubMed, Embase, and Web of Science, adhering to PRISMA guidelines. The methodological quality was assessed using the Quality in Prognosis Studies (QUIPS) tool and the Radiomics Quality Score (RQS), highlighting a low risk of bias in most domains. Our analysis revealed a significant average concordance index (c-index) of 72% across studies, indicating the potential of radiomics in clinical prognostication. However, moderate heterogeneity was observed, particularly in OS predictions. Subgroup analyses and meta-regression identified validation methods and radiomics software as significant heterogeneity moderators. Notably, the number of features in the prognosis model correlated positively with its performance. These findings suggest radiomics’ promising role in enhancing cancer treatment strategies, though the observed heterogeneity and potential biases call for cautious interpretation and standardization in future research.


Introduction
Nasopharyngeal carcinoma (NPC) exhibits notable epidemiological differences globally, with a significantly higher incidence in East and Southeast Asia compared to Western countries.These disparities are attributed to genetic susceptibility, environmental factors, and Epstein-Barr virus (EBV) infection prevalence.The distinct epidemiological patterns of NPC necessitate tailored approaches in diagnosis, treatment, and prognosis across different populations [1,2].In pursuing personalized medicine, radiomics and machine learning have emerged as transformative tools, offering new avenues for the prognostic assessment of NPC [3,4].
Radiomics involves extracting high-dimensional data from medical images, which, when analyzed through machine learning algorithms, can reveal patterns indicative of tumor phenotype, aggressiveness, and likely response to treatment.This methodology extends the value of conventional MRI scans beyond anatomical visualization, enabling the quantification of tumor heterogeneity at a microscopic level that may not be visually apparent [5,6].Machine learning further enhances this process by identifying complex relationships between radiomic features and clinical outcomes, facilitating the development of predictive models for NPC prognosis [7,8].
The integration of radiomics and machine learning in NPC research holds the potential to revolutionize patient care.By accurately predicting treatment outcomes, these technologies can guide the selection of therapeutic strategies tailored to individual patient profiles, thus improving survival rates and quality of life.Moreover, the ability to monitor tumor response non-invasively through advanced imaging analytics could lead to more dynamic and responsive treatment plans, adjusting to changes in tumor behavior over time [9][10][11][12].
Despite the promising prospects of radiomics and machine learning in enhancing the prognosis of nasopharyngeal carcinoma (NPC), significant challenges in the standardization of image acquisition, feature extraction, and model validation persist.These hurdles must be overcome to fully leverage the clinical potential of these advanced technologies.A recent meta-analysis highlighted the efficacy of MRI radiomics in predicting the progression-free survival in NPC, presenting a pooled concordance index (C-index) of 0.762 (95% CI, 0.687-0.837)[13].However, this analysis also noted a high level of heterogeneity (I 2 = 89%) due to the amalgamation of various endpoints, such as Local Recurrence-Free Survival, Distant Metastasis-Free Survival, and Progression-Free Survival.Our research aims to provide an updated synthesis of the current evidence while offering separate analyses for different endpoints.This approach intends to deliver a more nuanced and comprehensive analysis, potentially reducing heterogeneity and enhancing the interpretability of radiomics in the prognosis of NPC.

Materials and Methods
This investigation was executed adhering to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines for meta-analysis [14].The PRISMA checklists can be found in Supplemental Table S1.Registration of the study was completed in INPLASY with the registration number INPLASY202420101.It was determined that approval from an ethical review board or obtaining participant informed consent was not requisite for this study.

Database Searches and the Identification of Eligible Manuscripts
Two independent researchers (C-KW and T-WW) conducted an exhaustive literature review, employing a detailed search strategy across PubMed, Embase, and Web of Science, as outlined in Supplementary Table S2.This review spanned the inception of these databases to 17 February 2024.Articles were systematically screened for relevance based on titles and abstracts, with the inclusion and exclusion criteria established collaboratively.Reference lists of key review articles, including [13], were examined to ensure completeness and supplemented by manual searches to capture any overlooked studies.Discrepancies regarding study inclusion were resolved through consultation with a third investigator.

Inclusion and Exclusion Criteria
The inclusion criteria were specified for participants definitively diagnosed with nasopharyngeal carcinoma (NPC), focusing exclusively on adult populations of both sexes.The required imaging criteria stipulated that subjects must have undergone magnetic resonance imaging (MRI) for initial radiomic assessment.This criterion applied to both individuals receiving a new diagnosis and those previously subjected to medical interventions such as surgery, radiation, or chemotherapy.Only studies that included the concordance index (c-index) were considered.The c-index, a measure of the prognostic accuracy of models in time-to-event analysis where data may be censored, was selected based on its utilization in prior research [15], owing to its advantage in providing consistent results across studies with variable endpoints, in contrast to the time-independent area under the curve (AUC) which may lead to heterogeneous outcomes.All observational studies, including retrospective or prospective studies, were included.
The exclusion criteria were established: studies concerning cancers other than NPC; research employing deep learning-based radiomics, attributed to its lower interpretability; research incorporating multiple timepoint radiomics; individual radiomics feature predic-tion of prognosis without intergradation with a model; radiomics models incorporating clinical features, not radiomics studies; overlapping datasets; and documents in letters, conference proceedings, retracted papers, or those devoid of images.Further, studies using imaging modalities other than MRI, covering topics or outcomes irrelevant to the study aims, presenting data unsuitable for quantitative analysis, or not reporting the c-index were excluded.

Methodological Quality Appraisal
The methodological integrity of each study incorporated in the analysis was meticulously assessed via two established instruments: the Quality in Prognosis Studies (QUIPS) tool and the Radiomics Quality Score (RQS) [16,17].The Quality in Prognosis Studies (QUIPS) tool is employed to rigorously assess the risk of bias across various domains in prognostic studies, each evaluated with specific criteria to ascertain the risk level.Study Participation examines how representative the sample is of the target population, focusing on recruitment efficacy and the demographic and clinical similarity to the broader population.Study Attrition assesses the completeness of the follow-up, scrutinizing follow-up rates and the reasons for dropout to determine potential biases if outcomes for those lost differ from those who completed the study.Prognostic Factor Measurement evaluates the accuracy and consistency with which prognostic factors are measured across participants, emphasizing the method's reliability and uniform application.Outcome Measurement investigates the reliability and validity of outcome assessments, ensuring clarity in definitions and uniformity in measurement methods.Study Confounding involves identifying and adjusting for potential confounders, assessing the adequacy of their measurement and control.Statistical Analysis and Reporting reviews the appropriateness of statistical methods and the integrity of result reporting.Each domain's risk of bias is rated as low, moderate, or high, guiding the overall evaluation of a study's methodological soundness and result reliability.
The RQS, tailored for scrutinizing radiomics research, comprises 16 elements to assess the research's reliability and susceptibility to bias.Each element within a study was evaluated and scored, leading to an aggregate score representing the sum of scores across all components, reflecting the overall methodological quality of the radiomics studies under review.

Definitions of Prognostic Endpoints
Local Recurrence-Free Survival (LRFS): Constitutes a composite metric evaluating the efficacy of therapeutic interventions in maintaining control at both the primary tumor site and regional levels.
Distant Metastasis-Free Survival (DMFS): Denotes the interval from the commencement of therapeutic measures to the first instance of distant metastasis or death, whichever occurs first, indicating the treatment's capacity to inhibit tumor dissemination to distal anatomical sites.
Progression-Free Survival (PFS), Disease-Free Survival (DFS), and Failure-Free Survival (FFS): Although these terms are sometimes utilized interchangeably within the oncological lexicon, they predominantly refer to the duration from treatment initiation to the onset of tumor progression, recurrence, or mortality.These measures are critical for assessing the period during which a patient remains unaffected by worsening or reemergence of the disease.For the objectives of this research, these indicators are collectively considered under the umbrella of PFS.
Overall Survival (OS): This parameter measures the time elapsed from the beginning of treatment to death attributable to any cause, serving as a fundamental criterion for evaluating the overall effectiveness of cancer treatment modalities.

Data Extraction and Management
Data extraction was carried out independently by two authors (C-KW and T-WW), encompassing demographic details, study methodology, and specifics of MRI imaging, as well as radiomics and prognostic model characteristics.This process of data collection, conversion, and amalgamation of the results was executed in alignment with the guidelines stipulated in the Cochrane Handbook for Systematic Reviews of Prognosis Studies and pertinent previous studies.

Statistical Analysis
To address the variability inherent in the studies selected, we employed a randomeffects meta-analytical model [18], with statistical significance predetermined at a p-value of less than 0.05 (two-tailed).Our findings were visually synthesized in forest plots.During the preliminary exploration phase, we incorporated all outcomes associated with radiomics across various endpoints.The analysis was stratified according to specific endpoints (LRFS, DMFS, PFS, and OS).Concordance indices derived from composite models (those incorporating radiomic features in conjunction with clinical or other model variables) were excluded from the analysis due to the inclusion of varying clinical features across studies, which may introduce substantial heterogeneity [13].Sensitivity analysis was carried out with the leave-one-out method.In a more focused secondary analysis, attention was narrowed to the Overall Survival endpoint, recognizing its heterogeneity across the studies.Subgroup analyses were conducted based on geographical location (Asian versus Europe), validation methodology (internal validation versus external validation), MRI sequence (single versus multiple), and radiomics software (in-house versus Pyradiomics), alongside meta-regressions on publication year, training sample size, and the number of features used.The consistency of results across the studies was assessed using the Q-test, with a p-value less than 0.05 indicating significant heterogeneity.The extent of this heterogeneity was evaluated using I 2 metrics, categorized as negligible (0-25%), low (26-50%), moderate (51-75%), or high (76-100%) [19].Vigilance for potential publication bias was maintained, employing the Egger's method as a diagnostic tool for identifying asymmetry in funnel plots [20].All statistical analyses were conducted utilizing STATA software (Stata/SE 18.0 for Mac).

Basic Characteristics of Included Studies
A total of 15 studies involving 6243 patients were included.Among these, the majority were conducted in China, with one study from Thailand [21] and another from Italy [32].The endpoint measures included Local Recurrence-Free Survival (LRFS), Distant Metastasis-Free Survival (DMFS), Progression-Free Survival (PFS), Disease-Free Survival (DFS), Time to Treatment Failure (TTF), and Overall Survival (OS).For details on validation methods, study design, duration, patient demographics, staging, and treatment, refer to Table 1.Information on MRI protocols, such as slice thickness, magnetic field strength, sequences, and scanner types, is available in Table 2. Details on tumor segmentation soft-ware, annotators, radiomics software, features, prognostic models, and performance are provided in Table 3.

Basic Characteristics of Included Studies
A total of 15 studies involving 6243 patients were included.Among these, the majority were conducted in China, with one study from Thailand [21] and another from Italy [32].The endpoint measures included Local Recurrence-Free Survival (LRFS), Distant Metastasis-Free Survival (DMFS), Progression-Free Survival (PFS), Disease-Free Survival (DFS), Time to Treatment Failure (TTF), and Overall Survival (OS).For details on validation methods, study design, duration, patient demographics, staging, and treatment, refer to Table 1.Information on MRI protocols, such as slice thickness, magnetic field strength, sequences, and scanner types, is available in Table 2. Details on tumor segmentation software, annotators, radiomics software, features, prognostic models, and performance are provided in Table 3.

Methodological Quality of the Included Studies
In assessing the methodological quality of the included studies, we found that the majority exhibited a low risk of bias in the sample size domain as per the Quality in Prognosis Studies (QUIPS) tool.However, approximately 13.3% (2 out of 15) demonstrated some risk of bias in the study attrition domain, and nearly 46.7% (7 out of 15) showed some risk in the confounding domain (see Figure 2).Studies identified as having some risk of bias exhibited protocol variations, potentially influencing the adherence to and outcomes of the prognostic models.Detailed assessments of bias risk using QUIPS and the Radiomics Quality Score (RQS) are documented in Tables S4 and S5.

Methodological Quality of the Included Studies
In assessing the methodological quality of the included studies, we found that the majority exhibited a low risk of bias in the sample size domain as per the Quality in Prognosis Studies (QUIPS) tool.However, approximately 13.3% (2 out of 15) demonstrated some risk of bias in the study attrition domain, and nearly 46.7% (7 out of 15) showed some risk in the confounding domain (see Figure 2).Studies identified as having some risk of bias exhibited protocol variations, potentially influencing the adherence to and outcomes of the prognostic models.Detailed assessments of bias risk using QUIPS and the Radiomics Quality Score (RQS) are documented in Tables S4 and S5.

Basic Characteristics of Included Studies
A total of 15 studies involving 6243 patients were included.Among these, the majority were conducted in China, with one study from Thailand [21] and another from Italy [32].The endpoint measures included Local Recurrence-Free Survival (LRFS), Distant Metastasis-Free Survival (DMFS), Progression-Free Survival (PFS), Disease-Free Survival (DFS), Time to Treatment Failure (TTF), and Overall Survival (OS).For details on validation methods, study design, duration, patient demographics, staging, and treatment, refer to Table 1.Information on MRI protocols, such as slice thickness, magnetic field strength, sequences, and scanner types, is available in Table 2. Details on tumor segmentation software, annotators, radiomics software, features, prognostic models, and performance are provided in Table 3.

Methodological Quality of the Included Studies
In assessing the methodological quality of the included studies, we found that the majority exhibited a low risk of bias in the sample size domain as per the Quality in Prognosis Studies (QUIPS) tool.However, approximately 13.3% (2 out of 15) demonstrated some risk of bias in the study attrition domain, and nearly 46.7% (7 out of 15) showed some risk in the confounding domain (see Figure 2).Studies identified as having some risk of bias exhibited protocol variations, potentially influencing the adherence to and outcomes of the prognostic models.Detailed assessments of bias risk using QUIPS and the Radiomics Quality Score (RQS) are documented in Tables S4 and S5.

Secondary Outcome: Overall Survival Prediction of Radiomics Prognosis Model
Given that Overall Survival (OS) was the only endpoint associated with moderate heterogeneity, we conducted further subgroup analyses and meta-regression to identify potential moderators that could explain this heterogeneity.Significant differences were observed when considering the validation method and radiomics software as moderators (see Table 4).Specifically, subgroups based on radiomics software exhibited low heterogeneity, although caution is advised due to potential bias from the small number of studies included [131].This observation warrants confirmation with additional studies.In our meta-regression analysis, a significant association (p < 0.01) was found between the number of features in the prognosis model and its performance, with a coefficient of 0.010622 (Figure 4).No significant association was found with the publication year (coefficient = 0.0220509, p = 0.084).There was also no significant relationship between training size (coefficient = 2.72 × 10 −6 , p = 0.985) and model performance.

Overview of Key Findings
Our meta-analysis systematically evaluated 37 radiomics prognostic outcomes, revealing a notable average concordance index (c-index) of 72% (95% Confidence Interval (CI): 70-74%) across studies, with the range stretching from 54% to 81%.This variation not only underscores the potential utility of radiomics in clinical prognostication but also highlights the substantial heterogeneity encountered, particularly in Overall Survival (OS) predictions, where a moderate Higgins I 2 statistic of 64.44% was observed.Notably, our findings identified a significant positive correlation between the number of features in the prognosis model and its performance, with a meta-regression coefficient of 0.010622 (p < 0.01), emphasizing the complexity and potential of detailed models.The analysis also demonstrated that validation methods and radiomics software significantly influenced heterogeneity, pinpointing crucial areas for standardization and improvement in future research.

Comparison with Existing Literature
When compared with the existing literature, our findings both validate the recognized potential of radiomics and highlight the challenges of achieving consistent performance across studies.Notably, the c-index range we report aligns with those found in similar meta-analyses, such as a c-index of 0.762 (95% CI, 0.687-0.837)for Progression-Free Survival (PFS) prediction [13].Similarly, we limited our analysis to prognosis models incorporating radiomics features.However, we further refined our approach by categorizing endpoints into more sophisticated subgroups, leading to observed reductions in heterogeneity for Local Recurrence-Free Survival (LRFS), Distant Metastasis-Free Survival (DMFS), and PFS.Additionally, our study aggregated results for Overall Survival (OS), an analysis not conducted in the prior study.The moderate heterogeneity observed in OS predictions (I 2 = 64.44%)suggests that OS may be influenced by a broader range of clinical conditions, potentially necessitating the inclusion of additional clinical features for more robust prediction models.
Our analysis also advances the discussion on methodological variables-specifically, the number of features in a model and the selection of radiomics software.These factors have been less frequently quantified in prior reviews.By highlighting these aspects, our study underscores the need for a more standardized approach to radiomics model development, potentially leading to more consistent and reliable prognostic tools.

Implications for Clinical Practice and Research
The significance of our findings is emphasized through the substantial average concordance index, indicating that radiomic models harbor the potential to considerably refine patient stratification and the planning of treatments.Nonetheless, the observed diversity and fluctuations in performance, especially concerning Overall Survival (OS) predictions, mandate a prudent integration into clinical guidelines.
The observed variability in the consistency of radiomics prognostic models across different oncological endpoints has important implications for clinical practice.The finding that models performed more consistently for Local Recurrence-Free Survival (LRFS), Distant Metastasis-Free Survival (DMFS), and Progression-Free Survival (PFS) compared to OS suggests that radiomics may be particularly useful for predicting locoregional control and disease progression.This could help clinicians identify high-risk patients who may need more aggressive local therapies or closer surveillance.However, the greater variability in performance for OS indicates that radiomics alone may not be as reliable for predicting long-term survival, which is impacted by many factors beyond the primary tumor.
These results underscore the need to carefully consider the specific clinical endpoint of interest when developing and applying radiomics prognostic models.Models that perform well for one endpoint may not necessarily generalize to other endpoints.Clinicians should look for models that have been validated for the specific outcomes most relevant to their patients and practice.The variability across endpoints also highlights the importance of incorporating other clinical, pathologic, and genomic factors alongside radiomics to develop more holistic prognostic models, particularly for OS.Radiomics can provide valuable information about the primary tumor, but integrating it with other key determinants of survival may be necessary to maximize prognostic value.
Crucially, our detailed subgroup analysis and meta-regression reveals distinct moderators (for instance, validation methodologies profoundly affecting heterogeneity and radiomic software that reduces subgroup heterogeneity) that, upon standardization, could streamline the enhancement and validation of radiomic models.This standardization is pivotal for augmenting their reliability and applicability within clinical frameworks, thus facilitating improved patient care and treatment outcomes.
Despite these promising avenues, moderate heterogeneity in OS predictions persists, highlighting the complex interplay between radiomic data and patient-specific clinical factors such as health status, comorbidities, and response to treatment.The multifactorial nature of OS suggests that while radiomics can provide valuable insights into tumor characteristics, a comprehensive approach that integrates radiomic data with clinical parameters is essential for making more accurate prognostic assessments.Moreover, variations in treatment protocols and intrinsic tumor heterogeneity contribute further to the observed disparities in survival predictions.
Finally, the results suggest that further research is needed to understand the biological underpinnings of the radiomics features that drive prognostic performance for different endpoints.Better mechanistic insight could help refine models and identify radiomics signatures that are more specifically linked to the most clinically meaningful outcomes.Ongoing research to refine and integrate radiomics into multifaceted prognostic models will be key to realizing its potential to guide precision oncology care.

Technical Considerations of Radiomics Features and Imaging Protocols
Standardized approaches have been implemented to address the potential risk of bias due to protocol variations in radiomics studies to ensure consistency and comparability across different cancer pathologies.Khanfari et al. [132] employed a standardized dataset alongside consistent preprocessing techniques, including normalization and enhancement across various mpMRI images.This method was vital for minimizing data handling variability and included the use of uniform fusion techniques and robust preprocessing methods, which are essential in prostate cancer grading and reducing bias from data processing variations.Similarly, Reginelli et al. [133] standardized the radiomics pipeline by using consistent image acquisition protocols and radiomics software, thus enhancing the reliability of their findings and mitigating the risk of bias across studies.
To further ensure the relevance and accuracy of prognosis models, statistical techniques and machine learning were used for selecting radiomics features.Methods like the Least Absolute Shrinkage and Selection Operator (LASSO), recursive feature elimination, and correlation analyses (Pearson and Spearman) identified features with minimal redundancy.The reproducibility and consistency of these features were evaluated using intraclass and interclass correlation coefficients.Univariate and multivariate analyses, including Cox regression, further refined the selection based on statistical significance and clinical relevance.
Additionally, robust radiomics software was utilized to normalize data across different MRI scanner settings, mitigating the impact of scanning variability on feature extraction and model performance.MRI sequences were categorized into single and multiple sequences to examine how sequence variations affect the predictive power of radiomics features.This structured approach clarified the influence of technical variations in MRI on radiomic analysis and enhanced the results' reliability and applicability.These comprehensive measures effectively addressed potential biases due to protocol variations, leading to more reliable and applicable outcomes in radiomics studies.

Methodological Considerations and Strengths
The foundational strength of our study lies in its methodological precision, highlighted by a meticulous systematic review and exhaustive analyses, including the use of metaregression to identify sources of heterogeneity.The employment of established evaluation tools such as the Quality in Prognosis Studies (QUIPS) and the Radiomics Quality Score (RQS) enhances the reliability of our findings.QUIPS provides a qualitative assessment of bias across various domains of prognostic studies, adding depth to our analysis, while the RQS offers a quantitative measure of methodological quality.Higher RQS scores denote studies with lower risks of bias and greater methodological reliability, essential for ensuring the validity and reproducibility of results.This scoring system not only aids in distinguishing high-quality studies but also pinpoints areas needing improvement in study design and execution.
Moreover, the integration of RQS in a meta-regression against study results allows for a nuanced exploration of how methodological quality impacts reported outcomes in radiomics research.Our findings from the meta-regression, showing a coefficient of −0.0083655 with a p-value of 0.294, indicate no significant association between RQS scores and study outcomes at conventional levels of statistical significance.This analysis underscores the importance of robust methodological design in influencing the findings of radiomics studies and provides a reproducible framework for future research in this evolving field.

Limitations and Future Research Directions
Notwithstanding the compelling nature of our results, they are accompanied by limitations.The marked heterogeneity (I 2 = 64.44%for OS), potential biases, and paucity of studies within certain subgroups reflect the intricate nature of radiomics research and might temper the strength of our deductions.The inability to include unpublished studies raises the possibility of publication bias, while heterogeneity in patient populations, treatments, endpoints, and radiomics methods may limit the reliability and generalizability of pooled estimates.The retrospective nature of the included studies, lack of prospective validation, and absence of a direct assessment of the clinical utility of radiomics compared to standard prognostic tools are also important limitations that underscore the need for ongoing research.
Future inquiries should focus on conducting multi-institutional prospective studies to validate radiomics models in diverse patient cohorts and real-world settings.Methodological standardization, integration of radiomics with other prognostic factors, and mechanistic investigations are key priorities.Rigorous assessments of the clinical utility and impact of integrating radiomics into prognostic models and treatment strategies are essential.Expanding radiomics research to other cancer types and imaging modalities, as well as fostering multidisciplinary collaboration and data sharing, will be crucial for advancing the field.Such endeavors are pivotal for bridging the gap between radiomics research and its clinical application, ultimately leading to more effective and personalized treatment approaches.By addressing these challenges and opportunities, future research can help transform radiomics from a promising research tool into a validated and impactful asset for advancing precision oncology.

Conclusions
In summary, our meta-analysis highlighted the significance and variability of radiomics in predicting cancer treatment outcomes, particularly focusing on overall survival due to its heterogeneity.

Figure 2 .
Figure 2. The results of QUIPS quality assessment for included studies.

Figure 1 .
Figure 1.PRISMA flowchart for the current meta-analysis.

Figure 1 .
Figure 1.PRISMA flowchart for the current meta-analysis.

Figure 2 .
Figure 2. The results of QUIPS quality assessment for included studies.Figure 2. The results of QUIPS quality assessment for included studies.

Figure 2 .
Figure 2. The results of QUIPS quality assessment for included studies.Figure 2. The results of QUIPS quality assessment for included studies.

Figure 4 .
Figure 4. Bubble plot of feature number on radiomics prognosis models with Overall Survival as endpoint.

Table 1 .
Basic characteristics of studies.

Table 3 .
Summary of details of and prognosis model.

Table 4 .
Subgroup analysis of radiomics prognosis model with Overall Survival as endpoint.