Abstract
Background/Objectives: To evaluate the diagnostic accuracy of artificial intelligence (AI)-based imaging techniques for liver fibrosis and metabolic dysfunction-associated steatotic liver disease (MASLD). Materials and Methods: We performed a comprehensive search in PubMed, Embase, Cochrane Library, and Web of Science until August 2025. A total of 15 studies (mean age of patients 56 years, 60% male) were included. The risk of bias in the included studies was assessed using the Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) tool. Diagnostic performance metrics were calculated using a random-effects bivariate model, including the area under the curve (AUC), sensitivity, specificity, positive and negative likelihood ratios, and diagnostic odds ratio. Meta-regression analysis was conducted to investigate potential sources of heterogeneity when I2 was ≥50%. A p-value < 0.05 was considered statistically significant. Results: For liver fibrosis, pooled sensitivity was 0.85, specificity was 0.81, and AUC was 0.92. For MASLD, sensitivity was 0.86, specificity was 0.95, and AUC was 0.99. Different imaging modalities and AI classifiers caused significant study heterogeneity. To avoid misleading pooled estimates across varied datasets, imaging modality and AI model subgroup analyses were performed. Only three studies were used to estimate MASLD; therefore, considerable between-study heterogeneity should be considered. Conclusions: AI-based imaging modalities demonstrate promising diagnostic accuracy for liver fibrosis and MASLD, warranting further standardization to enhance diagnostic consistency.
1. Introduction
AI has made substantial strides in diagnostic radiology, particularly in the evaluation of liver conditions, including liver fibrosis and metabolic dysfunction-associated steatotic liver disease (MASLD) [1]. Traditional cross-sectional imaging techniques, including computed tomography (CT), magnetic resonance imaging (MRI), and ultrasonography, have been enhanced by AI technologies [2,3,4]. These advancements involve convolutional neural networks (CNNs) for analyzing complex imaging patterns, support vector machines (SVMs) for image classification, and deep learning models such as Residual Networks (ResNet) for improving feature extraction and classification accuracy [5]. The application of AI in this context focuses on qualitative assessments (e.g., presence or absence of disease) and semi-quantitative grading of liver fibrosis (mild, moderate, severe), as well as fat fraction imaging in MRI [6]. AI technologies are designed to automate image analysis, detect subtle patterns that may be missed by human reviewers, and integrate diverse data sources to enhance overall diagnostic performance [7].
Accurate diagnosis and staging of liver fibrosis and MASLD are crucial for effective patient management and therapeutic interventions [8]. The goals of utilizing AI are to provide reliable, consistent, and timely assessments that can support or even surpass traditional imaging methods [9,10,11].
Importantly, the nomenclature of fatty liver disease has recently changed: in June 2023 an international multi-society Delphi consensus renamed the formerly used term nonalcoholic fatty liver disease (NAFLD) to MASLD, and nonalcoholic steatohepatitis (NASH) to metabolic dysfunction-associated steatohepatitis (MASH) [12]. This change reflects increasing recognition of the central role of metabolic dysfunction in driving hepatic steatosis and seeks to provide positive diagnostic criteria rather than exclusion-based definitions [13]. Because the volume of literature published under the new MASLD definition remains modest, especially in imaging and AI integration, it is timely to update reviews and meta-analyses using the updated terminology and to clarify how prior work under the NAFLD heading maps to MASLD.
This meta-analysis evaluates the diagnostic performance of AI-integrated abdominal imaging modalities for liver fibrosis and NAFLD, aiming to assess the effectiveness of various AI techniques in detecting and staging these conditions and synthesizing current research findings.
2. Materials and Methods
This systematic review and meta-analysis followed the PRISMA guidelines [14] as described in Supplementary Table S1. This review protocol is registered with PROSPERO, with the registration number CRD42024592089.
2.1. Inclusion and Exclusion Criteria
Relevant articles were screened by title and abstract after removing duplicates. Studies were eligible for inclusion if they assessed the diagnostic performance of AI-based abdominal imaging modalities for liver fibrosis and metabolic dysfunction-associated steatotic liver disease (MASLD). Because the MASLD terminology was only introduced in 2023, studies published before this date generally referred to “non-alcoholic fatty liver disease (NAFLD)” while describing essentially the same disease spectrum. Therefore, for consistency and inclusivity, both NAFLD- and MASLD-related studies were considered, acknowledging their clinical and pathophysiological overlap. The remaining studies were then examined in full text to confirm eligibility. Inclusion criteria for articles were: (1) observational studies reporting the diagnostic performance of AI-based abdominal imaging modalities in the diagnosis or staging of liver fibrosis and/or MASLD (previously referred to as NAFLD); (2) articles had to specify the reference standard (diagnostic method(s)) and class(es) of AI; (3) publications reporting sensitivity and specificity data or providing sufficient information to calculate them; (4) articles that clearly reported training and test datasets or contained information on validation methods; and (5) studies published as original articles. Exclusion criteria were: (1) no full text electronically available; (2) publication in a language other than English; (3) comments, letters, editorials, protocols, guidelines, and review papers; (4) studies with insufficient outcome data; and (5) animal studies.
2.2. Search Strategy
We conducted a comprehensive literature search in PubMed, the Web of Science, Scopus, the Cochrane Library, Embase, Google Scholar, and CINAHL until 30 August 2025. Search terms included combinations of “artificial intelligence,” “deep learning,” “machine learning,” “imaging,” “liver fibrosis,” “steatosis,” “non-alcoholic fatty liver disease (NAFLD),” and “metabolic dysfunction-associated steatotic liver disease (MASLD)” to ensure inclusion of studies published both before and after the nomenclature change. We also manually searched the references mentioned in narrative reviews and pertinent non-systematic papers to find further relevant studies that our search approach could have overlooked. The detailed search strategy is described in Supplementary Table S2.
2.3. Screening and Selection Process
Two independent reviewers, K.B.M and A.P., both with over five years of experience in abdominal imaging and AI-based diagnostics, conducted the search and selection process. Each reviewer handled distinct sets of studies to ensure exhaustive coverage and accuracy. The kappa coefficient (k) was used to evaluate the inter-rater reliability for the review process and data extraction [15].
Discrepancies were resolved through discussion, ensuring consistent and precise study inclusion by J.A. (with 10 years of experience in radiology).
2.4. Data Extraction
The two independent authors conducted the data extraction process in duplicate. They retrieved information from the eligible articles following the predefined inclusion and exclusion criteria. Information was systematically collected on a standardized data sheet, which included the following variables: (1) study ID, (2) country, (3) study design (retrospective, cohort, prospective), (4) diagnosis (liver fibrosis or MASLD/NAFLD), (5) fibrosis stage (F1–F4), (6) abdominal imaging modality (MRI, CT, ultrasound, shear wave elastography), (7) total number of patients, (8) total number of images, (9) AI classifier (SVM, CNN, etc.), (10) AI performance metrics, (11) type of dataset (training, validation, or test), (12) sensitivity (%), (13) specificity (%), and (14) accuracy (%). Given that most included studies predated the 2023 MASLD terminology, diagnostic definitions were interpreted according to their equivalence to the current MASLD criteria, as described in the EASL–EASD–EASO Clinical Practice Guidelines [12]. Any inconsistencies in the extracted data were resolved by consulting a third reviewer with substantial expertise in diagnostic imaging. The accuracy of the data extraction process was ensured through systematic checks and cross-verification. Although the κ coefficient demonstrated almost perfect agreement, a total of three discrepancies occurred during screening, all of which were resolved by consensus. No formal calibration exercise was performed prior to screening.
2.5. Quality Assessment of the Studies
The methodological quality of the included studies was independently evaluated using the QUADAS-2 tool, which includes four criteria: “patient selection”, “index test”, “reference standard”, and “flow and timing”, and judges bias and applicability [16]. Each article was assessed in terms of risk of bias, and the first 3 domains were assessed with respect to applicability. Each item is answered with “yes,” “no,” or “unclear”. The answer of “yes” means low risk of bias, whereas “no” or “unclear” means the opposite, as attached in Supplementary Table S3. The Grading of Recommendations, Assessment, Development and Evaluations (GRADE) assessment tool was not used due to the challenge of applying it to diagnostic test accuracy (DTA) reviews [17]. The quality assessment was conducted by the two independent reviewers. Any discrepancy was settled through mutual discussion and consensus with the third reviewer. The results were presented using Review Manager 5.4.
2.6. Data Syntheses and Analyses
This diagnostic meta-analysis was conducted on the analytical software Meta-disk 1.4 and the statistical software Comprehensive Meta-Analysis version 3 (Biostat Inc., Englewood, CO, USA) in order to analyze the pooled sensitivity, specificity, Positive Likelihood Ratio (PLR), Negative Likelihood Ratio (NLR), Diagnostic Odds Ratio (DOR), and AUC values with 95% confidence intervals (CIs) across studies. As the terminology shift from NAFLD to MASLD occurred recently, the pooled analysis integrated both terms under a unified MASLD framework, reflecting the current understanding that both refer to the same metabolic liver disease spectrum. The data were considered statistically significant when two-sided p < 0.05. A random effects model was used in all analyses owing to an expectation of heterogeneity of data across studies [18]. The summary receiver operating characteristic (SROC) curve was also used based on the sensitivity and specificity of each study to assess the diagnostic performance [19].
Because of the differences in the basic features of the included articles, their diverging results may have been caused by heterogeneity or random errors. Therefore, the Cochrane chi-squared test was used to evaluate heterogeneity among articles, with p < 0.05 indicating the existence of heterogeneity [20]. To estimate the impact of heterogeneity on the meta-analysis, the I2 value was also calculated. If p < 0.05 and I2 > 50%, heterogeneity was defined as statistically significant [21]. In order to explore heterogeneity, the threshold effect was assessed using the Spearman correlation coefficient [22]. A strong positive correlation would suggest a threshold effect. Subgroup and random effects meta-regression analyses were also performed to identify potential sources of heterogeneity according to the type of abdominal imaging modality used and the AI classifier to identify which approaches seem to be most promising. We did not evaluate further covariates, such as age and sex, because they were not mentioned in all included papers. Also, there was no significant difference in study quality and sample size between studies, so they were excluded from subgroup and meta-regression analysis. Furthermore, a sensitivity analysis was conducted to evaluate the validity and robustness of the meta-analysis. Finally, Egger’s test was conducted to evaluate publication bias [23]. The latter was further assessed by the visual inspection of the symmetry in funnel plots.
3. Results
3.1. Study Selection
A total of 824 studies were initially identified through comprehensive database searches. Following the screening of 446 abstracts, 15 studies met the eligibility criteria and were included in the final systematic review and meta-analysis. The Kappa coefficient was 0.94, proving that the level of agreement between both authors was almost perfect.
The PRISMA flowchart in Figure 1 details this process.
Figure 1.
PRISMA Flow diagram of the literature study process and selection.
Although the updated MASLD terminology was included in our search, most eligible studies still referred to NAFLD, reflecting that the majority of AI-based imaging literature predates the 2023 redefinition. These studies, however, address the same metabolic liver disease continuum now classified as MASLD, supporting their inclusion in this analysis. Characteristics of included studies are summarized in Table 1.
Table 1.
Summary of Key Studies on AI Classifiers for Liver Conditions: Study Details and Performance Metrics.
3.2. Study Characteristics
The 15 studies assessed AI-based imaging for liver fibrosis and MASLD (previously NAFLD), published between 2012 and 2024 [24,25,26,27,28,29,30,31,32,33,34,35,36,37,38], across seven countries: China (n = M6), Republic of Korea (n = 2), USA (n = 2), Japan (n = 2), The Netherlands (n = 1), Egypt (n = 1), and Iran (n = 1). Study designs included retrospective (n = 7), prospective (n = 3), and cohort (n = 5) studies. Fibrosis staging definitions varied across studies, most commonly using the Kleiner system (F0–F4) [39].
The imaging modalities used were MRI (n = 5), CT (n = 4), ultrasonography (n = 4), shear wave elastography (n = 1), and transient elastography (n = 1). AI classifiers included CNNs, SVMs, ResNet, ANNs, Random Forest, and deep learning radiomics. Study sample sizes ranged from 37 to 7461 participants, with sensitivities from 63 to 97% and specificities from 59–100%.
To align pre-2023 NAFLD studies within the MASLD framework, only studies in which at least one metabolic risk factor was explicitly reported (obesity, diabetes, insulin resistance, dyslipidemia, or hypertension) were retained. Sensitivity analysis excluding two studies without clear metabolic profiling did not materially change pooled fibrosis estimates (ΔAUC < 0.02, Δsensitivity < 0.03), supporting the validity of this reclassification approach. When interpreted under the MASLD framework, these findings collectively represent AI’s growing potential to quantify hepatic steatosis and fibrosis across the metabolic disease spectrum.
The primary diagnostic effect size for this meta-analysis was the diagnostic odds ratio (DOR), with AUC and sensitivity/specificity treated as secondary accuracy metrics. Prediction intervals were additionally calculated for pooled DOR and AUC.
The AI classifiers included:
Convolutional Neural Networks (CNNs, n = 8): A type of deep learning model designed to automatically and adaptively learn spatial hierarchies of features from images. CNNs are particularly effective in image classification tasks [40].
Support Vector Machines (SVMs, n = 3): A machine learning algorithm that finds the optimal hyperplane to separate different classes in a dataset. It is effective for classification problems with a clear margin of separation [41].
ResNet (n = 1): A deep learning architecture that uses residual connections to help train very deep networks by mitigating the vanishing gradient problem [42].
Deep Learning Radiomics (n = 1): An approach combining deep learning with radiomics to extract and analyze features from medical images for improved diagnostic accuracy [43].
Artificial Neural Networks (ANNs, n = 1): A class of machine learning models inspired by the human brain, consisting of interconnected nodes (neurons) that process information in layers to learn complex patterns [44].
Random Forest (n = 1): An ensemble learning method that uses multiple decision trees to improve classification accuracy by averaging the results of individual trees to reduce overfitting and increase predictive performance [45].
Study sample sizes ranged from 37 to 7461 participants. Sensitivities varied from 63% to 97.2%, specificities from 59% to 100%, and accuracies from 74% to 98.64%. Disease stages reported included various fibrosis stages, i.e., ≥F2 (≥fibrosis stage 2), F1–F3 (fibrosis stages 1 through 3), and F0–F4 (fibrosis stages 0 through 4). AI classifiers were assessed through different methodologies, including cross-validation techniques, feature selection, and deep learning architectures, which reflect a broad range of approaches to enhancing diagnostic precision. More details about the methodological characteristics of the selected studies utilizing AI in abdominal imaging for liver disease, such as model types, validation strategies, and clinical implications, can be found in Supplementary Tables S4 and S5. The Kappa coefficient was 0.92, proving that the level of agreement between both authors was almost perfect.
3.3. Assessment of Risk of Bias
The quality of the 15 studies was methodologically assessed using the QUADAS-2 tool. With respect to domain patient selection, 4/15 studies were identified to have a high risk of bias because they did not use a consecutive or random sample. Interestingly, 9 studies showed a high risk of bias regarding the field of the index test due to not presetting the threshold. However, the domains of reference standard, flow, and timing were not affected by the risk of bias. In contrast, there were no concerns as to the applicability of the majority of studies included in this meta-analysis. Indeed, high applicability concerns were shown in 2 studies in patient selection and 7 studies in the index test. Tabular presentation for QUADAS-2 results of individual studies is attached as Supplementary Tables S6–S8.
3.4. Publication Bias
To evaluate the potential presence of publication bias, we employed a funnel plot asymmetry test, supplemented by Egger’s regression test. This dual-method approach ensures a thorough assessment of whether small-study effects or selective reporting biases might have distorted our meta-analytic findings. A funnel plot and Egger’s regression analysis identified a statistically significant publication bias (p = 0.000) for the DOR values associated with AI-based imaging for liver fibrosis. Visual inspection of the funnel plot (Figure S1) confirmed notable asymmetry, indicating the presence of publication bias. Because only three MASLD studies were available, publication bias testing and meaningful pooling are statistically unreliable and must be interpreted as exploratory only.
3.5. Findings
3.5.1. Liver Fibrosis
Twelve studies [24,25,26,27,28,29,30,31,32,33,34,35] investigated the diagnostic accuracy of AI-based abdominal imaging for the detection of liver fibrosis. The analysis demonstrated significant heterogeneity in sensitivity (Chi2 = 57.19, p < 0.0001; I2 = 80.8%), requiring a random-effects model for the pooled estimate. Sensitivity scores ranged from moderate to high and yielded a pooled estimate of 0.85 (95% CI: 0.82–0.87), representing excellent diagnostic performance. Specificity also showed high heterogeneity across studies (Chi2 = 103.03, p < 0.0001; I2 = 89.3%), with a pooled estimate of 0.81 (95% CI: 0.79–0.83) (Figure 2A,B).
Figure 2.
Diagnostic Performance of AI-Based Abdominal Imaging for Liver Fibrosis: Meta-Analysis Forest Plots. Forest plots display pooled specificity (A), sensitivity (B), negative likelihood ratio (C), and positive likelihood ratio (D) across 12 studies. Red circles show study estimates, blue lines their 95% CIs, red diamonds the pooled effects, red dashed lines their positions, and grey dashed lines reference values.
The pooled positive likelihood ratio (PLR) demonstrated extreme heterogeneity (Chi2 = 109.81, p < 0.0001; I2 = 90%), with a pooled estimate of 5.08 (95% CI: 3.48–7.42), indicating a strong ability to correctly identify diseased individuals. Similarly, the negative likelihood ratio (NLR) was highly heterogeneous (Chi2 = 52.23, p < 0.0001; I2 = 78.9%), with a pooled value of 0.18 (95% CI: 0.13–0.25), supporting the use of AI to reliably rule out disease in healthy individuals (Figure 2C,D).
DOR analysis also showed significant heterogeneity (Chi2 = 69.19, p < 0.0001; I2 = 84.1%), with individual study estimates varying widely. The pooled DOR was 30.87 (95% CI: 17.06–55.86), indicating outstanding overall diagnostic accuracy (Figure 3A). The SROC curve yielded an AUC of 0.9165, confirming excellent diagnostic discrimination of liver fibrosis by AI-based imaging modalities (Figure 3B).
Figure 3.
Overall Diagnostic Accuracy of AI-Based Imaging for Liver Fibrosis. The forest plot (A) shows diagnostic odds ratios across 12 studies, with red circles as study estimates, blue lines as 95% CIs, red diamonds as pooled effects, red dashed lines marking pooled positions, and grey dashed lines reference values. The SROC curve (B) uses red circles and blue curves, confirming high accuracy (AUC = 0.92). Detailed ranges of PLR, NLR, and DOR for individual studies are presented in Supplementary Table S9.
In the context of metabolic dysfunction-associated steatotic liver disease (MASLD), these findings underscore the critical role of AI in detecting and staging fibrosis, which represents the main prognostic determinant of disease progression. Although the majority of included studies predated the formal adoption of the MASLD terminology, the same metabolic and histopathological processes underline what was historically referred to as NAFLD. Thus, these results can be confidently interpreted within the MASLD framework, highlighting AI’s ability to support early, non-invasive risk stratification and improve longitudinal monitoring of hepatic fibrosis in metabolically driven liver disease.
3.5.2. MASLD (Previously NAFLD)
Three studies [36,37,38] evaluated the diagnostic accuracy of AI-augmented abdominal imaging for MASLD. Given the limited number of included studies and considerable methodological heterogeneity, these pooled results should be interpreted with caution. Sensitivity was extremely heterogeneous (Chi2 = 61.58, p < 0.001; I2 = 96.8%), with a pooled estimate of 0.86 (95% CI: 0.81–0.89). Specificity was also highly variable (Chi2 = 12.00, p = 0.002; I2 = 83.3%), with a pooled estimate of 0.95 (95% CI: 0.92–0.97) (Figure 4A,B).
Figure 4.
AI-Based Imaging for MASLD Diagnosis: Summary of Diagnostic Accuracy Metrics. Forest plots show pooled specificity (A), sensitivity (B), NLR (C), and PLR (D) from three studies. Red circles depict study estimates, blue lines their 95% CIs, red diamonds pooled effects, red dashed lines pooled positions, and grey dashed lines reference values. Results indicate high diagnostic performance with heterogeneity in likelihood ratios. The pooled positive likelihood ratio (PLR) was 16.95 (95% CI: 4.87–59.08), with significant heterogeneity (Chi2 = 8.43, p = 0.014; I2 = 76.3%). Negative likelihood ratio (NLR) results showed even greater inconsistency (Chi2 = 69.00, p < 0.001; I2 = 97.1%), with a pooled estimate of 0.08 (95% CI: 0.01–1.18) (Figure 4C,D). Corresponding individual study values are detailed in Supplementary Table S10.
The pooled DOR was 305.08 (95% CI: 13.53–6877.82), indicating very strong diagnostic performance but again with substantial heterogeneity (Chi2 = 25.50, p < 0.001; I2 = 92.2%) (Figure 5A). The SROC curve demonstrated near-perfect discrimination, with an AUC of 0.9881 (Figure 5B).
Figure 5.
Diagnostic Accuracy of AI-Based Imaging for MASLD Detection. (A) Forest plot displays diagnostic odds ratios from three studies, with red circles indicating individual estimates, blue diamonds pooled effects, blue triangles study weights, and red and grey dashed lines marking confidence and reference intervals. (B) SROC curve shows excellent performance (AUC = 0.99), with red points representing study sensitivity–specificity pairs and the blue line depicting the fitted curve. Despite this high AUC, the extreme heterogeneity in sensitivity and specificity severely limits interpretability and precludes drawing definitive conclusions. The wide confidence intervals of several performance metrics further highlight the uncertainty surrounding the diagnostic accuracy of AI for MASLD.
To assess whether variability in diagnostic thresholds contributed to the observed heterogeneity, a threshold effect analysis was conducted using the Moses model and inverse variance weighting. The Spearman correlation coefficient was 0.143 (p = 0.760), indicating no significant threshold effect across studies. Accordingly, the pooled diagnostic accuracy metrics are unlikely to be biased by threshold variation.
Given the very small number of included studies and the extreme between-study heterogeneity (I2 > 90% for several parameters), these pooled estimates are statistically unstable, and formal meta-analytic pooling may not be valid under these conditions. Therefore, the quantitative results should be interpreted cautiously and primarily as hypothesis-generating rather than confirmatory. A narrative interpretation is more appropriate for the MASLD subgroup until a larger number of methodologically comparable studies becomes available.
3.6. Subgroup Analyses
Subgroup analyses revealed variations in diagnostic accuracy depending on the imaging modality and AI classifier used. Meta-regression identified study quality, AI classifier type, and imaging modality as significant sources of heterogeneity. Statistically significant heterogeneity (p < 0.05) was observed across diagnostic performance metrics (sensitivity, specificity, PLR, NLR, DOR, and AUC). Further subgroup and meta-regression analyses for liver fibrosis studies showed no statistically significant differences between imaging modalities (p = 0.6962) or AI classifiers (p = 0.5479) (Table 2).
Table 2.
Subgroup and Meta-Regression Analysis of Diagnostic Accuracy in Liver Fibrosis Studies Based on Imaging Modality and AI Classifier.
Additionally, to further reveal the likely origin of heterogeneity, a leave-one-out sensitivity analysis was performed. We revealed that DOR values of AI-based abdominal imaging modalities for the diagnosis of liver fibrosis did not differ markedly, which indicated that the meta-analysis had strong reliability. Indeed, the DOR values ranged from 19.431 (95% CI: 15.214–24.818), p = 0.000, to 26.225 (95% CI: 20.237–33.985), p = 0.000 (Table 3).
Table 3.
Sensitivity Analysis of Diagnostic Odds Ratios for AI-Enhanced Imaging in Liver Fibrosis Diagnosis.
Sensitivity analysis could not be performed for MASLD due to the limited number of studies.
4. Discussion
This meta-analysis emphasizes the transformative potential of AI in the restructuring of the diagnostic landscape of liver disease, particularly within the contemporary framework of MASLD. The transition from the NAFLD terminology to MASLD is not merely a change in nomenclature; it is a paradigm shift that acknowledges the metabolic underpinnings of hepatic steatosis and fibrosis [46]. AI is a potent ally in this changing clinical context, capable of overcoming long-standing limitations in non-invasive liver imaging, including operator dependency, interpretative variability, and the limited sensitivity of traditional tools in early disease detection [47,48]. By integrating data from various imaging modalities, including CT, elastography, MRI, and ultrasound, AI exhibits a distinctive ability to standardize interpretation across centers and harmonize diagnostic procedures [49,50]. The complex spectrum of metabolic, hepatic, and cardiovascular interactions that characterize MASLD is contingent upon these characteristics [46]. Unlike traditional diagnostic scores, such as the FIB-4 or the NAFLD fibrosis score, which rely on indirect biochemical parameters [51], AI-driven imaging models derive complex, multidimensional data from raw imaging signals [52]. This capability enables the more comprehensive characterization of tissue texture, elasticity, and composition, thereby establishing a data-rich, objective, and reproducible framework for disease assessment [53].
The present findings support this evolving clinical role, as AI demonstrated high pooled diagnostic accuracy for liver fibrosis (sensitivity 0.85; AUC 0.92); however, prediction intervals indicated wide expected variability (DOR PI 3.9–172.4), underscoring that real-world performance may differ substantially from the pooled averages. This variability is consistent with the substantial heterogeneity (I2 > 80%) observed across studies, which was predominantly driven by differences in imaging modality, AI model architecture, fibrosis thresholds, reference standards, and study design.
Interpretation of pooled diagnostic accuracy requires caution because the included AI systems differed substantially in model architecture (e.g., CNNs, hybrid networks, radiomics-based approaches), imaging modalities (ultrasound, CT, MRI), and training datasets. Such methodological variability limits direct comparability across studies and reduces the stability of pooled metrics. Future research should implement standardized reporting guidelines, harmonized validation frameworks, and external multi-center testing to enable more reliable cross-study comparisons. The significance of AI in the diagnosis of MASLD is not limited to accuracy metrics from a clinical perspective. The genuine clinical value of this test is its capacity to identify subclinical or early fibrotic changes that frequently precede irreversible liver injury [54]. The most recent EASL–EASD–EASO Clinical Practice Guidelines [12] underscore the importance of detecting fibrosis progression prior to the onset of cirrhosis in the management of MASLD. In order to enable more personalized and timely interventions, these guidelines promote the development of improved non-invasive diagnostic instruments. The potential of AI to systematize quantification and risk stratification directly supports this agenda, as it offers scalable solutions that can be integrated into population-level screening programs [55]. Machine learning algorithms, including convolutional neural networks (CNNs), ResNet models, and support vector machines (SVMs), can identify intricate visual features that are invisible to the human eye [56]. These algorithms convert subjective interpretation into a standardized computational process [42,45,57]. This is particularly advantageous for MASLD, as it may be feasible to distinguish benign steatosis from early fibrotic transformation by making use of nuanced imaging distinctions.
However, the MASLD findings of this review must be interpreted cautiously. Only three studies were eligible for MASLD analysis, and extreme heterogeneity (I2 > 90%) produced unstable pooled estimates (AUC PI 0.71–1.00). Therefore, these results should be considered exploratory and hypothesis-generating rather than definitive. A narrative synthesis is more appropriate for this limited evidence base, as individual study effect sizes varied markedly and confidence intervals were wide.
When comparing endpoints, MASLD studies showed a pooled sensitivity of 0.86 and an AUC of 0.99, but the uncertainty was considerably higher than for fibrosis, limiting generalizability despite seemingly high point estimates.
Quantitatively, AI-based imaging demonstrated stronger and more consistent performance for liver fibrosis (pooled sensitivity 0.85, specificity 0.88, AUC 0.92) compared with MASLD, where the pooled AUC remained high (0.99), but estimates were unstable due to extreme heterogeneity (I2 > 90%). Fibrosis results were supported by narrower confidence intervals and a more reliable evidence base, whereas MASLD findings should be considered exploratory. The redefinition of NAFLD to MASLD does not invalidate prior research; rather, it contextualizes it within a broader metabolic spectrum. According to certain authors [58], the overlap between NAFLD and MASLD exceeds 95%, thereby guaranteeing that historical data remains highly pertinent. In fact, the current analysis is one of the most comprehensive and early attempts to reinterpret the existing AI-imaging literature through the MASLD lens. This alignment bolsters the study’s scientific contribution and novelty by illustrating how established datasets can inform the new nomenclature, thereby advancing toward a unified, metabolism-focused understanding of liver disease [59].
To ensure methodological rigor, earlier NAFLD studies were only reclassified as MASLD when metabolic dysfunction was explicitly documented, and sensitivity analyses excluding borderline cases produced no meaningful changes to pooled fibrosis accuracy (ΔAUC < 0.02), supporting the robustness of this approach. From a methodological perspective, the heterogeneity of clinical practice is reflected in the diversity of AI architectures and imaging modalities in this meta-analysis [60]. Although this diversity introduces variability, it emphasizes the real-world applicability and highlights the robustness of AI, whose consistent performance across diverse contexts is essential for clinical translation [61]. Separate meta-analyses by fibrosis thresholds (≥F2, ≥F3, F4) were not feasible because individual studies used heterogeneous staging cutoffs and did not consistently report threshold-specific accuracy metrics. Nevertheless, the interpretability of AI systems continues to be a significant obstacle. Black-box algorithms may generate precise predictions without furnishing clinicians with transparent reasoning [62]. The next generation of AI tools must integrate explainable AI (XAI) principles to establish trust and regulatory adoption. This involves linking model outputs to clinically interpretable imaging biomarkers and facilitating decision support, rather than opaque classification [63].
Risk-of-bias assessment revealed key methodological shortcomings across studies, particularly non-prespecified index test thresholds and mixed reference standards, which may inflate diagnostic accuracy and should be acknowledged when interpreting pooled estimates. In particular, several studies demonstrated great concern in key QUADAS-2 domains, most notably in the index test domain, where non-prespecified or data-driven thresholds increased the risk of inflated accuracy estimates. Additional concerns were identified in the reference standard and flow/timing domains in some studies, further underscoring the need for more rigorous methodological standardization in future AI-imaging research. An additional aspect of the impact of this work is its implications for global health equity. Populations with metabolic risk factors, including obesity and type 2 diabetes, are disproportionately affected by MASLD due to their global prevalence [64]. AI-driven imaging platforms have the potential to democratize access to dependable diagnostics in resource-limited environments where biopsy or specialized imaging expertise is unavailable [65]. The capacity of AI to seamlessly integrate with standard imaging devices, such as ultrasound, provides a practicable approach to broader implementation without the need for extensive infrastructural investment. This aligns with the multidisciplinary European guidelines [12], which emphasize the use of diagnostic strategies that are both cost-effective and scalable for the management of MASLD. Nevertheless, it is advisable to exercise caution as the field advances toward clinical deployment. Model performance can be influenced by algorithmic bias, variability in imaging quality, and inconsistent ground-truth definitions [66]. Transparent reporting, external validation, and multicenter standardization should be prioritized in future prospective studies to ensure reproducibility. Furthermore, the integration of clinical, biochemical, and imaging data through multimodal AI systems may lead to even more precise diagnostic signatures, thereby providing a comprehensive representation of the metabolic–liver interface [67]. This study demonstrates that AI is not simply a marginal enhancement to traditional imaging; instead, it is a transformative approach that augments the MASLD framework. Personalized hepatology is established by the amalgamation of clinical necessities and technology, guaranteeing objectivity, scalability, and equity. This meta-analysis sets a conceptual and methodological standard for future research by consolidating a varied range of data into the unified MASLD classification. The findings corroborate the claim that AI-driven imaging is poised to become a vital element of modern hepatology. Ultimately, technology will enhance patient outcomes by increasing the accuracy of care, facilitating earlier issue detection, and improving consistency.
5. Conclusions
AI-assisted imaging has substantially enhanced the diagnosis and treatment of metabolic dysfunction-associated steatotic liver disease. However, results for MASLD are based on only three heterogeneous studies and should be interpreted as preliminary. Through the integration of advanced computational algorithms and multimodal imaging, AI offers evaluations that are highly interpretable, reproducible, and standardized, surpassing those of conventional instruments. Future work must incorporate standardized validation frameworks, threshold harmonization, and external testing across imaging platforms before clinical deployment.
Supplementary Materials
The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/jcm14238466/s1, Table S1: PRISMA-DTA checklist. Table S2: Full database search strategies. Table S3: QUADAS-2 data extraction template. Table S4: Methodological characteristics of included studies. Table S5: Clinical implications and limitations of AI models. Table S6: QUADAS-2 assessments for all included studies. Table S7: Reviewer agreement for QUADAS-2 domains. Table S8: Summary of risk of bias and applicability across studies. Table S9: Diagnostic performance metrics for liver fibrosis studies. Table S10: Diagnostic odds ratios for MASLD studies. Figure S1: Funnel plot for pooled DOR analysis.
Author Contributions
Conceptualization, R.A.P. and J.A.; methodology, R.A.P., K.B.M. and A.P.; software, R.A.P.; validation, K.B.M., A.P. and J.A.; formal analysis, R.A.P. and J.A.; investigation, K.B.M. and A.P.; resources, and D.S.K.; data curation, R.A.P., K.B.M. and A.P.; writing—original draft preparation, R.A.P.; writing—review and editing, J.A., V.R.; visualization, R.A.P.; supervision, J.A.; project administration, R.A.P. and J.A.; and critical revision of the manuscript, D.S.K. and V.R. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Ethical review and approval were waived for this study, as it is a meta-analysis based on previously published data and does not involve any new studies with human participants or animals.
Informed Consent Statement
Patient consent was waived because all data were obtained from published studies in which informed consent had already been acquired by the original investigators.
Data Availability Statement
The extracted dataset is available in FigShare at: Pugliesi, Rosa Alba (2025). Meta-Analysis of AI Integration in Abdominal Imaging for Liver Fibrosis and MASLD: Evaluating Diagnostic Accuracy and Clinical Impact. figshare. Dataset. https://doi.org/10.6084/m9.figshare.30505748.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Decharatanachart, P.; Chaiteerakij, R.; Tiyarattanachai, T.; Treeprasertsuk, S. Application of artificial intelligence in chronic liver diseases: A systematic review and meta-analysis. BMC Gastroenterol. 2021, 21, 10. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Paudyal, R.; Shah, A.D.; Akin, O.; Do, R.K.G.; Konar, A.S.; Hatzoglou, V.; Mahmood, U.; Lee, N.; Wong, R.J.; Banerjee, S.; et al. Artificial Intelligence in CT and MR Imaging for Oncological Applications. Cancers 2023, 15, 2573. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Primakov, S.P.; Ibrahim, A.; van Timmeren, J.E.; Wu, G.; Keek, S.A.; Beuque, M.; Granzier, R.W.Y.; Lavrova, E.; Scrivener, M.; Sanduleanu, S.; et al. Automated detection and segmentation of non-small cell lung cancer computed tomography images. Nat. Commun. 2022, 13, 3423. [Google Scholar] [CrossRef] [PubMed]
- Dercle, L.; McGale, J.; Sun, S.; Marabelle, A.; Yeh, R.; Deutsch, E.; Mokrane, F.-Z.; Farwell, M.; Ammari, S.; Schoder, H.; et al. Artificial intelligence and radiomics: Fundamentals, applications, and challenges in immunotherapy. J. Immunother. Cancer 2022, 10, e005292. [Google Scholar] [CrossRef] [PubMed]
- Abdar, M.; Pourpanah, F.; Hussain, S.; Rezazadegan, D.; Liu, L.; Ghavamzadeh, M.; Fieguth, P.; Cao, X.; Khosravi, A.; Acharya, U.R.; et al. A review of uncertainty quantification in deep learning: Techniques, applications and challenges. Inf. Fus. 2021, 76, 243–297. [Google Scholar] [CrossRef]
- Le Berre, C.; Sandborn, W.J.; Aridhi, S.; Devignes, M.-D.; Fournier, L.; Smaïl-Tabbone, M.; Danese, S.; Peyrin-Biroulet, L. Application of artificial intelligence to gastroenterology and hepatology. Gastroenterology 2020, 158, 76–94. [Google Scholar] [CrossRef]
- Diaz, O.; Kushibar, K.; Osuala, R.; Linardos, A.; Garrucho, L.; Igual, L.; Radeva, P.; Prior, F.; Gkontra, P.; Lekadir, K. Data preparation for artificial intelligence in medical imaging: A comprehensive guide to open-access platforms and tools. Phys. Med. 2021, 83, 25–37. [Google Scholar] [CrossRef]
- Yu, J.H.; Lee, H.A.; Kim, S.U. Noninvasive imaging biomarkers for liver fibrosis in nonalcoholic fatty liver disease: Current and future. Clin. Mol. Hepatol. 2023, 29 (Suppl. S1), S136–S149. [Google Scholar] [CrossRef]
- Frank, R.A.; Bossuyt, P.M.; McInnes, M.D.F. Systematic reviews and meta-analyses of diagnostic test accuracy: The PRISMA-DTA statement. Radiology 2018, 288, 313–314. [Google Scholar] [CrossRef]
- Maleki Varnosfaderani, S.; Forouzanfar, M. The Role of AI in Hospitals and Clinics: Transforming Healthcare in the 21st Century. Bioengineering 2024, 11, 337. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Zhang, X.; Zhang, Y.; Zhang, G.; Qiu, X.; Tan, W.; Yin, X.; Liao, L. Deep learning with radiomics for disease diagnosis and treatment: Challenges and potential. Front. Oncol. 2022, 12, 773840. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- European Association for the Study of the Liver; European Association for the Study of Diabetes; European Association for the Study of Obesity. EASL–EASD–EASO Clinical Practice Guidelines on the management of metabolic dysfunction-associated steatotic liver disease (MASLD): Executive Summary. Diabetologia 2024, 67, 2375–2392. [Google Scholar] [CrossRef]
- Rajan, V.; Das, A.; Venkatachalam, J.; Lohani, K.K.; Lahariya, C. Managing Metabolic Dysfunction-associated Steatotic Liver Disease (MASLD) in Primary Care Settings: A Review. Prev. Med. Res. Rev. 2025, 2, 183–191. [Google Scholar] [CrossRef]
- Liberati, A.; Altman, D.G.; Tetzlaff, J.; Mulrow, C.; Gotzsche, P.C.; A Ioannidis, J.P.; Clarke, M.; Devereaux, P.J.; Kleijnen, J.; Moher, D. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate healthcare interventions: Explanation and elaboration. BMJ 2009, 21, 339. [Google Scholar] [CrossRef]
- McHugh, M.L. Interrater reliability: The kappa statistic. Biochem. Med. 2012, 22, 276–282. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Whiting, P.F.; Rutjes, A.W.S.; Westwood, M.E.; Mallett, S.; Deeks, J.J.; Reitsma, J.B.; Leeflang, M.M.G.; Sterne, J.A.C.; Bossuyt, P.M.M.; QUADAS-2 Group. QUADAS-2: A revised tool for the quality assessment of diagnostic accuracy studies. Ann. Intern. Med. 2011, 155, 529–536. [Google Scholar] [CrossRef] [PubMed]
- Gopalakrishna, G.; Mustafa, R.A.; Davenport, C.; Scholten, R.J.; Hyde, C.; Brozek, J.; Schünemann, H.J.; Bossuyt, P.M.; Leeflang, M.M.; Langendam, M.W. Applying grading of recommendations assessment, development and evaluation (GRADE) to diagnostic tests was challenging but doable. J. Clin. Epidemiol. 2014, 67, 760–768. [Google Scholar] [CrossRef]
- Dettori, J.R.; Norvell, D.C.; Chapman, J.R. Fixed-Effect vs Random-Effects Models for Meta-Analysis: 3 Points to Consider. Glob. Spine J. 2022, 12, 1624–1626. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Patel, A.; Cooper, N.; Freeman, S.; Sutton, A. Graphical enhancements to summary receiver operating characteristic plots to facilitate the analysis and reporting of meta-analysis of diagnostic test accuracy data. Res. Synth. Methods 2021, 12, 34–44. [Google Scholar] [CrossRef]
- Higgins, J.P.; Thompson, S.G.; Deeks, J.J.; Altman, D.G. Measuring inconsistency in meta-analyses. BMJ 2003, 327, 557–560. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Thorlund, K.; Imberger, G.; Johnston, B.C.; Walsh, M.; Awad, T.; Thabane, L.; Gluud, C.; Devereaux, P.J.; Wetterslev, J. Evolution of heterogeneity (I2) estimates and their 95% confidence intervals in large meta-analyses. PLoS ONE 2012, 7, e39471. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Rovetta, A. Raiders of the Lost Correlation: A Guide on Using Pearson and Spearman Coefficients to Detect Hidden Correlations in Medical Sciences. Cureus 2020, 12, e11794. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Deeks, J.J.; Macaskill, P.; Irwig, L. The performance of tests of publication bias and other sample size effects in systematic reviews of diagnostic test accuracy was assessed. J. Clin. Epidemiol. 2005, 58, 882–893. [Google Scholar] [CrossRef] [PubMed]
- Ahmed, Y.; Hussein, R.S.; Basha, T.A.; Khalifa, A.M.; Ibrahim, A.S.; Abdelmoaty, A.S.; Abdella, H.M.; Fahmy, A.S. Detecting liver fibrosis using a machine learning-based approach to the quantification of the heart-induced deformation in tagged MR images. NMR Biomed. 2019, 33, e4215. [Google Scholar] [CrossRef] [PubMed]
- Choi, K.J.; Jang, J.K.; Lee, S.S.; Sung, Y.S.; Shim, W.H.; Kim, H.S.; Yun, J.; Choi, J.-Y.; Lee, Y.; Kang, B.-K.; et al. Development and validation of a deep learning system for staging liver fibrosis using contrast agent-enhanced CT images. Radiology 2018, 289, 688–697. [Google Scholar] [CrossRef]
- Hectors, S.J.; Kennedy, P.; Huang, K.-H.; Stocker, D.; Carbonell, G.; Greenspan, H.; Friedman, S.; Taouli, B. Fully automated prediction of liver fibrosis using deep learning analysis of gadoxetic acid-enhanced MRI. Eur. Radiol. 2021, 31, 3805–3814. [Google Scholar] [CrossRef]
- Lee, J.H.; Joo, I.; Kang, T.W.; Paik, Y.H.; Sinn, D.H.; Ha, S.Y.; Kim, K.; Choi, C.; Lee, G.; Yi, J.; et al. Deep learning with ultrasonography: Automated classification of liver fibrosis using a deep convolutional neural network. Eur. Radiol. 2020, 30, 1264–1273. [Google Scholar] [CrossRef]
- Li, W.; Huang, Y.; Zhuang, B.-W.; Liu, G.-J.; Hu, H.-T.; Li, X.; Liang, J.-Y.; Wang, Z.; Huang, X.-W.; Zhang, C.-Q.; et al. Multiparametric ultrasomics of significant liver fibrosis: A machine learning-based analysis. Eur. Radiol. 2019, 29, 1496–1506. [Google Scholar] [CrossRef]
- Li, Q.; Yu, B.; Tian, X.; Cui, X.; Zhang, R.; Guo, Q. Deep residual nets model for staging liver fibrosis on plain CT images. Int. Comput. Assist. Radiol. Surg. 2020, 15, 1399–1406. [Google Scholar] [CrossRef]
- Wang, K.; Lu, X.; Zhou, H.; Gao, Y.; Zheng, J.; Tong, M.; Wu, C.; Liu, C.; Huang, L.; Jiang, T.; et al. Deep learning radiomics of shear wave elastography significantly improved diagnostic performance for assessing liver fibrosis in chronic hepatitis B: A prospective multicentre study. Gut 2019, 68, 729–741. [Google Scholar] [CrossRef]
- Yasaka, K.; Akai, H.; Kunimatsu, A.; Abe, O.; Kiryu, S. Liver fibrosis: Deep convolutional neural network for staging by using gadoxetic acid-enhanced hepatobiliary phase MR images. Radiology 2017, 287, 146–155. [Google Scholar] [CrossRef]
- Yasaka, K.; Akai, H.; Kunimatsu, A.; Abe, O.; Kiryu, S. Deep learning for staging liver fibrosis on CT: A pilot study. Eur. Radiol. 2018, 28, 4578–4585. [Google Scholar] [CrossRef]
- Yin, Y.; Yakar, D.; Dierckx, R.A.J.O.; Mouridsen, K.B.; Kwee, T.C.; de Haas, R.J. Liver fibrosis staging by deep learning: A visual-based explanation of diagnostic decisions of the model. Eur. Radiol. 2021, 31, 9620–9627. [Google Scholar] [CrossRef] [PubMed]
- Zhu, Z.; Lv, D.; Zhang, X.; Wang, S.-H.; Zhu, G. Deep learning in the classification of stage of liver fibrosis in chronic hepatitis B with magnetic resonance ADC images. Contrast Media Mol. Imaging 2021, 2021, 2015780. [Google Scholar] [CrossRef] [PubMed]
- Zhang, L.; Li, Q.-Y.; Duan, Y.-Y.; Yan, G.-Z.; Yang, Y.-L.; Yang, R.-J. Artificial neural network aided non-invasive grading evaluation of hepatic fibrosis by duplex ultrasonography. BMC Med. Inform. Dec. Mak. 2012, 12, 55. [Google Scholar] [CrossRef] [PubMed]
- Han, A.; Byra, M.; Heba, E.; Andre, M.P.; Erdman, J.J.W.; Loomba, R.; Sirlin, C.B.; O’brien, W.D. Noninvasive diagnosis of nonalcoholic fatty liver disease and quantification of liver fat with radiofrequency ultrasound data using one-dimensional convolutional neural networks. Radiology 2020, 295, 342–350. [Google Scholar] [CrossRef]
- Zamanian, H.; Mostaar, A.; Azadeh, P.; Ahmadi, M. Implementation of Combinational Deep Learning Algorithm for Non-alcoholic Fatty Liver Classification in Ultrasound Images. J. Biomed. Phys. Eng. 2021, 11, 73–84. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Zhang, L.; Huang, Y.; Huang, M.; Zhao, C.H.; Zhang, Y.J.; Wang, Y. Development of cost-effective fatty liver disease prediction models in a Chinese population: Statistical and machine learning approaches. JMIR Form. Res. 2024, 8, e53654. [Google Scholar] [CrossRef]
- Kleiner, D.E.; Brunt, E.M.; Van Natta, M.; Behling, C.; Contos, M.J.; Cummings, O.W.; Ferrell, L.D.; Liu, Y.-C.; Torbenson, M.S.; Unalp-Arida, A.; et al. Design and validation of a histological scoring system for nonalcoholic fatty liver disease. Hepatology 2005, 41, 1313–1321. [Google Scholar] [CrossRef]
- Mienye, I.D.; Swart, T.G.; Obaido, G.; Jordan, M.; Ilono, P. Deep Convolutional Neural Networks in Medical Image Analysis: A Review. Information 2025, 16, 195. [Google Scholar] [CrossRef]
- Patil, P.U.; Lande, S.B.; Nagalkar, V.J.; Nikam, S.B.; Wakchaure, G.C. Grading and sorting technique of dragon fruits using machine learning algorithms. J. Agric. Food Res. 2021, 4, 100118. [Google Scholar] [CrossRef]
- Borawar, L.; Kaur, R. ResNet: Solving vanishing gradient in deep networks. In Proceedings of the International Conference on Recent Trends in Computing, Delhi, India, 5–6 July 2023; Springer: Berlin/Heidelberg, Germany, 2023; pp. 235–247. [Google Scholar] [CrossRef]
- Calabrese, E.; Rudie, J.D.; Rauschecker, A.M.; Villanueva-Meyer, J.E.; Clarke, J.L.; Solomon, D.A.; Cha, S. Combining radiomics and deep convolutional neural network features from preoperative MRI for predicting clinically relevant genetic biomarkers in glioblastoma. Neurooncol. Adv. 2022, 4, vdac060. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Kufel, J.; Bargieł-Łączek, K.; Kocot, S.; Koźlik, M.; Bartnikowska, W.; Janik, M.; Czogalik, Ł.; Dudek, P.; Magiera, M.; Lis, A.; et al. What is machine learning, artificial neural networks, and deep learning?—Examples of practical applications in medicine. Diagnostics 2023, 13, 2582. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Schonlau, M.; Zou, R.Y. The random forest algorithm for statistical learning. Stata J. Promot. Commun. Stat. Stata 2020, 20, 3–29. [Google Scholar] [CrossRef]
- Nam, D.; Chapiro, J.; Paradis, V.; Seraphin, T.P.; Kather, J.N. Artificial intelligence in liver diseases: Improving diagnostics, prognostics and response prediction. JHEP Rep. 2022, 4, 100443. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Alshagathrh, F.M.; Househ, M.S. Artificial intelligence for detecting and quantifying fatty liver in ultrasound images: A systematic review. Bioengineering 2022, 9, 748. [Google Scholar] [CrossRef]
- Karalis, V.D. The Integration of Artificial Intelligence into Clinical Practice. Appl. Biosci. 2024, 3, 14–44. [Google Scholar] [CrossRef]
- Pinto-Coelho, L. How Artificial Intelligence Is Shaping Medical Imaging Technology: A Survey of Innovations and Applications. Bioengineering 2023, 10, 1435. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Al Kuwaiti, A.; Nazer, K.; Al-Reedy, A.; Al-Shehri, S.; Al-Muhanna, A.; Subbarayalu, A.V.; Al Muhanna, D.; Al-Muhanna, F.A. A Review of the Role of Artificial Intelligence in Healthcare. J. Pers. Med. 2023, 13, 951. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Angulo, P.; Hui, J.M.; Marchesini, G.; Bugianesi, E.; George, J.; Farrell, G.C.; Enders, F.; Saksena, S.; Burt, A.D.; Bida, J.P.; et al. The NAFLD fibrosis score: A noninvasive system that identifies liver fibrosis in patients with NAFLD. Hepatology 2007, 45, 846–854. [Google Scholar] [CrossRef] [PubMed]
- Njei, B.; Osta, E.; Njei, N.; Al-Ajlouni, Y.A.; Lim, J.K. An explainable machine learning model for prediction of high-risk nonalcoholic steatohepatitis. Sci. Rep. 2024, 14, 8589. [Google Scholar] [CrossRef] [PubMed]
- Alowais, S.A.; Alghamdi, S.S.; Alsuhebany, N.; Alqahtani, T.; Alshaya, A.I.; Almohareb, S.N.; Aldairem, A.; Alrashed, M.; Bin Saleh, K.; Badreldin, H.A.; et al. Revolutionizing healthcare: The role of artificial intelligence in clinical practice. BMC Med. Educ. 2023, 23, 689. [Google Scholar] [CrossRef] [PubMed]
- Pugliese, N.; Bertazzoni, A.; Hassan, C.; Schattenberg, J.M.; Aghemo, A. Revolutionizing MASLD: How Artificial Intelligence Is Shaping the Future of Liver Care. Cancers 2025, 17, 722. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- McTeer, M.; Applegate, D.; Mesenbrink, P.; Ratziu, V.; Schattenberg, J.M.; Bugianesi, E.; Geier, A.; Gomez, M.R.; Dufour, J.-F.; Ekstedt, M.; et al. Machine learning approaches to enhance diagnosis and staging of patients with MASLD using routinely available clinical information. PLoS ONE 2024, 19, e0299487. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Apell, P.; Eriksson, H. Artificial intelligence (AI) healthcare technology innovations: The current state and challenges from a life science industry perspective. Technol. Anal. Strateg. Manag. 2023, 35, 179–193. [Google Scholar] [CrossRef]
- Javed, H.; El-Sappagh, S.; Abuhmed, T. Robustness in deep learning models for medical diagnostics: Security and adversarial challenges towards robust AI applications. Artif. Intell. Rev. 2025, 58, 12. [Google Scholar] [CrossRef]
- Rinella, M.E.; Lazarus, J.V.; Ratziu, V.; Francque, S.M.; Sanyal, A.J.; Kanwal, F.; Romero, D.; Abdelmalek, M.F.; Anstee, Q.M.; Arab, J.P.; et al. A multisociety Delphi consensus statement on new fatty liver disease nomenclature. J. Hepatology 2023, 79, 1542–1556. [Google Scholar] [CrossRef]
- Bourganou, M.V.; Chondrogianni, M.E.; Kyrou, I.; Flessa, C.-M.; Chatzigeorgiou, A.; Oikonomou, E.; Lambadiari, V.; Randeva, H.S.; Kassi, E. Unraveling Metabolic Dysfunction-Associated Steatotic Liver Disease Through the Use of Omics Technologies. Int. J. Mol. Sci. 2025, 26, 1589. [Google Scholar] [CrossRef]
- Riedl, R. Is trust in artificial intelligence systems related to user personality? Review of empirical evidence and future research directions. Electron. Mark. 2022, 32, 2021–2051. [Google Scholar] [CrossRef]
- Chen, R.J.; Wang, J.J.; Williamson, D.F.K.; Chen, T.Y.; Lipkova, J.; Lu, M.Y.; Sahai, S.; Mahmood, F. Algorithmic fairness in artificial intelligence for medicine and healthcare. Nat. Biomed. Eng. 2023, 7, 719–742. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Kharchenko, V.; Fesenko, H.; Illiashenko, O. Quality models for artificial intelligence systems: Characteristic-based approach, development and application. Sensors 2022, 22, 4865. [Google Scholar] [CrossRef]
- Qadri, Y.A.; Shaikh, S.; Ahmad, K.; Choi, I.; Kim, S.W.; Vasilakos, A.V. Explainable Artificial Intelligence: A Perspective on Drug Discovery. Pharmaceutics 2025, 17, 1119. [Google Scholar] [CrossRef]
- Lu, F.; Liu, J.; She, B.; Yang, H.; Ji, F.; Zhang, L. Global Trends and Inequalities of Liver Complications Related to Metabolic Dysfunction-Associated Steatotic Liver Disease: An Analysis From 1990 to 2021. Liver Int. 2025, 45, e16120. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Bin Rashid, A.; Kausik, A.K. AI revolutionizing industries worldwide: A comprehensive overview of its diverse applications. Hybrid Adv. 2024, 7, 100277. [Google Scholar] [CrossRef]
- Koçak, B.; Ponsiglione, A.; Stanzione, A.; Bluethgen, C.; Santinha, J.; Ugga, L.; Huisman, M.; Klontzas, M.E.; Cannella, R.; Cuocolo, R. Bias in artificial intelligence for medical imaging: Fundamentals, detection, avoidance, mitigation, challenges, ethics, and prospects. Diagn. Interv. Radiol. 2025, 31, 75–88. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Jimenez Ramos, M.; Kendall, T.J.; Drozdov, I.; Fallowfield, J.A. A data-driven approach to decode metabolic dysfunction-associated steatotic liver disease. Ann. Hepatol. 2024, 29, 101278. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).




