Applications of Artificial Intelligence in Selected Internal Medicine Specialties: A Critical Narrative Review of the Latest Clinical Evidence
Abstract
1. Introduction
2. Materials and Methods
- -
- Prospective randomized controlled trials (RCTs) evaluating AI interventions
- -
- Large prospective cohort studies with real-world implementation of AI tools
- -
- Post hoc analyses of major prospective trials that incorporated
- -
- Studies demonstrating improvement in clinically relevant outcomes or resulting in regulatory approval
- -
- Publication date from 2010
- -
- Purely retrospective studies without prospective external validation
- -
- Proof-of-concept or preclinical studies lacking patient-centered clinical endpoints
- -
- Non-English language publications
3. Results
3.1. Cardiology
| Authors | Methodology | Results | Significance |
|---|---|---|---|
| Deisenhofer, I., et al. [33] | Multicenter, randomized, controlled, double-blind superiority trial; patients with drug-refractory persistent/long-standing persistent AF were 1:1 randomized to either conventional anatomical PVI-only or PVI plus AI-guided ablation of areas showing spatio-temporal electrogram dispersion. | At 12 months after a single procedure, freedom from documented AF was 88% in the tailored (AI-guided + PVI) arm vs. 70% in the PVI-only arm (log-rank p < 0.0001). No significant difference in freedom from any atrial arrhythmia. Procedure and ablation times were twice as long in the tailored arm; safety outcomes were similar between groups. | AI-guided targeting of spatio-temporal dispersion areas significantly improves 1-year AF-free survival compared to PVI alone in persistent/long-standing persistent AF. This establishes a new, more effective ablation strategy beyond standard PVI, although longer procedures and potential need for additional tachycardia ablation should be considered. |
| Kim, Y., et al. [34] | Multicenter, randomized (1:1), controlled trial involving 400 patients undergoing PCI; comparison of fully automated real-time AI-based quantitative coronary angiography (AI-QCA)-assisted PCI versus intravascular OCT-guided PCI. Primary endpoint: post-PCI minimal stent area (MSA) measured by OCT, tested for noninferiority (margin 0.8 mm2). | Post-PCI MSA was 6.3 ± 2.2 mm2 (AI-QCA) vs. 6.2 ± 2.2 mm2 (OCT) (difference −0.16 mm2; 95% CI −0.59 to 0.28; P for noninferiority < 0.001). Most OCT-defined endpoints (stent underexpansion, dissection, untreated reference disease) were similar; stent malapposition was higher in the AI-QCA group (13.6% vs. 5.6%, p = 0.007). | AI-QCA-assisted PCI is noninferior to OCT-guided PCI in achieving optimal stent expansion (MSA) while being faster and not requiring additional imaging equipment or expertise. It offers a practical, fully automated alternative to intravascular imaging guidance for everyday PCI with comparable stent optimization outcomes. |
| Liu, W. T., et al. [35] | Open-label, cluster-randomized controlled trial at two hospitals in Taiwan; noncardiologists were randomized by cluster to either receive real-time AI-ECG alerts for undetected AF in at-risk patients (CHA2DS2-VASc ≥ 1 M/≥2 F) or usual care without alerts (NCT05127460). | In patients with AI-detected AF, NOAC prescription within 90 days was significantly higher in the intervention group (23.3% vs. 12.0%; HR 1.85, 95% CI 1.11–3.07). New AF diagnosis rate was also higher (HR 1.40, 95% CI 1.03–1.90). No differences in echocardiogram ordering, cardiology referrals, ischemic stroke, CV death, or all-cause death. | Simple AI-ECG alerts substantially increased AF detection and guideline-directed NOAC prescribing by noncardiologists, narrowing the care gap with cardiologists. This low-cost, scalable intervention can improve stroke prevention in undiagnosed AF without increasing downstream testing or hard clinical events. |
| Tsai, D. J., et al. [36] | Pragmatic randomized controlled trial at a single academic center in Taiwan; 13,631 inpatients under non-cardiologist care were 1:1 randomized to AI-ECG interpretation (low-EF probability displayed) versus standard ECG care without AI results. | New low EF (≤50%) diagnoses within 30 days were significantly higher in the intervention group (1.5% vs. 1.1%; HR 1.50, 95% CI 1.11–2.03). Effect was stronger in AI-flagged high-risk patients (13.0% vs. 8.9%; HR 1.55). Positive predictive value of echocardiograms for low EF rose from 20.2% to 34.2% (p < 0.001) with no increase in overall echo utilization; cardiology consultations increased in high-risk patients. | A simple AI-ECG tool significantly improved early detection of low ejection fraction in routine inpatient care without raising resource use. It enhanced diagnostic yield of downstream testing and facilitated more appropriate specialist referral, demonstrating an efficient way to close the gap in heart failure diagnosis by non-cardiologists. |
| Kolossváry, M., et al. [37] | Post hoc analysis of the SCOT-HEART trial; coronary CT angiography from 1750 patients was segmented and analyzed for both conventional attenuation-based plaque burden and advanced radiomic features (eigen radiomic descriptors of plaque morphology). Univariable and multivariable Cox models plus Harrell’s C-statistic and time-dependent AUC with cross-validation assessed incremental prognostic value for fatal/nonfatal myocardial infarction over median 8.6 years | 82 myocardial infarctions occurred. Eight radiomic features remained independently associated with MI after adjustment for cardiovascular risk score and plaque burden. Adding plaque burden to a clinical model did not improve discrimination (C-statistic 0.70 → 0.70), but further adding radiomic features increased performance to C-statistic 0.74, with significantly higher cumulative/dynamic AUC after year 5. | Radiomics-based detailed plaque morphology characterization substantially improves long-term MI risk prediction beyond clinical factors, calcium score, stenosis, and simple plaque burden. This precision-phenotyping approach from routine coronary CTA identifies higher-risk plaques and could enable better risk stratification and targeted prevention. |
| Trivedi, R., et al. [38] | Qualitative semistructured interviews with purposive sampling of 30 patients with atrial fibrillation who completed a 6-month fully automated voice-based conversational AI intervention (weekly AI phone calls with speech recognition and natural language processing) as part of the CHAT-AF trial; thematic analysis of transcribed interviews. | Four main themes emerged: (1) AI interactions felt human-like yet limited by scripted responses and trusted because hospital-delivered; (2) engagement depended on personalization, novel content, manageable information volume, and multichannel flexibility; (3) AI improved perceived access to continuous AF care and information; (4) patients felt empowered in self-management through reminders and reassurance from linked rhythm-monitoring devices. | Patients with AF found conversational AI an acceptable and engaging tool for education and self-management support, particularly when personalized and hospital affiliated. Findings highlight the value of voice-based AI in bridging care gaps while identifying key areas (natural dialog flow, tailored content, and information dosing) for future improvement. |
| Mekonnen, D., et al. [39] | Post hoc imaging sub-analysis of the RESUS-AMI trial; AI-assisted echocardiographic software (CAAS Qardia 2.0) was used to measure GLS (fully automated and semi-automated), LVEF, and volumes in 169 patients after primary PCI for STEMI. Results were correlated with CMR-derived infarct size, LVEF, and volumes (n = 81); intra- and inter-observer reproducibility of AI-derived parameters was assessed using ICC, bias, and limits of agreement. | AI-derived GLS showed moderate-to-good correlation with CMR infarct size (r = 0.58 automated, r = 0.64 semi-automated; both p < 0.001) and CMR LVEF (r = −0.63 and −0.65). Correlation with echo-derived LVEF was r = −0.51 (automated) and r = −0.67 (semi-automated). Inter- and intra-observer reproducibility of GLS was excellent (ICC 0.93–0.94). | AI-assisted GLS provides a reproducible and reliable marker of infarct size and LV systolic function after STEMI, with good correlation to gold-standard CMR. It enables fast, operator-independent strain assessment in routine post-PCI echocardiography, potentially improving risk stratification and follow-up. |
| Williams, M. C., et al. [40] | Retrospective analysis of the SCOT-HEART trial (n = 1769); two separate XGBoost machine learning models with 10-fold cross-validation and grid-search hyperparameter tuning were trained on clinical variables (symptoms, demographics, risk factors, ECG, exercise tolerance testing) to predict (1) any coronary artery disease and (2) increased low-attenuation plaque (LAP) burden on CCTA. | ML model predicted any CAD significantly better than the 10-year CV risk score alone (AUC 0.80 vs. 0.75, p = 0.004); key features: CV risk score, age, sex, total cholesterol, abnormal ETT. The model predicting high LAP burden showed no significant improvement over the CV risk score (AUC 0.75 vs. 0.72, p = 0.08). | Machine learning using readily available clinical data meaningfully improves pre-test prediction of obstructive CAD on CCTA, potentially optimizing selection for imaging. However, clinical variables alone are insufficient to reliably predict high-risk (low-attenuation) plaque burden, suggesting imaging-derived features remain essential for identifying vulnerable plaques. |
| Saklica, D., et al. [41] | Randomized controlled trial with 52 CAD patients allocated to three groups: telerehabilitation (TRG, n = 18), mobile app-based rehabilitation (MAG, n = 13), or control (physical activity advice only, CG, n = 21). All intervention groups followed a 12-week supervised calisthenic/resistance program (3×/week). Outcomes: exercise capacity (Incremental Shuttle Walk Test), QoL (SF-36), adherence, and patient feedback analyzed with fine-tuned BERT NLP model plus anomaly detection. | Both TRG (+87.2 m) and MAG (+89.4 m) significantly improved ISWT distance vs. CG (+10.9 m; p = 0.001). Adherence was markedly higher in TRG (100%) and MAG (80%) than CG (30%; p < 0.001). NLP-analyzed patient satisfaction strongly correlated with ISWT gains (r = 0.75, p < 0.001); AI anomaly detection identified adherence–outcome mismatches. | Technology-supported cardiac rehabilitation (tele- or app-based) substantially outperforms usual-care advice in improving exercise capacity and adherence. AI tools (NLP for sentiment analysis and anomaly detection) provide objective, scalable enhancement of outcome evaluation and patient engagement monitoring in cardiac rehabilitation. |
| Trivedi, R., et al. [42] | Single-blinded, 4:1 randomized controlled feasibility trial; 103 post-discharge AF patients allocated to 6 months of fully automated conversational AI phone calls (speech recognition + NLP) with self-management support, symptom monitoring, triggered clinical alerts, supplementary SMS/email surveys, nudges, and an educational website versus usual care. | Trial stopped early (103/385 planned). No significant between-group difference in AFEQT QoL score at 6 months (adjusted mean difference 2.08, 95% CI −7.79 to 11.96; p = 0.46). Within the intervention group, AFEQT improved significantly from baseline (69.9 to 79.9; p = 0.01). Engagement was moderate (average 4/7 outreaches completed); 88.4% of completed contacts rated useful. | Conversational AI delivered via phone calls is feasible, acceptable, and engaging for post-discharge AF support, with high perceived usefulness and preliminary evidence of within-group QoL improvement. Despite early termination preventing definitive efficacy assessment, it establishes proof-of-concept for scalable, fully automated patient support in chronic AF management. |
| Fiolet, A. T. L., et al. [43] | Cross-sectional subanalysis of the LoDoCo2 trial; 151 patients with chronic coronary disease on stable therapy underwent coronary CTA after median 28.2 months of blinded low-dose colchicine (0.5 mg/day) or placebo. AI-enabled software quantified pericoronary adipose tissue (PCAT) attenuation and total/detailed plaque volumes (non-calcified, low-attenuation, calcified, dense-calcified) across the entire coronary tree. | No difference in pericoronary inflammation (PCAT attenuation: −79.5 HU colchicine vs. −78.7 HU placebo, p = 0.236). Colchicine group showed significantly higher calcified plaque volume (169.6 vs. 113.1 mm3, p = 0.041), calcified plaque burden (9.6% vs. 7.0%, p = 0.035), and dense calcified plaque volume (192.8 vs. 144.3 mm3, p = 0.048). Low-attenuation plaque burden was lower with colchicine only in patients on low-intensity statins (p-interaction = 0.037). | Low-dose colchicine does not reduce pericoronary adipose tissue inflammation but promotes coronary plaque calcification and increases dense calcified plaque—features associated with greater plaque stability. This provides a mechanistic explanation for the observed cardiovascular event reduction in LoDoCo2 and supports plaque stabilization as a key anti-atherosclerotic effect of colchicine. |
| Li, G., et al. [44] | Post hoc blinded analysis of the prospective CAREER trial (NCT04665817); fully automatic AI-based CCTA reconstruction and CT-μFR computation (Murray-law based quantitative flow ratio) performed in 242 patients (657 vessels) who had invasive coronary angiography with FFR or μFR within 30 days. Reference standard: invasive FFR ≤ 0.80 or μFR ≤ 0.80. | Fully automatic CT-μFR was successful in all cases with mean analysis time 1.60 ± 0.34 min/patient. CT-μFR showed good correlation (r = 0.62) and agreement (bias −0.01 ± 0.10) with invasive physiology. Patient-level diagnostic accuracy was 83.0% (sensitivity 84.2%, specificity 81.9%, PPV 82.1%, NPV 84.0%). | Fully automatic, AI-powered CT-μFR provides rapid (~1.6 min), operator-independent functional assessment directly from routine CCTA with high diagnostic performance comparable to invasive FFR. It enables accurate pre-catheterization identification of hemodynamically significant stenoses, potentially reducing unnecessary invasive procedures |
| Yu, X., et al. [45] | Single-center, open-label, randomized controlled trial (n = 2086 post-PCI CHD patients in China); 1:1 randomization to a comprehensive web-based telemedicine platform (personalized education, medication reminders, vital sign monitoring, AI-assisted consultations) plus usual care versus usual care alone (phone follow-up at 1, 3, 6, 12 months). Primary endpoint: 1-year MACCE (cardiac death, MI, stroke, target vessel revascularization). | Telemedicine significantly reduced 1-year MACCE (3.5% vs. 5.3%, p = 0.04), driven by lower cardiac death (1.0% vs. 2.3%, p = 0.02) and MI (0.8% vs. 1.8%, p = 0.03). Serious bleeding (BARC 3–5) was lower (0.6% vs. 1.6%, p = 0.03). Intervention group achieved better BP control, higher adherence to aspirin and ACEI/ARB/ARNI, and reduced alcohol consumption; smoking showed a favorable trend. | A multicomponent AI-supported telemedicine program significantly lowered hard clinical events (MACCE) and improved secondary prevention metrics at 1 year after PCI. This demonstrates that scalable, technology-driven remote management can close implementation gaps and meaningfully improve long-term outcomes in high-risk CHD patients. |
| Ishiguchi, H., et al. [46] | Post hoc analysis of the WARCEF trial; 2213 HFrEF patients without atrial fibrillation were included. Nine machine learning models (including SVM, XGBoost, LightGBM) were trained on 12 selected clinical/demographic variables to predict incident ischaemic stroke during mean 3.3-year follow-up (74 events). Model performance was compared to CHA2DS2-VASc using AUC and decision curve analysis; feature importance assessed via SHAP values. | ML models strongly outperformed CHA2DS2-VASc (AUC 0.643). Best-performing models were SVM (AUC 0.874, 95% CI 0.769–0.959) and XGBoost (AUC 0.873, 95% CI 0.783–0.953), with SVM and LightGBM showing consistent net clinical benefit. Top predictive features across models: creatinine clearance, blood urea nitrogen, and warfarin use. | Machine learning substantially improves risk stratification for ischaemic stroke in HFrEF patients in sinus rhythm compared to traditional scores. Renal function markers (CrCl, BUN) and anticoagulation status emerged as dominant predictors, highlighting new targets for stroke prevention beyond AF in this high-risk population. |
3.2. Pulmonology
3.3. Neurology
3.4. Hepatology
| Authors | Methodology | Results | Significance |
|---|---|---|---|
| Ratziu, V., et al. [74] | Post hoc analysis of digitized liver biopsies from 251 patients with biopsy-confirmed NASH (F1–F3) enrolled in a 72-week phase 2 RCT of once-daily semaglutide (NCT02970942). Paired baseline and week-72 biopsies were scored by two expert pathologists using conventional NASH CRN criteria and independently analyzed by PathAI’s machine-learning (ML) NASH model. Both categorical (ordinal) scores and continuous quantitative feature scores for fibrosis, steatosis, inflammation, and ballooning were generated. Treatment effects on the two co-primary endpoints (NASH resolution without worsening of fibrosis; fibrosis improvement ≥ 1 stage without worsening of NASH) were compared between pathologist and ML assessments. | Both pathologist and ML categorical scoring detected significantly higher rates of NASH resolution with semaglutide 0.4 mg versus placebo (pathologist: 58.5% vs. 22.0%, p < 0.0001; ML: 36.9% vs. 11.9%, p = 0.0015). Fibrosis improvement trended higher but was not significant with either method. ML-derived continuous scores revealed significant semaglutide-induced reductions in fibrosis (p = 0.0099) and other features that were not detectable by conventional categorical pathologist or ML ordinal assessments. | ML-based digital pathology reproduces expert pathologist categorical assessments of treatment response in NASH but provides superior sensitivity via continuous quantitative scoring, uncovering an antifibrotic effect of semaglutide missed by traditional histopathology. This demonstrates the value of AI-powered continuous metrics as more responsive endpoints in NASH clinical trials. |
| Tiyarattanachai, T., et al. [75] | Single-center prospective randomized controlled trial (TCTR20201230003); 504 patients (260 with FLLs for non-experts, 244 for experts) underwent real-time ultrasound twice: with and without AI assistance from a CNN-based system. Detection rates and false positives were compared using McNemar’s test; non-experts were trainees; experts were board-certified radiologists. | For non-experts, AI assistance increased FLL detection rate (36.9% vs. 21.4%, p < 0.001) without significantly raising false positives (14.2% vs. 9.2%, p = 0.08). For experts, AI did not significantly improve detection (66.7% vs. 63.3%, p = 0.32) or alter false positives (8.6% vs. 9.0%, p = 0.85). | AI assistance significantly boosts focal liver lesion detection during ultrasound for non-expert operators, potentially enabling high-quality HCC surveillance in resource-limited settings with limited expert availability. No added benefit for experts suggests AI’s primary value in democratizing access to reliable imaging. |
| Zhang, Y., et al. [76] | Prospective randomized diagnostic trial: 100 patients with suspected small hepatocellular carcinoma (≤3 cm) were 1:1:1:1 randomized to four ultrasound modalities: color Doppler alone, contrast-enhanced ultrasound (CEUS) alone, elastography alone, or multimodal (all three combined). All images were processed and segmented by a Mask R-CNN deep-learning algorithm. Diagnostic performance was compared against pathological biopsy as gold standard. Additionally, EZH2 and p57 expression were measured by immunohistochemistry in tumor tissue, peritumoral tissue, and normal liver. | Mask R-CNN achieved the highest segmentation accuracy (97.23%) and average precision (71.90%). The multimodal ultrasound group outperformed single-modality groups with sensitivity 88.87%, specificity 90.91%, accuracy 89.47%, and Cohen’s κ 0.68 (all p < 0.05). Multimodal imaging features of malignancy: irregular shape, unclear borders, uneven internal echo, grade 1–2 blood flow, elasticity score 4–5, and fast-in fast-out contrast pattern. EZH2 was overexpressed (75.95% positive) and p57 underexpressed (80.79% negative) in tumor tissue; p57 negativity was significantly higher in poorly differentiated HCC. | Mask R-CNN-enhanced multimodal ultrasound (Doppler + CEUS + elastography) provides excellent diagnostic performance for small HCC, significantly superior to any single modality. High EZH2 and low p57 expression are strongly associated with oncogenesis and poor differentiation in small HCC, supporting their potential as diagnostic and prognostic biomarkers. This AI-augmented multimodal approach offers a highly accurate, non-invasive tool for early HCC detection. |
| Briceño, J., et al. [77] | Multicenter retrospective analysis of 1003 liver transplants from 11 Spanish centers. Sixty-four donor and recipient variables were used to develop two complementary artificial neural network (ANN) models via Neural Net Evolutionary Programming (NNEP): a positive-survival model (NN-CCR) and a negative-loss model (NN-MS) to predict 3-month graft survival/loss for each donor–recipient pair. Performance was compared against six established prognostic scores (MELD, D-MELD, DRI, P-SOFT, SOFT, BAR) using ROC curves and AUROC. | ANN models significantly outperformed all conventional scores. NN-CCR predicted 3-month graft survival with 90.79% accuracy (AUROC 0.80) and NN-MS predicted graft loss with 71.42% accuracy (AUROC 0.82). AUROCs of traditional scores ranged from 0.41 to 0.67 (all p < 0.001 vs. ANN). ANN also outperformed multiple regression models. | Artificial neural networks provide substantially superior, individualized prediction of 3-month graft survival compared to all currently validated prognostic scores. This ANN-based approach offers a more objective, accurate, and equitable tool for donor–recipient matching and organ allocation, with potential to optimize justice, utility, and transplant outcomes in clinical practice. |
3.5. Pancreatic Disease
| Authors | Methodology | Results | Significance |
|---|---|---|---|
| Cui, H., et al. [86] | Multicenter randomized crossover trial (4 centers in China, Jan–Jun 2023); 12 endoscopists (junior and senior) diagnosed 130 prospective patients with solid pancreatic lesions twice: conventionally and with assistance from a multimodal joint-AI model. The AI was trained/validated on EUS images + clinical data from 439 internal patients (2014–2022) and externally tested on 189 patients from 3 other institutions. Primary outcome: diagnostic accuracy with vs. without AI; secondary: AUC of the joint-AI model and human-AI interaction effects. | Joint-AI model achieved outstanding performance: internal AUC 0.996, external AUCs 0.955–0.976. In prospective crossover testing, AI assistance significantly improved the diagnostic accuracy of novice endoscopists (p < 0.001) with no decline in senior endoscopists. Explainability features reduced skepticism among experienced users, demonstrating positive human-AI collaboration. | The multimodal joint-AI model (EUS images + clinical data) dramatically outperforms single-modality AI and significantly boosts real-world diagnostic accuracy, especially for less-experienced endoscopists. It establishes a new benchmark for AI-assisted EUS diagnosis of solid pancreatic lesions and shows that well-designed, explainable multimodal AI can be readily adopted across expertise levels in clinical practice. |
| Chen, P. T., et al. [87] | Retrospective multicenter development and validation of an end-to-end deep learning (DL) system combining a segmentation CNN and a classifier ensemble of five CNNs. Training/validation used 546 pancreatic cancer CTs (2006–2018) and 733 normal pancreas CTs (2004–2019). Internal testing and comparison with original radiologist reports were performed, followed by nationwide external real-world validation on 1473 CT studies from institutions across Taiwan. | Internal test set: DL tool achieved 89.9% sensitivity (98/109) and 95.9% specificity (141/147), AUC 0.96, with no significant difference from original radiologist sensitivity (96.1%, p = 0.11). Nationwide real-world test set (n = 1473): sensitivity 89.7% (600/669), specificity 92.8% (746/804), AUC 0.95. For tumors < 2 cm, sensitivity was 74.7% (68/91). | This fully automated DL tool reliably detects pancreatic cancer on contrast-enhanced CT, including a large proportion of sub-2 cm tumors that are frequently missed by radiologists in routine practice. It offers performance comparable to or exceeding human interpretation and has proven generalizability across Taiwan, establishing it as a robust second-reader or triage tool to reduce missed pancreatic cancers in clinical workflow. |
| Kovatchev, B., et al. [88] | Pilot randomized crossover feasibility trial: 15 adults with T1D on commercial AID systems underwent two identical 20 h supervised hotel stays. The University of Virginia Model-Predictive Control (UMPC) algorithm was encoded into a neural network to create its Neural-Net Artificial Pancreas (NAP) approximation. Participants were randomly assigned to receive either NAP or the original UMPC algorithm during each session (crossover design with washout). | NAP and UMPC achieved nearly identical glycemic outcomes: TIR 86% vs. 87% (adjusted difference 1 percentage point), time < 70 mg/dL 2.0% vs. 1.8%, and CV 29.3% vs. 29.1%. Mean absolute difference in insulin delivery was only 0.031 U/h under identical inputs. NAP required sixfold lower computational resources than UMPC. No serious adverse events occurred with either controller. | A neural-network-encoded version of a clinically validated model-predictive control AID algorithm (NAP) replicated the performance of the original UMPC algorithm in real-world conditions while dramatically reducing computational burden. This first-in-human demonstration opens regulatory and clinical pathways for modern machine-learning techniques to replace traditional control algorithms in artificial pancreas systems, enabling faster innovation and deployment of next-generation AID. |
| Atlas, E., et al. [89] | Early feasibility clinical study of the MD-Logic Artificial Pancreas (MDLAP), a closed-loop system based on fuzzy logic theory that mimics diabetes caregiver reasoning using control-to-range and control-to-target strategies. Seven young adults with well-controlled T1D (age 19–30, A1C 6.6 ± 0.7%) underwent a total of 14 supervised sessions: 8 h sessions (overnight fasting + meal challenges) and 24 h full closed-loop sessions in a clinical research center. | Mean postprandial peak glucose was 224 ± 22 mg/dL, returning to <180 mg/dL within 2.6 ± 0.6 h and remaining stable in range for ≥1 h. During 24 h closed-loop control, 73% of sensor values were in 70–180 mg/dL, 27% > 180 mg/dL, and 0% < 70 mg/dL. No symptomatic hypoglycemia occurred in any trial. | The MD-Logic Artificial Pancreas, one of the earliest fuzzy-logic-based fully closed-loop systems, demonstrated safe and effective automated glucose control in a controlled setting, with good postprandial handling and zero hypoglycemia despite meal challenges. This proof-of-concept study established the clinical feasibility of non-model-based, expert-knowledge-driven artificial pancreas technology and paved the way for subsequent real-world and long-term trials. |
3.6. Other Applications of AI (Table 6)
| Authors | Methodology | Results | Significance |
|---|---|---|---|
| Lång, K., et al. [96] | Prospective, population-based, two-arm randomized controlled trial (MASAI, NCT04838756) at four Swedish screening sites. 80,033 women aged 40–80 were randomized 1:1 to AI-supported screening (Transpara v1.7.0 risk score 1–10 used to triage: scores 1–9 single reading, score 10 double reading; CAD marks shown for scores 8–10) versus standard double reading without AI. Participants and radiographers were masked; radiologists were not. Prespecified clinical safety analysis performed after 80,000 enrolments, focusing on cancer detection rate, recall rate, false-positive rate, PPV, and screen-reading workload. | AI-supported arm (n = 39,996): cancer detection 6.1/1000 (244 cancers), recall 2.2%, false-positive rate 1.5%, PPV 28.3%, with 44.3% reduction in screen readings (46,345 vs. 83,231). Standard double-reading arm (n = 40,024): cancer detection 5.1/1000 (203 cancers), recall 2.0%, false-positive rate 1.5%, PPV 24.8%. Cancer detection ratio 1.2 (95% CI 1.0–1.5, p = 0.052), exceeding the prespecified safety threshold (>3/1000 in AI arm). Proportion of invasive vs. in situ cancers was similar. | In the first large-scale randomized trial of AI in mammography screening, AI-supported reading proved clinically safe with a comparable (slightly higher) cancer detection rate, similar recall/false-positive rates, and a 44% reduction in radiologist workload. This establishes AI as a viable and efficient alternative to standard double reading, with potential to address global mammography workforce shortages while maintaining or improving screening performance. |
| Dembrower, K., et al. [97] | Prospective, population-based, paired-screening non-inferiority trial (ScreenTrustCAD, NCT04778670) at a single mammography unit in Stockholm. 55,581 consecutive screening women aged 40–74 years (Apr 2021–Jun 2022) had their mammograms independently read by two radiologists (standard double reading) and by AI (Transpara) + one radiologist, AI alone, and two radiologists + AI. Non-inferiority margin was a ≤15% relative reduction in cancer detection rate compared with standard double reading by two radiologists. | 269 screen-detected cancers (0.49%). One radiologist + AI: 261 cancers (relative proportion 1.04, 95% CI 1.00–1.09)—non-inferior and numerically 4% higher. AI alone: 246 cancers (relative proportion 0.98, 95% CI 0.93–1.04)—non-inferior. Two radiologists + AI: 269 cancers (relative proportion 1.08, 95% CI 1.04–1.11)—superior. False-positive rates and recall rates were comparable across strategies. | In the first prospective trial of AI as an independent reader in population-based screening, replacing one of the two radiologists with AI maintained (and slightly improved) cancer detection while halving radiologist workload. AI alone was also non-inferior to standard double reading. These findings support safe, controlled implementation of AI to address radiologist shortages and increase screening capacity without compromising detection performance. |
| Sadeh-Sharvit, S., et al. [98] | Single-site randomized controlled trial at a U.S. community mental health clinic; 47 adults with primary depression or anxiety disorders starting outpatient individual CBT were 1:1 randomized to therapy augmented by the Eleos Health AI platform or treatment-as-usual (TAU) for the first 2 months. The AI platform automatically transcribed sessions, measured fidelity to evidence-based practices, integrated patient-reported outcome measures, and auto-drafted progress notes. | Patients in the AI-augmented group attended 67% more sessions (mean 5.24 vs. 3.14). Depression (PHQ-9) scores decreased 34% in the AI group vs. 20% in TAU, and anxiety (GAD-7) scores decreased 29% vs. 8%, both with large effect sizes favoring AI-augmented therapy. Therapists using the AI platform submitted progress notes 55 h earlier on average (p < 0.001). Treatment satisfaction and perceived helpfulness were equivalent between groups. | In the first randomized trial of an AI platform designed specifically for behavioral health, AI-augmented CBT significantly improved patient engagement, symptom reduction, and clinician documentation efficiency compared with standard care. These results establish clinical proof-of-concept that AI tools can meaningfully enhance the delivery and outcomes of routine outpatient psychotherapy in real-world community settings. |
| Repici, A., et al. [99] | Prospective, randomized, controlled non-inferiority trial (AID-2); 10 non-expert endoscopists (<2000 lifetime colonoscopies) performed 660 screening/surveillance/diagnostic colonoscopies in patients aged 40–80 years, randomized 1:1 to high-definition colonoscopy with or without real-time AI CADe (GI Genius, Medtronic). Primary endpoint: adenoma detection rate (ADR). Post hoc pooled analysis combined these data with the previously published AID-1 trial (6 expert endoscopists, similar design). | In non-experts (AID-2), CADe significantly increased ADR (53.3% vs. 44.5%; RR 1.22, 95% CI 1.04–1.40; p < 0.01 for non-inferiority, p = 0.02 for superiority), adenomas per colonoscopy, and detection of small/distal lesions, without increasing non-neoplastic resections. Pooled analysis of 1020 patients (AID-1 + AID-2) confirmed CADe as a strong independent predictor of higher ADR (RR 1.29, 95% CI 1.16–1.42), while endoscopist experience level was not significant (RR 1.02, 95% CI 0.89–1.16). | Real-time AI CADe substantially improves ADR in less experienced colonoscopists to levels exceeding those of unaided experts, and pooled data show that the benefit of CADe is largely independent of physician experience. This supports universal implementation of CADe to standardize and elevate colonoscopy quality across all skill levels, especially during training periods. |
| Marcuzzi, A., et al. [100] | Single-center, three-arm randomized clinical trial at a Danish multidisciplinary specialist outpatient clinic. 294 adults (≥18 years) with persistent neck and/or low back pain on the waiting list for specialist care were randomized 1:1:1 to: SELFBACK app + usual care (AI-based, individually tailored weekly exercise, physical-activity, and education plans) e-Help + usual care (non-tailored generic web-based self-management information) Usual care alone Primary outcome: change in Musculoskeletal Health Questionnaire (MSK-HQ) score at 3 months. Follow-up: 6 weeks, 3 months, 6 months | At 3 months (82.7% complete data): Adjusted mean difference SELFBACK vs. usual care: +0.62 MSK-HQ points (95% CI −1.66 to 2.90; p = 0.60) SELFBACK vs. e-Help: +1.08 points (95% CI −1.24 to 3.41; p = 0.36) No significant differences were found in the primary outcome or any secondary outcomes (pain-related disability, pain intensity, catastrophizing, fear-avoidance, quality of life) at any time point. | In patients already referred to specialist care, adding an AI-based, individually tailored self-management app (SELFBACK) to usual care did not improve musculoskeletal health, pain, or function more than usual care alone or simple nontailored web-based information. This negative trial suggests that highly personalized digital self-management support may offer little additional benefit in a secondary/tertiary care setting where patients already expect specialist intervention, highlighting the importance of context when deploying digital health tools. |
| Wallace, M. B., et al. [101] | Multicenter (Italy, UK, US), randomized (1:1), tandem colonoscopy trial in 230 screening/surveillance patients. Same-day back-to-back procedures were performed with or without real-time AI assistance (deep-learning CADe), with order randomized (AI-first vs. colonoscopy-first). Primary endpoint: adenoma miss rate (AMR) = adenomas detected only on second colonoscopy/total adenomas from both colonoscopies. Secondary endpoints: mean adenomas at second colonoscopy, false-negative patient rate, and adverse events. | Overall AMR was significantly lower when AI was used first (15.5% vs. 32.4%; adjusted OR 0.38, 95% CI 0.23–0.62). AI benefit was greatest for ≤5 mm (15.9% vs. 35.8%; OR 0.34) and nonpolypoid lesions (16.8% vs. 45.8%; OR 0.24), and consistent in proximal (18.3% vs. 32.5%) and distal colon. Mean adenomas missed at second colonoscopy were halved with AI-first (0.33 vs. 0.70, p < 0.001). False-negative patient rate dropped from 29.6% to 6.8% (OR 0.17). No difference in adverse events. | In this rigorous tandem-colonoscopy design, real-time AI approximately halved the miss rate of colorectal adenomas, especially small/subtle lesions that commonly escape human perception. This provides direct evidence that AI reduces perceptual errors during standard colonoscopy, supporting its role in substantially improving colorectal cancer prevention through higher neoplasia detection. |
| Liaw, S. Y., et al. [102] | Single-center randomized controlled trial with 64 nursing students 1:1 allocated to sepsis team training in virtual reality simulation with either an AI-powered virtual doctor or a human-controlled virtual doctor (played by medical students). Both groups underwent identical sepsis scenarios. Outcomes assessed: sepsis care knowledge, interprofessional communication knowledge, self-efficacy in interprofessional communication (pre- and post-intervention), and objective performance in sepsis care and communication during a post-intervention simulation-based test. | Both groups significantly improved communication knowledge and self-efficacy from baseline. Only the AI-powered group significantly improved sepsis care knowledge (p < 0.001 vs. p = 0.16 in human-controlled group). Posttest sepsis knowledge: AI-powered group significantly higher (mean 9.06 vs. 7.75, p = 0.009) Sepsis care performance: no difference (13.63 vs. 12.75, p = 0.39) Interprofessional communication performance: no difference (29.34 vs. 27.06, p = 0.21) Self-efficacy in interprofessional communication: human-controlled group higher (69.6 vs. 60.1, p = 0.008). | An AI-powered virtual doctor was not inferior to a human-controlled (medical student) virtual doctor in training nursing students for sepsis care and interprofessional communication, demonstrating feasibility and scalability of AI-driven team training when medical student availability is limited. The AI group achieved superior sepsis knowledge retention, but human partners provided a greater boost in perceived communication self-efficacy. This supports hybrid models that combine AI scalability with human sociability for optimal interprofessional sepsis training. |
| Luo, H., et al. [103] | Multicenter diagnostic study across six Chinese hospitals of varying tiers. A total of 1,036,496 standard white-light endoscopy images from 84,424 patients were used to develop and validate GRAIDS, a deep-learning convolutional neural network system for real-time detection of upper gastrointestinal cancers (oesophageal and gastric). Training and tuning: images from Sun Yat-sen University Cancer Center Validation: internal, prospective (same centre), and five external datasets from primary-care hospitals Performance of GRAIDS was directly compared with endoscopists of three expertise levels (expert, competent, trainee). | GRAIDS achieved: Diagnostic accuracy: 91.5–97.8% across all validation sets (internal 95.5%, prospective 92.7%) Sensitivity: 94.2% (not significantly different from expert endoscopists 94.5%, p = 0.692) Sensitivity markedly superior to competent (85.8%, p < 0.0001) and trainee (72.2%, p < 0.0001) endoscopists Specificity: 91–98% Positive predictive value: 81.4% Negative predictive value: 97.8% | GRAIDS is the first deep-learning system prospectively validated across multiple tiers of hospitals for real-time detection of upper gastrointestinal cancers using standard endoscopy. It performs at expert-endoscopist level for sensitivity and significantly outperforms non-expert endoscopists, offering a scalable, high-performance tool to improve early cancer detection in community and primary-care settings where endoscopic expertise is often limited. This represents a major step toward AI-assisted universal upper GI cancer screening |
| Wilson, P. M., et al. [104] | Pragmatic, cluster-randomized, stepped-wedge trial across 12 inpatient units at two U.S. hospitals (Aug 2019–Nov 2020). An AI/ML tool continuously screened the electronic health record and predicted need for palliative care consultation. Units were sequentially randomized to activate the intervention: when triggered, the tool sent a real-time Best Practice Advisory to the primary team recommending palliative care consultation. Control periods used usual care (no AI alert). Primary outcome: documented palliative care consultation. Secondary outcomes: hospital length of stay, ICU transfers, and 30-/60-/90-day readmissions. | 3183 hospitalizations enrolled: 1717 retained for analysis (1212 intervention, 1332 usual-care). Palliative care consultation rate was significantly higher with AI support (IRR 1.44, 95% CI 1.11–1.92). Exploratory analyses showed reduced 60-day (OR 0.75, 95% CI 0.57–0.97) and 90-day readmissions (OR 0.72, 95% CI 0.55–0.93). No significant differences in length of stay or ICU transfers. | This is the first large-scale randomized trial demonstrating that a real-time AI/ML decision-support tool safely and effectively increases palliative care consultation rates in general hospital inpatients. The associated reduction in 60- and 90-day readmissions suggests that proactive, AI-triggered palliative care involvement improves downstream utilization and potentially quality of care. The pragmatic design and integration into routine workflow support scalability across diverse hospital settings. |
| Papachristou, P., et al. [105] | Prospective multicentre diagnostic trial at 36 primary care centres in Sweden. Primary care physicians used a smartphone-based AI clinical decision support tool (dermoscopic photo → dichotomous text output: “evidence of melanoma” or not) on 253 skin lesions of concern in 228 patients. All lesions were managed according to standard care (excision or specialist referral) regardless of app result. Final histopathological or specialist diagnoses were retrieved from medical records and compared with the AI output. | 253 lesions: 21 melanomas (11 invasive, 10 in situ) Overall melanoma detection: AUROC 0.960 (95% CI 0.928–0.980) → max sensitivity 95.2%, specificity 84.5% Invasive melanomas only: AUROC 0.988 (95% CI 0.965–0.997) → max sensitivity 100%, specificity 92.6% | This is one of the first prospective trials of an AI-based melanoma detection tool used by primary care physicians in real-world clinical practice. The smartphone app achieved extremely high diagnostic accuracy (near perfect for invasive melanomas) when assessing dermoscopic images of lesions already flagged as suspicious. These results suggest that AI decision support can safely and effectively augment primary care physicians’ ability to triage and detect melanoma, potentially reducing diagnostic delay and unnecessary excisions in primary care settings. |
4. Discussion
5. Limitations
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
| AF | Atrial fibrillation |
| AI | Artificial intelligence |
| AKI | Acute kidney injury |
| AUC | Area under the (receiver operating characteristic) curve |
| CADe | Computer-aided detection |
| CAD | Coronary artery disease |
| CNN | Convolutional neural network |
| CT | Computed tomography |
| DL | Deep learning |
| ECG | Electrocardiogram |
| EHR | Electronic health record |
| EUS | Endoscopic ultrasound |
| FDA | U.S. Food and Drug Administration |
| HF | Heart failure |
| HFpEF | Heart failure with preserved ejection fraction |
| HPI | Hypotension Prediction Index |
| IPMN | Intraductal papillary mucinous neoplasm |
| LVEF | Left ventricular ejection fraction |
| MACCE | Major adverse cardiac and cerebrovascular events |
| ML | Machine learning |
| MRI | Magnetic resonance imaging |
| NOAC | Non-vitamin K antagonist oral anticoagulant |
| NSCLC | Non-small cell lung cancer |
| PCI | Percutaneous coronary intervention |
| PDAC | Pancreatic ductal adenocarcinoma |
| PPV | Positive predictive value |
| QoL | Quality of life |
| RCT | Randomized controlled trial |
| T1D | Type 1 diabetes |
References
- Bajwa, J.; Munir, U.; Nori, A.; Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Healthc. J. 2021, 8, e188–e194. [Google Scholar] [CrossRef]
- Shi, J.; Bendig, D.; Vollmar, H.C.; Rasche, P. Mapping the Bibliometrics Landscape of AI in Medicine: Methodological Study. J. Med. Internet Res. 2023, 25, e45815. [Google Scholar] [CrossRef] [PubMed]
- Howard, F.M.; Li, A.; Riffon, M.F.; Garrett-Mayer, E.; Pearson, A.T. Characterizing the Increase in Artificial Intelligence Content Detection in Oncology Scientific Abstracts From 2021 to 2023. JCO Clin. Cancer Inform. 2024, 8, e2400077. [Google Scholar] [CrossRef] [PubMed]
- Shajari, S.; Kuruvinashetti, K.; Komeili, A.; Sundararaj, U. The Emergence of AI-Based Wearable Sensors for Digital Health Technology: A Review. Sensors 2023, 23, 9498. [Google Scholar] [CrossRef] [PubMed]
- Huang, G.; Chen, X.; Liao, C. AI-Driven Wearable Bioelectronics in Digital Healthcare. Biosensors 2025, 15, 410. [Google Scholar] [CrossRef] [PubMed]
- Hannun, A.Y.; Rajpurkar, P.; Haghpanahi, M.; Tison, G.H.; Bourn, C.; Turakhia, M.P.; Ng, A.Y. Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nat. Med. 2019, 25, 65–69. [Google Scholar] [CrossRef] [PubMed]
- Virella Pérez, Y.I.; Medlow, S.; Ho, J.; Steinbeck, K. Mobile and Web-Based Apps That Support Self-Management and Transition in Young People with Chronic Illness: Systematic Review. J. Med. Internet Res. 2019, 21, e13579. [Google Scholar] [CrossRef] [PubMed]
- Rieke, N.; Hancox, J.; Li, W.; Milletarì, F.; Roth, H.R.; Albarqouni, S.; Bakas, S.; Galtier, M.N.; Landman, B.A.; Maier-Hein, K.; et al. The future of digital health with federated learning. npj Digit. Med. 2020, 3, 119. [Google Scholar] [CrossRef] [PubMed]
- Ziller, A.; Usynin, D.; Braren, R.; Makowski, M.; Rueckert, D.; Kaissis, G. Medical imaging deep learning with differential privacy. Sci. Rep. 2021, 11, 13524. [Google Scholar] [CrossRef] [PubMed]
- Yadalam, A.K.; Liu, C.; Hui, Q.; Razavi, A.C.; Sperling, L.S.; Quyyumi, A.A.; Sun, Y.V. Large-Scale Proteomics-Based Risk Score for the Prediction of Incident Cardio-Kidney-Metabolic Disease Risk. Circ. Genom. Precis. Med. 2025, 18, e005125. [Google Scholar] [CrossRef]
- Singh, M.; Kumar, A.; Khanna, N.N.; Laird, J.R.; Nicolaides, A.; Faa, G.; Johri, A.M.; Mantella, L.E.; Fernandes, J.F.E.; Teji, J.S.; et al. Artificial intelligence for cardiovascular disease risk assessment in personalised framework: A scoping review. EClinicalMedicine 2024, 73, 102660. [Google Scholar] [CrossRef]
- Hadida Barzilai, D.; Sudri, K.; Goshen, G.; Klang, E.; Zimlichman, E.; Barbash, I.; Cohen Shelly, M. Randomized Controlled Trials Evaluating Artificial Intelligence in Cardiovascular Care: A Systematic Review. JACC Adv. 2025, 4, 102152. [Google Scholar] [CrossRef]
- Komorowski, M.; Cecconi, M. Deploying AI in the ICU: Learning from successes and failures. Intensive Care Med. 2025, 51, 2410–2413. [Google Scholar] [CrossRef] [PubMed]
- Tomašev, N.; Glorot, X.; Rae, J.W.; Zielinski, M.; Askham, H.; Saraiva, A.; Mottram, A.; Meyer, C.; Ravuri, S.; Protsyuk, I.; et al. A clinically applicable approach to continuous prediction of future acute kidney injury. Nature 2019, 572, 116–119. [Google Scholar] [CrossRef] [PubMed]
- Kellum, J.A.; Bihorac, A. Artificial intelligence to predict AKI: Is it a breakthrough? Nat. Rev. Nephrol. 2019, 15, 663–664. [Google Scholar] [CrossRef]
- Tiwari, A.; Mishra, S.; Kuo, T.R. Current AI technologies in cancer diagnostics and treatment. Mol. Cancer 2025, 24, 159. [Google Scholar] [CrossRef] [PubMed]
- Zhang, J.; Che, Y.; Liu, R.; Wang, Z.; Liu, W. Deep learning-driven multi-omics analysis: Enhancing cancer diagnostics and therapeutics. Brief. Bioinform. 2025, 26, bbaf440. [Google Scholar] [CrossRef]
- Olang, O.; Mohseni, S.; Shahabinezhad, A.; Hamidianshirazi, Y.; Goli, A.; Abolghasemian, M.; Shafiee, M.A.; Aarabi, M.; Alavinia, M.; Shaker, P. Artificial Intelligence-Based Models for Prediction of Mortality in ICU Patients: A Scoping Review. J. Intensive Care Med. 2025, 40, 1240–1246. [Google Scholar] [CrossRef]
- Singhal, K.; Tu, T.; Gottweis, J.; Sayres, R.; Wulczyn, E.; Amin, M.; Hou, L.; Clark, K.; Pfohl, S.R.; Cole-Lewis, H.; et al. Toward expert-level medical question answering with large language models. Nat. Med. 2025, 31, 943–950. [Google Scholar] [CrossRef]
- Yang, X.; Chen, A.; PourNejatian, N.; Shin, H.C.; Smith, K.E.; Parisien, C.; Compas, C.; Martin, C.; Costa, A.B.; Flores, M.G.; et al. A large language model for electronic health records. npj Digit. Med. 2022, 5, 194. [Google Scholar] [CrossRef]
- Sivakumar, R.; Lue, B.; Kundu, S. FDA Approval of Artificial Intelligence and Machine Learning Devices in Radiology: A Systematic Review. JAMA Netw. Open 2025, 8, e2542338. [Google Scholar] [CrossRef] [PubMed]
- Shuja, M.H.; Shakil, F.; Shuja, S.H.; Hasan, M.; Edhi, M.; Abbasi, A.F.; Jawaid, A.; Shakil, S. Harnessing Artificial Intelligence in Cardiology: Advancements in Diagnosis, Treatment, and Patient Care. Heart Views 2024, 25, 241–248. [Google Scholar] [CrossRef]
- Ahmad, A.; Ahmad, S.; Ahmad, R.; Bodi, J.; Mohamed, A.; Wasim, A. Artificial Intelligence in Cardiovascular Diagnosis: Innovations and Impact on Disease Screenings. J. Pharm. Bioallied Sci. 2025, 17, S1900–S1903. [Google Scholar] [CrossRef]
- Medhi, D.; Kamidi, S.R.; Mamatha Sree, K.P.; Shaikh, S.; Rasheed, S.; Thengu Murichathil, A.H.; Nazir, Z. Artificial Intelligence and Its Role in Diagnosing Heart Failure: A Narrative Review. Cureus 2024, 16, e59661. [Google Scholar] [CrossRef] [PubMed]
- Udoy, I.A.; Hassan, O. AI-Driven Technology in Heart Failure Detection and Diagnosis: A Review of the Advancement in Personalized Healthcare. Symmetry 2025, 17, 469. [Google Scholar] [CrossRef]
- Almansouri, N.E.; Awe, M.; Rajavelu, S.; Jahnavi, K.; Shastry, R.; Hasan, A.; Hasan, H.; Lakkimsetti, M.; AlAbbasi, R.K.; Gutiérrez, B.C.; et al. Early Diagnosis of Cardiovascular Diseases in the Era of Artificial Intelligence: An In-Depth Review. Cureus 2024, 16, e55869. [Google Scholar] [CrossRef] [PubMed]
- Yasmin, F.; Shah, S.M.I.; Naeem, A.; Shujauddin, S.M.; Jabeen, A.; Kazmi, S.; Siddiqui, S.A.; Kumar, P.; Salman, S.; Hassan, S.A.; et al. Artificial intelligence in the diagnosis and detection of heart failure: The past, present, and future. Rev. Cardiovasc. Med. 2021, 22, 1095–1113. [Google Scholar] [CrossRef]
- Sanders-van Wijk, S.; Barandiarán Aizpurua, A.; Brunner-La Rocca, H.P.; Henkens, M.T.H.M.; Weerts, J.; Knackstedt, C.; Uszko-Lencer, N.; Heymans, S.; van Empel, V. The HFA-PEFF and H2 FPEF scores largely disagree in classifying patients with suspected heart failure with preserved ejection fraction. Eur. J. Heart Fail. 2021, 23, 838–840. [Google Scholar] [CrossRef]
- Bourazana, A.; Xanthopoulos, A.; Briasoulis, A.; Magouliotis, D.; Spiliopoulos, K.; Athanasiou, T.; Vassilopoulos, G.; Skoularigis, J.; Triposkiadis, F. Artificial Intelligence in Heart Failure: Friend or Foe? Life 2024, 14, 145. [Google Scholar] [CrossRef] [PubMed]
- Holt, D.B.; El-Bokl, A.; Stromberg, D.; Taylor, M.D. Role of Artificial Intelligence in Congenital Heart Disease and Interventions. J. Soc. Cardiovasc. Angiogr. Interv. 2025, 4, 102567. [Google Scholar] [CrossRef]
- Paul, D.; Sanap, G.; Shenoy, S.; Kalyane, D.; Kalia, K.; Tekade, R.K. Artificial intelligence in drug discovery and development. Drug Discov. Today 2021, 26, 80–93. [Google Scholar] [CrossRef]
- Meder, B.; Asselbergs, F.W.; Ashley, E. Artificial intelligence to improve cardiovascular population health. Eur. Heart J. 2025, 46, 1907–1916. [Google Scholar] [CrossRef]
- Deisenhofer, I.; Albenque, J.P.; Busch, S.; Gitenay, E.; Mountantonakis, S.E.; Roux, A.; Horvilleur, J.; Bakouboula, B.; Oza, S.; Abbey, S.; et al. Artificial intelligence for individualized treatment of persistent atrial fibrillation: A randomized controlled trial. Nat. Med. 2025, 31, 1286–1293. [Google Scholar] [CrossRef]
- Kim, Y.; Yoon, H.J.; Suh, J.; Kang, S.H.; Lim, Y.H.; Jang, D.H.; Park, J.H.; Shin, E.S.; Bae, J.W.; Lee, J.H.; et al. Artificial Intelligence-Based Fully Automated Quantitative Coronary Angiography vs Optical Coherence Tomography-Guided PCI: The FLASH Trial. JACC. Cardiovasc. Interv. 2025, 18, 187–197. [Google Scholar] [CrossRef]
- Liu, W.T.; Lin, C.; Lee, C.C.; Chang, C.H.; Fang, W.H.; Tsai, D.J.; Lin, W.Y.; Hung, Y.; Chen, K.C.; Lee, C.H.; et al. Artificial Intelligence-Enabled ECGs for Atrial Fibrillation Identification and Enhanced Oral Anticoagulant Adoption: A Pragmatic Randomized Clinical Trial. J. Am. Heart Assoc. 2025, 14, e042106. [Google Scholar] [CrossRef] [PubMed]
- Tsai, D.J.; Lin, C.; Liu, W.T.; Lee, C.C.; Chang, C.H.; Lin, W.Y.; Liu, Y.L.; Chang, D.W.; Hsieh, P.H.; Tsai, C.S.; et al. Artificial intelligence-assisted diagnosis and prognostication in low ejection fraction using electrocardiograms in inpatient department: A pragmatic randomized controlled trial. BMC Med. 2025, 23, 342. [Google Scholar] [CrossRef] [PubMed]
- Kolossváry, M.; Lin, A.; Kwiecinski, J.; Cadet, S.; Slomka, P.J.; Newby, D.E.; Dweck, M.R.; Williams, M.C.; Dey, D. Coronary Plaque Radiomic Phenotypes Predict Fatal or Nonfatal Myocardial Infarction: Analysis of the SCOT-HEART Trial. JACC Cardiovasc. Imaging 2025, 18, 308–319. [Google Scholar] [CrossRef]
- Trivedi, R.; Shaw, T.; Sheahen, B.; Chow, C.K.; Laranjo, L. Patient Perspectives on Conversational Artificial Intelligence for Atrial Fibrillation Self-Management: Qualitative Analysis. J. Med. Internet Res. 2025, 27, e64325. [Google Scholar] [CrossRef] [PubMed]
- Mekonnen, D.; Spitzer, E.; McFadden, E.P.; Caplice, N.M.; Ren, C.B. Artificial intelligence-assisted left ventricular global longitudinal strain assessment in patients with acute myocardial infarction: A RESUS-AMI trial sub-analysis. Int. J. Cardiovasc. Imaging 2025, 41, 1225–1236. [Google Scholar] [CrossRef]
- Williams, M.C.; Guimaraes, A.R.M.; Jiang, M.; Kwieciński, J.; Weir-McCall, J.R.; Adamson, P.D.; Mills, N.L.; Roditi, G.H.; van Beek, E.J.R.; Nicol, E.; et al. Machine learning to predict high-risk coronary artery disease on CT in the SCOT-HEART trial. Open Heart 2025, 12, e003162. [Google Scholar] [CrossRef]
- Saklica, D.; Vardar-Yagli, N.; Saglam, M.; Yuce, D.; Ates, A.H.; Yorgun, H. The Impact of Technology-Based Cardiac Rehabilitation on Exercise Capacity and Adherence in Patients with Coronary Artery Disease: An Artificial Intelligence Analysis. Arq. Bras. Cardiol. 2025, 122, e20240765. [Google Scholar] [CrossRef]
- Trivedi, R.; Laranjo, L.; Marschner, S.; Thiagalingam, A.; Thomas, S.; Kumar, S.; Shaw, T.; Chow, C.K. Conversational AI Phone Calls to Support Patients with Atrial Fibrillation: Randomized Controlled Trial. JMIR Cardio 2025, 9, e64326. [Google Scholar] [CrossRef]
- Fiolet, A.T.L.; Lin, A.; Kwiecinski, J.; Tutein Nolthenius, J.; McElhinney, P.; Grodecki, K.; Kietselaer, B.; Opstal, T.S.; Cornel, J.H.; Knol, R.J.; et al. Effect of low-dose colchicine on pericoronary inflammation and coronary plaque composition in chronic coronary disease: A subanalysis of the LoDoCo2 trial. Heart 2025, 111, 1156–1163. [Google Scholar] [CrossRef] [PubMed]
- Li, G.; Weng, T.; Sun, P.; Li, Z.; Ding, D.; Guan, S.; Han, W.; Gan, Q.; Li, M.; Qi, L.; et al. Diagnostic performance of fully automatic coronary CT angiography-based quantitative flow ratio. J. Cardiovasc. Comput. Tomogr. 2025, 19, 40–47. [Google Scholar] [CrossRef]
- Yu, X.; Cao, J.; Xu, J.; Xu, Q.; Chen, H.; Yu, D.; Ou, A.; Hu, Y.; Ma, L. Efficacy of Telemedical Interventional Management in Patients with Coronary Heart Disease Undergoing Percutaneous Coronary Intervention: Randomized Controlled Trial. J. Med. Internet Res. 2025, 27, e63350. [Google Scholar] [CrossRef] [PubMed]
- Ishiguchi, H.; Chen, Y.; Huang, B.; Gue, Y.; Correa, E.; Homma, S.; Thompson, J.L.P.; Qian, M.; Lip, G.Y.H.; Abdul-Rahim, A.H. Machine learning for stroke in heart failure with reduced ejection fraction but without atrial fibrillation: A post-hoc analysis of the WARCEF trial. Eur. J. Clin. Investig. 2025, 55, e14360. [Google Scholar] [CrossRef] [PubMed]
- Sindhu, A.; Jadhav, U.; Ghewade, B.; Bhanushali, J.; Yadav, P. Revolutionizing Pulmonary Diagnostics: A Narrative Review of Artificial Intelligence Applications in Lung Imaging. Cureus 2024, 16, e57657. [Google Scholar] [CrossRef]
- Cellina, M.; Cacioppa, L.M.; Cè, M.; Chiarpenello, V.; Costa, M.; Vincenzo, Z.; Pais, D.; Bausano, M.V.; Rossini, N.; Bruno, A.; et al. Artificial Intelligence in Lung Cancer Screening: The Future Is Now. Cancers 2023, 15, 4344. [Google Scholar] [CrossRef]
- Ma, K.; Zheng, M.; Chen, W.; Qi, Y.; Rong, H. Research progress in computer-aided diagnosis systems for lung cancer. npj Digit. Med. 2025, 8, 722. [Google Scholar] [CrossRef]
- Ren, F.; Aliper, A.; Chen, J.; Zhao, H.; Rao, S.; Kuppe, C.; Ozerov, I.V.; Zhang, M.; Witte, K.; Kruse, C.; et al. A small-molecule TNIK inhibitor targets fibrosis in preclinical and clinical models. Nat. Biotechnol. 2025, 43, 63–75. [Google Scholar] [CrossRef]
- Xu, Z.; Ren, F.; Wang, P.; Cao, J.; Tan, C.; Ma, D.; Zhao, L.; Dai, J.; Ding, Y.; Fang, H.; et al. A generative AI-discovered TNIK inhibitor for idiopathic pulmonary fibrosis: A randomized phase 2a trial. Nat. Med. 2025, 31, 2602–2610. [Google Scholar] [CrossRef]
- Habicher, M.; Denn, S.M.; Schneck, E.; Akbari, A.A.; Schmidt, G.; Markmann, M.; Alkoudmani, I.; Koch, C.; Sander, M. Perioperative goal-directed therapy with artificial intelligence to reduce the incidence of intraoperative hypotension and renal failure in patients undergoing lung surgery: A pilot study. J. Clin. Anesth. 2025, 102, 111777. [Google Scholar] [CrossRef]
- Hong, L.; Cheng, X.; Zheng, D. Application of Artificial Intelligence in Emergency Nursing of Patients with Chronic Obstructive Pulmonary Disease. Contrast Media Mol. Imaging 2021, 2021, 6423398. [Google Scholar] [CrossRef] [PubMed]
- Ladbury, C.; Li, R.; Danesharasteh, A.; Ertem, Z.; Tam, A.; Liu, J.; Hao, C.; Li, R.; McGee, H.; Sampath, S.; et al. Explainable Artificial Intelligence to Identify Dosimetric Predictors of Toxicity in Patients with Locally Advanced Non-Small Cell Lung Cancer: A Secondary Analysis of RTOG 0617. Int. J. Radiat. Oncol. Biol. Phys. 2023, 117, 1287–1296. [Google Scholar] [CrossRef] [PubMed]
- Pasipanodya, J.G.; Smythe, W.; Merle, C.S.; Olliaro, P.L.; Deshpande, D.; Magombedze, G.; McIlleron, H.; Gumbo, T. Artificial intelligence-derived 3-Way Concentration-dependent Antagonism of Gatifloxacin, Pyrazinamide, and Rifampicin During Treatment of Pulmonary Tuberculosis. Clin. Infect. Dis. 2018, 67, S284–S292. [Google Scholar] [CrossRef]
- Ding, Y.; Zhang, J.; Zhuang, W.; Gao, Z.; Kuang, K.; Tian, D.; Deng, C.; Wu, H.; Chen, R.; Lu, G.; et al. Improving the efficiency of identifying malignant pulmonary nodules before surgery via a combination of artificial intelligence CT image recognition and serum autoantibodies. Eur. Radiol. 2023, 33, 3092–3102. [Google Scholar] [CrossRef]
- Bosman, S.; Ayakaka, I.; Muhairwe, J.; Kamele, M.; van Heerden, A.; Madonsela, T.; Labhardt, N.D.; Sommer, G.; Bremerich, J.; Zoller, T.; et al. Evaluation of C-Reactive Protein and Computer-Aided Analysis of Chest X-rays as Tuberculosis Triage Tests at Health Facilities in Lesotho and South Africa. Clin. Infect. Dis. 2024, 79, 1293–1302. [Google Scholar] [CrossRef] [PubMed]
- Kalani, M.; Anjankar, A. Revolutionizing Neurology: The Role of Artificial Intelligence in Advancing Diagnosis and Treatment. Cureus 2024, 16, e61706. [Google Scholar] [CrossRef]
- Kale, M.; Wankhede, N.; Pawar, R.; Ballal, S.; Kumawat, R.; Goswami, M.; Khalid, M.; Taksande, B.; Upaganlawar, A.; Umekar, M.; et al. AI-driven innovations in Alzheimer’s disease: Integrating early diagnosis, personalized treatment, and prognostic modelling. Ageing Res. Rev. 2024, 101, 102497. [Google Scholar] [CrossRef]
- Onciul, R.; Tataru, C.-I.; Dumitru, A.V.; Crivoi, C.; Serban, M.; Covache-Busuioc, R.-A.; Radoi, M.P.; Toader, C. Artificial Intelligence and Neuroscience: Transformative Synergies in Brain Research and Clinical Applications. J. Clin. Med. 2025, 14, 550. [Google Scholar] [CrossRef]
- Zhang, H.; Jiao, L.; Yang, S.; Li, H.; Jiang, X.; Feng, J.; Zou, S.; Xu, Q.; Gu, J.; Wang, X.; et al. Brain-computer interfaces: The innovative key to unlocking neurological conditions. Int. J. Surg. 2024, 110, 5745–5762. [Google Scholar] [CrossRef]
- Buttar, A.M.; Shaheen, Z.; Gumaei, A.H.; Mosleh, M.A.A.; Gupta, I.; Alzanin, S.M.; Akbar, M.A. Enhanced neurological anomaly detection in MRI images using deep convolutional neural networks. Front. Med. 2024, 11, 1504545. [Google Scholar] [CrossRef] [PubMed]
- Gombolay, G.Y.; Silva, A.; Schrum, M.; Gopalan, N.; Hallman-Cooper, J.; Dutt, M.; Gombolay, M. Effects of explainable artificial intelligence in neurology decision support. Ann. Clin. Transl. Neurol. 2024, 11, 1224–1235. [Google Scholar] [CrossRef] [PubMed]
- Alagapan, S.; Choi, K.S.; Heisig, S.; Riva-Posse, P.; Crowell, A.; Tiruvadi, V.; Obatusin, M.; Veerakumar, A.; Waters, A.C.; Gross, R.E.; et al. Cingulate dynamics track depression recovery with deep brain stimulation. Nature 2023, 622, 130–138. [Google Scholar] [CrossRef]
- Boutet, A.; Madhavan, R.; Elias, G.J.B.; Joel, S.E.; Gramer, R.; Ranjan, M.; Paramanandam, V.; Xu, D.; Germann, J.; Loh, A.; et al. Predicting optimal deep brain stimulation parameters for Parkinson’s disease using functional MRI and machine learning. Nat. Commun. 2021, 12, 3043. [Google Scholar] [CrossRef]
- Cowan, R.P.; Rapoport, A.M.; Blythe, J.; Rothrock, J.; Knievel, K.; Peretz, A.M.; Ekpo, E.; Sanjanwala, B.M.; Woldeamanuel, Y.W. Diagnostic accuracy of an artificial intelligence online engine in migraine: A multi-center study. Headache 2022, 62, 870–882. [Google Scholar] [CrossRef] [PubMed]
- Gorenshtein, A.; Weisblat, Y.; Khateb, M.; Kenan, G.; Tsirkin, I.; Fayn, G.; Geller, S.; Shelly, S. AI-Based EMG Reporting: A Randomized Controlled Trial. J. Neurol. 2025, 272, 586. [Google Scholar] [CrossRef]
- Davidovic, V.; Giglio, B.; Albeloushi, A.; Alhaj, A.K.; Alhantoobi, M.; Saeedi, R.; Deraiche, S.; Yilmaz, R.; Tee, T.; Fazlollahi, A.M.; et al. Effect of Artificial Intelligence-Augmented Human Instruction on Feedback Frequency and Surgical Performance During Simulation Training. J. Surg. Educ. 2025, 82, 103743. [Google Scholar] [CrossRef]
- Hassan, A.E.; Ravi, S.; Desai, S.; Saei, H.M.; Mckennon, E.; Tekle, W.G. An artificial intelligence (AI)-based approach to clinical trial recruitment: The impact of Viz RECRUIT on enrollment in the EMBOLISE trial. Interv. Neuroradiol. J. Perither. Neuroradiol. Surg. Proced. Relat. Neurosci. 2025, 31, 739–744. [Google Scholar] [CrossRef]
- Macea, J.; Heremans, E.R.M.; Proost, R.; De Vos, M.; Van Paesschen, W. Automated Sleep Staging in Epilepsy Using Deep Learning on Standard Electroencephalogram and Wearable Data. J. Sleep Res. 2025, 34, e70061. [Google Scholar] [CrossRef]
- Kröner, P.T.; Engels, M.M.; Glicksberg, B.S.; Johnson, K.W.; Mzaik, O.; van Hooft, J.E.; Wallace, M.B.; El-Serag, H.B.; Krittanawong, C. Artificial intelligence in gastroenterology: A state-of-the-art review. World J. Gastroenterol. 2021, 27, 6794–6824. [Google Scholar] [CrossRef]
- Bian, Y.; Li, J.; Ye, C.; Jia, X.; Yang, Q. Artificial intelligence in medical imaging: From task-specific models to large-scale foundation models. Chin. Med. J. 2025, 138, 651–663. [Google Scholar] [CrossRef]
- Vivek, K.; Papalois, V. AI and Machine Learning in Transplantation. Transplantology 2025, 6, 23. [Google Scholar] [CrossRef]
- Ratziu, V.; Francque, S.; Behling, C.A.; Cejvanovic, V.; Cortez-Pinto, H.; Iyer, J.S.; Krarup, N.; Le, Q.; Sejling, A.S.; Tiniakos, D.; et al. Artificial intelligence scoring of liver biopsies in a phase II trial of semaglutide in nonalcoholic steatohepatitis. Hepatology 2024, 80, 173–185. [Google Scholar] [CrossRef]
- Tiyarattanachai, T.; Apiparakoon, T.; Chaichuen, O.; Sukcharoen, S.; Yimsawad, S.; Jangsirikul, S.; Chaikajornwat, J.; Siriwong, N.; Burana, C.; Siritaweechai, N.; et al. Artificial intelligence assists operators in real-time detection of focal liver lesions during ultrasound: A randomized controlled study. Eur. J. Radiol. 2023, 165, 110932. [Google Scholar] [CrossRef]
- Zhang, Y.; Cui, J.; Wan, W.; Liu, J. Multimodal Imaging under Artificial Intelligence Algorithm for the Diagnosis of Liver Cancer and Its Relationship with Expressions of EZH2 and p57. Comput. Intell. Neurosci. 2022, 2022, 4081654. [Google Scholar] [CrossRef]
- Briceño, J.; Cruz-Ramírez, M.; Prieto, M.; Navasa, M.; Ortiz de Urbina, J.; Orti, R.; Gómez-Bravo, M.Á.; Otero, A.; Varo, E.; Tomé, S.; et al. Use of artificial intelligence as an innovative donor-recipient matching model for liver transplantation: Results from a multicenter Spanish study. J. Hepatol. 2014, 61, 1020–1028. [Google Scholar] [CrossRef] [PubMed]
- Yang, M.; Zhao, Y.; Li, C.; Weng, X.; Li, Z.; Guo, W.; Jia, W.; Feng, F.; Hu, J.; Sun, H.; et al. Multimodal integration of liquid biopsy and radiology for the noninvasive diagnosis of gallbladder cancer and benign disorders. Cancer Cell 2025, 43, 398–412.e4. [Google Scholar] [CrossRef]
- Goyal, H.; Mann, R.; Gandhi, Z.; Perisetti, A.; Zhang, Z.; Sharma, N.; Saligram, S.; Inamdar, S.; Tharian, B. Application of artificial intelligence in pancreaticobiliary diseases. Ther. Adv. Gastrointest. Endosc. 2021, 14, 2631774521993059. [Google Scholar] [CrossRef] [PubMed]
- Podină, N.; Gheorghe, E.C.; Constantin, A.; Cazacu, I.; Croitoru, V.; Gheorghe, C.; Balaban, D.V.; Jinga, M.; Țieranu, C.G.; Săftoiu, A. Artificial Intelligence in Pancreatic Imaging: A Systematic Review. United Eur. Gastroenterol. J. 2025, 13, 55–77. [Google Scholar] [CrossRef]
- Lopez-Ramirez, F.; Syailendra, E.A.; Tixier, F.; Kawamoto, S.; Fishman, E.K.; Chu, L.C. Early detection of pancreatic cancer on computed tomography: Advancements with deep learning. Radiol. Adv. 2025, 2, umaf028. [Google Scholar] [CrossRef]
- Korfiatis, P.; Suman, G.; Patnam, N.G.; Trivedi, K.H.; Karbhari, A.; Mukherjee, S.; Cook, C.; Klug, J.R.; Patra, A.; Khasawneh, H.; et al. Automated Artificial Intelligence Model Trained on a Large Data Set Can Detect Pancreas Cancer on Diagnostic Computed Tomography Scans As Well As Visually Occult Preinvasive Cancer on Prediagnostic Computed Tomography Scans. Gastroenterology 2023, 165, 1533–1546.e4. [Google Scholar] [CrossRef]
- Udriștoiu, A.L.; Podină, N.; Ungureanu, B.S.; Constantin, A.; Georgescu, C.V.; Bejinariu, N.; Pirici, D.; Burtea, D.E.; Gruionu, L.; Udriștoiu, S.; et al. Deep learning segmentation architectures for automatic detection of pancreatic ductal adenocarcinoma in EUS-guided fine-needle biopsy samples based on whole-slide imaging. Endosc. Ultrasound 2024, 13, 335–344. [Google Scholar] [CrossRef]
- Shi, Y.J.; Zhang, H.; Wang, L.L.; Liu, Y.L.; Zhu, H.T.; Li, X.T.; Wei, Y.Y.; Sun, Y.S. Deep learning automatic segmentation and radiomics model for diagnosing pancreatic solid neoplasms in MRI. BMC Cancer 2025, 25, 1563. [Google Scholar] [CrossRef]
- Kui, B.; Pintér, J.; Molontay, R.; Nagy, M.; Farkas, N.; Gede, N.; Vincze, Á.; Bajor, J.; Gódi, S.; Czimmer, J.; et al. EASY-APP: An artificial intelligence model and application for early and easy prediction of severity in acute pancreatitis. Clin. Transl. Med. 2022, 12, e842. [Google Scholar] [CrossRef]
- Cui, H.; Zhao, Y.; Xiong, S.; Feng, Y.; Li, P.; Lv, Y.; Chen, Q.; Wang, R.; Xie, P.; Luo, Z.; et al. Diagnosing Solid Lesions in the Pancreas With Multimodal Artificial Intelligence: A Randomized Crossover Trial. JAMA Netw. Open 2024, 7, e2422454. [Google Scholar] [CrossRef] [PubMed]
- Chen, P.T.; Wu, T.; Wang, P.; Chang, D.; Liu, K.L.; Wu, M.S.; Roth, H.R.; Lee, P.C.; Liao, W.C.; Wang, W. Pancreatic Cancer Detection on CT Scans with Deep Learning: A Nationwide Population-based Study. Radiology 2023, 306, 172–182. [Google Scholar] [CrossRef] [PubMed]
- Kovatchev, B.; Castillo, A.; Pryor, E.; Kollar, L.L.; Barnett, C.L.; DeBoer, M.D.; Brown, S.A. Neural-Net Artificial Pancreas: A Randomized Crossover Trial of a First-in-Class Automated Insulin Delivery Algorithm. Diabetes Technol. Ther. 2024, 26, 375–382. [Google Scholar] [CrossRef]
- Atlas, E.; Nimri, R.; Miller, S.; Grunberg, E.A.; Phillip, M. MD-logic artificial pancreas system: A pilot study in adults with type 1 diabetes. Diabetes Care 2010, 33, 1072–1076. [Google Scholar] [CrossRef] [PubMed]
- Ali, A.; Alghamdi, M.; Marzuki, S.S.; Tengku Din, T.A.D.A.A.; Yamin, M.S.; Alrashidi, M.; Alkhazi, I.S.; Ahmed, N. Exploring AI Approaches for Breast Cancer Detection and Diagnosis: A Review Article. Breast Cancer 2025, 17, 927–947. [Google Scholar] [CrossRef]
- Debellotte, O.; Dookie, R.L.; Rinkoo, F.; Kar, A.; Salazar González, J.F.; Saraf, P.; Aflahe Iqbal, M.; Ghazaryan, L.; Mukunde, A.C.; Khalid, A.; et al. Artificial Intelligence and Early Detection of Breast, Lung, and Colon Cancer: A Narrative Review. Cureus 2025, 17, e79199. [Google Scholar] [CrossRef]
- Barrett, M.; Boyne, J.; Brandts, J.; Brunner-La Rocca, H.P.; De Maesschalck, L.; De Wit, K.; Dixon, L.; Eurlings, C.; Fitzsimons, D.; Golubnitschaja, O.; et al. Artificial intelligence supported patient self-care in chronic heart failure: A paradigm shift from reactive to predictive, preventive and personalised care. EPMA J. 2019, 10, 445–464. [Google Scholar] [CrossRef]
- Garzonis, K.; Mann, E.; Wyrzykowska, A.; Kanellakis, P. Improving Patient Outcomes: Effectively Training Healthcare Staff in Psychological Practice Skills: A Mixed Systematic Literature Review. Eur. J. Psychol. 2015, 11, 535–556. [Google Scholar] [CrossRef]
- Arioz, U.; Allsop, M.J.; Goodman, W.D.; Timmons, S.; Simbirtseva, K.; Mlakar, I.; Mocnik, G. Artificial intelligence-based approaches for advance care planning: A scoping review. BMC Palliat. Care 2025, 24, 268. [Google Scholar] [CrossRef]
- Bienefeld, N.; Keller, E.; Grote, G. AI Interventions to Alleviate Healthcare Shortages and Enhance Work Conditions in Critical Care: Qualitative Analysis. J. Med. Internet Res. 2025, 27, e50852. [Google Scholar] [CrossRef]
- Lång, K.; Josefsson, V.; Larsson, A.M.; Larsson, S.; Högberg, C.; Sartor, H.; Hofvind, S.; Andersson, I.; Rosso, A. Artificial intelligence-supported screen reading versus standard double reading in the Mammography Screening with Artificial Intelligence trial (MASAI): A clinical safety analysis of a randomised, controlled, non-inferiority, single-blinded, screening accuracy study. Lancet Oncol. 2023, 24, 936–944. [Google Scholar] [CrossRef] [PubMed]
- Dembrower, K.; Crippa, A.; Colón, E.; Eklund, M.; Strand, F.; ScreenTrustCAD Trial Consortium. Artificial intelligence for breast cancer detection in screening mammography in Sweden: A prospective, population-based, paired-reader, non-inferiority study. Lancet Digit. Health 2023, 5, e703–e711. [Google Scholar] [CrossRef] [PubMed]
- Sadeh-Sharvit, S.; Camp, T.D.; Horton, S.E.; Hefner, J.D.; Berry, J.M.; Grossman, E.; Hollon, S.D. Effects of an Artificial Intelligence Platform for Behavioral Interventions on Depression and Anxiety Symptoms: Randomized Clinical Trial. J. Med. Internet Res. 2023, 25, e46781. [Google Scholar] [CrossRef]
- Repici, A.; Spadaccini, M.; Antonelli, G.; Correale, L.; Maselli, R.; Galtieri, P.A.; Pellegatta, G.; Capogreco, A.; Milluzzo, S.M.; Lollo, G.; et al. Artificial intelligence and colonoscopy experience: Lessons from two randomised trials. Gut 2022, 71, 757–765. [Google Scholar] [CrossRef]
- Marcuzzi, A.; Nordstoga, A.L.; Bach, K.; Aasdahl, L.; Nilsen, T.I.L.; Bardal, E.M.; Boldermo, N.Ø.; Falkener Bertheussen, G.; Marchand, G.H.; Gismervik, S.; et al. Effect of an Artificial Intelligence-Based Self-Management App on Musculoskeletal Health in Patients with Neck and/or Low Back Pain Referred to Specialist Care: A Randomized Clinical Trial. JAMA Netw. Open 2023, 6, e2320400. [Google Scholar] [CrossRef] [PubMed]
- Wallace, M.B.; Sharma, P.; Bhandari, P.; East, J.; Antonelli, G.; Lorenzetti, R.; Vieth, M.; Speranza, I.; Spadaccini, M.; Desai, M.; et al. Impact of Artificial Intelligence on Miss Rate of Colorectal Neoplasia. Gastroenterology 2022, 163, 295–304.e5. [Google Scholar] [CrossRef]
- Liaw, S.Y.; Tan, J.Z.; Bin Rusli, K.D.; Ratan, R.; Zhou, W.; Lim, S.; Lau, T.C.; Seah, B.; Chua, W.L. Artificial Intelligence Versus Human-Controlled Doctor in Virtual Reality Simulation for Sepsis Team Training: Randomized Controlled Study. J. Med. Internet Res. 2023, 25, e47748. [Google Scholar] [CrossRef] [PubMed]
- Luo, H.; Xu, G.; Li, C.; He, L.; Luo, L.; Wang, Z.; Jing, B.; Deng, Y.; Jin, Y.; Li, Y.; et al. Real-time artificial intelligence for detection of upper gastrointestinal cancer by endoscopy: A multicentre, case-control, diagnostic study. Lancet Oncol. 2019, 20, 1645–1654. [Google Scholar] [CrossRef] [PubMed]
- Wilson, P.M.; Ramar, P.; Philpot, L.M.; Soleimani, J.; Ebbert, J.O.; Storlie, C.B.; Morgan, A.A.; Schaeferle, G.M.; Asai, S.W.; Herasevich, V.; et al. Effect of an Artificial Intelligence Decision Support Tool on Palliative Care Referral in Hospitalized Patients: A Randomized Clinical Trial. J. Pain Symptom Manag. 2023, 66, 24–32. [Google Scholar] [CrossRef]
- Papachristou, P.; Söderholm, M.; Pallon, J.; Taloyan, M.; Polesie, S.; Paoli, J.; Anderson, C.D.; Falk, M. Evaluation of an artificial intelligence-based decision support for the detection of cutaneous melanoma in primary care: A prospective real-life clinical trial. Br. J. Dermatol. 2024, 191, 125–133. [Google Scholar] [CrossRef] [PubMed]
- Kazemzadeh, K. Artificial intelligence in ophthalmology: Opportunities, challenges, and ethical considerations. Med. Hypothesis Discov. Innov. Ophthalmol. J. 2025, 14, 255–272. [Google Scholar] [CrossRef]
- Hosny, A.; Parmar, C.; Quackenbush, J.; Schwartz, L.H.; Aerts, H.J.W.L. Artificial intelligence in radiology. Nat. Rev. Cancer 2018, 18, 500–510. [Google Scholar] [CrossRef]
- Bhandari, A. Revolutionizing Radiology with Artificial Intelligence. Cureus 2024, 16, e72646. [Google Scholar] [CrossRef]
- Försch, S.; Klauschen, F.; Hufnagl, P.; Roth, W. Artificial Intelligence in Pathology. Dtsch. Arztebl. Int. 2021, 118, 194–204. [Google Scholar] [CrossRef]
- Shafi, S.; Parwani, A.V. Artificial intelligence in diagnostic pathology. Diagn. Pathol. 2023, 18, 109. [Google Scholar] [CrossRef]
- Sajithkumar, A.; Thomas, J.; Saji, A.M.; Ali, F.; E.K, H.H.; Adampulan, H.A.G.; Sarathchand, S. Artificial Intelligence in pathology: Current applications, limitations, and future directions. Ir. J. Med. Sci. 2024, 193, 1117–1121. [Google Scholar] [CrossRef]
- Hashemian, H.; Peto, T.; Ambrósio, R., Jr.; Lengyel, I.; Kafieh, R.; Muhammed Noori, A.; Khorrami-Nejad, M. Application of Artificial Intelligence in Ophthalmology: An Updated Comprehensive Review. J. Ophthalmic Vis. Res. 2024, 19, 354–367. [Google Scholar] [CrossRef]
- Lim, J.I.; Regillo, C.D.; Sadda, S.R.; Ipp, E.; Bhaskaranand, M.; Ramachandra, C.; Solanki, K. Artificial Intelligence Detection of Diabetic Retinopathy: Subgroup Comparison of the EyeArt System with Ophthalmologists’ Dilated Examinations. Ophthalmol. Sci. 2022, 3, 100228. [Google Scholar] [CrossRef]
- Li, Z.; Wang, L.; Wu, X.; Jiang, J.; Qiang, W.; Xie, H.; Zhou, H.; Wu, S.; Shao, Y.; Chen, W. Artificial intelligence in ophthalmology: The path to the real-world clinic. Cell Rep. Med. 2023, 4, 101095. [Google Scholar] [CrossRef]
- Johnson, K.W.; Torres Soto, J.; Glicksberg, B.S.; Shameer, K.; Miotto, R.; Ali, M.; Ashley, E.; Dudley, J.T. Artificial Intelligence in Cardiology. J. Am. Coll. Cardiol. 2018, 71, 2668–2679. [Google Scholar] [CrossRef] [PubMed]
- Karatzia, L.; Aung, N.; Aksentijevic, D. Artificial intelligence in cardiology: Hope for the future and power for the present. Front. Cardiovasc. Med. 2022, 9, 945726. [Google Scholar] [CrossRef]
- Patrascanu, O.S.; Tutunaru, D.; Musat, C.L.; Dragostin, O.M.; Fulga, A.; Nechita, L.; Ciubara, A.B.; Piraianu, A.I.; Stamate, E.; Poalelungi, D.G.; et al. Future Horizons: The Potential Role of Artificial Intelligence in Cardiology. J. Pers. Med. 2024, 14, 656. [Google Scholar] [CrossRef] [PubMed]
- Lotter, W.; Hassett, M.J.; Schultz, N.; Kehl, K.L.; Van Allen, E.M.; Cerami, E. Artificial Intelligence in Oncology: Current Landscape, Challenges, and Future Directions. Cancer Discov. 2024, 14, 711–726. [Google Scholar] [CrossRef]
- Kann, B.H.; Hosny, A.; Aerts, H.J.W.L. Artificial intelligence for clinical oncology. Cancer Cell 2021, 39, 916–927. [Google Scholar] [CrossRef] [PubMed]
- Shimizu, H.; Nakayama, K.I. Artificial intelligence in oncology. Cancer Sci. 2020, 111, 1452–1460. [Google Scholar] [CrossRef]
- Rizzo, M.; Dawson, J.D. AI in Neurology: Everything, Everywhere, All at Once Part 1: Principles and Practice. Ann. Neurol. 2025, 98, 211–230. [Google Scholar] [CrossRef]
- Khalilian, M.; Godefroy, O.; Roussel, M.; Mousavi, A.; Aarabi, A. Post-stroke outcome prediction based on lesion-derived features. NeuroImage Clin. 2025, 45, 103747. [Google Scholar] [CrossRef] [PubMed]
- Voigtlaender, S.; Pawelczyk, J.; Geiger, M.; Vaios, E.J.; Karschnia, P.; Cudkowicz, M.; Dietrich, J.; Haraldsen, I.R.J.H.; Feigin, V.; Owolabi, M.; et al. Artificial intelligence in neurology: Opportunities, challenges, and policy implications. J. Neurol. 2024, 271, 2258–2273. [Google Scholar] [CrossRef]
- Shrestha, U.K. Emerging role of artificial intelligence in gastroenterology and hepatology. World J. Gastroenterol. 2025, 31, 111495. [Google Scholar] [CrossRef]
- Urquhart, S.A.; Christof, M.; Coelho-Prabhu, N. The impact of artificial intelligence on the endoscopic assessment of inflammatory bowel disease-related neoplasia. Ther. Adv. Gastroenterol. 2025, 18, 17562848251348574. [Google Scholar] [CrossRef] [PubMed]
- El-Sayed, A.; Lovat, L.B.; Ahmad, O.F. Clinical Implementation of Artificial Intelligence in Gastroenterology: Current Landscape, Regulatory Challenges, and Ethical Issues. Gastroenterology 2025, 169, 518–530. [Google Scholar] [CrossRef]
- Nahm, W.J.; Sohail, N.; Burshtein, J.; Goldust, M.; Tsoukas, M. Artificial Intelligence in Dermatology: A Comprehensive Review of Approved Applications, Clinical Implementation, and Future Directions. Int. J. Dermatol. 2025, 64, 1568–1583. [Google Scholar] [CrossRef] [PubMed]
- Young, A.T.; Xiong, M.; Pfau, J.; Keiser, M.J.; Wei, M.L. Artificial Intelligence in Dermatology: A Primer. J. Investig. Dermatol. 2020, 140, 1504–1512. [Google Scholar] [CrossRef]
- Fliorent, R.; Fardman, B.; Podwojniak, A.; Javaid, K.; Tan, I.J.; Ghani, H.; Truong, T.M.; Rao, B.; Heath, C. Artificial intelligence in dermatology: Advancements and challenges in skin of color. Int. J. Dermatol. 2024, 63, 455–461. [Google Scholar] [CrossRef]
- Cruz-Gonzalez, P.; He, A.W.; Lam, E.P.; Ng, I.M.C.; Li, M.W.; Hou, R.; Chan, J.N.; Sahni, Y.; Vinas Guasch, N.; Miller, T.; et al. Artificial intelligence in mental health care: A systematic review of diagnosis, monitoring, and intervention applications. Psychol. Med. 2025, 55, e18. [Google Scholar] [CrossRef]
- Prégent, J.; Chung, V.H.; El Adib, I.; Désilets, M.; Hudon, A. Applications of Artificial Intelligence in Psychiatry and Psychology Education: Scoping Review. JMIR Med. Educ. 2025, 11, e75238. [Google Scholar] [CrossRef] [PubMed]
- Garcia, G. The role of AI in transforming psychiatric-mental health care: Enhancing the role of psychiatric-mental health nurse practitioners. Nurs. Outlook 2025, 73, 102461. [Google Scholar] [CrossRef]
- Niazi, S.K.; Mariam, Z. Artificial intelligence in drug development: Reshaping the therapeutic landscape. Ther. Adv. Drug Saf. 2025, 16, 20420986251321704. [Google Scholar] [CrossRef]
- Zhang, K.; Yang, X.; Wang, Y.; Yu, Y.; Huang, N.; Li, G.; Li, X.; Wu, J.C.; Yang, S. Artificial intelligence in drug development. Nat. Med. 2025, 31, 45–59. [Google Scholar] [CrossRef]
- Fu, C.; Chen, Q. The future of pharmaceuticals: Artificial intelligence in drug discovery and development. J. Pharm. Anal. 2025, 15, 101248. [Google Scholar] [CrossRef]
- Baghbani, S.; Mehrabi, Y.; Movahedinia, M.; Babaeinejad, E.; Joshaghanian, M.; Amiri, S.; Shahrezaee, M. The revolutionary impact of artificial intelligence in orthopedics: Comprehensive review of current benefits and challenges. J. Robot. Surg. 2025, 19, 511. [Google Scholar] [CrossRef]
- Wu, S.; Miao, Y.; Mei, J.; Xiong, S. The Rise of Artificial Intelligence in Orthopedics: A Bibliometric and Visualization Analysis. J. Multidiscip. Healthc. 2025, 18, 6037–6050. [Google Scholar] [CrossRef]
- Song, J.; Wang, G.C.; Wang, S.C.; He, C.R.; Zhang, Y.Z.; Chen, X.; Su, J.C. Artificial intelligence in orthopedics: Fundamentals, current applications, and future perspectives. Mil. Med. Res. 2025, 12, 42. [Google Scholar] [CrossRef]
- Smith, M.E.; Zalesky, C.C.; Lee, S.; Gottlieb, M.; Adhikari, S.; Goebel, M.; Wegman, M.; Garg, N.; Lam, S.H.F. Artificial Intelligence in Emergency Medicine: A Primer for the Nonexpert. J. Am. Coll. Emerg. Physicians Open 2025, 6, 100051. [Google Scholar] [CrossRef] [PubMed]
- Stewart, J.; Sprivulis, P.; Dwivedi, G. Artificial intelligence and machine learning in emergency medicine. Emerg. Med. Australas. EMA 2018, 30, 870–874. [Google Scholar] [CrossRef] [PubMed]
- Amiot, F.; Potier, B. Artificial Intelligence (AI) and Emergency Medicine: Balancing Opportunities and Challenges. JMIR Med. Inform. 2025, 13, e70903. [Google Scholar] [CrossRef] [PubMed]
- Kerth, J.L.; Bischops, A.C.; Hagemeister, M.; Reinhart, L.; Konrad, K.; Heinrichs, B.; Meissner, T. Künstliche Intelligenz in der Gesundheitsvorsorge von Kindern und Jugendlichen—Anwendungsmöglichkeiten und Akzeptanz [Artificial intelligence in preventive medicine for children and adolescents-applications and acceptance]. Bundesgesundheitsblatt Gesundheitsforschung Gesundheitsschutz 2025, 68, 907–914. [Google Scholar] [CrossRef] [PubMed]
- Dinc, R.; Ardic, N. The Next Frontiers in Preventive and Personalized Healthcare: Artificial Intelligent-powered Solutions. J. Prev. Med. Public Health = Yebang Uihakhoe Chi 2025, 58, 441–452. [Google Scholar] [CrossRef] [PubMed]
- Whiteson, H.Z.; Frishman, W.H. Artificial Intelligence in the Prevention and Detection of Cardiovascular Disease. Cardiol. Rev. 2025, 33, 239–242. [Google Scholar] [CrossRef]
- Lopes, S.; Rocha, G.; Guimarães-Pereira, L. Artificial intelligence and its clinical application in Anesthesiology: A systematic review. J. Clin. Monit. Comput. 2024, 38, 247–259. [Google Scholar] [CrossRef]
- Dost, A.; Alaraj, R.; Mayet, R.; Agrawal, D.K. Reshaping Anesthesia with Artificial Intelligence: From Concept to Reality. Anesth. Crit. Care 2025, 7, 77–90. [Google Scholar] [CrossRef]
- Hashimoto, D.A.; Witkowski, E.; Gao, L.; Meireles, O.; Rosman, G. Artificial Intelligence in Anesthesiology: Current Techniques, Clinical Applications, and Limitations. Anesthesiology 2020, 132, 379–394. [Google Scholar] [CrossRef]
- Scott, I.A.; van der Vegt, A.; Lane, P.; McPhail, S.; Magrabi, F. Achieving large-scale clinician adoption of AI-enabled decision support. BMJ Health Care Inform. 2024, 31, e100971. [Google Scholar] [CrossRef]
- Fihn, S.; Saria, S.; Mendonça, E.; Hain, S.; Matheny, M.; Shah, N.; Liu, H.; Auerbach, A. Deploying artificial intelligence in clinical settings. In Artificial Intelligence in Health Care: The Hope, the Hype, the Promise, the Peril; Whicher, D., Ahmed, M., Israni, S.T., Eds.; National Academies Press: Washington, DC, USA, 2023. Available online: https://www.ncbi.nlm.nih.gov/books/NBK605954/ (accessed on 20 December 2025).
- Bahl, M. Artificial intelligence in clinical practice: Implementation considerations and barriers. J. Breast Imaging 2022, 4, 632–639. [Google Scholar] [CrossRef]
- de Hond, A.A.H.; Leeuwenberg, A.M.; Hooft, L.; Kant, I.M.J.; Nijman, S.W.J.; van Os, H.J.A.; Aardoom, J.J.; Debray, T.P.A.; Schuit, E.; van Smeden, M.; et al. Guidelines and quality criteria for artificial intelligence-based prediction models in healthcare: A scoping review. npj Digit. Med. 2022, 5, 2. [Google Scholar] [CrossRef]
- Liu, M.; Ning, Y.; Teixayavong, S.; Liu, X.; Mertens, M.; Shang, Y.; Li, X.; Miao, D.; Liao, J.; Xu, J.; et al. A scoping review and evidence gap analysis of clinical AI fairness. npj Digit. Med. 2025, 8, 360. [Google Scholar] [CrossRef] [PubMed]




| Authors | Methodology | Results | Significance |
|---|---|---|---|
| Ren, F., et al. [50] | AI-driven target identification selected TNIK as a high-confidence anti-fibrotic target. A generative AI platform designed and optimized the small-molecule TNIK inhibitor INS018_055. The compound was tested in multiple preclinical in vivo models of lung, liver, skin, and kidney fibrosis (oral, inhaled, and topical routes) and in anti-inflammatory models. Safety, tolerability, and pharmacokinetics were assessed in two randomized, double-blind, placebo-controlled phase I trials (NCT05154240 and CTR20221542) involving 78 healthy volunteers. | INS018_055 showed potent pan-organ anti-fibrotic activity and additional anti-inflammatory effects across preclinical models via all tested administration routes. Both phase I trials confirmed favorable safety, tolerability, and pharmacokinetic profiles with no serious adverse events. The entire process from target identification to clinical candidate nomination was completed in approximately 18 months. | This study demonstrates the first successful clinical translation of a generative AI-discovered drug for idiopathic pulmonary fibrosis. By identifying TNIK as a novel anti-fibrotic target and rapidly delivering a safe, orally/inhalably bioavailable inhibitor (INS018_055), it validates an ultra-fast AI-powered drug discovery pipeline capable of dramatically accelerating therapeutic development for high-unmet-need fibrotic diseases. |
| Xu, Z., et al. [51] | Multicenter, double-blind, randomized, placebo-controlled phase 2a trial (NCT05938920); 71 patients with idiopathic pulmonary fibrosis were randomized 1:1:1:1 to rentosertib (AI-discovered first-in-class TNIK inhibitor) 30 mg QD, 30 mg BID, 60 mg QD, or placebo for 12 weeks. Primary endpoint: safety and tolerability (treatment-emergent adverse events). Key secondary endpoints: pharmacokinetics and change in forced vital capacity (FVC). | Safety profile was favorable and comparable across groups (TEAEs 70–83%); serious treatment-related AEs were rare, with liver toxicity and diarrhea as main reasons for discontinuation. At the highest dose (60 mg QD), FVC increased by +98.4 mL (95% CI 10.9–185.9) versus a decline of −20.3 mL (95% CI −116.1 to 75.6) with placebo, indicating a dose-dependent signal of lung function stabilization/improvement. | Rentosertib is the first fully AI-generated novel small molecule (new target + new chemical entity) to demonstrate clinical proof-of-concept in a phase 2 trial. The positive safety profile and early evidence of FVC benefit in IPF validate generative AI as a viable end-to-end drug discovery engine capable of delivering clinical-stage candidates for previously intractable diseases |
| Habicher, M., et al. [52] | Single-center, single-blinded randomized controlled trial; 150 patients undergoing lung surgery with single-lung ventilation were 1:1 randomized to either AI-based goal-directed hemodynamic therapy using the Hypotension Prediction Index (HPI) or standard care without a specific protocol. | The HPI-guided intervention significantly reduced the number of hypotensive episodes (0 [0–1] vs. 1 [0–2], p = 0.01), duration of hypotension (0 vs. 2.33 min, p = 0.01), area under MAP < 65 mmHg (0 vs. 10.67 mmHg·min, p < 0.01), and time-weighted average below MAP 65 (0 vs. 0.07 mmHg, p < 0.01). Postoperative AKI rates were similar (6.7% vs. 4.2%, p = 0.72); MINS showed a strong trend toward reduction (17.1% vs. 31.8%, p = 0.07), with a similar trend for postoperative infections. | AI-driven predictive hemodynamic management using the HPI effectively minimized intraoperative hypotension during single-lung ventilation surgery. Although AKI was not significantly reduced, the observed trends toward lower myocardial injury and infections suggest potential clinical benefit, supporting broader adoption of predictive AI tools for perioperative hemodynamic optimization. |
| Hong, L., et al. [53] | Two prospective randomized controlled trials in COPD patients: 1. A 12-month RCT with 447 patients assessing AI-based medical intervention vs. standard care, with quality-of-life (QoL) and psychological outcomes measured at 4 and 12 months. 2. A separate 9-month RCT with 101 patients randomized to a web-based AI-driven educational and exercise program for prevention of acute exacerbations versus control | At 4 months no significant QoL improvement was observed. At 12 months the AI-intervention group showed significantly better quality of life, emotional status, and psychological well-being compared with controls. In the second trial, the AI-supported group had lower hospitalization rates and shorter length of hospital stay than the control group. Single-factor analysis did not reach statistical significance in all outcomes, but overall results were positive. | AI-supported medical and educational interventions appear feasible and effective in long-term COPD management, improving quality of life, psychological health, and reducing acute exacerbations and hospitalizations after 9–12 months. Although some individual endpoints lacked statistical power, the trials provide preliminary evidence supporting the integration of artificial intelligence into routine COPD care and secondary prevention. |
| Ladbury, C., Li, R., et al. [54] | Secondary analysis of the RTOG 0617 trial; multiple machine learning models (including XGBoost, random forest, naive Bayes) were trained on clinical and dosimetric variables to predict grade ≥ 3 pulmonary, cardiac, and esophageal toxicity after definitive chemoradiation for locally advanced NSCLC. Best models were interpreted using SHAP values to identify and quantify dosimetric thresholds, validated with logistic regression and bootstrapping. | XGBoost best predicted pulmonary toxicity (AUC 0.739), random forest cardiac (AUC 0.706), and naive Bayes esophageal toxicity (AUC 0.721), all outperforming traditional logistic regression. Key thresholds: lung mean dose > 18 Gy and V20 > 37% for pulmonary toxicity (OR 2.47 and 2.72); esophageal mean dose > 34 Gy and V20 > 37% for esophageal toxicity (OR 4.01 and 3.73). No significant cardiac thresholds identified. | Machine learning combined with explainable AI (SHAP) validated known and identified new clinically actionable dosimetric thresholds for radiation-induced toxicity, outperforming conventional statistical methods. This approach enables data-driven, precise optimization of radiotherapy planning constraints in lung cancer chemoradiation. |
| Pasipanodya, J. G., et al. [55] | Nested pharmacokinetic substudy within the OFLOTUB trial; 126 patients with drug-susceptible pulmonary TB received a 4-month gatifloxacin-containing regimen. Intensive PK sampling was performed on two days in the first 2 months. Therapy failure was defined as failure to culture-convert, relapse (confirmed by spoligotyping), or death within 24 months. An ensemble machine-learning approach (multiple algorithms) was used to rank 27 clinical/laboratory/PK variables and detect interactions predicting outcome. | 19/126 patients (15%) had unfavorable outcomes. Machine learning ranked pyrazinamide and rifampicin exposure (Cmax and AUC) as more important than gatifloxacin exposure. A significant antagonistic 3-way interaction between low concentrations of pyrazinamide, gatifloxacin, and rifampicin was identified; this negative interaction disappeared when rifampicin Cmax exceeded 7 mg/L. Drug concentrations explained 31–75% of outcome variance across sites. | Concentration-dependent antagonism among the three key drugs contributes to treatment failure in shortened TB regimens but can be overcome by higher rifampicin exposure (>7 mg/L). The findings provide a mechanistic explanation for prior trial failures and strongly support dose optimization of both rifampicin and fluoroquinolones to improve efficacy of short-course TB regimens. |
| Ding, Y., et al. [56] | Retrospective analysis of 424 patients with pulmonary nodules undergoing surgical resection; all had preoperative CT-based AI malignancy probability score, 7-autoantibody (7-AAB) panel, and CEA testing. Patients were randomly split 1:1 into training (n = 212) and validation (n = 212) sets. A logistic regression-based nomogram was built using forward stepwise selection of significant predictors (age, AI score, 7-AAB result, CEA) and internally validated. | The nomogram achieved AUC 0.899, sensitivity 82.3%, specificity 90.5%, and PPV 97.2% in the validation set. It significantly outperformed 7-AAB alone (sensitivity 82.3% vs. 35.9%, p < 0.001), CEA alone (sensitivity 82.3% vs. 18.8%, p < 0.001), and standalone CT-AI (specificity 90.5% vs. 69.0%, p = 0.022). For nodules ≤ 2 cm, nomogram specificity remained high at 90.0% vs. 67.5% for AI alone (p = 0.022). | Integration of CT-based AI, 7-AAB panel, age, and CEA into a simple nomogram substantially improves diagnostic accuracy for pulmonary nodules, especially small (≤2 cm) lesions, offering higher sensitivity than biomarkers alone and higher specificity than AI alone. This non-invasive, clinically practical tool can reduce unnecessary surgeries while maintaining excellent malignancy detection. |
| Bosman, S., et al. [57] | Prospective diagnostic accuracy study at health facilities in Lesotho and South Africa; 1392 symptomatic adults (≥1 TB symptom) underwent digital chest X-ray with CAD4TBv7 analysis, point-of-care CRP, and microbiological confirmation (Xpert MTB/RIF Ultra + liquid culture as composite reference standard). CAD4TBv7 performance was compared to CRP and expert radiologist reading. | CAD4TBv7 AUC was 0.87 (95% CI 0.84–0.91) vs. 0.80 for CRP (95% CI 0.76–0.84). At ≥90% sensitivity, CAD4TBv7 specificity was 68.2% (95% CI 65.4–71.0%), nearly meeting WHO TPP (>70%), while CRP specificity was only 38.2%. CAD4TBv7 performed equivalently to expert radiologist interpretation. | CAD4TBv7 (version 7) is a highly accurate, non-sputum triage test that nearly achieves WHO TPP criteria in high TB/HIV-burden settings, enabling rapid rule-out of TB and prioritization of confirmatory testing. It significantly outperforms CRP and matches expert radiology, supporting its deployment as a scalable frontline TB screening tool. |
| Authors | Methodology | Results | Significance |
|---|---|---|---|
| Gombolay, G. Y., et al. [63] | Randomized, blinded vignette-based study comparing 81 neurologists/child neurologists (members of Child Neurology Society and American Academy of Neurology) with 284 general-population participants. Each received AI-based diagnostic recommendations accompanied by one of eight randomly assigned explainable AI (xAI) methods (decision tree, crowd-sourced agreement, case-based reasoning, probability scores, counterfactuals, feature importance, templated explanations, or no explanation). Primary outcomes: task performance, perceived explainability, trust, and social competence of the DSS. | Decision trees were rated significantly more explainable by neurologists than by the general population (p < 0.01) and more explainable than probability scores among neurologists (p < 0.001). Higher clinical experience and higher perceived explainability paradoxically correlated with worse task performance (p = 0.0214). Performance was driven by perceived explainability rather than the specific xAI technique used. | Different xAI techniques are not equally effective across clinicians and laypeople; neurologists strongly prefer decision-tree explanations, and there is no universal “best” xAI method. Perceived explainability can negatively affect diagnostic accuracy in experts, highlighting the need for clinician-centered, personalized xAI design rather than one-size-fits-all approaches in clinical decision-support systems. |
| Alagapan, S., et al. [64] | Prospective study in 10 patients with treatment-resistant depression receiving subcallosal cingulate (SCC) DBS with a bidirectional implant allowing chronic local field potential (LFP) recording (NCT01984710). SCC LFPs from 6 patients were analyzed using explainable AI to derive individualized electrophysiological biomarkers of clinical state. Preoperative structural/functional connectivity of the target network was quantified with tractography and resting-state fMRI; objective mood changes were measured via automated video-based facial expression analysis. | At 24 weeks, 90% of patients were responders and 70% achieved remission. Explainable AI identified stable, patient-specific SCC LFP biomarkers that accurately tracked clinical state, discriminated therapeutic from transient stimulation effects, and responded to programming changes. Recovery trajectories strongly correlated with preoperative integrity of the white-matter treatment network and were objectively mirrored by changes in data-driven facial expression metrics. | This is the first demonstration of chronic, individualized electrophysiological biomarkers from the SCC DBS target that objectively track clinical state in TRD. Combined with preoperative connectivity and automated behavioral analysis, these biomarkers enable personalized, biomarker-guided DBS management, reduce reliance on subjective reporting, and explain inter-individual variability in long-term outcome of SCC DBS for severe depression. |
| Boutet, A., et al. [65] | Prospective observational trial in 67 PD patients with chronic DBS; 3T fMRI was performed under clinically optimized (ON) and deliberately non-optimal (OFF) stimulation settings. Brain response patterns were compared, and a machine learning classifier (trained on 39 patients with a priori known optimal settings) was built to predict optimal vs. non-optimal stimulation from whole-brain fMRI activation maps. | Optimal DBS produced a distinct fMRI signature with strong engagement of the motor network. The ML model classified optimal vs. non-optimal settings with 88% accuracy in the training cohort and successfully generalized to unseen held-out data, including stimulation-naïve patients, correctly predicting the clinically optimal settings. | fMRI-based brain response patterns serve as an objective, patient-specific biomarker of therapeutic DBS efficacy in Parkinson’s disease. This proof-of-concept demonstrates the feasibility of imaging-guided DBS programming, potentially reducing the number of clinical visits and accelerating the identification of optimal stimulation parameters. |
| Cowan, R. P., et al. [66] | Cross-sectional diagnostic accuracy study at three academic headache centers; 212 adults completed both a self-administered web-based Computer-based Diagnostic Engine (CDE) and a semi-structured telephone interview (SSI) by headache specialists, both strictly applying ICHD-3 criteria. Order of administration was randomized; concordance and accuracy metrics (Cohen’s kappa, sensitivity, specificity, PPV/NPV) were calculated using SSI as the reference standard. | Excellent concordance between CDE and SSI for migraine/probable migraine diagnosis (κ = 0.83, 95% CI 0.75–0.91). CDE showed sensitivity 90.1% (95% CI 83.6–94.6%), specificity 95.8% (95% CI 88.1–99.1%), PPV 97.0%, and NPV 86.6% at 60% study prevalence. At a population prevalence of 10%, PPV was 70.3% and NPV 98.9%. | A fully automated, self-administered online diagnostic tool using ICHD-3 logic achieves near-specialist-level accuracy in migraine diagnosis. It reliably rules in migraine (high specificity) and rules out migraine (high sensitivity), offering a scalable, valid solution to reduce diagnostic delay and the need for specialist interviews in both clinical and research settings. |
| Gorenshtein, A., et al. [67] | Prospective single-center randomized controlled trial; 200 patients referred for electrodiagnostic (EDX) studies were 1:1 randomized to physician-only interpretation (control) or physician + AI-assisted interpretation using the multi-agent INSPIRE framework (intervention). Three board-certified physicians rotated across arms. In the intervention arm, physicians reviewed and integrated an AI-generated preliminary report. Primary outcome: report quality measured by AIGERS score (0–1). Secondary outcomes: physician-rated AI integration (PAIR) and usability survey. | I-generated preliminary reports showed only moderate consistency. Final integrated physician-AI reports did not significantly outperform physician-only reports on AIGERS (no statistical difference, p > 0.05). Physicians rated trust in AI suggestions moderately (3.7/5) but scored efficiency (2.0/5), ease of use (1.7/5), and workload reduction (1.7/5) poorly, citing interpretability issues and workflow disruption. | In real-world clinical use, the tested AI-assisted multi-agent framework (INSPIRE) did not improve EDX report quality over expert physician interpretation alone and was perceived as cumbersome. Current limitations in usability and workflow integration highlight that AI tools for complex interpretive tasks like EDX require substantial improvement before they can deliver meaningful clinical benefit. |
| Davidovic, V., et al. [68] | Cross-sectional cohort follow-up of a randomized controlled trial (NCT06273579) at McGill University Neurosurgical Simulation Centre; 87 medical students were block-randomized to three training conditions on the NeuroVR virtual reality simulator: pure AI-tutor feedback, scripted human instruction, or AI-augmented personalized human instruction (human instructor received real-time AI-detected error data). Participants performed repeated simulated tumor resections. Primary measure: feedback frequency (as proxy for errors); secondary: objective performance metrics (healthy tissue removal, instrument separation, aspirator force). | By the third repetition, the AI-augmented personalized instruction group required significantly fewer total feedback instances (IRR 1.50, 95% CI 1.16–1.94, p < 0.001) and high-force aspirator corrections (IRR 1.71, 95% CI 1.15–2.55, p = 0.002) than in earlier repetitions. Compared to pure AI-tutor instruction, AI-augmented human instruction achieved significantly less healthy tissue removal (p = 0.01), smaller instrument tip separation (mean ratio 1.25, p = 0.008), and lower aspirator force (mean ratio 1.68, p < 0.001), with sustained improvement across all metrics from baseline. | Real-time AI error detection fed to a human instructor (AI-augmented personalized instruction) markedly reduces feedback frequency (indicating fewer errors) and produces superior technical skill acquisition compared to either AI-only or traditional scripted human teaching. This hybrid approach leverages the strengths of both AI precision and human pedagogical adaptability, establishing a new benchmark for effective VR-based surgical training. |
| Hassan, A. E., et al. [69] | Observational before-and-after study within the ongoing EMBOLISE trial (NCT04402632) at a single large comprehensive stroke center. Pre-AI period: 153 days (5 May–6 October 2021); post-AI period: 316 days after activation of Viz RECRUIT SDH (6 October 2021–18 August 2022). The AI platform automatically analyzed all non-contrast head CTs, flagged suspected subacute/chronic SDH, and calculated volume, thickness, and midline shift. All AI alerts were manually reviewed to assess positive predictive value (PPV) and enrollment impact. | Pre-AI: 5 patients enrolled (0.99/month), 1 screen failure. Post-AI: 14 patients enrolled (1.35/month), representing a 36% increase in enrollment rate and zero screen failures. Of 6244 processed CTs, 207 SDH detections (3% prevalence); PPV of the algorithm was 81.4% (95% CI 75.3–86.7%). Median response time to alerts: 50% viewed within 1 h, 35% within 10 min. | Real-time AI-based automated screening of routine head CTs (Viz RECRUIT SDH) significantly accelerated patient identification and enrollment in a randomized trial for chronic/subacute SDH by 36%, eliminated screen failures, and demonstrated high real-world performance (PPV > 80%) and rapid clinical response. This validates AI-driven mobile platforms as powerful tools to improve recruitment efficiency in time-sensitive neurosurgical trials. |
| Macea, J., et al. [70] | Prospective observational study within the SeizeIT2 trial (NCT04284072); 223 in-hospital overnight recordings from 50 epilepsy patients were simultaneously captured with full polysomnography (including standard EEG) and a wearable EEG + accelerometry device. A single deep-learning model performed automated 30 s epoch sleep staging on both modalities. Automated scoring was compared against consensus clinical expert scoring on 20 nights (one per patient) using accuracy, Cohen’s kappa, F1-scores, and Bland–Altman analysis. Mixed-effects models compared sleep macrostructure between patients with and without in-hospital seizures. | Automated staging showed moderate agreement with expert scoring on standard EEG (accuracy 0.73, κ = 0.59) and lower agreement on wearable data (accuracy 0.61, κ = 0.43). Sensitivity was poor for N1 across both modalities; wearable-based staging systematically underestimated total sleep time and most stages except N2. Patients with seizures slept significantly longer (6.37 h, 95% CI 5.86–7.87) than those without (5.68 h, 95% CI 5.24–6.13; p = 0.001) and spent more time in N2. | Wearable-based deep learning can perform automated sleep staging in epilepsy patients with moderate accuracy, sufficient to detect clinically relevant differences (longer sleep and more N2 in patients with seizures). However, current performance (especially poor N1 detection and systematic underestimation by the wearable) requires further model improvement before reliable clinical or research use in epilepsy monitoring. |
| Medical Area | AI Methods | Purposes | Status |
|---|---|---|---|
| Radiology | Deep Learning (DL), Convolutional Neural Networks (CNN), Diffusion Models, Machine Learning (ML), Computer Vision, Large Multimodal Models (LMMs) | Image analysis, diagnosis (e.g., cancer detection in MRI/CT), report generation, anomaly detection, image enhancement, segmentation, triaging acute diseases (e.g., stroke, pneumothorax). | Established with emerging generative AI for synthetic imaging and bias mitigation [106,107,108]. |
| Pathology | DL, CNN, ML, Neural Networks, Virtual Staining | Histopathology classification, mutation prediction, diagnosis (e.g., tumor subclassification), prognosis prediction, biomarker analysis. | Established, with emerging applications in precision pathology and immunotherapy outcome prediction [109,110,111]. |
| Ophthalmology | DL, ML, CNN | Diabetic retinopathy screening, macular edema detection, glaucoma staging, refractive error detection. | Emerging, fastest-growing field with high publication growth [112,113,114]. |
| Cardiology | ML, DL, LMMs, CNN | ECG interpretation, arrhythmia detection, cardiovascular risk prediction, echocardiography analysis. | Established, with emerging multimodal analysis for improved accuracy [115,116,117]. |
| Oncology | DL, ML, CNN, Generative AI | Cancer detection/subtyping, treatment planning, prognosis, drug response prediction, precision therapy. | Established, emerging in genomic subtyping and AI for chemotherapy dose optimization [118,119,120]. |
| Neurology | DL, LMMs, Transformer-based Models, ML | Lesion detection, brain condition distinction, stroke triage, neurosurgery outcome prediction. | Emerging, with multimodal integration for diagnostics [121,122,123]. |
| Gastroenterology | CNN, Support Vector Machines (SVM), ML, DL | Neoplasia detection, polyp identification, inflammatory bowel disease prediction, endoscopy assistance. | Established, emerging integration with genomic data [124,125,126]. |
| Dermatology | CNN, DL, ML | Skin lesion classification, melanoma detection, triage of lesions. | Established, addressing health disparities through data mining [127,128,129]. |
| Mental Health/Psychiatry | NLP, LLMs, ML, Chatbots | Therapy delivery, psychosis prediction, mental health monitoring, patient engagement. | Emerging, with AI chatbots for accessible care and relapse prediction [130,131,132]. |
| Drug Discovery & Development | ML, DL, NLP, Generative AI | Drug repurposing, target identification, toxicity prediction, vaccine design, clinical trial management. | Emerging, accelerating personalized medicine and synthetic data generation [133,134,135]. |
| Orthopedics | ML, CNN, Diffusion Models | Fracture detection, osteoarthritis prediction, surgical outcome prediction, gait analysis. | Emerging, patient-specific biomechanical testing [136,137,138]. |
| Emergency Medicine | ML, CNN, DL | Patient triage, risk stratification, hyperglycemic crises prediction, decision support. | Established, faster triage for acute conditions [139,140,141]. |
| Preventive Medicine | ML, DL, Predictive Analytics | Risk assessment, disease progression prediction, population health management. | Emerging, second-fastest-growing field with wearable integration [142,143,144]. |
| Anesthesiology | ML, DL | Depth of anesthesia monitoring, pain identification, intraoperative support. | Established, with emerging decision support techniques [145,146,147]. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Łoś, A.; Bartusik-Aebisher, D.; Mytych, W.; Aebisher, D. Applications of Artificial Intelligence in Selected Internal Medicine Specialties: A Critical Narrative Review of the Latest Clinical Evidence. Algorithms 2026, 19, 54. https://doi.org/10.3390/a19010054
Łoś A, Bartusik-Aebisher D, Mytych W, Aebisher D. Applications of Artificial Intelligence in Selected Internal Medicine Specialties: A Critical Narrative Review of the Latest Clinical Evidence. Algorithms. 2026; 19(1):54. https://doi.org/10.3390/a19010054
Chicago/Turabian StyleŁoś, Aleksandra, Dorota Bartusik-Aebisher, Wiktoria Mytych, and David Aebisher. 2026. "Applications of Artificial Intelligence in Selected Internal Medicine Specialties: A Critical Narrative Review of the Latest Clinical Evidence" Algorithms 19, no. 1: 54. https://doi.org/10.3390/a19010054
APA StyleŁoś, A., Bartusik-Aebisher, D., Mytych, W., & Aebisher, D. (2026). Applications of Artificial Intelligence in Selected Internal Medicine Specialties: A Critical Narrative Review of the Latest Clinical Evidence. Algorithms, 19(1), 54. https://doi.org/10.3390/a19010054

