Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (4,459)

Search Parameters:
Keywords = gradient boosting model

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 3170 KB  
Article
Grey Wolf Optimization-Optimized Ensemble Models for Predicting the Uniaxial Compressive Strength of Rocks
by Xigui Zheng, Arzoo Batool, Santosh Kumar and Niaz Muhammad Shahani
Appl. Sci. 2026, 16(2), 1130; https://doi.org/10.3390/app16021130 (registering DOI) - 22 Jan 2026
Abstract
Reliable models for predicting the uniaxial compressive strength (UCS) of rocks are crucial for mining operations and rock engineering design. Empirical methods, including statistical methods, are often faced with many limitations when generalizing in a wide range of lithological types. To address this [...] Read more.
Reliable models for predicting the uniaxial compressive strength (UCS) of rocks are crucial for mining operations and rock engineering design. Empirical methods, including statistical methods, are often faced with many limitations when generalizing in a wide range of lithological types. To address this limitation, this study investigates the capability of grey wolf optimization (GWO)-optimized ensemble machine learning models, including decision tree (DT), extreme gradient boosting (XGBoost), and adaptive boosting (AdaBoost) for predicting UCS using a small dataset of easily measurable and non-destructive rock index properties. The study’s objective is to evaluate whether metaheuristic-based hyperparameter optimization can enhance model robustness and generalization performance under small-sample conditions. A unified experimental framework incorporating GWO-based optimization, three-fold cross-validation, sensitivity analysis, and multiple statistical performance indicators was implemented. The findings of this study confirm that although the GWO-XGBoost model achieves the highest training accuracy, it exhibits signs of mild overfitting. In contrast, the GWO-AdaBoost model outpaced with significant improvement in terms of coefficient of determination (R2) = 0.993, root mean square error (RMSE) = 2.2830, mean absolute error (MAE) = 1.6853, and mean absolute percentage error (MAPE) = 4.6974. Therefore, the GWO-AdaBoost has proven to be the most effective in terms of its prediction potential of UCS, with significant potential for adaptation due to its effectively learned parameters. From a theoretical perspective, this study highlights the non-equivalence between training accuracy and predictive reliability in UCS modeling. Practically, the findings support the use of GWO-AdaBoost as a reliable decision-support tool for preliminary rock strength assessment in mining and geotechnical engineering, particularly when comprehensive laboratory testing is not feasible. Full article
27 pages, 5594 KB  
Article
Conditional Tabular Generative Adversarial Network Based Clinical Data Augmentation for Enhanced Predictive Modeling in Chronic Kidney Disease Diagnosis
by Princy Randhawa, Veerendra Nath Jasthi, Kumar Piyush, Gireesh Kumar Kaushik, Malathy Batamulay, S. N. Prasad, Manish Rawat, Kiran Veernapu and Nithesh Naik
BioMedInformatics 2026, 6(1), 6; https://doi.org/10.3390/biomedinformatics6010006 (registering DOI) - 22 Jan 2026
Abstract
The lack of clinical data for chronic kidney disease (CKD) prediction frequently results in model overfitting and inadequate generalization to novel samples. This research mitigates this constraint by utilizing a Conditional Tabular Generative Adversarial Network (CTGAN) to enhance a constrained CKD dataset sourced [...] Read more.
The lack of clinical data for chronic kidney disease (CKD) prediction frequently results in model overfitting and inadequate generalization to novel samples. This research mitigates this constraint by utilizing a Conditional Tabular Generative Adversarial Network (CTGAN) to enhance a constrained CKD dataset sourced from the University of California, Irvine (UCI) Machine Learning Repository. The CTGAN model was trained to produce realistic synthetic samples that preserve the statistical and feature distributions of the original dataset. Multiple machine learning models, such as AdaBoost, Random Forest, Gradient Boosting, and K-Nearest Neighbors (KNN), were assessed on both the original and enhanced datasets with incrementally increasing degrees of synthetic data dilution. AdaBoost attained 100% accuracy on the original dataset, signifying considerable overfitting; however, the model exhibited enhanced generalization and stability with the CTGAN-augmented data. The occurrence of 100% test accuracy in several models should not be interpreted as realistic clinical performance. Instead, it reflects the limited size, clean structure, and highly separable feature distributions of the UCI CKD dataset. Similar behavior has been reported in multiple previous studies using this dataset. Such perfect accuracy is a strong indication of overfitting and limited generalizability, rather than feature or label leakage. This observation directly motivates the need for controlled data augmentation to introduce variability and improve model robustness. The dataset with the greatest dilution, comprising 2000 synthetic cases, attained a test accuracy of 95.27% utilizing a stochastic gradient boosting approach. Ensemble learning techniques, particularly gradient boosting and random forest, regularly surpassed conventional models like KNN in terms of predicted accuracy and resilience. The results demonstrate that CTGAN-based data augmentation introduces critical variability, diminishes model bias, and serves as an effective regularization technique. This method provides a viable alternative for reducing overfitting and improving predictive modeling accuracy in data-deficient medical fields, such as chronic kidney disease diagnosis. Full article
Show Figures

Figure 1

20 pages, 1962 KB  
Article
Machine Learning-Based Prediction and Feature Attribution Analysis of Contrast-Associated Acute Kidney Injury in Patients with Acute Myocardial Infarction
by Neriman Sıla Koç, Can Ozan Ulusoy, Berrak Itır Aylı, Yusuf Bozkurt Şahin, Veysel Ozan Tanık, Arzu Akgül and Ekrem Kara
Medicina 2026, 62(1), 228; https://doi.org/10.3390/medicina62010228 (registering DOI) - 22 Jan 2026
Abstract
Background and Objectives: Contrast-associated acute kidney injury (CA-AKI) is a frequent and clinically significant complication in patients with acute myocardial infarction (AMI) undergoing coronary angiography. Early and accurate risk stratification remains challenging with conventional models that rely on linear assumptions and limited [...] Read more.
Background and Objectives: Contrast-associated acute kidney injury (CA-AKI) is a frequent and clinically significant complication in patients with acute myocardial infarction (AMI) undergoing coronary angiography. Early and accurate risk stratification remains challenging with conventional models that rely on linear assumptions and limited variable integration. This study aimed to evaluate and compare the predictive performance of multiple machine learning (ML) algorithms with traditional logistic regression and the Mehran risk score for CA-AKI prediction and to explore key determinants of risk using explainable artificial intelligence methods. Materials and Methods: This retrospective, single-center study included 1741 patients with AMI who underwent coronary angiography. CA-AKI was defined according to KDIGO criteria. Multiple ML models, including gradient boosting machine (GBM), random forest (RF), XGBoost, support vector machine, elastic net, and standard logistic regression were developed using routinely available clinical and laboratory variables. A weighted ensemble model combining the best-performing algorithms was constructed. Model discrimination was assessed using area under the receiver operating characteristic curve (AUC), along with sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). Model interpretability was evaluated using feature importance and SHapley Additive exPlanations (SHAP). Results: CA-AKI occurred in 356 patients (20.4%). In multivariable logistic regression, lower left ventricular ejection fraction, higher contrast volume, lower sodium, lower hemoglobin, and higher neutrophil-to-lymphocyte ratio (NLR) were independently associated with CA-AKI. Among ML approaches, the weighted ensemble model demonstrated the highest discriminative performance (AUC 0.721), outperforming logistic regression and the Mehran risk score (AUC 0.608). Importantly, the ensemble model achieved a consistently high NPV (0.942), enabling reliable identification of low-risk patients. Explainability analyses revealed that inflammatory markers, particularly NLR, along with sodium, uric acid, baseline renal indices, and contrast burden, were the most influential predictors across models. Conclusions: In patients with AMI undergoing coronary angiography, interpretable ML models, especially ensemble and gradient boosting-based approaches, provide superior risk stratification for CA-AKI compared with conventional methods. The high negative predictive value highlights their clinical utility in safely identifying low-risk patients and supporting individualized, risk-adapted preventive strategies. Full article
(This article belongs to the Section Urology & Nephrology)
Show Figures

Figure 1

16 pages, 1569 KB  
Article
Honey Botanical Origin Authentication Using HS-SPME-GC-MS Volatile Profiling and Advanced Machine Learning Models (Random Forest, XGBoost, and Neural Network)
by Amir Pourmoradian, Mohsen Barzegar, Ángel A. Carbonell-Barrachina and Luis Noguera-Artiaga
Foods 2026, 15(2), 389; https://doi.org/10.3390/foods15020389 (registering DOI) - 21 Jan 2026
Abstract
This study develops a comprehensive workflow integrating Headspace Solid-Phase Microextraction Gas Chromatography–Mass Spectrometry (HS-SPME-GC-MS) with advanced supervised machine learning to authenticate the botanical origin of honeys from five distinct floral sources—coriander, orange blossom, astragalus, rosemary, and chehelgiah. While HS-SPME-GC-MS combined with traditional chemometrics [...] Read more.
This study develops a comprehensive workflow integrating Headspace Solid-Phase Microextraction Gas Chromatography–Mass Spectrometry (HS-SPME-GC-MS) with advanced supervised machine learning to authenticate the botanical origin of honeys from five distinct floral sources—coriander, orange blossom, astragalus, rosemary, and chehelgiah. While HS-SPME-GC-MS combined with traditional chemometrics (e.g., PCA, LDA, OPLS-DA) is well-established for honey discrimination, the application and direct comparison of Random Forest (RF), eXtreme Gradient Boosting (XGBoost), and Neural Network (NN) models represent a significant advancement in multiclass prediction accuracy and model robustness. A total of 57 honey samples were analyzed to generate detailed volatile organic compound (VOC) profiles. Key chemotaxonomic markers were identified: anethole in coriander and chehelgiah, thymoquinone in astragalus, p-menth-8-en-1-ol in orange blossom, and dill ester (3,6-dimethyl-2,3,3a,4,5,7a-hexahydrobenzofuran) in rosemary. Principal component analysis (PCA) revealed clear separation across botanical classes (PC1: 49.8%; PC2: 22.6%). Three classification models—RF, XGBoost, and NN—were trained on standardized, stratified data. The NN model achieved the highest accuracy (90.32%), followed by XGBoost (86.69%) and RF (83.47%), with superior per-class F1-scores and near-perfect specificity (>0.95). Confusion matrices confirmed minimal misclassification, particularly in the NN model. This work establishes HS-SPME-GC-MS coupled with deep learning as a rapid, sensitive, and reliable tool for multiclass honey botanical authentication, offering strong potential for real-time quality control, fraud detection, and premium market certification. Full article
(This article belongs to the Section Food Quality and Safety)
31 pages, 1700 KB  
Review
Prospective of Colorectal Cancer Screening, Diagnosis, and Treatment Management Using Bowel Sounds Leveraging Artificial Intelligence
by Divyanshi Sood, Surbhi Dadwal, Samiksha Jain, Iqra Jabeen Mazhar, Bipasha Goyal, Chris Garapati, Sagar Patel, Zenab Muhammad Riaz, Noor Buzaboon, Ayushi Mendiratta, Avneet Kaur, Anmol Mohan, Gayathri Yerrapragada, Poonguzhali Elangovan, Mohammed Naveed Shariff, Thangeswaran Natarajan, Jayarajasekaran Janarthanan, Shreshta Agarwal, Sancia Mary Jerold Wilson, Atishya Ghosh, Shiva Sankari Karuppiah, Joshika Agarwal, Keerthy Gopalakrishnan, Swetha Rapolu, Venkata S. Akshintala and Shivaram P. Arunachalamadd Show full author list remove Hide full author list
Cancers 2026, 18(2), 340; https://doi.org/10.3390/cancers18020340 - 21 Jan 2026
Abstract
Background: Colorectal cancer (CRC) is the second leading cause of cancer-related mortality worldwide, accounting for approximately 10% of all cancer cases. Despite the proven effectiveness of conventional screening modalities such as colonoscopy and fecal immunochemical testing (FIT), their invasive nature, high cost, and [...] Read more.
Background: Colorectal cancer (CRC) is the second leading cause of cancer-related mortality worldwide, accounting for approximately 10% of all cancer cases. Despite the proven effectiveness of conventional screening modalities such as colonoscopy and fecal immunochemical testing (FIT), their invasive nature, high cost, and limited patient compliance hinder widespread adoption. Recent advancements in artificial intelligence (AI) and bowel sound-based signal processing have enabled non-invasive approaches for gastrointestinal diagnostics. Among these, bowel sound analysis—historically considered subjective—has reemerged as a promising biomarker using digital auscultation and machine learning. Objective: This review explores the potential of AI-powered bowel sound analytics for early detection, screening, and characterization of colorectal cancer. It aims to assess current methodologies, summarize reported performance metrics, and highlight translational opportunities and challenges in clinical implementation. Methods: A narrative review was conducted across PubMed, Scopus, Embase, and Cochrane databases using the terms colorectal cancer, bowel sounds, phonoenterography, artificial intelligence, and non-invasive diagnosis. Eligible studies involving human bowel sound-based recordings, AI-based sound analysis, or machine learning applications in gastrointestinal pathology were reviewed for study design, signal acquisition methods, AI model architecture, and diagnostic accuracy. Results: Across studies using convolutional neural networks (CNNs), gradient boosting, and transformer-based models, reported diagnostic accuracies ranged from 88% to 96%. Area under the curve (AUC) values were ≥0.83, with F1 scores between 0.71 and 0.85 for bowel sound classification. In CRC-specific frameworks such as BowelRCNN, AI models successfully differentiate abnormal bowel sound intervals and spectral patterns associated with tumor-related motility disturbances and partial obstruction. Distinct bowel sound-based signatures—such as prolonged sound-to-sound intervals and high-pitched “tinkling” proximal to lesions—demonstrate the physiological basis for CRC detection through bowel sound-based biomarkers. Conclusions: AI-driven bowel sound analysis represents an emerging, exploratory research direction rather than a validated colorectal cancer screening modality. While early studies demonstrate physiological plausibility and technical feasibility, no large-scale, CRC-specific validation studies currently establish sensitivity, specificity, PPV, or NPV for cancer detection. Accordingly, bowel sound analytics should be viewed as hypothesis-generating and potentially complementary to established screening tools, rather than a near-term alternative to validated modalities such as FIT, multitarget stool DNA testing, or colonoscopy. Full article
(This article belongs to the Section Methods and Technologies Development)
19 pages, 2181 KB  
Article
Gut Microbiota and Type 2 Diabetes: Genetic Associations, Biological Mechanisms, Drug Repurposing, and Diagnostic Modeling
by Xinqi Jin, Xuanyi Chen, Heshan Chen and Xiaojuan Hong
Int. J. Mol. Sci. 2026, 27(2), 1070; https://doi.org/10.3390/ijms27021070 - 21 Jan 2026
Abstract
Gut microbiota is a potential therapeutic target for type 2 diabetes (T2D), but its role remains unclear. Investigating causal associations between them could further our understanding of their biological and clinical significance. A two-sample Mendelian randomization (MR) analysis was conducted to assess the [...] Read more.
Gut microbiota is a potential therapeutic target for type 2 diabetes (T2D), but its role remains unclear. Investigating causal associations between them could further our understanding of their biological and clinical significance. A two-sample Mendelian randomization (MR) analysis was conducted to assess the causal relationship between gut microbiota and T2D. Key genes and mechanisms were identified through the integration of Genome-Wide Association Studies (GWAS) and cis-expression quantitative trait loci (cis-eQTL) data. Network pharmacology was applied to identify potential drugs and targets. Additionally, gut microbiota community analysis and machine learning models were used to construct a diagnostic model for T2D. MR analysis identified 17 gut microbiota taxa associated with T2D, with three showing significant associations: Actinomyces (odds ratio [OR] = 1.106; 95% confidence interval [CI]: 1.06–1.15; p < 0.01; adjusted p-value [padj] = 0.0003), Ruminococcaceae (UCG010 group) (OR = 0.897; 95% CI: 0.85–0.95; p < 0.01; padj = 0.018), and Deltaproteobacteria (OR = 1.072; 95% CI: 1.03–1.12; p < 0.01; padj = 0.029). Ten key genes, such as EXOC4 and IGF1R, were linked to T2D risk. Network pharmacology identified INSR and ESR1 as target driver genes, with drugs like Dienestrol showing promise. Gut microbiota analysis revealed reduced α-diversity in T2D patients (p < 0.05), and β-diversity showed microbial community differences (R2 = 0.012, p = 0.001). Furthermore, molecular docking confirmed the binding affinity of potential therapeutic agents to their targets. Finally, we developed a class-weight optimized Extreme Gradient Boosting (XGBoost) diagnostic model, which achieved an area under the curve (AUC) of 0.84 with balanced sensitivity (95.1%) and specificity (83.8%). Integrating machine learning predictions with MR causal inference highlighted Bacteroides as a key biomarker. Our findings elucidate the gut microbiota-T2D causal axis, identify therapeutic targets, and provide a robust tool for precision diagnosis. Full article
(This article belongs to the Special Issue Type 2 Diabetes: Molecular Pathophysiology and Treatment)
Show Figures

Figure 1

26 pages, 12256 KB  
Article
High-Precision River Network Mapping Using River Probability Learning and Adaptive Stream Burning
by Yufu Zang, Zhaocai Chu, Zhen Cui, Zhuokai Shi, Qihan Jiang, Yueqian Shen and Jue Ding
Remote Sens. 2026, 18(2), 362; https://doi.org/10.3390/rs18020362 - 21 Jan 2026
Abstract
Accurate river network mapping is essential for hydrological modeling, flood risk assessment, and watershed environment management. However, conventional methods based on either optical imagery or digital elevation models (DEMs) often suffer from river network discontinuity and poor representation of morphologically complex rivers. To [...] Read more.
Accurate river network mapping is essential for hydrological modeling, flood risk assessment, and watershed environment management. However, conventional methods based on either optical imagery or digital elevation models (DEMs) often suffer from river network discontinuity and poor representation of morphologically complex rivers. To overcome this limitation, this study proposes a novel method integrating the river-oriented Gradient Boosting Tree model (RGBT) and adaptive stream burning algorithm for high-precision and topologically consistent river network extraction. Water-oriented multispectral indices and multi-scale linear geometric features are first fused and input for a river-oriented Gradient Boosting Tree model to generate river probability maps. A direction-constrained region growing strategy is then applied to derive spatially coherent river vectors. These vectors are finally integrated into a spatially adaptive stream burning algorithm to construct a conditional DEM for hydrological coherent river network extraction. We select eight representative regions with diverse topographical characteristics to evaluate the performance of our method. Quantitative comparisons against reference networks and mainstream hydrographic products demonstrate that the method achieves the highest positional accuracy and network continuity, with errors mainly focused within a 0–40 m range. Significant improvements are primarily for narrow tributaries, highly meandering rivers, and braided channels. The experiments demonstrate that the proposed method provides a reliable solution for high-resolution river network mapping in complex environments. Full article
22 pages, 1714 KB  
Article
Integrating Machine-Learning Methods with Importance–Performance Maps to Evaluate Drivers for the Acceptance of New Vaccines: Application to AstraZeneca COVID-19 Vaccine
by Jorge de Andrés-Sánchez, Mar Souto-Romero and Mario Arias-Oliva
AI 2026, 7(1), 34; https://doi.org/10.3390/ai7010034 - 21 Jan 2026
Abstract
Background: The acceptance of new vaccines under uncertainty—such as during the COVID-19 pandemic—poses a major public health challenge because efficacy and safety information is still evolving. Methods: We propose an integrative analytical framework that combines a theory-based model of vaccine acceptance—the cognitive–affective–normative (CAN) [...] Read more.
Background: The acceptance of new vaccines under uncertainty—such as during the COVID-19 pandemic—poses a major public health challenge because efficacy and safety information is still evolving. Methods: We propose an integrative analytical framework that combines a theory-based model of vaccine acceptance—the cognitive–affective–normative (CAN) model—with machine-learning techniques (decision tree regression, random forest, and Extreme Gradient Boosting) and SHapley Additive exPlanations (SHAP) integrated into an importance–performance map (IPM) to prioritize determinants of vaccination intention. Using survey data collected in Spain in September 2020 (N = 600), when the AstraZeneca vaccine had not yet been approved, we examine the roles of perceived efficacy (EF), fear of COVID-19 (FC), fear of the vaccine (FV), and social influence (SI). Results: EF and SI consistently emerged as the most influential determinants across modelling approaches. Ensemble learners (random forest and Extreme Gradient Boosting) achieved stronger out-of-sample predictive performance than the single decision tree, while decision tree regression provided an interpretable, rule-based representation of the main decision pathways. Exploiting the local nature of SHAP values, we also constructed SHAP-based IPMs for the full sample and for the low-acceptance segment, enhancing the policy relevance of the prioritization exercise. Conclusions: By combining theory-driven structural modelling with predictive and explainable machine learning, the proposed framework offers a transparent and replicable tool to support the design of vaccination communication strategies and can be transferred to other settings involving emerging health technologies. Full article
Show Figures

Figure 1

24 pages, 6765 KB  
Article
Optimizing Reference Evapotranspiration Estimation in Data-Scarce Regions Using ERA5 Reanalysis and Machine Learning
by Emre Tunca, Václav Novák, Petr Šařec and Eyüp Selim Köksal
Agronomy 2026, 16(2), 253; https://doi.org/10.3390/agronomy16020253 - 21 Jan 2026
Abstract
This study aims to optimize the estimation of reference evapotranspiration (ETo) in data-scarce regions by integrating ERA5-Land reanalysis data with machine learning (ML) models. Daily meteorological data from 33 stations across Turkey’s diverse climate zones (1981–2010) were utilized to train and validate three [...] Read more.
This study aims to optimize the estimation of reference evapotranspiration (ETo) in data-scarce regions by integrating ERA5-Land reanalysis data with machine learning (ML) models. Daily meteorological data from 33 stations across Turkey’s diverse climate zones (1981–2010) were utilized to train and validate three ML models: Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Extreme Learning Machine (ELM). The methodology involved rigorous quality control of ground-based observations, spatial correlation of ERA5-Land grids to station locations, and performance evaluation under various data-limited scenarios. Results indicate that while ERA5-Land provides highly accurate solar radiation (Rs) and temperature (T) data, variables like wind speed (U2) and relative humidity (RH) exhibit systematic biases. Among the used models, XGBoost demonstrated superior performance (R2 = 0.95, RMSE = 0.43 mm day−1, and MAE = 0.30 mm day−1) and computational efficiency. This study provides a robust, regionally calibrated framework that corrects reanalysis biases using ML, offering a reliable alternative for ETo estimation in areas where local measurements are insufficient for sustainable water management. Full article
(This article belongs to the Section Precision and Digital Agriculture)
Show Figures

Figure 1

17 pages, 783 KB  
Article
Hospital-Wide Sepsis Detection: A Machine Learning Model Based on Prospectively Expert-Validated Cohort
by Marcio Borges-Sa, Andres Giglio, Maria Aranda, Antonia Socias, Alberto del Castillo, Cristina Pruenza, Gonzalo Hernández, Sofía Cerdá, Lorenzo Socias, Victor Estrada, Roberto de la Rica, Elisa Martin and Ignacio Martin-Loeches
J. Clin. Med. 2026, 15(2), 855; https://doi.org/10.3390/jcm15020855 - 21 Jan 2026
Abstract
Background/Objectives: Sepsis detection remains challenging due to clinical heterogeneity and limitations of traditional scoring systems. This study developed and validated a hospital-wide machine learning model for sepsis detection using retrospectively developed data from prospectively expert-validated cases, aiming to improve diagnostic accuracy beyond conventional [...] Read more.
Background/Objectives: Sepsis detection remains challenging due to clinical heterogeneity and limitations of traditional scoring systems. This study developed and validated a hospital-wide machine learning model for sepsis detection using retrospectively developed data from prospectively expert-validated cases, aiming to improve diagnostic accuracy beyond conventional approaches. Methods: This retrospective cohort study analysed 218,715 hospital episodes (2014–2018) at a tertiary care centre. Sepsis cases (n = 11,864, 5.42%) were prospectively validated in real-time by a Multidisciplinary Sepsis Unit using modified Sepsis-2 criteria with organ dysfunction. The model integrated structured data (26.95%) and unstructured clinical notes (73.04%) extracted via natural language processing from 2829 variables, selecting 230 relevant predictors. Thirty models including random forests, support vector machines, neural networks, and gradient boosting were developed and evaluated. The dataset was randomly split (5/7 training, 2/7 testing) with preserved patient-level independence. Results: The BiAlert Sepsis model (random forest + Sepsis-2 ensemble) achieved an AUC-ROC of 0.95, sensitivity of 0.93, and specificity of 0.84, significantly outperforming traditional approaches. Compared to the best rule-based method (Sepsis-2 + qSOFA, AUC-ROC 0.90), BiAlert reduced false positives by 39.6% (13.10% vs. 21.70%, p < 0.01). Novel predictors included eosinopenia and hypoalbuminemia, while traditional variables (MAP, GCS, platelets) showed minimal univariate association. The model received European Medicines Agency approval as a medical device in June 2024. Conclusions: This hospital-wide machine learning model, trained on prospectively expert-validated cases and integrating extensive NLP-derived features, demonstrates superior sepsis detection performance compared to conventional scoring systems. External validation and prospective clinical impact studies are needed before widespread implementation. Full article
Show Figures

Figure 1

23 pages, 9975 KB  
Article
Leveraging LiDAR Data and Machine Learning to Predict Pavement Marking Retroreflectivity
by Hakam Bataineh, Dmitry Manasreh, Munir Nazzal and Ala Abbas
Vehicles 2026, 8(1), 23; https://doi.org/10.3390/vehicles8010023 - 20 Jan 2026
Abstract
This study focused on developing and validating machine learning models to predict pavement marking retroreflectivity using Light Detection and Ranging (LiDAR) intensity data. The retroreflectivity data was collected using a Mobile Retroreflectometer Unit (MRU) due to its increasing acceptance among states as a [...] Read more.
This study focused on developing and validating machine learning models to predict pavement marking retroreflectivity using Light Detection and Ranging (LiDAR) intensity data. The retroreflectivity data was collected using a Mobile Retroreflectometer Unit (MRU) due to its increasing acceptance among states as a compliant measurement device. A comprehensive dataset was assembled spanning more than 1000 miles of roadways, capturing diverse marking materials, colors, installation methods, pavement types, and vehicle speeds. The final dataset used for model development focused on dry condition measurements and roadway segments most relevant to state transportation agencies. A detailed synchronization process was implemented to ensure the accurate pairing of retroreflectivity and LiDAR intensity values. Using these data, several machine learning techniques were evaluated, and an ensemble of gradient boosting-based models emerged as the top performer, predicting pavement retroreflectivity with an R2 of 0.94 on previously unseen data. The repeatability of the predicted retroreflectivity was tested and showed similar consistency as the MRU. The model’s accuracy was confirmed against independent field segments demonstrating the potential for LiDAR to serve as a practical, low-cost alternative for MRU measurements in routine roadway inspection and maintenance. The approach presented in this study enhances roadway safety by enabling more frequent, network-level assessments of pavement marking performance at lower cost, allowing agencies to detect and correct visibility problems sooner and helping to prevent nighttime and adverse weather crashes. Full article
Show Figures

Figure 1

29 pages, 5451 KB  
Article
Machine Learning as a Tool for Sustainable Material Evaluation: Predicting Tensile Strength in Recycled LDPE Films
by Olga Szlachetka, Justyna Dzięcioł, Joanna Witkowska-Dobrev, Mykola Nagirniak, Marek Dohojda and Wojciech Sas
Sustainability 2026, 18(2), 1064; https://doi.org/10.3390/su18021064 - 20 Jan 2026
Abstract
This study contributes to the advancement of circular economy practices in polymer manufacturing by applying machine learning algorithms (MLA) to predict the tensile strength of recycled low-density polyethylene (LDPE) building films. As the construction and packaging industries increasingly seek eco-efficient and low-carbon materials, [...] Read more.
This study contributes to the advancement of circular economy practices in polymer manufacturing by applying machine learning algorithms (MLA) to predict the tensile strength of recycled low-density polyethylene (LDPE) building films. As the construction and packaging industries increasingly seek eco-efficient and low-carbon materials, recycled LDPE offers a valuable route toward sustainable resource management. However, ensuring consistent mechanical performance remains a challenge when reusing polymer waste streams. To address this, tensile tests were conducted on LDPE films produced from recycled granules, measuring tensile strength, strain, mass per unit area, thickness, and surface roughness. Three established machine learning algorithms—feed-forward Neural Network (NN), Gradient Boosting Machine (GBM), and Extreme Gradient Boosting (XGBoost)—were implemented, trained, and optimized using the experimental dataset using R statistical software (version 4.4.3). The models achieved high predictive accuracy, with XGBoost providing the most robust performance and the highest level of explainability. Feature importance analysis revealed that mass per unit area and surface roughness have a significant influence on film durability and performance. These insights enable more efficient production planning, reduced raw material usage, and improved quality control, key pillars of sustainable technological innovation. The integration of data-driven methods into polymer recycling workflows demonstrates the potential of artificial intelligence to accelerate circular economy objectives by enhancing process optimization, material performance, and resource efficiency in the plastics sector. Full article
(This article belongs to the Special Issue Circular Economy and Sustainable Technological Innovation)
Show Figures

Figure 1

17 pages, 4604 KB  
Article
Machine Learning Predictions of the Flexural Response of Low-Strength Reinforced Concrete Beams with Various Longitudinal Reinforcement Configurations
by Batuhan Cem Öğe, Muhammet Karabulut, Hakan Öztürk and Bulent Tugrul
Buildings 2026, 16(2), 433; https://doi.org/10.3390/buildings16020433 - 20 Jan 2026
Abstract
There are almost no studies that investigate the flexural behavior of existing reinforced concrete (RC) beams with insufficient concrete strength using machine learning methods. This study investigates the flexural response of low-strength concrete (LSC) RC beams reinforced exclusively with steel rebars, focusing on [...] Read more.
There are almost no studies that investigate the flexural behavior of existing reinforced concrete (RC) beams with insufficient concrete strength using machine learning methods. This study investigates the flexural response of low-strength concrete (LSC) RC beams reinforced exclusively with steel rebars, focusing on the effectiveness of three different longitudinal reinforcement configurations. Nine beams, each measuring 150 × 200 × 1100 mm and cast with C10-grade low-strength concrete, were divided into three groups according to their reinforcement layout: Group 1 (L2L) with two Ø12 mm rebars, Group 2 (L3L) with three Ø12 mm rebars, and Group 3 (F10L3L) with three Ø10 mm rebars. All specimens were tested under three-point bending to evaluate their load–deflection characteristics and failure mechanisms. The experimental findings were compared with ML approaches. To enhance predictive understanding, several ML regression models were developed and trained using the experimental datasets. Among them, the Light Gradient Boosting, K Neighbors Regressor and Adaboost Regressor exhibited the best predictive performance, estimating beam deflections with R2 values of 0.89, 0.90, 0.94, 0.74, 0.84, 0.64, 0.70, 0.82, and 0.72, respectively. The results highlight that the proposed ML models effectively capture the nonlinear flexural behavior of RC beams and that longitudinal reinforcement configuration plays a significant role in the flexural performance of low-strength concrete beams, providing valuable insights for both design and structural assessment. Full article
(This article belongs to the Section Construction Management, and Computers & Digitization)
Show Figures

Figure 1

13 pages, 6367 KB  
Article
Gene Expression-Based Colorectal Cancer Prediction Using Machine Learning and SHAP Analysis
by Yulai Yin, Zhen Yang, Xueqing Li, Shuo Gong and Chen Xu
Genes 2026, 17(1), 114; https://doi.org/10.3390/genes17010114 - 20 Jan 2026
Abstract
Objective: To develop and validate a genetic diagnostic model for colorectal cancer (CRC). Methods: First, differential expression genes (DEGs) between colorectal cancer and normal groups were screened using the TCGA database. Subsequently, a two-sample Mendelian randomization analysis was performed using the eQTL genomic [...] Read more.
Objective: To develop and validate a genetic diagnostic model for colorectal cancer (CRC). Methods: First, differential expression genes (DEGs) between colorectal cancer and normal groups were screened using the TCGA database. Subsequently, a two-sample Mendelian randomization analysis was performed using the eQTL genomic data from the IEU OpenGWAS database and colorectal cancer outcomes from the R12 Finnish database to identify associated genes. The intersecting genes from both methods were selected for the development and validation of the CRC genetic diagnostic model using nine machine learning algorithms: Lasso Regression, XGBoost, Gradient Boosting Machine (GBM), Generalized Linear Model (GLM), Neural Network (NN), Support Vector Machine (SVM), k-Nearest Neighbors (KNN), Random Forest (RF), and Decision Tree (DT). Results: A total of 3716 DEGs were identified from the TCGA database, while 121 genes were associated with CRC based on the eQTL Mendelian randomization analysis. The intersection of these two methods yielded 27 genes. Among the nine machine learning methods, XGBoost achieved the highest AUC value of 0.990. The top five genes predicted by the XGBoost method—RIF1, GDPD5, DBNDD1, RCCD1, and CLDN5—along with the five most significantly differentially expressed genes (ASCL2, IFITM3, IFITM1, SMPDL3A, and SUCLG2) in the GSE87211 dataset, were selected for the construction of the final colorectal cancer (CRC) genetic diagnostic model. The ROC curve analysis revealed an AUC (95% CI) of 0.9875 (0.9737–0.9875) for the training set, and 0.9601 (0.9145–0.9601) for the validation set, indicating strong predictive performance of the model. SHAP model interpretation further identified IFITM1 and DBNDD1 as the most influential genes in the XGBoost model, with both making positive contributions to the model’s predictions. Conclusions: The gene expression profile in colorectal cancer is characterized by enhanced cell proliferation, elevated metabolic activity, and immune evasion. A genetic diagnostic model constructed based on ten genes (RIF1, GDPD5, DBNDD1, RCCD1, CLDN5, ASCL2, IFITM3, IFITM1, SMPDL3A, and SUCLG2) demonstrates strong predictive performance. This model holds significant potential for the early diagnosis and intervention of colorectal cancer, contributing to the implementation of third-tier prevention strategies. Full article
(This article belongs to the Section Bioinformatics)
Show Figures

Figure 1

24 pages, 10530 KB  
Article
Agri-Fuse Spatiotemporal Fusion Integrated Multi-Model Synergy for High-Precision Cotton Yield Estimation in Arid Regions
by Xianhui Zhong, Jiechen Wang, Jianan Chi, Liang Jiang, Qi Wang, Lin Chang and Tiecheng Bai
Remote Sens. 2026, 18(2), 339; https://doi.org/10.3390/rs18020339 - 20 Jan 2026
Abstract
Accurate cotton yield estimation in arid oasis regions faces challenges from landscape fragmentation and the conflict between monitoring precision and computational costs. To address this, we developed a robust integrated framework combining multi-source remote sensing, spatiotemporal fusion, and data assimilation. To resolve spatiotemporal [...] Read more.
Accurate cotton yield estimation in arid oasis regions faces challenges from landscape fragmentation and the conflict between monitoring precision and computational costs. To address this, we developed a robust integrated framework combining multi-source remote sensing, spatiotemporal fusion, and data assimilation. To resolve spatiotemporal data gaps, the existing Agricultural Fusion (Agri-Fuse) algorithm was validated and employed to generate high-resolution time-series data, which achieved superior spectral fidelity (Root Mean Square Error, RMSE = 0.041) compared to traditional methods like Spatial and Temporal Adaptive Reflectance Fusion Model (STARFM). Subsequently, high-precision Leaf Area Index (LAI) time series retrieved via the eXtreme Gradient Boosting (XGBoost) algorithm (c = 0.97) were integrated into the Ensemble Kalman Filter (EnKF)-assimilated World Food Studies (WOFOST) model. This approach significantly corrected simulation biases, improving the yield estimation accuracy (R2 = 0.86, RMSE = 171 kg/ha) compared to the open-loop model. Crucially, we systematically evaluated the trade-off between assimilation frequency and efficiency. Findings identified the 3-day fusion interval as the optimal operational strategy, maintaining high accuracy (R2 = 0.83, RMSE = 181 kg/ha) while reducing computational costs by 66.5% compared to daily assimilation. This study establishes a scalable, cost-effective benchmark for precision agriculture in complex arid environments. Full article
(This article belongs to the Section Remote Sensing in Agriculture and Vegetation)
Show Figures

Figure 1

Back to TopTop