Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (86)

Search Parameters:
Keywords = stacking predictor

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
30 pages, 12869 KB  
Article
Integrative Nutritional Assessment of Avocado Leaves Using Entropy-Weighted Spectral Indices and Fusion Learning
by Zhen Guo, Juan Sebastian Estrada, Xingfeng Guo, Redmond R. Shamshiri, Marcelo Pereyra and Fernando Auat Cheein
Computation 2026, 14(2), 33; https://doi.org/10.3390/computation14020033 - 1 Feb 2026
Viewed by 338
Abstract
Accurate and non-destructive assessment of plant nutritional status remains a key challenge in precision agriculture, particularly under dynamic physiological conditions such as dehydration. Therefore, this study focused on developing an integrated nutritional assessment framework for avocado (Persea americana Mill.) leaves across progressive dehydration [...] Read more.
Accurate and non-destructive assessment of plant nutritional status remains a key challenge in precision agriculture, particularly under dynamic physiological conditions such as dehydration. Therefore, this study focused on developing an integrated nutritional assessment framework for avocado (Persea americana Mill.) leaves across progressive dehydration stages using spectral analysis. A novel nutritional function index (NFI) was innovatively constructed using an entropy-weighted multi-criteria decision-making approach. This unified assessment metric integrated critical physiological indicators, such as moisture content, nitrogen content, and chlorophyll content estimated from soil and plant analyzer development (SPAD) readings. To enhance the prediction accuracy and interpretability of NFI, innovative vegetation indices (VIs) specifically tailored to NFI were systematically constructed using exhaustive wavelength-combination screening. Optimal wavelengths identified from short-wave infrared regions (1446, 1455, 1465, 1865, and 1937 nm) were employed to build physiologically meaningful VIs, which were highly sensitive to moisture and biochemical constituents. Feature wavelengths selected via the successive projections algorithm and competitive adaptive reweighted sampling further reduced spectral redundancy and improved modeling efficiency. Both feature-level and algorithm-level data fusion methods effectively combined VIs and selected feature wavelengths, significantly enhancing prediction performance. The stacking algorithm demonstrated robust performance, achieving the highest predictive accuracy (R2V = 0.986, RMSEV = 0.032) for NFI estimation. This fusion-based modeling approach outperformed conventional single-model schemes in terms of accuracy and robustness. Unlike previous studies that focused on isolated spectral predictors, this work introduces an integrative framework combining entropy-weighted feature synthesis and multiscale fusion learning. The developed strategy offers a powerful tool for real-time plant health monitoring and supports precision agricultural decision-making. Full article
Show Figures

Graphical abstract

32 pages, 2526 KB  
Article
HSE-GNN-CP: Spatiotemporal Teleconnection Modeling and Conformalized Uncertainty Quantification for Global Crop Yield Forecasting
by Salman Mahmood, Raza Hasan and Shakeel Ahmad
Information 2026, 17(2), 141; https://doi.org/10.3390/info17020141 - 1 Feb 2026
Viewed by 290
Abstract
Global food security faces escalating threats from climate variability and resource constraints. Accurate crop yield forecasting is essential; however, existing methods frequently overlook complex spatial dependencies driven by climate teleconnections, such as the ENSO, and lacks rigorous uncertainty quantification. This paper presents HSE-GNN-CP, [...] Read more.
Global food security faces escalating threats from climate variability and resource constraints. Accurate crop yield forecasting is essential; however, existing methods frequently overlook complex spatial dependencies driven by climate teleconnections, such as the ENSO, and lacks rigorous uncertainty quantification. This paper presents HSE-GNN-CP, a novel framework integrating heterogeneous stacked ensembles, graph neural networks (GNNs), and conformal prediction (CP). Domain-specific features are engineered, including growing degree days and climate suitability scores, and explicitly model spatial patterns via rainfall correlation graphs. The ensemble combines random forest and gradient boosting learners with bootstrap aggregation, while GNNs encode inter-regional climate dependencies. Conformalized quantile regression ensures statistically valid prediction intervals. Evaluated on a global dataset spanning 15 countries and six major crops from 1990 to 2023, the framework achieves an R2 of 0.9594 and an RMSE of 4882 hg/ha. Crucially, it delivers calibrated 80% prediction intervals with 80.72% empirical coverage, significantly outperforming uncalibrated baselines at 40.03%. SHAP analysis identifies crop type and rainfall as dominant predictors, while the integrated drought classifier achieves perfect accuracy. These contributions advance agricultural AI by merging robust ensemble learning with explicit teleconnection modeling and trustworthy uncertainty quantification. Full article
Show Figures

Graphical abstract

33 pages, 4298 KB  
Article
Synergistic Phishing Intrusion Detection: Integrating Behavioral and Structural Indicators with Hybrid Ensembles and XAI Validation
by Isaac Kofi Nti, Murat Ozer and Chengcheng Li
Future Internet 2026, 18(1), 30; https://doi.org/10.3390/fi18010030 - 4 Jan 2026
Viewed by 482
Abstract
Phishing websites continue to evolve in sophistication, making them increasingly difficult to distinguish from legitimate platforms and challenging the effectiveness of current detection systems. In this study, we investigate the role of subtle deceptive behavioral cues such as mouse-over effects, pop-up triggers, right-click [...] Read more.
Phishing websites continue to evolve in sophistication, making them increasingly difficult to distinguish from legitimate platforms and challenging the effectiveness of current detection systems. In this study, we investigate the role of subtle deceptive behavioral cues such as mouse-over effects, pop-up triggers, right-click restrictions, and hidden iframes in enhancing phishing detection beyond traditional structural and domain-based indicators. We propose a hierarchical hybrid detection framework that integrates dimensionality reduction through Principal Component Analysis (PCA), phishing campaign profiling using K Means clustering, and a stacked ensemble classifier for final prediction. Using a public phishing dataset, we evaluate multiple feature configurations to quantify the added value of behavioral indicators. The results demonstrate that behavioral indicators, while weak predictors in isolation, significantly improve performance when combined with conventional features, achieving a macro F1 score of 97 percent. Explainable AI analysis using SHAP confirms the contribution of specific behavioral characteristics to model decisions and reveals interpretable patterns in attacker manipulation strategies. This study shows that behavioral interactions leave measurable forensic signatures and provides evidence that combining structural, domain, and behavioral features offers a more comprehensive and reliable approach to phishing intrusion detection. Full article
(This article belongs to the Special Issue Anomaly and Intrusion Detection in Networks)
Show Figures

Graphical abstract

21 pages, 2057 KB  
Article
Estimating Plant Physiological Parameters for Vitis vinifera L. Using In Situ Hyperspectral Measurements and Ensemble Machine Learning
by Marco Lutz, Emilie Lüdicke, Daniel Heßdörfer, Tobias Ullmann and Melanie Brandmeier
Remote Sens. 2025, 17(23), 3918; https://doi.org/10.3390/rs17233918 - 3 Dec 2025
Viewed by 569
Abstract
Accurate prediction of photosynthetic parameters is pivotal for precision viticulture, as it enables non-invasive monitoring of plant physiological status and informed management decisions. In this study, spectral reflectance data were used to predict key photosynthetic parameters such as assimilation rate (A), effective photosystem [...] Read more.
Accurate prediction of photosynthetic parameters is pivotal for precision viticulture, as it enables non-invasive monitoring of plant physiological status and informed management decisions. In this study, spectral reflectance data were used to predict key photosynthetic parameters such as assimilation rate (A), effective photosystem II (PSII) quantum yield (ΦPSII), and electron transport rate (ETR), as well as stem and leaf water potential (Ψstem and Ψleaf), in Vitis vinifera (cv. Müller-Thurgau) grown in an experimental vineyard in Lower Franconia (Germany). Measurements were obtained on 25 July, 7 August, and 12 August 2024 using a LI-COR LI-6800 system and a PSR+ hyperspectral spectroradiometer. Various machine learning models (SVR, Lasso, ElasticNet, Ridge, PLSR, a simple ANN, and Random Forest) were evaluated, both as standalone predictors and as base learners in a stacking ensemble regressor with a Random Forest meta-learner. First derivative reflectance (FDR) preprocessing enhanced predictive performance, particularly for ΦPSII and ETR, with the ensemble approach achieving R2 values up to 0.92 for ΦPSII and 0.85 for A at 1 nm resolution. At coarser spectral resolutions, predictive accuracy declined, though FDR preprocessing provided some mitigation of the performance loss. Diurnal patterns revealed that morning to mid-morning measurements, particularly between 9:00 and 11:00, captured peak photosynthetic activity, making them optimal for assessing vine vigor, while midday water potential declines indicated favorable timing for irrigation scheduling. These findings demonstrate the potential of integrating hyperspectral data with ensemble machine learning and FDR preprocessing for accurate, scalable, and high-throughput monitoring of grapevine physiology, supporting real-time vineyard management and the use of cost-effective sensors under diverse environmental conditions. Full article
Show Figures

Figure 1

22 pages, 3518 KB  
Article
Dose-Guided Hybrid AI Model with Deep and Handcrafted Radiomics for Explainable Radiation Dermatitis Prediction in Breast Cancer VMAT
by Tsair-Fwu Lee, Ling-Chuan Chang-Chien, Lawrence Tsai, Chia-Hui Chen, Po-Shun Tseng, Jun-Ping Shiau, Yang-Wei Hsieh, Shyh-An Yeh, Cheng-Shie Wuu, Yu-Wei Lin and Pei-Ju Chao
Cancers 2025, 17(23), 3767; https://doi.org/10.3390/cancers17233767 - 26 Nov 2025
Viewed by 906
Abstract
Purpose: To improve the prediction accuracy of radiation dermatitis (RD) in breast cancer patients undergoing volumetric modulated arc therapy (VMAT), we developed a hybrid artificial intelligence (AI) model that integrates deep learning radiomics (DLR), handcrafted radiomics (HCR), clinical features, and dose–volume histogram (DVH) [...] Read more.
Purpose: To improve the prediction accuracy of radiation dermatitis (RD) in breast cancer patients undergoing volumetric modulated arc therapy (VMAT), we developed a hybrid artificial intelligence (AI) model that integrates deep learning radiomics (DLR), handcrafted radiomics (HCR), clinical features, and dose–volume histogram (DVH) parameters, aiming to enhance the early identification of high-risk individuals and support personalized prevention strategies. Methods: A retrospective cohort of 156 breast cancer patients treated with VMAT at Kaohsiung Veterans General Hospital (2018–2023) was analyzed; 148 patients were eligible after exclusions, with RD graded according to the RTOG criteria. Clinical variables and 12 DVH indices were collected, while HCR features were extracted via PyRadiomics. DLR features were derived from a pretrained VGG16 network across four input designs: original CT images (DLROriginal), a 5 mm subcutaneous region (DLRSkin5mm), a planning target volume with a 100% prescription dose (DLRPTV100%), and a subcutaneous region receiving ≥ 5 Gy (DLRV5Gy). The features were preselected via ANOVA (p < 0.05), followed by Boruta–SHAP refinement across 11 feature sets. Predictive models were built via logistic regression, random forest, gradient boosting decision tree, and stacking ensemble (SE) methods. Explainability was assessed via SHapley Additive exPlanations (SHAPs) and gradient-weighted class activation mapping (Grad-CAM). Results: Among the 148 patients, 49 (33%) developed Grade ≥ 2 RD. The DLR models outperformed the HCR models (AUC = 0.72 vs. 0.66). The best performance was achieved with DLRV5Gy + clinical + DVH features, yielding an AUC = 0.76, recall = 0.68, and F1 score = 0.60. SE consistently surpassed single classifiers. SHAP identified convolutional DLR features as the strongest predictors, whereas Grad-CAM focused attention on subcutaneous high-dose regions, which was consistent with the clinical RD distribution. Conclusions: The proposed hybrid AI framework, which integrates DLR, clinical, and DVH features, provides accurate and explainable predictions of Grade ≥ 2 RD after VMAT in breast cancer patients. By combining ensemble learning with XAI methods, the model offers reliable high-risk stratification and potential clinical utility for personalized treatment planning. Full article
(This article belongs to the Special Issue Cancer Survivors: Late Effects of Cancer Therapy)
Show Figures

Figure 1

28 pages, 2720 KB  
Article
Ensemble Transfer Learning for Gastric Cancer Prediction Using Electronic Health Records in a Data-Scarce Single-Hospital Setting
by Hyon Hee Kim, Ji Yeon Han, Yae Bin Lim, Young Seo Lim, Seung-In Seo, Kyung Joo Lee and Woon Geon Shin
Appl. Sci. 2025, 15(23), 12428; https://doi.org/10.3390/app152312428 - 23 Nov 2025
Viewed by 572
Abstract
Gastric cancer is a significant health concern in East Asia, where early risk prediction is critical for prevention. However, the scarcity of single-hospital electronic health records (EHRs) data limits the applicability and generalizability of machine learning models. To address this challenge, we propose [...] Read more.
Gastric cancer is a significant health concern in East Asia, where early risk prediction is critical for prevention. However, the scarcity of single-hospital electronic health records (EHRs) data limits the applicability and generalizability of machine learning models. To address this challenge, we propose an ensemble transfer learning framework for gastric cancer prediction using structured EHRs in a data-scarce single-hospital setting. Three base models, Support Vector Machine (SVM), Random Forest, and Deep Neural Network (DNN), were pretrained on a large-scale national dataset from the Republic of Korean National Health Insurance Service (NHIS) and fine-tuned on a smaller institutional dataset from Kangdong Sacred Heart Hospital (KSHH). These fine-tuned models were combined via stacking ensemble learning with logistic regression as a meta-learner. The proposed model achieved strong performance with precision 0.78, recall 0.92, F1-score 0.83, accuracy 0.91, and AUC 0.93. For interpretability, permutation feature importance and Shapley Additive Explanations (SHAP) were applied. Smoking status, gender, and hypertensive disorder were identified as key predictors consistent with previous studies. This study demonstrates the successful application of transfer learning to overcome data scarcity in single-hospital structured EHRs. Furthermore, our stacking ensemble strategy outperformed the individual fine-tuned models, offering a generalizable framework for gastric cancer prediction in data-scarce clinical settings. Full article
(This article belongs to the Special Issue Advances in Machine Learning for Healthcare Applications)
Show Figures

Figure 1

22 pages, 3470 KB  
Article
A Multi-Sensor Machine Learning Framework for Field-Scale Soil Salinity Mapping Under Data-Scarce Conditions
by Joyce Mongai Chindong, Jamal-Eddine Ouzemou, Ahmed Laamrani, Ali El Battay, Soufiane Hajaj, Hassan Rhinane and Abdelghani Chehbouni
Remote Sens. 2025, 17(22), 3778; https://doi.org/10.3390/rs17223778 - 20 Nov 2025
Cited by 2 | Viewed by 1512
Abstract
Soil salinity severely constrains agricultural productivity and soil health, particularly in arid and semi-arid regions. Conventional salinity assessment methods are labor-intensive, time-consuming, and spatially limited. This study developed a data-scarce workflow integrating proximal sensing (EM38-MK2), very high-resolution multispectral imagery, and machine learning to [...] Read more.
Soil salinity severely constrains agricultural productivity and soil health, particularly in arid and semi-arid regions. Conventional salinity assessment methods are labor-intensive, time-consuming, and spatially limited. This study developed a data-scarce workflow integrating proximal sensing (EM38-MK2), very high-resolution multispectral imagery, and machine learning to map soil salinity at field scale in the semi-arid Sehb El Masjoune area, central Morocco. A total of 26 soil samples were analyzed for Electrical Conductivity (EC), and 500 Apparent Electrical Conductivity (ECa) measurements were collected and calibrated using the field samples. Spectral and topographic covariates derived from Unmanned Aerial Vehicle (UAV) and PlanetScope imagery supported model training using Partial Least Squares Regression (PLSR), Support Vector Regression (SVR), Random Forest (RF), and a Stacked Ensemble Learning Model (ELM). Regression Kriging (RK) was applied to model residuals to improve spatial prediction. ELM achieved the highest accuracy (R2 = 0.87, RMSE ≈ 4.15), followed by RF, which effectively captured nonlinear spatial patterns. RK improved PLSR accuracy (by 11.1% for PlanetScope, 13.8% for UAV) but offered limited gains for RF, SVR, and ELM. SHAP analysis identified topographic covariates as the most influential predictors. Both UAV and PlanetScope delineated similar saline–sodic zones. The study demonstrates the following: (1) a scalable, data-efficient workflow for salinity mapping; (2) model and RK performance depend more on algorithmic design than sensor type; (3) interpretable ML and spatial modeling enhance understanding of salinity processes in semi-arid systems. Full article
Show Figures

Figure 1

35 pages, 5223 KB  
Article
Physics-Based Machine Learning for Vibration Mitigation by Open Buried Trenches
by Luís Pereira, Luís Godinho, Fernando G. Branco, Paulo da Venda Oliveira, Pedro Alves Costa and Aires Colaço
Appl. Sci. 2025, 15(21), 11609; https://doi.org/10.3390/app152111609 - 30 Oct 2025
Viewed by 596
Abstract
Mitigating ground vibrations from sources like vehicles and construction operations poses significant challenges, often relying on computationally intensive numerical methods such as Finite Element Methods (FEM) or Boundary Element Methods (BEM) for analysis. This study addresses these limitations by developing and evaluating Machine [...] Read more.
Mitigating ground vibrations from sources like vehicles and construction operations poses significant challenges, often relying on computationally intensive numerical methods such as Finite Element Methods (FEM) or Boundary Element Methods (BEM) for analysis. This study addresses these limitations by developing and evaluating Machine Learning (ML) methodologies for the rapid and accurate prediction of Insertion Loss (IL), a critical parameter for assessing the effectiveness of open trenches as vibration barriers. A comprehensive database was systematically generated through high-fidelity numerical simulations, capturing a wide range of geometric, elastic, and physical configurations of a stratified geotechnical system. Three distinct ML strategies—Artificial Neural Networks (ANN), Support Vector Machines (SVM), and Random Forests (RF)—were initially assessed for their predictive capabilities. Subsequently, a Meta-RF stacking ensemble model was developed, integrating the predictions of these base methods. Model performance was rigorously evaluated using complementary statistical metrics (RMSE, MAE, NMAE, R), substantiated by in-depth statistical analyses (normality tests, Bootstrap confidence intervals, Wilcoxon tests) and an analysis of input parameter sensitivity. The results clearly demonstrate the high efficacy of Machine Learning (ML) in accurately predicting IL across diverse, realistic scenarios. While all models performed strongly, the RF and the Meta-RF stacking ensemble models consistently emerged as the most robust and accurate predictors. They exhibited superior generalization capabilities and effectively mitigated the inherent biases found in the ANN and SVM models. This work is intended to function as a proof-of-concept and offers promising avenues for overcoming the significant computational costs associated with traditional simulation methods, thereby enabling rapid design optimization and real-time assessment of vibration mitigation measures in geotechnical engineering. Full article
Show Figures

Figure 1

48 pages, 31470 KB  
Article
Integrating Climate and Economic Predictors in Hybrid Prophet–(Q)LSTM Models for Sustainable National Energy Demand Forecasting: Evidence from The Netherlands
by Ruben Curiël, Ali Mohammed Mansoor Alsahag and Seyed Sahand Mohammadi Ziabari
Sustainability 2025, 17(19), 8687; https://doi.org/10.3390/su17198687 - 26 Sep 2025
Cited by 1 | Viewed by 1225
Abstract
Forecasting national energy demand is challenging under climate variability and macroeconomic uncertainty. We assess whether hybrid Prophet–(Q)LSTM models that integrate climate and economic predictors improve long-horizon forecasts for The Netherlands. This study covers 2010–2024 and uses data from ENTSO-E (hourly load), KNMI and [...] Read more.
Forecasting national energy demand is challenging under climate variability and macroeconomic uncertainty. We assess whether hybrid Prophet–(Q)LSTM models that integrate climate and economic predictors improve long-horizon forecasts for The Netherlands. This study covers 2010–2024 and uses data from ENTSO-E (hourly load), KNMI and Copernicus/ERA5 (weather and climate indices), Statistics Netherlands (CBS), and the World Bank (macroeconomic and commodity series). We evaluate Prophet–LSTM and Prophet–QLSTM, each with and without stacking via XGBoost, under rolling-origin cross-validation; feature choice is guided by Bayesian optimisation. Stacking provides the largest and most consistent accuracy gains across horizons. The quantum-inspired variant performs on par with the classical ensemble while using a smaller recurrent core, indicating value as a complementary learner. Substantively, short-run variation is dominated by weather and calendar effects, whereas selected commodity and activity indicators stabilise longer-range baselines; combining both domains improves robustness to regime shifts. In sustainability terms, improved long-horizon accuracy supports renewable integration, resource adequacy, and lower curtailment by strengthening seasonal planning and demand-response scheduling. The pipeline demonstrates the feasibility of integrating quantum-inspired components into national planning workflows, using The Netherlands as a case study, while acknowledging simulator constraints and compute costs. Full article
Show Figures

Figure 1

39 pages, 9593 KB  
Article
An Integrated AI Framework for Occupational Health: Predicting Burnout, Long COVID, and Extended Sick Leave in Healthcare Workers
by Maria Valentina Popa, Călin Gheorghe Buzea, Irina Luciana Gurzu, Camer Salim, Bogdan Gurzu, Dragoș Ioan Rusu, Lăcrămioara Ochiuz and Letiția Doina Duceac
Healthcare 2025, 13(18), 2266; https://doi.org/10.3390/healthcare13182266 - 10 Sep 2025
Cited by 2 | Viewed by 1414
Abstract
Background: Healthcare workers face multiple, interlinked occupational health risks—burnout, post-COVID-19 sequelae (Long COVID), and extended medical leave. These outcomes often share predictors, contribute to each other, and, together, impact workforce capacity. Yet, existing tools typically address them in isolation. Objective: The objective of [...] Read more.
Background: Healthcare workers face multiple, interlinked occupational health risks—burnout, post-COVID-19 sequelae (Long COVID), and extended medical leave. These outcomes often share predictors, contribute to each other, and, together, impact workforce capacity. Yet, existing tools typically address them in isolation. Objective: The objective of this study to develop and deploy an integrated, explainable artificial intelligence (AI) framework that predicts these three outcomes using the same structured occupational health dataset, enabling unified workforce risk monitoring. Methods: We analyzed data from 1244 Romanian healthcare professionals with 14 demographic, occupational, lifestyle, and comorbidity features. For each outcome, we trained a separate predictive model within a common framework: (1) a lightweight transformer neural network with hyperparameter optimization, (2) a transformer with multi-head attention, and (3) a stacked ensemble combining transformer, XGBoost, and logistic regression. The data were SMOTE-balanced and evaluated on held-out test sets using Accuracy, ROC-AUC, and F1-score, with 10,000-iteration bootstrap testing for statistical significance. Results: The stacked ensemble achieved the highest performance: ROC AUC = 0.70 (burnout), 0.93 (Long COVID), and 0.93 (extended leave). The F1 scores were >0.89 for Long COVID and extended leave, whereas the performance gains for burnout were comparatively modest, reflecting the multidimensional and heterogeneous nature of burnout as a binary construct. The gains over logistic regression were statistically significant (p < 0.0001 for Long COVID and extended leave; p = 0.0355 for burnout). The SHAP analysis identified overlapping top predictors—tenure, age, job role, cancer history, pulmonary disease, and obesity—supporting the value of a unified framework. Conclusions: We trained separate models for each occupational health risk but deployed them in a single, real-time web application. This integrated approach improves efficiency, enables multi-outcome workforce surveillance, and supports proactive interventions in healthcare settings. Full article
Show Figures

Figure 1

22 pages, 1286 KB  
Article
Multiclass Classification of Sarcopenia Severity in Korean Adults Using Machine Learning and Model Fusion Approaches
by Arslon Ruziboev, Dilmurod Turimov, Jiyoun Kim and Wooseong Kim
Mathematics 2025, 13(18), 2907; https://doi.org/10.3390/math13182907 - 9 Sep 2025
Viewed by 978
Abstract
This study presents a unified machine learning strategy for identifying various degrees of sarcopenia severity in older adults. The approach combines three optimized algorithms (Random Forest, Gradient Boosting, and Multilayer Perceptron) into a stacked ensemble model, which is assessed with clinical data. A [...] Read more.
This study presents a unified machine learning strategy for identifying various degrees of sarcopenia severity in older adults. The approach combines three optimized algorithms (Random Forest, Gradient Boosting, and Multilayer Perceptron) into a stacked ensemble model, which is assessed with clinical data. A thorough data preparation process involved synthetic minority oversampling to ensure class balance and a dual approach to feature selection using Least Absolute Shrinkage and Selection Operator regression and Random Forest importance. The integrated model achieved remarkable performance with an accuracy of 96.99%, an F1 score of 0.9449, and a Cohen’s Kappa coefficient of 0.9738 while also demonstrating excellent calibration (Brier Score: 0.0125). Interpretability analysis through SHapley Additive exPlanations values identified appendicular skeletal muscle mass, body weight, and functional performance metrics as the most significant predictors, enhancing clinical relevance. The ensemble approach showed superior generalization across all sarcopenia classes compared to individual models. Although limited by dataset representativeness and the use of conventional multiclass classification techniques, the framework shows considerable promise for non-invasive sarcopenia risk assessments and exemplifies the value of interpretable artificial intelligence in geriatric healthcare. Full article
Show Figures

Figure 1

16 pages, 3521 KB  
Article
Temporal Trends and Machine Learning-Based Risk Prediction of Female Infertility: A Cross-Cohort Analysis Using NHANES Data (2015–2023)
by Ismat Ara Begum, Deepak Ghimire and A. S. M. Sanwar Hosen
Diagnostics 2025, 15(17), 2250; https://doi.org/10.3390/diagnostics15172250 - 5 Sep 2025
Viewed by 1507
Abstract
Background: Female infertility represents a significant global public health concern, yet its evolving trends and data-driven risk prediction remain under examined in nationally representative cohorts. This study investigates temporal changes in infertility prevalence and evaluates Machine Learning (ML) models for infertility risk prediction [...] Read more.
Background: Female infertility represents a significant global public health concern, yet its evolving trends and data-driven risk prediction remain under examined in nationally representative cohorts. This study investigates temporal changes in infertility prevalence and evaluates Machine Learning (ML) models for infertility risk prediction using harmonized clinical features from NHANES cycles (2015, 2016, 2017, 2018, 2021, 2022, and 2023). Methods: Women aged 19 to 45 with complete data on infertility-related variables (including reproductive history, menstrual irregularity, Pelvic Infection Disease (PID), hysterectomy, and bilateral oophorectomy) were analyzed. Descriptive statistics and cohort comparisons employed ANOVA and Chi-square tests, while multivariate Logistic Regression (LR) estimated Adjusted Odds Ratios (OR) and informed feature importance. Predictive models (LR, Random Forest, XGBoost, Naive Bayes, SVM, and a Stacking Classifier ensemble) were trained and tuned via GridSearchCV with five-fold cross-validation. Model performance was evaluated using accuracy, precision, recall, F1-score, specificity, and AUC-ROC. Results: We observed a notable increase in infertility prevalence from 14.8% in 2017–2018 to 27.8% in 2021–2023, suggesting potential post-pandemic impacts on reproductive health. In multivariate analysis, prior childbirth emerged as the strongest protective factor (Adjusted OR 0.00), while menstrual irregularity showed a significant positive association with infertility (OR =0.55, 95% CI 0.40 to 0.77, p<0.001). Unexpectedly, PID, hysterectomy, and bilateral oophorectomy were not significantly associated with infertility after adjustment (p>0.05), which may partly reflect the inherent definition of self-reported infertility used in this study. All six ML models demonstrated excellent and comparable predictive ability (AUC >0.96), reinforcing the effectiveness of even a minimal common predictor set for infertility risk stratification. Conclusions: The rising prevalence of self-reported infertility among U.S. women underscores emerging public health challenges. Despite relying on a streamlined feature set, interpretable and ensemble ML models successfully predicted infertility risk, showcasing their potential applicability in broader surveillance and personalized care strategies. Future models should integrate additional sociodemographic and behavioral factors to enhance precision and support tailored interventions. Full article
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)
Show Figures

Figure 1

19 pages, 2725 KB  
Article
Enhancing Photovoltaic Energy Output Predictions Using ANN and DNN: A Hyperparameter Optimization Approach
by Atıl Emre Cosgun
Energies 2025, 18(17), 4564; https://doi.org/10.3390/en18174564 - 28 Aug 2025
Cited by 1 | Viewed by 858
Abstract
This study investigates the use of artificial neural networks (ANNs) and deep neural networks (DNNs) for estimating photovoltaic (PV) energy output, with a particular focus on hyperparameter tuning. Supervised regression for photovoltaic (PV) direct current power prediction was conducted using only sensor-based inputs [...] Read more.
This study investigates the use of artificial neural networks (ANNs) and deep neural networks (DNNs) for estimating photovoltaic (PV) energy output, with a particular focus on hyperparameter tuning. Supervised regression for photovoltaic (PV) direct current power prediction was conducted using only sensor-based inputs (PanelTemp, Irradiance, AmbientTemp, Humidity), together with physically motivated-derived features (ΔT, IrradianceEff, IrradianceSq, Irradiance × ΔT). Samples acquired under very low irradiance (<50 W m−2) were excluded. Predictors were standardized with training-set statistics (z-score), and the target variable was modeled in log space to stabilize variance. A shallow artificial neural network (ANN; single hidden layer, widths {4–32}) was compared with deeper multilayer perceptrons (DNN; stacks {16 8}, {32 16}, {64 32}, {128 64}, {128 64 32}). Hyperparameters were selected with a grid search using validation mean squared error in log space with early stopping; Bayesian optimization was additionally applied to the ANN. Final models were retrained and evaluated on a held-out test set after inverse transformation to watts. Test performance was obtained as MSE, RMSE, MAE, R2, and MAPE for the ANN and DNN. Hence, superiority in absolute/squared error and explained variance was exhibited by the ANN, whereas lower relative error was achieved by the DNN with a marginal MAE advantage. Ablation studies showed that moderate depth can be beneficial (e.g., two-layer variants), and a simple bootstrap ensemble improved robustness. In summary, the ANN demonstrated superior performance in terms of absolute-error accuracy, whereas the DNN exhibited better consistency with relative-error accuracy. Full article
(This article belongs to the Section A2: Solar Energy and Photovoltaic Systems)
Show Figures

Figure 1

36 pages, 14469 KB  
Article
Multi-Objective Optimization Design Based on Prototype High-Rise Office Buildings: A Case Study in Shandong, China
by Hangyue Zhang and Zhi Zhuang
Buildings 2025, 15(17), 3071; https://doi.org/10.3390/buildings15173071 - 27 Aug 2025
Cited by 1 | Viewed by 992
Abstract
Urbanization in China and the proliferation of high-rise office buildings have led to increased demand for daylighting and thermal comfort. These requirements often result in reliance on active systems, including heating, cooling, and artificial lighting, which increase energy consumption. Existing studies have often [...] Read more.
Urbanization in China and the proliferation of high-rise office buildings have led to increased demand for daylighting and thermal comfort. These requirements often result in reliance on active systems, including heating, cooling, and artificial lighting, which increase energy consumption. Existing studies have often focused on individual cases or room-scale models, which makes it difficult to generalize findings to the design of various high-rise office building types. Therefore, in this study, parametric prototype building models for high-rise office buildings were developed based on surveys of completed and under-construction projects. These surveys reflected actual design practices and were used to support systematic performance evaluation and typology-level optimization. Building performance was simulated using Grasshopper and Honeybee to generate large-scale datasets, and stacking ensemble learning models were used as surrogate predictors for energy use, daylighting, and thermal comfort. Multi-objective optimization was conducted using the non-dominated sorting genetic algorithm III (NSGA-III), followed by strategy formulation. The results revealed the following: (1) the proposed prototype model establishes clear parameter ranges for geometry, envelope design, and thermal performance, offering reusable models and data; (2) the stacking ensemble model outperforms individual models, improving the coefficient of determination (R2) by 0.5–16.1%, with mean squared error (MSE) reductions of 4.4–70.6%, and mean absolute error (MAE) reductions of 2.8–45.8%; (3) space length, aspect ratio, usable area ratio, window U-value, and solar heat gain coefficient (SHGC) were identified as primary performance drivers; and (4) optimized solutions reduced energy use by 3.79–11.81% and enhanced daylighting comfort by 40.16–50.32% while maintaining thermal comfort. The proposed framework provides localized, data-driven guidance for early-stage performance optimization in high-rise office building design. Full article
(This article belongs to the Section Building Energy, Physics, Environment, and Systems)
Show Figures

Figure 1

26 pages, 3734 KB  
Article
Impact of PM2.5 Pollution on Solar Photovoltaic Power Generation in Hebei Province, China
by Ankun Hu, Zexia Duan, Yichi Zhang, Zifan Huang, Tianbo Ji and Xuanhua Yin
Energies 2025, 18(15), 4195; https://doi.org/10.3390/en18154195 - 7 Aug 2025
Viewed by 1498
Abstract
Atmospheric aerosols significantly impact solar photovoltaic (PV) energy generation through their effects on surface solar radiation. This study quantifies the impact of PM2.5 pollution on PV power output using observational data from 10 stations across Hebei Province, China (2018–2019). Our analysis reveals [...] Read more.
Atmospheric aerosols significantly impact solar photovoltaic (PV) energy generation through their effects on surface solar radiation. This study quantifies the impact of PM2.5 pollution on PV power output using observational data from 10 stations across Hebei Province, China (2018–2019). Our analysis reveals that elevated PM2.5 concentrations substantially attenuate solar irradiance, resulting in PV power losses reaching up to a 48.2% reduction in PV power output during severe pollution episodes. To capture these complex aerosol–radiation–PV interactions, we developed and compared the following six machine learning models: Support Vector Regression, Random Forest, Decision Tree, K-Nearest Neighbors, AdaBoost, and Backpropagation Neural Network. The inclusion of PM2.5 as a predictor variable systematically enhanced model performance across all algorithms. To further optimize prediction accuracy, we implemented a stacking ensemble framework that integrates multiple base learners through meta-learning. The optimal stacking configuration achieved superior performance (MAE = 0.479 MW, indicating an average prediction error of 479 kilowatts; R2 = 0.967, reflecting that 96.7% of the variance in power output is explained by the model), demonstrating robust predictive capability under diverse atmospheric conditions. These findings underscore the importance of aerosol–radiation interactions in PV forecasting and provide crucial insights for grid management in pollution-affected regions. Full article
(This article belongs to the Section B: Energy and Environment)
Show Figures

Figure 1

Back to TopTop