Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (915)

Search Parameters:
Keywords = bayes prediction

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
33 pages, 1895 KB  
Article
Leveraging Feature Selection and Ensemble Learning to Predict Secondary School Achievement: A Comparative Study of Three Grade Granularities
by Dimitrios Galiatsatos and Panagiota Galiatsatou
Information 2026, 17(6), 517; https://doi.org/10.3390/info17060517 - 22 May 2026
Abstract
Predictive analytics has become increasingly important in educational decision-making, supporting at-risk identification and adaptive tutoring. The accurate early prediction of school achievement can enable timely intervention. Using the Math Students dataset, which contains data on students from two Portuguese secondary schools, we model [...] Read more.
Predictive analytics has become increasingly important in educational decision-making, supporting at-risk identification and adaptive tutoring. The accurate early prediction of school achievement can enable timely intervention. Using the Math Students dataset, which contains data on students from two Portuguese secondary schools, we model three categorical outcomes derived from the students’ final grade, namely the final grade level (low, medium, high), its qualitative evaluation (fail, satisfactory, good, excellent), and the final pass/fail outcome. After preprocessing, three filter methods—Correlation-Based Feature Subset Selection (CFS), Correlation Attribute Evaluation (CorrEval), and Information Gain (InfoGain)—are applied to reduce the dimensionality of the datasets. Nine classifiers (Naive Bayes, Logistic, MLP, SMO, IBk, Bagging, J48, Random Forest, Random Tree) are evaluated using ten-fold cross-validation in the Waikato Environment for Knowledge Analysis (Weka) platform. Random Forest with InfoGain achieves 90.7% accuracy on the three-band task, while Bagging with InfoGain achieves 92.5% on the binary pass/fail outcome, outperforming benchmarks in prior Educational Data Mining (EDM) studies. Results confirm that prior academic performance indicators (first- and second-period grades) and failure history dominate predictive power and contribute substantially to the success of ensemble models, particularly when paired with feature selection methods that reduce noise and highlight relevant attributes. Full article
22 pages, 2725 KB  
Article
Machine Learning Classification of Pacemaker Low-Longevity Status Using Device Interrogation Reports
by Samikshya Neupane and Tarun Goswami
Appl. Sci. 2026, 16(10), 5134; https://doi.org/10.3390/app16105134 - 21 May 2026
Viewed by 128
Abstract
Pacemaker generator replacement remains clinically important because battery depletion influences the timing of elective replacement procedures and associated procedural risk. This study investigated whether routinely available pacemaker interrogation-derived telemetry can classify devices with manufacturer-estimated low longevity status, defined as remaining device life below [...] Read more.
Pacemaker generator replacement remains clinically important because battery depletion influences the timing of elective replacement procedures and associated procedural risk. This study investigated whether routinely available pacemaker interrogation-derived telemetry can classify devices with manufacturer-estimated low longevity status, defined as remaining device life below 12 months. A total of 39 Medtronic pacemaker interrogation snapshots were analyzed, including 11 single-chamber, 21 dual-chamber, and 7 triple-chamber CRT-P devices. Four machine learning classifiers, namely, Naïve Bayes, HistGradientBoosting, Logistic Regression, and Random Forest, were evaluated using leave-one-out cross-validation as the primary internal validation strategy, with bootstrap 95% confidence intervals calculated from aggregated out-of-fold predictions. The study is application-based rather than algorithmic, focusing on transparent comparison and interpretation of established classifiers for pacemaker-specific low-longevity classification. Logistic Regression achieved the strongest LOOCV performance, with ROC-AUC 0.941, accuracy 0.872, precision 0.958, recall 0.852, and F1-score 0.902. Random Forest also showed favorable performance, with ROC-AUC 0.855 and F1-score 0.873. Parsimonious baseline analysis showed that months since implantation plus battery voltage achieved ROC-AUC 0.883, accuracy 0.846, and F1-score 0.875, approaching the full Logistic Regression model. SHAP analysis identified months since implantation, battery voltage, and right ventricular capture threshold as the most influential predictors. These findings suggest that routine interrogation variables can classify manufacturer-estimated low-longevity status, but the incremental benefit of full multivariable machine learning over simple predictors was modest. Because the endpoint was based on manufacturer-estimated longevity rather than observed clinical battery failure or generator replacement timing, larger longitudinal and multi-manufacturer validation studies are needed before clinical application. Full article
Show Figures

Figure 1

22 pages, 1794 KB  
Article
A Python-Based Framework for Learning-from-Demonstration in Robotic Object Sorting: Comparative Evaluation of Lightweight Classifiers
by Marius-Valentin Drăgoi, Cozmin Adrian Cristoiu, Roxana-Mariana Nechita, Bogdan-Cătălin Navligu and Bogdan-Marian Verdete
Appl. Sci. 2026, 16(10), 5107; https://doi.org/10.3390/app16105107 - 20 May 2026
Viewed by 137
Abstract
This paper presents a Python-based v3.12 framework for robotic object sorting in a virtual workcell, combining learning-from-demonstration with a comparative evaluation of classical machine learning classifiers. A user provides a minimal demonstration (e.g., one cube and one cylinder placed into two bins) from [...] Read more.
This paper presents a Python-based v3.12 framework for robotic object sorting in a virtual workcell, combining learning-from-demonstration with a comparative evaluation of classical machine learning classifiers. A user provides a minimal demonstration (e.g., one cube and one cylinder placed into two bins) from which a dynamic type-to-bin rule is inferred. In this study, learning-from-demonstration is implemented at the level of rule acquisition from minimal task examples rather than at the level of trajectory imitation or low-level motion teaching. This rule is used to relabel a larger dataset of pre-generated object positions, enabling training with a selectable number of file-based samples (2–1600) optionally augmented with manual samples. Five classifiers—decision tree, k-nearest neighbors, logistic regression, naive Bayes, and linear SVM—were trained and then used to drive autonomous pick-and-place execution while logging replication time and correctness (correct/incorrect moves and accuracy). Because the task reaches accuracy saturation under a deterministic rule, an additional offline inference benchmark was included to compare prediction throughput using 10,000 probes with repeated timing (median over 50 runs or mean ± standard deviation over 30 runs). To complement this nominal evaluation, the framework also included a perturbation-aware robustness protocol based on controlled positional perturbation, systematic bias, controlled shape corruption, repeated perturbation voting, and stability-aware scoring. This additional layer makes it possible to examine classifier behavior under controlled uncertainty, especially in reduced-data settings, without changing the compact simulator-based nature of the workflow. Results indicate identical sorting accuracy across models, while inference-time differences remain measurable, highlighting deployment-oriented trade-offs and confirming that end-to-end cycle time is dominated by robot motion rather than model computation. Full article
Show Figures

Figure 1

26 pages, 3333 KB  
Article
An Interpretable and Reproducibility-Focused Evaluation Pipeline for Automatic Short-Answer Grading in Low-Resource Mathematics and Science Educational Datasets
by Miguel Ángel González Maestre, Javier Cubero Juánez, Alejandro de la Hoz Serrano and Lina Melo
Computers 2026, 15(5), 320; https://doi.org/10.3390/computers15050320 - 18 May 2026
Viewed by 218
Abstract
Automated short-answer grading (ASAG) in educational contexts faces a fundamental trade-off between predictive performance, interpretability, and methodological transparency, particularly under data-constrained educational settings. While recent approaches rely on deep learning architectures, these models require large annotated datasets and offer limited transparency, restricting their [...] Read more.
Automated short-answer grading (ASAG) in educational contexts faces a fundamental trade-off between predictive performance, interpretability, and methodological transparency, particularly under data-constrained educational settings. While recent approaches rely on deep learning architectures, these models require large annotated datasets and offer limited transparency, restricting their applicability in authentic classroom environments. This study proposes a fully specified and interpretable machine learning pipeline for ASAG across multiple educational concepts. The approach is based on a shared TF–IDF representation and evaluates three linear classifiers—Logistic Regression, Multinomial Naïve Bayes, and Linear Support Vector Machines—under a stratified cross-validation framework adapted to small datasets. Model performance is assessed using accuracy, precision, recall, and F1-score. Statistical comparisons using the Wilcoxon signed-rank test indicate exploratory evidence of statistically significant differences between classifiers, although the observed differences remain small in practical magnitude. Additionally, the methodology incorporates token-level analysis to identify discriminative lexical patterns and examine consensus across classifiers. To enhance interpretability, tokens are presented using a bilingual Spanish/English representation while preserving the original feature space. The results across ten concept-specific datasets show consistent performance across models (accuracy ≈ 0.82–0.88) and reveal stable lexical patterns consistently associated with model predictions of correctness. The findings highlight that lightweight, interpretable models can provide consistent and reliable performance under resource-constrained educational conditions. The proposed framework contributes a stability-oriented and interpretable evaluation paradigm for ASAG, offering a practical alternative to data-intensive approaches in educational assessment. It is intended as a methodological reference protocol rather than a performance benchmark. The findings should be interpreted as evidence of within-context consistency instead of broad external generalization. Full article
Show Figures

Figure 1

24 pages, 9510 KB  
Article
Overcoming Generalization Issues in Flood Prediction: A Machine Learning Approach Across Multiple Basins
by Ufuk Yükseler, Omerul Faruk Dursun, Mete Yağanoğlu and Abdolmajid Mohammadian
Sustainability 2026, 18(10), 4724; https://doi.org/10.3390/su18104724 - 9 May 2026
Viewed by 220
Abstract
Flooding is a complex, unpredictable disaster that occurs frequently and can have devastating impacts. Over the past two decades, the advent of machine learning (ML) methods has led to a surge in studies focused on flood prediction, emphasizing high-performance algorithms and fast processing [...] Read more.
Flooding is a complex, unpredictable disaster that occurs frequently and can have devastating impacts. Over the past two decades, the advent of machine learning (ML) methods has led to a surge in studies focused on flood prediction, emphasizing high-performance algorithms and fast processing times. The present study aims to investigate the challenges of generalization in flood prediction models using machine learning techniques. A dataset of 18,810 samples was compiled from 40 river basins covering the period 1959–2020. Nine machine learning algorithms were applied to the analysis: Logistic Regression, Support Vector Machine, K-Nearest Neighbors, Decision Tree, Random Forest, AdaBoost, Gradient Boosting, Extra Trees, and Gaussian Naive Bayes. Four distinct validation methods were employed to assess the performance of the models, and the results were thoroughly analyzed. The Gradient Boosting model demonstrated exceptional validation performance indicating its robustness across diverse datasets. High accuracy was also observed in the Decision Tree, Random Forest, Extra Trees, and AdaBoost models. However, for datasets with fewer than 200 samples, these four models experienced a decline in performance. Elevation was identified as the most important factor influencing flooding in 36 basins. NDVI was the dominant factor in 3 basins, while rainfall was the main driver in only 1 basin. The results highlight the contributions and shortcomings of machine learning methods in sustainable flood disaster management systems. Full article
(This article belongs to the Section Sustainable Engineering and Science)
Show Figures

Figure 1

37 pages, 1173 KB  
Article
Advances in Bayesian and Non-Bayesian Approaches Under Progressive Type-II Censoring with Applications
by Neama T. AL-Sayed, Asmaa M. Abd AL-Fattah, Hebatalla H. Mohammad, Gannat R. AL-Dayian and Abeer A. EL-Helbawy
Symmetry 2026, 18(5), 805; https://doi.org/10.3390/sym18050805 (registering DOI) - 8 May 2026
Viewed by 175
Abstract
Recent advances in lifetime modeling have led to several extensions of classical distributions, among which the extended inverted Kumaraswamy lifetime model, known as the exponentiated generalized inverted Kumaraswamy model, represents a significant innovation. In parallel, progressive Type-II censoring frameworks have garnered growing attention [...] Read more.
Recent advances in lifetime modeling have led to several extensions of classical distributions, among which the extended inverted Kumaraswamy lifetime model, known as the exponentiated generalized inverted Kumaraswamy model, represents a significant innovation. In parallel, progressive Type-II censoring frameworks have garnered growing attention for their adaptability and relevance across various applied disciplines, including medical, engineering, and social sciences. Driven by this motivation, the present study focuses on the problem of parameter estimation for the proposed lifetime model under a progressive Type-II censoring scheme. Both Bayesian and non-Bayesian frameworks are utilized to estimate the model parameters, reliability function, and hazard rate. Furthermore, interval estimation is conducted by constructing confidence and Bayesian credible intervals for these measures. Assuming independent gamma priors, Bayes estimators are derived under both symmetric and asymmetric loss functions to account for different decision-making perspectives. In addition, conditional and Bayesian predictive analyses are developed within a two-sample prediction framework, and their associated prediction intervals are constructed. The efficiency and robustness of the proposed estimation and prediction methodologies are thoroughly evaluated through an extensive simulation study conducted under various sample sizes and censoring schemes. To further demonstrate the model’s practical relevance, real-world medical and engineering datasets are analyzed, highlighting the applicability and effectiveness of the proposed distribution in empirical contexts. Full article
(This article belongs to the Section Mathematics)
Show Figures

Figure 1

30 pages, 1771 KB  
Article
Lightweight Multi-Label IoT Device Classification and Unknown Device Detection Using Early DHCP and DNS Metadata
by Ahmad Enaya and Xavier Fernando
Electronics 2026, 15(9), 1951; https://doi.org/10.3390/electronics15091951 - 4 May 2026
Viewed by 579
Abstract
Zero Trust architectures require immediate identification of IoT devices before granting network access; however, most existing classification methods rely on extended traffic observation windows or computationally intensive deep learning models. This study proposes a lightweight multi-label IoT device classification framework based solely on [...] Read more.
Zero Trust architectures require immediate identification of IoT devices before granting network access; however, most existing classification methods rely on extended traffic observation windows or computationally intensive deep learning models. This study proposes a lightweight multi-label IoT device classification framework based solely on early-stage DHCP and DNS metadata captured during device boot-up. Traditional supervised classifiers, including Naïve Bayes, Decision Tree, Random Forest, and Multi-Layer Perceptron, are adapted to support probabilistic multi-label prediction and integrated unknown device detection through confidence-based thresholding. The approach enables devices with identical or overlapping behavioral fingerprints to be grouped for policy enforcement while preserving detection sensitivity for unseen devices under open-set conditions. Experimental evaluation on 40 IoT devices representing 31 device types demonstrates that Random Forest achieves the most reliable balance between classification accuracy and unknown detection robustness, while maintaining low computational overhead suitable for constrained gateways. The results show that early metadata alone is sufficient for real-time Zero Trust enforcement and least-privilege policy activation. The proposed unified framework reduces architectural complexity by combining classification and unknown detection into a single model, making it practical for scalable IoT deployments. Full article
Show Figures

Figure 1

27 pages, 4942 KB  
Article
Ancestral BG1 Alleles and Structural Conservation Ensure Immune-Related Genetic Resilience in Southeast Asian Chicken Lineages
by Anh Huynh Luu, Trifan Budi, Worapong Singchat, Chien Tran Phuoc Nguyen, Thitipong Panthum, Nivit Tanglertpaibul, Kanithaporn Vangnai, Aingorn Chaiyes, Chotika Yokthongwattana, Chomdao Sinthuvanich, Orathai Sawatdichaikul, Kyudong Han, Narongrit Muangmai, Darren K. Griffin, Prateep Duengkae, Ngu Trong Nguyen and Kornsorn Srikulnath
Animals 2026, 16(9), 1398; https://doi.org/10.3390/ani16091398 - 3 May 2026
Viewed by 496
Abstract
Chicken (Gallus gallus domesticus) domestication, likely associated with dry-rice farming in central Thailand, has led to substantial loss of ancestral immune-related genetic diversity in commercial chicken lineages. This study addresses allelic loss by providing the first comprehensive analysis of the highly [...] Read more.
Chicken (Gallus gallus domesticus) domestication, likely associated with dry-rice farming in central Thailand, has led to substantial loss of ancestral immune-related genetic diversity in commercial chicken lineages. This study addresses allelic loss by providing the first comprehensive analysis of the highly polymorphic BG1 gene, an MHC-linked marker across the wild–domestic interface in Thailand and Vietnam, using high-depth Illumina amplicon sequencing. Genomic DNA from 47 Thai and Vietnamese chicken populations was extracted using a salting-out protocol following ethical sampling. Allelic variation was examined by targeting the BG1 intron 15–exon 16 region using triplicate PCR and Salus Pro NGS sequencing. Evolutionary dynamics and selection pressures were analyzed using AmpliSAS, MrBayes, and Datamonkey, while AlphaFold 3 was used to predict and validate 3D protein structures. We identified 98 novel alleles and 172 polymorphic sites within the BG1 intron 15–exon 16 region encoding an Ig-like domain. Extensive allele sharing between indigenous chickens and red junglefowl indicated strong balancing selection and trans-species polymorphism. Selection analyses showed that purifying selection conserved structural integrity at codons 9, 13, and 18, while variation at other sites enhanced immune recognition. AlphaFold 3 modeling confirmed conservation of the β-sandwich fold across variants, maintaining stability of the Immunoreceptor Tyrosine-based Inhibition Motif (ITIM). Thus, despite the regional gene flow, geographic isolation has shaped distinct signatures, as evidenced by the presence of 38 unique Thai and 9 unique Vietnamese alleles in addition to breed-specific private markers in the Betong (BG1*TH88), Decoy (BG1*TH91), and Tre (BG1*VN54) populations. A notable adaptive outlier under positive selection (ω = 1.357) was detected in the Dong Tao population, suggesting a recent selective sweep. These findings support the mission of the Siam Chicken Bioresource Project (SCBP) to utilize indigenous breeds as genetic reservoirs and provide a molecular basis for restoring resilience traits in domestic poultry to enhance global food security. Full article
(This article belongs to the Section Animal Genetics and Genomics)
Show Figures

Figure 1

57 pages, 16524 KB  
Review
A Review and Experimental Analysis of Supervised Learning Systems and Methods for Protein–Protein Interaction Detection
by Kamal Taha
Int. J. Mol. Sci. 2026, 27(9), 4094; https://doi.org/10.3390/ijms27094094 - 2 May 2026
Viewed by 368
Abstract
The exponential growth of genomic and proteomic data has made computational protein–protein interaction (PPI) prediction indispensable, driving the need for a comprehensive and method-aware evaluation of supervised learning approaches. PPIs are fundamental to understanding cellular processes and disease mechanisms, yet experimental identification remains [...] Read more.
The exponential growth of genomic and proteomic data has made computational protein–protein interaction (PPI) prediction indispensable, driving the need for a comprehensive and method-aware evaluation of supervised learning approaches. PPIs are fundamental to understanding cellular processes and disease mechanisms, yet experimental identification remains slow, costly, and difficult to scale. This survey systematically investigates ten supervised learning models—Extreme Learning Machine (ELM), Convolutional Neural Networks (CNNs), Graph Neural Networks (GNNs), Deep Neural Networks (DNNs), Naïve Bayes, Probabilistic Decision Tree, Support Vector Machine (SVM), Least Squares SVM (LS-SVM), K-Nearest Neighbor (KNN), and Weighted K-Nearest Neighbor (WKNN)—through a tri-layered framework that integrates Comparative Quantitative Analysis, Comparative Observational Analysis, and Experimental Evaluations. Beyond conventional accuracy summaries, this work provides critical commentary tied to real-world use, analyzing where techniques succeed or fail in practice—for instance, when instance-based methods bottleneck during inference, when kernel choices influence SVM variance, or when deep architectures trade accuracy for computational cost. The survey also offers concrete deployment guidance, such as calibration insights for WKNN versus KNN under varying feature noise or dataset curation quality, delivering operational perspectives that typical surveys omit. Comparative Quantitative Analysis consolidates metrics such as accuracy, F1-score, and computational time from the existing literature, while Comparative Observational Analysis evaluates interpretability, scalability, dataset suitability, and efficiency. Complementing these, Experimental Evaluations conducted by the authors empirically validate model performance on benchmark datasets. Together, these layers provide a unified and evidence-backed perspective on algorithmic strengths, weaknesses, and practical applicability. Findings show that GNNs and DNNs achieve the highest predictive accuracy due to their ability to capture structural and topological relationships, whereas ELM and Naïve Bayes offer superior efficiency. SVM and LS-SVM maintain robust stability under noisy conditions, and CNNs are well-suited for sequence-based prediction tasks. By combining empirical validation, critical insights, and deployment-focused recommendations, this survey delivers decision-grade guidance that bridges theoretical understanding with real-world implementation, thus clarifying the trade-offs among accuracy, efficiency, and scalability in PPI detection research. Full article
(This article belongs to the Section Molecular Biology)
Show Figures

Figure 1

24 pages, 8968 KB  
Article
FetalNet 1.0: TOPSIS-Guided Ensemble Learning with Genetic Feature Selection and SHAP Explainability for Fetal Health Classification from Cardiotocography
by Shweta, Neha Gupta, Meenakshi Gupta, Massimo Donelli, Yogita Arora and Achin Jain
Computers 2026, 15(5), 291; https://doi.org/10.3390/computers15050291 - 2 May 2026
Viewed by 319
Abstract
Fetal health assessment is a crucial aspect of prenatal care, aimed at the early detection of potential complications to ensure optimal outcomes for both mother and child. Traditional methods, such as the visual analysis of cardiotocography (CTG) data by healthcare professionals, are valuable [...] Read more.
Fetal health assessment is a crucial aspect of prenatal care, aimed at the early detection of potential complications to ensure optimal outcomes for both mother and child. Traditional methods, such as the visual analysis of cardiotocography (CTG) data by healthcare professionals, are valuable but often subjective and time-consuming. This work investigates the application of machine learning techniques, with a focus on ensemble learning, to enhance the accuracy and efficiency of fetal health classification based on CTG data. Genetic Algorithm (GA) is employed for optimal feature selection, identifying the most discriminative subset of CTG attributes to improve model performance and reduce computational complexity. We employ a combination of advanced machine learning models, including AdaBoost, Gaussian Naïve Bayes, Decision Tree, k-nearest neighbors (KNN), and Logistic Regression. The top two models were selected based on comprehensive performance metrics using the TOPSIS (Technique for Order Preference by Similarity to Ideal Solution) method. These models were then integrated through ensemble learning approaches, such as stacking, Particle Swarm Optimization (PSO) weighted averaging, and soft voting, to improve prediction reliability. Our proposed stacking ensemble model achieves a remarkable accuracy of 97.9%, demonstrating its potential as a robust, data-driven tool for fetal health monitoring and the early identification of at-risk pregnancies. The results indicate that machine learning can effectively complement traditional fetal health assessment methods by providing an objective framework to support clinical decision-making. Full article
(This article belongs to the Section AI-Driven Innovations)
Show Figures

Figure 1

15 pages, 2236 KB  
Article
Temporal Machine Learning Models for Classifying Suspected Dengue Cases in Mexico Using Surveillance Data from 2025
by Jorge Soria-Cruz, Enrique Luna-Ramírez, Iván Castillo-Zúñiga, Jaime Iván López-Veyna, Ma. Angélica Estrada-Ramírez and Juan Antonio González-Morales
Diseases 2026, 14(5), 155; https://doi.org/10.3390/diseases14050155 - 28 Apr 2026
Viewed by 366
Abstract
Background: Dengue fever remains a major public health challenge in Mexico, exhibiting pronounced seasonal behavior and substantial geographic heterogeneity. Using recent epidemiological surveillance data may improve predictive performance and better reflect the current epidemiological context. Objective: The aim of this study [...] Read more.
Background: Dengue fever remains a major public health challenge in Mexico, exhibiting pronounced seasonal behavior and substantial geographic heterogeneity. Using recent epidemiological surveillance data may improve predictive performance and better reflect the current epidemiological context. Objective: The aim of this study was to develop and compare temporal machine learning models for the binary classification of confirmed and negative dengue cases in Mexico using 2025 national surveillance data. Methods: A total of 68,222 suspected dengue cases reported in 2025 were analyzed. The outcome variable was CASE_STATUS, encoded as 0 for negative cases and 1 for confirmed cases. The dataset was divided chronologically into training (January–September), validation (October), and testing (November–December) subsets. Nine machine learning algorithms were evaluated: Random Forest, Bayesian Network, XGBoost, CatBoost, Naïve Bayes, Logistic Regression, Multilayer Perceptron, Support Vector Machine, and LightGBM. Preprocessing included scaling, encoding, age discretization for Bayesian Network, class imbalance treatment, and model-specific feature-importance analyses. Performance was assessed using accuracy, Precision, Recall, F1-score, ROC-AUC, and PR-AUC. Results: Random Forest achieved the best overall performance, with the highest test F1-score (0.7254) and PR-AUC (0.7300) at an optimized threshold of 0.397, together with a high Recall (0.8938). Bayesian Network achieved the highest test accuracy (0.7023) and ROC-AUC (0.7756), although its overall operational balance was less favorable considering class imbalance. Geographic and institutional variables were the most influential predictors across models, whereas comorbidities generally contributed less. Conclusions: Temporal machine learning models are useful for dengue case classification in Mexico, and Random Forest was the most robust approach, balancing sensitivity and overall predictive performance. From an operational perspective, this finding is especially relevant in dengue surveillance, where failure to identify true confirmed cases may have important public health consequences. Full article
(This article belongs to the Section Infectious Disease)
Show Figures

Figure 1

14 pages, 1565 KB  
Article
Enhancing Intrusion Detection Systems Using Machine Learning and Advanced Feature Selection Methods
by Ahmed Abu-Khadrah, Shaima AlKhudair, Mohammad R. Hassan, Ali Mohd Ali, Tareq A. Alawneh, Emad Alnawafa and Ahmed A. M. Sharadqh
Electronics 2026, 15(9), 1860; https://doi.org/10.3390/electronics15091860 - 28 Apr 2026
Viewed by 468
Abstract
Machine learning helps intrusion detection systems learn new assaults quickly. These systems train on a dataset with several threats and may identify odd behavior. This research detects intrusion using Random Forest, KNN, and Gaussian Naive Bayes. We run the model on a comprehensive [...] Read more.
Machine learning helps intrusion detection systems learn new assaults quickly. These systems train on a dataset with several threats and may identify odd behavior. This research detects intrusion using Random Forest, KNN, and Gaussian Naive Bayes. We run the model on a comprehensive dataset. Dynamics Feature Selector (DFS) improves performance. This technique eliminates unnecessary inputs and improves predictions using statistical analysis and feature significance. DFS effectiveness is tested using the NSL-KDD dataset. The recommended hybrid approach, Gaussian NB, Random Forest, and KNN are compared in meta-learning. Getting excellent accuracy with fewer characteristics is the aim. In order to demonstrate how the model may function in actual cybersecurity scenarios, the final test makes use of common performance metrics such as accuracy, precision, recall, and F1-score. The proposed method outperforms previously reported results with around 96.09% accuracy, 93.21% precision, 92.53% recall, 92.79% F1-score, and 93.65% average performance. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

16 pages, 1002 KB  
Article
Nutritional Status of Children with Short Stature Is Oppositely Associated with Growth Hormone Peak in Stimulation Tests and Insulin-like Growth Factor-1 Concentration
by Joanna Smyczyńska, Urszula Smyczyńska, Maciej Hilczer and Renata Stawerska
J. Clin. Med. 2026, 15(9), 3333; https://doi.org/10.3390/jcm15093333 - 27 Apr 2026
Viewed by 226
Abstract
Background/Objectives: A blunted growth hormone (GH) response in stimulation tests (GHSTs) in obese patients is well documented, with less evidence for insulin-like growth factor-1 (IGF-1) concentrations. The aim of this study was to assess the relationships between nutritional status, GH peak in [...] Read more.
Background/Objectives: A blunted growth hormone (GH) response in stimulation tests (GHSTs) in obese patients is well documented, with less evidence for insulin-like growth factor-1 (IGF-1) concentrations. The aim of this study was to assess the relationships between nutritional status, GH peak in GHST, and IGF-1 concentrations, and to develop machine learning prediction models of GH deficiency (GHD) in children with short stature. Methods: A case–control study included 1592 children with short stature, whose height, weight, body mass index (BMI), GH peak in two GHSTs, IGF-1 concentration and bone age (BA) were assessed. The cut-off of GH peak in two GHSTs between GHD and idiopathic short stature (ISS) was 10.0 µg/L; additionally, a lower cut-off of 7.0 µg/L was used in repeated analysis. Univariate statistical analyses and classification models were used to identify variables related to the normal and subnormal results of GHST. Results: Depending on the cut-off of GH peak (10.0 vs. 7.0 µg/L), GHD was diagnosed in 604 vs. 279 patients (37.9% vs. 17.5%). Children with GHD had significantly lower (p < 0.001) BMI SDS and IGF-1 SDS than ones with ISS for both cut-offs of GH peak. Overnutrition was associated with the lowest GH peak but the highest IGF-1 SDS; the opposite results were observed in undernutrition. A decision tree predicted GHD in 156 patients, in 149 based on BMI SDS > 0.91. A Naïve Bayes classifier predicted GHD in 118 cases, with BMI SDS and IGF-1 SDS being the only significant variables. The best multilayer perceptron (MLP) neural network predicted GHD in 310 patients, while a logistic regression model did so in 269 patients. Conclusions: Interpretation of GHST should include the patient’s nutritional status in order to avoid overdiagnosis of GHD in overweight and obese children. Full article
Show Figures

Figure 1

24 pages, 6075 KB  
Article
Complexity and Performance Analysis of Supervised Machine Learning Models for Applied Technologies: An Experimental Study with Impulsive α-Stable Noise
by Areeb Ahmed and Zoran Bosnić
Technologies 2026, 14(5), 252; https://doi.org/10.3390/technologies14050252 - 23 Apr 2026
Viewed by 522
Abstract
Impulsive alpha (α)-stable noise, characterized by heavy tails and intense outliers, is a key ingredient in simulating financial, medical, seismic, and digital communication technologies. It poses versatile challenges to conventional machine learning (ML) algorithms in predicting noise parameters for multidisciplinary artificial intelligence (AI)-embedded [...] Read more.
Impulsive alpha (α)-stable noise, characterized by heavy tails and intense outliers, is a key ingredient in simulating financial, medical, seismic, and digital communication technologies. It poses versatile challenges to conventional machine learning (ML) algorithms in predicting noise parameters for multidisciplinary artificial intelligence (AI)-embedded devices. In this study, we adopted a two-phase methodology to investigate the complexity and performance of supervised ML algorithms while classifying impulsive noise parameters. We generated synthetic datasets of α-stable noise distributions for experimentation in a controlled environment. It was followed by experimental evaluation to derive the complexity and performance of ML classifiers—k-nearest neighbors (KNN), Support Vector Machine (SVM), Naïve Bayes (NB), Decision Tree (DT), and Random Forest (RF). Moreover, we employed a very high channel noise level of −15 dB in the test datasets to ensure that the derived analysis applies to real-world devices. The results demonstrate the high performance of DT and RF in structured binary classification of the α regime and the sign of skewness, while incurring satisfactory computational costs. However, SVM and kNN are comparatively more robust for multi-class classification, albeit with higher memory and training costs. On the contrary, NB fails to address the skewed and impulsive behavior of α-stable noise. We observed that even the most effective classifiers struggle to achieve perfect accuracy in multi-class classification. Overall, the experimental results reveal significant trade-off relationships between the complexity and performance of ML classifiers. Conclusively, simple models are well-suited for coarse-grained tasks, such as α-approximation and sign-of-skewness classification. In contrast, sophisticated models can be deployed to predict noise parameters to some extent. Our study provides a clear set of trade-offs for future applied AI devices that address adversarial and impulsive noise. Full article
Show Figures

Figure 1

19 pages, 4525 KB  
Article
Interval Prediction of Remaining Useful Life Based on Uncertainty Quantification with Bayesian Convolutional Neural Networks Featuring Dual-Output Units
by Zhendong Qu, Jialong He, Yan Liu, Song Mao and Xiaowu Han
Sensors 2026, 26(9), 2592; https://doi.org/10.3390/s26092592 - 22 Apr 2026
Viewed by 393
Abstract
RUL prediction methods do not fully account for the uncertainties caused by data scarcity and inherent noise, and they also suffer from low reliability of RUL point estimates. To tackle these challenges, this paper proposes a Bayesian convolutional neural network with dual-output units [...] Read more.
RUL prediction methods do not fully account for the uncertainties caused by data scarcity and inherent noise, and they also suffer from low reliability of RUL point estimates. To tackle these challenges, this paper proposes a Bayesian convolutional neural network with dual-output units for RUL interval predictions. The network employs the negative log-likelihood as the loss function. Thanks to its dual-output structure, it not only provides point estimates, but also quantifies the aleatoric uncertainty inherent in the data. During the training process, the CNN is reformulated using Bayesian principles, and the Bayes-by-backprop method is applied to train the network. This transformation converts model parameters from fixed values into random variables. As a result, epistemic uncertainty caused by model inaccuracies and limited data can be quantified. Experimental validation on the IEEE PHM Challenge 2012 dataset demonstrated that the proposed method achieved a higher prediction accuracy than state-of-the-art uncertainty-aware prediction approaches, demonstrating a better applicability in engineering practice. Full article
(This article belongs to the Special Issue Sensing Technologies in Industrial Defect Detection)
Show Figures

Figure 1

Back to TopTop